Contextual Compression Retriever

The Contextual Compression Retriever helps you find the most relevant parts of your documents quickly. It takes a search query, looks up documents with a base retriever, and then uses a compression technique to keep only the most useful information. This makes searches faster and results clearer.

How it Works

Base Retriever – First, the component asks a base retriever (like a simple keyword search) to pull a set of documents that match your query.
Compression – Next, it applies one of several compression methods to those documents:
- LLMChainExtractor – Uses a language model to pull out the most important sentences.
- LLMChainFilter – Filters out less relevant parts with a language model.
- LLMListwiseRerank – Re‑orders the results so the best ones appear first.
- EmbeddingsFilter – Keeps only documents that are similar enough to the query based on vector embeddings.
Result – The compressed set of documents is returned as a new retriever that you can use for further processing or display.

The component does not call any external APIs beyond the language model or embedding service you provide. All processing happens inside Nappai.

Inputs

Base Retriever: The retriever that will first fetch documents before compression.
Embedding: The embedding model used when the “EmbeddingsFilter” compressor is selected.
LLM: The language model used for the LLM‑based compressors (Extractor, Filter, Rerank).
Compressor: Choose which compression method to apply.
Search Query: The text you want to search for in your documents.

Outputs

Retriever: A new retriever object that already has compression applied. You can feed this into other components that expect a retriever.
Search Results: The list of documents (or snippets) that match the query after compression. These can be shown to users or passed to downstream logic.

Usage Example

Add the component to your workflow.
Connect a Base Retriever (e.g., a simple keyword retriever that pulls documents from your database).
Select a Compressor – for example, “LLMChainExtractor” to keep only the most relevant sentences.
Provide an LLM – choose the language model you want to use for extraction.
Enter a Search Query – type the question or keyword you’re interested in.
Run the workflow.
- The component will return a compressed retriever and a list of the best matching document snippets.
Use the Search Results – display them in a dashboard card or feed them into a summarization component.

Base Retriever – The component that actually pulls raw documents before compression.
LLM – Provides the language model used for extraction or filtering.
Embedding – Supplies vector embeddings for similarity‑based filtering.
Document Summarizer – Can take the compressed results and produce a short summary.

Tips and Best Practices

Choose the right compressor: Use “EmbeddingsFilter” when you need strict similarity control; use LLM‑based compressors for more nuanced relevance.
Set a sensible similarity threshold if you use EmbeddingsFilter; too high and you’ll miss useful docs, too low and you’ll get noise.
Keep the base retriever fast: The compression step is only useful if the initial retrieval is quick; otherwise, the whole process can become slow.
Monitor LLM usage: LLM calls can be expensive; consider caching results or limiting the number of queries.

Security Considerations

Data privacy: Any text sent to the LLM or embedding service is processed outside of Nappai. Make sure you comply with your organization’s data‑handling policies.
Access control: Restrict who can configure the component to prevent accidental exposure of sensitive documents.