Vectara RAG
Vectara RAG is a component that lets you ask questions and get concise answers by searching a Vectara corpus, reranking the results, and summarizing them. It’s a one‑click way to turn your data into instant, AI‑powered insights.
How it Works
When you type a question, Vectara RAG talks to the Vectara API.
- Search – It looks for documents that match your query.
- Hybrid Search – You can blend keyword matching with embedding similarity to fine‑tune relevance.
- Reranking – The top results can be reordered using different reranker algorithms (MMR, multilingual, or none).
- Summarization – The best documents are fed to a summarizer that produces a short, readable answer in the language you choose.
All of this happens behind the scenes, so you just see the final answer in the dashboard.
Inputs
- Vectara Customer ID: Your unique Vectara account identifier.
- Vectara Corpus ID: The ID of the specific corpus you want to search.
- Vectara API Key: A secret key that authenticates your requests to Vectara.
- Search Query: The question or statement you want an answer for.
- Hybrid Search Factor: How much weight to give keyword search versus embedding similarity. 0 means only embeddings, 1 means only keywords.
- Metadata Filters: A filter string that limits results based on metadata attributes (e.g.,
author:John
). - Reranker Type: Choose how the retrieved results are reordered (MMR, multilingual, or none).
- Number of Results to Rerank: How many top results should be reranked (default 50).
- Diversity Bias: For MMR reranking, a value from 0 to 1 that encourages diverse answers.
- Max Results to Summarize: The maximum number of documents that will be summarized (default 7).
- Response Language: The language code for the answer (e.g.,
eng
,spa
,auto
). - Prompt Name: The summarization prompt to use. Growth customers use
vectara-summary-ext-24-05-sml
; Scale customers use the other prompts.
Outputs
- Answer: A
Message
object containing the final, summarized answer to your query.
Usage Example
- Drag the Vectara RAG component onto your workflow.
- Enter your Vectara Customer ID, Corpus ID, and API Key.
- Type a Search Query like “What are the latest sales figures for Q3?”
- (Optional) Set Hybrid Search Factor to 0.02 to give a little weight to keyword matching.
- Click Run.
- The Answer output will appear in the next component or in the dashboard’s chat window.
Related Components
- OpenAI LLM – Use this after Vectara RAG to add further processing or custom logic.
- Vectara Indexer – Create or update the corpus that Vectara RAG searches.
- Chat UI – Display the answer in a conversational interface.
Tips and Best Practices
- Keep your API Key hidden; use environment variables or the dashboard’s secret storage.
- For highly specific queries, set Metadata Filters to narrow the search.
- If you need more diverse answers, increase Diversity Bias (but don’t exceed 1).
- Choose the Prompt Name that matches your customer tier to avoid errors.
- Adjust Hybrid Search Factor to balance speed and relevance—lower values are faster.
Security Considerations
- The Vectara API Key is a secret; never expose it in public code or logs.
- Use the dashboard’s built‑in secret management to store the key securely.
- Ensure that only authorized users can edit the component’s inputs.