Skip to content

Vectara RAG

Vectara RAG is a component that lets you ask questions and get concise answers by searching a Vectara corpus, reranking the results, and summarizing them. It’s a one‑click way to turn your data into instant, AI‑powered insights.

How it Works

When you type a question, Vectara RAG talks to the Vectara API.

  1. Search – It looks for documents that match your query.
  2. Hybrid Search – You can blend keyword matching with embedding similarity to fine‑tune relevance.
  3. Reranking – The top results can be reordered using different reranker algorithms (MMR, multilingual, or none).
  4. Summarization – The best documents are fed to a summarizer that produces a short, readable answer in the language you choose.
    All of this happens behind the scenes, so you just see the final answer in the dashboard.

Inputs

  • Vectara Customer ID: Your unique Vectara account identifier.
  • Vectara Corpus ID: The ID of the specific corpus you want to search.
  • Vectara API Key: A secret key that authenticates your requests to Vectara.
  • Search Query: The question or statement you want an answer for.
  • Hybrid Search Factor: How much weight to give keyword search versus embedding similarity. 0 means only embeddings, 1 means only keywords.
  • Metadata Filters: A filter string that limits results based on metadata attributes (e.g., author:John).
  • Reranker Type: Choose how the retrieved results are reordered (MMR, multilingual, or none).
  • Number of Results to Rerank: How many top results should be reranked (default 50).
  • Diversity Bias: For MMR reranking, a value from 0 to 1 that encourages diverse answers.
  • Max Results to Summarize: The maximum number of documents that will be summarized (default 7).
  • Response Language: The language code for the answer (e.g., eng, spa, auto).
  • Prompt Name: The summarization prompt to use. Growth customers use vectara-summary-ext-24-05-sml; Scale customers use the other prompts.

Outputs

  • Answer: A Message object containing the final, summarized answer to your query.

Usage Example

  1. Drag the Vectara RAG component onto your workflow.
  2. Enter your Vectara Customer ID, Corpus ID, and API Key.
  3. Type a Search Query like “What are the latest sales figures for Q3?”
  4. (Optional) Set Hybrid Search Factor to 0.02 to give a little weight to keyword matching.
  5. Click Run.
  6. The Answer output will appear in the next component or in the dashboard’s chat window.
  • OpenAI LLM – Use this after Vectara RAG to add further processing or custom logic.
  • Vectara Indexer – Create or update the corpus that Vectara RAG searches.
  • Chat UI – Display the answer in a conversational interface.

Tips and Best Practices

  • Keep your API Key hidden; use environment variables or the dashboard’s secret storage.
  • For highly specific queries, set Metadata Filters to narrow the search.
  • If you need more diverse answers, increase Diversity Bias (but don’t exceed 1).
  • Choose the Prompt Name that matches your customer tier to avoid errors.
  • Adjust Hybrid Search Factor to balance speed and relevance—lower values are faster.

Security Considerations

  • The Vectara API Key is a secret; never expose it in public code or logs.
  • Use the dashboard’s built‑in secret management to store the key securely.
  • Ensure that only authorized users can edit the component’s inputs.