Multi Vector Retriever
This component helps you retrieve important information from your documents by using a two-storage approach. It connects a Parent Vector Store, which organizes documents using metadata and tags, with a Child Vector Store, which holds the full text of those documents. When you search, the component first finds the relevant references in the Parent store and then fetches the actual text from the Child store, ensuring you get accurate and complete results.
How it Works
The Multi Vector Retriever acts as a bridge between two types of data storage:
- Reference Search: You provide a search query (a question or text). The component looks at the Parent Vector Store, which acts like a detailed index or catalog. It uses metadata to find documents that match your query.
- Text Retrieval: Once the Parent store identifies the relevant documents, the component uses the IDs to look up the full text in the Child Vector Store.
- Return Results: The component returns the full text of the matched documents in a format ready for further automation or display.
This method allows for faster and more precise searches by separating the organization of data from the storage of text.
Inputs
Input Fields
The following fields are available to configure this component:
- ID Key: Specifies the field name used to link documents in the Parent store to documents in the Child store. This ensures the system knows which metadata matches which text.
- Search Query: The question or text you want to search for. The component will look for documents that match this query.
- Parent Vector Store: The main storage that holds metadata and references. This store helps filter and guide the search to find relevant documents quickly.
- Vector Store: The secondary storage that holds the actual text content of the documents. The component retrieves the full text from here once relevant references are found.
- Number of Results: The maximum number of documents to return. The default is 5, which works well for most cases.
Outputs
- Retriever: A configured retriever object that can be used in other parts of your workflow to perform searches.
- Results: A list of the documents found, returned in a format ready to be used by other nodes in your automation.
Output Data Example (JSON)
json [ { “text”: “This is the full content of the first relevant document found during the search.”, “metadata”: { “id”: “doc_001”, “source”: “reports/quarterly.pdf” } }, { “text”: “Details and insights from the second retrieved document.”, “metadata”: { “id”: “doc_002”, “source”: “database/contracts.csv” } } ]
Connectivity
To use this component effectively, you need to connect it to two Vector Stores:
- Parent Vector Store: Connect a vector store node that contains metadata and structured references here.
- Vector Store: Connect a vector store node that contains the full text content of your documents here.
These connections allow the component to cross-reference metadata with text. You can also connect the Retriever output to other nodes that accept retriever objects, or connect the Results output to nodes like Text Processors, Summarizers, or Chat Bots to use the retrieved information.
Usage Example
Scenario: Searching for Contract Details
Imagine you have a database of customer contracts. You want to find the text of contracts related to the “Marketing” department.
- Connect Stores:
- Connect your metadata store to Parent Vector Store.
- Connect your full text store to Vector Store.
- Configure Link:
- Set ID Key to
contract_id(or whatever field ID links your metadata to your text).
- Set ID Key to
- Search:
- Enter
Marketing department contractsin Search Query.
- Enter
- Result:
- The component returns the full text of contracts from the Marketing department, which you can then pass to a “Summarize Text” node to get a quick overview.
Important Notes
🔒 Protect Sensitive Documents Ensure that the parent and child vector stores are secured with proper access controls. Search queries expose document content, so only authorized users should have access.
⚠️ Component is in Development This MultiVectorRetriever is flagged as a development component. Some features may change or be unsupported in future releases, so use it with caution in production.
📋 Provide Both Parent and Child Vector Stores The component needs a parent vector store containing the main documents and a child vector store containing related documents. Both must be connected and properly indexed before use.
📋 Match Child Document ID Field The ID Key must match the field name used in the child vector store to identify documents. If they differ, the retriever will not find any related documents.
💡 Use Specific, Contextual Queries To improve relevance and reduce noise, craft detailed search queries rather than generic keywords. This helps the retriever match documents more accurately.
💡 Set a Reasonable Number of Results The default of 5 results balances speed and usefulness. Increasing this number too high can slow down searches or consume more memory.
⚙️ Advanced Number of Results Option The advanced input for Number of Results should be used only when you need more than the default. Misconfiguring it may lead to unexpected performance issues.
ℹ️ Search Results Returned as Data Objects The search output delivers results as a list of Data objects, which can be passed directly to other components or displayed in the UI.
Tips and Best Practices
- Always verify that your ID Key matches the structure of your vector stores to ensure documents are linked correctly.
- Use the Retriever output if you need to reuse the same search configuration across multiple parts of your workflow.
- Ensure both vector stores are indexed before running a search to avoid missing results.
- If you experience slow performance, reduce the Number of Results or check the complexity of your vector store indexing.
- Monitor your search queries to ensure they contain enough context for accurate matching.
Security Considerations
This component retrieves full text content from vector stores based on your search queries. Ensure that your vector stores are configured with appropriate access controls and security policies. Only authorized users should have access to this component, as the search operation may expose sensitive document data.