Skip to content

Milvus Vector Store

The Milvus Vector Store component acts as a smart memory for your Nappai automation workflows. It allows you to securely save documents, images, or other data as mathematical representations (vectors) and quickly find the most relevant matches when needed. Instead of searching by exact keywords, this component understands the meaning behind your queries, making it ideal for retrieving related information from large datasets.

How it Works

This component connects directly to a Milvus database service. When you use it, the component automatically converts your data into a format the database can understand, organizes it into a named collection (like a labeled folder), and stores it efficiently. When you run a search, it compares your query against the stored data using mathematical similarity to return the most relevant results. The component handles all the complex database communication in the background, so you can focus on managing and finding your information easily.

Connection & Credentials

This component requires configuring a credential in the Nappai panel before interacting with the external service:

  1. Go to the Credentials section in your Nappai panel.
  2. Create a new credential of the type Milvus API and fill in the required fields (Milvus URL, Username, Password, and API Key).
  3. In your workflow, select the saved credential in the Credential input field of this node.

Operations

This component offers several operations that you can select based on what you need to do. You can only use one operation at a time:

  • Add: Stores new documents or data into your Milvus collection.
  • Search: Finds and returns the most relevant documents based on a text query or vector input.
  • Retriever: Prepares the stored data for direct use by other components in your workflow, such as language models.

To use the component, first select the operation you need in the “Operation” field.

Inputs

Input Fields

The following fields are available to configure this component. Each field may be visible in different operations:

  • Embedding: Reference to the function or model that converts your raw data into searchable vectors.
    • Visible in: Add, Search, Retriever
  • Ingest Data: The actual documents, text, or data you want to save into the database.
    • Visible in: Add
  • Operation: Select which task this component should perform (Add, Search, or Retriever).
    • Visible in: Add, Search, Retriever
  • Collection Description: A brief label or description for your data collection.
    • Visible in: Add, Search, Retriever
  • Collection Name: A unique name for the folder or space where your data will be stored.
    • Visible in: Add, Search, Retriever
  • Other Connection Arguments: Additional settings to customize how the component connects to the database.
    • Visible in: Add, Search, Retriever
  • Consistencey Level: Controls how fresh the data appears during searches, balancing speed and accuracy.
    • Visible in: Add, Search, Retriever
  • Drop Old Collection: If enabled, deletes any existing collection with the same name before saving new data.
    • Visible in: Add, Search, Retriever
  • Index Parameters: Advanced settings that control how data is organized for faster searching.
    • Visible in: Add, Search, Retriever
  • Number of Results: Specifies how many matches the component should return when searching.
    • Visible in: Add, Search, Retriever
  • Primary Field Name: The name of the main field used to identify each stored document.
    • Visible in: Add, Search, Retriever
  • Search Parameters: Custom rules that refine how the search matches your query to stored data.
    • Visible in: Add, Search, Retriever
  • Search Query: The text or data you want to search for. Leave empty to retrieve all documents.
    • Visible in: Search
  • Text Field Name: The name of the field where the original text content is stored.
    • Visible in: Add, Search, Retriever
  • Timeout: The maximum time to wait for the database to complete an operation before stopping.
    • Visible in: Add, Search, Retriever
  • Connection URI: The address where your Milvus database server is located.
    • Visible in: Add, Search, Retriever
  • Vector Field Name: The name of the field specifically used to store the mathematical representations of your data.
    • Visible in: Add, Search, Retriever

Outputs

After running the component, you will receive the following outputs that can be connected to other nodes in your workflow:

  • Retriever: A ready-to-use search interface that allows other components to query your stored data seamlessly.
  • Results: A list of the documents or data chunks that matched your search query, along with relevance scores.
  • Vector Store: The complete storage object, which can be passed to other components that need to read or modify the stored vectors.

Output Data Example (JSON)json

{ “results”: [ { “id”: “doc_001”, “text”: “This is a retrieved document chunk related to your query.”, “score”: 0.92, “metadata”: {“source”: “example.pdf”, “page”: 3} }, { “id”: “doc_002”, “text”: “Another relevant piece of information found in your data store.”, “score”: 0.85, “metadata”: {“source”: “example.txt”, “page”: 1} } ], “status”: “success”, “total_matches”: 2 }

Connectivity

This component is designed to sit in the middle of data processing workflows. It typically receives documents from sources like web scrapers, document loaders, or API connectors via the Add operation. You then connect the Retriever or Results output to language models, chatbots, or data analysis tools to provide them with context. For storage operations, it pairs with output fields from text processors or file parsers. When searching, it connects to query inputs from user prompts or scheduling triggers to fetch relevant information on demand.

Usage Example

Scenario: Building a Company Knowledge Base Search

  1. Add Operation: Connect your HR document processor to this component. Set the Operation to Add, choose a Collection Name like HR_Policies, and input your employee handbook PDFs into the Ingest Data field. The component will organize and store these documents.
  2. Search Operation: Later, connect a customer support chat interface to this component. Switch the Operation to Search, type or pass a question like “What is the parental leave policy?” into the Search Query field, and set Number of Results to 3. The component will return the most relevant policy excerpts.
  3. Retriever Operation: Connect this component directly to an AI language model. Use the Retriever operation to let the AI automatically fetch supporting documents when generating answers, ensuring responses are accurate and based on your stored knowledge.

Important Notes

🔒 Secure Password Handling 🔴 Passwords entered in the Connection Password field are stored securely by the component. Avoid exposing this field in shared or public environments.

⚠️ Default Result Count 🟢 The default number of search results is 4. Increase this value if you need more results, but be aware that larger values may increase response time.

📋 Milvus Server Needed 🟡 You must have a Milvus instance running and reachable at the URI you provide. The default URI is http://localhost:19530.

📋 langchain-milvus Dependency 🟡 The component uses the external package langchain-milvus. Install it with pip install langchain-milvus before using this component.

💡 Set Appropriate Consistency Level 🟢 Choose a consistency level that balances latency and data freshness. For most use cases, the default ‘Session’ level offers a good trade‑off.

⚙️ Use Unique Collection Names 🟡 Give each project a distinct collection name to prevent accidental data overlap. Reusing a name may cause unintended data mix.

⚙️ Drop Old Collection Caution 🔴 Enabling Drop Old Collection will delete any existing collection with the same name. Use this option only if you are sure the old data is no longer needed.

⚙️ Advanced Index and Search Parameters 🟡 Index and search parameters are advanced settings. Incorrect values can degrade performance or accuracy; test them carefully on a small dataset first.

ℹ️ Automatic Document Ingestion 🟢 The component adds any documents you provide to the collection each time it is built. Running it multiple times may create duplicate entries unless the collection is cleared.

ℹ️ Empty Search Query Returns No Results 🟢 If the Search Query field is empty or contains only whitespace, the component will return an empty list rather than performing a search.

Tips and Best Practices

  • Always assign a unique Collection Name to separate different projects or datasets to avoid data mixing.
  • Use the Add operation first to store your documents, then switch to Search when you need to retrieve them.
  • Start with a low Number of Results value to keep workflows fast, then increase it only if you need more context.
  • Test the Drop Old Collection feature on a small copy of your data first to ensure you don’t lose important information.
  • Keep your Milvus server address (Connection URI) up to date, as network changes will affect search performance.

Security Considerations

This component handles sensitive connection details and stores valuable data. Ensure that your Milvus server is configured with proper access controls and firewalls. The credential fields use masked inputs to prevent screen exposure, but avoid sharing workflow snapshots that contain these fields. Regularly rotate your API keys and passwords, and always run the component in a trusted environment to protect your stored vectors and retrieval logs.