Milvus

Milvus is a fast, scalable vector database that lets you store and search large collections of text embeddings. In Nappai, the Milvus component lets you add documents, search for similar ones, or create a retriever that can be used in other parts of your workflow.

How it Works

When you use the Milvus component, it connects to a Milvus server (by default at http://localhost:19530).

Add: The component takes your documents, converts them into embeddings with the chosen embedding model, and stores them in a collection.
Search: It sends a query string to Milvus, which finds the most similar embeddings and returns the matching documents.
Retriever: It builds a retriever object that can be plugged into other components (e.g., a question‑answering chain) to fetch relevant documents on demand.

The component handles all the low‑level details: creating the collection, setting consistency levels, indexing, and cleaning up old collections if you choose to drop them.

Operations

This component offers several operations that you can select based on what you need to do. You can only use one operation at a time:

Add: Store new documents in the Milvus collection.
Search: Find documents that are most similar to a given query string.
Retriever: Create a retriever object that can be used by other components to fetch relevant documents.

To use the component, first select the operation you need in the “Operation” field.

Inputs

Embedding: The embedding model that turns text into vectors.
- Visible in: Add, Search, Retriever
Ingest Data: The documents you want to add to the collection.
- Visible in: Add
Operation: The action the component should perform (Add, Search, or Retriever).
- Visible in: Add, Search, Retriever
Collection Description: A short description of the collection.
- Visible in: Add, Search, Retriever
Collection Name: The name of the Milvus collection.
- Visible in: Add, Search, Retriever
Other Connection Arguments: Extra key/value pairs for the Milvus connection.
- Visible in: Add, Search, Retriever
Consistencey Level: How strict Milvus should be when reading data.
- Visible in: Add, Search, Retriever
Drop Old Collection: If checked, the existing collection with the same name will be deleted before adding new data.
- Visible in: Add, Search, Retriever
Index Parameters: Settings that control how Milvus indexes the vectors.
- Visible in: Add, Search, Retriever
Number of Results: How many documents to return when searching.
- Visible in: Add, Search, Retriever
Connection Password: Password for the Milvus server (leave blank if none).
- Visible in: Add, Search, Retriever
Primary Field Name: The field that holds the unique ID for each document.
- Visible in: Add, Search, Retriever
Search Parameters: Extra options that tweak the search algorithm.
- Visible in: Add, Search, Retriever
Search Query: The text you want to search for. Leave empty to retrieve all documents.
- Visible in: Search
Text Field Name: The field that contains the raw text of each document.
- Visible in: Add, Search, Retriever
Timeout: Maximum time (in seconds) to wait for a Milvus operation.
- Visible in: Add, Search, Retriever
Connection URI: The address of the Milvus server.
- Visible in: Add, Search, Retriever
Vector Field Name: The field that stores the embedding vectors.
- Visible in: Add, Search, Retriever

Outputs

Retriever: A retriever object that can be used by other components to fetch relevant documents.
Results: The list of documents returned by a search operation.
Vector Store: The underlying Milvus vector store object (useful for advanced customizations).

Usage Example

Adding Documents

Set Operation to Add.
Provide an Embedding model (e.g., OpenAI embeddings).
Drag your documents into Ingest Data.
Configure Collection Name and optional Collection Description.
Click Run – the component will create the collection and store the embeddings.

Searching

Set Operation to Search.
Provide the same Embedding model used for adding.
Enter a Search Query (e.g., “machine learning trends”).
Optionally adjust Number of Results.
Click Run – the component returns the most similar documents in Results.

Pinecone Vector Store – another cloud‑based vector database.
FAISS Vector Store – a local, in‑memory vector store.
OpenAI Embeddings – popular embedding model that can be paired with Milvus.
Retriever – generic retriever component that can consume the Milvus retriever output.

Tips and Best Practices

Use Drop Old Collection only when you want to start fresh; otherwise, keep the existing data.
Choose a Consistencey Level that balances speed and accuracy (e.g., “Session” for most use cases).
Keep Index Parameters tuned for your data size; default settings work well for small to medium collections.
Store the Connection URI and Password securely; avoid hard‑coding them in public workflows.
If you need to retrieve all documents, leave Search Query empty but set Number of Results high enough to cover the collection.

Security Considerations

The Connection Password field should be treated as sensitive. Use Nappai’s secret storage to keep it encrypted.
Ensure the Connection URI points to a trusted Milvus instance; exposing it publicly can allow unauthorized access.
When using Drop Old Collection, double‑check that you are not deleting valuable data accidentally.