Cassandra

The Cassandra component lets you keep your documents and their embeddings in a Cassandra database and then find the most similar ones whenever you need them. It works with the rest of Nappai’s automation tools, so you can add data, search it, or hand it off to other parts of your workflow.

How it Works

When you use the component, Nappai connects to your Cassandra database using a Cassandra DB credential that you set up in the credentials section.

The Add operation takes the documents you give it, turns them into embeddings (using the Embedding you provide), and stores them in the table you specify.
The Search operation looks up the most similar documents to a query you enter.
The Retriever operation gives you a reusable “retriever” object that can be used by other components to fetch relevant documents on demand.

All of this happens inside Nappai – you don’t need to run any code yourself. Just fill in the fields, pick an operation, and let the dashboard do the rest.

Operations

This component offers several operations that you can select based on what you need to do. You can only use one operation at a time:

Add: Store new documents and their embeddings in the Cassandra table.
Search: Find the most similar documents to a search query.
Retriever: Create a retriever object that can be used later to fetch relevant documents.

To use the component, first select the operation you need in the “Operation” field.

Inputs

Embedding: The model that turns text into vectors.
- Visible in: Add, Search, Retriever
Ingest Data: The documents you want to add to the database.
- Visible in: Add
Operation: Which of the three actions you want to perform.
- Visible in: Add, Search, Retriever
Batch Size: How many documents to process at once when adding.
- Visible in: Add, Search, Retriever
Search Body: Text terms that should be matched against the document body.
- Visible in: Add, Search, Retriever
Cluster arguments: Extra settings for the Cassandra connection.
- Visible in: Add, Search, Retriever
Enable Body Search: Turn on searching inside the document body. Must be enabled before the table is created.
- Visible in: Add, Search, Retriever
Keyspace: The keyspace (or namespace) where your table lives.
- Visible in: Add, Search, Retriever
Number of Results: How many documents to return when searching.
- Visible in: Add, Search, Retriever
Search Metadata Filter: Extra filters to narrow the search by metadata.
- Visible in: Add, Search, Retriever
Search Query: The text you want to search for. Leave empty to get all documents.
- Visible in: Search
Search Score Threshold: Minimum similarity score for results (used with “Similarity with score threshold”).
- Visible in: Add, Search, Retriever
Search Type: The algorithm used for searching.
- Visible in: Add, Search, Retriever
Setup Mode: How the table is created – “Sync”, “Async”, or “Off”.
- Visible in: Add, Search, Retriever
Table Name: The name of the table (or collection) where vectors are stored.
- Visible in: Add, Search, Retriever
TTL Seconds: Optional time‑to‑live for stored documents.
- Visible in: Add, Search, Retriever

Credential
This component requires a Cassandra DB credential.

In Nappai, go to the Credentials section and create a new Cassandra DB credential.

Provide the Cassandra database ID, Cassandra username, and Cassandra DB Token.

Back in the component, select that credential in the Credential field.
The credential supplies the database ID, username, and token needed to connect.

Outputs

Retriever: A retriever object that can be passed to other components to fetch relevant documents.
Results: The list of documents returned by a search.
Vector Store: The underlying Cassandra vector store object (useful for advanced integrations).

Usage Example

Adding Documents

Drag the Cassandra component onto the canvas.
Set Operation to Add.
Connect an Embedding component to the Embedding input.
Provide a list of documents in Ingest Data.
Fill in Keyspace and Table Name.
(Optional) Set Batch Size and TTL Seconds.
Click Run – the documents are now stored in Cassandra.

Searching for Similar Documents

Use the same component, but set Operation to Search.
Connect the same Embedding component.
Enter a Search Query (e.g., “machine learning trends”).
Choose Search Type (e.g., “Similarity”).
Set Number of Results to 5.
Run – the Results output will contain the top 5 matching documents.

Using a Retriever

Set Operation to Retriever.
Connect the Embedding component.
The Retriever output can now be fed into other components that accept a retriever, such as a question‑answering chain.

Embedding – Generates the vectors that the Cassandra component stores.
Data – Provides raw documents that can be ingested.
Vector Store – General interface for vector databases; Cassandra is one implementation.

Tips and Best Practices

Enable Body Search early: If you plan to search inside document text, turn on “Enable Body Search” before creating the table.
Use small batch sizes for large uploads: A batch size of 16 is a good starting point; adjust if you hit timeouts.
Set a TTL if data is temporary: This automatically removes old documents and keeps the table lean.
Keep credentials secure: Store your Cassandra DB credential in Nappai’s credential manager and never expose it in the UI.
Test with a few documents first: Verify that the table is created correctly before ingesting thousands of records.

Security Considerations

Credentials are stored encrypted in Nappai’s credential store.
The component never exposes your database ID, username, or token in the UI.
Use role‑based access in Cassandra to limit who can read or write to the table.