Skip to content

Neo4j

The Neo4j component acts as a bridge between your automation workflow and a Neo4j graph database. It allows you to either add new information to your database (ingestion) or search through existing data to find relevant documents (retrieval). Think of it as a smart filing cabinet that uses AI to understand the meaning behind text, making it easier to find exactly what you are looking for.

How it Works

This component connects to your Neo4j server using specific credentials. Once connected, it can perform two main types of actions depending on the operation you select:

  1. Adding Data (Ingestion): If you are adding new documents, the component takes the text, converts it into a mathematical representation (vector) using an AI model, and stores it in your Neo4j graph.
  2. Finding Data (Retrieval/Searching): If you are looking for information, the component takes your search query, converts it into a vector, and searches the Neo4j database for the most similar pieces of text. It returns the actual text content of the documents it finds.

It can also operate in a “Retriever” mode, which is optimized for quickly pulling existing data from the database without modifying it.

Connection & Credentials

This component requires configuring a credential in the Nappai panel before interacting with the external service:

  1. Go to the Credentials section in your Nappai panel.
  2. Create a new credential of the type Neo4j API and fill in the required fields (URL, Username, and Password).
  3. In your workflow, select the saved credential in the Credential input field of this node.

Operations

This component offers several operations that you can select based on what you need to do. You can only use one operation at a time:

  • [Add]: Uses the provided data to create or update records in the Neo4j database. This is used when you want to teach the database new information.
  • [Search]: Looks for specific information in the database based on a query. You enter a question or keyword, and it returns matching documents.
  • [Retriever]: Connects to an existing index in Neo4j to make data available for other parts of your workflow. This is typically used when you want to feed data into other AI components for further processing.

To use the component, first select the operation you need in the “Operation” field.

Inputs

The following fields are available to configure this component. Each field may be visible in different operations:

  • [Embedding]:
    • Visible in: Add, Search, Retriever
  • [Ingest Data]:
    • Visible in: Add
  • [Operation]:
    • Visible in: Add, Search, Retriever
  • [Embedding Node Property]:
    • Visible in: Add, Search, Retriever
  • [Node Label]:
    • Visible in: Add, Search, Retriever
  • [Search Query]: Enter a search query. Leave empty to retrieve all documents.
    • Visible in: Search
  • [Text Node Properties]:
    • Visible in: Add, Search, Retriever

Outputs

The component provides data that can be connected to other parts of your workflow:

  • [Retriever]: This output allows you to connect the Neo4j database to other AI components that need to read data. It essentially makes the database “readable” by other tools in your flow.
  • [Results]: When using the Search operation, this output contains the actual text content of the documents found in the database. You can connect this to other components to summarize, analyze, or display this information.
  • [Vector Store]: This is a technical link that represents the connection to the Neo4j database. It is usually used internally by the system to ensure the connection remains active.

Output Data Example (JSON)

When you perform a search, the Results output typically looks like this: json [ { “page_content”: “This is the text content of the first document found in Neo4j.”, “metadata”: { “source”: “example_file.txt”, “label”: “Document” } }, { “page_content”: “This is the text content of the second document found.”, “metadata”: { “source”: “another_file.pdf”, “label”: “Document” } } ]

Connectivity

  • Connects TO:
    • LLM or AI Model Components: You can connect the Results output to AI assistants to answer questions based on your Neo4j data.
    • Other Vector Store Components: The Retriever output can often be connected to other components that expect a data source.
  • Connects FROM:
    • Document Processors: When using the Add operation, you typically connect text or file outputs from previous steps into the Ingest Data field.

Usage Example

Scenario: Building a Q&A Bot for Company Documents

  1. Setup: You create a Neo4j component and select the Add operation. You connect a file uploader component to the Ingest Data field to upload your company’s PDF documents. You also connect an AI embedding model to the Embedding field. This teaches the database your documents.
  2. Search: Later, you change the operation to Search. You connect a text input box (where users type questions) to the Search Query field. You also connect the same AI embedding model to the Embedding field.
  3. Result: When a user types “What is our return policy?”, the component searches Neo4j, finds the relevant document, and outputs the text in the Results field. You can then connect this result to an AI assistant to generate a natural language answer for the user.

Important Notes

🔒 Security Use Secure Storage for Credentials Never hard-code Neo4j credentials in source files. Store them in environment variables or a secrets manager and reference them securely to protect against accidental exposure.

⚠️ Limitation Default Index Name May Conflict If you do not provide an index name, the component uses the default ‘nappai_neo4j_index’. Using this name on a database that already contains an index with the same name could overwrite existing data or produce unexpected results.

⚠️ Limitation Node Properties Must Match Graph Schema For graph-based retrieval, the node_label and comma-separated text_node_properties must exactly match the labels and properties in your Neo4j graph. Mismatched names will prevent the component from locating documents.

📋 Requirement Neo4j Server Must Be Accessible The component connects to a running Neo4j instance via the provided URL, username, and password. Ensure the server is online and reachable from your network before using the component.

📋 Requirement Embedding Model Required An embedding model must be supplied to the component. The embeddings must match the dimensionality expected by Neo4j’s vector index; otherwise, indexing or search will fail.

💡 Best Practice Specify Node Label and Properties for Efficient Retrieval When using only_retriever, providing node_label and text_node_properties limits the search to relevant nodes, improving performance and reducing memory usage.

ℹ️ Behavior Empty Query Returns No Results If the search_query input is empty or not a non-blank string, the component will return an empty list instead of performing a search. Always provide a valid query string to retrieve documents.

🗙 Configuration Ensure Embedding Property Matches Graph The embedding_node_property input must correspond to the property name that stores the embedding vectors in your Neo4j nodes. If it is incorrect, similarity searches will not find matching documents.

Tips and Best Practices

  • Always ensure your Neo4j server is running and accessible from your network before attempting to connect.
  • If you are setting up the database for the first time, use the Add operation to ingest your documents first. Then switch to Search or Retriever to query them.
  • When using Search, make sure the text you type is clear and specific to get the most accurate results.
  • For complex graphs, double-check that your Node Label and Text Node Properties exactly match how your data is stored in Neo4j.
  • Use the Retriever operation if you need to pass database connections to other AI tools rather than just viewing results immediately.

Security Considerations

  • Always use the Credential system in Nappai to store your Neo4j URL, Username, and Password. Never expose these details in plain text within your workflow.
  • Ensure that your Neo4j server has proper access controls and is not publicly exposed to the internet without authentication.