Neo4j CypherQAChain
The Neo4j CypherQAChain is a powerful tool within the Nappai automation system that allows you to ask questions in plain English and get answers directly from your Neo4j graph database. Instead of writing complex code yourself, this component uses an AI Language Model (LLM) to understand your question, translate it into a database query (Cypher), and fetch the relevant data. It acts as an intelligent bridge between you and your data, making it easier to explore relationships and insights stored in your Neo4j environment.
How it Works
This component works in two main steps to provide you with accurate answers:
- Understanding Your Question: You type a question or statement into the Query field. The component sends this text to an AI Language Model (LLM) that you have connected. The LLM analyzes your request and translates it into a specific language called Cypher, which is used to communicate with Neo4j databases.
- Retrieving Data: The component then executes this Cypher query on your Neo4j database. Depending on which output you choose to use, it either returns the raw data (as a list of items) or generates a natural language summary of that data using the AI.
By using this component, you leverage the AI’s ability to understand context and language, while Neo4j handles the complex storage and retrieval of interconnected data.
Connection & Credentials
To connect to your Neo4j database, you must first set up a secure credential in Nappai. This ensures your database access is safe and managed centrally.
- Go to the Credentials section in your Nappai panel.
- Create a new credential of the type Neo4j API.
- Fill in the required fields: Neo4j URL, Neo4j Username, and Neo4j Password.
- In your workflow, select this saved credential in the Credential input field of the Neo4j CypherQAChain node.
Note: Do not enter passwords or usernames directly into the node’s input fields. Always use the Credential system for security.
Operations
This component is designed to be straightforward and does not require selecting from multiple complex operations. It automatically handles the translation and retrieval process based on the outputs provided below. Simply connect an LLM and provide your query to begin.
Inputs
The following fields are available to configure this component.
- Query: The question or statement you want to ask Neo4j. This text is analyzed by the AI to generate the correct database query.
- Neo4j Database: The specific name of the Neo4j database you want to query. Ensure this matches the database you configured in your credentials.
- Limit the number of results: (Optional) A number that limits how many items are returned. The default is 10. Lowering this number can make the component faster and use less memory.
- Model: [REQUIREMENT] Connect an LLM component here. The AI model is essential for translating your text query into a database command.
Outputs
This component provides two ways to view the results of your query:
- Text: This output provides a natural language response. The AI reads the data from the database and writes a clear, human-readable answer. This is ideal for chatbots, dashboards showing summaries, or when you want a quick answer without looking at raw data.
- Data: This output provides the raw results from the database as a list. Use this if you need to process the data further, display it in a table, or connect it to other tools that require structured data.
Output Data Example (JSON)
Example of the Data output structure (a list of items): json [ { “name”: “Alice”, “age”: 30, “city”: “New York” }, { “name”: “Bob”, “age”: 25, “city”: “Boston” } ]
Example of the Text output (a simple string): json { “text”: “Here are the people in your database: Alice from New York and Bob from Boston.” }
Connectivity
- Connect To: This component typically connects to the Query input of other automation nodes if you want to filter or process the results further. The Text output can be connected to visualization nodes or chat interfaces. The Data output is often connected to table views, API endpoints, or other data processing nodes.
- Required Connection: You must connect an LLM (Language Model) component to the Model input. Without an LLM, the component cannot translate your question into a database query.
Usage Example
Scenario: You want to find all employees in a specific department and display their names in a chat interface.
- Setup: Add the Neo4j CypherQAChain node to your canvas. Connect your LLM component to its Model input. Select your Neo4j API credential.
- Configure: In the Neo4j Database field, enter the name of your database (e.g., “company_db”). Set Limit the number of results to 50.
- Query: In the Query field, type:
"List the names of all employees in the Sales department." - Execute: Run the workflow.
- Result: The Text output will provide a readable sentence like: “The employees in the Sales department are John, Mary, and Paul.” You can then connect this Text output to a display node to show it to the end-user.
Important Notes
🔒 Handle Credentials Safely [high] Never hard-code Neo4j credentials in public repositories. Store them in secure environment variables or secret managers to protect database access. Always use the Nappai Credential system.
🔒 Cypher Validation Protects Against Injection [medium] The component validates Cypher syntax by default, reducing the risk of malicious injection. Keep validate_cypher enabled unless you have a controlled environment.
📋 Provide Neo4j Connection Details [high] You must supply a valid Neo4j URL, username, password, and database name for the component to connect to your graph database via the Credential system.
📋 Attach a Language Model [high] An LLM (Language Model) is required for query processing. Ensure a compatible model is connected before running queries.
⚠️ No Automatic Error Recovery [medium] Connection failures or query syntax errors will result in uncaught exceptions. Implement external error handling if you need graceful fallbacks.
💡 Use Specific, Valid Cypher Queries [medium] Write clear and concise Cypher statements. Complex or overly broad queries may return too many results and slow performance. Ensure your questions are specific to get better AI translations.
💡 Set an Appropriate Result Limit (top_k) [low] Adjust the top_k setting to control how many records are returned. A lower number reduces memory usage and speeds up responses.
⚠️ Direct vs. Text Output Differences [medium] The component offers two outputs: ‘Data’ returns raw query results, while ‘Text’ returns only string results. If the query produces non-string data, the text output may be empty. Choose ‘Data’ for structured processing and ‘Text’ for human-readable summaries.
⚙️ Verbose Logging Can Be Verbose [low] Verbose mode is turned on by default, which outputs detailed logs to the console. Disable it in production to keep logs clean and reduce noise.
Tips and Best Practices
- Always test your queries with a small Limit the number of results first to ensure the AI generates the correct Cypher command.
- Use the Text output for end-user-facing interfaces like chatbots, and the Data output for backend processing or reporting dashboards.
- Ensure your LLM is capable of understanding technical syntax (like SQL or Cypher) for the best translation accuracy.
- Keep your Neo4j database name correct and consistent with your credential setup to avoid connection errors.
Security Considerations
- Credential Management: Always use the Nappai Credential system to store Neo4j passwords. Never hardcode them in the workflow.
- Input Validation: The component includes built-in validation to help prevent SQL/NoSQL injection attacks by verifying the generated Cypher syntax.
- Access Control: Ensure your Neo4j database user credentials have the minimum necessary permissions (principle of least privilege) to reduce risk if credentials are compromised.