SQLRetrieverTool
The SQLRetrieverTool lets you pull information from a SQL database and turn it into a format that can be used by other parts of Nappai. Think of it as a bridge that reads data from your database, turns the text into a searchable vector, and then hands that vector back to the system so it can be used in automated tasks or AI queries.
How it Works
- Connect to your database – You provide the database connection details (host, port, credentials, etc.) in the Database Info field.
- Define the data structure – The Metadata Schema tells the tool what columns and data types to expect in the results.
- Write a query template – In SQL Query Template you write a SQL statement that returns a column named
text
. You can insert placeholders like{[input]}
that will be replaced with actual values when the tool runs. - Create a retriever – The tool uses the Embedding you supply to convert the returned text into a vector. This vector can then be searched for similarity, allowing the system to find the most relevant rows.
- Expose the tool – The component outputs a BaseTool object (via the Tool output). This tool can be plugged into other Nappai components, such as a language model or a workflow step, to fetch data on demand.
Inputs
- Database Info: Connection details for the SQL database you want to query.
- Embedding: The embedding model or function that turns text into vectors for similarity search.
- Metadata Schema: The structure of the data you expect to retrieve, used by the SelfQueryRetriever.
- SQL Query Template: A SQL statement that must return a column named
text
. Use{[input]}
placeholders to inject dynamic values. - Tool Description: A short description of what the tool does, used by the SelfQueryRetriever.
- Tool Name: The name you want to give this tool, also used by the SelfQueryRetriever.
Outputs
- Tool: A
BaseTool
object that can be passed to other components. It contains the logic to run the SQL query, embed the results, and perform similarity searches.
Usage Example
Imagine you want to let a chatbot answer questions about sales data stored in a PostgreSQL database.
-
Set up the component
- Database Info: Provide your PostgreSQL connection string.
- Embedding: Choose an embedding model like
sentence-transformers/all-MiniLM-L6-v2
. - Metadata Schema: Define the columns you’ll retrieve (e.g.,
order_id
,customer_name
,amount
). - SQL Query Template:
SELECT order_id, customer_name, amount, description AS textFROM sales_ordersWHERE order_id = {[input]}
- Tool Description: “Retrieves order details from the sales database.”
- Tool Name: “SalesOrderRetriever”
-
Connect the output
- Drag the Tool output into a language model component that needs to fetch order details.
- When the model asks for “What was order 12345?” the retriever runs the query, embeds the description, and returns the most relevant row.
Related Components
- LibSQLRetrieverToolBase – The base class that provides the core SQL‑retrieval logic.
- EmbeddingTool – Generates embeddings for arbitrary text, useful if you need to embed data before storing it.
- SimilaritySearchTool – Uses embeddings to find the most similar records; often paired with a retriever like this one.
Tips and Best Practices
- Always return a
text
column – The retriever expects a field namedtext
. If your query uses a different name, rename it in the SELECT clause. - Keep queries simple – Complex joins or subqueries can slow down retrieval. Test performance before deploying.
- Use placeholders wisely –
{[input]}
is replaced at runtime; avoid SQL injection by never inserting raw user input directly. - Validate the schema – Make sure the Metadata Schema matches the columns returned by your query to prevent runtime errors.
- Secure credentials – Store database passwords in Nappai’s secure vault, not in plain text.
Security Considerations
- SQL Injection – Never concatenate user input directly into the query. Use the
{[input]}
placeholder, which the system safely substitutes. - Credential Management – Store database credentials in a protected secrets manager and grant the minimum necessary permissions to the database user.
- Data Privacy – If the database contains sensitive information, ensure that the embeddings and retrieved data are handled according to your organization’s privacy policies.