Skip to content

Deeplake Database

The Deeplake Database component is a user-friendly tool designed to store and retrieve documents efficiently using vector embeddings, making it easy to find information quickly.

Relationship with Deeplake

This component utilizes Deeplake as a vector storage system, allowing users to store documents as vector embeddings. It leverages Deeplake’s capabilities to perform efficient searches and retrieve relevant documents based on user queries.

Inputs

  • Collection Name: The name of the collection in the vector store where documents are stored.
  • Collection Description: A brief description of the collection.
  • Connection URI: The URI used to connect to the Deeplake service.
  • Scope Name: The scope or context name for the collection.
  • Index Name: The name of the index within the vector store.
  • Search Query: The query used to search for documents.
  • Ingest Data: The data to be ingested into the vector store.
  • Allow Dangerous Deserialization: A setting that allows deserialization of data from untrusted sources (use with caution).
  • Embedding: The embedding used for the documents.
  • Number of Results: The number of results to return from a search query.

Outputs

The component produces a list of documents that match the search query. These documents can be used in various workflows to extract information or perform further analysis.

Usage Example

Imagine you have a large collection of research papers and you want to find papers related to “machine learning.” You can use the Deeplake Database component to store these papers as vector embeddings. Then, by entering “machine learning” as the search query, the component will retrieve the most relevant papers for you.

Templates

Currently, there are no specific templates where this component is pre-configured.

  • NVIDIA Rerank: Rerank documents using the NVIDIA API and a retriever.
  • Multi Query Retriever: Retrieve documents using multiple queries.
  • Ensemble Retriever: Combine results from multiple retrievers.
  • Contextual Compression Retriever: Compress and decompress documents contextually.
  • RetrieverTool: Tool for interacting with retrievers.
  • Retrieval QA: Perform question-answering by querying sources from a retriever.
  • Data Retrieval Chain: Batch data and retrieve information from a vector store.

Tips and Best Practices

  • Always provide a meaningful Collection Name and Description to easily identify your collections.
  • Use the Search Query input to specify clear and concise queries for better search results.
  • Be cautious when enabling Allow Dangerous Deserialization, as it can pose security risks.

Security Considerations

If you choose to enable “Allow Dangerous Deserialization,” ensure that the data source is trusted to avoid potential security vulnerabilities.