FastEmbed Embeddings (Local)

The FastEmbed Embeddings (Local) component helps you turn raw text into mathematical vectors (embeddings) right on your device. Because it runs entirely offline using your computer’s hardware, you won’t need to manage API keys, pay for cloud services, or worry about your data leaving your system. It is ideal for privacy-focused projects, low-bandwidth environments, or systems that need to operate without an internet connection.

How it Works

When you connect this component to your workflow, it uses a lightweight engine called FastEmbed to analyze your text and create numerical representations. The process happens entirely on your machine. You simply select a model, adjust performance settings if needed, and pass your text to it. The component automatically handles background tasks like caching downloaded models, managing memory limits, and adding necessary formatting prefixes so your data is ready for search or analysis. Everything runs smoothly in the background without requiring manual technical setup.

Inputs

Input Fields

The following fields are available to configure this component:

Model: Choose the AI engine that will process your text. Different models vary in speed, accuracy, and language support. The feather icon (🪶) indicates a fast, lightweight option, while the globe (🌍) indicates a high-performance, larger model.
Document Embed Type: Select how your text should be formatted for processing. Usually, you keep this on “default”. If you are using specific large models (like multilingual-e5-large), change it to “passage” to help the system understand that you are working with full documents rather than short queries.
Model Cache Directory: The folder where the component stores downloaded AI models. Leave this blank to use the system’s default storage location.
Max Sequence Length: The maximum number of words/tokens the component will read at once. If your text is very long, it will automatically be cut off at this number. A default of 512 works well for most use cases.
Batch Size: How many text pieces the component will process at the same time. A higher number can make things faster but uses more computer memory. The default of 256 is optimized for most systems.
ONNX Runtime Threads: Controls how many processor cores the component uses to work. Set this to 0 (the default) to let the system automatically find the best setting for your computer’s hardware.

Outputs

The component produces an Embeddings Object. Think of this as a ready-to-use data package that contains the numerical representations of your text. You can directly connect this output to other nodes in your workflow, such as vector databases, search tools, or AI chat assistants, to enable features like semantic search, document retrieval, or text comparison.

Output Data Example (JSON)

{ “embeddings_object”: { “type”: “FastEmbedEmbeddings”, “model_loaded”: “Selected model name”, “status”: “ready”, “interface_capability”: “Vector mapping for text search, retrieval, and comparison” } } Note: The actual output is a LangChain-compatible interface object. The JSON above represents the metadata and capabilities that downstream nodes expect to receive.

Connectivity

This component is designed to connect to downstream nodes that require numerical text representations. In typical Nappai workflows, you would connect the Embeddings Object output to:

Vector Databases (e.g., Chroma, FAISS, Qdrant) to store and index your text.
Retrieval Agents or Search Nodes to help find relevant information based on meaning rather than exact keywords.
Chain Nodes to feed processed data into larger AI workflows. Connecting it to these nodes ensures that your text is converted accurately and consistently, enabling advanced data management and AI-driven search capabilities within Nappai.

Usage Example

Imagine you are building a system to automatically organize customer support tickets. You would drag the FastEmbed Embeddings (Local) component into your workflow, select the Model you prefer (e.g., jina-embeddings-v2-base-es for Spanish/English), and connect its output to a Vector Database node. When you send new support tickets through the flow, the component automatically converts the ticket text into vectors, allowing the database to quickly find similar past tickets or match them with relevant solutions.

Tips and Best Practices

Start with the default model and only upgrade to larger models if you notice accuracy issues in your search results.
Keep Max Sequence Length at 512 unless you are working with very long documents that require deeper context.
If your system feels slow when processing large amounts of text, try lowering the Batch Size to reduce memory pressure.
The component automatically handles formatting prefixes, so you do not need to manually add tags like passage: or query: to your text.
Leave Model Cache Directory empty unless you need to store models on a specific external drive for backup or space management.

Security Considerations

Since this component runs entirely on your local device, your data never leaves your system or travels over the internet. There are no external API keys, cloud dependencies, or third-party data collectors involved. This makes it highly secure for handling sensitive documents, private business records, or restricted personal information. Just ensure your local hardware meets the storage and memory requirements of the selected model to maintain smooth operation.