Text Embedder

The Text Embedder component turns a piece of text into a numerical vector (embedding) that can be used for searching, clustering, or feeding into other AI models. It takes a message and an embedding model, runs the model locally, and returns the resulting vector along with the original text.

How it Works

When you drop the Text Embedder into a workflow, you first connect an Embedding Model (for example, OpenAI’s text‑embedding‑ada‑002 or a local model) and a Message that contains the text you want to embed. The component extracts the plain text from the message, passes it to the chosen model, and receives a list of embeddings. Since only one document is processed at a time, it takes the first embedding from the list and packages it together with the original text into a Data object. No external API calls are made beyond the model you provide, so the whole process runs locally on the machine that hosts Nappai.

Inputs

Embedding Model: The embedding model to use for generating embeddings.
This input expects a model that implements the Embeddings interface, such as an OpenAI or local embedding provider.
Message: The message to generate embeddings for.
Provide a message that contains the text you want to embed. The component will read the text field of this message.

Outputs

Embedding Data: A Data object that contains two fields:
- text: the original message text.
- embeddings: the numeric vector produced by the model.
  This output can be passed to other components that need vector representations, such as similarity search or clustering modules.

Usage Example

Select an Embedding Model
Drag an “Embedding Model” component into the canvas and choose a model like text-embedding-ada-002. Connect its output to the Embedding Model input of the Text Embedder.
Provide a Message
Use a “Message” component or any component that outputs a message. Connect its output to the Message input of the Text Embedder.
Consume the Embedding
Connect the Embedding Data output to a downstream component, such as a “Vector Store” or “Similarity Search” component, to store or query the embedding.

This simple flow lets you turn raw text into a vector that can be used for advanced AI tasks without writing code.

Embedding Model – Choose or configure the model that will generate embeddings.
Message – Create or retrieve messages that contain the text you want to embed.
Vector Store – Store embeddings for later retrieval or similarity search.
Similarity Search – Find the most similar messages or documents based on embeddings.

Tips and Best Practices

Choose the right model: Larger models give more accurate embeddings but use more resources.
Clean your text: Remove unnecessary whitespace or formatting before embedding to improve consistency.
Batch when possible: If you have many messages, consider batching them in a single call to reduce overhead.
Monitor resource usage: Embedding models can be memory‑intensive; keep an eye on CPU/GPU usage if running locally.

Security Considerations

Data privacy: Embeddings can still reveal sensitive information. Store them securely and follow your organization’s data‑handling policies.
Model access: If using a cloud model, ensure that API keys are stored safely and that network traffic is encrypted.
Local execution: Running the model locally keeps data on your own infrastructure, which can reduce exposure to third‑party services.