Categorizer
The Categorizer component helps you quickly find the main categories in a set of items. Just feed it a list of items, choose a model, and it will return a list of categories and a tool that can be used later in your workflow.
How it Works
The component takes the items you provide and runs them through a pre‑trained language model. It splits the text into manageable chunks (up to the number you set in Max Chunks) and asks the model to identify the most relevant categories. The results are returned as a simple list of category names. In addition, the component builds a small “tool” that can be reused in other parts of your automation, making it easy to chain the categorization step with further actions.
Inputs
- Items: The data you want to categorize. This can be a list of text strings, documents, or any other data that can be represented as text.
- Model: The language model that will perform the categorization. You can choose from the models available in your Nappai environment.
- Max Chunks: The maximum number of text chunks the component will send to the model. A higher number can give more accurate results but may take longer to process.
Outputs
- Categories: A list of category names that the model identified. This output can be used directly in filters, dashboards, or as input to other components.
- Tool: A reusable tool object that encapsulates the categorization logic. You can attach this tool to other components that need to perform the same categorization without re‑running the model.
Usage Example
- Drag the Categorizer component onto your dashboard.
- Connect a Data Input component that supplies the items you want to analyze.
- Select a suitable model from the Model dropdown.
- Set Max Chunks to 10 (or whatever fits your data size).
- Run the workflow.
- The Categories output will contain the main categories, and the Tool output can be passed to a Filter component to keep only items that belong to a specific category.
Related Components
- Data Input – Pulls raw data into the workflow.
- Model Selector – Lets you choose which language model to use.
- Filter – Keeps only items that match certain criteria, such as a specific category.
- Output Formatter – Formats the results for display or export.
Tips and Best Practices
- Choose a model that is trained on data similar to yours for better accuracy.
- Keep Max Chunks moderate; too many can slow down the process.
- Use the Tool output to avoid re‑running the categorization when you need the same logic elsewhere.
- Verify the categories manually on a small sample before scaling up.
Security Considerations
The Categorizer processes data locally within the Nappai environment, so no external API calls are made. However, if your items contain sensitive information, ensure that your workflow complies with your organization’s data‑handling policies and that the output is stored securely.