Data batch Agent
The Data batch Agent is a tool in Nappai that takes a set of data, splits it into smaller groups, and sends each group to a Worker Agent for processing. It’s useful when you have large files or many records and want to handle them efficiently without overloading the system.
How it Works
When you drop data into the component, it first checks the settings you chose (like whether to flatten JSON or output JSON). It then divides the data into chunks based on the Max Concurrency value. Each chunk is handed off to the Worker Agent, which does the actual work (for example, translating text, cleaning records, or calling an external API). Once all chunks finish, the component gathers the results and returns them as a single output called Data.
Because everything happens inside Nappai, there are no external API calls unless your Worker Agent itself uses one. The component simply orchestrates the flow of data.
Inputs
- Worker Agent: The agent that will do the real work on each batch. Think of it as a helper that knows how to process the data you give it.
- Data: The raw data you want to process. It can be a file, a list of records, or any data format that Nappai supports.
- JSON Flatten: If you check this box, any nested JSON structures in your data will be flattened into a single level, making it easier for the Worker Agent to read.
- JSON Mode: When enabled, the output will be formatted as JSON. This is handy if you plan to feed the results into another component that expects JSON.
- Max Concurrency: The maximum number of batches that can be processed at the same time. A higher number speeds up processing but uses more resources.
- Output key name: The name of the key that will hold the processed data in the output. By default it’s “Data”, but you can change it if you need to avoid naming conflicts.
- prompt: A text prompt that can be sent to the Worker Agent. Useful if the agent uses a language model or needs a specific instruction for each batch.
Outputs
- Data: The processed data returned from the Worker Agent. It will contain the results of every batch combined into one structure, ready to be used by the next component in your workflow.
Usage Example
- Add the Data batch Agent to your dashboard.
- Connect a Data Source (e.g., a CSV file) to the Data input.
- Select a Worker Agent that knows how to clean or transform the records.
- Set Max Concurrency to 5 if you want up to five batches processed simultaneously.
- (Optional) Check JSON Flatten if your data has nested objects.
- Click Run.
The component will split the file into five batches, send each to the Worker Agent, and then combine the results into a single output called Data.
You can then feed this output into another component, such as a “Save to Database” step, to store the cleaned records.
Related Components
- Worker Agent – The helper that actually processes each batch.
- Data Source – Components that provide raw data (e.g., CSV Reader, API Connector).
- Data Processor – Other components that can take the output of this agent for further transformation or analysis.
Tips and Best Practices
- Choose the right Max Concurrency: Too high and you may run out of memory; too low and processing will take longer.
- Use JSON Flatten when your Worker Agent expects flat structures; it saves you from manual data reshaping.
- Set a clear Output key name if you plan to chain multiple batch agents; this avoids key collisions.
- Test with a small dataset first to confirm the Worker Agent behaves as expected before scaling up.
Security Considerations
- All data processing happens inside Nappai, so no external data is sent unless your Worker Agent explicitly does so.
- Keep sensitive data encrypted in transit and at rest; the Data batch Agent does not add extra encryption.
- Review the permissions of the Worker Agent to ensure it only accesses the resources it needs.