WitAI

WitAI is a component that turns spoken audio into written text. It lets you upload an audio file, sends it to the WitAI service, and receives the transcribed text back. You can also use it as a tool that an AI agent can call when it needs to understand spoken input.

How it Works

When you drop an audio file into the Speech to Text field, the file is first converted to a base64 string by the Binary Component. WitAI then receives this string along with your access token (stored in a credential) and returns the recognized text. The component simply passes that text forward as a Data output, and it can also expose the same information as a Tool that an agent can invoke.

Inputs

Mapping Mode

This component has a special mode called “Mapping Mode”. When you enable this mode using the toggle switch, an additional input called “Mapping Data” is activated, and each input field offers you three different ways to provide data:

Fixed: You type the value directly into the field.
Mapped: You connect the output of another component to use its result as the value.
Javascript: You write Javascript code to dynamically calculate the value.

This flexibility allows you to create more dynamic and connected workflows.

Input Fields

Speech to Text: The audio data in base64 format that you want to convert to text. Use the Binary Component to upload your audio file.
Access Token: The access token for your WitAI account. This is supplied through the WitAi credential.
Mapping Mode: Toggle this switch to enable batch processing of multiple audio records.
Tool Name: The name that will appear when this component is used as a tool by an agent.
Tool Description: A detailed description of what this tool does, helping the agent decide when to use it.
Tools arguments metadata: Defines the arguments metadata for the tool, allowing the agent to understand the expected input format.

Credential Setup

In Nappai’s Credentials section, create a new credential of type WitAi.

Enter your Access Token (password) and WitAi Server Connection URL.

In the component, select this credential from the Credential dropdown.

The credential fields themselves are not shown in the Input Fields list.

Outputs

Data: The transcribed text returned from WitAI, available as a Data object (GetData method).
Tool: A Tool object (to_toolkit method) that can be used by an AI agent to invoke the speech‑to‑text operation.

Usage Example

Upload Audio – Drag a Binary Component into the canvas and connect it to the Speech to Text input.
Configure Credential – In the component’s settings, choose the WitAi credential you created earlier.
Enable Mapping Mode – If you want to process several audio files at once, toggle Mapping Mode and connect a list of audio files to the Mapping Data input.
Run – Execute the workflow. The component will return the transcribed text in the Data output, which you can then feed into a text‑analysis component or expose as a tool for an agent.

Binary Component – Uploads and converts files to base64.
Text Analysis – Processes the transcribed text for sentiment, keywords, etc.
Agent Toolkit – Allows you to expose the WitAI component as a callable tool for AI agents.

Tips and Best Practices

Keep audio clips short (under 30 seconds) to stay within WitAI’s rate limits.
Use Mapping Mode when you have a batch of recordings to process in one run.
Verify that the WitAI credential is active; otherwise the component will fail to authenticate.
If you need to customize the WitAI request (e.g., language or model), consider extending the component or using a custom script.

Security Considerations

Store your WitAI access token in a secure credential; never hard‑code it in the workflow.
Ensure that only authorized users have access to the component and its credential.
When using Mapping Mode, validate the input data to avoid processing malicious audio files.