Skip to content

OpenAiWhisper

OpenAiWhisper is a tool in Nappai that converts audio files into written text. This is helpful for understanding the content of audio recordings, such as meetings, interviews, or voice memos. It uses OpenAI’s advanced Whisper technology for accurate transcriptions.

Relationship with OpenAI Whisper

This component uses OpenAI’s Whisper API to perform the audio transcription. Whisper is a powerful AI model that’s very good at converting speech into text, even with background noise or different accents. You’ll need an OpenAI account and API key to use this component.

Inputs

  • Audio File: Upload the audio file you want to transcribe. Supported formats include mp3, mp4, wav, m4a, and mkv.
  • Message: (Optional) If you’re working with files already within a Nappai message, select that message here.
  • Language: Choose the language spoken in the audio file. The default is English (“en”). Other options include Spanish (“es”), French (“fr”), German (“de”), Italian (“it”), Portuguese (“pt”), Japanese (“ja”), Korean (“ko”), Hindi (“hi”), Arabic (“ar”), Russian (“ru”), and Chinese (“zh”).
  • Credential: Select the OpenAI API key you’ve already configured in Nappai. This allows OpenAiWhisper to access the OpenAI service.

Outputs

  • Transcription Text: This output contains the transcribed text from your audio file. You can use this text in other parts of your Nappai workflow, such as searching, summarizing, or analyzing the content.
  • LLM AudioFile Encoder: This provides an encoded version of your audio file, optimized for use with large language models (LLMs). This is a more advanced feature and may be used in more complex workflows.

Usage Example

Let’s say you have a recording of a client meeting (a .mp3 file). To transcribe it:

  1. Upload the .mp3 file to the “Audio File” input.
  2. Select the correct language from the “Language” dropdown (e.g., “en” for English).
  3. Select your OpenAI API credential.
  4. Run the component.
  5. The “Transcription Text” output will contain the written version of the meeting. You can then use this text in other Nappai components to summarize the meeting, extract key points, or translate it into another language.

Templates

This component is used in the “Audio and Video Transcriber” template.

  • File Message: Add files to Message: Use this to add your audio file to a message before using OpenAiWhisper.
  • PGVector: PGVector Vector Store with search capabilities: Store and search the transcription text for efficient information retrieval.
  • Summarizer: Summarize the transcribed text to get a concise overview.
  • Google Drive File Manager: Manage and access audio files stored in Google Drive.
  • Many other components: The transcribed text can be used as input for a wide variety of other Nappai components for further processing and analysis.

Tips and Best Practices

  • Ensure you select the correct language for accurate transcription.
  • For best results, use high-quality audio recordings with minimal background noise.
  • Test your transcription with a short audio clip before processing longer files.

Security Considerations

  • Protect your OpenAI API key. Do not share it with others. Nappai uses secure methods to store and handle your credentials.
  • Be mindful of the content of your audio files and ensure they comply with OpenAI’s usage policies and any relevant regulations.