Skip to content

Recursive Character Text Splitter

The Recursive Character Text Splitter is a tool that takes a long piece of text and breaks it into smaller, more manageable chunks. It keeps related sentences or paragraphs together so that each chunk still makes sense on its own. This is useful when you want to feed text into other parts of your automation workflow, like AI models or data storage systems, without losing context.

How it Works

The component works entirely inside your dashboard. When you give it a block of text, it follows these steps:

  1. Choose where to cut – It looks for specific characters or patterns (called separators) such as double line breaks, single line breaks, spaces, or any custom characters you provide.
  2. Create chunks – It splits the text into pieces that are no longer than the Chunk Size you set.
  3. Add overlap – To keep context between chunks, it adds a few extra characters from the end of one chunk to the beginning of the next. The number of overlapping characters is set by Chunk Overlap.
  4. Return the pieces – The result is a list of text blocks that you can pass to other components in your workflow.

No external APIs are called; everything happens locally in the dashboard.

Inputs

  • Input: The texts you want to split.
    You can provide plain text, documents, or any data that contains text.

  • Chunk Overlap: The amount of overlap between chunks.
    This helps keep context when the text is split.

  • Chunk Size: The maximum length of each chunk.
    Set this to control how big each piece of text will be.

  • Separators: The characters to split on.
    If left empty, the component uses the defaults [“\n\n”, “\n”, ” ”, ""].

Outputs

  • Data: The split text is returned as a Data object (method: split_data).
    You can feed this output into other components, such as AI models or storage modules.

Usage Example

  1. Add the component to your workflow.
  2. Connect your text source (e.g., a document upload or a previous component that outputs text) to the Input field.
  3. Set Chunk Size to 1000 characters and Chunk Overlap to 200 characters.
  4. Leave Separators empty to use the default split points.
  5. Run the workflow – the component will output a list of text chunks that you can then send to an AI model for summarization or to a database for storage.
  • Text Splitter – A generic splitter that uses a fixed number of characters.
  • Document Loader – Loads documents from files or URLs before they can be split.
  • AI Text Summarizer – Takes the split chunks and produces concise summaries.

Tips and Best Practices

  • Keep context: Use a larger Chunk Overlap if your text contains long sentences that span multiple chunks.
  • Avoid too small chunks: Setting Chunk Size too low can make each piece hard to understand.
  • Custom separators: If your text uses a unique delimiter (e.g., a special marker), add it to the Separators list to ensure clean splits.
  • Test with sample data: Run the component on a small sample first to verify that the chunks look correct before processing large volumes.

Security Considerations

  • The component processes text locally; no data leaves your dashboard.
  • Ensure that any sensitive information in the text is handled according to your organization’s data‑privacy policies before feeding it into the splitter.