Skip to content

Documents ⇢ Data

Documents ⇢ Data is a simple tool that takes the documents you have loaded into your workflow and turns them into a format that other Nappai components can work with. Think of it as a translator that changes the “document” language into the “data” language that the rest of the system understands.

How it Works

When you drop a list of LangChain Document objects into this component, it runs a quick conversion routine:

  1. It checks if you accidentally passed a single Document instead of a list and wraps it in a list if needed.
  2. For each document, it calls Data.from_document(document) – a built‑in helper that extracts the text, metadata, and any other useful fields.
  3. The result is a list of Data objects that you can hand off to other components (like data processors, AI models, or storage modules).

No external services are called; everything happens locally inside the dashboard.

Inputs

  • Documents: The list of documents you want to convert.
    Visible in: All

Outputs

The component returns a list of Data objects. Each Data item contains the document’s content and metadata in a standardized format. You can feed this list into any component that expects Data input, such as:

  • Data analysis tools
  • AI model prompts
  • Storage or export modules

Usage Example

  1. Load Documents – Use a “Document Loader” component to read PDFs, Word files, or web pages into a list of Document objects.
  2. Convert – Connect the loader’s output to the Documents ⇢ Data component.
    The component will instantly produce a list of Data objects.
  3. Process – Feed the Data list into a “Data Processor” or an AI model component to extract insights, summarize, or store the information.

This simple two‑step flow lets you move from raw documents to actionable data without writing any code.

  • Data ⇢ Documents – Converts Data objects back into Document format for editing or re‑exporting.
  • Document Loader – Reads files or URLs and outputs a list of Document objects.
  • Data Processor – Performs transformations, filtering, or analysis on Data lists.

Tips and Best Practices

  • Check Document Quality – Make sure the documents are properly parsed; corrupted files may produce empty Data objects.
  • Batch Size – If you have thousands of documents, consider splitting them into smaller batches to keep the dashboard responsive.
  • Metadata Preservation – The conversion keeps all metadata (author, date, tags). Use it later for sorting or filtering.

Security Considerations

  • The component works entirely on data you provide; it does not send any information outside your Nappai instance.
  • Be cautious when loading documents from untrusted sources, as they may contain malicious content that could affect downstream components.