Skip to content

Structured Output

Structured Output is a component that turns the free‑form text produced by a language model into clean, well‑defined data. By giving the model a clear schema and a set of formatting instructions, you can automatically convert emails, reports, or any unstructured text into JSON objects that can be used directly in your workflows.

How it Works

  1. Model Selection – Choose a language model that supports structured output (e.g., OpenAI GPT‑4, Anthropic Claude).
  2. Prompt Construction – The component builds a system prompt that tells the model to return data in a JSON format that matches the user‑defined schema.
  3. Schema Building – The JSON schema you provide is turned into a Pydantic model. If you tick “Generate Multiple,” the model is wrapped to return a list of those objects.
  4. LLM Call – The model is called with the input message and the system prompt.
  5. Result Parsing – The response is parsed into the Pydantic model, then wrapped in a Data object and sent out as llm_structured_output.

The whole process happens inside Nappai, so you don’t need to write any code—just fill in the fields.

Inputs

  • Model: The language model to use to generate the structured output.
  • Input Message: The input message to the language model.
  • Format Instructions: The instructions to the language model for formatting the output.
    Example default value:
    You are an AI system designed to extract structured information from unstructured text.Given the input_text, return a JSON object with predefined keys based on the expected structure.Extract values accurately and format them according to the specified type (e.g., string, integer, float, date).If a value is missing or cannot be determined, return a default (e.g., null, 0, or 'N/A').If multiple instances of the expected structure exist within the input_text, stream each as a separate JSON object.
  • Schema Name: Provide a name for the output data schema.
  • Output Schema: Define the structure and data types for the model’s output.
  • Generate Multiple: Set to True if the model should generate a list of outputs instead of a single output.

Outputs

  • llm_structured_output: The structured data returned by the language model, wrapped in a Data object. It can be any JSON‑compatible structure defined by your schema.

Usage Example

You want to pull order details from a customer email.

  1. Model – Select OpenAI GPT‑4.
  2. Input Message – Paste the email text.
  3. Format Instructions – Leave the default value.
  4. Schema NameOrder.
  5. Output Schema
    {
    "order_id": "string",
    "customer_name": "string",
    "items": [
    {
    "product_id": "string",
    "quantity": "integer",
    "price": "float"
    }
    ],
    "total": "float",
    "order_date": "date"
    }
  6. Generate Multiple – Leave unchecked (single order per email).
  7. Run the component.
  8. The output will be a JSON object with the order details, ready to feed into downstream components like a database writer or a notification system.
  • LLM Output – For simple text responses from a language model.
  • Data Parser – To transform raw JSON into Nappai data structures.
  • Database Writer – To store the structured data in a database.

Tips and Best Practices

  • Keep the schema simple – Start with a few fields, then add more as you test.
  • Validate the schema – Use the “Output Schema” editor to catch syntax errors before running.
  • Use “Generate Multiple” when you expect several records in one message (e.g., a list of invoices).
  • Test with sample data – Run the component on a few example messages to ensure the output matches your expectations.
  • Avoid sensitive data – If the input contains personal information, consider anonymizing it before sending it to the LLM.

Security Considerations

  • Data Privacy – The text you send to the language model may leave your local environment, depending on the provider’s policies.
  • Compliance – Ensure that sending customer or financial data to an external LLM complies with your organization’s data‑handling regulations.
  • Model Access – Restrict which users can configure the component to prevent accidental exposure of sensitive prompts or schemas.