Skip to content

Entities Extraction

This component helps you automatically extract key pieces of information from your data using the power of AI. You tell it what information to look for, and it finds it within your data. This is useful for quickly getting the specific details you need from large amounts of text or data.

Relationship with Language Models

This component uses a language model (like those from OpenAI) to understand and analyze your data. The language model is what allows the component to intelligently identify and extract the information you’re looking for. You’ll need to select a language model to use within the component’s settings.

Inputs

  • Data: This is the data you want to extract information from. It could be text, a document, or any other data type supported by Nappai. This input is required.
  • Extract keys: This is a list of the specific pieces of information you want to extract. For example, if you want to extract a name and an address, you would list “name” and “address” here. This input is required. Each item in the list should be clearly defined.
  • Additional Context: (Optional) You can provide extra information here to help the AI understand your data and extract the information more accurately. For example, you might specify the format of your data or provide additional instructions. If left blank, a default value will be used.
  • Max chunks: (Optional) This sets a limit on how many pieces the component will break your data into for processing. The default is 5. Larger datasets might require a higher number.
  • Language Model: This is the specific AI model used for the extraction. You’ll need to select one from the available options in Nappai. This input is required.
  • chunk size: (Advanced, Optional) This setting controls how large each piece of data is before processing. The default is 1500 characters.
  • chunk overlap: (Advanced, Optional) This setting controls how many characters overlap between the data pieces. The default is 150 characters.

Outputs

  • Extracted Data: This output contains the information that was successfully extracted from your data, based on the “Extract keys” you provided. This is the main result of the component.
  • Tool: This output provides a reusable tool that encapsulates the extraction process. This is useful for more advanced users who want to integrate this functionality into their own custom workflows.

Usage Example

Let’s say you have a document containing customer information, and you want to extract the customer’s name and email address.

  1. In the “Data” input, you would provide the document.
  2. In the “Extract keys” input, you would enter ["name", "email"].
  3. The “Extracted Data” output would then contain a list of names and email addresses found in the document.

Templates

This component can be used in any Nappai workflow where you need to extract specific information from data. It’s particularly useful in workflows involving data cleaning, data analysis, and report generation.

  • Filter Data: Use this component to further refine the extracted data based on specific criteria.
  • Parse JSON: If your data is in JSON format, use this component to parse it before using Entities Extraction.
  • Summarizer: Use this component to summarize the extracted data.

Tips and Best Practices

  • Be as specific as possible when defining your “Extract keys” to get the most accurate results.
  • Use the “Additional Context” input to provide any extra information that might help the AI understand your data.
  • Experiment with different language models to find the one that works best for your data.

Security Considerations

Ensure that the data you input does not contain sensitive information that should not be processed by external AI models. Review the security and privacy policies of the language model you select.