Skip to content

Wikipedia Loader

The Wikipedia Loader lets you pull information straight from Wikipedia into your Nappai workflow. Just type a topic, pick a language, and decide how many pages you want, and the component will return the text and some basic page details.

How it Works

When you run the component, it talks to Wikipedia’s public API. It sends the topic you entered, asks for the page in the language you chose, and limits the number of pages based on the “Max Documents” setting. The API returns the page content and some metadata (like the title and URL). The component then packages that information into a simple data object that other parts of your workflow can use.

Inputs

  • Language: Choose the language of the Wikipedia page. Options are Spanish (es), English (en), or French (fr).
  • Max Documents: Set how many Wikipedia pages you want to download. The default is 1, but you can increase it if you want more results.
  • Query: Type the topic or search phrase you want to look up on Wikipedia.

Outputs

  • Data: The component outputs a list of data objects. Each object contains the page’s text (text) and metadata such as the page title and URL. You can feed this output into other components like a text splitter, summarizer, or LLM for further processing.

Usage Example

  1. Drag the Wikipedia Loader into your workflow.
  2. In the Query field, type Artificial Intelligence.
  3. Set Language to en for English.
  4. Leave Max Documents at the default value of 1.
  5. Connect the Data output to a Text Splitter component to break the article into smaller chunks, then feed those chunks into an LLM to generate a summary.
  • Text Splitter – Breaks long text into smaller pieces for easier processing.
  • OpenAI LLM – Uses the text from Wikipedia to answer questions or generate content.
  • Document Search – Lets you search through the loaded Wikipedia data for specific keywords.

Tips and Best Practices

  • Use specific queries (e.g., “Quantum computing” instead of just “computing”) to get more relevant results.
  • Keep Max Documents low if you only need a single, focused article; higher values can slow down the workflow.
  • Combine the loader with a summarizer to quickly get concise overviews of complex topics.

Security Considerations

Wikipedia is a public resource, so the data retrieved is not sensitive. However, always be mindful of the privacy of any downstream components that might store or expose the text. If you plan to share the workflow, consider adding a disclaimer that the information comes from Wikipedia and may need verification.