Skip to content

RSS Feeds

The RSS Feeds component lets you pull the latest news articles from one or more RSS feed URLs or an OPML file. It turns each article into a document that can be used by other parts of your Nappai workflow, such as summarization, sentiment analysis, or storage.

How it Works

When you provide a list of RSS feed URLs or an OPML file, the component uses LangChain’s RSSFeedLoader to read each feed. It downloads the HTML content of every article, extracts the main text, and packages it into a Data object that includes the article’s title, link, publication date, and the raw HTML. The result is a list of these Data objects that you can pass to other components.

Inputs

Input Fields

  • RSS Feed URLs: Enter one or more RSS feed URLs, one per line. The component will fetch articles from each of these feeds.
  • **OPML data (XML Format) **: Paste an OPML file (XML) that contains a list of RSS feeds. If this field is filled, it takes priority over the RSS Feed URLs field.

Outputs

  • Data: A list of Data objects, each representing a news article. Each object contains the article’s text, title, link, publication date, and other metadata. You can feed this output into components that perform text analysis, storage, or further processing.

Usage Example

  1. Add the RSS Feeds component to your workflow.
  2. Enter the RSS feed URLs you want to monitor, such as:
    https://news.google.com/rss
    https://rss.cnn.com/rss/edition.rss
  3. Connect the “Data” output to a component that will process the articles, for example a summarizer or a database writer.
  4. Run the workflow. The component will fetch the latest articles from the feeds and pass them downstream.
  • Text Summarizer – Condense the fetched articles into short summaries.
  • Sentiment Analyzer – Detect the sentiment of each article.
  • Database Writer – Store the articles in a database for later retrieval.

Tips and Best Practices

  • Use OPML when you have many feeds: An OPML file keeps your input tidy and makes it easier to update the list of feeds.
  • Limit the number of URLs: Fetching too many feeds at once can slow down your workflow. Consider batching or scheduling.
  • Check feed reliability: Some RSS feeds may be slow or return errors. Add error handling or retries if needed.
  • Sanitize content: If you plan to display the articles on a public site, strip out any unwanted scripts or HTML tags.

Security Considerations

  • Trust the source: RSS feeds can contain malicious content. Validate the URLs and consider sanitizing the HTML before using it in downstream components.
  • Rate limits: Some feed providers impose limits. Respect these limits to avoid being blocked.
  • Data privacy: If the feeds contain sensitive information, ensure that your workflow complies with your organization’s data handling policies.