RSS Reader Input Adapter

Introduction

The RSS Reader is an embedded adapter that reads RSS data from any number of HTTP feeds. Utilizing the schema of RSS (versions 0.90, 0.91N, 0.91U, 0.92, 0.93, 0.94, 1.0 or 2.0.x) or Atom (versions 0.3 or 1.0), the adapter interprets this data and emits tuples containing the relevant information from the feeds. One tuple will be emitted for each entry of each feed.

The adapter will periodically access each feed (see the Poll Frequency property below) to see if there have been changes. New tuples will be emitted when entries have changed.

When a publish date is present for a given feed's entries or for the feed itself, this will be used to determine when updates are needed (that is, a new tuple will be emitted whenever the publish date of an entry changes). However because publish dates are optional values in most RSS and Atom feeds, it may not be possible to detect if a given entry has been modified after initial publication. In that case, no further update tuples will be emitted for that entry.

Properties

Property Description
RSS Feed(s) List of your RSS and Atom feeds. At least one feed is required.
Poll Frequency The time, in seconds, to wait before polling the RSS server again to see if there were updates to the feed. Required. Defaults to 5 (that is, poll every five seconds.)
Maximum String Size The maximum size of string fields in the schema. Required. Defaults to 500.
Number Of Connection Retries The maximum number of times the adapter should attempt to reconnect to a feed before giving up. To specify an unlimited number of retries, enter a value of -1. Optional. Defaults to 5.

The adapter will wait 30 seconds before each retry. Once the adapter has given up connecting to a feed, it will be removed from the list of feeds to watch. If the list becomes empty, the adapter will exit.

Show Title? Show the titles of feed entries. Optional. Defaults to true.
Show Link? Show the URL of feed entries. Optional. Defaults to true.
Show URI? Show the HTTP web address for each of feed entries. Optional. Defaults to false.
Show Description? Show the item synopsis of feed entries. (RSS ONLY.) Optional. Defaults to true.
Show Contents? Show the item synopsis of feed entries. (Atom ONLY.) Optional. Defaults to false.
Show Author? Show the email addresses of the authors of feed entries. Optional. Defaults to false.
Show Categories? Show the comma-separated list of tags used to categorize each individual entry. Optional. Defaults to false.
Show Published Date (As String)? Show the dates that entries were published to the web, as a string. Optional. Defaults to true. Cannot be combined with Show Published Date (As Timestamp).
Show Published Date (As Timestamp)? Show the dates that entries were published to the web, as a timestamp. Optional. Defaults to false. Cannot be combined with Show Published Date (As String).

Schema

The schema of the tuples emitted by the adapter have different fields depending on the selected properties:

Property Field Name Field Type
Show Title title string
Show Link link string
Show URI uri string
Show Description description string
Show Contents contents string
Show Author authors string
Show Categories categories string
Show Published Date (As String or As Timestamp) publishedDate string or timestamp

Typechecking and Error Handling

Typechecking fails if there is not at least one correctly formatted HTTP RSS feed in RSS Feed(s) or if Period is less than or equal to zero.

Typechecking also fails if both Show Published Date (As String) and Show Published Date (As Timestamp) are checked.

At runtime, if a property (such as Show Contents or Show Author) was requested that is not available for a particular feed, a warning is issued saying that the corresponding tuple fields will be set to null. This can happen when the official specification for the RSS or Atom version of this feed does not include the requested property, when the property is optional and was simply not specified in this feed, or when the property was not properly formatted according to the feed version's specifications so that it could not be read.

Related Topics