Sample Code and Applications
About MemeFinder
Peter M. Murray (aka The Da Vinci Coder) won the contest with his StreamBase MemeFinder application which identifies and filters keywords from high-volume, dynamic text content, including RSS feeds, news articles, blogs, and emails, and then prioritizes and delivers the information based on sophisticated rules established by the end user.
MemeFinder is a StreamBase application that operates on the stream of words generated by a group of categorized RSS feeds. MemeFinder counts the occurrences of words as new articles are posted to the feeds it is configured to watch. It calculates a rolling 24 hour count of these word frequencies and generates events as words are encountered, as well as 15-second updates to top-20 word lists by category. It has a Swing-based user interface, which shows the results of the application.
Implementation Notes
Custom Input
MemeFinder uses a custom input adapter, which pulls the latest version of a list of categorized RSS feeds every 30 seconds and generates a stream of Tuples including the Category and Word (as well as some additional information, currently unused). This adapter takes as a configuration field the feeds.xml file, so users can create their own list of RSS feeds fairly easily. See the existing Resources/feeds.xml for the syntax of this file.
MemeUl
MemeFinder puts up a Swing-based panel to display its results. This panel is started and stopped with the lifecycle of its OutputAdapter. (NOTE: the close box of the MemeFinder UI does not function.) The UI has three parts:
- Results panel: The drop-down in the results pane selects the category currently displayed in the word list below it.
- Skip words: Users can modify the skip list by adding or removing words from the skip list. Type the word into the text field, and click Skip or Unskip. The user's custom skip-list is automatically saved in the user's home directory as skiplist.txt.
- Status: The time and date of the latest update of the word lists and the category/word and timestamp of the latest word counted.
Application
The StreamBase application manages the skip list and filters the incoming tuples from the input stream using that list. It then schedules the 24 hour decrement of the count for that word in that category. The tuple is then passed to a query, which maintains the category list. As words from new categories are encountered, they are appended to the category list . Finally, the word/category count is incremented in the master word frequency list.
There is a second input adapter supplied by the MemeFinder custom code. This input is triggered by the UI when the user clicks the Skip or Unskip buttons. Tuples add or remove words from the skip list and then are passed on to update the master word lists, removing all current count entries for newly skipped words.
Every five minutes, a metronome triggers a query to determine the list of words that need to be decremented because the 24 hour window has passed. These decrements are performed and updated counts are sent to the live feed output and UI.
Downloading, Installing, and Running MemeFinder
To use MemeFinder, you must have StreamBase 3.5 installed. Download and Install StreamBase Developer Edition.
The MemeFinder download is a compressed StreamBase project (620 KB zip file) that you load into StreamBase Studio, then run using Debug mode.
- Download the compressed project. Download MemeFinder.
- Open StreamBase Studio.
- Choose File > Import > Compressed StreamBase Project.
- Select the MemeFinder.zip file.
- Once the project is loaded, open MemeFinder.sbapp in the Application Diagrams folder.
- Click the Run or Debug button to start the application.
- The MemeFinder panel appears. See the Implementation Notes for details.
After about 10 seconds, the words start flowing into the application. This is reflected in the panel by the bottom status line indicating the latest category/word to be counted and the current count. Every 15 seconds, the top 20 words for each category of feed encountered so far are updated in the user-interface.
« More Sample Code and Applications.