River4 is a node.js river-of-news aggregator that stores its lists and data in the local file system or Amazon S3.
####Overview
We have a press backgrounder for River4 here. If you're wondering what it is, or why it's significant, this is the first place to go.
If you need help, we have a support mail list, with people who have successfully set up and are running River4 installations. If you're having trouble, this is the place to go.
####Installing the software
There are two howto's:
-
Setting up River4 using the local file system for storage.
-
Or, using Amazon S3 for storage.
The first option is easier, and often less expensive. However, if you're running River4 on a service like Heroku, you can't rely on the local file system for persistent storage, so we built River4 to work with S3 as well. On Heroku, which runs in the Amazon cloud, access to S3 storage is free.
There is also an experimental Docker installer. Notes about using it are on the wiki.
-
I edit code in an outliner, which is then turned into JavaScript. The "opml" folder in the repository contains the versions of the code that I edit. The comments are stripped out of the code before it's converted to raw JS, so there is information for developers in the OPML that isn't in the main files (though all the running code is in both).
-
The first released version is 0.79. They will increment by one one-hundredth every release. At some point I'll call it 1.0, then subsequent releases will be 1.01, 1.02 etc.
-
Heroku How To -- get a Heroku server running with Fargo Publisher, the back-end for Fargo.
-
Bare-bones Heroku do -- checklist for setting up a Heroku server running Node.js from a Mac desktop.
-
The River4 support mail list.
-
Chris Dadswell wrote a tutorial for setting up your own River4 installation.
-
The Hello World of Rivers.
Thanks to two developer friends, Dan MacTough and Eric Kidd, who helped this Node.js newbie get this app up and running.
Specifically thanks to Dan for writing the excellent feedparser and opmlparser packages that are incorporated in River4.
The home page of the River4 server now shows you the rivers being maintained by the server. There's a menu that links to the dashboard, the blog, mail list, and GitHub repo.
New feature: Callback scripts that run when River4 adds an item to the river.
Fixed JSON encoding problem reported by Andrew Shell.
We now record the current time in each item in the calendar structure. This is used when building a river to set the whenLastUpdate field.
Now when we receive a message saying that a feed updated, we read the feed and rebuild all rivers that it's part of. I wanted to test the framework before going this step.
Again, a careful code review and testing by others would be appreciated.
Added support for rssCloud. Now if a feed has a <cloud> element, we contact the server and go through the subscription protocol. If it all works, we'll be notified of updates to the feed before we poll.
The rssCloud support is largely untested. However I have upgraded all my copies of River4 to run the new version, and it seems to be functioning well. Code review of the new functionality would be much appreciated.
We also remove items from each feed's history array when the item no longer appears in the feed. This reduces the size of some of the files in the data folder, in general making the software more efficient.
Fixed an error that would cause River4 to crash when there were no OPML subscription list files in the lists folder.
A new way to configure River4, using a config.son file in the same directory as river4.js.
Fixed a bug that would cause generated rivers to be empty immediately after date rollover.
The fix was to write out an empty array in the calendar structure when the date rollover occurs. The problem was that until there was a new item saved for the day, the first read of the calendar, when building a river would fail, causing the build to finish.
The problem was discovered in podcatch.com, and written up on the River4 blog.
There's now a River4 Console app, at http://river4.io/ that allows you to edit subscription lists in an outliner, and set some of the server preferences remotely. It's documented on the River4 blog.
Fixed a bug in file name processing.
Two fixes for local file system use. 1. Only read lists whose names end with .opml -- there were invisible files on the Mac that would cause problems. 2. When running on Windows and writing to the local file system, there are more illegal characters. Replace them with underscores.
Apparently there was a change in format in the FeedParser module, in the way it represents <source:outline> elements. This release handles the change in format so outlines now pass through in a way that's understandable to the RiverBrowser software.
This version can be configured to store its data in the local filesystem instead of S3. See the blog post for details.
New /ping endpoint, available to be called by a publisher, on behalf of a user, to indicate that a feed has updated, and should be read immediately. Radio3 has this facility as of today, as does Fargo.
Fixed a problem that caused rivers to display only old stories. Full explanation on the blog.
Added more fields to the struct the /status call returns. It now says what the s3path is, what port the server is running on, and if you've defined a s3defaultAcl (see v0.91) what the value of that parameter is.
A new environment variable, s3defaultAcl, if present specifies the permissions on S3 files we create. The default is public-read. With this parameter, it may be possible to run a private installation of River4.
New <source:outline> elements flow through River4. See the docs for the source namespace for details.
One small change to package.json, and no changes to the JavaScript code.
A subscription list can now contain an include node, so you can have a list of lists. Full explanation in this blog post.
Changed the package.json file to require Node v0.8.x. Previously it was 0.6.x. This should make it possible to deploy on Nodejitsu without modification, per Dave Seidel's report.
Fixed a bug that would cause River4 to crash when processing an item with a null title.
Fixed a bug that would cause River4 to crash when reading an item from a subscription list that didn't have an xmlUrl attribute.
Two fixes, explained here.
Two fixes, explained here.
Now if there's an error in any JSON code we try to parse, we display an error message in the console, along with the path to the S3 file we were trying to read.
serverData.stats now has a copy of the last story added to the river. The dashboard page displays it.
New "dashboard" feature. If your server is running at aggregator.mydomain.com, if you go to:
http://aggregator.mydomain.com/dashboard
You'll get a real-time readout of what your aggregator is doing.
The HTML source for the dashboard page is in dashboard.opml in the opml folder in the repository.