Running an RSS to NNTP Gateway

If I knew that doing an RSS to NNTP gateway was so easy, I would have done it years ago. I was just waiting for somebody else to pick up this obviously useful idea, but apparently nobody else wanted to.

In comparison to doing the Gwene gateway, the (almost) ten-year-old Gmane mail-to-news project is pretty mammoth, what with all the administration, spam work and web interfaces.  Gwene, on the other hand, is a minuscule collection of Perl scripts (find the source code on GitHub).

The main issue with parsing real-world RSS/Atom files is that, like HTML, they can’t reliably be parsed strictly.  None of the Perl RSS-parsing libraries seem to take this into account, and fail pretty badly on real-world feeds.  So the Gwene scripts have to pre-process the feeds before handing them over to the libraries, and then they have to root around pretty invasively into the Perl libraries’ internal structures to pick out the useful bits.

Somebody should extract the useful bits from the Gwene scripts and whip up a simpler DWIM RSS parsing library.  But I’ll leave that to somebody that actually knows Perl.

3 thoughts on “Running an RSS to NNTP Gateway”

  1. Very cool! Thank you very much. One very important feature is still missing — blog comments are never found in RSS Feeds. Will try to implement a scraper for that.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s