Running an RSS to NNTP Gateway

If I knew that doing an RSS to NNTP gateway was so easy, I would have done it years ago. I was just waiting for somebody else to pick up this obviously useful idea, but apparently nobody else wanted to.

In comparison to doing the Gwene gateway, the (almost) ten-year-old Gmane mail-to-news project is pretty mammoth, what with all the administration, spam work and web interfaces.  Gwene, on the other hand, is a minuscule collection of Perl scripts (find the source code on GitHub).

The main issue with parsing real-world RSS/Atom files is that, like HTML, they can’t reliably be parsed strictly.  None of the Perl RSS-parsing libraries seem to take this into account, and fail pretty badly on real-world feeds.  So the Gwene scripts have to pre-process the feeds before handing them over to the libraries, and then they have to root around pretty invasively into the Perl libraries’ internal structures to pick out the useful bits.

Somebody should extract the useful bits from the Gwene scripts and whip up a simpler DWIM RSS parsing library.  But I’ll leave that to somebody that actually knows Perl.

3 thoughts on “Running an RSS to NNTP Gateway”

  1. Very cool! Thank you very much. One very important feature is still missing — blog comments are never found in RSS Feeds. Will try to implement a scraper for that.

Leave a Reply