--- author: '' category: '' date: 2010/09/25 02:41 description: '' link: '' priority: '' slug: BB920 tags: open source, programming, python title: 'Quick hack: rss2epub -- it does what it says.' type: text updated: 2010/09/25 02:41 url_type: '' --- One of my favourite things about `Aranduka `_ as a project is that it's an endless source of small, limited side projects. For example, Aranduka is now close to being able to sync my book collection to my phone. But... what if what I want to read on the train is not a book but, say, a blog? Well, blogs provide their content via a feed. And A feed is a collection of HTML pieces glued into a structure plus some data like author and such. And there's a great module for parsing them, called feedparser. And I have written not one, not two, not three, but *four* RSS aggregators in the past. So, how about converting the feed into something my phone can handle? [#] Would it be hard to do? Well... not really hard. It's mostly a matter of taking a small, sample ePub document (created by Calibre) writing a few templates, feeding it the data from feedparser and zipping it up. For example, this is this blog, `as an ePub `_ and here's FBReader reading it: .. raw:: html

As usual, the code is open, and it's here `in aranduka's mercurial `_. It's not really interesting code, and requires `templite `_ feedparser and who knows what else. The produced ePub doesn't validate, and it probably never will, because it has chunks of the original feed in it, so standard compliance doesn't depend on rss2epub. Also, you get no images. That would imply parsing and fixing all img elements, I suppose, and I am not going to do it right now. [#] I first saw this feature in plucker a long time ago, and I know Calibre has it too.