Quick hack: rss2epub -- it does what it says.
One of my favourite things about Aranduka as a project is that it's an endless source of small, limited side projects.
For example, Aranduka is now close to being able to sync my book collection to my phone. But... what if what I want to read on the train is not a book but, say, a blog?
Well, blogs provide their content via a feed. And A feed is a collection of HTML pieces glued into a structure plus some data like author and such.
And there's a great module for parsing them, called feedparser. And I have written not one, not two, not three, but four RSS aggregators in the past.
So, how about converting the feed into something my phone can handle? [#] Would it be hard to do?
Well... not really hard. It's mostly a matter of taking a small, sample ePub document (created by Calibre) writing a few templates, feeding it the data from feedparser and zipping it up.
For example, this is this blog, as an ePub and here's FBReader reading it:
As usual, the code is open, and it's here in aranduka's mercurial.
It's not really interesting code, and requires templite feedparser and who knows what else.
The produced ePub doesn't validate, and it probably never will, because it has chunks of the original feed in it, so standard compliance doesn't depend on rss2epub.
Also, you get no images. That would imply parsing and fixing all img elements, I suppose, and I am not going to do it right now.
[#] I first saw this feature in plucker a long time ago, and I know Calibre has it too.
eBooks and PyQt: a good match
I have been putting lots of love into Aranduka an eBook manager, (which is looking very good lately, thanks!), and I didn't want it to also be an eBook reader.
But then I thought... how hard can it be to read ePub? Well, it's freaking easy!
Here's a good start at stackoverflow.com but the short of it is... it's a zip with some XML in it.
One of those XML files tells you where things are, one of them is the TOC, the rest is just a small static collection of HTML/CSS/images.
So, here are the ingredients to roll-your-own ePub reader widget in 150 LOC:
Use python's zipfile library to avoid exploding the zip (that's lame)
Use Element Tree to parse said XML files.
Use PyQt's QtWebKit to display said collection of XML/CSS/Images
Use this recipe to make QtWebKit tell you when it wants something from the zipfile.
Plug some things to others, shake vigorously, and you end up with this:
Here's the code (as of today) and the UI file you need.
Missing stuff:
It doesn't display the cover.
It only shows the top level of the table of contents.
I only tested it on two books ;-)
It sure can use a lot of refactoring!
Neither should be terribly hard to do.
THE DOPE on mars
Eastern Standard Tribe
Review:Not Doctorow's best book. |