Skip to main content

Ralsina.Me — Roberto Alsina's website

Quick hack: rss2epub -- it does what it says.

One of my favourite things about Aran­du­ka as a project is that it's an end­less source of smal­l, lim­it­ed side project­s.

For ex­am­ple, Aran­du­ka is now close to be­ing able to sync my book col­lec­tion to my phone. But... what if what I want to read on the train is not a book but, say, a blog?

Well, blogs pro­vide their con­tent via a feed. And A feed is a col­lec­tion of HTML pieces glued in­to a struc­ture plus some da­ta like au­thor and such.

And there's a great mod­ule for pars­ing them, called feed­pars­er. And I have writ­ten not one, not two, not three, but four RSS ag­gre­ga­tors in the past.

So, how about con­vert­ing the feed in­to some­thing my phone can han­dle? [#] Would it be hard to do?

Well... not re­al­ly hard. It's most­ly a mat­ter of tak­ing a smal­l, sam­ple ePub doc­u­ment (cre­at­ed by Cal­i­bre) writ­ing a few tem­plates, feed­ing it the da­ta from feed­pars­er and zip­ping it up.

For ex­am­ple, this is this blog, as an ePub and here's FBRead­er read­ing it:

Share photos on twitter with Twitpic

As usu­al, the code is open, and it's here in aran­duka's mer­cu­ri­al.

It's not re­al­ly in­ter­est­ing code, and re­quires tem­plite feed­pars­er and who knows what else.

The pro­duced ePub does­n't val­i­date, and it prob­a­bly nev­er will, be­cause it has chunks of the orig­i­nal feed in it, so stan­dard com­pli­ance does­n't de­pend on rss2epub.

Al­so, you get no im­ages. That would im­ply pars­ing and fix­ing all img el­e­ments, I sup­pose, and I am not go­ing to do it right now.

[#] I first saw this fea­ture in pluck­er a long time ago, and I know Cal­i­bre has it too.

eBooks and PyQt: a good match

I have been putting lots of love in­to Aran­du­ka an eBook man­ager, (which is look­ing very good late­ly, thanks!), and I did­n't want it to al­so be an eBook read­er.

But then I thought... how hard can it be to read ePub? Well, it's freak­ing easy!

Here's a good start at stack­over­flow.­com but the short of it is... it's a zip with some XML in it.

One of those XML files tells you where things are, one of them is the TOC, the rest is just a small stat­ic col­lec­tion of HTM­L/C­SS/im­ages.

So, here are the in­gre­di­ents to rol­l-y­our-own ePub read­er wid­get in 150 LOC:

  • Use python's zip­­file li­brary to avoid ex­­plod­ing the zip (that's lame)

  • Use El­e­­ment Tree to parse said XML files.

  • Use PyQt's QtWe­bKit to dis­­­play said col­lec­­tion of XM­L/C­SS/Im­ages

  • Use this recipe to make QtWe­bKit tell you when it wants some­thing from the zip­­file.

Plug some things to oth­er­s, shake vig­or­ous­ly, and you end up with this:

Share photos on twitter with Twitpic

Here's the code (as of to­day) and the UI file you need.

Miss­ing stuff:

  • It does­n't dis­­­play the cov­­er.

  • It on­­ly shows the top lev­­el of the ta­ble of con­­tents.

  • I on­­ly test­ed it on two books ;-)

  • It sure can use a lot of refac­­tor­ing!

Nei­ther should be ter­ri­bly hard to do.

Eastern Standard Tribe

Cover for Eastern Standard Tribe

Review:

Not Doc­torow's best book.

The end­ing is a bit facile. There's a nice part in the mid­dle where you start ac­tu­al­ly doubt­ing the nar­ra­tor's san­i­ty, I wish it had gone that way a bit far­ther.


Contents © 2000-2023 Roberto Alsina