Skip to main content

Ralsina.Me — Roberto Alsina's website

Posts about programming (old posts, page 67)

Quick hack: rss2epub -- it does what it says.

One of my favourite things about Aran­du­ka as a project is that it's an end­less source of smal­l, lim­it­ed side project­s.

For ex­am­ple, Aran­du­ka is now close to be­ing able to sync my book col­lec­tion to my phone. But... what if what I want to read on the train is not a book but, say, a blog?

Well, blogs pro­vide their con­tent via a feed. And A feed is a col­lec­tion of HTML pieces glued in­to a struc­ture plus some da­ta like au­thor and such.

And there's a great mod­ule for pars­ing them, called feed­pars­er. And I have writ­ten not one, not two, not three, but four RSS ag­gre­ga­tors in the past.

So, how about con­vert­ing the feed in­to some­thing my phone can han­dle? [#] Would it be hard to do?

Well... not re­al­ly hard. It's most­ly a mat­ter of tak­ing a smal­l, sam­ple ePub doc­u­ment (cre­at­ed by Cal­i­bre) writ­ing a few tem­plates, feed­ing it the da­ta from feed­pars­er and zip­ping it up.

For ex­am­ple, this is this blog, as an ePub and here's FBRead­er read­ing it:

Share photos on twitter with Twitpic

As usu­al, the code is open, and it's here in aran­duka's mer­cu­ri­al.

It's not re­al­ly in­ter­est­ing code, and re­quires tem­plite feed­pars­er and who knows what else.

The pro­duced ePub does­n't val­i­date, and it prob­a­bly nev­er will, be­cause it has chunks of the orig­i­nal feed in it, so stan­dard com­pli­ance does­n't de­pend on rss2epub.

Al­so, you get no im­ages. That would im­ply pars­ing and fix­ing all img el­e­ments, I sup­pose, and I am not go­ing to do it right now.

[#] I first saw this fea­ture in pluck­er a long time ago, and I know Cal­i­bre has it too.

eBooks and PyQt: a good match

I have been putting lots of love in­to Aran­du­ka an eBook man­ager, (which is look­ing very good late­ly, thanks!), and I did­n't want it to al­so be an eBook read­er.

But then I thought... how hard can it be to read ePub? Well, it's freak­ing easy!

Here's a good start at stack­over­flow.­com but the short of it is... it's a zip with some XML in it.

One of those XML files tells you where things are, one of them is the TOC, the rest is just a small stat­ic col­lec­tion of HTM­L/C­SS/im­ages.

So, here are the in­gre­di­ents to rol­l-y­our-own ePub read­er wid­get in 150 LOC:

  • Use python's zip­­file li­brary to avoid ex­­plod­ing the zip (that's lame)

  • Use El­e­­ment Tree to parse said XML files.

  • Use PyQt's QtWe­bKit to dis­­­play said col­lec­­tion of XM­L/C­SS/Im­ages

  • Use this recipe to make QtWe­bKit tell you when it wants some­thing from the zip­­file.

Plug some things to oth­er­s, shake vig­or­ous­ly, and you end up with this:

Share photos on twitter with Twitpic

Here's the code (as of to­day) and the UI file you need.

Miss­ing stuff:

  • It does­n't dis­­­play the cov­­er.

  • It on­­ly shows the top lev­­el of the ta­ble of con­­tents.

  • I on­­ly test­ed it on two books ;-)

  • It sure can use a lot of refac­­tor­ing!

Nei­ther should be ter­ri­bly hard to do.

Introducing Aranduka

Yes, it's yet an­oth­er pro­gram I am work­ing on. But hey, the last few I start­ed are ac­tu­al­ly pret­ty func­tion­al al­ready!

And... I am not do­ing this one alone, which should make it more fun.

It's an eBook (or just any book?) man­ager, that helps you keep your PDF/­Mo­bi/F­B2/what­ev­er or­ga­nized, and should even­tu­al­ly sync them to the de­vice you want to use to read them.

What works now? See the video!

In case that makes no sense to you:

  • You can get books from Feed­­Book­s. Those books will get down­load­­ed, added to your database, tagged, the cov­­er fetched, etc. etc.

  • You can im­­port your cur­rent fold­er of books in bulk.

    Aran­­du­­ka will use google and oth­­er sources to try to guess (from the file­­name) what book that is and fill in the ex­­tra da­­ta about it.

  • You can "guess" the ex­­tra da­­ta.

    By mark­ing cer­­tain da­­ta (say, the ti­tle) as re­li­able, Aran­­du­­ka will try to find some pos­si­ble books that match then you can choose if it's right.

    Of course you can al­­so ed­it that da­­ta man­u­al­­ly.

And that's about it. Planned fea­tures:

  • Way too many to list.

The goals are clear:

  • It should be beau­ti­­ful (I know it is­n't!)

  • It should be pow­er­­ful (not yet!)

  • It should be bet­ter than the "com­pe­ti­­tion"

If those three goals are not achieved, it's fail­ure. It may be a fun fail­ure, but it would still be a fail­ure.

Very pythonic progress dialogs.

Sometimes, you see a piece of code and it just feels right. Here's an example I found when doing my "Import Antigravity" session for PyDay Buenos Aires: the progressbar module.

Here's an example that will teach you enough to use progressbar effectively:

progress = ProgressBar()
for i in progress(range(80)):
    time.sleep(0.01)

Yes, that's it, you will get a nice ASCII progress bar that goes across the ter­mi­nal, sup­ports re­siz­ing and moves as you it­er­ate from 0 to 79.

The progressbar module even lets you do fancier things like ETA or fie transfer speeds, all just as nicely.

Is­n't that code just right? You want a progress bar for that loop? Wrap it and you have one! And of course since I am a PyQt pro­gram­mer, how could I make PyQt have some­thing as right as that?

Here'show the out­put looks like:

progress

You can do this with ev­ery toolk­it, and you prob­a­bly should!. It has one ex­tra fea­ture: you can in­ter­rupt the it­er­a­tion. Here's the (short) code:

# -*- coding: utf-8 -*-
import sys, time
from PyQt4 import QtCore, QtGui

def progress(data, *args):
    it=iter(data)
    widget = QtGui.QProgressDialog(*args+(0,it.__length_hint__()))
    c=0
    for v in it:
        QtCore.QCoreApplication.instance().processEvents()
        if widget.wasCanceled():
            raise StopIteration
        c+=1
        widget.setValue(c)
        yield(v)

if __name__ == "__main__":
    app = QtGui.QApplication(sys.argv)

    # Do something slow
    for x in progress(xrange(50),"Show Progress", "Stop the madness!"):
        time.sleep(.2)

Have fun!

The first english Issue of PET (our Python Magazine) is out!

Hell yeah! It has been a lot of work but it's out at http://re­vista.python.org.ar

Some ar­ti­cles:

  • PyAr, The His­­to­ry

  • from gc im­­port com­­mon­sense - Fin­ish Him!

  • Pain­­less Con­cur­ren­­cy: The mul­ti­pro­cess­ing Mod­­ule

  • In­­tro­­duc­­tion to Unit Test­ing with Python

  • Taint Mode in Python

  • Ap­­plied Dy­­namism

  • Dec­o­rat­ing code (Part 1)

  • We­b2Py for Ev­ery­­body

It's avail­able in pret­ty much ev­ery for­mat any­one can read, and if your favourite is not there, we will make it for you or may I be smote by the fly­ing spaghet­ti mon­ster's nood­ly ap­pendage!

AFAIK there is no oth­er Python mag­a­zine be­ing pub­lished (feel free to cor­rect me), so it's kind of a big thing for us in PyAr (the Ar­genti­na Python com­mu­ni­ty) that we are do­ing one, and in two lan­guages.

But why stop here? Want it to be avail­able in your lan­guage? Con­tact us at re­vistap­yar@net­man­ager­s.­com.ar it may be doable!

And of course, very soon there will be a call for ar­ti­cles for Is­sue 2, and trust me: that one's go­ing to be epic: this one was just a warmup.


Contents © 2000-2020 Roberto Alsina