Skip to main content

Ralsina.Me — Roberto Alsina's website

Posts about python (old posts, page 69)

New golfing challenge: PatoCabrera

In the spir­it of the De Vi­cen­zo web browser, I am start­ing a new pro­gram, called Pa­to Cabr­era. Here are the rules:

  • Twit­ter client (no iden­ti.­­ca in the first ver­­sion, but to be added lat­er)

  • Has these fea­­tures: http://­­paste­bin.lug­­men.org.ar/6464

  • Has to be im­­ple­­men­t­ed be­­fore April 4th

  • Smal­l­­er than 16384 bytes (of python code) but may be larg­er be­­cause of art­­work.

Let's see how it works :-)

OK, so THAT is how much browser I can put in 128 lines of code.

I have al­ready post­ed a cou­ple of times (1, 2) about De Vi­cen­zo , an at­tempt to im­ple­ment the rest of the browser, start­ing with PyQt's We­bKit... lim­it­ing my­self to 128 lines of code.

Of course I could do more, but I have my stan­dard­s!

  • No us­ing ;

  • No if what­ev­er: f()

Oth­er than that, I did a lot of dirty trick­s, but right now, it's a fair­ly com­plete browser, and it has 127 lines of code (ac­cord­ing to sloc­coun­t) so that's enough play­ing and it's time to go back to re­al work.

But first, let's con­sid­er how some fea­tures were im­ple­ment­ed (I'll wrap the lines so they page stays rea­son­ably nar­row), and al­so look at the "nor­mal" ver­sions of the same (the "nor­mal" code is not test­ed, please tell me if it's bro­ken ;-).

This is not some­thing you should learn how to do. In fac­t, this is al­most a trea­tise on how not to do things. This is some of the least python­ic, less clear code you will see this week.

It is short, and it is ex­pres­sive. But it is ug­ly.

I'll dis­cuss this ver­sion.

Proxy Support

A brows­er is not much of a brows­er if you can't use it with­out a prox­y, but luck­i­ly Qt's net­work stack has good proxy sup­port. The trick was con­fig­ur­ing it.

De Vicenzo supports HTTP and SOCKS proxies by parsing a http_proxy environment variable and setting Qt's application-wide proxy:

 proxy_url = QtCore.QUrl(os.environ.get('http_proxy', ''))
 QtNetwork.QNetworkProxy.setApplicationProxy(QtNetwork.QNetworkProxy(\
 QtNetwork.QNetworkProxy.HttpProxy if unicode(proxy_url.scheme()).startswith('http')\
 else QtNetwork.QNetworkProxy.Socks5Proxy, proxy_url.host(),\
 proxy_url.port(), proxy_url.userName(), proxy_url.password())) if\
'http_proxy' in os.environ else None

How would that look in nor­mal code?

if 'http_proxy' in os.environ:
    proxy_url = QtCore.QUrl(os.environ['http_proxy'])
    if unicode(proxy_url.scheme()).starstswith('http'):
        protocol = QtNetwork.QNetworkProxy.HttpProxy
    else:
        protocol = QtNetwork.QNetworkProxy.Socks5Proxy
    QtNetwork.QNetworkProxy.setApplicationProxy(
        QtNetwork.QNetworkProxy(
            protocol,
            proxy_url.host(),
            proxy_url.port(),
            proxy_url.userName(),
            proxy_url.password()))

As you can see, the main abus­es against python here are the use of the ternary op­er­a­tor as a one-­line if (and nest­ing it), and line length.

Persistent Cookies

You re­al­ly need this, since you want to stay logged in­to your sites be­tween ses­sion­s. For this, first I need­ed to write some per­sis­tence mech­a­nis­m, and then save/re­store the cook­ies there.

Here's how the persistence is done (settings is a global QSettings instance):

def put(self, key, value):
    "Persist an object somewhere under a given key"
    settings.setValue(key, json.dumps(value))
    settings.sync()

def get(self, key, default=None):
    "Get the object stored under 'key' in persistent storage, or the default value"
    v = settings.value(key)
    return json.loads(unicode(v.toString())) if v.isValid() else default

It's not terribly weird code, except for the use of the ternary operator in the last line. The use of json ensures that as long as reasonable things are persisted, you will get them with the same type as you put them without needing to convert them or call special methods.

So, how do you save/restore the cookies? First, you need to access the cookie jar. I couldn't find whether there is a global one, or a per-webview one, so I created a QNetworkCookieJar in line 24 and assign it to each web page in line 107.

# Save the cookies, in the window's closeEvent
self.put("cookiejar", [str(c.toRawForm()) for c in self.cookies.allCookies()])

# Restore the cookies, in the window's __init__
self.cookies.setAllCookies([QtNetwork.QNetworkCookie.parseCookies(c)[0]\
for c in self.get("cookiejar", [])])

Here I con­fess I am guilty of us­ing list com­pre­hen­sions when a for loop would have been the cor­rect thing.

I use the same trick when restor­ing the open tab­s, with the added mis­fea­ture of us­ing a list com­pre­hen­sion and throw­ing away the re­sult:

# get("tabs") is a list of URLs
[self.addTab(QtCore.QUrl(u)) for u in self.get("tabs", [])]

Using Properties and Signals in Object Creation

This is a fea­ture of re­cent PyQt ver­sion­s: if you pass prop­er­ty names as key­word ar­gu­ments when you cre­ate an ob­jec­t, they are as­signed the val­ue. If you pass a sig­nal as a key­word ar­gu­men­t, they are con­nect­ed to the giv­en val­ue.

This is a re­al­ly great fea­ture that helps you cre­ate clear, lo­cal code, and it's a great thing to have. But if you are writ­ing evil code... well, you can go to hell on a hand­bas­ket us­ing it.

This is all over the place in De Vi­cen­zo, and here's one ex­am­ple (yes, this is one line):

QtWebKit.QWebView.__init__(self, loadProgress=lambda v:\
(self.pbar.show(), self.pbar.setValue(v)) if self.amCurrent() else\
None, loadFinished=self.pbar.hide, loadStarted=lambda:\
self.pbar.show() if self.amCurrent() else None, titleChanged=lambda\
t: container.tabs.setTabText(container.tabs.indexOf(self), t) or\
(container.setWindowTitle(t) if self.amCurrent() else None))

Oh, boy, where do I start with this one.

There are lambda expressions used to define the callbacks in-place instead of just connecting to a real function or method.

There are lamb­das that con­tain the ternary op­er­a­tor:

loadStarted=lambda:\
    self.pbar.show() if self.amCurrent() else None

There are lambdas that use or or a tuple to trick python into doing two things in a single lambda!

loadProgress=lambda v:\
(self.pbar.show(), self.pbar.setValue(v)) if self.amCurrent() else\
None

I won't even try to un­tan­gle this for ed­u­ca­tion­al pur­pos­es, but let's just say that line con­tains what should be re­placed by 3 meth­od­s, and should be spread over 6 lines or more.

Download Manager

Ok, call­ing it a man­ag­er is over­reach­ing, since you can't stop them once they start, but hey, it lets you down­load things and keep on brows­ing, and re­ports the pro­gress!

First, on line 16 I created a bars dictionary for general bookkeeping of the downloads.

Then, I need­ed to del­e­gate the un­sup­port­ed con­tent to the right method, and that's done in lines 108 and 109

What that does is basically that whenever you click on something WebKit can't handle, the method fetch will be called and passed the network request.

def fetch(self, reply):
    destination = QtGui.QFileDialog.getSaveFileName(self, \
        "Save File", os.path.expanduser(os.path.join('~',\
            unicode(reply.url().path()).split('/')[-1])))
    if destination:
        bar = QtGui.QProgressBar(format='%p% - ' +
            os.path.basename(unicode(destination)))
        self.statusBar().addPermanentWidget(bar)
        reply.downloadProgress.connect(self.progress)
        reply.finished.connect(self.finished)
        self.bars[unicode(reply.url().toString())] = [bar, reply,\
            unicode(destination)]

No re­al code golf­ing here, ex­cept for long lines, but once you break them rea­son­ably, this is pret­ty much the ob­vi­ous way to do it:

  • Ask for a file­­name

  • Cre­ate a pro­­gress­bar, put it in the sta­­tus­bar, and con­nect it to the down­load­­'s progress sig­­nal­s.

Then, of course, we need ths progress slot, that updates the progressbar:

progress = lambda self, received, total:\
    self.bars[unicode(self.sender().url().toString())][0]\
    .setValue(100. * received / total)

Yes, I de­fined a method as a lamb­da to save 1 line. [facepalm]

And the finished slot for when the download is done:

def finished(self):
    reply = self.sender()
    url = unicode(reply.url().toString())
    bar, _, fname = self.bars[url]
    redirURL = unicode(reply.attribute(QtNetwork.QNetworkRequest.\
        RedirectionTargetAttribute).toString())
    del self.bars[url]
    bar.deleteLater()
    if redirURL and redirURL != url:
        return self.fetch(redirURL, fname)
    with open(fname, 'wb') as f:
        f.write(str(reply.readAll()))

No­tice that it even han­dles redi­rec­tions sane­ly! Be­yond that, it just hides the progress bar, saves the data, end of sto­ry. The long­est line is not even my fault!

There is a big in­ef­fi­cien­cy in that the whole file is kept in mem­o­ry un­til the end. If you down­load a DVD im­age, that's gonna sting.

Also, using with saves a line and doesn't leak a file handle, compared to the alternatives.

Printing

Again Qt saved me, be­cause do­ing this man­u­al­ly would have been a pain. How­ev­er, it turns out that print­ing is just ... there? Qt, spe­cial­ly when used via PyQt is such an awe­some­ly rich en­vi­ron­men­t.

self.previewer = QtGui.QPrintPreviewDialog(\
    paintRequested=self.print_)
self.do_print = QtGui.QShortcut("Ctrl+p",\
    self, activated=self.previewer.exec_)

There's not even any need to golf here, that's exactly as much code as you need to hook Ctrl+p to make a QWebView print.

Other Tricks

There are no oth­er trick­s. All that's left is cre­at­ing wid­get­s, con­nect­ing things to one an­oth­er, and en­joy­ing the awe­some ex­pe­ri­ence of pro­gram­ming PyQt, where you can write a whole web brows­er (ex­cept the en­gine) in 127 lines of code.

De Vicenzo: A much cooler mini web browser.

It seems it was on­ly a few days ago that I start­ed this projec­t. Oh, wait, yes, it was just a few days ago!

If you don't want to read that again, the idea is to see just how much code is need­ed to turn Qt's We­bKit en­gine in­to a ful­ly-fledged brows­er.

To do that, I set my­self a com­plete­ly ar­bi­trary lim­it: 128 lines of code.

So, as of now, I de­clare it fea­ture-­com­plete.

The new fea­tures are:

  • Tabbed brows­ing (y­ou can ad­d/re­­move tab­s)

  • Book­­marks (y­ou can ad­d/re­­move them, and choose them from a drop-­­down menu)

This is what al­ready worked:

  • Zoom in (C­tr­l++)

  • Zoom out (C­tr­l+-)

  • Re­set Zoom (C­tr­l+=)

  • Find (C­tr­l+F)

  • Hide find (Esc)

  • But­­tons for back­­/­­for­ward and reload

  • URL en­try that match­es the page + au­­to­­com­­plete from his­­to­ry + smart en­try (adds http://, that kind of thing)

  • Plug­ins sup­­port (in­­clud­ing flash)

  • The win­­dow ti­­tle shows the page ti­­tle (with­­out brows­er ad­ver­tis­ing ;-)

  • Progress bar for page load­­ing

  • Sta­­tus­bar that shows hov­­ered links URL

  • Takes a URL on the com­­mand line, or opens http://python.org

  • Mul­ti­­plat­­form (works in any place QtWe­bKit work­s)

So... how much code was need­ed for this? 87 LINES OF CODE

Or if you want the PEP8-­com­pli­ant ver­sion, 115 LINES OF CODE.

Be­fore any­one says it: yes, I know the ren­der­ing en­gine and the tool­kit are huge. What I wrote is just the chrome around them, just like Aro­ra, Rekon­q, Ga­le­on, Epiphany and a bunch of oth­ers do.

It's sim­ple, min­i­mal­is­tic chrome, but it works pret­ty good, IMVHO.

Here it is in (bug­gy) ac­tion:

It's more or less fea­ture-­com­plete for what I ex­pect­ed to be achiev­able, but it still needs some fix­es.

You can see the code at it's own home page: http://de­vi­cen­zo.­google­code.­com

How much web browser can you put in 128 lines of code?

UP­DATE: If you read this and all you can say is "o­h, he's just em­bed­ding We­bKit", I have two things to tell you:

  1. Duh! Of course the 128 lines don't in­­­clude the ren­der­ing en­gine, or the TCP im­­ple­­men­­ta­­tion, or the GUI tool­k­it. This is about the rest of the browser, the part around the web ren­der­ing en­gine. You know, just like Aro­ra, Rekon­q, Epiphany, and ev­ery­one else that em­beds we­bkit or mozil­la does it? If you did­n't get that be­­fore this ex­­pla­­na­­tion... facepalm.

  2. Get your favourite we­bkit fork and try to do this much with the same amount of code. I dare you! I dou­ble dog dare you!

Now back to the orig­i­nal ar­ti­cle


To­day, be­cause of a IRC chat, I tried to find a 42-­line web brows­er I had writ­ten a while ago. Sad­ly, the paste­bin where I post­ed it was dead, so I learned a lesson: It's not a good idea to trust a paste­bin as code repos­i­to­ry

What I liked about that 42-­line brows­er was that it was not the typ­i­cal ex­am­ple, where some­one dumps a We­bkit view in a win­dow, loads a page and tries to con­vince you he's cool. That one is on­ly 7 lines of code:

import sys
from PyQt4 import QtGui,QtCore,QtWebKit
app=QtGui.QApplication(sys.argv)
wb=QtWebKit.QWebView()
wb.setUrl(QtCore.QUrl('http://www.python.org'))
wb.show()
sys.exit(app.exec_())

And if I want­ed to make the code uglier, it could be done in 6.

But any­way, that 42-­line brows­er ac­tu­al­ly looked use­ful!

This 42-line web browser, courtesy of #python and #qt -- http... on Twitpic

Those but­tons you see ac­tu­al­ly worked cor­rect­ly, en­abling and dis­abling at the right mo­men­t, the URL en­try changed when you clicked on links, and some oth­er bit­s.

So, I have de­cid­ed to start a smal­l, in­ter­mit­tent project of code golf: put as much brows­er as I can in 128 lines of code (not count­ing com­ments or blanks), start­ing with PyQt4.

This has a use­ful pur­pose: I al­ways sus­pect­ed that if you as­sumed PyQt was part of the base sys­tem, most apps would fit in flop­pies again. This one fits on a 1.44MB flop­py some 500 times (so you could use 360KB com­modore flop­pies if you prefer­!).

So far, I am at about 50 lines, and it has the fol­low­ing fea­tures:

  • Zoom in (C­tr­l++)

  • Zoom out (C­tr­l+-)

  • Re­set Zoom (C­tr­l+=)

  • Find (C­tr­l+F)

  • Hide find (Esc)

  • But­­tons for back­­/­­for­ward and reload

  • URL en­try that match­es the page + au­­to­­com­­plete from his­­to­ry + smart en­try (adds http://, that kind of thing)

  • Plug­ins sup­­port (in­­clud­ing flash)

  • The win­­dow ti­­tle shows the page ti­­tle (with­­out brows­er ad­ver­tis­ing ;-)

  • Progress bar for page load­­ing

  • Sta­­tus­bar that shows hov­­ered links URL

  • Takes a URL on the com­­mand line, or opens http://python.org

  • Mul­ti­­plat­­form (works in any place QtWe­bKit work­s)

Miss­ing are tabs and proxy sup­port. I ex­pect those will take an­oth­er 40 lines or so, but I think it's prob­a­bly the most fea­ture­ful of these toy browser­s.

The code... it's not all that hard. I am us­ing lamb­da a lot, and I am us­ing PyQt's key­word ar­gu­ments for sig­nal con­nec­tion which makes lines long, but not hard. It could be made much small­er!

Here it is in ac­tion:

And here's the code:

#!/usr/bin/env python
"A web browser that will never exceed 128 lines of code. (not counting blanks)"

import sys
from PyQt4 import QtGui,QtCore,QtWebKit

class MainWindow(QtGui.QMainWindow):
    def __init__(self, url):
        QtGui.QMainWindow.__init__(self)
        self.sb=self.statusBar()

        self.pbar = QtGui.QProgressBar()
        self.pbar.setMaximumWidth(120)
        self.wb=QtWebKit.QWebView(loadProgress = self.pbar.setValue, loadFinished = self.pbar.hide, loadStarted = self.pbar.show, titleChanged = self.setWindowTitle)
        self.setCentralWidget(self.wb)

        self.tb=self.addToolBar("Main Toolbar")
        for a in (QtWebKit.QWebPage.Back, QtWebKit.QWebPage.Forward, QtWebKit.QWebPage.Reload):
            self.tb.addAction(self.wb.pageAction(a))

        self.url = QtGui.QLineEdit(returnPressed = lambda:self.wb.setUrl(QtCore.QUrl.fromUserInput(self.url.text())))
        self.tb.addWidget(self.url)

        self.wb.urlChanged.connect(lambda u: self.url.setText(u.toString()))
        self.wb.urlChanged.connect(lambda: self.url.setCompleter(QtGui.QCompleter(QtCore.QStringList([QtCore.QString(i.url().toString()) for i in self.wb.history().items()]), caseSensitivity = QtCore.Qt.CaseInsensitive)))

        self.wb.statusBarMessage.connect(self.sb.showMessage)
        self.wb.page().linkHovered.connect(lambda l: self.sb.showMessage(l, 3000))

        self.search = QtGui.QLineEdit(returnPressed = lambda: self.wb.findText(self.search.text()))
        self.search.hide()
        self.showSearch = QtGui.QShortcut("Ctrl+F", self, activated = lambda: (self.search.show() , self.search.setFocus()))
        self.hideSearch = QtGui.QShortcut("Esc", self, activated = lambda: (self.search.hide(), self.wb.setFocus()))

        self.quit = QtGui.QShortcut("Ctrl+Q", self, activated = self.close)
        self.zoomIn = QtGui.QShortcut("Ctrl++", self, activated = lambda: self.wb.setZoomFactor(self.wb.zoomFactor()+.2))
        self.zoomOut = QtGui.QShortcut("Ctrl+-", self, activated = lambda: self.wb.setZoomFactor(self.wb.zoomFactor()-.2))
        self.zoomOne = QtGui.QShortcut("Ctrl+=", self, activated = lambda: self.wb.setZoomFactor(1))
        self.wb.settings().setAttribute(QtWebKit.QWebSettings.PluginsEnabled, True)

        self.sb.addPermanentWidget(self.search)
        self.sb.addPermanentWidget(self.pbar)
        self.wb.load(url)


if __name__ == "__main__":
    app=QtGui.QApplication(sys.argv)
    if len(sys.argv) > 1:
        url = QtCore.QUrl.fromUserInput(sys.argv[1])
    else:
        url = QtCore.QUrl('http://www.python.org')
    wb=MainWindow(url)
    wb.show()
    sys.exit(app.exec_())

Charla: docutils / rst y sus amigos

Again, span­ish on­ly be­cause it's a video... in span­ish.

Re­sul­ta que me olvidé que sí habían graba­do mi char­la de do­cu­tils y com­pañi­a. Gra­cias a Ger­mán por hac­erme acor­dar y mostrarme adonde es­taba!

Y ... acá es­tá:


Contents © 2000-2023 Roberto Alsina