OK, so THAT is how much browser I can put in 128 lines of code.

I have already posted a couple of times (1, 2) about De Vicenzo , an attempt to implement the rest of the browser, starting with PyQt's WebKit... limiting myself to 128 lines of code.

Of course I could do more, but I have my standards!

  • No using ;
  • No if whatever: f()

Other than that, I did a lot of dirty tricks, but right now, it's a fairly complete browser, and it has 127 lines of code (according to sloccount) so that's enough playing and it's time to go back to real work.

But first, let's consider how some features were implemented (I'll wrap the lines so they page stays reasonably narrow), and also look at the "normal" versions of the same (the "normal" code is not tested, please tell me if it's broken ;-).

This is not something you should learn how to do. In fact, this is almost a treatise on how not to do things. This is some of the least pythonic, less clear code you will see this week.

It is short, and it is expressive. But it is ugly.

I'll discuss this version.

Proxy Support

A browser is not much of a browser if you can't use it without a proxy, but luckily Qt's network stack has good proxy support. The trick was configuring it.

De Vicenzo supports HTTP and SOCKS proxies by parsing a http_proxy environment variable and setting Qt's application-wide proxy:

 proxy_url = QtCore.QUrl(os.environ.get('http_proxy', ''))
 QtNetwork.QNetworkProxy.setApplicationProxy(QtNetwork.QNetworkProxy(\
 QtNetwork.QNetworkProxy.HttpProxy if unicode(proxy_url.scheme()).startswith('http')\
 else QtNetwork.QNetworkProxy.Socks5Proxy, proxy_url.host(),\
 proxy_url.port(), proxy_url.userName(), proxy_url.password())) if\
'http_proxy' in os.environ else None

How would that look in normal code?

if 'http_proxy' in os.environ:
    proxy_url = QtCore.QUrl(os.environ['http_proxy'])
    if unicode(proxy_url.scheme()).starstswith('http'):
        protocol = QtNetwork.QNetworkProxy.HttpProxy
    else:
        protocol = QtNetwork.QNetworkProxy.Socks5Proxy
    QtNetwork.QNetworkProxy.setApplicationProxy(
        QtNetwork.QNetworkProxy(
            protocol,
            proxy_url.host(),
            proxy_url.port(),
            proxy_url.userName(),
            proxy_url.password()))

As you can see, the main abuses against python here are the use of the ternary operator as a one-line if (and nesting it), and line length.

Persistent Cookies

You really need this, since you want to stay logged into your sites between sessions. For this, first I needed to write some persistence mechanism, and then save/restore the cookies there.

Here's how the persistence is done (settings is a global QSettings instance):

def put(self, key, value):
    "Persist an object somewhere under a given key"
    settings.setValue(key, json.dumps(value))
    settings.sync()

def get(self, key, default=None):
    "Get the object stored under 'key' in persistent storage, or the default value"
    v = settings.value(key)
    return json.loads(unicode(v.toString())) if v.isValid() else default

It's not terribly weird code, except for the use of the ternary operator in the last line. The use of json ensures that as long as reasonable things are persisted, you will get them with the same type as you put them without needing to convert them or call special methods.

So, how do you save/restore the cookies? First, you need to access the cookie jar. I couldn't find whether there is a global one, or a per-webview one, so I created a QNetworkCookieJar in line 24 and assign it to each web page in line 107.

# Save the cookies, in the window's closeEvent
self.put("cookiejar", [str(c.toRawForm()) for c in self.cookies.allCookies()])

# Restore the cookies, in the window's __init__
self.cookies.setAllCookies([QtNetwork.QNetworkCookie.parseCookies(c)[0]\
for c in self.get("cookiejar", [])])

Here I confess I am guilty of using list comprehensions when a for loop would have been the correct thing.

I use the same trick when restoring the open tabs, with the added misfeature of using a list comprehension and throwing away the result:

# get("tabs") is a list of URLs
[self.addTab(QtCore.QUrl(u)) for u in self.get("tabs", [])]

Using Properties and Signals in Object Creation

This is a feature of recent PyQt versions: if you pass property names as keyword arguments when you create an object, they are assigned the value. If you pass a signal as a keyword argument, they are connected to the given value.

This is a really great feature that helps you create clear, local code, and it's a great thing to have. But if you are writing evil code... well, you can go to hell on a handbasket using it.

This is all over the place in De Vicenzo, and here's one example (yes, this is one line):

QtWebKit.QWebView.__init__(self, loadProgress=lambda v:\
(self.pbar.show(), self.pbar.setValue(v)) if self.amCurrent() else\
None, loadFinished=self.pbar.hide, loadStarted=lambda:\
self.pbar.show() if self.amCurrent() else None, titleChanged=lambda\
t: container.tabs.setTabText(container.tabs.indexOf(self), t) or\
(container.setWindowTitle(t) if self.amCurrent() else None))

Oh, boy, where do I start with this one.

There are lambda expressions used to define the callbacks in-place instead of just connecting to a real function or method.

There are lambdas that contain the ternary operator:

loadStarted=lambda:\
    self.pbar.show() if self.amCurrent() else None

There are lambdas that use or or a tuple to trick python into doing two things in a single lambda!

loadProgress=lambda v:\
(self.pbar.show(), self.pbar.setValue(v)) if self.amCurrent() else\
None

I won't even try to untangle this for educational purposes, but let's just say that line contains what should be replaced by 3 methods, and should be spread over 6 lines or more.

Download Manager

Ok, calling it a manager is overreaching, since you can't stop them once they start, but hey, it lets you download things and keep on browsing, and reports the progress!

First, on line 16 I created a bars dictionary for general bookkeeping of the downloads.

Then, I needed to delegate the unsupported content to the right method, and that's done in lines 108 and 109

What that does is basically that whenever you click on something WebKit can't handle, the method fetch will be called and passed the network request.

def fetch(self, reply):
    destination = QtGui.QFileDialog.getSaveFileName(self, \
        "Save File", os.path.expanduser(os.path.join('~',\
            unicode(reply.url().path()).split('/')[-1])))
    if destination:
        bar = QtGui.QProgressBar(format='%p% - ' +
            os.path.basename(unicode(destination)))
        self.statusBar().addPermanentWidget(bar)
        reply.downloadProgress.connect(self.progress)
        reply.finished.connect(self.finished)
        self.bars[unicode(reply.url().toString())] = [bar, reply,\
            unicode(destination)]

No real code golfing here, except for long lines, but once you break them reasonably, this is pretty much the obvious way to do it:

  • Ask for a filename
  • Create a progressbar, put it in the statusbar, and connect it to the download's progress signals.

Then, of course, we need ths progress slot, that updates the progressbar:

progress = lambda self, received, total:\
    self.bars[unicode(self.sender().url().toString())][0]\
    .setValue(100. * received / total)

Yes, I defined a method as a lambda to save 1 line. [facepalm]

And the finished slot for when the download is done:

def finished(self):
    reply = self.sender()
    url = unicode(reply.url().toString())
    bar, _, fname = self.bars[url]
    redirURL = unicode(reply.attribute(QtNetwork.QNetworkRequest.\
        RedirectionTargetAttribute).toString())
    del self.bars[url]
    bar.deleteLater()
    if redirURL and redirURL != url:
        return self.fetch(redirURL, fname)
    with open(fname, 'wb') as f:
        f.write(str(reply.readAll()))

Notice that it even handles redirections sanely! Beyond that, it just hides the progress bar, saves the data, end of story. The longest line is not even my fault!

There is a big inefficiency in that the whole file is kept in memory until the end. If you download a DVD image, that's gonna sting.

Also, using with saves a line and doesn't leak a file handle, compared to the alternatives.

Printing

Again Qt saved me, because doing this manually would have been a pain. However, it turns out that printing is just ... there? Qt, specially when used via PyQt is such an awesomely rich environment.

self.previewer = QtGui.QPrintPreviewDialog(\
    paintRequested=self.print_)
self.do_print = QtGui.QShortcut("Ctrl+p",\
    self, activated=self.previewer.exec_)

There's not even any need to golf here, that's exactly as much code as you need to hook Ctrl+p to make a QWebView print.

Other Tricks

There are no other tricks. All that's left is creating widgets, connecting things to one another, and enjoying the awesome experience of programming PyQt, where you can write a whole web browser (except the engine) in 127 lines of code.

Comments

Comments powered by Disqus