Skip to main content

Ralsina.Me — Roberto Alsina's website

OK, so THAT is how much browser I can put in 128 lines of code.

I have al­ready post­ed a cou­ple of times (1, 2) about De Vi­cen­zo , an at­tempt to im­ple­ment the rest of the browser, start­ing with PyQt's We­bKit... lim­it­ing my­self to 128 lines of code.

Of course I could do more, but I have my stan­dard­s!

  • No us­ing ;

  • No if what­ev­er: f()

Oth­er than that, I did a lot of dirty trick­s, but right now, it's a fair­ly com­plete browser, and it has 127 lines of code (ac­cord­ing to sloc­coun­t) so that's enough play­ing and it's time to go back to re­al work.

But first, let's con­sid­er how some fea­tures were im­ple­ment­ed (I'll wrap the lines so they page stays rea­son­ably nar­row), and al­so look at the "nor­mal" ver­sions of the same (the "nor­mal" code is not test­ed, please tell me if it's bro­ken ;-).

This is not some­thing you should learn how to do. In fac­t, this is al­most a trea­tise on how not to do things. This is some of the least python­ic, less clear code you will see this week.

It is short, and it is ex­pres­sive. But it is ug­ly.

I'll dis­cuss this ver­sion.

Proxy Support

A brows­er is not much of a brows­er if you can't use it with­out a prox­y, but luck­i­ly Qt's net­work stack has good proxy sup­port. The trick was con­fig­ur­ing it.

De Vicenzo supports HTTP and SOCKS proxies by parsing a http_proxy environment variable and setting Qt's application-wide proxy:

 proxy_url = QtCore.QUrl(os.environ.get('http_proxy', ''))
 QtNetwork.QNetworkProxy.setApplicationProxy(QtNetwork.QNetworkProxy(\
 QtNetwork.QNetworkProxy.HttpProxy if unicode(proxy_url.scheme()).startswith('http')\
 else QtNetwork.QNetworkProxy.Socks5Proxy, proxy_url.host(),\
 proxy_url.port(), proxy_url.userName(), proxy_url.password())) if\
'http_proxy' in os.environ else None

How would that look in nor­mal code?

if 'http_proxy' in os.environ:
    proxy_url = QtCore.QUrl(os.environ['http_proxy'])
    if unicode(proxy_url.scheme()).starstswith('http'):
        protocol = QtNetwork.QNetworkProxy.HttpProxy
    else:
        protocol = QtNetwork.QNetworkProxy.Socks5Proxy
    QtNetwork.QNetworkProxy.setApplicationProxy(
        QtNetwork.QNetworkProxy(
            protocol,
            proxy_url.host(),
            proxy_url.port(),
            proxy_url.userName(),
            proxy_url.password()))

As you can see, the main abus­es against python here are the use of the ternary op­er­a­tor as a one-­line if (and nest­ing it), and line length.

Persistent Cookies

You re­al­ly need this, since you want to stay logged in­to your sites be­tween ses­sion­s. For this, first I need­ed to write some per­sis­tence mech­a­nis­m, and then save/re­store the cook­ies there.

Here's how the persistence is done (settings is a global QSettings instance):

def put(self, key, value):
    "Persist an object somewhere under a given key"
    settings.setValue(key, json.dumps(value))
    settings.sync()

def get(self, key, default=None):
    "Get the object stored under 'key' in persistent storage, or the default value"
    v = settings.value(key)
    return json.loads(unicode(v.toString())) if v.isValid() else default

It's not terribly weird code, except for the use of the ternary operator in the last line. The use of json ensures that as long as reasonable things are persisted, you will get them with the same type as you put them without needing to convert them or call special methods.

So, how do you save/restore the cookies? First, you need to access the cookie jar. I couldn't find whether there is a global one, or a per-webview one, so I created a QNetworkCookieJar in line 24 and assign it to each web page in line 107.

# Save the cookies, in the window's closeEvent
self.put("cookiejar", [str(c.toRawForm()) for c in self.cookies.allCookies()])

# Restore the cookies, in the window's __init__
self.cookies.setAllCookies([QtNetwork.QNetworkCookie.parseCookies(c)[0]\
for c in self.get("cookiejar", [])])

Here I con­fess I am guilty of us­ing list com­pre­hen­sions when a for loop would have been the cor­rect thing.

I use the same trick when restor­ing the open tab­s, with the added mis­fea­ture of us­ing a list com­pre­hen­sion and throw­ing away the re­sult:

# get("tabs") is a list of URLs
[self.addTab(QtCore.QUrl(u)) for u in self.get("tabs", [])]

Using Properties and Signals in Object Creation

This is a fea­ture of re­cent PyQt ver­sion­s: if you pass prop­er­ty names as key­word ar­gu­ments when you cre­ate an ob­jec­t, they are as­signed the val­ue. If you pass a sig­nal as a key­word ar­gu­men­t, they are con­nect­ed to the giv­en val­ue.

This is a re­al­ly great fea­ture that helps you cre­ate clear, lo­cal code, and it's a great thing to have. But if you are writ­ing evil code... well, you can go to hell on a hand­bas­ket us­ing it.

This is all over the place in De Vi­cen­zo, and here's one ex­am­ple (yes, this is one line):

QtWebKit.QWebView.__init__(self, loadProgress=lambda v:\
(self.pbar.show(), self.pbar.setValue(v)) if self.amCurrent() else\
None, loadFinished=self.pbar.hide, loadStarted=lambda:\
self.pbar.show() if self.amCurrent() else None, titleChanged=lambda\
t: container.tabs.setTabText(container.tabs.indexOf(self), t) or\
(container.setWindowTitle(t) if self.amCurrent() else None))

Oh, boy, where do I start with this one.

There are lambda expressions used to define the callbacks in-place instead of just connecting to a real function or method.

There are lamb­das that con­tain the ternary op­er­a­tor:

loadStarted=lambda:\
    self.pbar.show() if self.amCurrent() else None

There are lambdas that use or or a tuple to trick python into doing two things in a single lambda!

loadProgress=lambda v:\
(self.pbar.show(), self.pbar.setValue(v)) if self.amCurrent() else\
None

I won't even try to un­tan­gle this for ed­u­ca­tion­al pur­pos­es, but let's just say that line con­tains what should be re­placed by 3 meth­od­s, and should be spread over 6 lines or more.

Download Manager

Ok, call­ing it a man­ag­er is over­reach­ing, since you can't stop them once they start, but hey, it lets you down­load things and keep on brows­ing, and re­ports the pro­gress!

First, on line 16 I created a bars dictionary for general bookkeeping of the downloads.

Then, I need­ed to del­e­gate the un­sup­port­ed con­tent to the right method, and that's done in lines 108 and 109

What that does is basically that whenever you click on something WebKit can't handle, the method fetch will be called and passed the network request.

def fetch(self, reply):
    destination = QtGui.QFileDialog.getSaveFileName(self, \
        "Save File", os.path.expanduser(os.path.join('~',\
            unicode(reply.url().path()).split('/')[-1])))
    if destination:
        bar = QtGui.QProgressBar(format='%p% - ' +
            os.path.basename(unicode(destination)))
        self.statusBar().addPermanentWidget(bar)
        reply.downloadProgress.connect(self.progress)
        reply.finished.connect(self.finished)
        self.bars[unicode(reply.url().toString())] = [bar, reply,\
            unicode(destination)]

No re­al code golf­ing here, ex­cept for long lines, but once you break them rea­son­ably, this is pret­ty much the ob­vi­ous way to do it:

  • Ask for a file­­name

  • Cre­ate a pro­­gress­bar, put it in the sta­­tus­bar, and con­nect it to the down­load­­'s progress sig­­nal­s.

Then, of course, we need ths progress slot, that updates the progressbar:

progress = lambda self, received, total:\
    self.bars[unicode(self.sender().url().toString())][0]\
    .setValue(100. * received / total)

Yes, I de­fined a method as a lamb­da to save 1 line. [facepalm]

And the finished slot for when the download is done:

def finished(self):
    reply = self.sender()
    url = unicode(reply.url().toString())
    bar, _, fname = self.bars[url]
    redirURL = unicode(reply.attribute(QtNetwork.QNetworkRequest.\
        RedirectionTargetAttribute).toString())
    del self.bars[url]
    bar.deleteLater()
    if redirURL and redirURL != url:
        return self.fetch(redirURL, fname)
    with open(fname, 'wb') as f:
        f.write(str(reply.readAll()))

No­tice that it even han­dles redi­rec­tions sane­ly! Be­yond that, it just hides the progress bar, saves the data, end of sto­ry. The long­est line is not even my fault!

There is a big in­ef­fi­cien­cy in that the whole file is kept in mem­o­ry un­til the end. If you down­load a DVD im­age, that's gonna sting.

Also, using with saves a line and doesn't leak a file handle, compared to the alternatives.

Printing

Again Qt saved me, be­cause do­ing this man­u­al­ly would have been a pain. How­ev­er, it turns out that print­ing is just ... there? Qt, spe­cial­ly when used via PyQt is such an awe­some­ly rich en­vi­ron­men­t.

self.previewer = QtGui.QPrintPreviewDialog(\
    paintRequested=self.print_)
self.do_print = QtGui.QShortcut("Ctrl+p",\
    self, activated=self.previewer.exec_)

There's not even any need to golf here, that's exactly as much code as you need to hook Ctrl+p to make a QWebView print.

Other Tricks

There are no oth­er trick­s. All that's left is cre­at­ing wid­get­s, con­nect­ing things to one an­oth­er, and en­joy­ing the awe­some ex­pe­ri­ence of pro­gram­ming PyQt, where you can write a whole web brows­er (ex­cept the en­gine) in 127 lines of code.

Rodney Dawes / 2011-03-09 02:23:

Doesn't QtWebKit have built-in proxy support that "just works" if you configure it in the system control panel?

Roberto Alsina / 2011-03-09 02:28:

Good question. If there is, I couldn't find it in the docs.

Mario / 2011-03-09 07:30:

Great examples on different topics, printing, network (with redirect), cookie handling, proxy and of course, the power of Python :)

Dan / 2011-03-09 09:38:

Awesome. Just happend to read through wikipedia's "list of browsers" - This one is missing. I'd love to do all the tests and post it there..

Roberto Alsina / 2011-03-09 10:43:

Sure, go ahead :-)

Shulai / 2011-03-10 01:26:

(Py)Qt is powerful indeed, and you are a master to showcase them. The only thing I really miss in Qt and PyQt is built in Poppler support, I got it working on Windows with MinGW but it was somewhat painful.

Anyway, what's your next code golfing challenge? Your public is eager to know! :-)

Roberto Alsina / 2011-03-10 01:28:

A twitter/identi.ca client apparently!

employment background check / 2011-12-27 23:33:


Well, the write-up is truly the freshest on this laudable topic. 

Joe Borg / 2012-11-26 11:52:

Hi, I think there is a bug (maybe with Qt). If you open the application to a page that plants a cookie, then open a tab, close that tab then try to open another one, you get:

Traceback (most recent call last):
File "./qtwk.py", line 15, in <lambda>
self.tabs.setCornerWidget(QtGui.QToolButton(self, text="New Tab", icon=QtGui.QIcon.fromTheme("document-new"), clicked=lambda: self.addTab().url.setFocus(), shortcut="Ctrl+t"))
File "./qtwk.py", line 77, in addTab
self.tabs.setCurrentIndex(self.tabs.addTab(Tab(url, self), ""))
File "./qtwk.py", line 105, in __init__
self.wb.page().networkAccessManager().setCookieJar(container.cookies)
RuntimeError: wrapped C/C++ object of type QNetworkCookieJar has been deleted
Traceback (most recent call last):
File "./qtwk.py", line 62, in closeEvent
self.put("cookiejar", [str(c.toRawForm()) for c in self.cookies.allCookies()])
RuntimeError: wrapped C/C++ object of type QNetworkCookieJar has been deleted

Roberto Alsina / 2012-11-26 13:11:

Looks like a race condition where I am trying to set properties in the already closed tab. Easy-ish to work around, though (just try/except it)

Joe Borg / 2012-11-26 13:40:

Hi Roberto, the try / except works, but then you don't pick up the persistent cookies. I've tried to fix it myself, but not found a way so far.
It looks like we're destroying the self.cookies instance when we close a tab, meaning it can't be picked up again and you loose the cookies saved application wide.

Roberto Alsina / 2012-11-28 00:05:

Makes sense. I admit the cookiejar is not very tested ;-)

Joe Borg / 2012-11-29 14:53:

Yep, I'm not trying to knock holes in it, just cool to get it fully working :)
Also trying to get new window to work with tabs. It's odd how easy some hard things are and how hard a few easy things are with QtWebKit; like new windows, cookies etc.

Joe Borg / 2012-11-26 15:27:

Also noticed that if you open a tab, close it then try and click on a link on the original tab, you get a segmentation fault.


Contents © 2000-2020 Roberto Alsina