--- author: '' category: '' date: 2010/07/16 19:58 description: '' link: '' priority: '' slug: BB901 tags: kde, programming, python, qt title: Capturing a webpage as an image using Pyhon and Qt type: text updated: 2010/07/16 19:58 url_type: '' --- For a small project I am doing I wanted the capability to see web pages offline. So, I started thinking of a way to do that, and all solutions were annoying or impractical. So, I googled and found `CutyCapt `_ which uses Qt and WebKit to turn web pages into images. Good enough for me! Since I wanted to use this from a PyQt app, it makes sense to do the same thing CutyCapt does, but as a python module/script, so here's a quick implementation that works for me, even if it lacks a bunch of CutyCapt's features. With a little extra effort, it can even save as PDF or SVG, which would let you use it *almost* like a real web page. You just use it like this:: python capty.py http://www.kde.org kde.png And here's the code [`download capty.py `_] .. code-block:: python # -*- coding: utf-8 -*- """This tries to do more or less the same thing as CutyCapt, but as a python module. This is a derived work from CutyCapt: http://cutycapt.sourceforge.net/ //////////////////////////////////////////////////////////////////// // // CutyCapt - A Qt WebKit Web Page Rendering Capture Utility // // Copyright (C) 2003-2010 Bjoern Hoehrmann // // This program is free software; you can redistribute it and/or // modify it under the terms of the GNU General Public License // as published by the Free Software Foundation; either version 2 // of the License, or (at your option) any later version. // // This program is distributed in the hope that it will be useful, // but WITHOUT ANY WARRANTY; without even the implied warranty of // MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the // GNU General Public License for more details. // // $Id$ // //////////////////////////////////////////////////////////////////// """ import sys from PyQt4 import QtCore, QtGui, QtWebKit class Capturer(object): """A class to capture webpages as images""" def __init__(self, url, filename): self.url = url self.filename = filename self.saw_initial_layout = False self.saw_document_complete = False def loadFinishedSlot(self): self.saw_document_complete = True if self.saw_initial_layout and self.saw_document_complete: self.doCapture() def initialLayoutSlot(self): self.saw_initial_layout = True if self.saw_initial_layout and self.saw_document_complete: self.doCapture() def capture(self): """Captures url as an image to the file specified""" self.wb = QtWebKit.QWebPage() self.wb.mainFrame().setScrollBarPolicy( QtCore.Qt.Horizontal, QtCore.Qt.ScrollBarAlwaysOff) self.wb.mainFrame().setScrollBarPolicy( QtCore.Qt.Vertical, QtCore.Qt.ScrollBarAlwaysOff) self.wb.loadFinished.connect(self.loadFinishedSlot) self.wb.mainFrame().initialLayoutCompleted.connect( self.initialLayoutSlot) self.wb.mainFrame().load(QtCore.QUrl(self.url)) def doCapture(self): self.wb.setViewportSize(self.wb.mainFrame().contentsSize()) img = QtGui.QImage(self.wb.viewportSize(), QtGui.QImage.Format_ARGB32) painter = QtGui.QPainter(img) self.wb.mainFrame().render(painter) painter.end() img.save(self.filename) QtCore.QCoreApplication.instance().quit() if __name__ == "__main__": """Run a simple capture""" app = QtGui.QApplication(sys.argv) c = Capturer(sys.argv[1], sys.argv[2]) c.capture() app.exec_()