2007-05-16 11:37

A KDE hack: Faster mail searches in kmail, using mairix

Kmail searches slowly.

I have been a kmail user for a couple of years, since I settled in my own home office with my own computer.

I like the thing.

However, it annoys me that it takes sooooo long to find a message in my mailstore. Hey, it's only 13000 messages!

So, while I wait for KDE4 to bring all its searching goodness, I decided to see if I could hack something quickly.

Enter mairix: a mail indexer/search thing.

Convincing mairix to index all my mail was rather simple (here is my ~/.mairixrc):

base=/home/ralsina/
maildir=Mail/*...
maildir=.kde/share/apps/kmail/dimap...
omit=Mail/mairix
database=~/.mairix_db

What does it do?

  • It indexes mail stored at ~/Mail and everywhere in my kmail imap folders.
  • It stores search results in ~/Mail/mairix and ignores its contents when searching. The results are stored as links, so they waste no disk space.

After running mairix so it builds its DB (took about a minute, which is less than most kmail searches) , you can search for things like this:

[[email protected] ~]$ time mairix b:bartleblog
Matched 6 messages

real    0m0.232s
user    0m0.012s
sys     0m0.204s

And the result can be seen in kmail, in the mairix folder:

mairix1.png

However, there is a problem. It will work for the first search, but not for the second one. On the second search, you get the same content listing, but all messages appear empty.

That's because kmail saves an index file of each folder. To work around that, I wrote a little shell wrapper, kmairix:

#!/bin/sh
rm ~/Mail/.mairix* -f
mairix $*

And you use that instead of calling mairix directly.

But there are still improvements to be done. If your kmail is currently displaying the mairix folder, searching doesn't update the message list.

DCOP to the rescue! We can switch to the inbox, then back to mairix (adjust as needed for yourself):

dcop kmail KMailIface selectFolder /Local/inbox
dcop kmail KMailIface selectFolder /Local/mairix

Missing pieces:

  • How about switching to the kmail window? Sadly, the kwin DCOP interface seems incomplete. Maybe assigning kmail a hotkey and work from there? Let me know if you have any ideas.

    UPDATE as suggested by Anno Heimburg: just call kmail.

  • A GUI (of course!) probably with a tray icon...

  • A way to auto-update the Mairix DB when new mail arrives. I am thinking about doing it with incron but have not done it yet.

So, here is the final version, put it somewhere in your path, and use ALT+F2 to search your mails :-)

#!/bin/sh
rm ~/Mail/.mairix* -f
mairix $*
dcop kmail KMailIface selectFolder /Local/inbox
dcop kmail KMailIface selectFolder /Local/mairix
kmail

2007-05-15 14:08

BartleBlog live!

I was thinking: how can I implement page previews in BartleBlog?

The obvious way is to render the page and open the local file. However, the page may link or include pieces that are not updated yet in the static version, so that can give confusing results.

Then it hit me... generate the page on the fly and serve it. And do the same for everything else the browser asks for.

So, after searching for 15 minutes for the simplest python "web framework" that let me use the code already in Bartleblog, and deciding for Colubrid...

Now, this is cute: bartleblog as a dynamic web app in 34 lines.

from colubrid import RegexApplication, HttpResponse, execute

from BartleBlog.backend.blog import Blog
import BartleBlog.backend.dbclasses as db
import os, codecs

class webBlog(Blog):
    def __init__(self):
        Blog.__init__(self)
        self.basepath='http://localhost:8080/'
        self.dest_dir=os.path.expanduser("~/.bartleblog/preview")
        if not os.path.isdir(self.dest_dir):
            os.mkdir(self.dest_dir)


class MyApplication(RegexApplication):
    blog=webBlog()
    urls = [
        (r'^(.*?)$', 'page'),
        (r'^(.*?)/(.*?)$', 'page'),
        (r'^(.*?)/(.*?)/(.*?)$', 'page'),
        (r'^(.*?)/(.*?)/(.*?)/(.*?)$', 'page')
    ]

    def page(self, *args):
        path=''.join(args)
        page=db.pageByPath(path)
        self.blog.renderPage(page)
        return HttpResponse(codecs.open(os.path.join(self.blog.dest_dir, path)).read())

app = MyApplication
app = StaticExports(app, {
    '/static': './static'
})


if __name__ == '__main__':
    execute(app)

2007-05-15 12:44

BartleBlog change: Mako Templates

Since the very beginning, BartleBlog has been using CherryTemplate for its output formatting needs. I like it, because it's very simple.

However, it had grown rather cumbersome.

Specifically, most pages in a blog are sort of a page template with a body template inside (the main content).

To do that on CherryTemplate, I used a two-pass approach: generate the body, then pass it as parameter to the page template.

Which is a pain in some cases because you end basically having to do a rendering function for each kind of page, or some crazy-evil function (what I did).

Exploring the different python template engines, I ran into Mako and decided to give it a whirl. It looks good.

The approach is a bit different, it is much more powerful, but you can still use it simply if you can.

And the main feature was template inheritance. Using that, no more inner and outer templates, baby!

Oh, and performance is better:

Cherry

real    31m44.732s
user    21m18.336s
sys     2m7.628s


Mako

real    24m54.472s
user    19m9.508s
sys     1m56.375s

This is for completely rerendering the whole 7 years, 574 posts, 40 static articles, 14 category blog, and there is tons of optimizations to be done.

BTW: this is how you rerender the whole blog:

from BartleBlog.backend.blog import Blog
Blog().renderFullBlog()

2007-05-13 21:15

Sometimes I am stupid. Then again, it doesn't matter, because I am lucky!

I am working on changing BartleBlog so it can be used from scratch. That may sound odd but because I have been using it since day 2 to post this blog, it has grown very organically, meaning there are things that only work because of the way I used it while developing it.

So, I created a test user, and created a test blog there, and I am working, and decide to do another from-scratch test, and...

I deleted my production copy.

Yes. The one that generates this blog. So this blog disappeared. Because I used the wrong terminal window.

And I had one-week old backups.

So I felt very very stupid.

Because undeleting in Linux is a joke.

So I was thinking how to spend a few hours recreating the last week of posts, and whatever, when I noticed on the taskbar... bartleblog was still running.

Which means that the DB was still open by a process. Which means...

[[email protected] bartleblog]$ ps ax | grep python
17063 pts/1    S     24:33 python bartleblog.py
17161 ?        S      0:04 konqueror [kdeinit] -mimetype text/html http://www.google.com/search?q=python+copy+file&ie=UTF-8&oe=UTF-8
17454 pts/1    D+     0:00 grep python
[[email protected] bartleblog]$ su
Password:
[[email protected] bartleblog]# cd /proc/17063/fd
[[email protected] fd]# ls
0  1  10  11  12  2  3  4  5  6  7  8  9
[[email protected] fd]# ls -l
total 0
lrwx------ 1 ralsina users 64 2007-05-13 21:07 0 -> /dev/pts/1
lrwx------ 1 ralsina users 64 2007-05-13 21:07 1 -> /dev/pts/1
lrwx------ 1 ralsina users 64 2007-05-13 21:07 10 -> socket:[159486]
lrwx------ 1 ralsina users 64 2007-05-13 21:07 11 -> socket:[159488]
lrwx------ 1 ralsina users 64 2007-05-13 21:07 12 -> /mnt/centos/home/ralsina/.bartleblog/blog.db (deleted)
lrwx------ 1 ralsina users 64 2007-05-13 21:07 2 -> /dev/pts/1
lr-x------ 1 ralsina users 64 2007-05-13 21:07 3 -> /mnt/centos/home/ralsina/Desktop/proyectos/bartleblog/bartleblog/BartleBlog/ui/bartleblog.py
lr-x------ 1 ralsina users 64 2007-05-13 21:07 4 -> pipe:[159481]
l-wx------ 1 ralsina users 64 2007-05-13 21:07 5 -> pipe:[159481]
lr-x------ 1 ralsina users 64 2007-05-13 21:07 6 -> pipe:[159482]
l-wx------ 1 ralsina users 64 2007-05-13 21:07 7 -> pipe:[159482]
lr-x------ 1 ralsina users 64 2007-05-13 21:07 8 -> pipe:[159485]
l-wx------ 1 ralsina users 64 2007-05-13 21:07 9 -> pipe:[159485]
[[email protected] fd]# cp 12 /root/db
[[email protected] fd]# ls -l ~/db
-rw-r--r-- 1 root root 3582976 2007-05-13 21:07 /root/db
[[email protected] fd]# sqlitebrowser ~/db
[[email protected] fd]# cp ~/db /home/ralsina/.bartleblog/blog.db

And I got the database back.

If you don't understand how that worked.... here's the explanation:

  • On unix, files are really unlinked (removed from directories) when no process has them open. Even then, the data is not deleted, but finding it is much harder.
  • On /proc/PID you can see the file descriptors each process has open.
  • You can actually copy a file descriptor.

So I went and copied the open file. And got it back. And this blog didn't go away.

So I am lucky! Stupid. But lucky!

2007-05-11 12:01

New Bartleblog Feature: Menu Editor

Took a while to implement, but BartleBlog finally got a functional menu editor:

bartleblog12.png

Right now, it only works with the mootools-based menu gadget, but I will start working on the yahoo menu version in a moment.

The only thing not working is the preview button, because it needs more support on the backend side.

2007-05-11 08:59

Python Trick: Save anything in config files

The Python config objects are convenient and simple, but they have a problem: you can only save strings. That means you need to store numbers as strings and remember to use the getint()/getfloat() methods (or coerce by hand!), which is error prone and anti-pythonic. Storing a list is even uglier.

You could store ascii pickles, but those are pretty unpleasant to read in some cases.

Here's my solution: Encode it using a JSON encoder first! (I am using demjson)

Silly obvious code fragment:

def getValue(section,key,default=None):
    try:
        return JSON().decode(conf.get (section,key))
    except:
        return default

def setValue(section,key,value):
    value=JSON().encode(value)
    try:
        r=conf.set(section,key,value)
    except ConfigParser.NoSectionError:
        conf.add_section(section)
        r=conf.set(section,key,value)
    f=open(os.path.expanduser('~/.bartleblog/config'),'w')
    conf.write(f)
    return r

With just a little effort you can have a readable ascii typed python config file.

2007-05-10 14:07

Today's first hour of hacking...

... has been all about UI.

I have always had a problem when writing PyQt apps: stock icons.

Which ones should I use? Where are they?

I usually fished through the crystalsvg icon set until I found one that seemed to be what I needed, and then copied it to my app.

Sadly, that's annoying in several ways:

  1. Since those are PNG icons, you need to find the right size.
  2. Not all icons are there for all sizes!
  3. Because of 2, I need to check three or four folders to see all the icons.

So, I decided to cut my losses, and see what else could be done. And here it is:

bartleblog11.png

I am now using all SVG icons, from the reinhardt set that will look equally out of place in all OSs, but which I like (and I think look awesome with this relaxed Domino theme). And because they are all SVG, I don't care about sizes, and they are all in the same place, and all is good.

And whenever Oxygen is released, all I need to do is switch the files around and that's that. Which is nice, too.

Of course there is a catch... it does look out of place, and I expect many to find it ugly. So what, since I am the only user of this app! ;-)

2007-05-09 15:06

Today's two hours of hacking

  • Done with the main blog config dialog.
  • Fixed a dozen bugs
  • Generate the blog in a reasonable place
  • Fixed a lot of UI bugs (tab orders, sizes)

Still lots and lots of things to be done, tho.

2007-05-08 21:05

Making your QTextBrowser show remote images

It's remarkably easy to turn your QTextBrowser into a limited web browser, at least good enough to show images from the web.

Here's all the code:

from PyQt4 import QtCore,QtGui
import urllib, os, md5

class PBrowser(QtGui.QTextBrowser):
    def loadResource(self, type, name):
        url=unicode(name.toString())
        ret=QtCore.QVariant()
        if url.startswith('http://'):
            dn=os.path.expanduser('~/.bartleblog/cache/')
            if not os.path.isdir(dn):
                os.mkdir(dn)
            m=md5.new()
            m.update(url)
            fn=os.path.join(dn,m.hexdigest())
            if not os.path.isfile(fn):
                urllib.urlretrieve(url, fn)
            ret=QtGui.QTextBrowser.loadResource(self, type, QtCore.QUrl(fn))
        else:
            ret=QtGui.QTextBrowser.loadResource(self, type, name)
        return ret

And here's bartleblog taking advantage of it:

bartleblog10.png

It even has a primitive cache and everything ;-)

Contents © 2000-2019 Roberto Alsina