Skip to main content

Ralsina.Me — Roberto Alsina's website

A KDE hack: Faster mail searches in kmail, using mairix

Kmail search­es slow­ly.

I have been a kmail us­er for a cou­ple of years, since I set­tled in my own home of­fice with my own com­put­er.

I like the thing.

How­ev­er, it an­noys me that it takes sooooo long to find a mes­sage in my mail­store. Hey, it's on­ly 13000 mes­sages!

So, while I wait for KDE4 to bring all its search­ing good­ness, I de­cid­ed to see if I could hack some­thing quick­ly.

En­ter mair­ix: a mail in­dex­er/search thing.

Con­vinc­ing mair­ix to in­dex all my mail was rather sim­ple (here is my ~/.­mair­ixr­c):

base=/home/ralsina/
maildir=Mail/*...
maildir=.kde/share/apps/kmail/dimap...
omit=Mail/mairix
database=~/.mairix_db

What does it do?

  • It in­­dex­es mail stored at ~/­­Mail and ev­ery­where in my kmail imap fold­er­s.

  • It stores search re­­sults in ~/­­Mail/­­mair­ix and ig­nores its con­­tents when search­ing. The re­­sults are stored as links, so they waste no disk space.

Af­ter run­ning mair­ix so it builds its DB (took about a min­ute, which is less than most kmail search­es) , you can search for things like this:

[ralsina@monty ~]$ time mairix b:bartleblog
Matched 6 messages

real    0m0.232s
user    0m0.012s
sys     0m0.204s

And the re­sult can be seen in kmail, in the mair­ix fold­er:

mairix1.png

How­ev­er, there is a prob­lem. It will work for the first search, but not for the sec­ond one. On the sec­ond search, you get the same con­tent list­ing, but all mes­sages ap­pear emp­ty.

That's be­cause kmail saves an in­dex file of each fold­er. To work around that, I wrote a lit­tle shell wrap­per, kmair­ix:

#!/bin/sh
rm ~/Mail/.mairix* -f
mairix $*

And you use that in­stead of call­ing mair­ix di­rect­ly.

But there are still im­prove­ments to be done. If your kmail is cur­rent­ly dis­play­ing the mair­ix fold­er, search­ing does­n't up­date the mes­sage list.

DCOP to the res­cue! We can switch to the in­box, then back to mair­ix (ad­just as need­ed for your­self):

dcop kmail KMailIface selectFolder /Local/inbox
dcop kmail KMailIface selectFolder /Local/mairix

Miss­ing pieces:

  • How about switch­ing to the kmail win­­dow? Sad­­ly, the kwin DCOP in­­ter­­face seems in­­­com­­plete. Maybe as­sign­ing kmail a hotkey and work from there? Let me know if you have any ideas.

    UP­­­DATE as sug­­gest­ed by An­no He­im­burg: just call kmail.

  • A GUI (of course!) prob­a­bly with a tray icon...

  • A way to au­­to-up­­date the Mair­ix DB when new mail ar­rives. I am think­ing about do­ing it with in­­cron but have not done it yet.

So, here is the fi­nal ver­sion, put it some­where in your path, and use AL­T+F2 to search your mails :-)

#!/bin/sh
rm ~/Mail/.mairix* -f
mairix $*
dcop kmail KMailIface selectFolder /Local/inbox
dcop kmail KMailIface selectFolder /Local/mairix
kmail

BartleBlog live!

I was think­ing: how can I im­ple­ment page pre­views in Bartle­Blog?

The ob­vi­ous way is to ren­der the page and open the lo­cal file. How­ev­er, the page may link or in­clude pieces that are not up­dat­ed yet in the stat­ic ver­sion, so that can give con­fus­ing re­sult­s.

Then it hit me... gen­er­ate the page on the fly and serve it. And do the same for ev­ery­thing else the brows­er asks for.

So, af­ter search­ing for 15 min­utes for the sim­plest python "web frame­work" that let me use the code al­ready in Bartle­blog, and de­cid­ing for Col­u­brid...

Now, this is cute: bartle­blog as a dy­nam­ic web app in 34 lines.

from colubrid import RegexApplication, HttpResponse, execute

from BartleBlog.backend.blog import Blog
import BartleBlog.backend.dbclasses as db
import os, codecs

class webBlog(Blog):
    def __init__(self):
        Blog.__init__(self)
        self.basepath='http://localhost:8080/'
        self.dest_dir=os.path.expanduser("~/.bartleblog/preview")
        if not os.path.isdir(self.dest_dir):
            os.mkdir(self.dest_dir)


class MyApplication(RegexApplication):
    blog=webBlog()
    urls = [
        (r'^(.*?)$', 'page'),
        (r'^(.*?)/(.*?)$', 'page'),
        (r'^(.*?)/(.*?)/(.*?)$', 'page'),
        (r'^(.*?)/(.*?)/(.*?)/(.*?)$', 'page')
    ]

    def page(self, *args):
        path=''.join(args)
        page=db.pageByPath(path)
        self.blog.renderPage(page)
        return HttpResponse(codecs.open(os.path.join(self.blog.dest_dir, path)).read())

app = MyApplication
app = StaticExports(app, {
    '/static': './static'
})


if __name__ == '__main__':
    execute(app)

BartleBlog change: Mako Templates

Since the very be­gin­ning, Bartle­Blog has been us­ing Cher­ry­Tem­plate for its out­put for­mat­ting need­s. I like it, be­cause it's very sim­ple.

How­ev­er, it had grown rather cum­ber­some.

Specif­i­cal­ly, most pages in a blog are sort of a page tem­plate with a body tem­plate in­side (the main con­tent).

To do that on Cher­ry­Tem­plate, I used a two-­pass ap­proach: gen­er­ate the body, then pass it as pa­ram­e­ter to the page tem­plate.

Which is a pain in some cas­es be­cause you end ba­si­cal­ly hav­ing to do a ren­der­ing func­tion for each kind of page, or some crazy-evil func­tion (what I did).

Ex­plor­ing the dif­fer­ent python tem­plate en­gi­nes, I ran in­to Mako and de­cid­ed to give it a whirl. It looks good.

The ap­proach is a bit dif­fer­en­t, it is much more pow­er­ful, but you can still use it sim­ply if you can.

And the main fea­ture was tem­plate in­her­i­tance. Us­ing that, no more in­ner and out­er tem­plates, baby!

Oh, and per­for­mance is bet­ter:

Cherry

real    31m44.732s
user    21m18.336s
sys     2m7.628s


Mako

real    24m54.472s
user    19m9.508s
sys     1m56.375s

This is for com­plete­ly reren­der­ing the whole 7 years, 574 post­s, 40 stat­ic ar­ti­cles, 14 cat­e­go­ry blog, and there is tons of op­ti­miza­tions to be done.

BTW: this is how you reren­der the whole blog:

from BartleBlog.backend.blog import Blog
Blog().renderFullBlog()

Sometimes I am stupid. Then again, it doesn't matter, because I am lucky!

I am work­ing on chang­ing Bartle­Blog so it can be used from scratch. That may sound odd but be­cause I have been us­ing it since day 2 to post this blog, it has grown very or­gan­i­cal­ly, mean­ing there are things that on­ly work be­cause of the way I used it while de­vel­op­ing it.

So, I cre­at­ed a test user, and cre­at­ed a test blog there, and I am work­ing, and de­cide to do an­oth­er from-scratch test, and...

I delet­ed my pro­duc­tion copy.

Yes. The one that gen­er­ates this blog. So this blog dis­ap­peared. Be­cause I used the wrong ter­mi­nal win­dow.

And I had one-week old back­up­s.

So I felt very very stupid.

Be­cause un­delet­ing in Lin­ux is a joke.

So I was think­ing how to spend a few hours recre­at­ing the last week of post­s, and what­ev­er, when I no­ticed on the taskbar... bartle­blog was still run­ning.

Which means that the DB was still open by a process. Which mean­s...

[ralsina@monty bartleblog]$ ps ax | grep python
17063 pts/1    S     24:33 python bartleblog.py
17161 ?        S      0:04 konqueror [kdeinit] -mimetype text/html http://www.google.com/search?q=python+copy+file&ie=UTF-8&oe=UTF-8
17454 pts/1    D+     0:00 grep python
[ralsina@monty bartleblog]$ su
Password:
[root@monty bartleblog]# cd /proc/17063/fd
[root@monty fd]# ls
0  1  10  11  12  2  3  4  5  6  7  8  9
[root@monty fd]# ls -l
total 0
lrwx------ 1 ralsina users 64 2007-05-13 21:07 0 -> /dev/pts/1
lrwx------ 1 ralsina users 64 2007-05-13 21:07 1 -> /dev/pts/1
lrwx------ 1 ralsina users 64 2007-05-13 21:07 10 -> socket:[159486]
lrwx------ 1 ralsina users 64 2007-05-13 21:07 11 -> socket:[159488]
lrwx------ 1 ralsina users 64 2007-05-13 21:07 12 -> /mnt/centos/home/ralsina/.bartleblog/blog.db (deleted)
lrwx------ 1 ralsina users 64 2007-05-13 21:07 2 -> /dev/pts/1
lr-x------ 1 ralsina users 64 2007-05-13 21:07 3 -> /mnt/centos/home/ralsina/Desktop/proyectos/bartleblog/bartleblog/BartleBlog/ui/bartleblog.py
lr-x------ 1 ralsina users 64 2007-05-13 21:07 4 -> pipe:[159481]
l-wx------ 1 ralsina users 64 2007-05-13 21:07 5 -> pipe:[159481]
lr-x------ 1 ralsina users 64 2007-05-13 21:07 6 -> pipe:[159482]
l-wx------ 1 ralsina users 64 2007-05-13 21:07 7 -> pipe:[159482]
lr-x------ 1 ralsina users 64 2007-05-13 21:07 8 -> pipe:[159485]
l-wx------ 1 ralsina users 64 2007-05-13 21:07 9 -> pipe:[159485]
[root@monty fd]# cp 12 /root/db
[root@monty fd]# ls -l ~/db
-rw-r--r-- 1 root root 3582976 2007-05-13 21:07 /root/db
[root@monty fd]# sqlitebrowser ~/db
[root@monty fd]# cp ~/db /home/ralsina/.bartleblog/blog.db

And I got the data­base back.

If you don't un­der­stand how that worked.... here's the ex­pla­na­tion:

  • On unix, files are re­al­­ly un­linked (re­­moved from di­rec­­to­ries) when no process has them open. Even then, the da­­ta is not delet­ed, but find­­ing it is much hard­er.

  • On /proc/PID you can see the file de­scrip­­tors each process has open.

  • You can ac­­tu­al­­ly copy a file de­scrip­­tor.

So I went and copied the open file. And got it back. And this blog did­n't go away.

So I am luck­y! Stupid. But luck­y!


Contents © 2000-2023 Roberto Alsina