With iterpipes, python is ready to replace bash for scripting. Really.

2009-12-23 15:32

This has been a pet peeve of mine for years: programming shell scripts suck. They are ugly and error prone. The only reason why we still do it? There is no real replacement.

Or at least that was the case, until today I met iterpipes at python.reddit.com

Iterpipes is "A library for running shell pipelines using shell-like syntax" and guess what? It's brilliant.

Here's an example from its PYPI page:

# Total lines in *.py files under /path/to/dir,
# use safe shell parameters formatting:

>>> total = cmd(
...     'find {} -name {} -print0 | xargs -0 wc -l | tail -1 | awk {}',
...     '/path/to/dir', '\*.py', '{print $1}')
>>> run(total | strip() | join | int)
315

Here's how that would look in shell:

find /path/to/dir -name '*.py' -print0 | xargs -0 wc -l | tail -1 | awk '{print $1}'

You may say the shell version looks better. That's an illusion caused by the evil that is shell scripting: the shell version is buggy.

Why is it buggy? Because if I control what's inside /path/to/dir I can make that neat little shell command fail [1], but at least in python I can handle errors!

Also, in most versions you could attempt to write, this command would be unsafe because quoting and escaping in shell is insane!

The iterpipes version uses the equivalent of SQL prepared statements which are much safer.

It's nearly impossible to do such a command in pure shell and be sure it's safe.

Also, the shell version produces a string instead of an integer, which sucks if you intend to do anything with it.

And the most important benefit is, of course, not when you try to make python act like a shell, but when you can stop pretending shell is a real programming language.

Consider this gem from Arch Linux's /etc/rc.shutdown script. Here, DAEMONS is a list of things that started on boot, and this script is trying to shut them down in reverse order, unless the daemon name starts with "!":

# Shutdown daemons in reverse order
let i=${#DAEMONS[@]}-1
while [ $i -ge 0 ]; do
        if [ "${DAEMONS[$i]:0:1}" != '!' ]; then
                ck_daemon ${DAEMONS[$i]#@} || stop_daemon ${DAEMONS[$i]#@}
        fi
        let i=i-1
done

Nice uh?

Now, how would that look in python (I may have inverted the meaning of ck_daemon)?

# Shutdown daemons in reverse order
for daemon in reversed(DAEMONS):
    if daemon[0]=='!':
        continue
    if ck_daemon(daemon):
        stop_daemon(daemon)

Where stop_daemon used to be this:

stop_daemon() {
    /etc/rc.d/$1 stop
}

And will now be this:

def stop_daemon(daemon):
    run(cmd('/etc/rc.d/{} stop',daemon))

So, come on, people, we are in the 21st century, and shell scripting sucked in the 20th already.

python-keyring is seriously nice

2009-12-21 14:39

Many programs require passwords from the user.

It's nice when a program can remember the password you give it.

It's nicer when it stores said password safely. However, it's not trivial to do that if you care for cross-platform support.

Or at least it wasn't until Kang Zhang wrote python keyring, a module that abstracts the password storage mechanisms for KDE, GNOME, OSX and windows (and adds a couple of file-based backends just in case).

So, how does it work?

Install it in the usual way. If it's not packaged for your distro/operating system, just use easy_install:

easy_install keyring

You could also get it from mercurial:

hg clone http://bitbucket.org/kang/python-keyring-lib/

The API is simplicity itself. This is how you save a secret:

import keyring
keyring.set_password('keyring_demo','username','thisisabadpassword')

You may get this dialog (or some analog on other platforms):

And here's the proof that it was saved correctly (this is KDE's password manager):

And how do you get the secret back?

import keyring
print keyring.get_password('keyring_demo','username')

This is how it runs:

$ python load.py
thisisabadpassword

As you can see, the API is as easy as it could possible get. It even chose the KWallet backend automatically because I am in KDE!

Python-keyring is a module that fixes a big problem, so a big thank you to Kang Zhang and Tarek Ziadé (who had the idea)

Migrating from Haloscan to Disqus (if you can comment on it, it worked ;-)

2009-12-18 00:37

Introduction

If you are a Haloscan user, and are starting to wonder what can you do... this page will explain you a way to take your comments to Disqus, another free comment service.

A few days ago, Haloscan announced they were stopping their free comment service for blogs. Guess what service has in it the comments of the last 9 years of this blog? Yes, Haloscan.

They offered a simple migration to their Echo platform, which you have to pay for. While Echo looks like a perfectly nice comment platform, I am not going to spend any money on this blog if I can help it, since it already eats a lot of my time.

Luckily, the guys at Haloscan allow exporting the comments (that used to be only for their premium accounts), so thanks Haloscan, it has been nice!

So, I started researching where I could run to. There seems to be two large free comment systems:

Keep in mind that my main interest lays in not losing almost ten years of comments, not on how great the service is. That being said, they both seem to offer roughly the same features.

Let's consider how you can import comments to each service:

Disqus: It can import from blogger and some other hosted blog service. Not from Haloscan.
Intense Debate: Can import from some hosted services, and from some files. Not from the file Haloscan gave me.

So, what is a guy to do? Write a python program, of course! Here's where Disqus won: they have a public API for posting comments.

So, all I have to do then is:

Grok the Disqus API
Grok the Haloscan comments file (it's XML)
Create the necessary threads and whatever in Disqus
Post the comments from Haloscan to Disqus
Hack the blog so the links to Haloscan now work for Disqus

Piece of cake. It only took me half a day, which at my current rates is what 3 years of Echo would have costed me, but where's the fun in paying?

So, let's go step by step.

1. Grok the Disqus API

Luckily, there is a reasonable Disqus Python Client library and docs for the API so, this was not hard.

Just get the library and install it:

hg clone https://IanLewis@bitbucket.org/IanLewis/disqus-python-client/
cd disqus-python-client
python setup.py install

The API usage we need is really simple, so study the API docs for 15 minutes if you want. I got almost all the tips I needed from this pybloxsom import script

Basically:

Get your API Key
You login
You get the right "forum" (you can use a disqus account for more than one blog)
Post to the right thread

2. Grok the Haloscan comments file

Not only is it XML, it's pretty simple XML!

Here's a taste:

<?xml version="1.0" encoding="iso-8859-1" ?>
<comments>
    <thread id="BB546">
      <comment>
        <datetime>2007-04-07T10:21:54-05:00</datetime>
        <name>superstoned</name>
        <email>josje@aaaaaa.nl</email>
        <uri></uri>
        <ip>86.92.111.236</ip>
        <text><![CDATA[that is one hell of a cool website ;-)]]></text>
      </comment>
      <comment>
        <datetime>2007-04-07T16:14:53-05:00</datetime>
        <name>Remi Villatel</name>
        <email>maxilys@aaaaaa.fr</email>
        <uri></uri>
        <ip>77.216.206.65</ip>
        <text><![CDATA[Thank you for these rare minutes of sweetness in this rough world...]]></text>
      </comment>
    </thread>
</comments>

So, a comments tag that contains one or more thread tags, which contain one or more comment tags. Piece of cake to traverse using ElementTree!

There is an obvious match between comments and threads in Haloscan and Disqus. Good.

3. Create the necessary threads and whatever in Disqus

This is the tricky part, really, because it requires some things from your blog.

You must have a permalink for each post
Each permalink should be a separate page. You can't have permalinks with # in the URL
You need to know what haloscan id you used for each post's comments, and what the permalink for each post is.

For example, suppose you have a post at //ralsina.me/weblog/posts/ADV0.html and it has a Haloscan comments link like this:

<a href="javascript:HaloScan('ADV0');" target="_self"> <script type="text/javascript">postCount('ADV0');</script></a>

You know where else that 'ADV0' appears? In Haloscan's XML file, of course! It's the "id" attribute of a thread.

Also, the title of this post is "Advogato post for 2000-01-17 17:19:57" (hey, it's my blog ;-)

Got that?

Then we want to create a thread in Disqus with that exact same data:

URL
Thread ID
Title

The bad news is... you need to gather this information for your entire blog and store it somewhere. If you are lucky, you may be able to get it from a database, as I did. If not... well, it's going to be a lot of work :-(

For the purpose of this explanation, I will assume you got that data nicely in a dictionary indexed by thread id:

{
  id1: (url, title),
  id2: (url, title)
}

4. Post the comments from Haloscan to Disqus

Here's the code. It's not really tested, because I had to do several attempts and fixes, but it should be close to ok (download).

#!/usr/bin/python
# -*- coding: utf-8 -*-

# Read all comments from a CAIF file, the XML haloscan exports

from disqus import DisqusService
from xml.etree import ElementTree
from datetime import datetime
import time


# Obviously these should be YOUR comment threads ;-)
threads={
    'ADV0': ('//ralsina.me/weblog/posts/ADV0.html','My first post'),
    'ADV1': ('//ralsina.me/weblog/posts/ADV1.html','My second post'),
    }

key='USE YOUR API KEY HERE'
ds=DisqusService()
ds.login(key)
forum=ds.get_forum_list()[0]

def importThread(node):
    t_id=node.attrib['id']

    # Your haloscan thread data
    thr_data=threads[t_id]

    # A Disqus thread: it will be created if needed
    thread=ds.thread_by_identifier(forum,t_id,t_id)['thread']

    # Set the disqus thread data to match your blog
    ds.update_thread(forum, thread, url=thr_data[0], title=thr_data[1])


    # Now post all the comments in this thread
    for node in node.findall('comment'):
        dt=datetime.strptime(node.find('datetime').text[:19],'%Y-%m-%dT%H:%M:%S')
        name=node.find('name').text or 'Anonymous'
        email=node.find('email').text or ''
        uri=node.find('uri').text or ''
        text=node.find('text').text or 'No text'

        print '-'*80
        print 'Name:', name
        print 'Email:', email
        print 'Date:', dt
        print 'URL:', uri
        print
        print 'Text:'
        print text

        print ds.create_post(forum, thread, text, name, email,
                                   created_at=dt, author_url=uri)
        time.sleep(1)

def importComments(fname):
    tree=ElementTree.parse(fname)
    for node in tree.findall('thread'):
        importThread(node)


# Replace comments.xml with the file you downloaded from Haloscan
importComments('comments.xml')

Now, if we are lucky, you already have a nice and fully functioning collection of comments in your Disqus account, and you should be calm knowing you have not lost your data. Ready for the final step?

5. Hack the blog so the links to Haloscan now work for Disqus

You may not need to do anything beyond what the Disqus install guide suggests. If you have to do the custom generic install editing HTML manually, it's something like this:

First I changed my comment links. Here's the Haloscan version:

<a href="javascript:HaloScan('ADV0');" target="_self">
<script type="text/javascript">postCount('ADV0');</script></a>

And here's the Disqus version:

<a href="//ralsina.me/weblog/posts/ADV0.html#disqus_thread">Comments</a>

Add the monkeypatching javascript at the bottom, before </body> like the Disqus install guide says (here's mine, it's not the one you want!):

<script type="text/javascript">
//<![CDATA[
(function() {
        var links = document.getElementsByTagName('a');
        var query = '?';
        for(var i = 0; i < links.length; i++) {
        if(links[i].href.indexOf('#disqus_thread') >= 0) {
                query += 'url' + i + '=' + encodeURIComponent(links[i].href) + '&';
        }
        }
        document.write('<script charset="utf-8" type="text/javascript" src="http://disqus.com/forums/yourownblognamegoeshere/get_num_replies.js' + query + '"></' + 'script>');
})();
//]]>
</script>

Also, in the post's own webpage, add the "embed code" from the Disqus install guide wherever you want the comments to appear.

If you want more than one page to share a set of comments (for example, I use the same comments for the spanish and english versions of a post), you will have to use something like this before the embed code:

<script type="text/javascript">
    var disqus_url = "http://the_url_where_the_comments_really_belong.com";
</script>

And thatś about it, it should be enough to get you going. However:

If you found this guide useful, and it saved you money, why not be a nice guy and give it to me instead? There's a donate link at the left, you know ;-)
If you can't manage to fix it, ask me for help. I have very reasonable consulting rates!

Make sure to export your haloscan comments while you still can!

New 24-hour app coming (not so) soon: foley

2009-12-16 17:12

First a short explanation:

24-hour apps are small, self-contained projects where I intend to create a decent, useful application in 24 hours. The concept is that:

I will think about this app a lot for a while
I will design it in my head or in written notes
I will code, from scratch, for 24 hours.
That's not one day, really, but 24 hours of work. I can't work 24 hours straight anymore.

The last time around this didn't quite work as I intended, but it was fun and educational (for me at least ;-) and the resulting app is really not bad!

So, what's foley going to be? A note-taking app aimed at students and conference public.

In your last geeky conference, did you notice everyone is using a computer?

And what are they taking notes on? Vi? Kwrite? OpenOffice? Whatever it is they use, it's not meant to be used for this purpose.

So, what will foley do different? I don't quite know yet, but I have some ideas:

A strong timeline orientation. Every paragraph will be dated.
Twitter/Identica support. Want to liveblog your notes? Just click.
Multimedia incorporated in the timeline.
- Webcam/Audio recording synced to your notes?
- Images imported and added in the timeline?
- Attach files to the timeline? (Useful for slides?)
If provided with a PDF of slides, attach each slide to the right moment in the timeline
Easy web publishing: find a way to put this on a webpage easy and quick (single-click publishing is the goal)

I have only thought about this for about 10 minutes, but I see potential here.

The bad news is... I have a ton of paying work to do. So this will probably only happen in January. However, I wanted to post it so I can take input while in this planning phase.

So, any ideas?

Making a unique application using python and DBUS

2009-12-11 11:04

No, not unique in the sense "oh, this app is a special snowflake", but unique in the sense "you can only run one copy of this application".

I tried googling for it and I always found the same answer, "use dbus, try to own the name, if it exists already, then a copy is already running".

What I could not find is one working example of this, or at least not something conveniently labeled "here is how you do a unique application using dbus and python".

So, here is how you do a unique application using dbus and python:

Supposing your application is called uRSSus (mine is):

session_bus = dbus.SessionBus()
try:
    session_bus.get_object("org.urssus.service", "/uRSSus")
    # This is the second copy, make the first one show instead
    # TODO: implement
except dbus.DBusException: # No other copy running
    # This will 'take' the DBUS name
    name = dbus.service.BusName("org.urssus.service", bus=session_bus)
    # Now, start your app:
    window=MainWindow()
    object = UrssusServer(window,name)
    :
    :
    :
    etc, etc

And that's it. No, it's not hard, but since the DBUS docs seem to be... rather they seem almost not to be sometimes, every little bit may help.

Ralsina.Me — Roberto Alsina's website

Posts about python (old posts, page 57)

With iterpipes, python is ready to replace bash for scripting. Really.

python-keyring is seriously nice

Migrating from Haloscan to Disqus (if you can comment on it, it worked ;-)

Introduction

1. Grok the Disqus API

2. Grok the Haloscan comments file

3. Create the necessary threads and whatever in Disqus

4. Post the comments from Haloscan to Disqus

5. Hack the blog so the links to Haloscan now work for Disqus

New 24-hour app coming (not so) soon: foley

Making a unique application using python and DBUS