To write, and to write what.

2012-02-17 03:18

Some of you may know I have written about 30% of a book, called "Python No Muerde", available at http://nomuerde.netmanagers.com.ar (in spanish only).That book has stagnated for a long time.

On the other hand, I wrote a very popular series of posts, called PyQt by Example, which has (you guessed it) stagnated for a long time.

The main problem with the book was that I tried to cover way too much ground. When complete, it would be a 500 page book, and that would involve writing half a dozen example apps, some of them in areas I am no expert.

The main problem with the post series is that the example is lame (a TODO app!) and expanding it is boring.

¡So, what better way to fix both things at once, than to merge them!

I will leave Python No Muerde as it is, and will do a new book, called PyQt No Muerde. It will keep the tone and language of Python No Muerde, and will even share some chapters, but will focus on developing a PyQt app or two, instead of the much more ambitious goals of Python No Muerde. It will be about 200 pages.

I have acquired permission from my superiors (my wife) to work on this project a couple of hours a day, in the early morning. So, it may move forward, or it may not. This is, as usual, an experiment, not a promise.

PyQt Quickie: Don't Get Garbage Collected

2012-02-10 22:57

There is one area where Qt and Python (and in consequence PyQt) have major disagreements. That area is memory management.

While Qt has its own mechanisms to handle object allocation and disposal (the hierarchical QObject trees, smart pointers, etc.), PyQt runs on Python, so it has garbage collection.

Let's consider a simple example:

from PyQt4 import QtCore

def finished():
    print "The process is done!"
    # Quit the app
    QtCore.QCoreApplication.instance().quit()

def launch_process():
    # Do something asynchronously
    proc = QtCore.QProcess()
    proc.start("/bin/sleep 3")
    # After it finishes, call finished
    proc.finished.connect(finished)

def main():
    app = QtCore.QCoreApplication([])
    # Launch the process
    launch_process()
    app.exec_()

main()

If you run this, this is what will happen:

QProcess: Destroyed while process is still running.
The process is done!

Plus, the script never ends. Fun! The problem is that proc is being deleted at the end of launch_process because there are no more references to it.

Here is a better way to do it:

from PyQt4 import QtCore

processes = set([])

def finished():
    print "The process is done!"
    # Quit the app
    QtCore.QCoreApplication.instance().quit()

def launch_process():
    # Do something asynchronously
    proc = QtCore.QProcess()
    processes.add(proc)
    proc.start("/bin/sleep 3")
    # After it finishes, call finished
    proc.finished.connect(finished)

def main():
    app = QtCore.QCoreApplication([])
    # Launch the process
    launch_process()
    app.exec_()

main()

Here, we add a global processes set and add proc there so we always keep a reference to it. Now, the program works as intended. However, it still has an issue: we are leaking QProcess objects.

While in this case the leak is very short-lived, since we are ending the program right after the process ends, in a real program this is not a good idea.

So, we would need to add a way to remove proc from processes in finished. This is not as easy as it may seem. Here is an idea that will not work as you expect:

def launch_process():
    # Do something asynchronously
    proc = QtCore.QProcess()
    processes.add(proc)
    proc.start("/bin/sleep 3")
    # Remove the process from the global set when done
    proc.finished.connect(lambda: processes.remove(proc))
    # After it finishes, call finished
    proc.finished.connect(finished)

In this version, we will still leak proc, even though processes is empty! Why? Because we are keeping a reference to proc in the lambda!

I don't really have a good answer for that that doesn't involve turning everything into members of a QObject and using sender to figure out what process is ending, or using QSignalMapper. That version is left as an exercise.

Garbage Collection Has Side Effects

2012-01-31 18:08

Just a quick followup to The problem is is, is it not? This is not mine, I got it from reddit

This should really not surprise you:

>>> a = [1,2]
>>> b = [3,4]
>>> a is b
False
>>> a == b
False
>>> id(a) == id(b)
False

After all, a and b are completely different things. However:

>>> [1,2] is [3,4]
False
>>> [1,2] == [3,4]
False
>>> id([1,2]) == id([3,4])
True

Turns out that using literals, one of those things is not like the others.

First, the explanation so you understand why this happens. When you don't have any more references to a piece of data, it will get garbage collected, the memory will be freed, so it can be reused for other things.

In the first case, I am keeping references to both lists in the variables a and b. That means the lists have to exist at all times, since I can always say print a and python has to know what's in it.

In the second case, I am using literals, which means there is no reference to the lists after they are used. When python evaluates id([1,2]) == id([3,4]) it first evaluates the left side of the ==. After that is done, there is no need to keep [1,2] available, so it's deleted. Then, when evaluating the right side, it creates [3,4].

By pure chance, it will use the exact same place for it as it was using for [1,2]. So id will return the same value. This is just to remind you of a couple of things:

a is b is usually (but not always) the same as id(a) == id(b)
garbage collection can cause side effects you may not be expecting

The problem is is. Is it not?

2012-01-28 18:14

This has been a repeated discussion in the Python Argentina mailing list. Since it has not come up in a while, why not recap it, so the next time it happens people can just link here.

Some people for some reason do this:

>>> a = 2
>>> b = 2
>>> a == b
True
>>> a is b
True

And then, when they do this, they are surprised:

>>> a = 1000
>>> b = 1000
>>> a == b
True
>>> a is b
False

They are surprised because "2 is 2" makes more intuitive sense than "1000 is not 1000". This could be attributed to an inclination towards platonism, but really, it's because they don't know what is is.

The is operator is (on CPython) simply a memory address comparison. if objects a and b are the same exact chunk of memory, then they "are" each other. Since python pre-creates a bunch of small integers, then every 2 you create is really not a new 2, but the same 2 of last time.

This works because of two things:

Integers are read-only objects. You can have as many variables "holding" the same 2, because they can't break it.
In python, assignment is just aliasing. You are not making a copy of 2 when you do a = 2, you are just saying "a is another name for this 2 here".

This is surprising for people coming from other languages, like, say, C or C++. In those languages, a variable int a will never use the same memory space as another variable int b because a and b are names for specific bytes of memory, and you can change the contents of those bytes. On C and C++, integers are a mutable type. This 2 is not that 2, unless you do it intentionally using pointers.

In fact, the way assignment works on Python also leads to other surprises, more interesting in real life. For example, look at this session:

>>> def f(s=""):
...     s+='x'
...     return s
...
>>> f()
'x'
>>> f()
'x'
>>> f()
'x'

That is really not surprising. Now, let's make a very small change:

>>> def f(l=[]):
...     l.append('x')
...     return l
...
>>> f()
['x']
>>> f()
['x', 'x']
>>> f()
['x', 'x', 'x']

And that is, for someone who has not seen it before, surprising. It happens because lists are a mutable type. The default argument is defined when the function is parsed, and every time you call f() you are using and returning the same l. Before, you were also using always the same s but since strings are immutable, it never changed, and you were returning a new string each time.

You could check that I am telling you the truth, using is, of course. And BTW, this is not a problem just for lists. It's a problem for objects of every class you create yourself, unless you bother making it immutable somehow. So let's be careful with default arguments, ok?

But the main problem about finding the original 1000 is not 1000 thing surprising is that, in truth, it's uninteresting. Integers are fungible. You don't care if they are the same integer, you only really care that they are equal.

Testing for integer identity is like worrying, after you loan me $1, about whether I return you a different or the same $1 coin. It just doesn't matter. What you want is just a $1 coin, or a 2, or a 1000.

Also, the result of 2 is 2 is implementation dependent. There is no reason, beyond an optimization, for that to be True.

Hoping this was clear, let me give you a last snippet:

>>> a = float('NaN')
>>> a is a
True
>>> a == a
False

UPDATE: lots of fun and interesting comments about this post at reddit and a small followup here

People doing useful stuff with my toys

2012-01-25 10:33

About a year ago, I wrote a small web browser, called De Vicenzo just for fun.

But hey, someone went and madeit useful for something! Specifically, to provide previews when doing sphix docs

That's cool :)

Ralsina.Me — Roberto Alsina's website

Posts about python (old posts, page 71)

To write, and to write what.

PyQt Quickie: Don't Get Garbage Collected

Garbage Collection Has Side Effects

The problem is is. Is it not?

People doing useful stuff with my toys