The problem is is. Is it not?

2012-01-28 18:14 | Also available in: Español

This has been a repeated discussion in the Python Argentina mailing list. Since it has not come up in a while, why not recap it, so the next time it happens people can just link here.

Some people for some reason do this:

>>> a = 2
>>> b = 2
>>> a == b
True
>>> a is b
True

And then, when they do this, they are surprised:

>>> a = 1000
>>> b = 1000
>>> a == b
True
>>> a is b
False

They are surprised because "2 is 2" makes more intuitive sense than "1000 is not 1000". This could be attributed to an inclination towards platonism, but really, it's because they don't know what is is.

The is operator is (on CPython) simply a memory address comparison. if objects a and b are the same exact chunk of memory, then they "are" each other. Since python pre-creates a bunch of small integers, then every 2 you create is really not a new 2, but the same 2 of last time.

This works because of two things:

Integers are read-only objects. You can have as many variables "holding" the same 2, because they can't break it.
In python, assignment is just aliasing. You are not making a copy of 2 when you do a = 2, you are just saying "a is another name for this 2 here".

This is surprising for people coming from other languages, like, say, C or C++. In those languages, a variable int a will never use the same memory space as another variable int b because a and b are names for specific bytes of memory, and you can change the contents of those bytes. On C and C++, integers are a mutable type. This 2 is not that 2, unless you do it intentionally using pointers.

In fact, the way assignment works on Python also leads to other surprises, more interesting in real life. For example, look at this session:

>>> def f(s=""):
...     s+='x'
...     return s
...
>>> f()
'x'
>>> f()
'x'
>>> f()
'x'

That is really not surprising. Now, let's make a very small change:

>>> def f(l=[]):
...     l.append('x')
...     return l
...
>>> f()
['x']
>>> f()
['x', 'x']
>>> f()
['x', 'x', 'x']

And that is, for someone who has not seen it before, surprising. It happens because lists are a mutable type. The default argument is defined when the function is parsed, and every time you call f() you are using and returning the same l. Before, you were also using always the same s but since strings are immutable, it never changed, and you were returning a new string each time.

You could check that I am telling you the truth, using is, of course. And BTW, this is not a problem just for lists. It's a problem for objects of every class you create yourself, unless you bother making it immutable somehow. So let's be careful with default arguments, ok?

But the main problem about finding the original 1000 is not 1000 thing surprising is that, in truth, it's uninteresting. Integers are fungible. You don't care if they are the same integer, you only really care that they are equal.

Testing for integer identity is like worrying, after you loan me $1, about whether I return you a different or the same $1 coin. It just doesn't matter. What you want is just a $1 coin, or a 2, or a 1000.

Also, the result of 2 is 2 is implementation dependent. There is no reason, beyond an optimization, for that to be True.

Hoping this was clear, let me give you a last snippet:

>>> a = float('NaN')
>>> a is a
True
>>> a == a
False

UPDATE: lots of fun and interesting comments about this post at reddit and a small followup here

toyg / 2012-01-30 11:00:

This is actually an issue in PyCharm, the JetBrains IDE for Python, which (from what I can see) will always nudge you to use "is" for any integer comparison, as it's "more readable" (which it is, but correctness will always trump readability).

BTW, the "bunch of small integers" is exactly 256, at least on Windows.

Benjamin Kloster / 2012-01-31 07:12:

I use PyCharm extensively, and I've never seen it suggest the "is" operator for integer comparison.

toyg / 2012-01-31 09:30:

maybe you turned off some option? i'm pretty sure it does suggest it. EDIT: see http://lateral.netmanagers....

Guest / 2012-01-31 16:58:

I think you're talking about the use of "and" or "or" instead of "&&" and "||".

toyg / 2012-01-31 23:14:

Actually, I went back to check and I think what I meant is the warning "boolean variable check can be simplified" (under Settings -> Inspectors -> Python), i.e. if int(someVar) == 0
In those cases, PyCharm suggests to switch from == to "is".

John Lenton / 2012-01-30 14:19:

Remind me not to lend you $1... :-)

jjconti / 2012-01-30 20:45:

Jaja, el último snippet es como un remate de standup.

vegai / 2012-01-31 08:40:

What's the benefit of pre-creating all 8-bit integers?

Roberto Alsina / 2012-01-31 17:48:

You use them often, and creating objects is kinda expensive.

shabbyrobe / 2012-01-31 12:21:

This is an excellent illustration of two of the most eye-popping WTFs Python has to offer. For a language that potentially has so much to offer beginners, total catastrophe gotchas like these simply shouldn't exist. A lot of us may well have a computer science background, but a language that comes so close to being incredibly useful without one shouldn't make it a requirement.

Roberto Alsina / 2012-01-31 17:49:

While I would not have called is "is", there is really little to be done about the mutable default argument issue without making it less useful, AFAICS

Ralsina.Me — Roberto Alsina's website