2007-09-14 00:57

The unfatomable improbability of my existence.

I am doing this as a public service for USA readers, because you are likely to be in contact with someone who believes evolution of life is incredibly unlikely. After all, many of your presidential candidates don't believe in evolution. Which would be sad if it wasn't because I expect some of them do but are afraid of losing the ignorant religious fanatic vote (which makes it just lame).

You will sometimes see someone say something like "that's unlikely!. The odds of A are 1 in 1000 and the odds of B are 1 in 500, so the odds of A and B are 1 in 500000!".

This argument is very, very, very wrong. But for those with an intuitive knowledge of probability it looks good, because they know that the odds of a coin flip coming heads is 1/2 and the odds of 2 consecutive heads is 1/2 * 1/2 = 1/4 and so on.

Now, let's construct me this way (warning: made up numbers ahead).

I am argentinian. There's roughly 35M argentinians out of 6000M humans. That's about 1 in 171.

I am male. That's about 1 in 2.

I am a bit over median height. That's also 1 in 2.

I have a beard. I am guessing 1 in 10.

I wear size 41 shoes. That's about 1 in 4.

I use glasses: I'd say 1 in 5.

I get dizzy riding buses: About 1 in 100.

I am married: 1 in 4.

I have a child: 1 in 2.

I have a son: 1 in 4.

He is 4 months old: 1 in 200

So, I have a 1 in 87 552 000 000 chance of existing. And I could make those odds much lower. After all, I was born in 1971, mi favourite colour is green and my name is Roberto.

There are two reasons why you should never trust this kind of number manipulation unless you really, really know what you are doing.

  1. You can't just multiply anything. These odds have to be about statistically independent variables.

For example, I get 1 in 8 from having a child, and having a son.

Yes, maybe the odds of someone having a child are 1 in 2, and the odds of someone having a son is 1 in 4, but those are correlated. The odds of having a son if you have a child are about 1 in 2. The odds of having a child if you have a son are 100% :-)

If two variables are correlated you can not multiply their probabilities.

A much sillier example.Imagine there are ten countries, each with 1 tenth of the population.

The odds of being from A is 1/10. But he is also not from B,C,D,E,F,G,H,I, or J! There's a 9/10 chance for each one of those facts!

So, the odds of being from A and not fromanywhere else is 1/10 times 9/10 ... that's almost 1 in 26!

And since the same reasoning applies to each country, it turns out there is almost a 60% chance the next baby will not be born in any one of the ten countries.

See how stupid that sounds? Of course when correlation is more subtle, it's harder to figure out, and you will not be able to do this while arguing.

  1. Add random unlikely data.

This is slightly trickier, because it may be what you really wanted to do. Yes, me being exactly the way I am is extremely unlikely. However, something more or less like me is not.

Yes, I am unlikely but that's trivial. However, if you are going to apply this kind of reasoning in other cases it gets silly quick. Here's an example:

Imagine a lottery with 6-digit numbers. Today, the winner is 123456. Yesterday it was 654321.

The odds of those numbers being the winners is 1 in 1 000 000 000 000. But it's obvious that if you make two draws, some two numbers will come up! And whatever they are, they will be just as unlikely!

That something specific is unlikely doesn't always matter, because the important thing is the chance of some kind of thing happening, not of one specific thing.

And when/if you have a kid,and he/she/it asks you why he should study math, show him his kind of thing and tell him why:

Math makes it harder for people to lie to you.

2007-09-13 15:00

Geek challenge: Backup this thing!

Here's the scenario:

  • A Linux+Samba server with 40GB of data.
  • A SMB-only small storage server.

Your mission? Backup the thing. You should do full backups, and keep the last three.

In another age, I would have cobbled a 10-line script using tar (and split, see below) and be done with it. But now I want to use backup software.

So, I tried, and I ran into the following limitations:

  1. 2GB filesize limit on the storage server. Have no idea why, assume can't fix it.
  2. Weird unicode characters in filenames. There must be some encoding issue, but when a Windows client saves a file with accented characters, the clients see it all right. On the server, though, they are weird-looking. This is enough to make mc unable to delete some folders, for example.

So far I have tried:

  1. rdiff-backup: breaks with the unicode chars.
  2. flexbackup: breaks with filesize limit
  3. rsync: breaks with the unicode chars
  4. synbax: using rsync backend, see above. Using tar backend, breaks with filesize limit.

Here's what I want:

A simple backup software, where I can tell it "take this, back it there, keep last three backups, do it in files smaller than 2GB, give me a report".

Bonus points if restoring it is doable from windows.

Any suggestions?

2007-09-10 07:54

What I learned at PyWeek

  1. I don't have the time for this kind of deadlines anymore. Not even one all-nighter? I did nothing on Saturday except real work and family reunion?
  2. It's really easy to write games with Python. It's mostly just a matter of having a good game concept. The coding is the easy part.
  3. Chipmunk is cool. Qt's graphic scene stuff is cool. My ChipmunkScene is coolest ;-) I should rethink the API but the concept is killer stuff. With a little work this thing is like LEGO!
  4. I will try again in PyWeek6.

2007-09-06 15:27

A little further on TLB

A bit of progress, although not much time to work on it anymore so I will probably not make it.

Objects now can do things when they are hit by other kinds of things.

Example: If a ball hits the bottom of the catapult cup, the catapult shoots. If something hits the target object, you win the level.

2007-09-06 08:43

First pic of TLB

Not a game yet, but the engine is starting to look good.


The ball drops, follows both ramps, bounces down the stairs, hits the dominoes, the last one falls on the pivoting ramp, slides down and to the left, falls standing and leans left, then hits the rest of the dominoes.

2007-09-05 17:01

PyWeek progress: the 4 hour mark

Suddenly I was having a calm day at work, and Rosario is taking care of the baby, so I spent a few hours on the PyWeek project.

I have integrated Chipmunk with QGraphicsScene.

What does it mean? That I can now...

  • Create a scene
  • Create a view onto that scene
  • Create balls, walls and polygons as scene items
  • Watch said balls/walls/polygons bounce around happily under Chipmunk direction.

For example, here's enough code to create a few balls a box and a staircase:


for x in range (0, 10):
    self.scene.addBall(x*50.0+10, 50.0, 10.)
    self.scene.addBall(x*50.0+20, 20.0, 10.)
self.scene.addWall(0., 0., 0., 500.)
self.scene.addWall(0., 500., 500., 500.)
self.scene.addWall(500., 500., 500., 0.)
self.scene.addWall(500., 0., 0., 0.)
for i in range (0, 20):
    self.scene.addWall(i*20, 200+i*20, i*20+20, 200+i*20)
    self.scene.addWall(i*20+20, 200+i*20, i*20+20, 200+i*20+20)

self.scene.addPoly([[0, 50], [0, 100], [100, 100], [100, 50], [0, 50]])

I declare that nifty.

2007-09-05 09:46


Well, it seems I am in trouble for PyWeek.

Why? Because it's wednesday and I have done nothing. Nothing! It's because I have been working a lot, really, and I have a 4 month baby, too.

So, I am upping the ante.

I will do a PyDay.

I am taking tomorrow off (yeah, right!) and I'm doing the game in one day. Maybe I will scrounge a few hours on sunday, too.

It will probably not be fit for the contest because:

  • I will use PyQt
  • I won't test it in any platform other than my Linux box

But here's the game concept (BTW: Twisted sucks as a theme. It sucks really, really, really hard!):

According to the dictionary, Twisted also means perverted. So, this game, Twisted Little Boy is about a bad boy. A really bad boy. But a clever one. He creates machines using random equipment he finds to do evil, really mean things.

I will probably do a live-blog thing like those tutorials I wrote years ago about PyQt.

There's a Google code project (obviously empty): http://code.google.com/p/twistedlittleboy/

See you all tomorrow.

2007-08-13 17:59

Django, the view from a parachute

In the last few days I have been learning Django in perhaps the hardest way possible: by being hired to work on a site someone else wrote.

I already had the view from 10000 feet. And since I had to get to this thing rather quickly, I jumped on my parachute from those 10000 feet, and learned it on the way down.

Here's what I knew:

  • Python Web framework
  • Regexp-based URL dispatching
  • Its own template language
  • Its own ORM and form stuff

I have hacked stuff based on TurboGears, Colubrid, pure CherryPy, Mako/Kid/Cheetah/CherryTemplate templates, Routes, Paste and about half a dozen other frameworks or pieces that are used for frameworks, so how new could it be? Well, not very new. I am starting to notice a sort of sameness in these things. They are all alike.

First, the conclusion: I liked it, I could work with it.

Now for some little detail:

  • The URL dispatching is nice ,if not really interesting. there seem to be two ways to do this, all frameworks use one or the other, and almost everyone likes regexps better.
  • The ORM+newforms is quite nice! Of course everything was done with oldforms, which is... not quite so nice. But you can switch pieces as you go, and the code actually simplifies as you hack, so it's good.
  • The template language I could live without. It doesn't seem to be specially featureful, and it didn't seem as expressive to me as my current favourite, Mako. Luckily you can replace it easily. It's not that it's bad, it's just average.

So, I see no reason to learn it instead of Turbogears, or viceversa. On the other hand, if you know one, you can learn the other in perhaps a weekend, so there's no point on not having at least a basic knowledge of both.

2007-08-11 23:15

Thinking about this blog.

I suppose it happens to everyone once in a while, and it has happened to me often in the past, but I am thinking if I should keep on writing this blog, or if some large change is needed.

Here are some random things from my head:

  1. Almost noone reads it. Really. It has less than 40 subscribers. That's pathetic for a blog that has content for over 7 years :-)

  2. Maybe I should post in spanish, or at least bilingually.

  3. Maybe I should write more features. When I write a longish piece and announce it, there is a respectable traffic surge.

  4. On the other hand, I enjoy writing it. And it's really very little effort (specially now, with BartleBlog ;-)

  5. Maybe it should be more focused in one area, make it a python programming blog, or a tutorials blog, or something like that.

  6. But I am not a focused person. I am a generalist. This week I have worked in the following things:

    • VoIP
    • Django
    • PyQt
    • Linux sysadmining
    • Firewall/Proxy integration with windows clients
    • Consulting in the most generic sense, sitting with a company's IT staff and thinking about their situation.
    • Learning PyGame

    And this was in 5 days of work. If I listed what I have done this year, it would take me 500 items. I am broad, how could my blog be narrow?

  7. Maybe it's just not interesting? Or badly done?

  8. Is it too nerdy? Is it not nerdy enough?

  9. I have had a blog with a small readership for 7 years, why is it bothering me now?

  10. If I stop, it doesn't matter, I can always pick it up again later when I feel like writing.

So, there. You, the 40 guys, comment on it if you want ;-)

2007-08-10 13:38

Incredible new things.

Found on reddit, from Lens Culture

A mindblowing presentation of image-browsing software by Mcrosoft Research at TED. Please watch it.

And then there is this research from CMU.

Really awesome stuff.

Now, this is really scifi stuff. Or should have been. While science fiction was promising jetpacks and virtual worlds, computer science is providing new and incredible ways to manipulate information.

Contents © 2000-2019 Roberto Alsina