Skip to main content

Posts about programming (old posts, page 11)

Almost a real spreadsheet! (with video)

I was able to hack a bit at StupidSheet this morning, and there are some real advances.

In fact, it's a barely functional spreadsheet already! Hell, it has at least one feature OOcalc lacks, and one both OOcalc and KSpread missed.

Check it out in this video (9.5MB):

Sorry, dead file

Sorry if the audio is out of sync and/or too low. And you probably can't stream it, you need to download it first. And maybe it will 404 on you. If that's the case, wait a few minutes and insist.

Stupid Sheet: Redoing cell displacements

For my spreadsheet project, I had to redo something I had forgotten about: cell displacement. I did that once when the formula language was python.

At the time, I parsed the python using the tokenize module and Ka-Ping Yee's regurgitate.

Python->Tokens->Displace cells->regurgitate->Python

Since I have lots of other things to do, I decided to do it the same way, and wrote the equivalent of regurgitate for Traxter, my formula language.

It turns out it was not really hard, but I had to redo parts of the parser so it kept more information about the source.

After that it was simple, you see, Traxter compiles to Python. That means all I had to do was another (very similar) backend, and there it is, a Traxter-to-Traxter compiler. Or decompiler. Or something. And relative cell references are working again (and now with the right syntax).

A graph is a graph is a graph.

After hacking for about two hours the cell dependencies yesterday using dicts, I found myself saying "how can I check if the dependency graph is cyclical?"

And of course it hit me. That's the dependency graph. Use a graph library!

I looked for about 2 minutes before finding one that's probably not optimal, but is damn simple and in the public domain: by Nathan Denny.

First interesting data point, the first two lines on the file (yes, I found out there is a later version):

#--Version 1.0.0
#--Nathan Denny, May 27, 1999

Yup. In two days this piece of code is turning 8 years old and untouched. But it works just fine and dandy!

Here's the piece of code that makes the engine run which has grown from a humble 10 LOC to almost a whooping 40 LOC:

class SpreadSheet(QtCore.QObject):
    _cells = {}
    tools = {}

    def __init__(self,parent):
        for name in dir(functions):

    def __setitem__(self, key, formula):
        self._cells[key] = [c[key][0],False,compile(c[key][0],"Formula for %s"%key,'eval')]

        # Dependency graph
        if not self.graph.has_node(key):
        for edge in self.graph.in_arcs(key):
        for cell in c[key][1]:
                print 'GRAPH(TOPO): ',self.graph.topological_sort()
                print 'GRAPH(BFS) : ',self.graph.bfs(key)
                for cell in self.graph.bfs(key)[1:]:
        except Graph_topological_error:
                # We made the graph cyclic
                # So, mark this cell as evil
                # And remove all incoming edges to go back to
                # status quo
                for edge in self.graph.in_arcs(key):

    def getformula(self, key):
        return self._cells[key][0]
    def __getitem__(self, key ):
        print self._cells[key]
        if self._cells[key][1]:
                return "ERROR: cyclic dependency"
                print 'evaluating [%s]: '%key,type(self._cells[key][0]),self._cells[key][0]
              return eval(self._cells[key][0],, self)

This engine supports:

  • User defined functions (just add them in

  • Cell dependencies with detection of cicles, of any length

  • Unlimited size spreadsheets

  • Notification of cell changes to other modules

  • Automatic recalculation

  • A custom formula language (ok, that's not in this piece of code ;-)

The whole project is now 1167 LOC, of which 591 are generated or graph_lib, which means everything including the traxter "compiler" and the UI is less than 600 lines of code.

Not too shabby.

My goal is a functional spreadsheet with a working GUI and supporting the most common functions and pieces of the formula language in .... 2000 LOC, and I am willing to work on it for about two weeks.

Let's see how it turns out. Of course anyone with 5 free minutes can contribute his favourite spreadsheet function (I already have sum ;-)

New software project: Stupid Sheet

Adding something else to my plate is probably not a very good idea, but what the heck, I can make it sleep another three years if I lose interest.

So: I am writing a real spreadsheet in python.

Probably never going to be useful for corporations, but it should be at least as featureful as Google's and it should be amazingly small.

Here are the components:

  • Traxter: my spreadsheet-formula-like-language with dependency tracking that compiles to python.

  • PyQt (hey, it has a grid widget)

  • Python (Of course)

The status right now:

  • It's almost as functional as it was 2.5 years ago

  • Except for broken relative cells.

  • But with the beginning of a real formula language.

  • With automatic recalculation and cyclic dependency checks.

  • Adding things is dead simple. Here's the implementation of SUM (not uploaded yet, though):

def sum(*args):
  for v in args:
  return ac

All the range stuff happens behind the scenes (although you may get a function called with thousands of args... I wonder how well Python handles that).

You can check it at the google code project (Use the SVN).

The python spreadsheet: Another look (Traxter DSL)

I apologize in advance for any ugly amateurism in this post. It's my first attempt at a domain specific language :-)

Yesterday I posted about using PyCells to write a spreadsheet in Python.

Sadly, I can't figure out the problem with my code, and the PyCells mailing list seems to be pretty much dead.

So, I started thinking... what other ways are to achieve my goal? And decided to go medieval on this problem.

By that I mean that I will do it the most traditional way possible... with a twist.

The traditional way is, of course, to write one or more of lexer/parser/interpreter/compiler for the formula language.

Mind you, I don't intend to do anything complete, much less Excel-compatible (see Excel formula parsers are hell in this same blog.

So, let's start with a toy language, supporting the following:

  • Assignment to a variable

  • Classic 4-op arithmetics.

  • Function calls

  • Cell ranges

That's enough for a toy spreadsheet, and it should be easy to extend.

Here's a description of the grammar for such a language, written using Aperiot 1:

# This is a simple language for arithmetic expressions


     plus   "+"
     times  "*"
     minus  "-"
     div    "/"
     equal  "="
     colon ":"
     comma ","
     semicolon ";"

     lpar  "("
     rpar  ")"




LIST             -> ASSIGNMENT                : "[$1]"
                  | ASSIGNMENT semicolon LIST : "[$1]+$3"
                  | ASSIGNMENT semicolon : "[$1]"

ASSIGNMENT       -> label equal EXPR : "($1,$3)"

ARGLIST          -> ARG comma ARGLIST : "[$1]+$3"
                  | ARG          : "[$1]"

ARG              -> RANGE       : "$1"
                  | EXPR        : "$1"
                  | label       : "$1"

EXPR             -> TERM              : "$1"
                  | TERM plus EXPR    : "(\'+\',$1,$3)"
                  | TERM minus EXPR   : "(\'-\',$1,$3)"

TERM             -> FACTOR               : "$1"
                  | FACTOR times TERM    : "(\'*\',$1,$3)"
                  | FACTOR div TERM      : "(\'/\',$1,$3)"

FACTOR           -> number           : "$1.val()"
                  | lpar EXPR rpar  : "(\'group\',$2)"
                  | FUNCALL     : "$1"
                  | label               : "$1"
                  | minus FACTOR    : "-$2"

FUNCALL          ->  label lpar ARGLIST rpar : "(\'funcall\',$1,$3)"

RANGE            -> label colon label   : "(\'range\',$1,$3)"

This transforms this:


Into this:

[(<aperiot.lexer.Identifier instance at 0xb7af10ac>,
    <aperiot.lexer.Identifier instance at 0xb7af142c>,
      <aperiot.lexer.Identifier instance at 0xb7af15cc>,
      <aperiot.lexer.Identifier instance at 0xb7af144c>)]),
 (<aperiot.lexer.Identifier instance at 0xb7b4c72c>, ('+', 2, 2))]

Which is sort of a tree with all the expressions in prefix notation in them.

Now, here is the twist: I will "compile" this tree into.... python code. So I can use eval to do the evaluation, just like in the original python spreadsheet recipe.

So this is sort of a preprocessor:

  • The user writes excel-like formulas.

  • The spreadsheet stores python code obtained through compilation.

  • The spreadsheet evals the python code.

Of course we have the same problem as usual: cell dependencies, which is the reason why I started playing with PyCells in the first place!

But... well, here's another trick: since I am compiling, I know whenever there is a variable referenced in the code. And I can remember them :-)

So, I can turn this:


Into this:

[['A1=SUM(a1,a2,a3)*2;', set(['a1', 'a3', 'a2'])],
 ['A3=2+2;', set([])]]

The "compiled" python code and a dependency set. And voila, this spreadsheet will propagate correctly.

Here's the compiler... in about 60 lines of python 2. And since the whole point of this language is to track dependencies... let's call it Traxter.

Of course, this is a toy right now. But it's a toy with potential!

from pprint import pprint
from aperiot.parsergen import build_parser
import aperiot
import cellutils
import sys


def addOp(*args):
        return '+'.join([compile_token(a) for a in args])
def mulOp(*args):
        return '*'.join([compile_token(a) for a in args])
def subOp(*args):
        return '-'.join([compile_token(a) for a in args])
def divOp(*args):
        return '/'.join([compile_token(a) for a in args])

def groupOp(*args):
        return '(%s)'%compile_token(args[0])

def funcOp(*args):
        return '%s(%s)'%(args[0].symbolic_name,
                         ','.join([compile_token(a) for a in args[1]]))

def rangeOp(*args):
        return ','.join([compile_token(a) for a in cellutils.cellrange(c1,c2)])


def compile_token(token):
        if isinstance (token,aperiot.lexer.Identifier):
                return v
        if isinstance(token,list) or isinstance(token,tuple):
            return apply(operators[token[0]],token[1:])
        return str(token)

def compile_assignment(tokens):
        return '%s=%s;'%(target,compiled)

myparser = build_parser('traxter')
pprint (assign_list)

for assignment in assign_list:

print compiled

You may be asking yourself:what the heck is Aperiot? Or Why the heck Aperiot? Well... I had never heard of it until 6 hours ago, and I just wrote a DSL using it. That means it's worth knowing.


cellrange() is left as an exercise for the reader because my current implementation is shameful ;-)

PyCells: The Python SpreadSheet redux

In 2004 I saw a recipe about how to make a "spreadsheet" in python in 10 lines of code:

class SpreadSheet:
    _cells = {}
    tools = {}
    def __setitem__(self, key, formula):
        self._cells[key] = formula
    def getformula(self, key):
        return self._cells[key]
    def __getitem__(self, key ):
        return eval(self._cells[key],, self)

It's shocking. And it works, too:

>>> from math import sin, pi
>>>, pi=pi, len=len)
>>> ss = SpreadSheet()
>>> ss['a1'] = '5'
>>> ss['a2'] = 'a1*6'
>>> ss['a3'] = 'a2*7'
>>> ss['a3']
>>> ss['b1'] = 'sin(pi/4)'
>>> ss['b1']
>>> ss.getformula('b1')

I was so awed, I wrote a PyQt version . Of course there is a catch in that code: it sucks if you are trying to write a spreadsheet with it.

Why? Because it doesn't store results, but only formulas.

For example:


If you ask for the value of A2, you get 4. If you set A1 to 7, what's the value of A2?

Well, it's nothing yet, because it's only calculated when you ask for it. But suppose you are trying to display that sheet... you need to know A2's value changed when you set A1!

That's cell dependencies, and while that simple code handles them in a way, it totally sucks in another.

So, I went ahead and coded around it successfully. Of course the code was not so pretty anymore (although a large part of the uglyness is just for making it work with Python 2.3 and relative cells).

Then yesterday, while looking at the excel formula parser madness I saw a reference to PyCells, a python port of Cells from CLOS.

Here is a blog commenting on Pycells:

It basically takes the concept of a cell in a spreadsheet that get updated automatically to programming where there are a lot of internal data states that are dependent on one another in a chain, or a complex graph of dependencies. Like, the color of a button depends on whether you selected a radio button or not. Or, shut down the motor if the sensor reads above 100 degrees (example given in text).

Almost everyone uses that analogy... however, no matter how hard I looked, I couldn't find anyone who had actually tried writing a spreadsheet using PyCells! Not even as an example!

So here it is:

import cells

class Cell(cells.Model):
    def value(self,prev):
        print "eval ",self.formula
        return eval(self.formula, {},
    def __init__(self, ss, *args, **kwargs):
        cells.Model.__init__(self, *args, **kwargs)

class ssDict:
        def __init__(self):

        def __getitem__(self,key):

        def __setitem__(self,key,v):
                if not

if __name__ == "__main__":
        ss['a1'] ='5'

        print "a1: ", ss['a1']
        print "a2: ", ss['a2']

        ss['a1'] = '7'
        print "a1: ", ss['a1']
        print "a2: ", ss['a2']

And here you can see it running:

[[email protected] cells]$ python
a1:  eval  5
a2:  eval  2*a1
eval  7
eval  2*a1
a1:  7
a2:  14

See how when I set a1 to 7 I get "eval 7" and "eval 2*a1"? That's because it's propagating changes the right way. And that's why this would work as a basis for a spreadsheet.


It seems there is a bug wither in PyCells, or in my example, or something, because it breaks pretty easily, if the dependency chain is even two cells:

a1:  eval  5
a2:  eval  2*a1
a3:  eval  2*a2
eval  7
eval  2*a1
a1:  7
a2:  14
a3:  20

In this example, I am setting A3 to 2*A2, and when I update A1, A3 is not updated. Further research is needed.


Please check the reddit comments. They are very educational!

Excel formula parsers are hell

On 2004 I wrote a spreadsheet in python, which was about a 25KB download (compressed). It was pretty functional!.

The main problem that would never let that program be a real application was that it used python as a formula language instead of the traditional Lotus/Excel thing.

Yesterday I decided to look at it again, and see if there was a way around it 1. My main asset was that I know a fair bit more about parsers now than I did then, and how hard could it be?

Well, it turns out that language is a whole lot harder than I knew!

I found this page about a parser

Here's an example from it:

(R[14]C[11] *IF(OR(AND(R[39]C[11]>=55,R[40]C[11]>=20),AND(R[40]C[11]>=20,
R11C3="YES")), R[45]C[11],R[43]C[11])),0))

Yes, that's a valid Excel formula. No, I have no idea what it does.

And here's some extra nuggets about the language (quoted from the same page):

  • For some reason Excel uses a comma as the range union operator. This means that its only unambiguous use between ranges is outside of a function call or within a subexpression (i.e., between parentheses).

  • For some other reason Excel uses a space (or multiple spaces) as the range intersection operator. While not as ambiguous as the comma, it does require some consideration.

And it goes on, and on, and on....

But hey, he wrote a nice parser!

Just for kicks, here's the best BNF I could find.

I really don't envy the KSpread authors if they try to be compatible to this garbage!

But why is this language so gnarly?

  • Because it has been growing organically for 30 years?

  • Because of backwards compatibility?

  • Because it was never designed?

  • All of the above?

I mean, it can't be intentional! Noone actually meant it to be like this, right?


Yes, I routinely look at things I wrote and haven't touched in 2.4 years.

Why I use Arch Linux

I have been an Arch Linux for a while now, and I am still liking it.

Here's the good side of it:

  • It's small (one CD)

  • It's simple (it comes with very little)

  • It has a decent package selection (if you consider AUR, more about that later)

  • It uses pretty much unpatched upstream software

  • It's a binary distro (except for AUR. Again, more about it later)

  • It's pretty stable (no crashes I can remember)

  • It has rolling releases (unlike, say, Fedora or Debian)

  • It's easy to keep updated (like all of them nowadays)

  • It's not ideologically dogmatic, but pragmatic (yes, there are NVidia drivers, and test-drive games, and whatever)

  • It doesn't seem to be a one-guy joint

And the bad side:

  • Updates sometimes break things (about twice a year)

  • Admin tools are between unexistant and disjointed

And of course, there is the very very good side: AUR

AUR is a comunity repository. And there is a rather large community. And packaging things for Arch is so easy, and putting things in AUR is so simple, even I find time to contribute (my packages).

And it's a calm community, and pretty much, instead of compiling my random unknown packages for myself, I save the steps to build them and stick them in a PKGBUILD and upload them. Takes two minutes for most things.

It's a throwback to the old days of Linux: quiet, competent (or learning) people doing things, sharing, you use them, you give back... I had not felt that way with a distro for years.

Small software released: RA-WebPass

I just released a wee piece of software, called RA-WebPass which is simply a webpage that you can use to change linux system passwords.

Basically, I wrote it today out of my frustration with customers asking me "how can I change the password for the FTP server". Ok, this is how you can.

It's simple and doesn't have too many dependencies, so it should be rather secure, but don't trust me on that, and you need to run it as root, so be very very careful.

RA-WebPass home page

On a related note: it's quite satisfying to write something in two hours and just release it :-)