Ir al contenido principal

Ralsina.Me — El sitio web de Roberto Alsina

PyCells: The Python SpreadSheet redux

In 2004 I saw a recipe about how to make a "spread­sheet" in python in 10 lines of code:

class SpreadSheet:
    _cells = {}
    tools = {}
    def __setitem__(self, key, formula):
        self._cells[key] = formula
    def getformula(self, key):
        return self._cells[key]
    def __getitem__(self, key ):
        return eval(self._cells[key], SpreadSheet.tools, self)

It's shock­ing. And it work­s, too:

>>> from math import sin, pi
>>> SpreadSheet.tools.update(sin=sin, pi=pi, len=len)
>>> ss = SpreadSheet()
>>> ss['a1'] = '5'
>>> ss['a2'] = 'a1*6'
>>> ss['a3'] = 'a2*7'
>>> ss['a3']
210
>>> ss['b1'] = 'sin(pi/4)'
>>> ss['b1']
0.70710678118654746
>>> ss.getformula('b1')
'sin(pi/4)'

I was so awed, I wrote a PyQt ver­sion . Of course there is a catch in that code: it sucks if you are try­ing to write a spread­sheet with it.

Why? Be­cause it does­n't store re­sult­s, but on­ly for­mu­las.

For ex­am­ple:

A1=2
A2=A1*2

If you ask for the val­ue of A2, you get 4. If you set A1 to 7, what's the val­ue of A2?

Well, it's noth­ing yet, be­cause it's on­ly cal­cu­lat­ed when you ask for it. But sup­pose you are try­ing to dis­play that sheet... you need to know A2's val­ue changed when you set A1!

That's cell de­pen­den­cies, and while that sim­ple code han­dles them in a way, it to­tal­ly sucks in an­oth­er.

So, I went ahead and cod­ed around it suc­cess­ful­ly. Of course the code was not so pret­ty any­more (although a large part of the ug­ly­ness is just for mak­ing it work with Python 2.3 and rel­a­tive cell­s).

Then yes­ter­day, while look­ing at the ex­cel for­mu­la pars­er mad­ness I saw a ref­er­ence to Py­Cells, a python port of Cells from CLOS.

Here is a blog com­ment­ing on Py­cells:

It ba­si­cal­ly takes the con­cept of a cell in a spread­sheet that get up­dat­ed au­to­mat­i­cal­ly to pro­gram­ming where there are a lot of in­ter­nal da­ta states that are de­pen­dent on one an­oth­er in a chain, or a com­plex graph of de­pen­den­cies. Like, the col­or of a but­ton de­pends on whether you se­lect­ed a ra­dio but­ton or not. Or, shut down the mo­tor if the sen­sor reads above 100 de­grees (ex­am­ple giv­en in tex­t).

Al­most ev­ery­one us­es that anal­o­gy... how­ev­er, no mat­ter how hard I looked, I could­n't find any­one who had ac­tu­al­ly tried writ­ing a spread­sheet us­ing Py­Cell­s! Not even as an ex­am­ple!

So here it is:

import cells

class Cell(cells.Model):
    formula=cells.makecell(value='')
    @cells.fun2cell()
    def value(self,prev):
        print "eval ",self.formula
        return eval(self.formula, {}, self.ss)
    def __init__(self, ss, *args, **kwargs):
        self.ss=ss
        cells.Model.__init__(self, *args, **kwargs)


class ssDict:
        def __init__(self):
                self.ss={}

        def __getitem__(self,key):
                return self.ss[key].value

        def __setitem__(self,key,v):
                if not self.ss.has_key(key):
                        c=Cell(self)
                        c.formula=v
                        self.ss[key]=c
                else:
                        self.ss[key].formula=v

if __name__ == "__main__":
        ss=ssDict()
        ss['a1'] ='5'

        ss['a2']='2*a1'
        print "a1: ", ss['a1']
        print "a2: ", ss['a2']

        ss['a1'] = '7'
        print "a1: ", ss['a1']
        print "a2: ", ss['a2']

And here you can see it run­ning:

[ralsina@monty cells]$ python ctest.py
a1:  eval  5
5
a2:  eval  2*a1
10
eval  7
eval  2*a1
a1:  7
a2:  14

See how when I set a1 to 7 I get "e­val 7" and "e­val 2*a1"? That's be­cause it's prop­a­gat­ing changes the right way. And that's why this would work as a ba­sis for a spread­sheet.

UP­DATE

It seems there is a bug with­er in Py­Cell­s, or in my ex­am­ple, or some­thing, be­cause it breaks pret­ty eas­i­ly, if the de­pen­den­cy chain is even two cell­s:

a1:  eval  5
5
a2:  eval  2*a1
10
a3:  eval  2*a2
20
eval  7
eval  2*a1
a1:  7
a2:  14
a3:  20

In this ex­am­ple, I am set­ting A3 to 2*A2, and when I up­date A1, A3 is not up­dat­ed. Fur­ther re­search is need­ed.

UP­DATE 2

Please check the red­dit com­ments. They are very ed­u­ca­tion­al!

Excel formula parsers are hell

On 2004 I wrote a spread­sheet in python, which was about a 25KB down­load (com­pressed). It was pret­ty func­tion­al!.

The main prob­lem that would nev­er let that pro­gram be a re­al ap­pli­ca­tion was that it used python as a for­mu­la lan­guage in­stead of the tra­di­tion­al Lo­tus/Ex­cel thing.

Yes­ter­day I de­cid­ed to look at it again, and see if there was a way around it 1. My main as­set was that I know a fair bit more about parsers now than I did then, and how hard could it be?

Well, it turns out that lan­guage is a whole lot hard­er than I knew!

I found this page about a pars­er

Here's an ex­am­ple from it:

=IF(R[39]C[11]>65,R[25]C[42],ROUND((R[11]C[11]*IF(OR(AND(R[39]C[11]>=55,
R[40]C[11]>=20),AND(R[40]C[11]>=20,R11C3="YES")),R[44]C[11],R[43]C[11]))+
(R[14]C[11] *IF(OR(AND(R[39]C[11]>=55,R[40]C[11]>=20),AND(R[40]C[11]>=20,
R11C3="YES")), R[45]C[11],R[43]C[11])),0))

Yes, that's a valid Ex­cel for­mu­la. No, I have no idea what it does.

And here's some ex­tra nuggets about the lan­guage (quot­ed from the same page):

  • For some rea­­son Ex­­cel us­es a com­­ma as the range union op­er­a­­tor. This means that its on­­ly un­am­bigu­ous use be­tween ranges is out­­­side of a func­­tion call or with­­in a sub­­ex­pres­­sion (i.e., be­tween paren­the­s­es).

  • For some oth­­er rea­­son Ex­­cel us­es a space (or mul­ti­­ple spaces) as the range in­­ter­sec­­tion op­er­a­­tor. While not as am­bigu­ous as the com­­ma, it does re­quire some con­sid­er­a­­tion.

And it goes on, and on, and on....

But hey, he wrote a nice parser!

Just for kick­s, here's the best BNF I could find.

I re­al­ly don't en­vy the KSpread au­thors if they try to be com­pat­i­ble to this garbage!

But why is this lan­guage so gnarly?

  • Be­­cause it has been grow­ing or­­gan­i­­cal­­ly for 30 years?

  • Be­­cause of back­­wards com­­pat­i­­bil­i­­ty?

  • Be­­cause it was nev­er de­signed?

  • All of the above?

I mean, it can't be in­ten­tion­al! Noone ac­tu­al­ly meant it to be like this, right?

1

Yes, I rou­tine­ly look at things I wrote and haven't touched in 2.4 years.

Web Typography

Since I am try­ing to make my blog in­to a se­ri­ous site 1 I de­cid­ed to take a se­ri­ous look in­to web ty­pog­ra­phy. It should be use­ful if Bartle­Blog ev­er gets a sec­ond us­er (which is not pre­cise­ly com­ing soon ;-)

Spe­cial­ly since this link ap­peared in Ned Batchelder's blog.

Right now, I am in­to ba­sic things, like ver­ti­cal ry­thm and such, but you can al­ready see a dif­fer­ence:

bartleblog13.png

Now, let's go in­to more se­ri­ous ma­te­ri­al to read on the sub­jec­t.

1

as in "ex­ces­sive or im­pres­sive in qual­i­ty, quan­ti­ty, ex­ten­t, or de­gree", not "not jok­ing or tri­fling". This site is all about tri­fling.

Why I use Arch Linux

I have been an Arch Lin­ux for a while now, and I am still lik­ing it.

Here's the good side of it:

  • It's small (one CD)

  • It's sim­­ple (it comes with very lit­tle)

  • It has a de­­cent pack­­age se­lec­­tion (if you con­sid­er AUR, more about that lat­er)

  • It us­es pret­­ty much un­­patched up­­stream soft­­ware

  • It's a bi­­na­ry dis­­tro (ex­­cept for AUR. Again, more about it lat­er)

  • It's pret­­ty sta­ble (no crash­es I can re­mem­ber)

  • It has rolling re­leas­es (un­­like, say, Fe­­do­ra or De­bian)

  • It's easy to keep up­­­dat­ed (like all of them nowa­­days)

  • It's not ide­o­log­i­­cal­­ly dog­­mat­ic, but prag­­mat­ic (yes, there are NVidia driver­s, and test-­­drive games, and what­ev­er)

  • It does­n't seem to be a one-guy joint

And the bad side:

  • Up­­­dates some­­times break things (about twice a year)

  • Ad­min tools are be­tween un­ex­is­­tant and dis­­join­t­ed

And of course, there is the very very good side: AUR

AUR is a co­mu­ni­ty repos­i­to­ry. And there is a rather large com­mu­ni­ty. And pack­ag­ing things for Arch is so easy, and putting things in AUR is so sim­ple, even I find time to con­trib­ute (my pack­ages).

And it's a calm com­mu­ni­ty, and pret­ty much, in­stead of com­pil­ing my ran­dom un­known pack­ages for my­self, I save the steps to build them and stick them in a PKG­BUILD and up­load them. Takes two min­utes for most things.

It's a throw­back to the old days of Lin­ux: qui­et, com­pe­tent (or learn­ing) peo­ple do­ing things, shar­ing, you use them, you give back­... I had not felt that way with a dis­tro for years.

No, you don't understand... this site is not impopular, it's ELITE!

Ac­cord­ing to Alex­a.­com this site (//ralsi­na.me) is the 682768th most pop­u­lar site in the in­ter­net this week.

Al­so, it's the 50998th most pop­u­lar site in Ar­genti­na and most supris­ing­ly.... the 10180th most pop­u­lar site in Slove­ni­a!

In fac­t, a whole 18.5% of my read­ers come from Slove­ni­a. Which is not sur­pris­ing. It's stun­ning!

I here­by salute all my slove­ni­an read­ers and de­clare my­self a Beno Udrih fan.

All the da­ta.

Ig­nore this link be­low, please.

Technorati Profile

Contents © 2000-2020 Roberto Alsina