Ir al contenido principal

Ralsina.Me — El sitio web de Roberto Alsina

Publicaciones sobre python (publicaciones antiguas, página 18)

A little project, son of BartleBlog

I have been post­ing this blog us­ing PyDS for over 4 years now. Sad­ly, the PyDS au­thor seems to have aban­doned it. Which is sad, be­cause it's nifty soft­ware.

How­ev­er, keep­ing it work­ing is get­ting hard­er ev­ery year, and I don't ex­pect to be able to do it soon.

Al­so, the da­ta is in a Metak­it database, which is the most an­noy­ing DB ev­er (no re­al schema! colum­nar in­stead of record ori­ent­ed! gouge my eyes with a bread­stick­!)

So, since I have all the data, and my blog­ging needs are mod­est, and no tool does ex­act­ly what I wan­t, I de­cid­ed to write my own.

I could make it a web ap­p, maybe us­ing Tur­bo­Gears, but what the heck, I haven't done a de­cent GUI app in ... ok, ar­guably, I nev­er have done a de­cent one, and my PyQt4 needs some work, and I am kin­da in a groove for ac­tu­al­ly fin­ish­ing things late­ly (I am rather proud of RaSPF).

And I have a neat name (Bartle­Blog) re­served from an­oth­er abort­ed ap­p.

So, here's the manda­to­ry screen­shot af­ter a cou­ple hours hack­ing:

bartleblog

And here are the goal­s:

  • Gen­er­ate stat­ic pages, so it can be used by any­one with a lit­­tle web space (I am a gip­sy)

  • Sim­­ple tem­­plat­ing (Us­ing cher­ry­tem­­plate right now, but should be mod­­u­lar)

  • Re­struc­­tured Text as in­­put mech­a­nism (a­­gain, mod­­u­lar)

  • Good sup­­port for code snip­pets

  • Should sup­­port stat­ic pages (like the ones I have in the Sto­ries link)

  • In­­te­­grate with Flickr for im­ages

  • In­­te­­grate "chunks" in the tem­­plat­ing, where you can do things like set­t­ing the right Haloscan com­­men­t/­­track­­back links eas­i­­ly

  • Sim­­ple cat­e­­go­ry mech­a­nis­m, with a reg­ex­p-based au­­to­­tag­ger with­­out cre­at­ing per-­­cat­e­­go­ry copies of ev­ery­thing.

  • RSS feed gen­er­a­­tion, glob­al and per-­­cat­e­­go­ry.

  • A way to im­­port all my PyDS blog (and maybe my old­er ad­vo­ga­­to things)

  • Use sqlite and SQLOb­­ject for sane stor­age.

So far, it's do­ing some things, I can im­port, ed­it, save (by in­stant ap­pli­ca­tion, there is no "save" here).

I can't yet gen­er­ate the site, or cre­ate a new post, and it should take months to make it use­ful, but let's see how it goes.

C is not Python II.

RaSPF, my C port of PySPF, is pret­ty much func­tion­al right now.

Here's what I mean:

  • It pass­es 75 in­­ter­­nal unit tests (ok, 74 , but that one is ar­guable).

  • It pass­es 137 of 145 tests of the SPF of­­fi­­cial test suit­­e.

  • It agrees with PySPF in 181 of the 183 cas­es of the lib­spf2 live DNS suit­­e.

  • It seg­­faults in none of the 326 test cas­es.

So, while there are still some cor­ner cas­es to de­bug, it's look­ing very good.

I even spent some time with val­grind to plug some leaks ( the in­ter­nal test suite runs al­most leak­less, the re­al app is a sieve ;-)

All in al­l, if I can spend a lit­tle while with it dur­ing the week, I should be able to make a re­lease that ac­tu­al­ly work­s.

Then, I can re­write my SPF plug­in for qmail, which was what sent me in this mon­th-log tan­gen­t.

As a lan­guage wars com­par­ison:

  • The sloc­­count of raspf is 2557 (or 2272 if we use the ragel gram­­mar source in­­stead of the gen­er­at­ed file)

  • The sloc­­count of PySPF is 993.

So, a 2.6:1 or 2.28:1 code ra­tio.

How­ev­er, I used 4 non-­s­tan­dard C li­braries: bstr­lib, udns, and helpers for hash­es and ex­cep­tion­s, which add an­oth­er 5794 LOC­s.

So, it could be ar­gued as a 8:1 ra­tio, too, but my C code is prob­a­bly ver­bose in ex­treme, and many C lines are not re­al­ly "log­ic" but dec­la­ra­tions and such.

Al­so, I did not write PySPF, so his code may be more con­cise, but I tried my best to copy the flow as much as pos­si­ble line-per-­line.

In short, you need to write, ac­cord­ing to this case, be­tween 2 and 8 times more code than you do in Python.

That's a bit much!

The middle path

In my pre­vi­ous post, I men­tioned how PySPF does some­thing us­ing a reg­u­lar ex­pres­sion which I could­n't eas­i­ly re­pro­duce in C.

So, I start­ed look­ing at pars­er gen­er­a­tors to use the orig­i­nal SPF RFC's gram­mar.

But that had its own prob­lem­s.... and then came ragel.

Ragel is a fi­nite state ma­chine com­pil­er, and you can use it to gen­er­ate sim­ple parsers and val­ida­tors.

The syn­tax is very sim­ple, the re­sults are pow­er­ful, and here's the main chunk of code that lets you parse a SPF do­main-spec (it work­s, too!):

machine domain_spec;
name = ( alpha ( alpha | digit | '-' | '_' | '.' )* );
macro_letter = 's' | 'l' | 'o' | 'd' | 'i' | 'p' | 'h' | 'c' | 'r' | 't';
transformers = digit* 'r'?;
delimiter = '.' | '-' | '+' | ',' | '|' | '_' | '=';
macro_expand = ( '%{' macro_letter transformers delimiter* '}' ) |
               '%%' | '%_' | '%-';
toplabel = ( alnum* alpha alnum* ) |
           ( alnum{1,} '-' ( alnum | '-' )* alnum );
domain_end = ( '.' toplabel '.'? ) | macro_expand;
macro_literal = 0x21 .. 0x24 | 0x26 .. 0x7E;
macro_string = ( macro_expand | macro_literal )*;
domain_spec := macro_string domain_end 0 @{ res = 1; };

And in fac­t, it's sim­pler than the AB­NF gram­mar used in the RFC:

name             = ALPHA *( ALPHA / DIGIT / "-" / "_" / "." )
macro-letter     = "s" / "l" / "o" / "d" / "i" / "p" / "h" /
                   "c" / "r" / "t"
transformers     = *DIGIT [ "r" ]
delimiter        = "." / "-" / "+" / "," / "/" / "_" / "="
macro-expand     = ( "%{" macro-letter transformers *delimiter "}" )
                   / "%%" / "%_" / "%-"
toplabel         = ( *alphanum ALPHA *alphanum ) /
                   ( 1*alphanum "-" *( alphanum / "-" ) alphanum )
domain-end       = ( "." toplabel [ "." ] ) / macro-expand
macro-literal    = %x21-24 / %x26-7E
macro-string     = *( macro-expand / macro-literal )
domain-spec      = macro-string domain-end

So, thumbs up for ragel!

Up­date:

  • The code looks very bad on python or agre­­ga­­tors.

  • This piece of code alone fixed 20 test cas­es from the SPF suit­­e, and now on­­ly 8 fail. Neat!

This can't be good

Work­ing on my SPF li­brary, I ran in­to a prob­lem. I need­ed to val­i­date a spe­cif­ic el­e­men­t, and the python code is a lit­tle hairy (it splits based on a large reg­ex­p, and it's tricky to con­vert to C).

So, I asked, and was told, maybe you should start from the RFC's gram­mar.

Ok. I am not much in­to gram­mars and parser­s, but what the heck. So I check it. It's a AB­NF gram­mar.

So, I look for the ob­vi­ous thing: a AB­NF pars­er gen­er­a­tor.

There are very few of those, and none of them seems very solid, which is scary, be­cause al­most all the RFC's de­fine ev­ery­thing in terms of AB­NF (ex­cept for some that do worse, and de­fine in pros­e. Did you know there is no for­mal, ver­i­fi­able def­i­ni­tion of what an Ipv6 ad­dress looks like?).

So, af­ter hours of googling...

Any­one knows a good AB­NF pars­er gen­er­a­tor? I am try­ing with ab­n­f2c but it's not strict enough (I am get­ting a pars­er that does­n't work).

Any­one knows why those very im­por­tant doc­u­ments that rule how most of us make a liv­ing/­work/have fun are so ... hazy?

My SPF library kinda works

RaSPF, my at­tempt­ed port of PySPF to C is now at a very spe­cial point in its life:

The pro­vid­ed CLI ap­pli­ca­tion can check SPF records and tell you what you should do with them!

Here's an ex­am­ple:

[ralsina@monty build]$ ./raspfquery --ip=192.0.2.1 --sender=03.spf1-test.mailzone.com --helo=03.spf1-test.mailzone.com
Checking SPF with:

sender: 03.spf1-test.mailzone.com
helo:   03.spf1-test.mailzone.com
ip:     192.0.2.1


response:       softfail
code:           250
explanation:    domain owner discourages use of this host

Is that cor­rec­t? Ap­par­ent­ly yes!

[ralsina@monty pyspf-2.0.2]$ python spf.py 192.0.2.1 03.spf1-test.mailzone.com 03.spf1-test.mailzone.com
('softfail', 250, 'domain owner discourages use of this host')

Is it use­ful? Sure­ly you jest!

There are still the fol­low­ing prob­lem­s:

  • The mem­o­ry man­age­­ment is un­ex­is­­tant

  • I need to hack a way to run the of­­fi­­cial SPF test suite so I can see how well it works and that it works ex­ac­t­­ly as PySPF

  • It prob­a­bly will seg­­fault on many places

  • I am chang­ing the er­ror han­dling to be ex­­cep­­tion-based, thanks to EX­CC

  • The IPv6 sup­­port is be­tween iffy and not there

  • There is no sup­­port for SPF (type 99) DNS record­s, on­­ly TXT records (need to hack the udns li­brary)

But re­al­ly, this should be about 60% of the work, and it does work for some cas­es, which is more than I re­al­ly ex­pect­ed at the be­gin­ning.

Here's the whole source code of the sam­ple ap­pli­ca­tion (ex­cept for CLI op­tion pro­cess­ing):

spf_init();
spf_response r=spf_check(ip,sender,helo,0,0);
printf ("\nresponse:\t%s\ncode:\t\t%d\nexplanation:\t\t%s\n",
        r.response,r.code,r.explanation);

Contents © 2000-2023 Roberto Alsina