Skip to main content

Ralsina.Me — Roberto Alsina's website

The middle path

In my pre­vi­ous post, I men­tioned how PySPF does some­thing us­ing a reg­u­lar ex­pres­sion which I could­n't eas­i­ly re­pro­duce in C.

So, I start­ed look­ing at pars­er gen­er­a­tors to use the orig­i­nal SPF RFC's gram­mar.

But that had its own prob­lem­s.... and then came ragel.

Ragel is a fi­nite state ma­chine com­pil­er, and you can use it to gen­er­ate sim­ple parsers and val­ida­tors.

The syn­tax is very sim­ple, the re­sults are pow­er­ful, and here's the main chunk of code that lets you parse a SPF do­main-spec (it work­s, too!):

machine domain_spec;
name = ( alpha ( alpha | digit | '-' | '_' | '.' )* );
macro_letter = 's' | 'l' | 'o' | 'd' | 'i' | 'p' | 'h' | 'c' | 'r' | 't';
transformers = digit* 'r'?;
delimiter = '.' | '-' | '+' | ',' | '|' | '_' | '=';
macro_expand = ( '%{' macro_letter transformers delimiter* '}' ) |
               '%%' | '%_' | '%-';
toplabel = ( alnum* alpha alnum* ) |
           ( alnum{1,} '-' ( alnum | '-' )* alnum );
domain_end = ( '.' toplabel '.'? ) | macro_expand;
macro_literal = 0x21 .. 0x24 | 0x26 .. 0x7E;
macro_string = ( macro_expand | macro_literal )*;
domain_spec := macro_string domain_end 0 @{ res = 1; };

And in fac­t, it's sim­pler than the AB­NF gram­mar used in the RFC:

name             = ALPHA *( ALPHA / DIGIT / "-" / "_" / "." )
macro-letter     = "s" / "l" / "o" / "d" / "i" / "p" / "h" /
                   "c" / "r" / "t"
transformers     = *DIGIT [ "r" ]
delimiter        = "." / "-" / "+" / "," / "/" / "_" / "="
macro-expand     = ( "%{" macro-letter transformers *delimiter "}" )
                   / "%%" / "%_" / "%-"
toplabel         = ( *alphanum ALPHA *alphanum ) /
                   ( 1*alphanum "-" *( alphanum / "-" ) alphanum )
domain-end       = ( "." toplabel [ "." ] ) / macro-expand
macro-literal    = %x21-24 / %x26-7E
macro-string     = *( macro-expand / macro-literal )
domain-spec      = macro-string domain-end

So, thumbs up for ragel!

Up­date:

  • The code looks very bad on python or agre­­ga­­tors.

  • This piece of code alone fixed 20 test cas­es from the SPF suit­­e, and now on­­ly 8 fail. Neat!

This can't be good

Work­ing on my SPF li­brary, I ran in­to a prob­lem. I need­ed to val­i­date a spe­cif­ic el­e­men­t, and the python code is a lit­tle hairy (it splits based on a large reg­ex­p, and it's tricky to con­vert to C).

So, I asked, and was told, maybe you should start from the RFC's gram­mar.

Ok. I am not much in­to gram­mars and parser­s, but what the heck. So I check it. It's a AB­NF gram­mar.

So, I look for the ob­vi­ous thing: a AB­NF pars­er gen­er­a­tor.

There are very few of those, and none of them seems very solid, which is scary, be­cause al­most all the RFC's de­fine ev­ery­thing in terms of AB­NF (ex­cept for some that do worse, and de­fine in pros­e. Did you know there is no for­mal, ver­i­fi­able def­i­ni­tion of what an Ipv6 ad­dress looks like?).

So, af­ter hours of googling...

Any­one knows a good AB­NF pars­er gen­er­a­tor? I am try­ing with ab­n­f2c but it's not strict enough (I am get­ting a pars­er that does­n't work).

Any­one knows why those very im­por­tant doc­u­ments that rule how most of us make a liv­ing/­work/have fun are so ... hazy?

SPF test suite on RASPF

Here are the re­sults as of right now:

  • Give the ex­pec­t­ed re­­sult­s: 82 tests

  • Give the wrong re­­sult: 48 tests

  • Give a cor­rect but not pre­­ferred re­­sult (most­­ly be­­cause of SPF records and IPv6): 6 tests

  • Fail (crash): 9 tests

So, de­pend­ing on how you look at it, RASPF pass­es be­tween 61% and 56% of the test­s.

Not bad so far :-)

Up­date: As of 20:52 ART, it's 105/0/35/5 and 72-76%. The bad news is that that was all the low hang­ing fruit, and now it gets much hard­er.

My SPF library kinda works

RaSPF, my at­tempt­ed port of PySPF to C is now at a very spe­cial point in its life:

The pro­vid­ed CLI ap­pli­ca­tion can check SPF records and tell you what you should do with them!

Here's an ex­am­ple:

[ralsina@monty build]$ ./raspfquery --ip=192.0.2.1 --sender=03.spf1-test.mailzone.com --helo=03.spf1-test.mailzone.com
Checking SPF with:

sender: 03.spf1-test.mailzone.com
helo:   03.spf1-test.mailzone.com
ip:     192.0.2.1


response:       softfail
code:           250
explanation:    domain owner discourages use of this host

Is that cor­rec­t? Ap­par­ent­ly yes!

[ralsina@monty pyspf-2.0.2]$ python spf.py 192.0.2.1 03.spf1-test.mailzone.com 03.spf1-test.mailzone.com
('softfail', 250, 'domain owner discourages use of this host')

Is it use­ful? Sure­ly you jest!

There are still the fol­low­ing prob­lem­s:

  • The mem­o­ry man­age­­ment is un­ex­is­­tant

  • I need to hack a way to run the of­­fi­­cial SPF test suite so I can see how well it works and that it works ex­ac­t­­ly as PySPF

  • It prob­a­bly will seg­­fault on many places

  • I am chang­ing the er­ror han­dling to be ex­­cep­­tion-based, thanks to EX­CC

  • The IPv6 sup­­port is be­tween iffy and not there

  • There is no sup­­port for SPF (type 99) DNS record­s, on­­ly TXT records (need to hack the udns li­brary)

But re­al­ly, this should be about 60% of the work, and it does work for some cas­es, which is more than I re­al­ly ex­pect­ed at the be­gin­ning.

Here's the whole source code of the sam­ple ap­pli­ca­tion (ex­cept for CLI op­tion pro­cess­ing):

spf_init();
spf_response r=spf_check(ip,sender,helo,0,0);
printf ("\nresponse:\t%s\ncode:\t\t%d\nexplanation:\t\t%s\n",
        r.response,r.code,r.explanation);

Some kind of landmark

As of right now, my cus­tomers owe me more than I billed in the sec­ond half of last year, and more comes due each month.

I sup­pose that's bad be­cause I am suck­ing at col­lect­ing. On the oth­er hand, it al­so means I am not suck­ing at billing. Or maybe yes, but much less than last year.


Contents © 2000-2024 Roberto Alsina