Skip to main content

Ralsina.Me — Roberto Alsina's website

Posts about RaSPF

Official RaSPF page

Ok, time to go a lit­tle more pub­lic with this.

Here's a page for it (click on "read more") and I will ask the open­spf guys to put it on the im­ple­men­ta­tions list (let's see how that goes).

RaSPF on its way to release

I have been able to work some more on RaSPF and the re­sults are en­cour­ag­ing.

Thanks to val­grind and test suites, I am pret­ty con­fi­dent it does­n't leak mem­o­ry, or at least, that it does­n't leak ex­cept on very rare cas­es.

I think I found a neat way to sim­pli­fy mem­o­ry man­age­men­t, though, and that's what I want­ed to men­tion.

This is prob­a­bly triv­ial for ev­ery­one read­ing, but I am a lim­it­ed C pro­gram­mer, so when­ev­er some­thing works un­ex­pect­ed­ly right, I am hap­py ;-)

One prob­lem with C mem­o­ry man­age­ment is that if you have many ex­it points for your func­tion­s, re­leas­ing ev­ery­thing you al­lo­cate is rather an­noy­ing, since you may have to do it in sev­er­al dif­fer­ent lo­ca­tion­s.

I com­pound­ed this prob­lem be­cause I am us­ing ex­cep­tions (yeah, C does­n't have them. I used this).

Now not on­ly do I have my re­turns but al­so my throws and what­ev­er un­caught throw some­thing I called has!

Hel­l, right?

Nope: what ex­cep­tions com­pli­cat­ed, ex­cep­tions fixed. Look at this func­tion:

bstring spf_query_get_explanation(spf_query *q, bstring spec)
{
    bstring txt=0;
    struct bstrList *l=0;
    bstring expanded=0;
    bstring result=0;
    struct tagbstring s=bsStatic("");

    try
    {
        // Expand an explanation
        if (spec && spec->slen)
        {
            expanded=spf_query_expand(q,spec,1);
            l=spf_query_dns_txt(q,expanded);

            if (l)
            {
                txt=bjoin(l,&s);
            }
            else
            {
                txt=bfromcstr("");
            }
            result=spf_query_expand(q,txt,0);
            throw(EXC_OK,0);
        }
        else
        {
            result=bfromcstr("explanation: Required option is missing");
            throw(EXC_OK,0);
        }
    }
    except
    {
        if(expanded) bdestroy(expanded);
        if(txt) bdestroy(txt);
        if(l) bstrListDestroy(l);
        on (EXC_OK)
        {
            return result;
        }
        if(result) bdestroy(result);
        throw(EXCEPTION.type,EXCEPTION.param1);
    }
}

It does­n't mat­ter if spf_­query_­ex­pand or spf_­query_dns_txt throw an ex­cep­tion, this will not leak.

Nice, I think :-)

C is not Python II.

RaSPF, my C port of PySPF, is pret­ty much func­tion­al right now.

Here's what I mean:

  • It pass­es 75 in­­ter­­nal unit tests (ok, 74 , but that one is ar­guable).

  • It pass­es 137 of 145 tests of the SPF of­­fi­­cial test suit­­e.

  • It agrees with PySPF in 181 of the 183 cas­es of the lib­spf2 live DNS suit­­e.

  • It seg­­faults in none of the 326 test cas­es.

So, while there are still some cor­ner cas­es to de­bug, it's look­ing very good.

I even spent some time with val­grind to plug some leaks ( the in­ter­nal test suite runs al­most leak­less, the re­al app is a sieve ;-)

All in al­l, if I can spend a lit­tle while with it dur­ing the week, I should be able to make a re­lease that ac­tu­al­ly work­s.

Then, I can re­write my SPF plug­in for qmail, which was what sent me in this mon­th-log tan­gen­t.

As a lan­guage wars com­par­ison:

  • The sloc­­count of raspf is 2557 (or 2272 if we use the ragel gram­­mar source in­­stead of the gen­er­at­ed file)

  • The sloc­­count of PySPF is 993.

So, a 2.6:1 or 2.28:1 code ra­tio.

How­ev­er, I used 4 non-­s­tan­dard C li­braries: bstr­lib, udns, and helpers for hash­es and ex­cep­tion­s, which add an­oth­er 5794 LOC­s.

So, it could be ar­gued as a 8:1 ra­tio, too, but my C code is prob­a­bly ver­bose in ex­treme, and many C lines are not re­al­ly "log­ic" but dec­la­ra­tions and such.

Al­so, I did not write PySPF, so his code may be more con­cise, but I tried my best to copy the flow as much as pos­si­ble line-per-­line.

In short, you need to write, ac­cord­ing to this case, be­tween 2 and 8 times more code than you do in Python.

That's a bit much!

The middle path

In my pre­vi­ous post, I men­tioned how PySPF does some­thing us­ing a reg­u­lar ex­pres­sion which I could­n't eas­i­ly re­pro­duce in C.

So, I start­ed look­ing at pars­er gen­er­a­tors to use the orig­i­nal SPF RFC's gram­mar.

But that had its own prob­lem­s.... and then came ragel.

Ragel is a fi­nite state ma­chine com­pil­er, and you can use it to gen­er­ate sim­ple parsers and val­ida­tors.

The syn­tax is very sim­ple, the re­sults are pow­er­ful, and here's the main chunk of code that lets you parse a SPF do­main-spec (it work­s, too!):

machine domain_spec;
name = ( alpha ( alpha | digit | '-' | '_' | '.' )* );
macro_letter = 's' | 'l' | 'o' | 'd' | 'i' | 'p' | 'h' | 'c' | 'r' | 't';
transformers = digit* 'r'?;
delimiter = '.' | '-' | '+' | ',' | '|' | '_' | '=';
macro_expand = ( '%{' macro_letter transformers delimiter* '}' ) |
               '%%' | '%_' | '%-';
toplabel = ( alnum* alpha alnum* ) |
           ( alnum{1,} '-' ( alnum | '-' )* alnum );
domain_end = ( '.' toplabel '.'? ) | macro_expand;
macro_literal = 0x21 .. 0x24 | 0x26 .. 0x7E;
macro_string = ( macro_expand | macro_literal )*;
domain_spec := macro_string domain_end 0 @{ res = 1; };

And in fac­t, it's sim­pler than the AB­NF gram­mar used in the RFC:

name             = ALPHA *( ALPHA / DIGIT / "-" / "_" / "." )
macro-letter     = "s" / "l" / "o" / "d" / "i" / "p" / "h" /
                   "c" / "r" / "t"
transformers     = *DIGIT [ "r" ]
delimiter        = "." / "-" / "+" / "," / "/" / "_" / "="
macro-expand     = ( "%{" macro-letter transformers *delimiter "}" )
                   / "%%" / "%_" / "%-"
toplabel         = ( *alphanum ALPHA *alphanum ) /
                   ( 1*alphanum "-" *( alphanum / "-" ) alphanum )
domain-end       = ( "." toplabel [ "." ] ) / macro-expand
macro-literal    = %x21-24 / %x26-7E
macro-string     = *( macro-expand / macro-literal )
domain-spec      = macro-string domain-end

So, thumbs up for ragel!

Up­date:

  • The code looks very bad on python or agre­­ga­­tors.

  • This piece of code alone fixed 20 test cas­es from the SPF suit­­e, and now on­­ly 8 fail. Neat!

This can't be good

Work­ing on my SPF li­brary, I ran in­to a prob­lem. I need­ed to val­i­date a spe­cif­ic el­e­men­t, and the python code is a lit­tle hairy (it splits based on a large reg­ex­p, and it's tricky to con­vert to C).

So, I asked, and was told, maybe you should start from the RFC's gram­mar.

Ok. I am not much in­to gram­mars and parser­s, but what the heck. So I check it. It's a AB­NF gram­mar.

So, I look for the ob­vi­ous thing: a AB­NF pars­er gen­er­a­tor.

There are very few of those, and none of them seems very solid, which is scary, be­cause al­most all the RFC's de­fine ev­ery­thing in terms of AB­NF (ex­cept for some that do worse, and de­fine in pros­e. Did you know there is no for­mal, ver­i­fi­able def­i­ni­tion of what an Ipv6 ad­dress looks like?).

So, af­ter hours of googling...

Any­one knows a good AB­NF pars­er gen­er­a­tor? I am try­ing with ab­n­f2c but it's not strict enough (I am get­ting a pars­er that does­n't work).

Any­one knows why those very im­por­tant doc­u­ments that rule how most of us make a liv­ing/­work/have fun are so ... hazy?


Contents © 2000-2023 Roberto Alsina