2007-03-06 14:23

C is not Python II.

RaSPF, my C port of PySPF, is pretty much functional right now.

Here's what I mean:

  • It passes 75 internal unit tests (ok, 74 , but that one is arguable).
  • It passes 137 of 145 tests of the SPF official test suite.
  • It agrees with PySPF in 181 of the 183 cases of the libspf2 live DNS suite.
  • It segfaults in none of the 326 test cases.

So, while there are still some corner cases to debug, it's looking very good.

I even spent some time with valgrind to plug some leaks ( the internal test suite runs almost leakless, the real app is a sieve ;-)

All in all, if I can spend a little while with it during the week, I should be able to make a release that actually works.

Then, I can rewrite my SPF plugin for qmail, which was what sent me in this month-log tangent.

As a language wars comparison:

  • The sloccount of raspf is 2557 (or 2272 if we use the ragel grammar source instead of the generated file)
  • The sloccount of PySPF is 993.

So, a 2.6:1 or 2.28:1 code ratio.

However, I used 4 non-standard C libraries: bstrlib, udns, and helpers for hashes and exceptions, which add another 5794 LOCs.

So, it could be argued as a 8:1 ratio, too, but my C code is probably verbose in extreme, and many C lines are not really "logic" but declarations and such.

Also, I did not write PySPF, so his code may be more concise, but I tried my best to copy the flow as much as possible line-per-line.

In short, you need to write, according to this case, between 2 and 8 times more code than you do in Python.

That's a bit much!

2007-03-04 21:10

The middle path

In my previous post, I mentioned how PySPF does something using a regular expression which I couldn't easily reproduce in C.

So, I started looking at parser generators to use the original SPF RFC's grammar.

But that had its own problems.... and then came ragel.

Ragel is a finite state machine compiler, and you can use it to generate simple parsers and validators.

The syntax is very simple, the results are powerful, and here's the main chunk of code that lets you parse a SPF domain-spec (it works, too!):

machine domain_spec;
name = ( alpha ( alpha | digit | '-' | '_' | '.' )* );
macro_letter = 's' | 'l' | 'o' | 'd' | 'i' | 'p' | 'h' | 'c' | 'r' | 't';
transformers = digit* 'r'?;
delimiter = '.' | '-' | '+' | ',' | '|' | '_' | '=';
macro_expand = ( '%{' macro_letter transformers delimiter* '}' ) |
               '%%' | '%_' | '%-';
toplabel = ( alnum* alpha alnum* ) |
           ( alnum{1,} '-' ( alnum | '-' )* alnum );
domain_end = ( '.' toplabel '.'? ) | macro_expand;
macro_literal = 0x21 .. 0x24 | 0x26 .. 0x7E;
macro_string = ( macro_expand | macro_literal )*;
domain_spec := macro_string domain_end 0 @{ res = 1; };

And in fact, it's simpler than the ABNF grammar used in the RFC:

name             = ALPHA *( ALPHA / DIGIT / "-" / "_" / "." )
macro-letter     = "s" / "l" / "o" / "d" / "i" / "p" / "h" /
                   "c" / "r" / "t"
transformers     = *DIGIT [ "r" ]
delimiter        = "." / "-" / "+" / "," / "/" / "_" / "="
macro-expand     = ( "%{" macro-letter transformers *delimiter "}" )
                   / "%%" / "%_" / "%-"
toplabel         = ( *alphanum ALPHA *alphanum ) /
                   ( 1*alphanum "-" *( alphanum / "-" ) alphanum )
domain-end       = ( "." toplabel [ "." ] ) / macro-expand
macro-literal    = %x21-24 / %x26-7E
macro-string     = *( macro-expand / macro-literal )
domain-spec      = macro-string domain-end

So, thumbs up for ragel!


  • The code looks very bad on python or agregators.
  • This piece of code alone fixed 20 test cases from the SPF suite, and now only 8 fail. Neat!

2007-03-04 14:21

This can't be good

Working on my SPF library, I ran into a problem. I needed to validate a specific element, and the python code is a little hairy (it splits based on a large regexp, and it's tricky to convert to C).

So, I asked, and was told, maybe you should start from the RFC's grammar.

Ok. I am not much into grammars and parsers, but what the heck. So I check it. It's a ABNF grammar.

So, I look for the obvious thing: a ABNF parser generator.

There are very few of those, and none of them seems very solid, which is scary, because almost all the RFC's define everything in terms of ABNF (except for some that do worse, and define in prose. Did you know there is no formal, verifiable definition of what an Ipv6 address looks like?).

So, after hours of googling...

Anyone knows a good ABNF parser generator? I am trying with abnf2c but it's not strict enough (I am getting a parser that doesn't work).

Anyone knows why those very important documents that rule how most of us make a living/work/have fun are so ... hazy?

2007-03-02 16:14

SPF test suite on RASPF

Here are the results as of right now:

  • Give the expected results: 82 tests
  • Give the wrong result: 48 tests
  • Give a correct but not preferred result (mostly because of SPF records and IPv6): 6 tests
  • Fail (crash): 9 tests

So, depending on how you look at it, RASPF passes between 61% and 56% of the tests.

Not bad so far :-)

Update: As of 20:52 ART, it's 105/0/35/5 and 72-76%. The bad news is that that was all the low hanging fruit, and now it gets much harder.

2007-03-01 13:46

My SPF library kinda works

RaSPF, my attempted port of PySPF to C is now at a very special point in its life:

The provided CLI application can check SPF records and tell you what you should do with them!

Here's an example:

[[email protected] build]$ ./raspfquery --ip= --sender=03.spf1-test.mailzone.com --helo=03.spf1-test.mailzone.com
Checking SPF with:

sender: 03.spf1-test.mailzone.com
helo:   03.spf1-test.mailzone.com

response:       softfail
code:           250
explanation:    domain owner discourages use of this host

Is that correct? Apparently yes!

[[email protected] pyspf-2.0.2]$ python spf.py 03.spf1-test.mailzone.com 03.spf1-test.mailzone.com
('softfail', 250, 'domain owner discourages use of this host')

Is it useful? Surely you jest!

There are still the following problems:

  • The memory management is unexistant
  • I need to hack a way to run the official SPF test suite so I can see how well it works and that it works exactly as PySPF
  • It probably will segfault on many places
  • I am changing the error handling to be exception-based, thanks to EXCC
  • The IPv6 support is between iffy and not there
  • There is no support for SPF (type 99) DNS records, only TXT records (need to hack the udns library)

But really, this should be about 60% of the work, and it does work for some cases, which is more than I really expected at the beginning.

Here's the whole source code of the sample application (except for CLI option processing):

spf_response r=spf_check(ip,sender,helo,0,0);
printf ("\nresponse:\t%s\ncode:\t\t%d\nexplanation:\t\t%s\n",

2007-02-22 17:07

Some kind of landmark

As of right now, my customers owe me more than I billed in the second half of last year, and more comes due each month.

I suppose that's bad because I am sucking at collecting. On the other hand, it also means I am not sucking at billing. Or maybe yes, but much less than last year.

2007-02-20 18:56

New look for this blog.

After many years, this is the first radical change of look.

It's not very nice, because my HTML skillz sux0rz, but hey, it's darker!

There are many little things wrong (like the color of visited links) but I like it.

It's also a bit simpler, and I did the banner using inkscape (I had it done much nicer using Karbon, but then I couldn't figure out how to do the gradient. Oh, well).

Update1: It looks incredibly awful on IE6, from the untransparente PNG to the unsupported overflow:auto I think I caught every damn thing that doesn't work :-)

Update2: Both IE and Konqueror will not use overflow: auto if the object is inside a table. In a blog that posts code and logs, and so on, that's a big problem.

So, to make this kinda work, I had to get rid of (almost) all tables.

It looks ok now, except on IE the sidebar is at the bottom right, which is probably because it calculates element widths differently.

2007-02-16 15:01

My SPF lib improving

It now can do a bunch of things like expanding macros and (in some cases) validating mechanisms.

I am making very heavy use of unit testing, because it's a pretty complex piece and each function needs to do exactly the right thing or everything else fails (it's pretty hard to figure out where it will fail ;-)

You can check the 947 LOC thing at http://code.google.com/p/raspf (the Code tab).

If you do check it, jeep in mind the following:

  • It uses a few libs, and they are included in the source code for simplicity.
  • I do sometimes commit code that doesn't compile
  • I do sometimes commit code that fails tests
  • You need cmake
  • I am not giving a damn about memory management right now, so don't bother worrying about leaks: everything leaks in this code. I want to make it functional first, then I can plug it one function at a time (simply by running the unit testing code with a memory checker).

Enjoy (although it's not precisely enjoyable code right now ;-)

2007-02-13 11:56

C is not Python

I am porting pyspf to C (long story, and I am stupid for trying). But of course, C is not python.

So you don't have anything nearly as nice as re.compile("whatever").split("somestring").

What is that good for, you may ask? Well, to do things like splitting email addresses while validating them, or in this specific case, to validate SPF mechanisms (nevermind what those are).

But hey, you can always do this (excuse me while I weep a little):

struct bstrList *re_split(const char *string, const char *pattern)
    int status;
    regex_t re;
    regmatch_t pmatch[20];

    if (regcomp(&re, pattern, REG_ICASE|REG_EXTENDED) != 0)
        return(0);      /* Report error. */

    bstring tmp=bfromcstr("");
    char *ptr=(char *)string;

    for (;;)
        status = regexec(&re, ptr, (size_t)20, pmatch, 0);
        if (status==REG_NOMATCH)
        bcatblk (tmp,ptr,pmatch[0].rm_so);
        bconchar (tmp,0);
        bcatblk (tmp,ptr+pmatch[0].rm_so,pmatch[0].rm_eo-pmatch[0].rm_so);
        bconchar (tmp,0);

    bcatblk (tmp,ptr,strlen(string)-(ptr-string));
    struct bstrList *l= bsplit(tmp,0);
    return l;

And that is probably wrong for some cases (and it doesn't split the exact same way as Python, but that's what unit testing is for).

I must be missing something that makes regcomp & friends nicer to use. Right? Right?

2007-02-12 23:31

Any regex wizard reading this?

If so, what is the C POSIX regex (you know regcomp & friends) equivalent of this python regular expresion:

re.compile(r'^([a-z][a-z0-9_\-\.]*)=', re.IGNORECASE)

Because it sure isn't this:


I have been playing with it for two hours and am bored :-)

Contents © 2000-2019 Roberto Alsina