Posts about programming (old posts, page 8)

2007-03-27 12:49

Son of Bartleblog IV

Another morning, another feature: archive

bartleblog3

Now I'm working on the image tool, importing PyDS's images and uploading to flickr, etc.

2007-03-26 18:28

Son of Bartleblog III

A couple more hours of hacking, and the templates are all new, and more functional then ever.

bartleblog2

I am making heavy use of Yahoo's UI library, which makes lots of things much simpler:

  • Layout using Yahoo Grids

    I spent hours making the layout you see now, and the one with Grids works better and was done in minutes. Avoid reinventing the wheel works for webpages, too.

  • Calendar using Yahoo Calendar

    Isn't it neat? And it works, too. Since the linking is handled by javascript I may make it so it loads the posts for a month without reloading the page.

  • Styling using their reset.css stylesheet.

    That stylesheet removes all styling from your page. That way, if there's something there, you put it.

    I used that, added a slightly simplified stylesheet based on Firefox's default, Restructured Text's and Silvercity's, and all the customizing I needed to do to achieve a simple but functional layout were 30 lines of CSS, compared to the rather monstrous pyds.css my blog currently uses.

  • Modular thingies.

    I turned all Technorati/HaloScan/FeedBurner/Talkr thingies into macros that take as configuration your personal data (for example, HaloScan ID) and if necessary a post.

If the styling was a little more done and a few bugs were ironed, I may even start uploading the site using bartleblog instead of PyDS soon :-)

2007-03-25 12:53

Son of Bartlebog II

After a few more hours hacking, it's got the following working:

  • CherryTemplate templates that do about the same as the Cheetah templates in PyDS
  • Generates the whole site and it looks just the same
  • Advogato import (my blog should go all the way back to 2000 when I switch!)
  • PyDS import

The main missing things are:

  • Do a decent templating system (right now they are embedded in the code)
  • Do a decent config system (right now, global variables)
  • Do uploading (or just trust lftp)
  • Do post/story creation
  • Port the RSS template
  • Flickr integration
  • Integration with all those neat little gadgets: feedburner flares, HaloScan comments which are currently kinda grafted (only work for my account ;-)
  • Look into Yahoo UI toolkit for things like the calendar and menus.
  • Add the extra stuff to Restructured Text so it:
    • Fixes automatically links to posts/stories in the blog
    • Pretty-prints code using SilverCity
  • Lots of UI stuff

All in all, not really a huge amount of work, but I am taking it easy.

When KDE4 is out, a version with a full-fledget KHTML in it will be a whole lot nicer.

2007-03-24 09:16

A little project, son of BartleBlog

I have been posting this blog using PyDS for over 4 years now. Sadly, the PyDS author seems to have abandoned it. Which is sad, because it's nifty software.

However, keeping it working is getting harder every year, and I don't expect to be able to do it soon.

Also, the data is in a Metakit database, which is the most annoying DB ever (no real schema! columnar instead of record oriented! gouge my eyes with a breadstick!)

So, since I have all the data, and my blogging needs are modest, and no tool does exactly what I want, I decided to write my own.

I could make it a web app, maybe using TurboGears, but what the heck, I haven't done a decent GUI app in ... ok, arguably, I never have done a decent one, and my PyQt4 needs some work, and I am kinda in a groove for actually finishing things lately (I am rather proud of RaSPF).

And I have a neat name (BartleBlog) reserved from another aborted app.

So, here's the mandatory screenshot after a couple hours hacking:

bartleblog

And here are the goals:

  • Generate static pages, so it can be used by anyone with a little web space (I am a gipsy)
  • Simple templating (Using cherrytemplate right now, but should be modular)
  • Restructured Text as input mechanism (again, modular)
  • Good support for code snippets
  • Should support static pages (like the ones I have in the Stories link)
  • Integrate with Flickr for images
  • Integrate "chunks" in the templating, where you can do things like setting the right Haloscan comment/trackback links easily
  • Simple category mechanism, with a regexp-based autotagger without creating per-category copies of everything.
  • RSS feed generation, global and per-category.
  • A way to import all my PyDS blog (and maybe my older advogato things)
  • Use sqlite and SQLObject for sane storage.

So far, it's doing some things, I can import, edit, save (by instant application, there is no "save" here).

I can't yet generate the site, or create a new post, and it should take months to make it useful, but let's see how it goes.

2007-03-13 11:04

RaSPF on its way to release

I have been able to work some more on RaSPF and the results are encouraging.

Thanks to valgrind and test suites, I am pretty confident it doesn't leak memory, or at least, that it doesn't leak except on very rare cases.

I think I found a neat way to simplify memory management, though, and that's what I wanted to mention.

This is probably trivial for everyone reading, but I am a limited C programmer, so whenever something works unexpectedly right, I am happy ;-)

One problem with C memory management is that if you have many exit points for your functions, releasing everything you allocate is rather annoying, since you may have to do it in several different locations.

I compounded this problem because I am using exceptions (yeah, C doesn't have them. I used this).

Now not only do I have my returns but also my throws and whatever uncaught throw something I called has!

Hell, right?

Nope: what exceptions complicated, exceptions fixed. Look at this function:

bstring spf_query_get_explanation(spf_query *q, bstring spec)
{
    bstring txt=0;
    struct bstrList *l=0;
    bstring expanded=0;
    bstring result=0;
    struct tagbstring s=bsStatic("");

    try
    {
        // Expand an explanation
        if (spec && spec->slen)
        {
            expanded=spf_query_expand(q,spec,1);
            l=spf_query_dns_txt(q,expanded);

            if (l)
            {
                txt=bjoin(l,&s);
            }
            else
            {
                txt=bfromcstr("");
            }
            result=spf_query_expand(q,txt,0);
            throw(EXC_OK,0);
        }
        else
        {
            result=bfromcstr("explanation: Required option is missing");
            throw(EXC_OK,0);
        }
    }
    except
    {
        if(expanded) bdestroy(expanded);
        if(txt) bdestroy(txt);
        if(l) bstrListDestroy(l);
        on (EXC_OK)
        {
            return result;
        }
        if(result) bdestroy(result);
        throw(EXCEPTION.type,EXCEPTION.param1);
    }
}

It doesn't matter if spf_query_expand or spf_query_dns_txt throw an exception, this will not leak.

Nice, I think :-)

2007-03-06 14:23

C is not Python II.

RaSPF, my C port of PySPF, is pretty much functional right now.

Here's what I mean:

  • It passes 75 internal unit tests (ok, 74 , but that one is arguable).
  • It passes 137 of 145 tests of the SPF official test suite.
  • It agrees with PySPF in 181 of the 183 cases of the libspf2 live DNS suite.
  • It segfaults in none of the 326 test cases.

So, while there are still some corner cases to debug, it's looking very good.

I even spent some time with valgrind to plug some leaks ( the internal test suite runs almost leakless, the real app is a sieve ;-)

All in all, if I can spend a little while with it during the week, I should be able to make a release that actually works.

Then, I can rewrite my SPF plugin for qmail, which was what sent me in this month-log tangent.

As a language wars comparison:

  • The sloccount of raspf is 2557 (or 2272 if we use the ragel grammar source instead of the generated file)
  • The sloccount of PySPF is 993.

So, a 2.6:1 or 2.28:1 code ratio.

However, I used 4 non-standard C libraries: bstrlib, udns, and helpers for hashes and exceptions, which add another 5794 LOCs.

So, it could be argued as a 8:1 ratio, too, but my C code is probably verbose in extreme, and many C lines are not really "logic" but declarations and such.

Also, I did not write PySPF, so his code may be more concise, but I tried my best to copy the flow as much as possible line-per-line.

In short, you need to write, according to this case, between 2 and 8 times more code than you do in Python.

That's a bit much!

2007-03-04 21:10

The middle path

In my previous post, I mentioned how PySPF does something using a regular expression which I couldn't easily reproduce in C.

So, I started looking at parser generators to use the original SPF RFC's grammar.

But that had its own problems.... and then came ragel.

Ragel is a finite state machine compiler, and you can use it to generate simple parsers and validators.

The syntax is very simple, the results are powerful, and here's the main chunk of code that lets you parse a SPF domain-spec (it works, too!):

machine domain_spec;
name = ( alpha ( alpha | digit | '-' | '_' | '.' )* );
macro_letter = 's' | 'l' | 'o' | 'd' | 'i' | 'p' | 'h' | 'c' | 'r' | 't';
transformers = digit* 'r'?;
delimiter = '.' | '-' | '+' | ',' | '|' | '_' | '=';
macro_expand = ( '%{' macro_letter transformers delimiter* '}' ) |
               '%%' | '%_' | '%-';
toplabel = ( alnum* alpha alnum* ) |
           ( alnum{1,} '-' ( alnum | '-' )* alnum );
domain_end = ( '.' toplabel '.'? ) | macro_expand;
macro_literal = 0x21 .. 0x24 | 0x26 .. 0x7E;
macro_string = ( macro_expand | macro_literal )*;
domain_spec := macro_string domain_end 0 @{ res = 1; };

And in fact, it's simpler than the ABNF grammar used in the RFC:

name             = ALPHA *( ALPHA / DIGIT / "-" / "_" / "." )
macro-letter     = "s" / "l" / "o" / "d" / "i" / "p" / "h" /
                   "c" / "r" / "t"
transformers     = *DIGIT [ "r" ]
delimiter        = "." / "-" / "+" / "," / "/" / "_" / "="
macro-expand     = ( "%{" macro-letter transformers *delimiter "}" )
                   / "%%" / "%_" / "%-"
toplabel         = ( *alphanum ALPHA *alphanum ) /
                   ( 1*alphanum "-" *( alphanum / "-" ) alphanum )
domain-end       = ( "." toplabel [ "." ] ) / macro-expand
macro-literal    = %x21-24 / %x26-7E
macro-string     = *( macro-expand / macro-literal )
domain-spec      = macro-string domain-end

So, thumbs up for ragel!

Update:

  • The code looks very bad on python or agregators.
  • This piece of code alone fixed 20 test cases from the SPF suite, and now only 8 fail. Neat!

2007-03-04 14:21

This can't be good

Working on my SPF library, I ran into a problem. I needed to validate a specific element, and the python code is a little hairy (it splits based on a large regexp, and it's tricky to convert to C).

So, I asked, and was told, maybe you should start from the RFC's grammar.

Ok. I am not much into grammars and parsers, but what the heck. So I check it. It's a ABNF grammar.

So, I look for the obvious thing: a ABNF parser generator.

There are very few of those, and none of them seems very solid, which is scary, because almost all the RFC's define everything in terms of ABNF (except for some that do worse, and define in prose. Did you know there is no formal, verifiable definition of what an Ipv6 address looks like?).

So, after hours of googling...

Anyone knows a good ABNF parser generator? I am trying with abnf2c but it's not strict enough (I am getting a parser that doesn't work).

Anyone knows why those very important documents that rule how most of us make a living/work/have fun are so ... hazy?

2007-03-01 13:46

My SPF library kinda works

RaSPF, my attempted port of PySPF to C is now at a very special point in its life:

The provided CLI application can check SPF records and tell you what you should do with them!

Here's an example:

[[email protected] build]$ ./raspfquery --ip=192.0.2.1 --sender=03.spf1-test.mailzone.com --helo=03.spf1-test.mailzone.com
Checking SPF with:

sender: 03.spf1-test.mailzone.com
helo:   03.spf1-test.mailzone.com
ip:     192.0.2.1


response:       softfail
code:           250
explanation:    domain owner discourages use of this host

Is that correct? Apparently yes!

[[email protected] pyspf-2.0.2]$ python spf.py 192.0.2.1 03.spf1-test.mailzone.com 03.spf1-test.mailzone.com
('softfail', 250, 'domain owner discourages use of this host')

Is it useful? Surely you jest!

There are still the following problems:

  • The memory management is unexistant
  • I need to hack a way to run the official SPF test suite so I can see how well it works and that it works exactly as PySPF
  • It probably will segfault on many places
  • I am changing the error handling to be exception-based, thanks to EXCC
  • The IPv6 support is between iffy and not there
  • There is no support for SPF (type 99) DNS records, only TXT records (need to hack the udns library)

But really, this should be about 60% of the work, and it does work for some cases, which is more than I really expected at the beginning.

Here's the whole source code of the sample application (except for CLI option processing):

spf_init();
spf_response r=spf_check(ip,sender,helo,0,0);
printf ("\nresponse:\t%s\ncode:\t\t%d\nexplanation:\t\t%s\n",
        r.response,r.code,r.explanation);

2007-02-13 11:56

C is not Python

I am porting pyspf to C (long story, and I am stupid for trying). But of course, C is not python.

So you don't have anything nearly as nice as re.compile("whatever").split("somestring").

What is that good for, you may ask? Well, to do things like splitting email addresses while validating them, or in this specific case, to validate SPF mechanisms (nevermind what those are).

But hey, you can always do this (excuse me while I weep a little):

struct bstrList *re_split(const char *string, const char *pattern)
{
    int status;
    regex_t re;
    regmatch_t pmatch[20];

    if (regcomp(&re, pattern, REG_ICASE|REG_EXTENDED) != 0)
    {
        return(0);      /* Report error. */
    }

    bstring tmp=bfromcstr("");
    char *ptr=(char *)string;

    for (;;)
    {
        status = regexec(&re, ptr, (size_t)20, pmatch, 0);
        if (status==REG_NOMATCH)
        {
            break;
        }
        bcatblk (tmp,ptr,pmatch[0].rm_so);
        bconchar (tmp,0);
        bcatblk (tmp,ptr+pmatch[0].rm_so,pmatch[0].rm_eo-pmatch[0].rm_so);
        bconchar (tmp,0);
        ptr=ptr+pmatch[0].rm_eo;

    }
    regfree(&re);
    bcatblk (tmp,ptr,strlen(string)-(ptr-string));
    struct bstrList *l= bsplit(tmp,0);
    return l;
}

And that is probably wrong for some cases (and it doesn't split the exact same way as Python, but that's what unit testing is for).

I must be missing something that makes regcomp & friends nicer to use. Right? Right?

Contents © 2000-2019 Roberto Alsina