This can't be good
Working on my SPF library, I ran into a problem. I needed to validate a specific element, and the python code is a little hairy (it splits based on a large regexp, and it's tricky to convert to C).
So, I asked, and was told, maybe you should start from the RFC's grammar.
Ok. I am not much into grammars and parsers, but what the heck. So I check it. It's a ABNF grammar.
So, I look for the obvious thing: a ABNF parser generator.
There are very few of those, and none of them seems very solid, which is scary, because almost all the RFC's define everything in terms of ABNF (except for some that do worse, and define in prose. Did you know there is no formal, verifiable definition of what an Ipv6 address looks like?).
So, after hours of googling...
Anyone knows a good ABNF parser generator? I am trying with abnf2c but it's not strict enough (I am getting a parser that doesn't work).
Anyone knows why those very important documents that rule how most of us make a living/work/have fun are so ... hazy?
Uhm... perhaps first, you really want something that will tell you what class of grammar you're dealing with?
As far as I know, all the 'compiler-compiler' tools tend to be specific to a particular class of grammar: they'll either generate lexical analysers, LL parsers or LR parsers, or whatever...
I don't think I've ever seen a (useful) tool that will do more than one of things.
Are you're just matching a reg. ex., or is there something trickier going on?
There's an element called a domain-spec.
It's defined in a ABNF grammar.
The python version of the code validates it by splitting it using a regexp.
If you email me, I can show you the code ( and the ABFN grammar :-)
So, it looks like it's just a regexp.
I think you should be able to persuade flex to match the pattern, and use its 'rules' to track which parts of the string correspond to which parts of the grammar.
Perhaps define things like toplabel, delimiter and macro-literal in the definitions section of your flex input, and put domain-spec in the rules section.
It's been about 4 years since I looked at this stuff though - and even then, I was only doing a toy example or two - so I'm a bit hazy on it all too...
This will help you see why the RFCs are so fuzzy.
http://en.wikipedia.org/wik...
They may well be the first example of a form of Wiki - anyone could comment. Out of the comments came (usually) consensus.
It worked because most of us are going to go with the proposals with merit, and the other ideas sink.
And they are not any kind of mandatory. You can ignore them if you wish (usually a VERY bad idea).
Don.
Martin: I think I found a way to create the parser (just a validating parser, since I already do everything else).
The grammar for this specific element is not terrible complex, so I think I can manage.
Don: Well, the problem I mention is the proposals are defined using either prose or a grammar for which no good parser seems to exist (hell, the ABNF grammar's grammar is broken, too!)
It would be no harder to propose using, you know, things that are verifiable, so people can know if they are complying to the proposal or not :-)