Ir al contenido principal

Ralsina.Me — El sitio web de Roberto Alsina

Dear readers, a question!

rst2rst was go­ing along just fine, but I have run in­to the re­al prob­lem: TA­BLES.

To re­fresh your meme­o­ry (if you know) or let you know any­way, RST is an ascii markup lan­guage. And it sup­ports ta­bles. Like this ex­am­ple:

+----------+----------------------+
|          |                      |
+----------+------------+---------+
|          |            |         |
+----------+------------+         +
|                       |         |
+-----------------------+---------+

The pars­er gives me a tree struc­ture:

row (en­try, en­try with morecol­s=1) row (en­try,en­try,en­try with morerows=1) row (en­try with morecol­s=1)

Now, each cell has a min­i­mum size set by its con­tents.

For ex­am­ple, a cell con­tain­ing "Hel­lo" has a min­i­mum size of 5x1 (wx­h).

It is a good idea to sur­round the con­tents with a blank row/­colum­n, so make that 7x3. I can fig­ure this out from the con­tents.

And here's the trick ques­tion... any­one knows an al­go­rithm to do this that is not in­cred­i­bly dif­fi­cult? Or, if it is in­cred­i­bly dif­fi­cult, care to help me? ;-)

rst2rst works (80% or so)

What is it? A pro­gram that takes a do­cu­tils doc­u­ment tree ( parsed from a RST doc­u­ment or pro­gra­mat­i­cal­ly gen­er­at­ed) then dumps as close as I can guess to rea­son­able RST back.

This lets Re­struc­tured Text be a save­able da­ta for­mat, which is nice.

It's not done as a do­cu­tils writ­er. Sor­ry, I could­n't make that work.

What work­s? Most of it.

What does­n't? A dozen di­rec­tives, cus­tom in­ter­pret­ed text roles, and ta­bles.

Yes, all of those are im­por­tan­t. But the rest seems to work ok!

Look: a 804 line RST doc­u­ment con­tain­ing al­most ev­ery fea­ture of the lan­guage, and the on­ly dif­fer­ence in the gen­er­at­ed HTML out­put be­tween the orig­i­nal and rst2rst's is an in­vis­i­ble dif­fer­ence in con­tin­u­a­tion lines in line block­s.

[ralsina@monty wp]$ python rst2rst.py t1.txt > t2.txt
[ralsina@monty wp]$ /usr/bin/rst2html.py t1.txt t1.html ;  /usr/bin/rst2html.py t2.txt t2.html
[ralsina@monty wp]$ diff t1.html t2.html
468,469c468,469
< <div class="line">But I'm expecting a postal order and I can pay you back
< as soon as it comes.</div>
---
> <div class="line">But I'm expecting a postal order and I can pay you back</div>
> <div class="line">as soon as it comes.</div>
[ralsina@monty wp]$ wc -l t1.txt
804 t1.txt

You can get rst2rst.py and the test­file.

Any­one knows of a re­al do­cu­tils test suite I could bor­row?

Hacking Restructured Text

I am a great fan of Re­struc­tured Tex­t. I write my blog us­ing it. I write my busi­ness pro­pos­als us­ing it, I write my doc­u­men­ta­tion us­ing it, I think you should write alm­sot ev­ery­thing you write now us­ing it. I have even blogged many times about it.

RST is a min­i­mal markup lan­guage. You can fig­ure it out in a cou­ple of hours, and then use it to pro­duce pret­ty HTML pages, PDF doc­s, man pages, La­TeX doc­u­ments, S5 slides, and oth­er things.

Plus, the source works as a plain text ver­sion, and is very read­able:

This is a title
===============

Some text in a paragraph

A subtitle
----------

* A list

* More items

  1. A numbered sublist

  2. Another item

     a) A sub-sub-list

     b) With more items


+-----------------------+-------------------------+
|   A table             | With two columns        |
+-----------------------+-------------------------+
|  And Two              |   rows                  |
+-----------------------+-------------------------+

See? Nice.

RST has an­oth­er great thing that is not so well known: there is a pars­er for it, which turns the doc­u­ment in­to a tree of nodes rep­p­re­sent­ing dif­fer­ent parts of the doc­u­men­t.

You can ma­nip­u­late this node tree, mod­i­fy­ing the doc­u­men­t, and then gen­er­ate the out­put.

But there is no way, right now, to gen­er­ate RST from the tree. Which means it's a one way road.

Well, I am hack­ing to fix that.

Right now, I han­dle ti­tles, sec­tion­s, all sorts of list­s, tran­si­tion­s, quotes, em­pha­sis, ital­ic­s, and a few oth­er el­e­ments.

The on­ly ones that seem dif­fi­cult to im­ple­ment are ta­bles, but I still think I can do it. Al­though the pro­duced RST does­n't look the same as the orig­i­nal, it is func­tion­al­ly iden­ti­cal.

How do I test if it work­s? With a test suit­e. If it work­s, it should be in­vari­ant this way:

RST­sam­ple -> rst2html pro­duces the ex­act same out­put as RST­sam­ple -> rst2rst -> rst2html

If any­one wants a copy, email me.

Some people say anything

Last night I saw an "in­ves­tiga­tive news" pro­gram on the TV. It's called "In­forme Cen­tral", and their head­line sto­ry was about an aban­doned fac­to­ry in San tel­mo (where tourists go to see typ­i­cal BA and lo­cals go to see tourist­s).

The thing is, that fac­to­ry has been tak­en over by poor peo­ple who live there. It's con­ve­nient­ly lo­cat­ed, and they don't pay any­thing.

On the oth­er hand, it's a nest of drug, rape, pover­ty and vi­o­lence, but that's not the on­ly thing these "jour­nal­ist­s" said.

They said they lived in in­hu­mane con­di­tion­s, up to 2.6 per­sons per square me­ter.

They al­so said about 300 peo­ple live there, which would mean there are rough­ly 115 square me­ters in the fac­to­ry.

Which is, ac­tu­al­ly, clos­er to 1200 square me­ter­s. or maybe 5000. But they kept on say­ing those num­ber­s.

Do you know that in or­der to have 2.6 per­sons per square me­ter so that each of them has a small (dou­ble) bunk bed, you would have to put the bunk beds one next to the oth­er with 20cm-wide spa­ces in be­tween?

How the hell did they get that num­ber? Is that a sign of their reg­u­lar in­ves­tiga­tive qual­i­ty? Prob­a­bly.

An application idea

Yes­ter­day I wrote that I have too many ideas. Ok, here's an­oth­er one:

A word pro­ces­sor for writ­er­s. And when I say writ­er­s, I mean nov­el­ist­s, tech­ni­cal book writ­er­s, script writ­er­s, play­wright­s...

Word is not very good for a writ­er. Open­Office is not good. KWord is prob­a­bly worse (be­cause of the em­pha­sis on page lay­out). LyX is prob­a­bly as good as it get­s, and it's not ex­act­ly per­fec­t.

A writ­er ac­tu­al­ly needs a sim­ple-ish word pro­ces­sor with a bunch of an­cil­lary gad­getry.

For ex­am­ple:

  • Statis­tic­s:

    • How many word­s/chars/­­­pages a day is he writ­ing

    • A live word/char counter

    • A live word fre­quen­­­cy mon­i­­­tor (put the cur­­­sor on a word and see how of­ten it's used)

    • Live counter of doc­u­­­men­t/chap­ter/­sec­­­tion/scene size.

  • Out­­lin­ing

    • Re­al live out­­­lin­ing. The kind where you drag stuff around and the text fol­lows.

    • An ed­itable ful­l-­­­text out­­­­­line view

  • Col­lab­o­ra­­tion

    • Mul­ti­­­ple ed­i­­­tors

    • Ver­­­sion­ing con­trol

  • Projects

    • Mul­ti­­­ple files per project

    • Link­ing files to places on the text in oth­­­er files

  • In­­dex cards

    • As­­­so­­­ci­at­ing in­­­dex cards to places on the text

    • Group­ing in­­­dex cards (for ex­am­­­ple, per char­ac­ter, or per lo­­­ca­­­tion)

    • Plac­ing them on a time­­­line or a sto­ry­board

  • Live The­saurus / Dic­­tio­­nary

    • Show de­f­i­ni­­­tions and al­ter­­­na­­­tives as the point­er cross­es a word.

    • One click re­­­place­­­ment

  • Styling

    • Per frag­­­men­t/­­­para­­­graph styles

    • Us­er de­fined

    • Pre­de­fined styles

There are a bazil­lion things he does not need, though, like de­tailed page lay­out­ing, or gram­mar check­ing.

It would be nice if it could lat­er be eas­i­ly im­port­ed (styled!) in­to some­thing like Scribus so a de­cent page lay­out could be done, but it does­n't need to be in the same app at al­l.

The text en­gines in Qt4 are good enough for all this app needs graph­i­cal­ly.

Re­struc­tured­Text is good enough to pro­vide a back­end, a parser, an ex­porter, a read­er, a trans­former, what­ev­er.

So there it is, an­oth­er idea I will most like­ly not im­ple­men­t. Some­one please run with it, you can prob­a­bly make it a rather ex­pen­sive GPL share­ware on Mac ;-)


Contents © 2000-2023 Roberto Alsina