Skip to main content

Ralsina.Me — Roberto Alsina's website

Posts about programming (old posts, page 62)

rst2pdf 0.14 released!

It's my plea­sure to an­nounce that I just up­load­ed rst2pdf 0.14 to the site at http://rst2pdf.­google­code.­com.

Rst2pdf is a pro­gram and a li­brary to con­vert re­struc­tured text di­rect­ly in­to PDF us­ing Re­port­lab.

It sup­ports True Type and Type 1 font em­bed­ding, most raster and vec­tor im­age for­mat­s, source code high­light­ing, ar­bi­trary text frames in a page, cas­cad­ing stylesheet­s, the full re­struc­tured text syn­tax and much, much more.

It al­so in­cludes a sphinx ex­ten­sion so you can use it to gen­er­ate PDFs from doc­u­ments built with Sphinx.

In case of prob­lem­s, please re­port them in the Is­sue track­er (http://­code.­google.­com/p/rst2pdf/is­sues/list) or the mail­ing list (http://­group­s.­google.­com/­group/rst2pdf-dis­cuss)

This re­lease fix­es sev­er­al bugs and adds some mi­nor fea­tures com­pared to 0.13.2. Here are some of the changes:

  • Fixed Is­­sue 197: Ta­ble bor­ders were con­­fus­ing.

  • Fixed Is­­sue 297: styles from de­­fault­­.j­­son leaked on­­to oth­­er syn­­tax high­­­light­ing stylesheet­s.

  • Fixed Is­­sue 295: key­­word re­­place­­ment in head­­er­s/­­foot­ers did­n't work if ###­­Page### and oth­­ers was in­­­side a ta­ble.

  • New fea­­ture: odd­­e­ven di­rec­­tive to dis­­­play al­ter­­na­­tive con­­tent on odd­­/even pages (good for head­­er­s/­­foot­er­s!)

  • Switched all stylesheets to more read­­able RSON for­­mat.

  • Fixed Is­­sue 294: Im­ages were de­­formed when on­­ly height was spec­i­­fied.

  • Fixed Is­­sue 293: Ac­­cept left­­/­­cen­ter/right as align­­ments in stylesheet­s.

  • Fixed Is­­sue 292: sep­a­rate style for line num­bers in code­blocks

  • Fixed Is­­sue 291: sup­­port class di­rec­­tive for code­blocks

  • Fixed Is­­sue 104: to­­tal num­ber of pages in head­­er/­­foot­er works in all cas­es now.

  • Fixed Is­­sue 168: linenos and linenothresh­old op­­tions in Sphinx now work cor­rec­t­­ly.

  • Fixed re­­gres­­sion in 0.12 (in­ter­ac­­tion be­tween rst2pdf and sphinx math)

  • Doc­u­­men­t­ed ex­ten­­sions in the man­u­al

  • Bet­ter styling of bul­let­s/items (Is­­sue 289)

  • Fixed Is­­sue 290: don't fail on bro­ken im­ages

  • Bet­ter font find­­ing in win­­dows (patch by tech­­tonik, Is­­sue 282).

  • Fixed Is­­sue 166: Im­­ple­­men­t­ed Sphinx's hlist (hor­i­­zon­­tal list­s)

  • Fixed Is­­sue 284: Im­­ple­­men­t­ed pro­­duc­­tion lists for sphinx

  • Fixed Is­­sue 165: De­f­i­ni­­tion lists not prop­er­­ly in­­­den­t­ed in­­­side ad­­mo­ni­­tions or ta­bles.

  • SVG Im­ages work in­­­line when us­ing the inkscape ex­ten­­sion.

  • Fixed Is­­sue 268: TOCs shift­ed to the left on RL 2.4

  • Fixed Is­­sue 281: sphinx test au­­to­­ma­­tion was bro­ken

  • Fixed Is­­sue 280: wrong page tem­­plates used in sphinx

En­joy!

If it's worth doing, it's worth doing right.

Yes­ter­day in the PyAr mail­ing list a "sil­ly" sub­ject ap­peared: how would you trans­late span­ish to rosari­no?

For those read­ing in en­glish: think of rosari­no as a sort of pig lat­in, where the ton­ic vow­el X is re­placed with XgasX, thus "rosar­i­o" -> "rosagasar­i­o".

In eng­lish this would be im­pos­si­ble, but span­ish is a pret­ty reg­u­lar lan­guage, and a writ­ten word has enough in­for­ma­tion to know how to pro­nounce it, in­clud­ing the lo­ca­tion of the ton­ic vow­el, so this is pos­si­ble to do.

Here is the thread.

It's looong but, fi­nal out­come, since I am a nerd, and a pro­gram­merm and pro­gram­mers pro­gram, I wrote it.

What sur­prised me is that as soon as I start­ed do­ing it, this throw­away pro­gram, com­plete­ly use­less...I did it clean­ly.

  • I used doc­trings.

  • I used doctest­s.

  • I was care­­ful with uni­­code.

  • Com­­ments are ad­e­quate

  • Fac­­tor­ing in­­­to func­­tions is cor­rect

A year ago I would­n't have done that. I think I am fin­ish­ing a stage in my (s­low, stum­bling) evo­lu­tion as a pro­gram­mer, and am cod­ing bet­ter than be­fore.

I had a ten­den­cy to, since python lets you write fast, write fast and dirty. Or slow and clean. Now I can code fast and clean, or at least clean­er.

BTW: this would be an ex­cel­lent ex­er­cise for "ju­nior" pro­gram­mer­s!

  • It in­­­volves string ma­nip­u­la­­tion which may (or may not) be han­­dled with reg­ex­p­s.

  • Us­ing tests is very quick­­­ly re­ward­ing

  • Makes you "think uni­­code"

  • The al­­go­rithm it­­self is not com­­pli­­cat­ed, but trick­­y.

BTW: here is the (maybe stupid­ly over­thought) pro­gram, gaso.py:

# -*- coding: utf-8 -*-

"""
Éste es el módulo gasó.

Éste módulo provee la función gasear. Por ejemplo:

>>> gasear(u'rosarino')
u'rosarigasino'
"""

import unicodedata
import re

def gas(letra):
    '''dada una letra X devuelve XgasX
    excepto si X es una vocal acentuada, en cuyo caso devuelve
    la primera X sin acento

    >>> gas(u'a')
    u'agasa'

    >>> gas (u'\xf3')
    u'ogas\\xf3'

    '''
    return u'%sgas%s'%(unicodedata.normalize('NFKD', letra).encode('ASCII', 'ignore'), letra)

def umuda(palabra):
    '''
    Si una palabra no tiene "!":
        Reemplaza las u mudas de la palabra por !

    Si la palabra tiene "!":
        Reemplaza las "!" por u

    >>> umuda (u'queso')
    u'q!eso'

    >>> umuda (u'q!eso')
    u'queso'

    >>> umuda (u'cuis')
    u'cuis'

    '''

    if '!' in palabra:
        return palabra.replace('!', 'u')
    if re.search('([qg])u([ei])', palabra):
        return re.sub('([qg])u([ei])', u'\\1!\\2', palabra)
    return palabra

def es_diptongo(par):
    '''Dado un par de letras te dice si es un diptongo o no

    >>> es_diptongo(u'ui')
    True

    >>> es_diptongo(u'pa')
    False

    >>> es_diptongo(u'ae')
    False

    >>> es_diptongo(u'ai')
    True

    >>> es_diptongo(u'a')
    False

    >>> es_diptongo(u'cuis')
    False

    '''

    if len(par) != 2:
        return False

    if (par[0] in 'aeiou' and par[1] in 'iu') or \
    (par[1] in 'aeiou' and par[0] in 'iu'):
        return True
    return False

def elegir_tonica(par):
    '''Dado un par de vocales que forman diptongo, decidir cual de las
    dos es la tónica.

    >>> elegir_tonica(u'ai')
    0

    >>> elegir_tonica(u'ui')
    1
    '''
    if par[0] in 'aeo':
        return 0
    return 1

def gasear(palabra):
    """
    Convierte una palabra de castellano a rosarigasino.

    >>> gasear(u'rosarino')
    u'rosarigasino'

    >>> gasear(u'pas\xe1')
    u'pasagas\\xe1'

    Los diptongos son un problema a veces:

    >>> gasear(u'cuis')
    u'cuigasis'

    >>> gasear(u'caigo')
    u'cagasaigo'


    Los adverbios son especiales para el castellano pero no
    para el rosarino!

    >>> gasear(u'especialmente')
    u'especialmegasente'

    """
    #from pudb import set_trace; set_trace()

    # Primero el caso obvio: acentos.
    # Lo resolvemos con una regexp

    if re.search(u'[\xe1\xe9\xed\xf3\xfa]',palabra):
        return re.sub(u'([\xe1\xe9\xed\xf3\xfa])',lambda x: gas(x.group(0)),palabra,1)


    # Siguiente problema: u muda
    # Reemplazamos gui gue qui que por g!i g!e q!i q!e
    # y lo deshacemos antes de salir
    palabra=umuda(palabra)

    # Que hacemos? Vemos en qué termina

    if palabra[-1] in 'nsaeiou':
        # Palabra grave, acento en la penúltima vocal
        # Posición de la penúltima vocal:
        pos=list(re.finditer('[aeiou]',palabra))[-2].start()
    else:
        # Palabra aguda, acento en la última vocal
        # Posición de la última vocal:
        pos=list(re.finditer('[aeiou]',palabra))[-1].start()

    # Pero que pasa si esa vocal es parte de un diptongo?

    if es_diptongo(palabra[pos-1:pos+1]):
        pos += elegir_tonica(palabra[pos-1:pos+1])-1
    elif es_diptongo(palabra[pos:pos+2]):
        pos += elegir_tonica(palabra[pos:pos+2])


    return umuda(palabra[:pos]+gas(palabra[pos])+palabra[pos+1:])

if __name__ == "__main__":
    import doctest
    doctest.testmod()

rst2pdf 0.13 released!

I've just up­load­ed the 0.13 ver­sion of rst2pdf, a tool to con­vert re­Struc­tured text to PDF us­ing Re­port­lab to http://rst2pdf.­google­code.­com

rst2pdf sup­ports the full reSt syn­tax, works as a sphinx ex­ten­sion, and has many ex­tras like lim­it­ed sup­port for TeX-­less math, SVG im­ages, em­bed­ding frag­ments from PDF doc­u­ments, True Type font em­bed­ding, and much more.

This is a ma­jor ver­sion, and has lots of im­prove­ments over 0.12.3, in­clud­ing but not lim­it­ed to:

  • New TOC code (sup­­ports dots be­tween ti­­tle and page num­ber)

  • New ex­ten­­sion frame­­work

  • New pre­pro­ces­­sor ex­ten­­sion

  • New vec­­tor­pdf ex­ten­­sion

  • Sup­­port for nest­ed stylesheets

  • New head­­er­Sep­a­ra­­tor/­­foot­erSep­a­ra­­tor stylesheet op­­tions

  • Fore­­ground im­age sup­­port (use­­ful for wa­ter­­mark­s)

  • Sup­­port tran­s­­paren­­cy (al­pha chan­nel) when spec­i­­fy­ing col­ors

  • Inkscape ex­ten­­sion for much bet­ter SVG sup­­port

  • Abil­i­­ty to show to­­tal page count in head­­er/­­foot­er

  • New RSON for­­mat for stylesheets (J­­SON su­per­set)

  • Fixed Is­­sue 267: Sup­­port :align: in fig­ures

  • Fixed Is­­sue 174 re­­gres­­sion (In­­den­t­ed lines in line block­­s)

  • Fixed Is­­sue 276: Load stylesheets from strings

  • Fixed Is­­sue 275: Ex­­tra space be­­fore lineblocks

  • Fixed Is­­sue 262: Full sup­­port for Re­­port­lab 2.4

  • Fixed Is­­sue 264: Split­t­ing er­ror in some doc­u­­ments

  • Fixed Is­­sue 261: As­sert er­ror with wor­­daxe

  • Fixed Is­­sue 251: added sup­­port for rst2pdf ex­ten­­sions when us­ing sphinx

  • Fixed Is­­sue 256: ug­­ly crash when us­ing SVG im­ages with­­out SVG sup­­port

  • Fixed Is­­sue 257: sup­­port aafig­ure when us­ing sphinx/pdf­builder

  • In­i­­tial sup­­port for graphviz ex­ten­­sion in pdf­builder

  • Fixed Is­­sue 249: Im­ages dis­­­tort­ed when speci­­fiy­ing width and height

  • Fixed Is­­sue 252: math di­rec­­tive con­flic­t­ed with sphinx

  • Fixed Is­­sue 224: Ta­bles can be left­­/­­cen­ter/right aligned in the page.

  • Fixed Is­­sue 243: Wrong spac­ing for sec­ond para­­graphs in bul­let list­s.

  • Big refac­­tor­ing of the code.

  • Sup­­port for Python 2.4

  • Ful­­ly re­­worked test suit­­e, con­t­in­u­ous in­­te­­gra­­tion site.

  • Op­­tion­al­­ly use SWFtools for PDF im­ages

  • Fixed Is­­sue 231 (S­­marter TTF au­­toem­bed)

  • Fixed Is­­sue 232 (HTML tags in ti­­tle meta­­data)

  • Fixed Is­­sue 247 (print­­ing stylesheet)

Finding a programmer that can program.

If you haven't read Jeff At­wood's Why Can't Pro­gram­mer­s.. Pro­gram? go ahead, then come back.

Now, are you scared enough? Don't be, the prob­lem there is with the hir­ing process.

Yes, there are lots of peo­ple who show up for pro­gram­ming po­si­tions and can't pro­gram. That's not un­usu­al!

It's re­lat­ed to some­thing I read by Joel Spol­sky (a­maz­ing­ly, Jeff At­wood's part­ner in stack­over­flow.­com).

Sup­pose you are a com­pa­ny that tries to hire in the top 1% of pro­gram­mer­s, and have an open po­si­tion.

You get 100 ap­pli­cants. Of those, 99 can't pro­gram. 1 can. You hire him.

Then the com­pa­ny next door needs to do the same thing. They may get 100 ap­pli­can­t. 99 can't pro­gram ... and prob­a­bly 80 of them are the same the pre­vi­ous com­pa­ny re­ject­ed be­fore!

So no, hir­ing the best 1 out of 100 is not a way to get a pro­gram­mer in the top 1% at al­l, that's just sta­tis­tics in­tu­ition get­ting the worse of you.

You don't want to hire in the top 1% of ap­pli­cants, you want to hire in the top 1% of pro­gram­mers. Dif­fer­ent uni­vers­es.

These two things are the two sides of the same coin. 99% of ap­pli­cants are use­less, that's why they are ap­pli­cants, be­cause they can't get a job and they can't get a job be­cause they are use­less as pro­gram­mers.

So, judg­ing pro­gram­mers by the stan­dard of the ap­pli­cants you get is like judg­ing qual­i­ty of a restau­rant by lick­ing its dump­ster.

But now, hav­ing tak­en care of this, how do you find a pro­gram­mer that can ac­tu­al­ly pro­gram?

Easy! Find one that has pro­grams he can show you!

I would nev­er hire a pro­gram­mer that can't show me code. There must be some­thing wrong with him, be­cause pro­gram­mers write pro­grams.

That's just what we do. If we did­n't what kind of pro­gram­mers would we be?

Let's see some ob­vi­ous ob­jec­tions to my ar­gu­men­t:

  1. He wrote code for his pre­vi­ous em­­ploy­er and can't show it.

    So, he did. What else has he writ­ten? Some open source code? Maybe snip­pets in a blog? An­swers in stack­­over­flow?

    Noth­ing? He has writ­ten noth­ing he was not paid to write? He is not who I wan­t. He on­­ly pro­­grams for mon­ey, he lacks pas­­sion for pro­­gram­ming, he does­n't en­joy it. He is prob­a­bly not very good at it.

  2. He is just fin­ish­ing col­lege, he has not writ­ten much code yet!

    Why? What stopped him? He has been learn­ing to pro­­gram for years, what has he done with the knowl­­edge he has been re­­ceiv­ing? Sav­ing it for his 25th brth­­day par­­ty? He has not prac­ticed his craft? Not the pro­­gram­mer I need.

But hav­ing him show you code is not enough, of course. It al­so has to be good code, if you are se­ri­ous about hir­ing ex­cel­lent pro­gram­mer­s.

So here's some bonus cri­te­ri­a:

  1. Check the lan­guages he us­es. If he codes COBOL for plea­­sure, he may or may not be what you wan­t.

  2. Open source == bonus points: it means he is not ashamed of his code, plus it makes his cre­­den­­tials triv­ial to ver­i­­fy.

  3. If he leads a project with mul­ti­­ple con­trib­u­­tors and does a good job he is half way to be­­com­ing a pro­­gram­mer/­­man­ager, so huge bonus points.

  4. Projects with long com­mit his­­to­ries show re­spon­s­a­bil­i­­ty and a lev­­el head.

  5. De­vel­op­­ment mail­ing lists let you gauge his per­­son­al­i­­ty. Is he abra­­sive? Is he thin-skinned? Is he an­noy­ing?

Then there's the ob­vi­ous stuff, ref­er­ences from pre­vi­ous em­ploy­er­s, in­ter­views, ex­er­cis­es, an such. But those are the least im­por­tant fil­ter­s, the most im­por­tant thing is that he must be able to code. And show­ing you his code is the way to do it.

Hacked on kuatia for a couple of hours...

As men­tioned pre­vi­ous­ly, I am hack­ing a bit on a proof-of-­con­cept word pro­ces­sor. Right now, it's host­ed on google­code and called ku­a­tia.

Now, it is far from be­ing use­ful for any­thing, but... it can do nest­ed item­ized and bul­let­ed list­s.

Here's a scree­nie of the ed­i­tor and the PDF out­put it pro­duces via re­Struc­tured Text:

editando2

Per­son­al­ly I think that's not too bad.


Contents © 2000-2024 Roberto Alsina