Ir al contenido principal

Ralsina.Me — El sitio web de Roberto Alsina

Creating Reports from Python

I need so­me ad­vi­ce. I need to crea­te an appli­ca­tion for schools that takes stu­dent da­ta (per­so­nal in­for­ma­tio­n, sub­jec­ts, gra­des, etc) and pro­du­ces their gra­de re­por­t. I need to crea­te a printed co­p­y, and keep a his­to­ric re­cor­d.

As a first ste­p, I thou­ght on ge­ne­ra­ting them in PDF via re­por­tla­b, but I want opi­nion­s. For exam­ple, I can ge­ne­ra­te the PDF, print it and re­ge­ne­ra­te it if I need to re­print it. What other op­tins do you see? It's ba­si­ca­lly text wi­th ta­ble­s. Re­por­tla­b? La­TeX? So­me other tool?

To this I re­plied I su­gges­ted Res­truc­tu­red Text whi­ch if you fo­llow my blog should sur­pri­se noone at all ;-)

In this sto­ry I wi­ll try to bring to­ge­ther all the pie­ces to turn a chunk of py­thon da­ta in­to a ni­ce PDF re­por­t. Ho­pe it´s use­ful for so­meo­ne!

Why not use reportlab directly?

He­re´s an exam­ple I pos­ted in that th­rea­d: how to crea­te a PDF wi­th two pa­ra­gra­phs, using res­truc­tu­red tex­t:

This is a paragraph. It has several lines, but what it says does not matter.
I can press enter anywhere, because
it ends only on a blank
line. Like this.

This is another paragraph.

And he­re´s what you need to do in re­por­tla­b:

# -*- coding: utf-8 -*-
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer
from reportlab.lib.styles import getSampleStyleSheet
from reportlab.lib.units import inch
styles = getSampleStyleSheet()
def go():
  doc = SimpleDocTemplate("phello.pdf")
  Story = [Spacer(1,2*inch)]
  style = styles["Normal"]
  p = Paragraph('''This is a paragraph. It has several lines, but what it says does not matter.
I can press enter anywhere, because
it ends when the string ends.''', style)
  p = Paragraph('''This is another paragraph.''', style)


Of cour­se, you could wri­te a pro­gram that takes text se­pa­ra­ted in pa­ra­gra­phs as its in­pu­t, and crea­tes the re­por­tlab Pa­ra­gra­ph ele­men­ts, pu­ts them in the Sto­ry and buil­ds the do­cu­men­t.... but then you are rein­ven­ting the res­truc­tu­red text par­se­r, on­ly wor­se!

Res­truc­tu­red text is da­ta. Re­por­tlab pro­gra­ms are co­de. Da­ta is ea­sier to ge­ne­ra­te than co­de.

So, how do you do a report?

You crea­te a fi­le wi­th the da­ta in it, pro­ce­ss it via one of the many rs­t->­pdf pa­ths (I su­ggest my rs­t2­pdf scrip­t, but feel free to use the other 9 al­ter­na­ti­ve­s).

Su­ppo­se you ha­ve the fo­llo­wing da­ta:

frobtimes = [[1,3],[3,5],[9,8]]

And you want to pro­du­ce this re­por­t:

Frobniz performance

* 1 frobniz: 3 seconds

* 3 frobniz: 5 seconds

* 9 frobniz: 8 seconds

You could do it this wa­y:

print '''Frobniz performance

for ft in frobtimes:
  print '* %d frobniz: %d seconds\n'%(ft[0],ft[1])

And it will work. However, this means you are writing code again! This time, you are reinventing templating

What you want is to use, sa­y, Mako (or whate­ve­r). It's going to be be­tter than your ho­me­brew so­lu­tion an­ywa­y. He­re's the tem­pla­te for the re­por­t:

${title('Frobniz Performance')}

% for ft in frobtimes:
* ${ft[0]} frobniz: $ft[1] seconds

% endfor

This uses a function title defined thus:

..­co­de-­blo­ck:: py­thon

ti­tle=­lamb­da(­tex­t): tex­t+'n'+'='*­len(­tex­t)+'n­n'

You could ge­ne­ra­li­ze it to su­pport mul­ti­ple hea­ding le­vel­s:

ti­tle=­lamb­da(­tex­t,­le­ve­l): tex­t+'n'+'=-~_#%^'[­le­ve­l]*­len(­tex­t)+'n­n'

Trickier: tables

One ve­ry co­m­mon fea­tu­re of re­por­ts is ta­ble­s. In fac­t, it would be mo­re na­tu­ral to pre­sent our fro­bniz re­port as a ta­ble. The bad news is how ta­bles look like in res­truc­tu­red tex­t:

| Frobniz | Time (seconds) |
|        1|              3 |
|        3|              5 |
|        9|              8 |

Whi­ch is ve­ry pre­tty, but not exac­tly tri­vial to ge­ne­ra­te. But do­n´t wo­rr­y, the­re is a sim­ple so­lu­tion for this, too: CSV ta­ble­s.

Frobniz time measurements









And of cour­se, the­re is py­tho­n´s csv mo­du­le if you want to be fan­cy and avoid trou­ble wi­th de­li­mi­ter­s, es­ca­ping and so on:

def table(title,header,data):
  csv_writer = csv.writer(head, dialect='excel')

  head=´:header: %s´head.getvalue()

  csv_writer = csv.writer(body, dialect='excel')
  for row in data:

  return ´´´.. csv-table:: %s
     :header: %s


wi­ll pro­du­ce nea­t, ready for use, csv ta­ble di­rec­ti­ves for res­truc­tu­red tex­t.

How would it work?

This py­thon pro­gram is rea­lly ge­ne­ri­c. All you need is for it to ma­tch a tem­pla­te (an ex­ter­nal text fi­le), wi­th da­ta in the form of a bun­ch of py­thon va­ria­ble­s.

But how do we get the da­ta? We­ll, from a da­ta­ba­se, usua­ll­y. But it can co­me from an­ywhe­re. You could be making a re­port about your de­l.i­cio­.us book­ma­rks, or about fi­les in a fol­de­r, this is rea­lly ge­ne­ric stu­ff.

What would I use to get the da­ta? I would use JSON in the mi­dd­le. I would make my re­port ge­ne­ra­tor take the fo­llo­wing ar­gu­men­ts:

  1. A mako te­m­­pla­­te na­­me.

  2. A JSON da­­ta fi­­le.

That wa­y, the pro­gram wi­ll be com­ple­te­ly ge­ne­ri­c.

So, put all this to­ge­the­r, and the­re's the su­per­du­per ma­gi­cal re­port ge­ne­ra­to­r.

On­ce you get rs­t, pa­ss it th­rou­gh so­me­thing to crea­te PDFs, but sto­re on­ly the rs­t, whi­ch is (al­mos­t) plain tex­t, sear­cha­ble, ea­sy to sto­re, and mu­ch sma­lle­r.

I do­n´t ex­pect su­ch a re­port ge­ne­ra­tor to be over 50 li­nes of co­de, in­clu­ding co­m­men­ts.

Missing pieces

  • Whi­­le res­­tru­c­­tu­­red text is al­­most plain tex­­t, the­­re are spe­­cial cha­­ra­c­­te­r­s, whi­­ch you should es­­ca­­pe. That is le­­ft as an exe­r­­ci­­se to the rea­­der ;-)

  • So­­­meo­­­ne should rea­­lly wri­­te this thing ;-)

Giving rst2pdf some love

Be­cause of a thread in the PyAr list about gen­er­at­ing re­ports from Python, I sug­gest­ed us­ing ReST and my rst2pdf scrip­t.

This caused a few things:

  1. I de­­cid­ed it's a pret­­ty de­­cent piece of code, and it de­serves a re­lease. Mak­ing a re­lease means I need­ed to fix the most em­bar­ras­ing pieces of it. So...

  2. Im­­ple­­men­t­ed the class di­rec­­tive, so it can have cus­­tom para­­graph styles with very lit­­tle ef­­fort.

  3. Did prop­er com­­mand line pars­ing.

  4. Did prop­er se­­tup­­tools script

  5. Up­­load­­ed to PyPI

  6. Cre­at­ed a re­lease in Google Code.

So, if you want the sim­plest way to gen­er­ate PDF files from a pro­gram in the en­tire python­ic uni­verse... give it a look.

Lessons learned in a month of hobby programming

A lit­tle over a month ago, on Ju­ly 15th, I opened a Google Code project called uRSSus. Here's the com­mit. My goal was to try build­ing a desk­top ap­pli­ca­tion like if I were build­ing a web ap­pli­ca­tion, us­ing a OR­M, tem­plat­ing, gener­ic views, and oth­er things.

The first thing I learned is that it was more fun to just write the ap­pli­ca­tion and see it grow than spend­ing time writ­ing the frame­work need­ed to do what I want­ed, so I just kept the OR­M, and the rest is pret­ty tra­di­tion­al code.

The sec­ond thing I learned is that for a hob­by­ist pro­gram­mer, this is a gold­en age. I am not ex­act­ly an awe­some pro­gram­mer my­self, and with to­day´s tool­s, I could al­most wish my app in­to ex­is­tence. When I start­ed pro­gram­ming on a PC, I had to swap flop­pies to change from the IDE to the com­pil­er 1. And if I made a mis­take, the com­put­er crashed. No, not the pro­gram. The com­put­er crashed.

Now? I get a pret­ty di­alog, a link to the po­si­tion, a stack dump, etc, etc, etc. Not miss­ing the old days at al­l.

An­oth­er way this is a gold­en age is that there is a lot of code out there. I lit­er­al­ly had to learn my code from book­s. I first "got" C by read­ing the help for a pi­rat­ed copy of Au­todesk An­i­ma­tor's POCO ex­ten­sion lan­guage. There were no col­lec­tions of code I could look at and learn. There were not even any large li­braries of code I could legal­ly use!

And that´s an­oth­er rea­son why this is a gold­en age: Open Source and Free Soft­ware. You re­al­ly can be a pro­gram­mer just by will­ing it and ef­fort. You will not lack tool­s, you will find users (if you are good), you will find helpers (if you are luck­y), you will find free in­fra­struc­ture (svn re­pos, free wik­is, free file host­ing, free ev­ery­thing), you will find li­braries you can use!.

The third thing I learned is that Python does come with bat­ter­ies in­clud­ed. Many things that would be an­noy­ing ef­fort in oth­er lan­guages are just there, ready to be used. Add the in­ter­net, and it´s a Mr. Fu­sion in­stead of a bat­tery.

The ap­pli­ca­tion I de­vel­oped is a News ag­gre­ga­tor and thanks to Mark Pil­grim I had Feed Pars­er and thanks to Troll Tech (Now Noki­a) I had Qt for the UI, and many many oth­er things. I could fo­cus on ap­pli­ca­tion log­ic, not on pars­ing and draw­ing.

The fourth thing I learned is that a month is a long time when you have pro­duc­tive tool­s. Urssus (that's my ap­pli­ca­tion) was func­tion­al (but aw­ful) in a day or two. It was not aw­ful in 2 week­s. It was pret­ty good in 3.In a mon­th? Down­load it and see for your­self, I like it, the SVN ver­sion is much bet­ter most of the time, try re­vi­sion 619 ;-)

The fifth thing I learned is that Python per­for­mance is good enough. I don´t see much per­for­mance dif­fer­ence be­tween uRSSus and, say, Akre­ga­tor, which is C++, ex­cept on places which are ob­vi­ous­ly bro­ken. Sure, the data­base is C, the UI tool­kit is C++... they are all black box­es to me here. I code Python. My pieces do well.

The last thing I learned is that I can still code free soft­ware. I had not writ­ten a use­ful/us­able large free soft­ware ap­pli­ca­tion in per­haps 8 years. I am 36.9 years old... ex­cuse me if I feel mid­dle-aged, sur­round­ed by young­sters which are faster, more ded­i­cat­ed and ac­tu­al­ly have free time.

Be­cause of the pro­duc­tiv­i­ty of the tool­s, I man­aged to code just a cou­ple of hours a day for the first week­s, and progress was still good, so I did­n´t get dis­cour­aged, which is the worst en­e­my of free soft­ware.

It has been a fun ex­per­i­men­t, hope­ful­ly it will be a fun on­go­ing hob­by.


Can you guess what I was us­ing?

uRSSus: is that an icon in your pocket?

Yes, some­times I think a fea­ture is much hard­er than it re­al­ly is.


So, since re­vi­sion 571, ev­ery once in a while a new fav­i­con will ap­pear in your feed tree when you use uRSSus. No, they won't all ap­pear at once. They will ap­pear when­ev­er:

  1. You ac­­tu­al­­ly fetched new posts from a feed

  2. You restart the app

Once they ap­pear they will nev­er change, ei­ther.

Contents © 2000-2021 Roberto Alsina