Nikola: Filters & Bundles

2012-08-10 20:06

Two upcoming features for the next release of Nikola, my static site generator, due sometime in August.

Filters

Filters let you postprocess your output. Think of it like instagram for websites, but useful. You can configure per file extension a series of python functions or shell commands, which will be applied in place to the output file.

For example, suppose you want to apply yui-compressor to your CSS and JS files:

FILTERS = {
    ".css": [filters.yui_compressor],
    ".js": [filters.yui_compressor],
}

There, filters.yui_compressor is a simple wrapper around the command so that it applies in-place to the output files.

If you use strings there (untested), they are taken as commands. The "%s" will be replaced by the filename, the usual crazy shell quoting rules apply:

FILTERS = {
    ".jpg": ["jpegoptim '%s'"],
    ".png": ["pngoptim '%s'"],
}

Keep in mind that the filters modify the output of Nikola, not the input, so your images, CSS, and JS files will not be touched in any way. And of course changing the filters applied to a file will force a rebuild, so you can experiment freely.

Bundles

Having many separate CSS or JS files is usually a nono for performance reasons because each one may involve a separate HTTP transaction. The solution is to "bundle" those files in a single, larger file.

The reason not to do that is that usually it means having a huge, uncomfortable thing to handle. So Nikola tries to give you the best of both worlds, by letting you have separate files, and bundling them (or not) on build.

There is a new option, USE_BUNDLES that defaults to False, and there are some changes in the theme templates so that it uses the bundled version when needed.

This was only possible thanks to Webassets. However, if you don't have Webassets installed, or you don't enable USE_BUNDLES, this should cause no changes in the output.

Conclusion

These new features will allow Nikola users to improve their site's performance with minimal tweaking, which is always a good thing.

Driving a Nail With a Shoe I: Do-Sheet

2012-07-25 22:25

I had proposed a talk for PyCon Argentina called "Driving 3 Nails with a Shoe". I know, the title is silly, but the idea was showing how to do things using the wrong tool, intentionally. Why? Because:

It makes you think different
It's fun

The bad side is, of course, that this talk's contents have to be a secret, or else the fun is spoiled for everyone. Since the review process for PyConAr talks is public, there was no way to explain what this was about.

And since that means the reviewers basically have to take my word for this being a good thing to have at a conference, which is unfair, I deleted the proposal. The good (maybe) news is that now everyone will see what those ideas I had were about. And here is nail number 1: Writing a spreadsheet using doit.

This is not my first "spreadsheet". It all started a long, long time ago with a famous recipe by Raymond Hettinger which I used again and again and again (I may even be missing some post there).

Since I have been using doit for Nikola I am impressed by the power it gives you. In short, doit lets you create tasks, and those tasks can depend on other tasks, and operate on data, and provide results for other tasks, etc.

See where this is going?

So, here's the code, with explanations:

cells is our spreadsheet. You can put anything there, just always use "cellname=formula" format, and the formula must be valid Python, ok?

from tokenize import generate_tokens

cells = ["A1=A3+A2", "A2=2", "A3=4"]
values = {}

task_calculate creates a task for each cell, called calculate:CELLNAME. The "action" to be performed by that task is evaluating the formula. But in order to do that successfully, we need to know what other cells have to be evaluated first!

This is implemented using doit's calculated dependencies by asking doit to run the task "get_dep:FORMULA" for this cell's formula.

def evaluate(name, formula):
    value = eval(formula, values)
    values[name] = value
    print "%s = %s" % (name, value)

def task_calculate():
    for cell in cells:
        name, formula = cell.split('=')
        yield {
            'name':name,
            'calc_dep': ['get_dep:%s' % formula],
            'actions': [(evaluate, (name, formula))],
            }

For example, in our test sheet, A1 depends on A3 and A2 but those depend on no other cells. To figure this out, I will use the tokenize module, and just remember what things are "names". More sophisticated approaches exist.

The task_get_dep function is a doit task that will create a task called "get_dep:CELLNAME" for every cell name in cells.

What get_dep returns is a list of doit tasks. For our A1 cell, that would be ["calculate:A2", "calculate:A3"] meaning that to calculate A1 you need to perform those tasks first.

def get_dep(formula):
    """Given a formula, return the names of the cells referenced."""
    deps = {}
    try:
        for token in generate_tokens([formula].pop):
            if token[0] == 1:  # A variable
                deps[token[1]] = None
    except IndexError:
        # It's ok
        pass
    return {
        'result_dep': ['calculate:%s' % key for key in deps.keys()]
        }

def task_get_dep():
    for cell in cells:
        name, formula = cell.split('=')
        yield {
            'name': formula,
            'actions': [(get_dep, (formula,))],
            }

And that's it. Let's see it in action. You can get your own copy here and try it out by installing doit, editing cells and then running it like this:

ralsina@perdido:~/dosheet$ doit -v2 calculate:A3
.  get_dep:4
{}
.  calculate:A3
A3 = 4
ralsina@perdido:~/dosheet$ doit -v2 calculate:A2
.  get_dep:2
{}
.  calculate:A2
A2 = 2
ralsina@perdido:~/dosheet$ doit -v2 calculate:A1
.  get_dep:A3+A2
{'A3': None, 'A2': None}
.  get_dep:4
{}
.  calculate:A3
A3 = 4
.  get_dep:2
{}
.  calculate:A2
A2 = 2
.  calculate:A1
A1 = 6

As you can see, it always does the minimum amount of effort to calculate the desired result. If you are so inclined, there are some things that could be improved, and I am leaving as exercise for the reader, for example:

Use uptodate to avoid recalculating dependencies.
Get rid of the global values and use doit's computed values instead.

Here is the full listing, enjoy!

Your Editor is Not the Bottleneck

2012-07-15 22:22

This may cause some palpitations in some friends of mine who laugh at me for using kwrite, but it really is not. Any time you spend configuring, choosing, adjusting, tweaking, changing, improving, patching or getting used to your editor is time invested, for which you need to show a benefit, or else it's time wasted.

Let's look at SLOC, which while discredited as a measure of programmer's productivity, surely does work as a measure of how much a programmer types, right?

Well, estimates of total code production across the lifetime of a product vary (just take the SLOC of the product, divide by men/days spent), but they are usually something between 10 and 100 SLOC per programmer per day. Let's be generous and say 200.

So, 200 lines in eight hours. That's roughly one line every two minutes, and the average line of code is about 45 characters. Since I assume you are a competent typist (if you are not, shame on you!), it takes less than 20 seconds to type that.

So, typing, which is what you often tweak in your editor, takes less than 15% of your time. And how much faster can it get? Can you get a line written in 10 seconds? Then you just saved 8% of your day. And really, does your editor isave you half the typing time?

How much time do you lose having your eyes wonder over the sidebars, buttons, toolbars, etc?

So while yes, typing faster and more efficiently is an optimization, it may also be premature, in that, what the hell are we doing the other 80% of the time? Isn't there something we can do to make that huge chunck of time more efficient instead of the smaller chunk?

Well, I think we spent most of that time doing three things:

Reading code
Thinking about what code to write
Fixing what we wrote in that other 20%

The first is easy: we need better code readers not editors. It's a pity that the main interface we get for looking at code is an editor, with its constant lure towards just changing stuff. I think there is a lost opportunity there somewhere, for an app where you can look at the code in original or interesting ways, so that you understand the code better.

The second is harder, because it's personal. I walk. If you see me walking while my editor is open, I am thinking. After I think, I write. Your mileage may vary.

The third is by far the hardest of the three. For example, autocomplete helps there, because you won't mistype things, which is interesting, but more powerful approaches exist, like constant running of tests suites while you edit. Every time you leave a line, trigger the affected parts of the suite.

That's much harder than it sounds, since it means your tools need to correlate your test suite to your code very tightly, so that you will see you are breaking stuff the second you break it, not minutes later.

Also, it should enable you to jump to the test with a keystroke, so that you can fix those tests if you are changing behaviour in your code. And of course it will mean you need tests ;-)

Which brings me to a pet peeve of mine, that editors still treat the file as the unit of work, which makes no sense at all. You never want to edit a file, you want to edit a function, or a class, or a method, or a constant but never a file. Knowing this was the sheer genius of ancient Visual Basic, which was completely ignored by all the snobs looking down at it.

So, instead of tweaking your editor, get me a tool that does what I need please. I have been waiting for it since VB 1.0. And a sandwich.

UPDATE: Interesting discussions in reddit and hacker news

Quick Hack to Catalog your Books

2012-07-13 12:22

If you have actual, paper books and want to catalog their info quickly, this bookdata.py script may be handy:

import sys
import time
import gdata.books.service
import json

def get_book_info(isbn):
    print "Looking for ISBN:", isbn
    google_books = gdata.books.service.BookService()
    result = google_books.search('ISBN %s '%isbn)
    data = [x.to_dict() for x in result.entry]
    if not data:
        print "No results"
        return
    title = data[0]['title']
    with open(title+'.json','w') as f:
        f.write(json.dumps(data))
    print "Guardada info de '%s' en '%s.json'" %(isbn, title)

if __name__ == "__main__":
    while True:
        isbn = sys.stdin.readline().strip()
        if isbn:
            get_book_info(isbn)
        time.sleep(1)

What does it do? It reads ISBN numbers from standard input and saves the book's info in a title.json file for later processing and formatting.

If you want to edit that information, you can just do it or you can try doing a little script using jsonwidget like this:

python -c 'import jsonwidget; jsonwidget.run_editor("abook.json", schemafile="gbooks.schema")'

Where abook.json is a file generated by the previous script and gbooks.schema is this file.

Oh, and if your books have barcodes, you can just do:

zbarcam --raw | python bookdata.py

Show your computer your books and let it do the rest :-)

PS: I would love if someone gathered all this and made a nice personal book cataloguing thing.

The Long Post About PyCamp 2012

2012-07-11 22:28

As I have mentioned in half a dozen posts already, I spent the last weekend at PyCamp 2012. But what I have not written about is what exactly it was, and why anyone would want to attend one, or maybe organize one. So that's what this post is about.

PyCamp was organized by PyAr, the Python Argentina community. PyAr is a very special bunch of people, who are completely amateur, and do everything for love and fun. Since PyAr is a very special group of people, the things PyAr causes, inspires or creates are special as well.

So, since a few years ago, what happens is someone finds a place with bunk beds, a large room, perhaps somewhat isolated, that provides meals, and is cheap (it's as hard as it sounds) and rents it for a long weekend. Then everyone is invited to chip in for the rent money.

This year, 4-days, all inclusive, costed roughly U$S 100. Sure, it's not exactly luxury accomodations, but it does what it has to do, which is give us shelter and protect us from wild animals.

Thus, you end up with a few dozen nerds with computers, one of them is great at setting up wireless (Joac!), one is the MC (Alecu!), one helps around (Facundo!) one is the liaison with the location (Pindonga!) and so on, the work is spread around, and we have time and company to hack.

So, on the first morning, everyone proposes what he would like to work on. Those proposals are voted by the public, and those with more votes are assigned slots (5 a day), where they will be the main focus of attention.

So, what happens if your proposal is not voted? Well, you either find a proposal you like, and join it, or you just do your thing. Because this is not a democracy, this is anarchy, the votes are just a way for everyone to know what people will be doing, and to find places to fit in if you want (BTW, there is a situation in LeGuin's The Disposessed which is so much like this, it's scary).

After that, you just do what you want. You can put your headset on, and code, or mingle and chat, or join a group, or do a bit of everything. Since meals are catered, you don't have to worry about breaks. When the meal is ready, everyone breaks at the same time and socializes in communitary tables.

Does all this sound as strange to you as it does to me? A bunch of grown professionals acting like hippies. Well, it feels strange too, but that doesn't mean it doesn't feel great. It even works great. Once you see what the others are doing, things you wouldn't expect start looking like fun (Celery!?! Juggernaut! Android!) and the sheer excitement of people telling you "look, I did this!" is infectious, and exhilarating.

Also, RC cars, kinect hacking, android hacking, electric guitar hacking, juggling, monocycle lessons, a firepit, alcohol, coffee, mate, boardgames, cardgames, music, jokes, adrenaline, huge spiders, asado, cold, vim, ninja, ping pong, robot spaceships, people you see only twice a year if that, questions, not knowing the answers, figuring things out on the run, getting help in that thing you have been stuck for weeks, having the piece someone else has been stuck on for weeks, feeling like some sort of bearded buddha and a total ignoramus in 5 minutes...

And at least I, at least this year, had a very productive weekend. I got help from a bunch of people in things I was daunted by, I felt like an active programmer instead of a suit, which is always nice, since I don't own a suit, and had a great time. Laughed a lot. Made a couple new friends. Saw a bunch of old ones. Helped a few people.

So, I would like other people to have as great a time as I had. Of course coming to Argentina is probably not a great idea. It's an expensive trip, if you don't speak spanish you will miss a lot, and if PyCamp gets too big it may stop being fun at all.

But why not do something similar? Doesn't have to be about Python, you can do it about making stuff, about programming in general, whatever. Just get a somewhat comfortable, somewhat isolated place with a reasonable catering and get your 50 nearest geeks there, and have a ton of fun.

You may get something useful done, too.

Ralsina.Me — Roberto Alsina's website

Posts about python (old posts, page 81)

Nikola: Filters & Bundles

Filters

Bundles

Conclusion

Driving a Nail With a Shoe I: Do-Sheet

Your Editor is Not the Bottleneck

Quick Hack to Catalog your Books

The Long Post About PyCamp 2012