Posts about python (old posts, page 37)

2012-04-22 21:58

Nikola Screencast

I did some work today to get Nikola properly packaged. This involves some minor changes on the workflow for site authors. I am not 100% sure I have it right yet, so here is a short video showing how it works right now in the packaging branch I am doing.

The new thing is the nikola init foldername command, the rest is all old stuff. Basically, you stop having a full copy of Nikola for each site and everything is in a centralized location.

You can still do your own themes by putting them in themes/themename and add new tasks, files, etc. The configuration is unchanged except for the "magic bit" which is slightly different.

So, not really invasive, easy to migrate to, and enables much easier updates in the future, as long as we don't break any important stuff in a non-compatible way.

Here is the video:

2012-04-19 23:34

Jinja support in Nikola ready for testing

Since some people really don't like Mako, and prefer Jinja2, I decided to make the template engine modular in Nikola my static site/blog generator.

All things considered, there is very little difference on the API side:


I even ported the default theme to Jinja! And not all that much is different on the template side, either:


But hey, to each his own. Other template engines are probably easy to plug, too.

Branch up for review here in github and yes, anyone reading this who knows Jinja2 or python is welcome to review it.

2012-04-17 23:40

A port of blog.txt to Nikola

If you want a very minimalistic theme for a Nikola-based site, I just did a quick and dirty port of Scott Wallick's blog.txt theme

If there is anything nice here, he did it. If there is something wrong or broken, I did it instead.

I did it basically to see if it was possible to port wordpress themes to Nikola. And it is, but it involves reading php files and loosely reinterpreting them into Mako templates.

While the port is far from perfect, it's a reasonable starting point for someone who is really interested in it.

Here is how it looks:


And here is the download. To use it, just unzip it in your themes folder, set THEME to "blogtxt" in your file, and rebuild the site.

At least the CSS files are easily adapted.

This theme is under a LGPL license (see included license.txt), enjoy!

2012-04-16 23:12

Smiljan, a Small Planet Generator

I maintain a couple of small "planet" sites. If you are not familiar with planets, they are sites that aggregate RSS/Atom feeds for a group of people related somehow. It makes for a nice, single, thematic feed.

Recently, when changing them from one server to another, everything broke. Old posts were new, feeds that had not been updated in 2 years were always with all its posts on top... a disaster.

I could have gone to the old server, and started debugging why rawdog was doing that, or switch to planet, or look for other software, or use an online aggregator.

Instead, I started thinking... I had written a few RSS aggregators in the past... Feedparser is again under active development... rawdog and planet seem to be pretty much abandoned... how hard could it be to implement the minimal planet software?

Well, not all that hard, that's how hard it was. Like it took me 4 hours, and was not even difficult.

One reason why this was easier than what planet and rawdog achieved is that I am not doing a static site generator, because I already have one so all I need this program (I called it Smiljan) to do is:

  • Parse a list of feeds and store it in a database if needed.
  • Download those feeds (respecting etag and modified-since).
  • Parse those feeds looking for entries (feedparser does that).
  • Load those entries (or rather, a tiny subset of their data) in the database.
  • Use the entries to generate a set of files to feed Nikola
  • Use nikola to generate and deploy the site.

So, here is the final result: which still needs theming and a lot of other stuff, but works.

I implemented Smiljan as 3 doit tasks, which makes it very easy to integrate with Nikola (if you know Nikola: add "from smiljan import *" in your and a feeds file with the feed list in rawdog format) and voilá, running this updates the planet:

doit load_feeds update_feeds generate_posts render_site deploy

Here is the code for, currently at the "gross hack that kinda works" stage. Enjoy!

# -*- coding: utf-8 -*-
import codecs
import datetime
import glob
import os
import sys

from import timeout
import feedparser
import peewee

class Feed(peewee.Model):
    name = peewee.CharField()
    url = peewee.CharField(max_length = 200)
    last_status = peewee.CharField()
    etag = peewee.CharField(max_length = 200)
    last_modified = peewee.DateTimeField()

class Entry(peewee.Model):
    date = peewee.DateTimeField()
    feed = peewee.ForeignKeyField(Feed)
    content = peewee.TextField(max_length = 20000)
    link = peewee.CharField(max_length = 200)
    title = peewee.CharField(max_length = 200)
    guid = peewee.CharField(max_length = 200)


def task_load_feeds():
    feeds = []
    feed = name = None
    for line in open('feeds'):
        line = line.strip()
        if line.startswith('feed'):
            feed = line.split(' ')[2]
        if line.startswith('define_name'):
            name = ' '.join(line.split(' ')[1:])
        if feed and name:
            feeds.append([feed, name])
            feed = name = None

    def add_feed(name, url):
        f = Feed.create(

    def update_feed_url(feed, url):
        feed.url = url

    for feed, name in feeds:
        f =
        if not list(f):
            yield {
                'name': name,
                'actions': ((add_feed,(name, feed)),),
                'file_dep': ['feeds'],
        elif list(f)[0].url != feed:
            yield {
                'name': 'updating:'+name,
                'actions': ((update_feed_url,(list(f)[0], feed)),),

def task_update_feeds():
    def update_feed(feed):
        modified = feed.last_modified.timetuple()
        etag = feed.etag
        parsed = feedparser.parse(feed.url,
            feed.last_status = str(parsed.status)
        except:  # Probably a timeout
            # TODO: log failure
        if parsed.feed.get('title'):
            print parsed.feed.title
            print feed.url
        feed.etag = parsed.get('etag', 'caca')
        modified = tuple(parsed.get('date_parsed', (1970,1,1)))[:6]
        print "==========>", modified
        modified = datetime.datetime(*modified)
        feed.last_modified = modified
        # No point in adding items from missinfg feeds
        if parsed.status > 400:
            # TODO log failure
        for entry_data in parsed.entries:
            print "========================================="
            date = entry_data.get('updated_parsed', None)
            if date is None:
                date = entry_data.get('published_parsed', None)
            if date is None:
                print "Can't parse date from:"
                print entry_data
                return False
            date = datetime.datetime(*(date[:6]))
            title = "%s: %s" %(, entry_data.get('title', 'Sin título'))
            content = entry_data.get('description',
                    entry_data.get('summary', 'Sin contenido'))
            guid = entry_data.get('guid',
            link =
            print repr([date, title])
            entry = Entry.get_or_create(
                date = date,
                title = title,
                content = content,
    for feed in
        yield {
            'actions': [(update_feed,(feed,))],
            'uptodate': [timeout(datetime.timedelta(minutes=20))],

def task_generate_posts():

    def generate_post(entry):
        meta_path = os.path.join('posts',str('.meta')
        post_path = os.path.join('posts',str('.txt')
        with, 'wb+', 'utf8') as fd:
            fd.write(u'%s\n' % entry.title.replace('\n', ' '))
            fd.write(u'%s\n' %
            fd.write(u'%s\n' %'%Y/%m/%d %H:%M'))
            fd.write(u'%s\n' %
        with, 'wb+', 'utf8') as fd:
            fd.write(u'.. raw:: html\n\n')
            content = entry.content
            if not content:
                content = u'Sin contenido'
            for line in content.splitlines():
                fd.write(u'    %s\n' % line)

    for entry in'date', 'desc')):
        yield {
            'actions': [(generate_post, (entry,))],

2012-04-14 11:33

Nikola 2.1.1 + GitHub

By popular request, Nikola now has its source code at GitHub.

Also, if you tried version 2.1 and it failed, try 2.1.1, because I forgot to add a couple of files in one of the themes in 2.1.

2012-04-09 21:41

Nikola 1.2 is out!

Version 1.2 of Nikola, my static site generator and the software behind this very site, is out!

Why build static sites? Because they are light in resources, they are future-proof, because it's easy, because they are safe, and because you avoid lockin.

New Features:

  • Image gallery (just drop pics on a folder)
  • Built-in webserver for previews (doit -a serve)
  • Helper commands to create new posts (doit -a new_post)
  • Google Sitemap support
  • A Handbook!
  • Full demo site included
  • Support for automatic deployment (doit -a deploy)
  • Client-side redirections

And of course the old features:

  • Write your posts in reStructured text
  • Clean, customizable page design (via bootstrap)
  • Comments via Disqus
  • Support any analytics you want
  • Build blogs with tags, feeds, feeds for your tags, indexes, and more
  • Works like a simple CMS for things outside your blog
  • Clean customizable templates using Mako
  • Pure python, and not a lot of it (about 600 lines)
  • Smart builds (doit only rebuilds changed pages)
  • Easy to extend and improve
  • Code displayed with syntax highlighting

Right now Nikola does literally everything I need, so if you try it and need something else... it's a good time to ask!

More info at

2012-03-30 22:59

Nikola 1.1 is out!

A simple yet powerful and flexible static website and blog generator, based on doit, mako, docutils and bootstrap.

I built this to power this very site you are reading, but decided it may be useful to others. The main goals of Nikola are:

  • Small codebase: because I don't want to maintain a big thing for my blog
  • Fast page generation: Adding a post should not take more that 5 seconds to build.
  • Static output: Deployment using rsync is smooth.
  • Flexible page generation: you can decide where everything goes in the final site.
  • Powerful templates: Uses Mako
  • Clean markup for posts: Uses Docutils
  • Don't do stupid builds: Uses doit
  • Clean HTML output by default: Uses bootstrap
  • Comments out of the box: Uses Disqus
  • Tags, with their own RSS feeds
  • Easy way to do a blog
  • Static pages outside the blog
  • Multilingual blog support (my own blog is english + spanish)

I think this initial version achieves all of those goals, but of course, it can be improved. Feedback is very welcome!

Nikola's home page is currently

2012-03-30 13:58

Unicode in Python is Fun!

As I hope you know, if you get a string of bytes, and want the text in it, and that text may be non-ascii, what you need to do is decode the string using the correct encoding name:

>>> 'á'.decode('utf8')

However, there is a gotcha there. You have to be absolutely sure that the thing you are decoding is a string of bytes, and not a unicode object. Because unicode objects also have a decode method but it's an incredibly useless one, whose only purpose in life is causing this peculiar error:

>>> u'á'.decode('utf8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/encodings/", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1'
in position 0: ordinal not in range(128)

Why peculiar? Because it's an Encode error. Caused by calling decode. You see, on unicode objects, decode does something like this:

System Message: ERROR/3 (<string>, line 26)

Cannot find pygments lexer for language "python2"

.. code-block:: python2

   def decode(self, encoding):
       return self.encode('ascii').decode(encoding)

The user wants a unicode object. He has a unicode object. By definition, there is no such thing as a way to utf-8-decode a unicode object. It just makes NO SENSE. It's like asking for a way to comb a fish, or climb a lake.

What it should return is self! Also, it's annoying as all hell in that the only way to avoid it is to check for type, which is totally unpythonic.

Or even better, let's just not have a decode method on unicode objects, which I think is the case in python 3, and I know we will never get on python 2.

So, be aware of it, and good luck!

2012-03-28 22:59

Nikola is Near

I managed to do some minor work today on Nikola, the static website generator used to generate ... well, this static website.

  • Implemented tags (including per-tag RSS feeds)
  • Simplified templates
  • Separated code and configuration.

The last one was the trickiest. And as a teaser, here is the full configuration file to create this site, except HTML bits for analytics, google custom search and whatever that would make no sense on other sites. I hope it's somewhat clear.

# -*- coding: utf-8 -*-

# post_pages contains (wildcard, destination, template) tuples.
# The wildcard is used to generate a list of reSt source files (whatever/thing.txt)
# That fragment must have an associated metadata file (whatever/thing.meta),
# and opcionally translated files (example for spanish, with code "es"):
#     whatever/ and whatever/
# From those files, a set of HTML fragment files will be generated:
# whatever/thing.html (and maybe whatever/
# These files are combinated with the template to produce rendered
# pages, which will be placed at
# output / TRANSLATIONS[lang] / destination / pagename.html
# where "pagename" is specified in the metadata file.

post_pages = (
    ("posts/*.txt", "weblog/posts", "post.tmpl"),
    ("stories/*.txt", "stories", "post.tmpl"),

# What is the default language?


# What languages do you have?
# If a specific post is not translated to a language, then the version
# in the default language will be shown instead.
# The format is {"translationcode" : "path/to/translation" }
# the path will be used as a prefix for the generated pages location

    "en": "",
    "es": "tr/es",

# Data about this site
BLOG_TITLE = "Lateral Opinion"
BLOG_URL = "//"
BLOG_EMAIL = "[email protected]"
BLOG_DESCRIPTION = "I write free software. I have an opinion on almost "\
    "everything. I write quickly. A weblog was inevitable."

# Paths for different autogenerated bits. These are combined with the translation
# paths.

# Final locations are:
# output / TRANSLATION[lang] / TAG_PATH / index.html (list of tags)
# output / TRANSLATION[lang] / TAG_PATH / tag.html (list of posts for a tag)
# output / TRANSLATION[lang] / TAG_PATH / tag.xml (RSS feed for a tag)
TAG_PATH = "categories"
# Final location is output / TRANSLATION[lang] / INDEX_PATH / index-*.html
INDEX_PATH = "weblog"
# Final locations for the archives are:
# output / TRANSLATION[lang] / ARCHIVE_PATH / archive.html
# output / TRANSLATION[lang] / ARCHIVE_PATH / YEAR / index.html
ARCHIVE_PATH = "weblog"
# Final locations are:
# output / TRANSLATION[lang] / RSS_PATH / rss.xml
RSS_PATH = "weblog"

# A HTML fragment describing the license, for the sidebar.
    <a rel="license" href="">
    <img alt="Creative Commons License" style="border-width:0; margin-bottom:12px;"

# A search form to search this site, for the sidebar. Has to be a <li>
# for the default template (base.tmpl).
    <!-- google custom search -->
    <!-- End of google custom search -->

# Google analytics or whatever else you use. Added to the bottom of <body>
# in the default template (base.tmpl).
        <!-- Start of StatCounter Code -->
        <!-- End of StatCounter Code -->
        <!-- Start of Google Analytics -->
        <!-- End of Google Analytics -->

# Put in global_context things you want available on all your templates.
# It can be anything, data, functions, modules, etc.
    'analytics': ANALYTICS,
    'blog_title': BLOG_TITLE,
    'blog_url': BLOG_URL,
    'translations': TRANSLATIONS,
    'license': LICENSE,
    'search_form': SEARCH_FORM,
    # Locale-dependent links
    'archives_link': {
        'es': '<a href="/tr/es/weblog/archive.html">Archivo</a>',
        'en': '<a href="/weblog/archive.html">Archives</a>',
    'tags_link': {
        'es': '<a href="/tr/es/categories/index.html">Tags</a>',
        'en': '<a href="/categories/index.html">Tags</a>',


2012-03-27 23:19

Welcome To Nikola

If you see this, you may notice some changes in the site.

So, here is a short explanation:

  • I changed the software and the templates for this blog.
  • Yes, it's a work in progress.
  • The new software is called Nikola.
  • Yes, it's pretty cool.
Why change?
Are you kidding? My previous blog-generator (Son of BartleBlog) was not in good shape. The archives only covered 2000-2010, the "previous posts" links were a lottery, and the spanish version of the site was missing whole sections.
So, what's Nikola?
Nikola is a static website generator. One thing about this site is that it is, and has always been, just HTML. Every "dynamic" thing you see in it, like comments, is a third party service. This site is just a bunch of HTML files sitting in a folder.
So, how does Nikola work?

Nikola takes a folder full of txt files written in restructured text, and generates HTML fragments.

Those fragments plus some light metadata (title, tags, desired output filename, external links to sources) and Some Mako Templates create HTML pages.

Those HTML pages use bootstrap to not look completely broken (hey, I never claimed to be a designer).

To make sure I don't do useless work, doit makes sure only the required files are recreated.

Why not use <whatever>?

Because, for diverse reasons, I wanted to keep the exact URLs I have been using:

  • If I move a page, keeping the Disqus comments attached gets tricky
  • Some people may have bookmarked them

Also, I wanted:

  • Mako templates (because I like Mako)
  • Restructured text (Because I have over 1000 posts written in it)
  • Python (so I could hack it)
  • Easy to hack (currently Nikola is under 600 LOC, and is almost feature complete)
  • Support for a multilingual blog like this one.

And of course:

  • It sounded like a fun, short project. I had the suspicion that with a bit of glue, existing tools did 90% of the work. Looks like I was right, since I wrote it in a few days.
Are you going to maintain it?
Sure, since I am using it.
Is it useful for other people?
Probably not right now, because it makes a ton of assumptions for my site. I need to clean it up a bit before it's really nice.
Can other people use it?
Of course. It will be available somewhere soon.
Missing features?
No tags yet. Some other minor missing things.

Contents © 2000-2018 Roberto Alsina