Skip to main content

Ralsina.Me — Roberto Alsina's website

Posts about programming (old posts, page 72)

Unicode in Python is Fun!

As I hope you know, if you get a string of bytes, and want the text in it, and that text may be non-asci­i, what you need to do is de­code the string us­ing the cor­rect en­cod­ing name:

>>> 'á'.decode('utf8')
u'\xe1'

How­ev­er, there is a gotcha there. You have to be ab­so­lute­ly sure that the thing you are de­cod­ing is a string of bytes, and not a uni­code ob­jec­t. Be­cause uni­code ob­jects al­so have a de­code method but it's an in­cred­i­bly use­less one, whose on­ly pur­pose in life is caus­ing this pe­cu­liar er­ror:

>>> u'á'.decode('utf8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1'
in position 0: ordinal not in range(128)

Why pe­cu­liar? Be­cause it's an En­code er­ror. Caused by call­ing de­code. You see, on uni­code ob­ject­s, de­code does some­thing like this:

def decode(self, encoding):
    return self.encode('ascii').decode(encoding)

The us­er wants a uni­code ob­jec­t. He has a uni­code ob­jec­t. By def­i­ni­tion, there is no such thing as a way to ut­f-8-de­code a uni­code ob­jec­t. It just makes NO SENSE. It's like ask­ing for a way to comb a fish, or climb a lake.

What it should return is self! Also, it's annoying as all hell in that the only way to avoid it is to check for type, which is totally unpythonic.

Or even bet­ter, let's just not have a de­code method on uni­code ob­ject­s, which I think is the case in python 3, and I know we will nev­er get on python 2.

So, be aware of it, and good luck!

How it's done

I added a very mi­nor fea­ture to the site. Up here ^ you should be able to see a link that says "reSt". If you click on it, it will show you the "source code" for the page.

I did this for a few rea­son­s:

  1. Be­­cause a com­­ment seemed to sug­­gest it ;-)

  2. Be­­cause it seems like a nice thing to do. Since I so like reSt, I would like oth­­ers to use it, too. And show­ing how easy it is to write us­ing it, is cool.

  3. It's the "free soft­­ware-y" thing to do. I am pro­vid­ing you the pre­­ferred way to mod­­i­­fy my post­s.

  4. It was ridicu­lous­­ly easy to ad­d.

Al­so, if you see some­thing miss­ing, or some­thing you would like to have on the site, please com­men­t, I will try to add it.

Nikola is Near

I man­aged to do some mi­nor work to­day on Niko­la, the stat­ic web­site gen­er­a­tor used to gen­er­ate ... well, this stat­ic web­site.

  • Im­­ple­­men­t­ed tags (in­­clud­ing per-­­tag RSS feed­s)

  • Sim­­pli­­fied tem­­plates

  • Sep­a­rat­ed code and con­­fig­u­ra­­tion.

The last one was the trick­i­est. And as a teaser, here is the full con­fig­u­ra­tion file to cre­ate this site, ex­cept HTML bits for an­a­lyt­ic­s, google cus­tom search and what­ev­er that would make no sense on oth­er sites. I hope it's some­what clear.

# -*- coding: utf-8 -*-

# post_pages contains (wildcard, destination, template) tuples.
#
# The wildcard is used to generate a list of reSt source files (whatever/thing.txt)
# That fragment must have an associated metadata file (whatever/thing.meta),
# and opcionally translated files (example for spanish, with code "es"):
#     whatever/thing.txt.es and whatever/thing.meta.es
#
# From those files, a set of HTML fragment files will be generated:
# whatever/thing.html (and maybe whatever/thing.html.es)
#
# These files are combinated with the template to produce rendered
# pages, which will be placed at
# output / TRANSLATIONS[lang] / destination / pagename.html
#
# where "pagename" is specified in the metadata file.
#

post_pages = (
    ("posts/*.txt", "weblog/posts", "post.tmpl"),
    ("stories/*.txt", "stories", "post.tmpl"),
)

# What is the default language?

DEFAULT_LANG = "en"

# What languages do you have?
# If a specific post is not translated to a language, then the version
# in the default language will be shown instead.
# The format is {"translationcode" : "path/to/translation" }
# the path will be used as a prefix for the generated pages location

TRANSLATIONS = {
    "en": "",
    "es": "tr/es",
    }

# Data about this site
BLOG_TITLE = "Lateral Opinion"
BLOG_URL = "//ralsina.me"
BLOG_EMAIL = "ralsina@kde.org"
BLOG_DESCRIPTION = "I write free software. I have an opinion on almost "\
    "everything. I write quickly. A weblog was inevitable."

# Paths for different autogenerated bits. These are combined with the translation
# paths.

# Final locations are:
# output / TRANSLATION[lang] / TAG_PATH / index.html (list of tags)
# output / TRANSLATION[lang] / TAG_PATH / tag.html (list of posts for a tag)
# output / TRANSLATION[lang] / TAG_PATH / tag.xml (RSS feed for a tag)
TAG_PATH = "categories"
# Final location is output / TRANSLATION[lang] / INDEX_PATH / index-*.html
INDEX_PATH = "weblog"
# Final locations for the archives are:
# output / TRANSLATION[lang] / ARCHIVE_PATH / archive.html
# output / TRANSLATION[lang] / ARCHIVE_PATH / YEAR / index.html
ARCHIVE_PATH = "weblog"
# Final locations are:
# output / TRANSLATION[lang] / RSS_PATH / rss.xml
RSS_PATH = "weblog"

# A HTML fragment describing the license, for the sidebar.
LICENSE = """
    <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/2.5/ar/">
    <img alt="Creative Commons License" style="border-width:0; margin-bottom:12px;"
    src="http://i.creativecommons.org/l/by-nc-sa/2.5/ar/88x31.png"></a>
"""

# A search form to search this site, for the sidebar. Has to be a <li>
# for the default template (base.tmpl).
SEARCH_FORM = """
    <!-- google custom search -->
    <!-- End of google custom search -->
"""

# Google analytics or whatever else you use. Added to the bottom of <body>
# in the default template (base.tmpl).
ANALYTICS = """
        <!-- Start of StatCounter Code -->
        <!-- End of StatCounter Code -->
        <!-- Start of Google Analytics -->
        <!-- End of Google Analytics -->
"""

# Put in global_context things you want available on all your templates.
# It can be anything, data, functions, modules, etc.
GLOBAL_CONTEXT = {
    'analytics': ANALYTICS,
    'blog_title': BLOG_TITLE,
    'blog_url': BLOG_URL,
    'translations': TRANSLATIONS,
    'license': LICENSE,
    'search_form': SEARCH_FORM,
    # Locale-dependent links
    'archives_link': {
        'es': '<a href="/tr/es/weblog/archive.html">Archivo</a>',
        'en': '<a href="/weblog/archive.html">Archives</a>',
        },
    'tags_link': {
        'es': '<a href="/tr/es/categories/index.html">Tags</a>',
        'en': '<a href="/categories/index.html">Tags</a>',
        },
    }

execfile("nikola/nikola.py")

Welcome To Nikola

If you see this, you may no­tice some changes in the site.

So, here is a short ex­pla­na­tion:

  • I changed the soft­­ware and the tem­­plates for this blog.

  • Yes, it's a work in progress.

  • The new soft­­ware is called Niko­la.

  • Yes, it's pret­­ty cool.

Why change?

Are you kid­ding? My pre­vi­ous blog-­gen­er­a­tor (Son of Bartle­Blog) was not in good shape. The ar­chives on­ly cov­ered 2000-2010, the "pre­vi­ous post­s" links were a lot­tery, and the span­ish ver­sion of the site was miss­ing whole sec­tion­s.

So, what's Nikola?

Niko­la is a stat­ic web­site gen­er­a­tor. One thing about this site is that it is, and has al­ways been, just HTM­L. Ev­ery "dy­nam­ic" thing you see in it, like com­ments, is a third par­ty ser­vice. This site is just a bunch of HTML files sit­ting in a fold­er.

So, how does Nikola work?

Niko­la takes a fold­er full of txt files writ­ten in re­struc­tured text, and gen­er­ates HTML frag­ments.

Those frag­ments plus some light meta­da­ta (ti­tle, tags, de­sired out­put file­name, ex­ter­nal links to sources) and Some Mako Tem­plates cre­ate HTML pages.

Those HTML pages use boot­strap to not look com­plete­ly bro­ken (hey, I nev­er claimed to be a de­sign­er).

To make sure I don't do use­less work, doit makes sure on­ly the re­quired files are recre­at­ed.

Why not use <whatever>?

Be­cause, for di­verse rea­son­s, I want­ed to keep the ex­act URLs I have been us­ing:

  • If I move a page, keep­­ing the Dis­­qus com­­ments at­­tached gets tricky

  • Some peo­­ple may have book­­marked them

Al­so, I want­ed:

  • Mako tem­­plates (be­­cause I like Mako)

  • Re­struc­­tured text (Be­­cause I have over 1000 posts writ­ten in it)

  • Python (so I could hack it)

  • Easy to hack (cur­ren­t­­ly Niko­la is un­der 600 LOC, and is al­­most fea­­ture com­­plete)

  • Sup­­port for a mul­ti­lin­gual blog like this one.

And of course:

  • It sound­ed like a fun, short pro­jec­t. I had the sus­pi­­cion that with a bit of glue, ex­ist­ing tools did 90% of the work. Looks like I was right, since I wrote it in a few days.

Are you going to maintain it?

Sure, since I am us­ing it.

Is it useful for other people?

Prob­a­bly not right now, be­cause it makes a ton of as­sump­tions for my site. I need to clean it up a bit be­fore it's re­al­ly nice.

Can other people use it?

Of course. It will be avail­able some­where soon.

Missing features?

No tags yet. Some oth­er mi­nor miss­ing things.

Ubuntu One APIs by Example (part 1)

One of the nice things about work­ing at Canon­i­cal is that we pro­duce open source soft­ware. I, specif­i­cal­ly, work in the team that does the desk­top clients for Ubun­tu One which is a re­al­ly cool job, and a re­al­ly cool piece of soft­ware. How­ev­er, one thing not enough peo­ple know, is that we of­fer damn nice APIs for de­vel­op­er­s. We have to, since all our client code is open source, so we need those APIs for our­selves.

So, here is a small tu­to­ri­al about us­ing some of those APIs. I did it us­ing Python and PyQt for sev­er­al rea­son­s:

  • Both are great tools for pro­­to­­typ­ing

  • Both have good sup­­port for the re­quired stuff (D­Bus, HTTP, OAu­th)

  • It's what I know and en­joy. Since I did this code on a sun­­day, I am not go­ing to use oth­­er things.

Hav­ing said that, there is noth­ing python-spe­cif­ic or Qt-spe­cif­ic in the code. Where I do a HTTP re­quest us­ing Qt­Net­work, you are free to use lib­soup, or what­ev­er.

So, on to the nuts and bolt­s. The main pieces of Ubun­tu One, from a in­fra­struc­ture per­spec­tive, are Ubun­tu SSO Clien­t, that han­dles us­er reg­is­tra­tion and login, and Sync­Dae­mon, which han­dles file syn­chro­niza­tion.

To in­ter­act with them, on Lin­ux, they of­fer DBus in­ter­faces. So, for ex­am­ple, this is a frag­ment of code show­ing a way to get the Ubun­tu One cre­den­tials (this would nor­mal­ly be part of an ob­jec­t's __init__):

# Get the session bus
bus = dbus.SessionBus()

:
:
:

# Get the credentials proxy and interface
self.creds_proxy = bus.get_object("com.ubuntuone.Credentials",
                        "/credentials",
                        follow_name_owner_changes=True)

# Connect to signals so you get a call when something
# credential-related happens
self.creds_iface = dbus.Interface(self.creds_proxy,
    "com.ubuntuone.CredentialsManagement")
self.creds_proxy.connect_to_signal('CredentialsFound',
    self.creds_found)
self.creds_proxy.connect_to_signal('CredentialsNotFound',
    self.creds_not_found)
self.creds_proxy.connect_to_signal('CredentialsError',
    self.creds_error)

# Call for credentials
self._credentials = None
self.get_credentials()

You may have no­ticed that get_­cre­den­tials does­n't ac­tu­al­ly re­turn the cre­den­tial­s. What it does is, it tells Sync­Dae­mon to fetch the cre­den­tial­s, and then, when/if they are there, one of the sig­nals will be emit­ted, and one of the con­nect­ed meth­ods will be called. This is nice, be­cause it means you don't have to wor­ry about your app block­ing while Sync­Dae­mon is do­ing all this.

But what's in those meth­ods we used? Not much, re­al­ly!

def get_credentials(self):
    # Do we have them already? If not, get'em
    if not self._credentials:
        self.creds_proxy.find_credentials()
    # Return what we've got, could be None
    return self._credentials

def creds_found(self, data):
    # Received credentials, save them.
    print "creds_found", data
    self._credentials = data
    # Don't worry about get_quota yet ;-)
    if not self._quota_info:
        self.get_quota()

def creds_not_found(self, data):
    # No credentials, remove old ones.
    print "creds_not_found", data
    self._credentials = None

def creds_error(self, data):
    # No credentials, remove old ones.
    print "creds_error", data
    self._credentials = None

So, ba­si­cal­ly, self­._­cre­den­tials will hold a set of cre­den­tial­s, or None. Con­grat­u­la­tion­s, we are now logged in­to Ubun­tu One, so to speak.

So, let's do some­thing use­ful! How about ask­ing for how much free space there is in the ac­coun­t? For that, we can't use the lo­cal APIs, we have to con­nect to the server­s, who are, af­ter al­l, the ones who de­cide if you are over quo­ta or not.

Ac­cess is con­trolled via OAuth. So, to ac­cess the API, we need to sign our re­quest­s. Here is how it's done. It's not par­tic­u­lar­ly en­light­en­ing, and I did not write it, I just use it:

def sign_uri(self, uri, parameters=None):
    # Without credentials, return unsigned URL
    if not self._credentials:
        return uri
    if isinstance(uri, unicode):
        uri = bytes(iri2uri(uri))
    print "uri:", uri
    method = "GET"
    credentials = self._credentials
    consumer = oauth.OAuthConsumer(credentials["consumer_key"],
                                   credentials["consumer_secret"])
    token = oauth.OAuthToken(credentials["token"],
                             credentials["token_secret"])
    if not parameters:
        _, _, _, _, query, _ = urlparse(uri)
        parameters = dict(cgi.parse_qsl(query))
    request = oauth.OAuthRequest.from_consumer_and_token(
                                        http_url=uri,
                                        http_method=method,
                                        parameters=parameters,
                                        oauth_consumer=consumer,
                                        token=token)
    sig_method = oauth.OAuthSignatureMethod_HMAC_SHA1()
    request.sign_request(sig_method, consumer, token)
    print "SIGNED:", repr(request.to_url())
    return request.to_url()

And how do we ask for the quo­ta us­age? By ac­cess­ing the http­s://one.ubun­tu.­com/api/quo­ta/ en­try point with the prop­er au­tho­riza­tion, we would get a JSON dic­tio­nary with to­tal and used space. So, here's a sim­ple way to do it:

    # This is on __init__
    self.nam = QtNetwork.QNetworkAccessManager(self,
        finished=self.reply_finished)

:
:
:

def get_quota(self):
    """Launch quota info request."""
    uri = self.sign_uri(QUOTA_API)
    url = QtCore.QUrl()
    url.setEncodedUrl(uri)
    self.nam.get(QtNetwork.QNetworkRequest(url))

Again, see how get_quo­ta does­n't re­turn the quo­ta? What hap­pens is that get_quo­ta will launch a HTTP re­quest to the Ubun­tu One server­s, which will, even­tu­al­ly, re­ply with the da­ta. You don't want your app to block while you do that. So, QNet­workAc­cess­Man­ag­er will call self­.re­ply_fin­ished when it gets the re­spon­se:

def reply_finished(self, reply):
    if unicode(reply.url().path()) == u'/api/quota/':
        # Handle quota responses
        self._quota_info = json.loads(unicode(reply.readAll()))
        print "Got quota: ", self._quota_info
        # Again, don't worry about update_menu yet ;-)
        self.update_menu()

What else would be nice to have? How about get­ting a call when­ev­er the sta­tus of sync­dae­mon changes? For ex­am­ple, when sync is up to date, or when you get dis­con­nect­ed? Again, those are DBus sig­nals we are con­nect­ing in our __init__:

self.status_proxy = bus.get_object(
    'com.ubuntuone.SyncDaemon', '/status')
self.status_iface = dbus.Interface(self.status_proxy,
    dbus_interface='com.ubuntuone.SyncDaemon.Status')
self.status_iface.connect_to_signal(
    'StatusChanged', self.status_changed)

# Get the status as of right now
self._last_status = self.process_status(
    self.status_proxy.current_status())

And what's sta­tus_changed?

def status_changed(self, status):
    print "New status:", status
    self._last_status = self.process_status(status)
    self.update_menu()

The pro­cess_s­ta­tus func­tion is bor­ing code to con­vert the in­fo from sync­dae­mon's sta­tus in­to a hu­man-read­able thing like "Sync is up­-­to-­date". So we store that in self­._last_s­ta­tus and up­date the menu.

What menu? Well, a QSys­tem­Tray­I­con's con­text menu! What you have read are the main pieces you need to cre­ate some­thing use­ful: a Ubun­tu One tray app you can use in KDE, XFCE or open­box. Or, if you are on uni­ty and in­stall sni-qt, a Ubun­tu One app in­di­ca­tor!

http://ubuntuone.com/7iXTbysoMM9PIUS9Ai4TNn

My Ubun­tu One in­di­ca­tor in ac­tion.

You can find the source code for the whole ex­am­ple app at my u1-­toys project in launch­pad and here is the full source code (miss­ing some icon re­sources, just get the re­po)

Com­ing soon(ish), more ex­am­ple app­s, and cool things to do with our APIs!


Contents © 2000-2024 Roberto Alsina