Posts about programming (old posts, page 38)

Creating Languages For Dummies

Intro

I don't have the usual programmer's education. I studied maths, and then dropped out of that, and am mostly self-taught. So, there are some parts of programming I always saw wearily, thinking to myself that I really should go to school to learn them. One remarkable such area is parsing and implementing languages.

Well... sure, school is always a good idea, but this is not that hard. In this article I will explain how to go from nothing to a functioning, extensible language, using Python and PyParsing. If you are as scared of grammars, parsers and all that jazz as I used to be, come along, it's pretty simple,

Read more…

FLOSS Decision Making in Action

If you are reading this there is a good chance you are involved somehow in open source development, or software development in general. One thing lots of people ask me when they know I have lead this sort of projects for a long time is "how do you decide things?". To which I have all sorts of bad answers like:

  • "It's a consensus thing"
  • "It happens organically"
  • "Sometimes it just happens"
  • "Anarchy!"
  • "You do what you do"

So, now here I have an AWESOME example of FLOSS decision making in action, which is ... all of the above.

Some context: Nikola is a static site generator, so it deals with reading and writing textual data from disk. It's also an internationalized project, which supports multilingual sites and translated data. It also runs un multiple platforms, like Windows, OSX, Linux, etc.

And to make that more fun, it also works on Python 2.7, and 3.3 or later. Which means it has to handle two different models on how to work with unicode data, in the same codebase. And that's not fun. So, we have been floating around the idea of deprecating python 2.7. And so, when s2hc_johan walks in with a unicode problem...

14:23:16 <s2hc_johan> I don't have a site with sections, but I tested it for the other case
14:35:42 <s2hc_johan> strange it worked for a while broken again, probably because I've got åäö in it now.
14:35:45 <s2hc_johan> https://github.com/getnikola/plugins/blob/master/v7/recent_posts_json/recent_posts_json.py#L134
14:36:17 <s2hc_johan> if you wrap data with unicode it works, but I'm not sure that works in python3
14:36:37 <ChrisWarrick> s2hc_johan: how do you wrap it with unicode?
14:36:48 <s2hc_johan> unicode(data)
14:37:05 <s2hc_johan> but is that valid in  python3?
14:37:11 <ChrisWarrick> s2hc_johan: this is wrong on so many levels
14:37:16 <ChrisWarrick> s2hc_johan: please don’t do that, ever
14:37:48 <ChrisWarrick> s2hc_johan: This won’t work in Python 3 either.  You must have an actual encoding, and use the decode method.   try: foo = foo.decode('utf-8'); except AttributeError: foo = foo  # python 3
14:38:02 <s2hc_johan> what do you mean, that is like my standard when I get strnage data in, undoce(data) data.encode(whatever) data.decode(whatever) :)
14:38:23 <s2hc_johan> one of them ussually work
14:39:22 <ChrisWarrick> s2hc_johan: unicode() assumes ASCII, it never works right
14:39:32 <s2hc_johan> true
14:39:40 <ChrisWarrick> s2hc_johan: encode/decode with a specified encoding is fine
14:40:00 <ChrisWarrick> s2hc_johan: but you might need a try/except for Python 3 if it could have Unicode data already
14:40:16 <s2hc_johan> I'm a bit confused in this case since the output comes from json.dumps
14:40:34 <s2hc_johan> thought that would produce a unicode object
14:40:51 <ChrisWarrick> s2hc_johan: not necessarily on python 2
14:41:05 <ralsina_> if isinstance(thing, utils.str_bytes): thing=thing.decode('utf8')
14:41:15 <ralsina_> that works in py2 and py3
14:42:12 <ChrisWarrick> easier to ask for forgiveness imo
14:43:07 <ralsina_> maybe we should have helpers in utils enforce_unicode and enforce_bytes
14:43:13 -GitHub[nikola]:#nikola- [nikola] Aeyoun pushed 1 new commit to feed-previewimage: http://git.io/vnqek
14:43:13 -GitHub[nikola]:#nikola- nikola/feed-previewimage 4b79e20 Daniel Aleksandersen: Deprecated RSS_READ_MORE_LINK and RSS_LINKS_APPEND_QUERY...
14:44:58 <Aeyoun> Or upgrade to Py3.
14:45:11 <ChrisWarrick> ++
14:45:47 <Aeyoun> Unicode in Py27 is a nightmare. It tries as hard as it can to kill you at every turn.
14:48:09 -travis-ci:#nikola- getnikola/nikola#6426 (feed-previewimage - 4b79e20 : Daniel Aleksandersen): The build is still failing.
14:48:10 -travis-ci:#nikola- Change view: https://github.com/getnikola/nikola/compare/c4c69c02db34...4b79e20d1ebc
14:48:10 -travis-ci:#nikola- Build details: https://travis-ci.org/getnikola/nikola/builds/81026762
14:48:27 <ralsina_> ok, let's consider py3-only seriously.
14:48:40 <ralsina_> 1) Is there any distro commonly used with py3 < 3.3 ?
14:48:55 <ralsina_> 2) Do we just stop using py2, or we deprecate slowly?
14:49:15 <ralsina_> 3) Do we just start doing py3-only code, or we actively de-hack the codebase?
14:49:21 <ralsina_> That's my 3 questions :-)
14:50:13 <SteveDrees> Unicode is a nightmare
14:50:53 <SteveDrees> different python versions just changes where the pain point is
14:50:53 <s2hc_johan> which one is better isinstance... or hasattr('decode', ..)
14:51:02 <ralsina_> isinstance
14:51:08 <s2hc_johan> oki then
14:51:10 <ralsina_> hasattr is evil in itself
14:51:26 <s2hc_johan> just going to feed the kids then I'll make another pr
14:51:28 -GitHub[nikola]:#nikola- [nikola] Aeyoun pushed 1 new commit to feed-previewimage: http://git.io/vnqJ2
14:51:28 -GitHub[nikola]:#nikola- nikola/feed-previewimage 4c950ac Daniel Aleksandersen: flake8
14:52:13 <Aeyoun> ralsina_: user survey? pip download data?
14:52:33 <gour> ralsina_: create some poll at website/mailing-list about it?
14:53:18 <ralsina_> dude, I offered free shirts and I got only 10 requests ;-)
14:53:30 <ralsina_> so, how many answers do you expect about that sort of thing?
14:53:43 * gour thought shirts are jsut for devs :-(
14:53:47 <Aeyoun> ralsina_: release a unchanged version on pip that is flagged as py3 only. see how many downlaod it versus previous version in same amount of time.
14:53:51 <ralsina_> gour: go add yourself dude
14:54:18 <ralsina_> gour: TO THE SHIRT LIST! I just notced that sounded very rude :-)
14:54:43 <gour> ralsina_: where it is?
14:54:43 <Aeyoun> ralsina_: or one py27 version number and and one version py3 only version number at the same time.
14:55:17 <ralsina_> gour: https://docs.google.com/forms/d/18YFwdgukmpkjr5b8FGEKL0arxPePuLHNsuEa-Gl80D8/viewform?c=0&w=1
14:55:17 <gour> found it
14:56:00 <gour> ralsina_: wonder if xxl is too large or xl is enough
14:56:00 <Aeyoun> ralsina_: american or european sizes by the by?
14:56:03 <ralsina_> Aeyoun: that reflects how many people use py2.7 by reflex. I know *i* do because it's "python" and not "python3"
14:56:20 <ralsina_> Aeyoun: no idea about sizes to be honest... probably american
14:56:21 <Aeyoun> American sizes are … a big bigger. I’m probably a XS/S american but M european. :P
14:56:28 <Aeyoun> *bit bigger
14:56:39 <gour> ok
14:56:57 * gour submitted request
14:57:17 <ralsina_> So, what I would prefer to do is make people use py3 if they can. And it seems to me that pretty much everyone can, regardless of whether they still use py2 by defect.
14:57:26 <ralsina_> by default*, spanishism leaked there.
14:57:52 <ChrisWarrick> technically, using py2 is a defect
14:57:59 <ralsina_> So, if we all agree that most users *could* run nikola in py3... then let's do it.
14:58:02 <Aeyoun> Agreed.
14:58:15 <gour> sites won't stop working :-)
14:58:26 <Aeyoun> ralsina_: act on data not dev agreement?
14:58:42 <ChrisWarrick> guess we could change our docs/webiste to highlight 3.x
14:58:59 <ralsina_> Aeyoun: the only data we'd need is to know how many people have py2.7 and no py3.3
14:59:14 <ralsina_> not how many are *using* 2.7 instead of 3.3
14:59:38 <ChrisWarrick> micro-survey via ml?
14:59:39 <ralsina_> How about: let's announce that, unless lots of people complaint, we deprecate py2 by end of october
14:59:45 -travis-ci:#nikola- getnikola/nikola#6429 (feed-previewimage - 4c950ac : Daniel Aleksandersen): The build was fixed.
14:59:46 -travis-ci:#nikola- Change view: https://github.com/getnikola/nikola/compare/4b79e20d1ebc...4c950ac5e52e
14:59:46 -travis-ci:#nikola- Build details: https://travis-ci.org/getnikola/nikola/builds/81028389
14:59:47 <Aeyoun> Mac is shipping with Py2.7 and no Py3. BUT MacPorts and Homebrew offer painfree Py3 installs.
14:59:58 <ralsina_> ok, mac is a good point
15:00:25 <ChrisWarrick> it’s not like we have Homebrew/MacPorts/Fink-based install instructions for them…
15:00:27 <Aeyoun> ralsina_: we could add a deprecation message every time `nikola` is run and ask people to bitch in a bug?
15:00:32 <Aeyoun> ChrisWarrick: hehe. ;)
15:00:50 <ralsina_> "I see you have python3 installed but I am running on 2.7 ... dude, what's wrong with you?"
15:00:51 <Aeyoun> Or maybe once per 24 hour rather  than every time its run.
15:01:00 <ralsina_> doit timed tasks :-)
15:01:12 <Aeyoun> ralsina_: "Don’t get in the way of progress! Upgrade to Py3 and save a developer’s mind today!"
15:01:32 <ralsina_> "niec unicode you have there, would be a shame something happened to it.. switch to python 3!"
15:01:39 <ChrisWarrick> ralsina_: hey, let’s start with a Google Docs survey on the ML.  One question: what Python version and OS are you using for Nikola? 2.7/3.3/3.4/3.5; Windows/OS X/[other: linux/bsd distro]
15:01:57 <gour> "Free t-shirt foreveryone switching from py2.7 to py3.3"
15:01:58 <ChrisWarrick> ralsina_: Just don’t require a Google account like you did last time.
15:02:00 <ralsina_> Second question: "Do you have python 3.3 or later installed?"
15:02:03 <Aeyoun> How much code can be removed with dropping Py27? Lowers maintenance cost and increases performance. That is also an important datapoint.
15:02:11 <ralsina_> ChrisWarrick: I needed to know who was asking for the shirt :-)
15:02:21 <ChrisWarrick> ralsina_: good point
15:02:25 <ralsina_> Aeyoun: not all that much, really
15:02:47 <ChrisWarrick> Aeyoun: it would need to start with a huge rewrite to remove all of our pointers in nikola.utils
15:03:00 <ralsina_> Aeyoun: there are a number of tiny hacks, which were a pain to get right but they always amount to one if and/or one decode :-)
15:03:26 <ralsina_> We can just turn a bunch of helpers in utils into noops
15:04:52 <gour> py3-only nikola is going to become v8?
15:05:15 <Aeyoun> gour: seems like a likely outcome. you’re following the discussion live.
15:06:34 <ChrisWarrick> if we do v8, we’ll have to merge the early tasks garbage
15:07:03 <ralsina_> Is it technically backwards-incompatible if we just stop working on py2.7?
15:07:21 <ralsina_> gour: welcome to open source software: behind the code.
15:07:30 <gour> ralsina_: :-)
15:07:35 <Aeyoun> Someone call in a documentary crew!
15:07:43 <ralsina_> Aeyoun: we have logs!
15:07:51 <Aeyoun> Oh, wait. This is already logged for prosperity.
15:07:57 <ralsina_> I am totally posting this somewhere as "this is how decisions are made in FLOSS"
15:08:40 <ralsina_> Ok, who creates the poll and who posts it in the blog, and who makes sure it appears on planet, and who sends it to the list?
15:08:49 <ralsina_> I would do it but I have work to do :)
15:08:51 <ChrisWarrick> ralsina_: I’ll do it
15:08:57 <ralsina_> ChrisWarrick: you rock dude!
15:09:01 <ChrisWarrick> ralsina_: should be really simple
15:09:03 <ralsina_> Ok, we have a plan!
15:09:17 <ralsina_> Let's consider the poll results in ... a week?
15:09:25 <Aeyoun> Let the logs show we’re all in favor of this plan of action. ;-)
15:09:29 <ralsina_> aye
15:09:51 <ralsina_> Also: can I do the "shame on you" thing on nikola build? It sounds like fun :-)
15:10:27 <ChrisWarrick> ralsina_: for the python version question: radiobox vs checkbox?
15:10:28 <gour> ralsina_: you can mention that Nikola (Tesla) was always for innovation ;)
15:10:44 <Aeyoun> "You’re using FIVE YEAR OLD SOFTWARE. Update your system."
15:11:00 <ralsina_> Aeyoun: I am totally getting at least 5 different comments there
15:11:01 <Aeyoun> https://en.wikipedia.org/wiki/History_of_Python#Version_release_dates
15:11:05 <ralsina_> ChrisWarrick: checkbox... maybe 2?
15:11:23 <ralsina_> ChrisWarrick: one for python version, one for operating system
15:11:32 <ChrisWarrick> ralsina_: ?
15:11:38 <ralsina_> ChrisWarrick: two questions
15:11:54 <ChrisWarrick> ralsina_: there will even be three questions (py2/3 used, OS, has py3)
15:11:57 <ChrisWarrick> ralsina_: and checkboxes it is
15:12:02 <ralsina_> right
15:12:05 <ralsina_> awesome
15:14:44 <ralsina_> Copied / Pasted for posterity

There you go, half an hour later, we have a plan to (maybe) deprecate it.

Now go vote here: Should Nikola support python2.7? Gives us data to decide!

Javascript Makes Me Cry: Turning a Date into a String

Working late last night in Alva I wanted to do something that sounded trivial:

When the page loads, get the current date and time, and if a certain input is empty, put it there like this:

28/05/2013 23:45

So, how hard can that be, right? Well not hard, but...

Getting the current date-time is easy: now = new Date(); So, is there something like strftime in Javascript? Of course not. You can get code from the usual places and have a untested, perhaps broken, limited version of it. And I am not about to add a strftime implementation to use it once. Sure, there are a number of Date methods that convert to strings, but none of them lets you specify the output format. So, let's try to do this The Javascript Way, right?

To get the elements I want to put in the value, I used accessor methods. So, obviously, these should give me what I want for the string, right?

now.getDay(), now.getMonth(), now.getYear(), now.getHour() now.getMinute()

Well, they are, at the date mentioned above, respectively: 2, 4, 113, error, error

Ok, the errors are easy to fix from the docs. It's actually getHours() and getMinutes(), so now we have 2, 4, 113, 23, 45 and of those five things, the last two are what one would expect, at least. Let's go over the other three and see why they are so weird:

Date.getDay() returned 2 instead of 28
Because getDay() gives you the week day and not the day of the month. Which is absolutely idiotic. So, you have to use getDate() instead. Which means the name is a lie, becasue the logical thing for getDate() to return is the whole date.
Date.getMonth() returned 4 instead of 5
Because getMonth() returns months in the [0,11] range. Which is beyond idiotic and bordering in evil. Come on, Javascript, people have been referring to may as "5" for nearly two thousand years now! What other language does this? Anyone knows one?
Date.getYear() returned 113 instead of 2013
Because it uses offset-from-1900. Which is amazing, and I had never heard of a language doing in a standard type. Because why? So, use getFullYear() instead.

Now, armed with the right 5 numbers, let's format it. Does Javascript have the equivalent of sprintf or format ? Of course not. In JavaScript, without 3rd party modules, you create strings by addition, like a caveman. Again, I know I could add a format method to the String prototype and make this work, but I am not adding an implementation of format or sprintf just to use it once!

So, this produces that I want:

now.getDate()+'/'+(now.getMonth()+1)+'/'+now.getFullYear()+' '+now.getHours()+':'+now.getMinutes()

Unless... the day or month are lower than 10, in which case it's missing the left-padding zero. Luckily, for the purpose I was using it, it worked anyway. Because OF COURSE there's no included function to left-pad a string. You have to do it by addition. Or, of course, add a 3rd party function that's out there, in the internet, somewhere.

Nothing Ever Really Goes Away On The Internet: ra-plugins

I used to manage a large number of QMail installations. And because Qmail was ... weirdly licensed, I wrote a set of plugins that ran on top of a patch called Qmail-SPP. I pretty much stopped doing that years ago because life took me in other directions, and forgot all about it.

That collection is called ra-plugins and I had not touched it since late 2008.

And today... I got a patch with two whole plugins to add to it so that it makes Qmail handle email addresses more like Gmail does (aliases using user+foo and making user.foo the same as userfoo).

So, I got them, added them, fixed a few simple building issues, updated the libsmtp it uses internally for one of the plugins to a later version, and there it stays, perhaps not to be touched until 2018.

Qt Mac Tips

My team has been working on porting some PyQt stuff to Mac OSX, and we have run into several Qt bugs, sadly. Here are two, and the workarounds we found.

Native dialogs are broken.

Using QFileDialog.getExistingDirectory we noticed the following symptoms:

  • If you do nothing, the dialog went away on its own after about 20 seconds.
  • After you used it once, it may pop up and disappear immediately. Or not.

Solution: use the DontUseNativeDialog option.

Widgets in QTreeWidgetItems don't scroll.

When you use Widgets inside the items of a QTreeWidget (which I know, is not a common case, but hey, it happens), the widgets don't scroll with the items.

Solution: use the -graphicssystem raster options. You can even inject them into argv if the platform is darwin.

Driving a Nail With a Shoe I: Do-Sheet

I had proposed a talk for PyCon Argentina called "Driving 3 Nails with a Shoe". I know, the title is silly, but the idea was showing how to do things using the wrong tool, intentionally. Why? Because:

  1. It makes you think different
  2. It's fun

The bad side is, of course, that this talk's contents have to be a secret, or else the fun is spoiled for everyone. Since the review process for PyConAr talks is public, there was no way to explain what this was about.

And since that means the reviewers basically have to take my word for this being a good thing to have at a conference, which is unfair, I deleted the proposal. The good (maybe) news is that now everyone will see what those ideas I had were about. And here is nail number 1: Writing a spreadsheet using doit.

This is not my first "spreadsheet". It all started a long, long time ago with a famous recipe by Raymond Hettinger which I used again and again and again (I may even be missing some post there).

Since I have been using doit for Nikola I am impressed by the power it gives you. In short, doit lets you create tasks, and those tasks can depend on other tasks, and operate on data, and provide results for other tasks, etc.

See where this is going?

So, here's the code, with explanations:

cells is our spreadsheet. You can put anything there, just always use "cellname=formula" format, and the formula must be valid Python, ok?

from tokenize import generate_tokens

cells = ["A1=A3+A2", "A2=2", "A3=4"]
values = {}

task_calculate creates a task for each cell, called calculate:CELLNAME. The "action" to be performed by that task is evaluating the formula. But in order to do that successfully, we need to know what other cells have to be evaluated first!

This is implemented using doit's calculated dependencies by asking doit to run the task "get_dep:FORMULA" for this cell's formula.

def evaluate(name, formula):
    value = eval(formula, values)
    values[name] = value
    print "%s = %s" % (name, value)

def task_calculate():
    for cell in cells:
        name, formula = cell.split('=')
        yield {
            'name':name,
            'calc_dep': ['get_dep:%s' % formula],
            'actions': [(evaluate, (name, formula))],
            }

For example, in our test sheet, A1 depends on A3 and A2 but those depend on no other cells. To figure this out, I will use the tokenize module, and just remember what things are "names". More sophisticated approaches exist.

The task_get_dep function is a doit task that will create a task called "get_dep:CELLNAME" for every cell name in cells.

What get_dep returns is a list of doit tasks. For our A1 cell, that would be ["calculate:A2", "calculate:A3"] meaning that to calculate A1 you need to perform those tasks first.

def get_dep(formula):
    """Given a formula, return the names of the cells referenced."""
    deps = {}
    try:
        for token in generate_tokens([formula].pop):
            if token[0] == 1:  # A variable
                deps[token[1]] = None
    except IndexError:
        # It's ok
        pass
    return {
        'result_dep': ['calculate:%s' % key for key in deps.keys()]
        }

def task_get_dep():
    for cell in cells:
        name, formula = cell.split('=')
        yield {
            'name': formula,
            'actions': [(get_dep, (formula,))],
            }

And that's it. Let's see it in action. You can get your own copy here and try it out by installing doit, editing cells and then running it like this:

[email protected]:~/dosheet$ doit -v2 calculate:A3
.  get_dep:4
{}
.  calculate:A3
A3 = 4
[email protected]:~/dosheet$ doit -v2 calculate:A2
.  get_dep:2
{}
.  calculate:A2
A2 = 2
[email protected]:~/dosheet$ doit -v2 calculate:A1
.  get_dep:A3+A2
{'A3': None, 'A2': None}
.  get_dep:4
{}
.  calculate:A3
A3 = 4
.  get_dep:2
{}
.  calculate:A2
A2 = 2
.  calculate:A1
A1 = 6

As you can see, it always does the minimum amount of effort to calculate the desired result. If you are so inclined, there are some things that could be improved, and I am leaving as exercise for the reader, for example:

  1. Use uptodate to avoid recalculating dependencies.
  2. Get rid of the global values and use doit's computed values instead.

Here is the full listing, enjoy!

Your Editor is Not the Bottleneck

This may cause some palpitations in some friends of mine who laugh at me for using kwrite, but it really is not. Any time you spend configuring, choosing, adjusting, tweaking, changing, improving, patching or getting used to your editor is time invested, for which you need to show a benefit, or else it's time wasted.

Let's look at SLOC, which while discredited as a measure of programmer's productivity, surely does work as a measure of how much a programmer types, right?

Well, estimates of total code production across the lifetime of a product vary (just take the SLOC of the product, divide by men/days spent), but they are usually something between 10 and 100 SLOC per programmer per day. Let's be generous and say 200.

So, 200 lines in eight hours. That's roughly one line every two minutes, and the average line of code is about 45 characters. Since I assume you are a competent typist (if you are not, shame on you!), it takes less than 20 seconds to type that.

So, typing, which is what you often tweak in your editor, takes less than 15% of your time. And how much faster can it get? Can you get a line written in 10 seconds? Then you just saved 8% of your day. And really, does your editor isave you half the typing time?

How much time do you lose having your eyes wonder over the sidebars, buttons, toolbars, etc?

So while yes, typing faster and more efficiently is an optimization, it may also be premature, in that, what the hell are we doing the other 80% of the time? Isn't there something we can do to make that huge chunck of time more efficient instead of the smaller chunk?

Well, I think we spent most of that time doing three things:

  1. Reading code
  2. Thinking about what code to write
  3. Fixing what we wrote in that other 20%

The first is easy: we need better code readers not editors. It's a pity that the main interface we get for looking at code is an editor, with its constant lure towards just changing stuff. I think there is a lost opportunity there somewhere, for an app where you can look at the code in original or interesting ways, so that you understand the code better.

The second is harder, because it's personal. I walk. If you see me walking while my editor is open, I am thinking. After I think, I write. Your mileage may vary.

The third is by far the hardest of the three. For example, autocomplete helps there, because you won't mistype things, which is interesting, but more powerful approaches exist, like constant running of tests suites while you edit. Every time you leave a line, trigger the affected parts of the suite.

That's much harder than it sounds, since it means your tools need to correlate your test suite to your code very tightly, so that you will see you are breaking stuff the second you break it, not minutes later.

Also, it should enable you to jump to the test with a keystroke, so that you can fix those tests if you are changing behaviour in your code. And of course it will mean you need tests ;-)

Which brings me to a pet peeve of mine, that editors still treat the file as the unit of work, which makes no sense at all. You never want to edit a file, you want to edit a function, or a class, or a method, or a constant but never a file. Knowing this was the sheer genius of ancient Visual Basic, which was completely ignored by all the snobs looking down at it.

So, instead of tweaking your editor, get me a tool that does what I need please. I have been waiting for it since VB 1.0. And a sandwich.

warning

This post is 99% lies, but I want to hear the arguments against it. If I tell you now it doesn't count as a real lie, I have learned from financial press ;-)

UPDATE: Interesting discussions in reddit and hacker news