I do quite a bit of coding (it’s part of my job, after all). Although I’ve worked and played with a dozen or two programming languages, the one I’ve used the most is python (you might also want to check out some of my writings on the subject).

Perhaps one of the most useful pieces of code I’ve written (in my opinion) is the tiny mpub script, used to temporarily make files public on a web server. If you don’t see any practical use for that then, well, perhaps you just don’t need it as much as me.

Python snippets

Some random (mostly rather old) code snippets that might be of semi-general interest.

A Naive Bayesian Classifier

This is a tiny implementation of a naive Bayesian classifier. You can train it by adding (with the add method) training “vectors”, that is, iterable objects (all of the same length) that contain hashable objects. Associated with each training vector is a class (another hashable object). After training, you can then classify unseen vectors; in other words, you can make the classifier which class best suits the vector.

The classifier is available here.

A PGM Reader

A tiny library for reading PGM (portable gray map) files into numeric arrays (requires numarray). It’s available here.

The Towers of Hanoi

I just had to do this one… The puzzle is well-known (a Web search should give you plenty of info), and it is easily solved with recursion. Take a look at source here.

A simple but powerful secret key encryption algorithm. The script is based on an original by Ka-Ping Yee, which implemented the CipherSaber-1 algorithm. In this script I’ve changed it to comply with the CipherSaber-2 algorithm, as well as using encoding the data in a simple form of “ASCII armour”, and using getpass to read the password, so it won’t end up in your UNIX history file. The script can be found here.

Algorithms in Python

Just a couple of standard algorithms: Levenshtein distance and the Bellman-Ford algorithm.

Parser for Minimal XML

Minimal XML is a subset of XML 1.0 which leaves out several non-essential features, such as attributes, mixed contents, empty tags, etc. A preliminary specification can be found here.

I’ve made a small parser for Minimal XML (or Simple Markup Language, as it’s called). It basically assumes that the input is valid SML, and does little error checking. The resulting tree structure is a tuple (tag, kids), where the tag is a string (the tag name, or nodeName) while kids is a list of subtrees, each of the same form. Character data are stored in nodes of the form (None, data) where data is a string.

“Detect” in Python

Detect” is a control structure invented by Arne Halaas in the seventies. I have been persuaded by him that it is useful, and, although I have never actually used it for anything useful, I have implemented a version of it for python. If nothing else, it might be an interesting example of Python source code.

A minimal CGI publisher

A minimal CGI publisher for python scripts. The usage is described in the doc string. It can be used to turn standard scripts into CGI-scripts by simply importing the module and appending a single function call at the end of the script. A simple example of its use can be found here. (This was an experiment, to see if it was possible to make a really tiny publisher that would still be useful. For a more complete publisher, see the ZPublisher component of Zope, previously known as “Bobo”).

Self-Printing One-Liner

If you run this one-liner at a command prompt, it should print out a copy of itself (write it as one continuous line, without the line break):

python -c "x='python -c %sx=%s; print x%%(chr(34),repr(x),chr(34))%s';
print x%(chr(34),repr(x),chr(34))"

Not very useful, but kinda fun… I just saw some other self-printing programs and thought it would be interesting to make a one-liner version.

Sleepcat — cat with a delay

This is a little program that emulates the unix command cat, except that a delay (“sleep”) is added after printing out each line. I don’t really remember why I wrote this, but I’m sure it might be useful for reading large files without moving a muscle :)

Usage: sleepcat [ -t secs ] [ file1 [file2 ... ] ]

Old projects


I was one of the original creators of Piddle, a generic, multi-platform drawing toolkit for Python. I originally implemented the PostScript backend.

Anygui — a Generic GUI Module

Anygui was an attempt at creating a thin layer of abstraction on top of several available (and some as yet unavailable) packages for creating graphical user interfaces in Python. The point was that it should be possible to write GUI programs in Python without worrying about which GUI packages the user has installed. The name was inspired by the standard Python library module anydbm.


Atox is an toy implementation of a rather generic text-to-XML transformation system.