?

Log in

entries friends calendar profile Goode Trouble Previous Previous Next Next
Tom Meets Python, Part 1 - I Am My Avatar
Big Warm Fuzzy Public Heart
boutell
boutell
Tom Meets Python, Part 1


I'm writing my first real application in the Python programming language. People say it's the most intuitive language, and that things Just Work Like They Should. That's not entirely true... but I'm impressed. So far.

Now, I'm still in the early stages with Python. So I fully expect to make some uninformed misjudgments here... though any claims I make about how hard it is to figure something out are, I think, very valid, because the reasonably intelligent newcomer is the authority there, not the experienced user.

You'll Never Mistake This for C

From the very beginning, Python has a different look from almost all other languages. The following code is correct in C, and nearly correct in (deep breath) C, C++, Perl, Java, and C#:

int i;
for (i = 0; (i < 10); i++) {
  printf("foo\n");
}

Python's equivalent is entirely different:

for i in range(10):
  print "foo"

So what's different here? What isn't? To begin with, Python doesn't use { and } or any other characters to indicate a "block" of code that should be run while the loop is in effect. Instead, Python uses indentation.

That bears repeating: white space actually matters in Python. No, you don't have to indent that "print" statement by exactly two spaces, but you must indent it past the "for" statement it resides in, and you must indent any additional statements inside the "for" statement by the same amount... or more, if you have additional loops, "if" statements and so on nested inside the "for" block.

So is this incredibly irritating? Surprisingly, no. Good programmers always, always, always indent code consistently these days anyway, because the wages of not doing so are death... well, okay, total horrific train-wreck confusion when you try to read your code later. Python leverages this and frees us from the irritation of typing { and } and ; all the time. Also, ; is now optional. I like it.

What else is different? Surprisingly, python has no equivalent to the old-school BASIC "for/next" loop that C's "for (i = 0; (i < 10); i++)" is equivalent to. Actually, you can get that exact behavior using a "while" loop in Python. But in Python it is more idiomatic for "i" to range through the values in a sequence. And the "range" function is a convenient way to get a sequence from zero up to, but not including, the value passed to "range." Like many things in Python, this seems strange at first, but why should we be worrying about incrementing things? We're people, the computer is a computer. Let the computer worry about dumb minutiae like adding one to i. What we want is for i to range from 0 to 9, and we can say that more elegantly this way.

Another difference: i doesn't have to be defined explicitly here. Instead, it is defined by its first use. "Aha," you may be thinking, "that's bound to lead to typos going unnoticed." Well, not so much. Python, like Perl in strict mode, will call you on it if you try to access a variable without first assigning a value to it. That's an effective way to catch typos without tedious typing.

If You Need It, It's Probably There

So what am I actually doing with Python? I'm using it to write a specialized LiveJournal client, largely as a way of learning more about the language.

LiveJournal offers several ways for an application to communicate with the LiveJournal server. The sexiest, supposedly, is the XMLRPC interface. I was leery of this because I haven't done much XML. Taking on a new language and a new approach to sharing data at the same time? Well, why the hell not?

As it turns out, this is one of the smoothest things about Python: a high-quality XMLRPC library called xmlrpclib is standard equipment. xmlrpclib lets you make an object called a "server proxy" that represents all of the actions ("methods") available from the LiveJournal server.

Hmm, but I'll need ways to save and load all of my LJ entries. Do I have to play with lots of files? Nope: Python allows nested "dictionaries," known as associative arrays or hashes in Perl. And Python also has built-in support for "pickling" objects, saving them to disk in a well-defined format so that they can be painlessly loaded back in again or "unpickled." There's a standard Python class called "pickle" that does the job, and an optimized, pure-C version called "cPickle" that does it faster for typical applications... like mine. Very cool.

What about a GUI? Can I create a user interface at all? If I can, won't I have to do it over again for Linux and MacOS X users? wxPython to the rescue! wxPython is a Python interface to the excellent, free, and mature wxWidgets library, which makes it easy to write friendly applications for Windows, MacOS X, Linux and other beasties. And it offers a native look and feel on each.

How about making an executable? Do I have to ship the Python interpreter and my Python source code? That's covered too: py2exe turns Python scripts into Windows executables with very little pain. Even if they require wxPyton. Much, much too cool.

So: xmlrpclib to fetch all of my articles. cPickle to save them. wxPython display them all, and py2exe to in the darkness bind them. This thing is writing itself! Is there nothing in Python-land to make me pine for Perl?

Well, yes... two things. So far.

What I Don't Like

So far I've hit two irritants I didn't enjoy dealing with.

First off, like anyone with half a brain, I wanted to define functions so that I could reuse code and not retype it all over the place. Python makes defining a function pretty painless:

def function():
 # Do lots of work here
 # Now return a result!
 return 1


You'll note that function definitions use indentation, just like "for" loops and "if" statements do.

What's not so painless is discovering that "function" must be defined like this before I can call it.

Wait a minute. That's pretty standard behavior for C, yeah. But this ain't 1973. Perl and Java are both bright enough to read the rest of your code and find the function, allowing you to place functions in your source code where they feel right to you. They don't use the programmer's brain as a sorting machine. That's what computers are for.

Still, it's not that hard to work with -- just define your functions before you use them. And I gather I can write them in separate modules and include them gracefully. And perhaps there's a way to predeclare them. Maybe there's some other workaround I haven't stumbled upon yet. Still, this is a bit of a turn-off.

The other irritation: Python, like Perl and other scripting languages, has elegant support for "dictionaries" (known as hashes or associative arrays in many languages). So when the LJ server's XMLRPC interface returns a day's articles to me (as an array), I see no reason not to write this:

entries[year][month][day] = articles

Now, in Java, I would expect this to spew errors unless I tediously create not only "entries" itself, but also the hashes "entries[2002]", "entries[2002][09]", and so on. Java doesn't automatically create things for you -- its standard behavior is to assume you didn't mean to do that, and flag it as an error. But Java is an uptight little language and, to be fair, very concerned with performance and safety and stuff. In Perl, this line would automatically create the nested series of hashes needed to do the job, if they didn't already exist. And I expected something similar from Python.

But no -- I must explicitly create everything. Brother. So this is what I wound up with:

if year not in entries:
  entries[year] = {}
if month not in entries[year]:
  entries[year][month] = {}
entries[year][month][day] = eresult['events']

And when I look at it, I have to admit that's not the most horrible thing ever. But it is irritating, and it would be nice if it wasn't necessary.

Tune In Soon For More Wriggly Goodness

That's it for part one! You'll hear more about exception handling, modules, and wxPython soon... "Tags From 10,000 Feet" Version 0.0001 has just successfully pickled my entire stinkin' mass of LJ posts for the first time!

Tags: , , ,

6 comments or Leave a comment
Comments
jwgh From: jwgh Date: September 18th, 2005 10:39 pm (UTC) (Link)
Another livejournal friend mentioned this old post: http://groups.google.com/group/alt.religion.kibology/msg/9a669b71383aaa15
sambushell From: sambushell Date: September 18th, 2005 11:46 pm (UTC) (Link)
That isn't how I write code, per se... but it is how I punctuate sentences in code review meetings, so it's not far off.
From: cks Date: September 19th, 2005 03:23 am (UTC) (Link)

function definition order explained

There are three things going on here, and they all make sense when you understand them:

  1. Like everything else in Python, functions are found by name lookup. So their name has to be defined by the time they're called.
  2. When you do 'import MODULE' in Python (and 'python file', which sort of implicitly imports the file), the code being imported is actually executed.
  3. And finally, def is actually an executable statement; when it runs, it defines the function from the code block et al. (This has other useful consequences that don't fit in the margin of this comment.)
So you can actually create functions in any order so long as their names are bound by the time anything that will look up their name is executed. If you put your code at the top level of a module or program, this means that they must be lexically before the code.

The traditional approach (which helps with several other things too, including pychecker) is to write all of your code in functions (including something like a main function) and then end your source file with

if __name__ == "__main__":
    main(sys.argv)
__name__ is the name of the current source file being imported by the Python interpreter; if it is __main__, this file has not been imported but is instead being run as 'python program'.

This also implies that a function name is not a special thing; it is just a variable that points to the function's code block. It can be duplicated, reassigned, changed, etc as you desire, just like other variables.

boutell From: boutell Date: September 19th, 2005 04:06 am (UTC) (Link)

Re: function definition order explained

I am enlightened.

But while this gets the job done and I'll look into using it, it is a bit of a wart on an otherwise gorgeous language. It smacks of Perlesque shenanigans to achieve elegance that should be standard equipment. Or C's #ifndef FOO_H #define FOO_H... nonsense.

Why doesn't Python just have a built-in notion of main()?
From: cks Date: September 19th, 2005 04:31 am (UTC) (Link)

Re: function definition order explained

Python has no default main invocation partly because there's no requirement that you use that technique. It does make several things easier (eg, by making it possible to import your program's file into a Python interpreter). I've written small Python programs with straightline code, not using the technique, which is convenient for quick things. (I've also written larger ones using it, back in my younger and more innocent days; it was a mistake, if only because pychecker does very badly with code that starts executing the moment it gets imported.)

Also: There's a bunch of reasons for the import behavior, including enabling a variety of features in a simple and general way. Given the import behavior, adding an explicit idea of a main function is duplicating functionality, something Python generally avoids.

boutell From: boutell Date: September 19th, 2005 04:06 am (UTC) (Link)

Re: function definition order explained

P.S. Your wisdom is appreciated!
6 comments or Leave a comment