Forums

OT: Friday Python Puzzler #1

Hello everyone!

I thought it might be a fun Friday diversion to try and find some of Python's quirkier behaviours and try to figure them out. Heck, you never know, it might even be useful for future reference...

So, I think this one people might already be familiar with, but let's see. Closures in Python are a really handy way to make factories for other functions and cool stuff like that. But due to some quirks of the Python language definition, they don't always do what you'd expect.

What would you expect this to do?

functions = []
for i in xrange(10):
    def inner():
        return i
    functions.append(inner)
print ", ".join(str(func()) for func in functions)

Now try executing it... Did it produce what you expected? If not, can you figure out why not?

Incidentally, you should find the safe effect in the following more concise version:

functions = [(lambda: i) for i in xrange(5)]
print ", ".join(str(func()) for func in functions)

However, this one seems to behave a bit more intuitively:

functions = ((lambda: i) for i in xrange(5))
print ", ".join(str(func()) for func in functions)

Why is that?

that last one is definitely unexpected! Something to do how tuple comprehensions work, and the fact that tuples are immutable? flummoxed.

I'll post the full explanation later today in case anybody else wants to hazard an answer, but it's not related to the immutability of tuples. In fact, don't think of that last expression as a "tuple comprehension" at all but a "generator comprehension".

OK, for anybody who's interested here's briefly what's going on...

Python is essentially a lexically scoped language where for any given function there are effectively three classes of scope:

  • Local scope (i.e. within the current function).
  • Enclosing scope (i.e. within the enclosing function, if any).
  • Global scope (i.e. defined at the module level).

This is ignoring a couple of warts such as classes, which have their own scope which is more or less unconnected from other scopes, and the builtin scope, where all the language builtins live.

So, while a function is executing if you refer to a variable then it's searched first in the local scope, then in each enclosing scope in turn and finally in the global scope. The first match is always used. The other wrinkle to note is that when a function is defined, it stores a reference to its enclosing scope to make sure they don't disappear if the enclosing function terminates - the technical term for this is a closure. It's what allows things like this to work:

def create_adder_function(add_amount):
    def inner_func(arg):
        return arg + add_amount
    return inner_func
my_add_5_func = create_adder_function(5)
# This will print '22'
print my_add_5_func(17)

The key point to note here is that in Python (unlike some other lexically scoped languages) only a function introduces its own scope and not constructs such as for loops. This means that when you create a closure, you're just getting a reference to the enclosing scope. In the code in my first post, each of the functions defined in the loop ends up getting a reference to the same enclosing scope - in that example, the global scope, but it would work the same way within a function. This is because each repetition around the loop does not create its own scope, it merely executes in the local scope of the current function (or the global scope in that particular case).

This means that when the lambda expressions are executed and refer to the variable i, they're all referring to the same i in the current scope and hence they all get the same value. If the inner function had been defined in a wrapper function as in my adding example just above, everything would work fine. This is because the wrapper function would be called each time, hence creating a new closure and a new instance of the counter variable. But in a for loop no new closure, hence all the same value.

Finally, the question as to why the final code snippet works. This is because the round-bracket version of the list comprehension declares a generator expression which is lazily evaluated. Therefore, when functions is defined in that example nothing has actually been evaluated - the xrange generator expression is ready to generate integers, but it hasn't done yet.

When the second line then evaluates all the functions, the generator expression starts iterating through the xrange one item at a time. Importantly, the lambda expressions are being created and evaluated at the same time and hence the variable i has the value you'd expect each time. The problem comes when you create all the closures first, and then evaluate the functions.

I'm not sure how good a job I did of explaining that, if anybody's curious feel free to ask. You might also read the blog post I did about this earlier this week.

EDIT:

By the way, thinking about closures as references to scopes as opposed to "freeze-frames" of values makes some things easier to understand. For example, the following Python 3 code:

def outer():
    x = 1
    def increment():
        nonlocal x
        x += 1
        return x
    def double():
        nonlocal x
        x *= 2
        return x
    return (increment, double)

inc, doub = outer()
print inc()
print doub()
print inc()
print doub()

If you execute this (with python3 of course) then you'll see that both inner functions are sharing and modifying the same closure - they don't each have their own copy.