Forums

OT: Friday Python Puzzler #4

Haven't done this for a few weeks, mostly due to lack of time, to some extent running out of obscure Python puzzles... But maybe it's useful to cover some simpler gotchas too. This one's rather easier than the previous ones, but it still might be easy to miss what's going on.

Suppose you wanted to design a class which kept track of all the different instances of it which were created. You might imagine a class like this:

class Counted(object):

    instances = set()

    def __init__(self):
        self.instances.add(self)

    def __del__(self):
        self.instances.discard(self)

Seems simple enough - there's a class variable instances which is a set() of all the instances created by __init__(). When the class is garbage-collected, __del__() removes the reference from the set, so it should always contain the current references. Right?

So, let's test this class in an interactive session:

>>> print len(Counted.instances)
0
>>> local_instances = [Counted() for i in xrange(10)]
>>> print len(Counted.instances)
10

Looks good - we've created a list of 10 instances, and now Counted.instances contains 10 items, just as it should. OK, so now we delete local_instances and the count should go back to zero:

>>> del local_instances
>>> print len(Counted.instances)
10

Hm, still 10... What's going on?

EDIT

Bonus section... After running the above code, try then executing the following and watch the fireworks:

>>> del Counted.instances

(Note: Don't do this in a Python interpreter session you care about!)

I imagine because we are only deleting a reference to a list...

And it's kind of proven by doing the deletes in reverse order. Where they work. Though the Counted class is then broken.

Unless I've misunderstood you, you're not quite there, but not too far off. Deleting the list removes the references to all the items within the list, which should bring their reference count to zero and trigger __del__() to be called... Right?

Not quite, because there's still an extant reference to each instance which prevents garbage collection - that reference is within the Counted.instances set itself. Hence, __del__() is never called and the reference remains - it's essentially a form of cyclic reference problem, just wearing a funny hat.

The interesting thing is that if you attempt to del Counted.instances (admittedly a quirky thing to do, to remove a class variable while there are extant instances) then you actually end up segfaulting the Python interpreter! This seems to happen on everything from 2.6 to 3.3 and even the latest development version - if I can get around to unpicking the backtrace then I'll submit a bug about it to the Python tracker. I don't think you could say it's a common problem, however.

Of course, the correct way to do this without all these annoying cyclic reference issues is to use weak references:

import weakref

class Counted(object):

    instances = weakref.WeakSet()

    def __init__(self):
        self.instances.add(self)

Here you don't need to worry about __del__() because WeakSet sets up a callback to automatically remove the item from the set when it's garbage collected (but prior to __del__() on the object, I believe). Also, the items in the set being weak references, they don't count towards keeping the object alive, which is why you don't suffer from the problem.