Today I again came across code that I was able to make simpler,
clearer and safer using
collections.defaultdict. I keep coming across experienced Python programmers that don't know
about it. Perhaps it's time to spread the good word.
The defaultdict type is a dict
subclass that takes a factory function to supply default values for
keys that haven't been set yet. For example
from collections import defaultdict frequency = defaultdict(lambda:0) for c in 'the quick brown fox jumps over the lazy dog': frequency[c] = frequency[c] + 1Will count the frequency of characters in a string.
I often use defaultdict for dicts of
dicts (defaultdict(dict)) and dicts
of lists (defaultdict(list)).
defaultdict replaces some pretty
simple code, for example the above code could be written:
frequency = dict()
for c in 'the quick brown fox jumps over the lazy dog':
if c in dict:
frequency[c] = frequency[c] + 1
else:
frequency[c] = 1
but I find using defaultdict is not
just shorter but also much clearer.
The other classes in the
collections
class, especially OrderedDict and
Counter (which is an implementation
of the pattern I just implemented here on top of
defaultdict) seem useful, but I've
never found myself actually using them, whereas
defaultdict is a common part of my
repertoire these days.