Today I again came across code that I was able to make simpler, clearer and safer using collections.defaultdict
. I keep coming across experienced Python programmers that don’t know about it. Perhaps it’s time to spread the good word.
The defaultdict
type is a dict subclass that takes a factory function to supply default values for keys that haven’t been set yet. For example
from collections import defaultdict frequency = defaultdict(lambda:0) for c in 'the quick brown fox jumps over the lazy dog': frequency[c] = frequency[c] + 1
Will count the frequency of characters in a string.
I often use defaultdict
for dicts of dicts (defaultdict(dict)
) and dicts of lists (defaultdict(list)
).
defaultdict
replaces some pretty simple code, for example the above code could be written:
frequency = dict() for c in 'the quick brown fox jumps over the lazy dog': if c in dict: frequency[c] = frequency[c] + 1 else: frequency[c] = 1
but I find using defaultdict
is not just shorter but also much clearer.
The other classes in the collections
class, especially OrderedDict
and Counter
(which is an implementation of the pattern I just implemented here on top of defaultdict
) seem useful, but I’ve never found myself actually using them, whereas defaultdict
is a common part of my repertoire these days.
Brilliant, I didn’t know about this one – cheers!
And lots of folks forget about the built-in
set
class, too.defaultdict(set)
is one of my very favorite things.