dict.fromkeys with a Mutable Default: One List to Rule Them All2026-05-30
This function is supposed to group words by their starting letter, returning a dict like {'a': ['apple', 'avocado'], 'b': ['banana']}. It looks clean, idiomatic, and even slightly clever. Run it and weep.
def build_word_groups(words):
"""Group words by their starting letter."""
letters = {w[0] for w in words}
groups = dict.fromkeys(letters, [])
for word in words:
groups[word[0]].append(word)
return groups
if __name__ == "__main__":
words = ["apple", "banana", "avocado",
"blueberry", "cherry", "apricot"]
result = build_word_groups(words)
for letter, group in sorted(result.items()):
print(f"{letter}: {group}")
# Expected:
# a: ['apple', 'avocado', 'apricot']
# b: ['banana', 'blueberry']
# c: ['cherry']
Run it and every letter is mapped to the entire word list:
a: ['apple', 'banana', 'avocado', 'blueberry', 'cherry', 'apricot']
b: ['apple', 'banana', 'avocado', 'blueberry', 'cherry', 'apricot']
c: ['apple', 'banana', 'avocado', 'blueberry', 'cherry', 'apricot']
The culprit is dict.fromkeys(letters, []). The docs are explicit but easy to miss: the second argument is evaluated exactly once, and that same object becomes the value for every key. So groups['a'], groups['b'], and groups['c'] are all references to the same list. Append to one, append to all.
You can verify the aliasing in two lines:
>>> d = dict.fromkeys(['x', 'y'], [])
>>> d['x'] is d['y']
True
This is a sibling of the mutable-default-argument trap, but it bites harder because fromkeys looks like a factory that constructs a fresh value per key. The signature is misleading — the value isn't a template, it's a singleton. With immutable values (0, None, "") you never notice. The moment the value is mutable — list, dict, set, or any class with mutable state — you have invisible aliasing.
The fix is to construct each value independently. The Pythonic approaches:
# Option 1: dict comprehension — explicit fresh list per key
groups = {letter: [] for letter in letters}
# Option 2: defaultdict — skip the prelude entirely
from collections import defaultdict
def build_word_groups(words):
groups = defaultdict(list)
for word in words:
groups[word[0]].append(word)
return groups
Option 2 is strictly better here: you don't need to pre-compute the set of letters, and you can't typo a key into existence with a shared default.
Why does fromkeys work this way? Because evaluating the default once is faster, and for the overwhelmingly common case (initializing with None or 0 to use the dict as a set-with-extras), sharing the reference is harmless. The footgun only exists because mutability is a runtime property, not something the API can guard against.
Heuristic worth tattooing: if a Python API takes a "default value" parameter rather than a "default factory" callable, assume it shares the reference. dict.fromkeys, list * n (e.g. [[]] * 5), and function default arguments all follow this pattern and all bite in the same way.
dict.fromkeys(keys, value) stores the same value object under every key — pass anything mutable and you've built an aliasing trap; use a dict comprehension or defaultdict instead.
