Python's dict.fromkeys with a Mutable Default: One List to Rule Them All

2026-05-30

This function is supposed to group words by their starting letter, returning a dict like {'a': ['apple', 'avocado'], 'b': ['banana']}. It looks clean, idiomatic, and even slightly clever. Run it and weep.

def build_word_groups(words):
    """Group words by their starting letter."""
    letters = {w[0] for w in words}
    groups = dict.fromkeys(letters, [])
    for word in words:
        groups[word[0]].append(word)
    return groups

if __name__ == "__main__":
    words = ["apple", "banana", "avocado",
             "blueberry", "cherry", "apricot"]
    result = build_word_groups(words)

    for letter, group in sorted(result.items()):
        print(f"{letter}: {group}")

    # Expected:
    #   a: ['apple', 'avocado', 'apricot']
    #   b: ['banana', 'blueberry']
    #   c: ['cherry']

The Bug

Run it and every letter is mapped to the entire word list:

a: ['apple', 'banana', 'avocado', 'blueberry', 'cherry', 'apricot']
b: ['apple', 'banana', 'avocado', 'blueberry', 'cherry', 'apricot']
c: ['apple', 'banana', 'avocado', 'blueberry', 'cherry', 'apricot']

The culprit is dict.fromkeys(letters, []). The docs are explicit but easy to miss: the second argument is evaluated exactly once, and that same object becomes the value for every key. So groups['a'], groups['b'], and groups['c'] are all references to the same list. Append to one, append to all.

You can verify the aliasing in two lines:

>>> d = dict.fromkeys(['x', 'y'], [])
>>> d['x'] is d['y']
True

This is a sibling of the mutable-default-argument trap, but it bites harder because fromkeys looks like a factory that constructs a fresh value per key. The signature is misleading — the value isn't a template, it's a singleton. With immutable values (0, None, "") you never notice. The moment the value is mutable — list, dict, set, or any class with mutable state — you have invisible aliasing.

The fix is to construct each value independently. The Pythonic approaches:

# Option 1: dict comprehension — explicit fresh list per key
groups = {letter: [] for letter in letters}

# Option 2: defaultdict — skip the prelude entirely
from collections import defaultdict

def build_word_groups(words):
    groups = defaultdict(list)
    for word in words:
        groups[word[0]].append(word)
    return groups

Option 2 is strictly better here: you don't need to pre-compute the set of letters, and you can't typo a key into existence with a shared default.

Why does fromkeys work this way? Because evaluating the default once is faster, and for the overwhelmingly common case (initializing with None or 0 to use the dict as a set-with-extras), sharing the reference is harmless. The footgun only exists because mutability is a runtime property, not something the API can guard against.

Heuristic worth tattooing: if a Python API takes a "default value" parameter rather than a "default factory" callable, assume it shares the reference. dict.fromkeys, list * n (e.g. [[]] * 5), and function default arguments all follow this pattern and all bite in the same way.

Key Takeaway: dict.fromkeys(keys, value) stores the same value object under every key — pass anything mutable and you've built an aliasing trap; use a dict comprehension or defaultdict instead.

All newsletters