Daily Debugging Puzzle: Python's <code>bool</code>-is-<code>int</code> Trap: When <code>True</code> and <code>1</code> Share a Hash

Python's `bool`-is-`int` Trap: When `True` and `1` Share a Hash

2026-05-16

You're auditing a data-migration script. A legacy table stored an "active" flag as integer 0/1; the new code writes proper booleans. You're asked to count records by status before the cutover so you can verify nothing got lost.

def count_by_status(records):
    """Count records grouped by their 'active' value."""
    counts = {}
    for r in records:
        key = r['active']
        counts[key] = counts.get(key, 0) + 1
    return counts


records = [
    {'id': 1, 'active': True},   # new schema
    {'id': 2, 'active': 1},      # legacy int
    {'id': 3, 'active': False},  # new schema
    {'id': 4, 'active': 0},      # legacy int
    {'id': 5, 'active': True},
]

print(count_by_status(records))
# You expect:  {True: 2, 1: 1, False: 1, 0: 1}
# You get:     {True: 3, False: 2}

The legacy rows have vanished into the boolean buckets. Worse, if you reverse the input order, you get {1: 3, 0: 2} — same totals, different keys. The audit silently changes shape based on row order.

The Bug

In Python, bool is a subclass of int. Not "like" an int — literally a subclass:

>>> isinstance(True, int)
True
>>> True == 1 and hash(True) == hash(1)
True
>>> False == 0 and hash(False) == hash(0)
True
>>> {1: 'a', True: 'b', 1.0: 'c'}
{1: 'c'}

Dict and set membership use __hash__ and __eq__. Since hash(True) == hash(1) and True == 1, they occupy the same bucket. The first key inserted wins for display purposes, but every subsequent equal key increments that same entry. Throw 1.0 into the mix and it collapses too — hash(1.0) == hash(1) because Python guarantees numeric types that compare equal hash equal.

This trap surfaces anywhere a hash-based container meets mixed numeric types: deduplication (set([1, True]) has length 1), JSON round-trips through Counter, caching keyed on flag values, or pytest parametrize IDs collapsing.

The Fix

You need a key that distinguishes types as well as values. Pair the value with its type, or coerce to a canonical type before counting:

def count_by_status(records):
    counts = {}
    for r in records:
        v = r['active']
        key = (type(v).__name__, v)   # ('bool', True) ≠ ('int', 1)
        counts[key] = counts.get(key, 0) + 1
    return counts

If your goal was to normalize legacy ints to booleans, be explicit — don't let the hash collision do it silently:

key = bool(r['active'])   # 0 → False, 1 → True, then count

Two principles worth internalizing:

Equality drives identity in containers. Any time a == b and hash(a) == hash(b), a dict or set treats them as the same thing — even across types you thought were unrelated.
Python's numeric tower is one big equivalence class. True, 1, 1.0, 1+0j, Fraction(1,1), and Decimal(1) mostly all compare equal and hash the same (Decimal is the partial exception — but only because of float quirks, not type).

The bug is invisible in unit tests that use only one type per fixture. It detonates the first time real data mixes representations — exactly the migration you were trying to audit.

Key Takeaway: In Python, True == 1 and they hash identically, so booleans and ints (and floats) silently collapse into the same dict and set keys — never use a raw numeric value as a key when its type carries meaning.

All newsletters

Python's bool-is-int Trap: When True and 1 Share a Hash

The Bug

The Fix

Python's `bool`-is-`int` Trap: When `True` and `1` Share a Hash