bool-is-int Trap: When True and 1 Share a Hash2026-05-16
You're auditing a data-migration script. A legacy table stored an "active" flag as integer 0/1; the new code writes proper booleans. You're asked to count records by status before the cutover so you can verify nothing got lost.
def count_by_status(records):
"""Count records grouped by their 'active' value."""
counts = {}
for r in records:
key = r['active']
counts[key] = counts.get(key, 0) + 1
return counts
records = [
{'id': 1, 'active': True}, # new schema
{'id': 2, 'active': 1}, # legacy int
{'id': 3, 'active': False}, # new schema
{'id': 4, 'active': 0}, # legacy int
{'id': 5, 'active': True},
]
print(count_by_status(records))
# You expect: {True: 2, 1: 1, False: 1, 0: 1}
# You get: {True: 3, False: 2}
The legacy rows have vanished into the boolean buckets. Worse, if you reverse the input order, you get {1: 3, 0: 2} — same totals, different keys. The audit silently changes shape based on row order.
In Python, bool is a subclass of int. Not "like" an int — literally a subclass:
>>> isinstance(True, int)
True
>>> True == 1 and hash(True) == hash(1)
True
>>> False == 0 and hash(False) == hash(0)
True
>>> {1: 'a', True: 'b', 1.0: 'c'}
{1: 'c'}
Dict and set membership use __hash__ and __eq__. Since hash(True) == hash(1) and True == 1, they occupy the same bucket. The first key inserted wins for display purposes, but every subsequent equal key increments that same entry. Throw 1.0 into the mix and it collapses too — hash(1.0) == hash(1) because Python guarantees numeric types that compare equal hash equal.
This trap surfaces anywhere a hash-based container meets mixed numeric types: deduplication (set([1, True]) has length 1), JSON round-trips through Counter, caching keyed on flag values, or pytest parametrize IDs collapsing.
You need a key that distinguishes types as well as values. Pair the value with its type, or coerce to a canonical type before counting:
def count_by_status(records):
counts = {}
for r in records:
v = r['active']
key = (type(v).__name__, v) # ('bool', True) ≠ ('int', 1)
counts[key] = counts.get(key, 0) + 1
return counts
If your goal was to normalize legacy ints to booleans, be explicit — don't let the hash collision do it silently:
key = bool(r['active']) # 0 → False, 1 → True, then count
Two principles worth internalizing:
a == b and hash(a) == hash(b), a dict or set treats them as the same thing — even across types you thought were unrelated.True, 1, 1.0, 1+0j, Fraction(1,1), and Decimal(1) mostly all compare equal and hash the same (Decimal is the partial exception — but only because of float quirks, not type).The bug is invisible in unit tests that use only one type per fixture. It detonates the first time real data mixes representations — exactly the migration you were trying to audit.
True == 1 and they hash identically, so booleans and ints (and floats) silently collapse into the same dict and set keys — never use a raw numeric value as a key when its type carries meaning.
