Daily Debugging Puzzle: The Binary Search Bug That Lurks for Years

The Binary Search Bug That Lurks for Years

2026-04-26

Binary search is one of those algorithms every programmer thinks they can write correctly from memory. Most can't. Here's a Java implementation that searches a sorted array for a target value and returns its index, or -1 if not found.

public static int binarySearch(int[] arr, int target) {
    int low = 0;
    int high = arr.length; // search space: [low, high)

    while (low < high) {
        int mid = low + (high - low) / 2;

        if (arr[mid] == target) {
            return mid;
        } else if (arr[mid] < target) {
            low = mid + 1;
        } else {
            high = mid;
        }
    }
    return -1;
}

This version is actually correct — the half-open interval [low, high) is handled consistently. But now a colleague needs a variant: find the insertion point — the index where target should be inserted to keep the array sorted. They copy-paste and tweak:

public static int findInsertionPoint(int[] arr, int target) {
    int low = 0;
    int high = arr.length - 1; // switched to closed interval [low, high]

    while (low <= high) {
        int mid = low + (high - low) / 2;

        if (arr[mid] < target) {
            low = mid + 1;
        } else {
            high = mid - 1;
        }
    }
    return low;
}

Looks clean. Passes most tests. Ships to production. But call it with an empty array:

findInsertionPoint(new int[]{}, 5);

When arr.length is 0, high becomes -1. The while (low <= high) condition is immediately false (0 <= -1 is false), so we return low = 0. That's... actually fine for the empty case. But now try the edge where the target belongs after all elements:

findInsertionPoint(new int[]{1, 2, 3}, 4);

Walk through it: low=0, high=2. Mid=1, arr[1]=2 < 4, so low=2. Mid=2, arr[2]=3 < 4, so low=3. Now low > high, loop exits, return 3. Correct — insert at the end.

So where's the bug? It's not in this function at all. The bug is in the caller. The real danger of switching from half-open to closed intervals is what happens when someone uses this insertion point with the original search:

int pos = findInsertionPoint(sortedList, value);
// "If value is already there, pos points to it... right?"
if (pos < sortedList.length && sortedList[pos] == value) {
    // found it
}

This breaks when there are duplicate values. The findInsertionPoint function uses high = mid - 1 in the else branch, which pushes past equal elements. It finds the leftmost insertion point — but only by accident of the branch structure. If the array is [1, 3, 3, 3, 5] and you search for 3, you get index 1 (before the first 3). The caller's sortedList[pos] == value check works here by luck.

But change the else branch to high = mid — a common "fix" when someone notices the asymmetry — and the loop never terminates when low == high and arr[mid] >= target. The closed interval [low, high] with high = mid creates an infinite loop because mid equals low when they're adjacent.

The correct version uses a half-open interval consistently:

public static int findInsertionPoint(int[] arr, int target) {
    int low = 0;
    int high = arr.length; // half-open: [low, high)

    while (low < high) {
        int mid = low + (high - low) / 2;

        if (arr[mid] < target) {
            low = mid + 1;
        } else {
            high = mid; // safe: high is exclusive
        }
    }
    return low;
}

The half-open interval makes high = mid safe because high is always exclusive — the search space shrinks every iteration, guaranteed. No infinite loop. No off-by-one at array boundaries. No special-casing for empty arrays.

Key Takeaway: In binary search, mixing closed-interval initialization (high = length - 1) with half-open update logic (high = mid) creates infinite loops — always pick one interval convention and follow it religiously through initialization, loop condition, and update steps.

All newsletters