Booth's Algorithm: How Hardware Multiplies Signed Numbers Efficiently

2026-05-07

A naive multiplier built from shift-and-add hardware works fine for unsigned numbers, but signed two's complement breaks it. Sign-extending operands and handling negative multipliers with the same circuit is awkward — you end up with conditional logic that bloats the datapath. Booth's algorithm solves this by recoding the multiplier so signed multiplication "just works," and as a bonus it skips strings of consecutive 1s, reducing the number of partial products.

The trick: instead of looking at one multiplier bit at a time, Booth examines pairs of adjacent bits. For each bit position i, the hardware inspects bits (i, i-1) with an implicit zero appended to the right of the LSB:

The insight is that a run of 1s from bit j to bit k equals 2^(k+1) − 2^j. So instead of k−j+1 additions, you do one subtraction at j and one addition at k+1.

Concrete example: Multiply 7 × −3 in 4 bits. Multiplicand M = 0111, multiplier Q = 1101. Append a 0: 11010. Scan pairs from the right:

After arithmetic shifts, the result is −21 (11101011 in 8-bit two's complement). No special sign handling required — the sign falls out of the recoding.

Modified Booth (radix-4) is what real hardware uses. It examines three bits at a time and produces partial products from the set {−2M, −M, 0, +M, +2M}. This halves the number of partial products, since you process two multiplier bits per step. Multiplying by 2M is just a left shift; multiplying by −M is two's complement (invert and add 1). ARM Cortex multipliers, x86 integer multipliers, and most DSP MAC units use radix-4 or radix-8 Booth encoders feeding a Wallace tree of carry-save adders.

Rule of thumb: for an N-bit multiplier, standard shift-and-add needs N partial products; radix-4 Booth needs ⌈N/2⌉ + 1. For a 32×32 multiplier, that's 17 partial products instead of 32 — roughly half the adder tree depth and half the area in the reduction network.

See it in action: Check out Booth
#39;s Algorithm for Signed Multiplication by TutorialsPoint to see this theory applied.
Key Takeaway: Booth's algorithm recodes the multiplier to handle signed numbers natively and collapse runs of 1s into a single add/subtract pair, which is why every serious hardware multiplier uses radix-4 Booth encoding feeding a Wallace tree.