CORDIC: How Hardware Computes Sine and Cosine With Only Shifts and Adds

2026-06-06

Software engineers reach for sin() and cos() without thinking. But on an FPGA with no multiplier or lookup ROM, how do you compute trig? The answer is CORDIC (COordinate Rotation DIgital Computer), invented in 1959 by Jack Volder for the B-58 bomber's navigation computer. It computes sin, cos, atan, sqrt, log, and more using only shifts, adds, and a small constant table — no multipliers required.

The core insight: rotating a 2D vector by angle θ is just multiplying by a rotation matrix. CORDIC decomposes θ into a sum of pre-chosen "micro-rotations" of angles arctan(2⁻ⁱ) for i = 0, 1, 2, …. Each micro-rotation looks like:

where d_i is just ±1 — the sign of z_i, deciding whether to rotate clockwise or counterclockwise. The multiply by 2⁻ⁱ is a free wire-shift in hardware. The arctan(2⁻ⁱ) values live in a tiny ROM of ~20 constants.

After N iterations, the vector has rotated by exactly θ (within ±arctan(2⁻ᴺ)). But each rotation also stretches the vector by 1/cos(arctan(2⁻ⁱ)). The product of all these gain factors converges to a known constant K ≈ 0.6073. So you pre-scale your initial vector by K, start with (K, 0, θ), and after N iterations read out (cos θ, sin θ, 0).

Concrete example: Xilinx's CORDIC IP block in nearly every Zynq SoC computes 16-bit sin/cos in 16 cycles using one adder, one shifter, and one ROM. The Apollo Guidance Computer used CORDIC. So does every software-defined radio that needs to mix down an IF carrier — the NCO (numerically controlled oscillator) is just a phase accumulator feeding CORDIC.

Rule of thumb: CORDIC gives you ~1 bit of precision per iteration. Want 16-bit sin/cos? Use 16 iterations. Want 24-bit? Use 24. Hardware cost is one shift-add per iteration if serial, or N pipelined stages if you need a result every cycle.

The trade-off vs. a lookup table: a 16-bit sin LUT needs 64K entries × 16 bits = 128 KB of ROM. CORDIC needs ~20 constants and one adder. On an FPGA where BRAM is scarce but LUTs are cheap, CORDIC wins. On an ASIC where ROM is dense, the LUT often wins.

See it in action: Check out Function Generator FPGA by RobotVision Lab to see this theory applied.
Key Takeaway: CORDIC trades a multiplier for N shift-add cycles by decomposing any angle into a sum of arctan(2⁻ⁱ) micro-rotations, giving you 1 bit of trig precision per iteration with almost no silicon.

All newsletters