2026-06-05
Every message broker worth using guarantees at-least-once delivery. That means duplicates. A consumer crashes after processing but before acking. A network blip retriggers a redelivery. A producer retries on a flaky connection. If your handler isn't built for it, you'll double-charge a customer, double-ship an order, or double-credit an account.
The Idempotent Consumer Pattern makes processing the same message N times produce the same result as processing it once. The recipe is boring on purpose: every message carries a stable unique ID, the consumer records which IDs it has processed, and it checks that record before doing the work.
The three flavors:
UPDATE orders SET status='shipped' WHERE id=X is naturally idempotent. UPDATE balance SET amount = amount - 10 is not.Concrete example. A payments service consumes a charge_requested event from Kafka. Naive handler: call Stripe, write to payments, ack. If the consumer crashes after Stripe but before ack, Kafka redelivers — you charge the customer twice. Fix it with a dedup table:
INSERT INTO processed_events (event_id) VALUES ($1) — fails if duplicateNote the layered defense: even if the dedup row commits but the broker ack fails, Stripe's idempotency key prevents a second charge on redelivery.
Rule of thumb for the dedup table. Retention should cover your broker's maximum redelivery window plus a generous safety margin — typically 7 days for Kafka, 30 days for SQS-backed systems with DLQs. Index on event_id, partition by day, and prune the tail with a scheduled job. Storage cost: ~50 bytes per row × peak throughput × retention. At 1k msg/sec for 7 days, that's roughly 30 GB — cheap insurance.
Common traps: using a non-stable ID (like a hash of payload that changes when a producer adds a field), checking the dedup table outside the business transaction (race condition), and forgetting that side effects to external systems aren't covered by your local transaction — that's where you need the downstream service's own idempotency key.
