Daily Software Engineering: The CAP Theorem: You Can Only Pick Two, and Network Partitions Aren't Optional

The CAP Theorem: You Can Only Pick Two, and Network Partitions Aren't Optional

2026-06-07

CAP theorem says a distributed data store can only guarantee two of three properties: Consistency (every read sees the latest write), Availability (every request gets a non-error response), and Partition tolerance (the system keeps working when network links drop messages). The catch most engineers miss: partitions will happen. Cables get cut, switches die, cloud zones isolate. So the real choice is between CP and AP — never CA.

When a partition occurs, your system has two options:

CP (Consistency + Partition tolerance): Refuse writes (or reads) on the minority side until the partition heals. You return errors but never serve stale data. Examples: etcd, ZooKeeper, HBase, MongoDB with majority writes.
AP (Availability + Partition tolerance): Both sides keep serving traffic, accepting that they'll diverge. You reconcile later via vector clocks, CRDTs, or last-write-wins. Examples: Cassandra, DynamoDB (tunable), Riak.

Real-world example: You're building a shopping cart. A user in us-east adds an item; the partition splits us-east from us-west right as replication kicks in. CP choice (etcd-style): the write fails, the user sees "try again." AP choice (DynamoDB-style): the write succeeds locally, replicates after the partition heals, and if the user added items from both sides, you merge the carts (Amazon famously chose "add" semantics for shopping carts — you'd rather over-include than lose a sale). Now compare to a bank ledger: never AP. A double-spend is worse than an error message.

Rule of thumb: If the cost of stale or conflicting data exceeds the cost of an error, pick CP. If the cost of an error exceeds the cost of reconciliation, pick AP. Money, inventory counts, and unique-constraint enforcement → CP. Likes, view counts, session data, shopping carts → AP.

Two common misunderstandings to avoid:

CAP is only about behavior during a partition. In normal operation, you can have strong consistency and high availability simultaneously.
"Eventually consistent" isn't a free lunch — you still need conflict resolution logic. CRDTs, vector clocks, and read repair aren't decoration; they're load-bearing.

PACELC extends CAP usefully: during a Partition, choose A or C; Else, during normal operation, choose between Latency and Consistency. Most "AP" systems are really PA/EL — they trade consistency for latency even when no partition exists, because synchronous replication is slow.

Key Takeaway: Partitions are inevitable, so CAP is really a choice between failing requests (CP) and accepting divergence you'll reconcile later (AP) — pick based on whether stale data or errors hurt more.

All newsletters