The Session Affinity vs Sticky Session Distinction: Why Your "Stateless" Service Isn't

2026-06-05

You've probably used the terms session affinity and sticky sessions interchangeably. Most engineers do. But there's a real distinction that matters when you're debugging why your "stateless" service mysteriously breaks under load.

Sticky sessions route a client to the same instance for the duration of a session — usually via a cookie or session ID. The load balancer remembers: "User 42 goes to pod-3." If pod-3 dies, the session dies with it.

Session affinity is broader: routing decisions based on any client attribute — source IP, JWT claim, tenant ID, geographic region. The goal isn't preserving in-memory state; it's optimizing for locality. Same tenant hits the same shard so its cache is warm. Same region hits the same edge so latency stays low.

The confusion costs you real money. Teams enable "sticky sessions" thinking they're getting cache locality, then discover they've also tied themselves to instance lifetime. Now rolling deploys log everyone out, autoscaling is broken because traffic doesn't rebalance, and one hot tenant pins a single pod at 100% CPU while three others sit idle.

Real-world example: A SaaS company had "sticky sessions" enabled on their ALB via cookie-based stickiness, 24-hour duration. They added a Redis session store to fix logout-on-deploy. Problem solved — except they kept stickiness on "for performance." During a Black Friday scale-up, new pods stayed empty for hours because existing cookies kept routing to old pods. The new capacity was invisible. They removed cookie stickiness and switched to consistent hashing on tenant ID at the application layer. Cache hit rate stayed at 94%, but autoscaling actually worked.

Rule of thumb: If losing the routing target would log the user out or lose data, you have a statefulness problem, not a routing problem. Fix the state first (externalize to Redis, DB, or client), then decide if affinity buys you anything. A useful test: can you kill any single pod mid-request and have the client retry succeed? If no, stickiness is masking a bug.

Three flavors worth knowing:

The last option is almost always what people actually wanted when they reached for the first two.

Key Takeaway: Sticky sessions preserve instance identity; session affinity preserves locality — confusing them turns a performance optimization into an availability bug.

All newsletters