The performance bug hiding in our Cloud Run billing settings

2026-05-05

Link: https://oblique.security/blog/the-performance-bug-hiding-in-our-billing-settings/

HN Discussion: 2 points, 1 comments

Cloud Run has two billing modes — request-based (the default, where you pay only while a request is being handled) and instance-based (you pay for the full lifetime of each instance). Most engineers pick the default and never look at it again. This post from Oblique Security is a reminder that the choice has performance implications that go far beyond the bill.

The crux: in request-based billing, Cloud Run aggressively throttles CPU between requests. Your container is technically still running, but it's been pushed down to a tiny slice of CPU until the next request arrives. That's fine for stateless handlers, but it quietly breaks anything you assumed was happening "in the background":

The "performance bug" framing is the right one. It looks like a billing toggle, but it's really a runtime model change — the platform is telling your process when it's allowed to do work, and that contract is invisible from inside the container. You only notice when p99 latency on the first request after idle goes weird, or when traces show inexplicable gaps.

What makes the post worth reading is the diagnostic angle: how do you even see this? CPU throttling on Cloud Run doesn't show up the way you'd expect in metrics, and the symptoms (cold-start-like latency on warm instances, sporadic connection failures, missing spans) are easy to misattribute to network flakiness or downstream services. The fix — switching to instance-based billing for services that do real background work — is trivial once you've identified the cause, but identifying it is the hard part.

Anyone running serverless containers should know which billing mode they're on and why. The default isn't always the right answer for services with persistent connections, streaming workloads, or batch telemetry.

Why it deserves more upvotes: A "billing setting" that silently throttles your CPU between requests is exactly the kind of platform footgun every Cloud Run user should know about before debugging mystery latency.

All newsletters