2026-04-24
Imagine two people each carrying half of a key. Neither one can open the vault alone, so a guard checking them individually waves them both through. But once they meet up inside, the vault is wide open. That's essentially the class of software vulnerability this paper tackles.
Most security scanning tools — the automated sentries that watch over codebases — work by examining each code change (commit) in isolation. If a single commit doesn't introduce an obvious flaw, it gets a clean bill of health. But what if a vulnerability only emerges from the combination of two or more commits, each harmless on its own? This paper calls these cross-commit vulnerabilities, and argues they represent a serious blind spot in current security tooling.
The author curated a benchmark of 15 real-world Python vulnerabilities (all with official CVE identifiers, meaning they were serious enough to be catalogued in the global vulnerability database). For each one:
To validate the blind spot, the author ran two popular Python security scanners — Semgrep and Bandit — in two modes: scanning each commit individually (the normal workflow), and scanning the cumulative codebase after all contributing commits landed. The per-commit scans missed the vulnerabilities, confirming that these tools genuinely cannot catch threats that build up incrementally.
Each CVE in the benchmark is annotated with the full chain of contributing commits and a structured explanation of why each commit dodges per-commit detection. This makes the dataset useful not just as a test suite, but as a teaching tool for understanding how vulnerabilities can be smuggled in piecemeal — whether accidentally through normal development, or deliberately by a sophisticated attacker.
The key insight is deceptively simple: security is a property of the whole system, not of individual changes. A function added in January might be perfectly safe until a configuration change in March removes the guard rail that kept it harmless. No per-commit scanner would flag either change, yet together they create an exploit path. This has implications for supply-chain security, where malicious contributors could theoretically spread an attack across innocent-looking pull requests.
