2026-05-16
Imagine you hire a personal assistant who downloads helpful "skills" from an app store — one teaches them to book flights, another to summarize emails, another to manage your calendar. Now imagine an attacker wants to steal your data, but the app store scans every skill for malicious code before publishing. How do you sneak something past the scanner? You write a skill that contains no malicious code at all.
That is the unsettling premise of this paper. The authors study LLM agents — autonomous AI systems like those built on top of Claude or GPT — that pull "skills" (small plugins or tools) from open marketplaces. Current security audits work the way antivirus does: scan the skill's code and instructions for known bad patterns, suspicious payloads, or injection attempts. If nothing harmful is found, the skill is approved.
The researchers show this defense is fundamentally inadequate. They demonstrate payload-less attacks: each individual skill is benign on its own, but when an agent uses several skills together in the course of a normal task, their combined behavior produces an attack. Think of it like a heist movie — no single crew member is doing anything illegal (one rents a van, one buys gloves, one studies a blueprint), but their coordinated activity adds up to a bank robbery.
Concretely, the attacks work by:
The key insight: security scanning treats skills as isolated artifacts, but agents compose them dynamically. The danger lives in the composition, not the components. This mirrors classic supply-chain attacks (think SolarWinds or compromised npm packages), but with a twist — there's no malicious code to find in the first place.
The authors argue defenses need to shift from scanning skill content to monitoring agent behavior at runtime: watching how skills combine, what data flows between them, and whether the agent's overall trajectory matches its stated task.
