ArXiv Paper Digest: Exploiting LLM Agent Supply Chains via Payload-less Skills

Exploiting LLM Agent Supply Chains via Payload-less Skills

2026-05-16

Authors: Xinyu Liu, Yukai Zhao, Xing Hu, Xin Xia

Imagine you hire a personal assistant who downloads helpful "skills" from an app store — one teaches them to book flights, another to summarize emails, another to manage your calendar. Now imagine an attacker wants to steal your data, but the app store scans every skill for malicious code before publishing. How do you sneak something past the scanner? You write a skill that contains no malicious code at all.

That is the unsettling premise of this paper. The authors study LLM agents — autonomous AI systems like those built on top of Claude or GPT — that pull "skills" (small plugins or tools) from open marketplaces. Current security audits work the way antivirus does: scan the skill's code and instructions for known bad patterns, suspicious payloads, or injection attempts. If nothing harmful is found, the skill is approved.

The researchers show this defense is fundamentally inadequate. They demonstrate payload-less attacks: each individual skill is benign on its own, but when an agent uses several skills together in the course of a normal task, their combined behavior produces an attack. Think of it like a heist movie — no single crew member is doing anything illegal (one rents a van, one buys gloves, one studies a blueprint), but their coordinated activity adds up to a bank robbery.

Concretely, the attacks work by:

Splitting malicious intent across skills. Skill A asks the agent to read a file. Skill B happens to write data to a URL. Neither is suspicious alone, but chained together they exfiltrate data.
Exploiting the agent's reasoning. The skill's description nudges the LLM into making decisions that lead to harmful outcomes, without ever telling it to do anything explicitly bad.
Living off the land. Attacks use the agent's legitimate capabilities (file access, network calls, other approved tools) rather than smuggling in new ones.

The key insight: security scanning treats skills as isolated artifacts, but agents compose them dynamically. The danger lives in the composition, not the components. This mirrors classic supply-chain attacks (think SolarWinds or compromised npm packages), but with a twist — there's no malicious code to find in the first place.

The authors argue defenses need to shift from scanning skill content to monitoring agent behavior at runtime: watching how skills combine, what data flows between them, and whether the agent's overall trajectory matches its stated task.

Why it matters: As LLM agents adopt plugin marketplaces at scale, this paper shows that static code review — our default security posture — is structurally blind to attacks that emerge from how skills are combined rather than what any one of them contains.

All newsletters