2026-05-12
Imagine you're trying to train a puppy, but every time you want to test out a new training technique, you have to raise a brand-new puppy from birth. That's roughly the situation researchers face today when they build "meta-agents" — AI systems whose job is to supervise, debug, or improve other AI agents. Every experiment means re-running the whole agent from scratch, which is slow, expensive, and makes it hard to compare what would have happened if the agent had taken a different path at some critical moment.
Shepherd is a new runtime system that fixes this by treating an AI agent's life history like a Git repository. Every time the agent does something — reads a file, calls a tool, gets a response — that interaction is recorded as a typed event in an execution trace. Crucially, you can rewind to any past moment, fork off a new branch, and try a different path. Think of it as save-states for AI agents, but with mathematical rigor.
Three things make this paper notable:
The key insight is treating agent execution as a functional, immutable data structure rather than a one-shot script. Once you can fork, replay, and branch agent runs cheaply and correctly, a whole class of meta-agent techniques — automated debugging, counterfactual evaluation, search over agent strategies, training data generation — becomes practical. Today, most agent frameworks treat a run as an ephemeral event; Shepherd treats it as a queryable artifact you can poke at after the fact.
If you've ever wanted to ask "what would my agent have done if I'd given it that other tool at step 47?" — and gotten an answer in seconds instead of dollars and minutes — this is the substrate that makes that question cheap to answer.
