- What: Discussion on the security risks of AI agents writing code
- Impact: Developers and IT professionals working with AI agents
Your Agent Runs Code You Never Wrote Containers, VMs, and serverless were built for code you wrote. Agents write their own. Shekhar Mar 30, 2026 4 Share Think about every line of code running on your infrastructure right now. A developer wrote it. Someone reviewed it in a PR. CI ran the tests. Ops deployed it. You know what’s running because you decided what would run. Now think about what happens when you give an AI agent access to a terminal. The agent writes code on the fly. Python, bash, SQL, shell commands. It decides what to write based on what the model thinks you want. That code has never been reviewed. Never been tested. Never appeared in any pull request. It exists for the first time in the moment it executes. This changes the isolation problem completely. It’s no longer about protecting services from each other. It’s about protecting the world from code you’ve never seen. If you’re building agents that actually work, that people actually depend on, this matters. The agent that demos well on stage will hit every one of these problems in production. Credentials will leak. Untrusted code will run. Snapshots will capture secrets you didn’t think about. The things that break real agents are not model quality or prompt engineering. They’re infrastructure problems. Isolation problems. And most teams don’t discover them until something goes wrong. This series is a deep dive into all of it. This post lays out why it matters. The rest will dig into the details. Five Things We Assumed That Aren’t True Anymore Containers, VMs, serverless functions. The isolation tools we’ve built over the last decade are good. But they were designed with assumptions that agents break in interesting ways. 1. We assumed the code is known at deploy time. A Docker container runs an image you built from a Dockerfile. You wrote the code, reviewed it, signed it, shipped it. You can scan it for vulnerabilities because it exists before it runs. An agent doesn’t work this way. Ask a coding agent to fix a bug and it might write Python that imports packages you’ve never heard of, reads your environment variables, or shells out to curl. The code doesn’t exist until the moment it runs. You can’t scan what hasn’t been written yet. Every coding agent does this. Claude Code, Cursor, Devin, Copilot Workspace. Every single invocation. 2. We assumed the workload has a bounded scope. A web server handles HTTP requests. A Lambda function processes events. You know the shape of what they’ll do. An agent’s scope is whatever the model decides. Ask it to “analyze this dataset” and it might install packages from PyPI, write files to disk, make HTTP calls, spawn subprocesses, and read other files in your working directory. All in one go. Your isolation system has to contain anything the model might attempt . That’s a very different problem than containing a known set of operations. 3. We assumed compromise requires a deliberate attack. Traditional security assumes someone is targeting you. Probing for vulnerabilities. Crafting exploits. Working at it. Agents can be compromised by accident. It’s called prompt injection. Someone puts a malicious instruction in a webpage, a document, an API response, or a file in a repository. The agent reads it, treats it as a legitimate instruction, and follows it. No exploit needed. No vulnerability. Just a sentence in the wrong place. In April 2025, security researcher Johann Rehberger showed what this looks like in practice. He put prompt injection instructions on a website linked from a GitHub issue. Devin (the AI coding agent from Cognition) processed the issue, followed the link, downloaded a command-and-control malware binary, ran chmod +x on it, and executed it. The attacker now had remote access to Devin’s machine, including all its secrets and AWS keys. The cost to the attacker was one poisoned GitHub issue. No zero-day. No kernel exploit. Just a sentence on a webpage that the agent treated as an instruction. This keeps happening. In August 2024, hidden instructions in a public Slack channel tricked Slack AI into exfiltrating private channel data. In 2025, a poisoned Google Doc made Gemini Enterprise search across all connected Workspace data and send it to an attacker’s server. Zero clicks, zero warnings. In October 2025, a prompt injection in a ServiceNow ticket field recruited higher-privileged agents to execute an attacker’s instructions (CVE-2025-12420, CVSS 9.3). Simon Willison calls it the “Lethal Trifecta.” If your agent has access to private data, is exposed to untrusted content, and has any way to send data out, it’s vulnerable. Most useful agents have all three by design. 4. We assumed workloads are stateless or explicitly stateful. Lambda functions are stateless. Databases are stateful. You choose which one you’re building and you design around it. Agents live in a gray zone. During a session, they accumulate things. Files they created. Packages they installed. Environment variables they set. And c...