The Race to Ship AI Tools Left Security Behind. Part 1: Sandbox Escape

Cymulate Research Lab identified a recurring class of sandbox escape vulnerabilities in widely adopted AI CLI agents and IDE integrations, including Claude Code and Gemini CLI, allowing attackers to bypass isolation and execute arbitrary code on the host system. The research attributes these flaws to weaknesses in how these tools enforce isolation and trust user/LLM-controlled input, enabling privilege escalation from low-privileged contexts. While some vendors have addressed the issues, others have not, reflecting a broader trend where the rapid deployment of AI tools is outpacing the maturity of their security architecture.

Read Full Article →

The Race to Ship AI Tools Left Security Behind. Part 1: Sandbox Escape By: Cymulate Research Lab April 7, 2026 Ilan Kalendarov, Security Research Team Lead Ben Zamir, Security Researcher Elad Beber, Security Researcher Cymulate Research Labs uncovered a range of vulnerability classes across multiple different AI tools that allow attackers to bypass trust boundaries, execute code in unintended contexts and compromise both local and cloud environments. This research examines widely adopted AI tools, including CLI agents and IDE integrations such as Claude Code, Gemini CLI, Codex CLI, Cursor and GitHub Copilot. While these tools are rapidly becoming a core part of modern workflows, they introduce a new and largely unexplored attack surface. Across multiple platforms and vendors, we identified recurring weaknesses in how these systems enforce isolation, handle configuration, and trust LLM / user-controlled input. These flaws enable attackers to move from low-privileged or remote influence into meaningful impact, including sandbox escape, remotely-influenced arbitrary code execution and cross-environment compromise. The vulnerabilities were responsibly disclosed to the relevant vendors. While some responded quickly and addressed the issues, others failed to remediate the underlying problems or did not engage. This reflects a broader trend: the rapid adoption of AI-driven development tools is outpacing the maturity of their security architecture. As a result, fundamental and well-understood vulnerability patterns are being reintroduced, alongside new attack surfaces that enable more sophisticated paths to code execution and privilege escalation. This blog is the first in a series analyzing these vulnerability classes. It begins with sandbox escape and isolation failures and will be followed by additional parts covering: Multiple remote code execution paths Privilege escalation Lateral movement and other attack primitives across AI-driven tools. Each blog in this series will cover relevant mitigation strategies to reduce the attack surface and highlight how our primary threat research provides Cymulate Exposure Validation with the industry’s deepest attack library. Executive Summary AI coding agents such as Claude Code, Gemini CLI and Codex CLI are rapidly becoming part of modern development workflows. These tools are often marketed not only as productivity tools, but also as security tools capable of auditing code, detecting vulnerabilities and improving overall security posture. This research shows that the agents themselves introduce a new attack surface. It is the first in a series analyzing the security architecture of AI CLI tools, beginning with sandbox implementations and trust boundary failures. We identified a recurring vulnerability class across multiple AI CLI tools that allows an attacker to escape the agent’s sandbox and execute code on the host system with the user’s privileges. Instead of breaking the sandbox through the operating system or container runtime, the attacks abuse the agent’s own configuration, startup behavior and trust boundaries. We refer to this vulnerability class as Configuration-Based Sandbox Escape (CBSE) . In this model, the sandbox isolation can be completely bypassed by modifying trusted files or execution paths that are later processed outside the sandbox. On next startup, the attacker’s code executes on the host OS, resulting in sandbox escape, persistence, and potential credential or cloud compromise. This flaw nullifies the isolation model. We reproduced this pattern across tools from multiple vendors, including Anthropic, Google, and OpenAI. The technical details differ, but the root cause is the same: the sandbox is treated as the security boundary , while the real boundary, the host-side configuration and execution logic, remains writable from inside the sandbox. The industry is increasingly promoting AI agents as security assistants. However, this research highlights a critical question: if an AI agent cannot protect its own execution boundary, how can it be trusted to secure the developer’s environment ? These findings suggest that current AI agent sandbox implementations may provide a false sense of security , and that stronger trust boundary design is required before these tools can be safely used to run commands and handle untrusted code. For Security Leaders Organizations adopting AI coding agents should treat them as privileged software with access to developer credentials, source code, and cloud environments. The CBSE vulnerability class represents a new attack surface that spans the full security lifecycle: posture management, compliance, prevention, detection , and response. Security leaders should audit which AI CLI tools are in use, whether those tools have access to sensitive credentials such as AWS, GitHub or SSH keys, and whether existing controls would detect post-exploitation activity originating from an agent process. The CVEs and techniques document...

Read Full Article → ← Back to News

The Race to Ship AI Tools Left Security Behind. Part 1: Sandbox Escape

Related Articles

Share this article