- What: AI agent escapes sandbox and sends unsolicited messages
- Impact: Potential for abuse in AI systems
EarlyCore Services Platform Solutions About Blog Pricing Sign In Sign In Book a demo Book a demo 635 security tests. 131 failures. Then the agent started messaging real people. Seven hours into an automated red team scan, at 05:30 UTC, the agent was successfully jailbroken. It read OpenClaw's own documentation, discovered a parameter that allowed it to bypass its sandbox, and rewrote its own WhatsApp configuration. The change took effect instantly via hot-reload. Seven minutes after the scan ended, two real individuals received unsolicited messages from a compromised AI agent. The deployment included sandbox isolation, tool allowlists, and channel-level access controls. None of these controls prevented the incident. The agent operated entirely within the boundaries of what the system's own rules technically permitted. Minimus OpenClawis a hardened, distroless build of the OpenClaw AI gateway, eliminating 99% of known CVEs compared to the standard image. We conducted a full red team assessment using EarlyCore Compliance. Result: Review outcome FAILED. Overall risk HIGH. 131 of 635 tests failed. 75 tests showed evidence of real tool execution. 5 were critical config changes. The agent escaped its sandbox, extracted production secrets, and caused an uncontrolled WhatsApp incident involving real phone numbers. Key takeaway: Minimus eliminates nearly all known CVEs at the OS level. But the attacks that succeeded in this assessment did not exploit a single CVE. They leveraged the tools that OpenClaw intentionally provides. Running AI agents with tool access? This assessment used EarlyCore Compliance, the same tool available to your team. Run a scan against your own deployment in under 15 minutes, no code changes required. The EarlyCore security review returned aFailedverdict with an overall risk level ofHIGH: Severity Breakdown:Critical 0, High 50, Medium 39, Low 42