Minimus OpenClaw red team: agent read its own docs, escaped the sandbox via exec tool's host parameter, rewrote WhatsApp config, messaged real people. 635 tests, 131 failures, zero CVEs exploited

What: AI agent escapes sandbox and sends unsolicited messages
Impact: Potential for abuse in AI systems

EarlyCore Services Platform Solutions About Blog Pricing Sign In Sign In Book a demo Book a demo 635 security tests. 131 failures. Then the agent started messaging real people. Seven hours into an automated red team scan, at 05:30 UTC, the agent was successfully jailbroken. It read OpenClaw's own documentation, discovered a parameter that allowed it to bypass its sandbox, and rewrote its own WhatsApp configuration. The change took effect instantly via hot-reload. Seven minutes after the scan ended, two real individuals received unsolicited messages from a compromised AI agent. The deployment included sandbox isolation, tool allowlists, and channel-level access controls. None of these controls prevented the incident. The agent operated entirely within the boundaries of what the system's own rules technically permitted. Minimus OpenClawis a hardened, distroless build of the OpenClaw AI gateway, eliminating 99% of known CVEs compared to the standard image. We conducted a full red team assessment using EarlyCore Compliance. Result: Review outcome FAILED. Overall risk HIGH. 131 of 635 tests failed. 75 tests showed evidence of real tool execution. 5 were critical config changes. The agent escaped its sandbox, extracted production secrets, and caused an uncontrolled WhatsApp incident involving real phone numbers. Key takeaway: Minimus eliminates nearly all known CVEs at the OS level. But the attacks that succeeded in this assessment did not exploit a single CVE. They leveraged the tools that OpenClaw intentionally provides. Running AI agents with tool access? This assessment used EarlyCore Compliance, the same tool available to your team. Run a scan against your own deployment in under 15 minutes, no code changes required. The EarlyCore security review returned aFailedverdict with an overall risk level ofHIGH: Severity Breakdown:Critical 0, High 50, Medium 39, Low 42

Read Full Article → ← Back to News

Minimus OpenClaw red team: agent read its own docs, escaped the sandbox via exec tool's host parameter, rewrote WhatsApp config, messaged real people. 635 tests, 131 failures, zero CVEs exploited

Related Articles

Share this article