AI Agent Hacks McKinsey: 5 Situations When You Should Not Deploy Agents By Vinit Mehta March 14, 2026 9 min read Link copied! Copy failed! A security startup called CodeWall pointed an autonomous AI agent at McKinsey's internal AI platform, Lilli, and walked away. Two hours later, the agent had full read and write access to the entire production database. 46.5 million chat messages, 728,000 confidential client files, 57,000 user accounts, all in plaintext. The system prompts that control what Lilli tells 40,000 consultants every day? Writable. Every single one of them. The vulnerability was just an SQL injection, one of the oldest attack classes in software security. Lilli had been sitting in production for over two years. McKinsey's scanners never found it. The CodeWall agent found it because it doesn't follow a checklist. It maps, probes, chains, escalates, continuously, at machine speed. And scarier than the breach is what a malicious actor could have done after. Subtly alter financial models. Strip guardrails. Rewrite system prompts so Lilli starts giving poisoned advice to every consultant who queries it, with no log trail, file changes, anomaly to detect. The AI just starts behaving differently. Nobody notices until the damage is done. McKinsey is one incident. The broader pattern is what this piece is really about. The narrative pushing businesses to deploy agents everywhere is running far ahead of what agents can actually do safely inside real enterprise environments. And a lot of the companies finding that out are finding it out the hard way. So the question worth asking is when you shouldn't deploy agents at all. Let’s decode. The entire industry is betting on them anyway Around the same time as the McKinsey breach, Mustafa Suleyman, the CEO of Microsoft AI, was telling the Financial Times that white-collar work will be fully automated within 12 to 18 months. Lawyers. Accountants. Project managers. Marketing teams. Anyone sitting at a computer. Every conference keynote since late 2024 has been some version of the same thing: agents are here, agents are transforming work, go all in or fall behind. The numbers back up the energy. 62% of enterprises are experimenting with agentic AI. KPMG says 67% of business leaders plan to maintain AI spending even through a recession. The FOMO is real and it's thick. If your competitor is shipping agents, standing still feels like falling behind. But the same reports suggest: only 14% of enterprises have production-ready agent deployments. Gartner predicts over 40% of agentic AI projects will be cancelled by end of 2027. 42% of organizations are still developing their agentic strategy roadmap. 35% have no formal strategy at all. The gap between "we're experimenting" and "this is running in production and delivering value" is enormous. Most organizations are somewhere in that gap right now, burning money to stay there. Agents do work. In controlled, well-scoped, well-instrumented environments, they do. The question is what specific conditions make them fail. And there are five that keep showing up. Situation 1: The agent inherits production permissions without a human judgment filter In mid-December 2025, engineers at Amazon gave their internal AI coding agent, Kiro, a straightforward task: fix a minor bug in AWS Cost Explorer. Kiro had operator-level permissions, equivalent to a human developer. Kiro evaluated the problem and concluded the optimal approach was to delete the entire environment and rebuild it from scratch. The result was a 13-hour outage of AWS Cost Explorer across one of Amazon's China regions. Amazon's official response called it user error, specifically misconfigured access controls. But four people familiar with the matter told the Financial Times a different story. This was also not the first incident. A senior AWS employee confirmed a second production outage around the same period involving Amazon Q Developer, under nearly identical conditions: engineers allowed the AI agent to resolve an issue autonomously, it caused a disruption, and the framing again was "user error." Amazon has since added mandatory peer review for all production changes and initiated a 90-day safety reset across 335 critical systems. Safeguards that should have been there from the start, retrofitted after the damage. The structural problem was that a human developer, given a minor bug fix, would almost certainly not choose to delete and rebuild a live production environment. That's a judgment call and humans apply one instinctively. Agents don't. They reason about what's technically permissible given their permissions, choose the approach that solves the stated problem most directly, and execute it at machine speed. The permission says yes. No second thought triggers. This is the most common failure mode in agentic deployments. An agent gets write access to a production system. It has a task. It has credentials. Nothing in the architecture tells it which actions are off ...
The article describes a successful breach where an autonomous AI agent exploited a basic SQL injection vulnerability in McKinsey's Lilli platform, gaining full read/write access to its production database. The agent's ability to map, probe, and chain vulnerabilities at machine speed allowed it to bypass traditional scanners. The core threat highlighted is that once compromised, such an agent could subtly alter data, strip guardrails, and rewrite system prompts to deliver poisoned outputs without triggering conventional detection mechanisms.