Seven days later: the AI threat thesis bent
On 20 April we said the data did not support a broad AI-driven acceleration of zero-day discovery. In the six days after that, Anthropic's Mythos found 271 Firefox vulnerabilities in a single evaluation pass. Two AI-sandbox-escape CVEs landed. Google reported a 32% jump in indirect prompt injection. CrowdStrike launched an industry coalition to deal with all of it. The thesis bent — here is exactly where, and how far.
What we said on 20 April still holds for the historical data through April 19. What changed on 21–24 April is the kind of evidence you cannot dismiss as outliers — Anthropic and Mozilla jointly attribute 271 specific Firefox bugs to one model in one pass; two named CVEs (CVE-2026-5752, CVE-2025-59532) document AI-runtime sandbox escapes as a real category; Google reports prompt injection up 340% year-over-year with active in-the-wild monetisation; and a major endpoint vendor admits the legacy detect-and-patch loop has been outpaced.
The picture has not flipped — but it has moved enough that an honest re-read is required. We do that here.
The original three datasets, in one sentence
CISA KEV monthly additions had been steady at 15–25/month since 2022; annual zero-days tracked by Google had oscillated 60–100 since 2021 with no post-ChatGPT ramp; Mandiant's time-to-exploit had collapsed from 63 days (2019) to −7 days (2026), but the industry consensus attributed that drop to automated scanners, public PoC explosion on GitHub, and N-day exploit markets — not AI.
The summary line was: AI is real on the phishing/BEC side; on the vulnerability side, the data did not yet support a broad AI-driven jump.
The bar we set for ourselves was: "If AI were broadly accelerating zero-day discovery, we would expect a steep ramp." The 21–24 April events do not yet show up in CISA KEV month-counts or in Google's annual zero-day tally — those are lagging indicators by 30–365 days. They show up first in vendor disclosures, CVE assignments and threat-vendor coalition launches. That is what the next four sections document.
One AI model, one codebase, 271 vulnerabilities
On 21 April, Anthropic publicly announced Claude Mythos — a cybersecurity-focused model trained for vulnerability discovery in large codebases. On 22 April, Mozilla released Firefox 150 with patches for 271 vulnerabilities discovered by Mythos in a single evaluation pass against the Firefox source tree. Three of those received public CVE numbers (CVE-2026-6746, CVE-2026-6757, CVE-2026-6758) — the rest are characterised as defence-in-depth, hardening or non-exploitable code-path bugs that did not meet Mozilla's CVE threshold.
Firefox CTO Bobby Holley described the model as “every bit as capable as elite security researchers,” while adding the important caveat: “we also haven't seen any bugs that couldn't have been found by an elite human researcher.” Read together, that is a claim about speed, not super-human capability — Mythos found in one pass what would have taken months of human effort, but did not (yet) find a category of bug a human couldn't have. Mozilla's framing is that Mythos “shifts security toward defenders” — a view that depends on Anthropic restricting access to the model. Anthropic has done so via a programme called Project Glasswing, currently limited to Amazon, Apple and Microsoft.
One AI run produced 271 specific, named, attributable vulnerabilities in a real production codebase that has been audited continuously for two decades by some of the best security researchers in the world. Whether the rest of the picture is symmetric — i.e. whether attackers without Project Glasswing access can produce comparable output — is the open question. The leaked model copy (Dataset 4) is the early answer.
The AI runtime is now an attack surface — by CVE number
On 22 April, two related CVEs were published describing sandbox escapes in widely-used AI tooling:
- CVE-2026-5752 — CVSS 9.3 critical Cohere Terrarium — sandbox escape via Pyodide WebAssembly prototype-chain traversal Terrarium is Cohere's open-source Python sandbox used to execute LLM-generated code in a Docker container. The vulnerability allows arbitrary code execution with root privileges on the host Node.js process — i.e. the AI's untrusted output executes on the system that was supposed to be isolating it.
- CVE-2025-59532 — sandbox bypass OpenAI Codex CLI — sandbox bypass via model-controlled working-directory Affects Codex CLI 0.2.0 to 0.38.0. The CLI accepted a current-working-directory value supplied in model output, and used that path as the writable root of the sandbox. A malicious model response could therefore designate any path on the user's machine as “writable.” Patched in 0.39.0 by canonicalising and validating the path against the user's session start location.
Both bugs are confused-deputy errors: the sandbox grants its trust boundary to a value the model produces. That is the canonical shape of an AI-runtime vulnerability — the AI is the attacker the sandbox is supposed to contain, and the sandbox forgot. With Cohere and OpenAI both publishing on the same day, this is now a category, not a one-off. Expect parallel CVEs in every code-execution AI tool deployed in 2025–2026.
From research-conference theory to live PayPal transfers
On 24 April, multiple research outlets published on indirect prompt-injection attacks now active against production AI agents. Indirect prompt injection is the case where the malicious instruction is embedded in third-party content (a webpage, an email, a PDF) that the agent then ingests, rather than typed by the user. Until very recently this was largely a research-conference scenario.
Three numbers to anchor the change:
- +340 % YoY Total prompt-injection attack volume, year-over-year Industry tracking now ranks prompt injection as the #1 security threat to AI systems in 2026.
- +32 % (Nov 2025 → Feb 2026) Increase in malicious indirect-injection observed by Google Documented in Google's threat-intelligence telemetry over a 3-month window. Targets: GitHub Copilot, Claude Code, ChatGPT plugins, browser-using agents.
- 10 In-the-wild payloads documented by Palo Alto Unit 42 Concrete observed cases include: forced deletion of a backup folder, an instruction to issue a $5,000 transfer via PayPal.me embedded in a webpage, exfiltration of a secret API key from agent context, and rerouting of donations to attacker-controlled Stripe accounts via the persuasion keyword “ultrathink” in meta tags. Concealment techniques observed: HTML comments, 1-pixel fonts, transparent colours, accessibility-layer text, namespace-injection meta tags.
The risk profile of an AI agent is a function of its actions, not its words. A summarising agent with read-only access is low-risk. An agent with payment, mail-send or terminal-execution capability becomes a high-impact target the moment any of its inputs are attacker-controlled — and almost all of them are. The Unit 42 PayPal.me case is the cleanest single demonstration: a webpage instructed an agent to spend money, and the agent did.
The defence stack and the supply-chain pull both moved in the same week
Two things happened on the same day, 23 April:
- 23 Apr — CrowdStrike Project QuiltWorks Industry coalition for AI-discovered vulnerabilities — partners: Accenture, EY, IBM, Kroll, OpenAI, Anthropic The Frontier AI Readiness and Resilience Service offers AI-powered scanning of customer codebases plus red-team triage. CrowdStrike's CEO described the launch as a response to a “collapsed window” between AI-driven discovery and exploitation. The vendor framing is that the legacy detect-and-patch loop has been outpaced and that defenders need a parallel AI-pipeline.
- 21–22 Apr — Mythos breach Same day Mythos was announced, an unauthorised group accessed it via a third-party vendor environment A small Discord-coordinated group correctly guessed the model's URL based on Anthropic's known URL conventions, and accessed it through a contractor environment. ShinyHunters claims of authorship are false (AI-generated screenshot evidence). Anthropic's systems were not directly breached; the access was via a third-party. The technical fact that matters: a model intended for restricted distribution to Amazon, Apple and Microsoft was reachable from outside that perimeter on its public launch day.
The defender pipeline (QuiltWorks) and the offence-side leak (Mythos third-party access) appeared simultaneously. The defender argument depends on Mythos staying restricted; the same week's leak demonstrated it cannot. This is the structural problem with AI-driven vulnerability discovery: the model is the asset, and the asset leaks. There is no analogue in the pre-AI defender stack — Cobalt Strike's leak in 2020 enabled adversaries to use a tool, but Cobalt Strike does not autonomously discover novel vulnerabilities the way Mythos does.
Pacing is now weekly, not annual
The original analysis used annual reports — Mandiant M-Trends 2026, Google Project Zero year totals, FBI IC3 2025. Annual reports remain valuable for trend identification. They do not capture week-on-week change, and the 21–24 April events are exactly the kind of cluster that an annual cycle would average out into invisibility.
Three concrete adjustments this week:
- If your pipeline runs an LLM-backed code-execution sandbox, patch immediately and review sandbox-policy assumptions.
- Treat every external input ingested by an agentic AI as untrusted user input.
- Subscribe to weekly (not annual) threat-intel digests.
Local angle: AI tools in public-sector deployments
NIS2 transposition deadlines passed for Iceland's “essential entities” classification this past quarter. Many Icelandic public-sector deployments are evaluating GitHub Copilot, Microsoft 365 Copilot for Security and Anthropic-via-Bedrock. The 21–24 April events do not change which products are appropriate, but they change the threat-model assumptions an organisation is allowed to make about them. Specifically:
- Indirect prompt injection should be in the risk register, with a control mapped.
- Board-level discussion: do your AI assistants have payment, mail-send or command-execution paths? If yes, controls must be in place.
- NIS2 72-hour notification applies to AI-mediated incidents. Most organisations lack a runbook for this classification.
Methodology & honest caveats
What we did: aggregated public reporting on five named events from 21–24 April 2026; verified each against at least two independent sources; cross-referenced with our own news.1881.is article corpus for the same period.
What we did not do: re-run the underlying datasets from the 20 April analysis. CISA KEV, Google Project Zero and Mandiant M-Trends are still our primary trend sources; this article does not claim those numbers have changed yet — it claims the lagging indicators are about to.
What we are willing to retract: “no clear AI-driven jump in zero-day discovery” was a defensible reading of the data on 20 April. After 271 Mythos-attributed Firefox bugs and two named AI-runtime CVEs in three days, that sentence needs an asterisk: “as of 19 April 2026.” That asterisk now exists.
- Mozilla — The zero-days are numbered
- Help Net Security — Claude Mythos finds 271 Firefox flaws
- SecurityWeek — Claude Mythos Finds 271 Firefox Vulnerabilities
- The Hacker News — Cohere Terrarium sandbox flaw (CVE-2026-5752)
- SentinelOne — CVE-2026-5752 Terrarium sandbox escape
- Wiz — CVE-2025-59532 Codex CLI sandbox bypass
- Check Point Research — OpenAI Codex CLI vulnerability
- Palo Alto Unit 42 — Web-based indirect prompt injection observed in the wild
- Help Net Security — Indirect prompt injection is taking hold in the wild
- Infosecurity Magazine — 10 in-the-wild prompt-injection payloads
- SiliconANGLE — CrowdStrike launches Project QuiltWorks coalition
- CrowdStrike — Project QuiltWorks press release
- TechCrunch — Unauthorised access to Anthropic Mythos via third-party
- Security News — Original 20 April analysis: Is AI speeding up cyber attacks?