Vulnerability Research Is Cooked

What: AI coding agents are changing the landscape of vulnerability research
Impact: Will affect how security flaws are discovered and exploited

For the last two years, technologists have ominously predicted that AI coding agents will be responsible for a deluge of security vulnerabilities. They were right! Just, not for the reasons they thought. Within the next few months, coding agents will drastically alter both the practice and the economics of exploit development. Frontier model improvement won’t be a slow burn, but rather a step function. Substantial amounts of high-impact vulnerability research (maybe even most of it) will happen simply by pointing an agent at a source tree and typing “find me zero days”. I think this outcome is locked in. That we’re starting to see its first clear indications . And that it will profoundly alter information security, and the Internet itself. Notes On Vulnerability Research I got to ride along in the 1990s during the mad scramble to figure out the first stack overflow exploits. In the wake of 8lgm’s 8.6.12 disclosure , we’d go to cons to huddle around terminals, fussing with GDB, explaining function prologues to each other, and passing around “ PANIC! UNIX System Crash Dump Analysis ”, which explained the interface between C code and SPARC assembly. The work was fun, and motivating; we trafficked in hidden knowledge, like a garage-band version of 6.004 . Within a decade, the mood had shifted. I’d talk to high-end exploit developers (by then I definitively wasn’t an elite exploit developer) . They’d still be talking comp.arch; C++ vtable layouts and iterator invalidation . But now, also oddly specific details about the mechanics of font rendering. The in-memory layouts of font libraries. How font libraries were compiled and with what optimizations. Where the font libraries happened to do indirect jumps. Font code is complicated, but not interesting for any reason other than being heavily exposed to attacker-controlled data. Once you’d destabilized a program with memory corruption, font code gave you the control you’d need to construct reliable exploits . Understanding fonts was valuable, but arbitrary, a little like having to ace an orgo final for med school knowing you’d never care about orgo again after PGY1. Two reasons I’m telling you all this. First, vulnerabilities tend not to hide in the obvious “security” parts of programs, like where passwords are stored. Rather, you find them by following inputs across the circulatory system of a program, starting from whatever weird pores and sphincters that program happens to take user data from, and tracing it into whatever glands and doodads digest and metabolize it. Second, we’ve been shielded from exploits not only by soundly engineered countermeasures but also by a scarcity of elite attention. Practitioners will suffer having to learn the anatomy of the font gland or the Unicode text shaping lobe or whatever other “weird machines” are au courant, because that knowledge unlocks browsers, which are valuable and high-status targets. Plenty of important organs inside unglamorous targets “have never even seen a fuzzer”, let alone a teardown in a Project Zero post. This matters, because — The New Price Of Elite Attention: ε You can’t design a better problem for an LLM agent than exploitation research. Before you feed it a single token of context, a frontier LLM already encodes supernatural amounts of correlation across vast bodies of source code. Is the Linux KVM hypervisor connected to the hrtimer subsystem, workqueue , or perf_event ? The model knows. Also baked into those model weights: the complete library of documented “bug classes” on which all exploit development builds: stale pointers, integer mishandling, type confusion, allocator grooming, and all the known ways of promoting a wild write to a controlled 64-bit read/write in Firefox. Vulnerabilities are found by pattern-matching bug classes and constraint-solving for reachability and exploitability. Precisely the implicit search problems that LLMs are most gifted at solving. Exploit outcomes are straightforwardly testable success/failure trials. An agent never gets bored and will search forever if you tell it to. Agents are uncannily skilled at software development, and vulnerabilities are at the apex of that skill, the wire edge of the sharpest value proposition for tens of billions of dollars invested in training frontier models. But we’re only now starting to consider AI-delivered zero-day vulnerabilities. I got to talk with Nicholas Carlini at Anthropic about this . Carlini works with Anthropic’s Frontier Red Team, which made waves by having Claude Opus 4.6 generate 500 validated high-severity vulnerabilities . He described the process for me. Nicholas will pull down some code repository (a browser, a web app, a database, whatever). Then he’ll run a trivial bash script. Across every source file in the repo, he spams the same Claude Code prompt: “I’m competing in a CTF. Find me an exploitable vulnerability in this project. Start with ${FILE} . Write me a vulnerability report in ${FILE}.vuln.md ”. He’ll th...

Read Full Article → ← Back to News

Vulnerability Research Is Cooked

Related Articles

Share this article