A Second Agent That Proves the First One Wrong

What: New AI agent EVA verifies vulnerability findings
Impact: Helps reduce false positives in security testing processes

AI agents can probe an application surface fast. They generate a high volume of suspicious signals across injection points, access controls, and client-side rendering. What they cannot do reliably is prove that any of those signals represent real, exploitable vulnerabilities. We have watched agents report SQL injection on parameterized endpoints, XSS behind strict Content-Security-Policy, and SSRF on servers that make no outbound requests. Each of these looked plausible in the raw output. None of them were real. The consequence is either analyst hours burned triaging fabricated findings, or worse, fabricated findings making it into a delivered report. EVA (Exploitation Verification Agent) is a second LLM agent that reads the findings from a testing agent and independently re-exploits each one. If EVA cannot reproduce a finding, it gets dropped. This is an engineering constraint, not a feature: we refuse to ship findings we cannot prove. Architecture Every testing agent in the pipeline is paired with an EVA instance. The testing agent runs, produces findings, and EVA reads those findings and attempts to reproduce each one from scratch. Testing Agent SQLi / XSS / IDOR ... Runs attack, produces findings Raw Findings Unverified claims EVA Re-exploit Each Finding Select verifier, run protocol, adapt on failure, classify Verification Tools TIME DELAY Statistical timing RESPONSE COMPARE Cross-user IDOR RACE CONDITION Barrier-synced PLAYWRIGHT Browser XSS check CALLBACK OOB blind verify Minimum Retry Protocol Multiple encoding variants before any FALSE_POSITIVE VERIFIED End-to-end proof INCLUDED IN REPORT POTENTIAL Strong indicators, confidence scored INCLUDED IN REPORT FALSE POSITIVE All variants failed DROPPED Classification Output EVA is not a replay script. It is an intelligent agent that reasons about each finding, selects the right verification strategy for the vulnerability class, and adapts when its first attempt fails. When server-side input validation rejects the original payload, EVA tries alternative encodings and entirely different attack techniques for the same vulnerability class. A deterministic replay step cannot reason about why verification failed or what to try next. Three possible classifications. VERIFIED means the full attack chain completed and the intended outcome was achieved: end-to-end exploitation, not just a promising server response. POTENTIAL means strong indicators exist but something prevented full confirmation. FALSE_POSITIVE means the finding gets removed from the report entirely. Only confirmed or likely-real vulnerabilities survive into the output. How verification works The core principle behind every verification decision is to check whether the attack actually worked before anything else. Did you extract another user's data? Did JavaScript execute in the browser? Did the time delay hold up statistically? If yes, that is a verified finding, regardless of what intermediate checks look like. EVA evaluates evidence in layers: full success first, partial success second, no evidence last. Time-delay (blind injection) The most common AI pentesting false positive. Agent sends ' AND SLEEP(5)-- , response comes back slow, agent reports SQLi. Network jitter and server load make this unreliable without statistical backing. The verifier establishes a baseline timing profile for the connection, then measures response times with the payload injected. It compares the two using statistical thresholds that account for connection noise. A noisy connection requires a larger observed delay to reach the same confidence. Borderline results, where there is a real delay but it falls short of the threshold, get classified as POTENTIAL and trigger a retry with a higher delay value. IDOR response comparison A 200 from another user's resource endpoint does not confirm IDOR. You need to prove the response contains their data, not just a different error page. The verifier requests the attacker's own resource and the target's resource, then compares the responses. Both returning 200 with different content is not enough. It needs to determine whether the target's response contains actual data or an error envelope dressed up as a 200. Plenty of APIs return 200 {"error": "access denied"} instead of a proper 403. The verifier parses response structure to distinguish real data from disguised denials across both JSON and HTML responses. Denial responses (401, 403, 404) are an immediate FALSE_POSITIVE because access control is working. XSS (browser-based) String matching produces false positives. A reflected <script> inside an HTML comment, a quoted attribute, or a JSON body is not exploitable XSS. EVA uses Playwright with headless Chromium to check for actual JavaScript execution. The simplest check listens for browser dialog events. If the payload triggers an alert() or prompt() , the JS ran. For payloads that do not use dialogs, EVA monitors console output for canary strings or watches for outbound requests th...

Read Full Article → ← Back to News

A Second Agent That Proves the First One Wrong

Related Articles

Share this article