We hid backdoors in binaries — Opus 4.6 found 49% of them

What: Researchers tested AI agents' ability to detect backdoors in binary executables without source code access.
Impact: The study found that AI agents can detect some hidden backdoors in binaries, which was unexpected.

Back to Blog We hid backdoors in binaries — Opus 4.6 found 49% of them Piotr Grabowski & Rafał Strzaliński & Michał Kowalczyk & Piotr Migdał & Jacek Migdal 10 February 2026 Claude can code, but can it check binary executables? We already did our experiments with using NSA software to hack a classic Atari game . This time we want to focus on a much more practical task — using AI agents for malware detection . We partnered with Michał “Redford” Kowalczyk, reverse engineering expert from Dragon Sector, known for finding malicious code in Polish trains , to create a benchmark of finding backdoors in binary executables, without access to source code. See BinaryAudit for the full benchmark results — including false positive rates, tool proficiency, and the Pareto frontier of cost-effectiveness. All tasks are open source and available at quesmaOrg/BinaryAudit . We were surprised that today’s AI agents can detect some hidden backdoors in binaries. We hadn’t expected them to possess such specialized reverse engineering capabilities. However, this approach is not ready for production. Even the best model, Claude Opus 4.6, found relatively obvious backdoors in small/mid-size binaries only 49% of the time. Worse yet, most models had a high false positive rate — flagging clean binaries. In this blog post we discuss a few recent security stories, explain what binary analysis is, and how we construct a benchmark for AI agents. We will see when they accomplish tasks and when they fail — by missing malicious code or by reporting false findings. Background Just a few months ago Shai Hulud 2.0 compromised thousands of organizations, including Fortune 500 companies, banks, governments, and cool startups — see postmortem by PostHog . It was a supply chain attack for the Node Package Manager ecosystem, injecting malicious code stealing credentials. Just a few days ago, Notepad++ shared updates on a hijack by state-sponsored actors , who replaced legitimate binaries with infected ones. Even the physical world is at stake, including critical infrastructure. For example, researchers found hidden radios in Chinese solar power inverters and security loopholes in electric buses . Every digital device has a firmware, which is much harder to check than software we install on the computer — and has much more direct impact. Both state and corporate actors have incentive to tamper with these. Michał “Redford” Kowalczyk from Dragon Sector on reverse engineering a train to analyze a suspicious malfunction , the most popular talk at the 37th Chaos Communication Congress . See also Dieselgate, but for trains writeup and a subsequent discussion . You do not even need bad actors. Network routers often have hidden admin passwords baked into their firmware so the vendor can troubleshoot remotely — but anyone who discovers those passwords gets the same access. Can we use AI agents to protect against such attacks? Binary analysis In day-to-day programming, we work with source code. It relies on high-level abstractions: classes, functions, types, organized into a clear file structure. LLMs excel here because they are trained on this human-readable logic. Malware analysis forces us into a harder world: binary executables. Compilation translates high-level languages (like Go or Rust) into low-level machine code for a given CPU architecture (such as x86 or ARM). We get raw CPU instructions: moving data between registers, adding numbers, or jumping to memory addresses. The original code structure, together with variables and functions names gets lost. To make matters worse, compilers aggressively optimize for speed, not readability. They inline functions (changing the call hierarchy), unroll loops (replacing concise logic with repetitive blocks), and reorder instructions to keep the processor busy. Yet, a binary is what users actually run. And for closed-source and binary-distributed software, it is all we have. Analyzing binaries is a long and tedious process of reverse engineering, which starts with a chain of translations: machine code → assembly → pseudo-C . Let’s see how an example backdoor looks in those representations: 1 Raw Binary xxd ↓ hover/tap code to trace b9 01 00 00 00 48 89 df ba e0 00 00 00 e8 b6 c6 ff ff 49 89 c5 48 85 c0 74 6e 44 0f b6 40 01 4c 8d 8c 24 a0 01 00 00 49 8d 75 02 4c 89 cf 4c 89 c0 41 83 f8 08 72 0a 4c 89 c1 48 c1 e9 03 f3 48 a5 31 d2 41 f6 c0 04 74 09 8b 16 89 17 ba 04 00 00 00 41 f6 c0 02 74 0c 0f b7 0c 16 66 89 0c 17 48 83 c2 02 41 83 e0 01 74 07 0f b6 0c 16 88 0c 17 4c 89 cf c6 84 04 a0 01 00 00 00 e8 b7 4c fd ff 2 Disassembly objdump ↓ hover/tap code to trace 33e88: mov ecx, 0x1 33e8d: mov rdi, rbx 33e90: mov edx, 0xe0 33e95: call 30550 33e9a: mov r13, rax 33e9d: test rax, rax 33ea0: je 33f10 33ea2: movzx r8d, BYTE PTR [rax+1] 33ea7: lea r9, [rsp+0x1a0] 33eaf: lea rsi, [r13+0x2] ... (omitted for brevity) 33efc: mov BYTE PTR [rsp+rax+0x1a0], 0x0 33f04: call system@plt 3 Decompiled Ghidra ↓ hover/tap code to trace lVar1...

Read Full Article → ← Back to News

We hid backdoors in binaries — Opus 4.6 found 49% of them

Related Articles

Share this article