CyberGym benchmark reveals massive AI hacking leap
New results from UC Berkeley’s CyberGym benchmark show frontier models like GPT-5.5 and Claude Mythos achieving over 80% success in autonomous vulnerability reproduction. The framework, which spans 1,507 real-world tasks, has already helped agents discover 35 new zero-day vulnerabilities.
CyberGym is graduating from a research project to the definitive metric for agentic security capabilities, effectively becoming the "SWE-bench" of offensive research.
- –Frontier models have jumped from ~12% to over 80% success in less than a year, signaling a breakthrough in long-horizon reasoning and codebase navigation.
- –The discovery of 35 zero-days in major libraries like OpenSSL proves that AI agents can now outperform traditional fuzzing and human review in specific contexts.
- –High performance on CyberGym was reportedly a key factor in Anthropic's decision to gate "Mythos," illustrating how benchmarks are now driving safety policy.
- –The shift to execution-based eval (requiring a working PoC) prevents "data contamination" leaks that plague static security benchmarks.
- –Microsoft’s MDASH now leads the leaderboard at 88.45%, demonstrating the efficacy of multi-agent architectures in complex security tasks.
DISCOVERED
2h ago
2026-05-15
PUBLISHED
2h ago
2026-05-15
RELEVANCE
AUTHOR
Wes Roth