Mythos Preview hits 93.9% SWE-bench, remains restricted
Anthropic's restricted "super-frontier" model outperforms Opus 4.7 across all benchmarks, setting a new record of 93.9% on SWE-bench Verified. The model is currently limited to defensive cybersecurity partners in Project Glasswing due to its high capability for autonomous zero-day discovery and exploitation.
Anthropic is building a "Manhattan Project" for cybersecurity, prioritizing infrastructure defense over general accessibility.
- –The 13-point jump on SWE-bench Verified signals a massive leap in reasoning and autonomous software engineering.
- –Autonomous discovery of 27-year-old vulnerabilities makes this model a high-risk asset that could weaponize hacking if leaked.
- –Project Glasswing’s $100M in credits and $4M in donations aim to secure global software before offensive models catch up.
- –A 100% score on Cybench marks the end of existing security benchmarks, requiring a total overhaul of AI evals.
DISCOVERED
45d ago
2026-04-16
PUBLISHED
45d ago
2026-04-16
RELEVANCE
AUTHOR
Bijan Bowen