Anthropic Claude Mythos tops reasoning, security benchmarks
Anthropic’s leaked "Claude Mythos" model, codenamed Capybara, reportedly sets new performance records for reasoning and autonomous zero-day vulnerability detection. Currently restricted to select partners under Project Glasswing, the high-compute model represents a shift toward prioritized output quality over speed or cost.
Claude Mythos represents a category-defining leap that moves LLMs from general-purpose assistants to specialized, autonomous reasoning agents. The reported 97% USAMO score confirms a level of mathematical reasoning signaling a major breakthrough in logical consistency, while advanced offensive cyber capabilities identify vulnerabilities with a success rate far exceeding previous models. The introduction of the Capybara tier suggests Anthropic is bifurcating its lineup into consumer and high-compute expert tiers. Restricted access via Project Glasswing highlights growing safety concerns regarding dual-use models, and high operational costs indicate that wide commercial deployment remains distant.
DISCOVERED
4d ago
2026-04-07
PUBLISHED
4d ago
2026-04-07
RELEVANCE
AUTHOR
ImmuneHack