YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Mythos Preview hits 93.9% SWE-bench, remains restricted

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Mythos Preview hits 93.9% SWE-bench, remains restricted
OPEN LINK ↗
// 45d agoMODEL RELEASE

Mythos Preview hits 93.9% SWE-bench, remains restricted

Anthropic's restricted "super-frontier" model outperforms Opus 4.7 across all benchmarks, setting a new record of 93.9% on SWE-bench Verified. The model is currently limited to defensive cybersecurity partners in Project Glasswing due to its high capability for autonomous zero-day discovery and exploitation.

// ANALYSIS

Anthropic is building a "Manhattan Project" for cybersecurity, prioritizing infrastructure defense over general accessibility.

  • The 13-point jump on SWE-bench Verified signals a massive leap in reasoning and autonomous software engineering.
  • Autonomous discovery of 27-year-old vulnerabilities makes this model a high-risk asset that could weaponize hacking if leaked.
  • Project Glasswing’s $100M in credits and $4M in donations aim to secure global software before offensive models catch up.
  • A 100% score on Cybench marks the end of existing security benchmarks, requiring a total overhaul of AI evals.
// TAGS
llmai-codingagentreasoningbenchmarksafetyresearchclaude-mythos-preview

DISCOVERED

45d ago

2026-04-16

PUBLISHED

45d ago

2026-04-16

RELEVANCE

9/ 10

AUTHOR

Bijan Bowen