Composer 2.5 tops CursorBench, Gemini 3.5 Flash slips
Composer 2.5 scored 63.2% on the newest CursorBench evaluation, matching flagship performance at 20x less cost. The benchmark results highlight its immense value for AI-assisted coding tasks, while Google's Gemini 3.5 Flash disappointed by falling to tenth place.
The latest CursorBench results prove that budget-tier coding models are now fully capable of handling complex tasks, fundamentally changing the cost equation for developers.
- –Composer 2.5 achieved a 63.2% success rate at only $0.55 per task, nearly matching heavyweights like Opus 4.7 Max and GPT 5.5 Extra High.
- –It delivers this near-flagship capability at a 20x cost reduction, making it highly attractive for intensive, agentic coding workflows.
- –Gemini 3.5 Flash stumbled with a 49.8% score, landing at #10 and falling behind older budget competitors like GPT 5.5 Low.
- –The benchmark gained massive traction after Elon Musk amplified the results, solidifying Composer 2.5 as a sleeper hit.
DISCOVERED
2h ago
2026-05-21
PUBLISHED
2h ago
2026-05-21
RELEVANCE
AUTHOR
elonmusk