Moonshot AI's Kimi 2.7 ranks second on the ErdosBench reasoning smoke test, outperforming GPT-5 xhigh.
In a recent re-run of the ErdosBench smoke test on 14 math problems, Moonshot AI's Kimi 2.7 model achieved second place, ranking only behind Claude Fable 5. The benchmark, which evaluates advanced mathematical reasoning and abstract problem-solving, placed Kimi 2.7 ahead of GPT-5 xhigh, highlighting the rapid acceleration and competitiveness of Chinese frontier models in high-level intellectual tasks.
Kimi 2.7's second-place finish on ErdosBench demonstrates that Chinese AI labs are achieving parity and in some cases surpassing Western models on high-level reasoning.
* Benchmark-Specific Success: ErdosBench's focus on synthetic, research-grade math problems tests deep reasoning rather than raw knowledge retrieval.
* Competitive Landscape: Beating GPT-5 xhigh is a major milestone for Moonshot AI, cementing Kimi 2.7 as one of the world's leading reasoning models.
* The Frontier Shift: Despite Kimi 2.7's impressive performance, Fable 5's top position indicates that the frontier of mathematical AI remains highly contested.
DISCOVERED
4d ago
2026-06-17
PUBLISHED
4d ago
2026-06-17
RELEVANCE
AUTHOR
Kimi_Moonshot