GLM-5.2 sets open-source ARC-AGI-2 record
Z.ai's 744B open-weight model GLM-5.2 achieves a 22.8% score on the ARC-AGI-2 benchmark, marking the strongest performance to date for an open-source model. The model features agentic capabilities and a 1M-token context window designed for long-horizon software engineering tasks.
While still trailing top closed-source models, GLM-5.2's ARC-AGI-2 performance signals serious fluid reasoning capabilities in the open-weight ecosystem.
- –22.8% score demonstrates early agentic reasoning capacity previously restricted to proprietary APIs
- –Under-the-hood architecture includes IndexShare and improved multi-token prediction to drastically reduce inference costs
- –1M-token context window and MIT license position it as a viable local alternative for repository-scale coding agents
- –Maintains the typical 6-12 month performance gap behind frontier models like GPT-5.5, which recently hit 85%
DISCOVERED
1h ago
2026-06-25
PUBLISHED
12h ago
2026-06-24
RELEVANCE
AUTHOR
fchollet