Z.ai has released GLM-5.2, a flagship open-weights Mixture-of-Experts model designed for long-context software engineering and complex agentic tasks.
Z.ai (formerly Zhipu AI) has launched GLM-5.2, a flagship Mixture-of-Experts (MoE) large language model under an MIT open-source license. The model boasts 744 billion total parameters (40 billion active per inference call) and supports a native 1-million-token context window designed specifically to handle complex coding and agentic workflows. To optimize efficiency, GLM-5.2 introduces "IndexShare" to reuse indexers across sparse attention layers—slashing per-token FLOPs by 2.9×—and incorporates an improved Multi-Token Prediction (MTP) layer for faster speculative decoding. Developers can run the model locally using quantized weights or access it via API platforms.
GLM-5.2 is a massive milestone for open-weights AI, proving that open-source models can challenge proprietary giants in complex, long-context software engineering without restrictive licensing.
* **Frontier-level coding performance:** Competes directly with leading proprietary models on long-horizon agentic benchmarks like SWE-bench.
* **Architectural breakthroughs:** IndexShare and Multi-Token Prediction (MTP) significantly reduce the compute and latency overhead of long-context inference.
* **Truly open and accessible:** The MIT license and lack of regional restrictions encourage global deployment and modification.
* **Local execution via quantization:** Support for aggressive 1-bit or 2-bit quantization makes running this 744B parameter model feasible on local infrastructure.
DISCOVERED
1h ago
2026-06-20
PUBLISHED
2h ago
2026-06-20
RELEVANCE
AUTHOR
johnseach