GLM-5.2 hits Fireworks inference
Z.ai's GLM-5.2 is now available on Fireworks serverless inference with open weights, an MIT license, 1M-token context, and coding-first positioning for long-horizon agent workflows.
GLM-5.2 is the most serious open-weight pressure test yet for closed frontier coding models, and Fireworks is using speed, control, and benchmark validation as the wedge.
- –Fireworks says it runs the weights on its own stack, not as a router, which matters for teams that care about data retention, uptime, and predictable serving.
- –The model page lists a 743B-parameter MoE design, 1040k-token context, function calling, and serverless pricing at $1.40 input / $4.40 output per million tokens.
- –Fireworks independently reproduced a 91.4% GPQA-Diamond result, but developers should still test it against their own codebases before treating social comparisons to Opus 4.8 as settled.
- –The broader story is open-weight models closing the agentic coding gap while undercutting closed-model economics.
DISCOVERED
2h ago
2026-06-19
PUBLISHED
2h ago
2026-06-19
RELEVANCE
AUTHOR
rileybrown