OpenRouter lands fast GLM-5.2 endpoints, nitro routing
OpenRouter has added new fast inference endpoints for Z.ai's GLM-5.2 model, hosted by Wafer and Fireworks AI. Developers can use the "z-ai/glm-5.2:nitro" model ID to automatically route requests to the fastest provider based on live throughput data.
OpenRouter's new dynamic routing options and high-speed endpoints make running the flagship open-weights GLM-5.2 model significantly faster and more reliable.
- –Dynamic routing via the ':nitro' suffix solves the provider availability and speed volatility problem for production AI agents.
- –The addition of Wafer and Fireworks AI fast variants introduces healthy competition among inference providers, driving down latency and costs.
- –GLM-5.2's 1M-token context window makes high-speed endpoints crucial for developers running long-horizon coding tasks and multi-step workflows.
- –Using the unified ':nitro' endpoint prevents vendor lock-in and eliminates the need for manual fallback logic in developer codebases.
DISCOVERED
1h ago
2026-06-26
PUBLISHED
1h ago
2026-06-26
RELEVANCE
AUTHOR
OpenRouter