GLM-5.2 hits 120 tok/s on Blackwell tinyboxes

// 4h agoBENCHMARK RESULT

GLM-5.2 hits 120 tok/s on Blackwell tinyboxes

A rumor reported by Tiny Corp suggests that Zhipu AI's upcoming GLM-5.2 model is currently running at 120 tokens per second across a setup of two networked Blackwell-based tinyboxes. The hardware configuration is estimated to cost $150,000, highlighting a potential shift towards powerful, cost-effective, and decentralized local hardware clusters for running frontier large language models.

// ANALYSIS

Local AI hardware clusters are officially encroaching on cloud dominance by making high-speed, frontier-class inference affordable for enterprises.

–Networked Blackwell tinyboxes demonstrate the viability of Tiny Corp's architecture for multi-GPU, high-bandwidth workloads.
–A speed of 120 tokens per second makes real-time, multi-step agentic workflows highly practical for local deployments.
–The $150,000 price tag lowers the entry barrier for organizations seeking data sovereignty and predictable operational costs over cloud APIs.

// TAGS

glm-5.2tinyboxblackwelltinygradgpubenchmarkllm

DISCOVERED

4h ago

2026-06-21

PUBLISHED

5h ago

2026-06-21

RELEVANCE

7/ 10

AUTHOR

AravSrinivas

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS36m ago

Google, Meta models land on Huawei Ascend

The Chinese AI ecosystem is focusing on porting Western open-source models, such as Google's T5-Efficient-Tiny and Meta's V-JEPA 2, to Huawei's Ascend NPU. This trend highlights a shift toward building out software support and compatibility for domestic silicon during a quiet cycle for novel local releases.

NEWS2h ago

OpenAI Codex teases major front-end updates

An upcoming update for OpenAI Codex is being teased on social media as a potentially game-changing solution for front-end development. The teaser hints that the new release will address long-standing challenges in automating front-end coding, generating excitement within the developer community about the next generation of AI-assisted software engineering tools.

NEWS3h ago

Codex App built with okayish frontend models

In a social media post, Thomas Sottiaux, head of the Codex team at OpenAI, revealed that the Codex desktop application was developed using models with only 'okayish' frontend capabilities. He teased the massive potential of what the team will be able to build once OpenAI's models receive significant upgrades to their frontend development skills.