OPEN_SOURCE ↗
REDDIT · REDDIT// 3h agoINFRASTRUCTURE
Dev preps 1.5TB Mac for GLM-5.2
A developer configured a 2019 Mac Pro with 1.5TB of RAM and 128GB of VRAM to benchmark massive local models like Zhipu AI's 744B parameter GLM-5.2. The community is eager to see if offloading Mixture-of-Experts layers to VRAM makes running this behemoth viable on a single workstation.
// ANALYSIS
Running a 744B parameter MoE model locally usually requires a server rack, but brute-forcing it with maxed-out older hardware is becoming a fascinating hobbyist strategy.
- –GLM-5.2's sheer size demands extraordinary memory, making the 1.5TB RAM ceiling of the 2019 Intel Mac Pro surprisingly relevant for extreme local deployments
- –Selectively offloading active MoE experts to the 128GB VRAM pool while keeping inactive experts in system RAM could yield acceptable inference speeds
- –This experiment highlights the growing distinction between raw compute requirements for training and sheer memory capacity requirements for MoE inference
- –Zhipu AI releasing GLM-5.2 under an MIT license empowers developers to push the boundaries of consumer-grade hardware setups
// TAGS
glm-5.2llmself-hostedopen-weightsinference
DISCOVERED
3h ago
2026-04-28
PUBLISHED
5h ago
2026-04-28
RELEVANCE
8/ 10
AUTHOR
habachilles