REDDIT · REDDIT// 3h agoINFRASTRUCTURE

Dev preps 1.5TB Mac for GLM-5.2

A developer configured a 2019 Mac Pro with 1.5TB of RAM and 128GB of VRAM to benchmark massive local models like Zhipu AI's 744B parameter GLM-5.2. The community is eager to see if offloading Mixture-of-Experts layers to VRAM makes running this behemoth viable on a single workstation.

// ANALYSIS

Running a 744B parameter MoE model locally usually requires a server rack, but brute-forcing it with maxed-out older hardware is becoming a fascinating hobbyist strategy.

–GLM-5.2's sheer size demands extraordinary memory, making the 1.5TB RAM ceiling of the 2019 Intel Mac Pro surprisingly relevant for extreme local deployments
–Selectively offloading active MoE experts to the 128GB VRAM pool while keeping inactive experts in system RAM could yield acceptable inference speeds
–This experiment highlights the growing distinction between raw compute requirements for training and sheer memory capacity requirements for MoE inference
–Zhipu AI releasing GLM-5.2 under an MIT license empowers developers to push the boundaries of consumer-grade hardware setups

// TAGS

glm-5.2llmself-hostedopen-weightsinference

DISCOVERED

3h ago

2026-04-28

PUBLISHED

5h ago

2026-04-28

RELEVANCE

8/ 10

AUTHOR

habachilles