THUDM open-sources Slime RL framework for GLM-5.2

// 1h agoOPENSOURCE RELEASE

THUDM open-sources Slime RL framework for GLM-5.2

Tsinghua University's THUDM group has open-sourced Slime, the reinforcement learning (RL) post-training framework behind Zhipu AI's GLM-4 and GLM-5 series. By integrating Megatron-LM with SGLang to bridge training and inference, Slime enables parallel On-Policy Distillation (OPD) loops and completed the post-training of the 744B-parameter GLM-5.2 MoE model in approximately two days.

// ANALYSIS

Making production-grade RL post-training infrastructure open source significantly lowers the barrier to entry for training large-scale agentic models.

* Unifies Megatron-LM for training and SGLang for inference rollouts, minimizing system and synchronization overhead.

* Features a TransferQueue to decouple compute processes and a Distributed Checkpoint Service for asynchronous weight syncing.

* Highly optimized for complex RL workflows like multi-turn agentic rollouts and parallel OPD loops, enabling rapid folding of expert models.

// TAGS

reinforcement-learningfine-tuningopen-sourcemegatron-lmsglangmachine-learning-infrastructureglm-5.2post-trainingslime

DISCOVERED

1h ago

2026-06-20

PUBLISHED

2h ago

2026-06-20

RELEVANCE

8/ 10

AUTHOR

jeremyphoward

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE1h ago

Stripe expands Stripe Projects agentic capabilities

Stripe Projects has expanded its developer integrations, enabling AI agents to autonomously provision and manage third-party services directly from the CLI. The platform automates resource setup across 49 providers, syncing credentials into the workspace while consolidating billing.

VIDEO2h ago

Claude Code creator develops entirely from phone

In a 40-minute presentation, Claude Code creator Boris Cherny shared that he writes 100% of his code using Claude, primarily managing the developer loop from his phone. He highlighted underutilized features that enable this workflow, such as auto mode—which lets Claude approve its own safe terminal commands to run tasks autonomously for hours—and customized output styles.

NEWS2h ago

GPT-5.6 Pro builds interactive Sims-like simulator

A developer demonstration highlights the capability of GPT-5.6 Pro to generate a complete, self-contained Sims-like life simulator loop within a single interface artifact. The model handles state coordination, multi-agent logic, and UI rendering out of the box without requiring external coding harnesses.