OPEN_SOURCE ↗
REDDIT · REDDIT// 5h agoOPENSOURCE RELEASE
Qwen-Scope maps Qwen 3.5 hidden features
Qwen Team released Qwen-Scope, a public SAE suite for Qwen 3.5 models spanning 2B through 35B MoE, plus a Hugging Face Space for feature exploration and steering. It exposes residual-stream features across layers, turning model internals into something researchers can inspect, localize, and intervene on.
// ANALYSIS
This is a serious interpretability release, not just another model dump. The big story is that Qwen is making feature-level control and debugging feel practical for a broad model family, which moves SAEs from niche research into usable tooling.
- –Coverage across multiple Qwen 3.5 sizes makes this more useful than a one-off demo on a single checkpoint.
- –Residual-stream, all-layer coverage matters because it lets you trace behaviors like language switching, refusals, and style drift to specific learned features.
- –Steering and ablation are the obvious headline use cases, but the more durable value is debugging and dataset auditing for fine-tunes.
- –It is also plainly dual-use: the same machinery that helps explain behavior can be used to suppress safety-related features or push the model toward unwanted behaviors.
- –Compared with prompt-only control, feature editing is much more surgical, which is why interpretability folks will care and policy folks will be uneasy.
// TAGS
qwen-scopeqwenllmopen-sourceresearchinterpretabilitysafety
DISCOVERED
5h ago
2026-04-30
PUBLISHED
8h ago
2026-04-30
RELEVANCE
9/ 10
AUTHOR
MadPelmewka