BACK_TO_FEEDAICRIER_2
Qwen-Scope maps Qwen 3.5 hidden features
OPEN_SOURCE ↗
REDDIT · REDDIT// 5h agoOPENSOURCE RELEASE

Qwen-Scope maps Qwen 3.5 hidden features

Qwen Team released Qwen-Scope, a public SAE suite for Qwen 3.5 models spanning 2B through 35B MoE, plus a Hugging Face Space for feature exploration and steering. It exposes residual-stream features across layers, turning model internals into something researchers can inspect, localize, and intervene on.

// ANALYSIS

This is a serious interpretability release, not just another model dump. The big story is that Qwen is making feature-level control and debugging feel practical for a broad model family, which moves SAEs from niche research into usable tooling.

  • Coverage across multiple Qwen 3.5 sizes makes this more useful than a one-off demo on a single checkpoint.
  • Residual-stream, all-layer coverage matters because it lets you trace behaviors like language switching, refusals, and style drift to specific learned features.
  • Steering and ablation are the obvious headline use cases, but the more durable value is debugging and dataset auditing for fine-tunes.
  • It is also plainly dual-use: the same machinery that helps explain behavior can be used to suppress safety-related features or push the model toward unwanted behaviors.
  • Compared with prompt-only control, feature editing is much more surgical, which is why interpretability folks will care and policy folks will be uneasy.
// TAGS
qwen-scopeqwenllmopen-sourceresearchinterpretabilitysafety

DISCOVERED

5h ago

2026-04-30

PUBLISHED

8h ago

2026-04-30

RELEVANCE

9/ 10

AUTHOR

MadPelmewka