OPEN_SOURCE ↗
REDDIT · REDDIT// 21d agoRESEARCH PAPER
Flash-Jacobian Challenges Linear LLM Assumptions
Flash-Jacobian is a Zenodo preprint that uses cluster-representative Jacobians to study local layer dynamics in Qwen-3.5-4B, Llama-3.2-3B, and Phi-3-mini. It argues that the Linear Representation Hypothesis breaks down in mid and late layers, that cluster directions behave more like boundary-token suppressors than clean concept vectors, and that hallucination signals are nonlinear and model-specific.
// ANALYSIS
Hot take: this looks less like “we found the right semantic direction” and more like “LLM internals are locally structured but globally messy, and the mess is model-specific.”
- –The U-shaped layer result feels strong: if the late-layer collapse tracks gate anisotropy so tightly, that’s a real warning sign for linear-geometry stories.
- –The causal intervention result could still be partly about activation fragility, but the size and direction of the effect make the “semantic cluster = semantic control knob” reading pretty shaky.
- –The hallucination detector result is the most interesting one to me: within-model AUC > 0.99 says the signal is there, but pooled AUC ~0.50 says it is not a universal geometry, it is a per-model fingerprint.
- –With only 3B/4B models, I would treat the non-transfer result as an important empirical limitation rather than a final theorem, but it is already enough to challenge naive generalization claims.
- –Overall, this is a sharp interpretability paper with a clear thesis: linear probes can be informative, but they may be missing the causal and factual structure that actually matters.
// TAGS
flash-jacobianllm-interpretabilityjacobianhallucination-detectionlinear-representation-hypothesisnonlinear-classificationqwenllamaphizenodoresearch
DISCOVERED
21d ago
2026-03-21
PUBLISHED
21d ago
2026-03-21
RELEVANCE
9/ 10
AUTHOR
s0kex