Small LLMs reveal primitive semantic layer
Independent researchers ran 18 experiments across four small language model architectures (Qwen 2.5, Gemma 3, LLaMA 3.2, SmolLM2) and found consistent evidence of a two-tier primitive semantic layer — separating scaffolding concepts (SOMEONE, TIME, PLACE) from content seed concepts (FEAR, GRIEF, JOY) — with an activation gap averaging +0.245. The gap narrows predictably with model scale, a pattern the authors suggest may partly explain capability jumps.
Preliminary and self-published, but the cross-architecture consistency is hard to dismiss — four different model families showing the same structural distinction demands at least a second look.
- –The Layer 0a/0b split maps loosely onto linguistic notions of function vs. content words; if real at the activation level, it implies LLMs encode semantic structure rather than pure distributional statistics
- –The inverse scaling pattern — gap largest in 360M models, narrowest in 1B — is the most provocative finding: larger models may develop phenomenological access to scaffolding primitives, which could partially explain emergent capability thresholds
- –11 validated two-primitive compositions (WANT + GRIEF → longing, FEEL + GRIEF → heartbreak) suggest compositionality in the primitive layer, not just isolated activation differences
- –Acknowledged circularity: the classifier measuring activation is the same class of model being measured — a real methodological concern the authors flag openly
- –Fully reproducible locally via Ollama with no API keys — low barrier to independent verification
DISCOVERED
72d ago
2026-03-15
PUBLISHED
72d ago
2026-03-15
RELEVANCE
AUTHOR
BodeMan5280

