Small LLMs reveal primitive semantic layer

// 118d agoRESEARCH PAPER

Small LLMs reveal primitive semantic layer

Independent researchers ran 18 experiments across four small language model architectures (Qwen 2.5, Gemma 3, LLaMA 3.2, SmolLM2) and found consistent evidence of a two-tier primitive semantic layer — separating scaffolding concepts (SOMEONE, TIME, PLACE) from content seed concepts (FEAR, GRIEF, JOY) — with an activation gap averaging +0.245. The gap narrows predictably with model scale, a pattern the authors suggest may partly explain capability jumps.

// ANALYSIS

Preliminary and self-published, but the cross-architecture consistency is hard to dismiss — four different model families showing the same structural distinction demands at least a second look.

–The Layer 0a/0b split maps loosely onto linguistic notions of function vs. content words; if real at the activation level, it implies LLMs encode semantic structure rather than pure distributional statistics
–The inverse scaling pattern — gap largest in 360M models, narrowest in 1B — is the most provocative finding: larger models may develop phenomenological access to scaffolding primitives, which could partially explain emergent capability thresholds
–11 validated two-primitive compositions (WANT + GRIEF → longing, FEEL + GRIEF → heartbreak) suggest compositionality in the primitive layer, not just isolated activation differences
–Acknowledged circularity: the classifier measuring activation is the same class of model being measured — a real methodological concern the authors flag openly
–Fully reproducible locally via Ollama with no API keys — low barrier to independent verification

// TAGS

llmreasoningresearchopen-sourcebenchmark

DISCOVERED

118d ago

2026-03-15

PUBLISHED

118d ago

2026-03-15

RELEVANCE

6/ 10

AUTHOR

BodeMan5280

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS2h ago

Zebra stripes trick drone vision AI

Forces in the Ukraine war are painting military vehicles with high-contrast zebra patterns to trick autonomous drone machine-vision algorithms. However, experts note this tactic only offers a temporary advantage as training datasets are quickly updated to recognize the new camouflage.

OPEN SOURCE2h ago

Nuxt surpasses 60,000 GitHub stars

Nuxt, the open-source Vue.js framework, has surpassed 60,000 stars on GitHub, solidifying its position as a leading tool for full-stack web development.

OPEN SOURCE2h ago

Microsoft's ASP.NET Core provides a robust, cross-platform framework for building modern, cloud-based C# web applications.

ASP.NET Core is an open-source, high-performance, and cross-platform framework developed by Microsoft and the community for building modern, cloud-enabled, and Internet-connected applications. It allows developers to build web apps, services, IoT apps, and mobile backends using C# on Windows, macOS, and Linux. As a key component of the .NET ecosystem, it features a unified story for building web UI and web APIs, integration of modern client-side frameworks, and a cloud-ready, environment-based configuration system.

Small LLMs reveal primitive semantic layer