Librarian drops 125M SLM series and SFT framework

// 46d agoMODEL RELEASE

Librarian drops 125M SLM series and SFT framework

Developer Sujal Maheshwari has released the Librarian series, a collection of 125M parameter language models trained from scratch using a custom 16k BPE tokenizer. This release includes base and instruct variants alongside Librarian-SFT, a modular, config-driven framework for supervised fine-tuning on consumer-grade hardware.

// ANALYSIS

Librarian proves that training from scratch isn't just for big labs, it's the ultimate "build vs. buy" flex for SLM researchers. Modern architecture like RoPE and SwiGLU makes it a superior baseline for experimentation compared to aging GPT-2 checkpoints, while the custom 16k BPE tokenizer provides a clean slate for testing domain-specific vocabularies. These sub-1B models are becoming the essential building blocks for low-latency "micro-agents" in complex on-device workflows, and the inclusion of the full training and SFT pipeline sets a high standard for transparency and community reproducibility.

// TAGS

librarianllmfine-tuningopen-sourceagentresearch

DISCOVERED

46d ago

2026-04-14

PUBLISHED

46d ago

2026-04-13

RELEVANCE

8/ 10

AUTHOR

Kill_Streak308

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS4m ago

Uber's Claude Code bill tests AI ROI

The video uses Uber’s reported Claude Code spend as a concrete example of the rising tension around agentic coding tools: usage can scale quickly inside engineering teams, but leadership is still struggling to connect that spend to shipped consumer features. It frames Claude Code as genuinely useful, but also as the kind of token-heavy workflow that is easy to adopt and hard to justify when budgets tighten.

RESEARCH7m ago

UserHarness reframes Theory of Mind as user-mind reconstruction

UserHarness is an inference-time framework for Theory-of-Mind tasks that models a user’s partial observations, evolving beliefs, intentions, and actions instead of inferring mental state indirectly. In the paper, the approach is evaluated across five benchmarks and reaches up to 95.94% macro accuracy, with reported gains of more than 15% relative over existing inference methods and about 20% relative over the strongest prompt-only harness.

UPDATE11m ago

Claude Code adds dynamic workflows with Ultracode mode

Anthropic’s latest Claude Code update adds dynamic workflows that let Claude plan work, fan tasks out across parallel subagents, verify results, and return a single coordinated answer. The new `ultracode` setting raises effort automatically and lets Claude decide when to use the workflow mode, targeting large debugging, codebase migrations, security audits, and other long-running engineering jobs.