OPEN_SOURCE ↗
REDDIT · REDDIT// 27d agoOPENSOURCE RELEASE
OmniCoder-9B brings agentic coding to 8GB GPUs
Tesslate releases OmniCoder-9B, a 9B-parameter agentic coding model fine-tuned from Qwen3.5-9B on 425K+ frontier AI coding traces, with Q4_K_M quantization fitting in ~5.7GB. Designed for Cline and llama-server, it targets developers running local AI coding agents on consumer hardware.
// ANALYSIS
A 9B model trained on GPT-5 and Claude Opus 4.6 agentic traces that fits in 8GB of VRAM is exactly what the self-hosted coding movement has been waiting for.
- –Q4_K_M quantization lands at ~5.7GB, running comfortably on RTX 3070/4060-class cards — no cloud API required
- –Terminal-Bench 2.0 score of 23.6% is a 61% improvement over the Qwen3.5-9B base (14.6%), suggesting real agentic gains rather than benchmark overfitting
- –Training on scaffold patterns from Claude Code, OpenCode, and Codex effectively distills frontier agentic behavior into a local-first model
- –Native 262K context window (extensible to 1M+) is exceptional at this size class and critical for multi-file coding sessions
- –Apache 2.0 license with OpenAI-compatible llama-server API means drop-in replacement for existing Cline/VS Code setups with zero vendor lock-in
// TAGS
omnicoder-9bllmai-codingagentopen-sourceopen-weightsself-hostedinference
DISCOVERED
27d ago
2026-03-16
PUBLISHED
27d ago
2026-03-16
RELEVANCE
8/ 10
AUTHOR
Powerful_Evening5495