REDDIT · REDDIT// 27d agoOPENSOURCE RELEASE

OmniCoder-9B brings agentic coding to 8GB GPUs

Tesslate releases OmniCoder-9B, a 9B-parameter agentic coding model fine-tuned from Qwen3.5-9B on 425K+ frontier AI coding traces, with Q4_K_M quantization fitting in ~5.7GB. Designed for Cline and llama-server, it targets developers running local AI coding agents on consumer hardware.

// ANALYSIS

A 9B model trained on GPT-5 and Claude Opus 4.6 agentic traces that fits in 8GB of VRAM is exactly what the self-hosted coding movement has been waiting for.

–Q4_K_M quantization lands at ~5.7GB, running comfortably on RTX 3070/4060-class cards — no cloud API required
–Terminal-Bench 2.0 score of 23.6% is a 61% improvement over the Qwen3.5-9B base (14.6%), suggesting real agentic gains rather than benchmark overfitting
–Training on scaffold patterns from Claude Code, OpenCode, and Codex effectively distills frontier agentic behavior into a local-first model
–Native 262K context window (extensible to 1M+) is exceptional at this size class and critical for multi-file coding sessions
–Apache 2.0 license with OpenAI-compatible llama-server API means drop-in replacement for existing Cline/VS Code setups with zero vendor lock-in

// TAGS

omnicoder-9bllmai-codingagentopen-sourceopen-weightsself-hostedinference

DISCOVERED

27d ago

2026-03-16

PUBLISHED

27d ago

2026-03-16

RELEVANCE

8/ 10

AUTHOR

Powerful_Evening5495