OPEN_SOURCE ↗
REDDIT · REDDIT// 25d agoMODEL RELEASE
H Company, NVIDIA launch Holotron-12B for agents
Holotron-12B is a new multimodal 12B VLM from H Company and NVIDIA, released on March 16, 2026 under the NVIDIA Open Model License and tuned for UI navigation and agentic computer-use workloads. In the team’s published evals, it reports over 2x higher throughput than Holo2-8B on WebVoyager-style concurrent runs while improving benchmark performance from its Nemotron-Nano base.
// ANALYSIS
This looks like a practical shift from “smart demo model” to “deployable agent policy model,” with throughput as the main differentiator rather than raw parameter count.
- –The architecture focus (hybrid SSM-attention) is aimed at long-context, multi-image workloads where transformer KV-cache costs usually bottleneck agent systems.
- –Reported scaling to about 8.9k tokens/s at concurrency 100 (vs ~5.1k for Holo2-8B in the same setup) is the headline claim for teams running many parallel agent trajectories.
- –WebVoyager improvement from 35.1% to 80.5% versus the Nemotron base suggests post-training data quality, not just architecture, is doing heavy lifting.
- –The NVIDIA Open Model License makes it accessible for experimentation, but production users will still want independent replication of throughput and task-completion metrics.
// TAGS
holotron-12bllmmultimodalcomputer-useagentinferencegpuopen-weights
DISCOVERED
25d ago
2026-03-17
PUBLISHED
26d ago
2026-03-17
RELEVANCE
9/ 10
AUTHOR
Nunki08