PrismML launches ternary Bonsai models

// 45d agoMODEL RELEASE

PrismML launches ternary Bonsai models

PrismML’s Ternary Bonsai is a 1.58-bit model family in 8B, 4B, and 1.7B sizes, using ternary weights to cut memory by about 9x versus standard 16-bit models. The company says the release improves on its 1-bit Bonsai line while keeping the footprint and throughput attractive for consumer and edge deployment.

// ANALYSIS

This is a strong compression story: PrismML is no longer just chasing the smallest possible model, it’s optimizing for the more useful point where a little extra memory buys a lot more capability.

–The core design is fully ternary, with weights constrained to `{-1, 0, +1}` across embeddings, attention, MLPs, and the LM head.
–PrismML claims the 8B model scores 75.5 average benchmark points, about 5 points better than its 1-bit 8B predecessor, while staying at 1.75 GB.
–The deployment angle is the real hook: native MLX support on Apple devices and reported throughput of 82 toks/sec on M4 Pro make this feel practical, not just academic.
–Apache 2.0 licensing matters here, because it lowers friction for experimentation and downstream packaging.
–The big question is how these numbers hold up outside PrismML’s own benchmark setup, especially across real workloads and longer-context use.

// TAGS

llmedge-aiinferencebenchmarkopen-sourceternary-bonsai

DISCOVERED

45d ago

2026-04-19

PUBLISHED

45d ago

2026-04-19

RELEVANCE

9/ 10

AUTHOR

AI Search

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

INFRA57m ago

Arkhai builds infrastructure for autonomous AI agents

Arkhai is developing an infrastructure layer that enables AI agents to become autonomous economic participants. By moving away from human-centric processes like manual searching, comparison, and purchasing, Arkhai empowers AI agents to seamlessly discover resources, negotiate prices, and complete transactions independently.

NEWS1h ago

Codex users hit strict limits after promo ends

A user on X voiced their dissatisfaction with the usage limits imposed on Codex following the end of the 2X promo, claiming they hit a 5-hour limit in just one hour. They compared the situation unfavorably to past limits on Claude Code, stating that they are now getting more value out of Claude Code powered by Opus 4.8 than Codex with GPT.

LAUNCH1h ago

Snugly generates photorealistic AI room redesigns

Snugly is an AI-powered mobile app that generates photorealistic room redesigns in seconds from a single uploaded photo. Users can customize spaces across over 12 design styles and directly purchase the featured furniture and decor through integrated shoppable links.