OPEN_SOURCE ↗
REDDIT · REDDIT// 14d agoBENCHMARK RESULT
PentaNet edges BitNet with pentanary quantization
PentaNet is an open-source 124M GPT-2-style model that swaps BitNet's ternary weights for pentanary {-2, -1, 0, +1, +2} quantization. The author reports a 6.4% WikiText-103 perplexity gain at the same compute budget, while keeping inference addition/shift-only and shipping Triton plus AVX2 kernels.
// ANALYSIS
BitNet's ternary alphabet is starting to look like a design point, not a ceiling; PentaNet suggests a slightly richer discrete space can recover capacity without abandoning add/shift-only inference. The caveat is that this is still a small, self-reported experiment, so the real proof will be scaling plus independent profiling.
- –The reported mean PPL drops from 192.63 to 180.32 across three seeds, which is a solid gain for a 124M language model.
- –The ±2 buckets staying occupied matters; if the distribution collapsed back to ternary, the extra states would just be dead complexity.
- –Add/shift-only arithmetic is the right constraint, but batch size, memory traffic, and compiler fusion will decide whether the speedup survives outside the demo path.
- –The repo already ships the training code, weights, and both Triton GPU and AVX2 CPU paths, so others can stress-test the claim.
- –If this holds up, it points to a broader design space of low-bit alphabets, not just ternary vs full precision.
// TAGS
llminferencebenchmarkopen-sourcebitnetpentanet
DISCOVERED
14d ago
2026-03-28
PUBLISHED
15d ago
2026-03-28
RELEVANCE
8/ 10
AUTHOR
kyworn