REDDIT · REDDIT// 14d agoBENCHMARK RESULT

PentaNet edges BitNet with pentanary quantization

PentaNet is an open-source 124M GPT-2-style model that swaps BitNet's ternary weights for pentanary {-2, -1, 0, +1, +2} quantization. The author reports a 6.4% WikiText-103 perplexity gain at the same compute budget, while keeping inference addition/shift-only and shipping Triton plus AVX2 kernels.

// ANALYSIS

BitNet's ternary alphabet is starting to look like a design point, not a ceiling; PentaNet suggests a slightly richer discrete space can recover capacity without abandoning add/shift-only inference. The caveat is that this is still a small, self-reported experiment, so the real proof will be scaling plus independent profiling.

–The reported mean PPL drops from 192.63 to 180.32 across three seeds, which is a solid gain for a 124M language model.
–The ±2 buckets staying occupied matters; if the distribution collapsed back to ternary, the extra states would just be dead complexity.
–Add/shift-only arithmetic is the right constraint, but batch size, memory traffic, and compiler fusion will decide whether the speedup survives outside the demo path.
–The repo already ships the training code, weights, and both Triton GPU and AVX2 CPU paths, so others can stress-test the claim.
–If this holds up, it points to a broader design space of low-bit alphabets, not just ternary vs full precision.

// TAGS

llminferencebenchmarkopen-sourcebitnetpentanet

DISCOVERED

14d ago

2026-03-28

PUBLISHED

15d ago

2026-03-28

RELEVANCE

8/ 10

AUTHOR

kyworn