BACK_TO_FEEDAICRIER_2
PentaNet edges BitNet with pentanary quantization
OPEN_SOURCE ↗
REDDIT · REDDIT// 14d agoBENCHMARK RESULT

PentaNet edges BitNet with pentanary quantization

PentaNet is an open-source 124M GPT-2-style model that swaps BitNet's ternary weights for pentanary {-2, -1, 0, +1, +2} quantization. The author reports a 6.4% WikiText-103 perplexity gain at the same compute budget, while keeping inference addition/shift-only and shipping Triton plus AVX2 kernels.

// ANALYSIS

BitNet's ternary alphabet is starting to look like a design point, not a ceiling; PentaNet suggests a slightly richer discrete space can recover capacity without abandoning add/shift-only inference. The caveat is that this is still a small, self-reported experiment, so the real proof will be scaling plus independent profiling.

  • The reported mean PPL drops from 192.63 to 180.32 across three seeds, which is a solid gain for a 124M language model.
  • The ±2 buckets staying occupied matters; if the distribution collapsed back to ternary, the extra states would just be dead complexity.
  • Add/shift-only arithmetic is the right constraint, but batch size, memory traffic, and compiler fusion will decide whether the speedup survives outside the demo path.
  • The repo already ships the training code, weights, and both Triton GPU and AVX2 CPU paths, so others can stress-test the claim.
  • If this holds up, it points to a broader design space of low-bit alphabets, not just ternary vs full precision.
// TAGS
llminferencebenchmarkopen-sourcebitnetpentanet

DISCOVERED

14d ago

2026-03-28

PUBLISHED

15d ago

2026-03-28

RELEVANCE

8/ 10

AUTHOR

kyworn