BACK_TO_FEEDAICRIER_2
Mamba-3 Weights Stay Missing After Release
OPEN_SOURCE ↗
REDDIT · REDDIT// 4h agoRESEARCH PAPER

Mamba-3 Weights Stay Missing After Release

A LocalLLaMA thread flags an awkward gap in the Mamba-3 release: the paper and Together AI blog report pretrained benchmark results, and the GitHub repo exposes Mamba-3 code/kernels, but the official Hugging Face listings still appear to cover Mamba and Mamba-2 weights rather than the benchmarked Mamba-3 checkpoints.

// ANALYSIS

Mamba-3 looks technically important, but the release lands in an uncomfortable middle ground: enough code to study the architecture, not enough weights to reproduce the headline claims cleanly.

  • The paper claims Mamba-3 improves downstream accuracy and inference efficiency at the 1.5B scale, including a stronger MIMO variant with no added decode latency.
  • Together AI says kernels are open-sourced, but that is not the same as releasing the trained checkpoints behind the benchmark tables.
  • The state-spaces GitHub README lists pretrained Hugging Face models for Mamba and Mamba-2, while its Mamba-3 demo uses random tensors for block-level usage.
  • For developers, this makes Mamba-3 more of a research architecture drop than a usable model release until official weights or reproducible training configs appear.
// TAGS
mamba-3llminferenceresearchbenchmark

DISCOVERED

4h ago

2026-04-22

PUBLISHED

6h ago

2026-04-22

RELEVANCE

7/ 10

AUTHOR

Designer_Win6465