OPEN_SOURCE ↗
REDDIT · REDDIT// 4h agoRESEARCH PAPER
Mamba-3 Weights Stay Missing After Release
A LocalLLaMA thread flags an awkward gap in the Mamba-3 release: the paper and Together AI blog report pretrained benchmark results, and the GitHub repo exposes Mamba-3 code/kernels, but the official Hugging Face listings still appear to cover Mamba and Mamba-2 weights rather than the benchmarked Mamba-3 checkpoints.
// ANALYSIS
Mamba-3 looks technically important, but the release lands in an uncomfortable middle ground: enough code to study the architecture, not enough weights to reproduce the headline claims cleanly.
- –The paper claims Mamba-3 improves downstream accuracy and inference efficiency at the 1.5B scale, including a stronger MIMO variant with no added decode latency.
- –Together AI says kernels are open-sourced, but that is not the same as releasing the trained checkpoints behind the benchmark tables.
- –The state-spaces GitHub README lists pretrained Hugging Face models for Mamba and Mamba-2, while its Mamba-3 demo uses random tensors for block-level usage.
- –For developers, this makes Mamba-3 more of a research architecture drop than a usable model release until official weights or reproducible training configs appear.
// TAGS
mamba-3llminferenceresearchbenchmark
DISCOVERED
4h ago
2026-04-22
PUBLISHED
6h ago
2026-04-22
RELEVANCE
7/ 10
AUTHOR
Designer_Win6465