mamba-rs brings Mamba SSM to Rust, CUDA
mamba-rs is a framework-independent implementation of the Mamba Selective State Space Model (SSM) in Rust, featuring high-performance training and inference with custom CUDA kernels.
mamba-rs is a significant addition to the Rust AI ecosystem, offering a lightweight alternative to Python-heavy stacks. By eliminating external framework dependencies like PyTorch or Burn, it reduces overhead and simplifies deployment in production or embedded environments. Custom CUDA kernels compiled at runtime via NVRTC leverage NVIDIA Ampere and Hopper Tensor Cores for high-performance training, while manual analytical gradients and a "burn-in" API for recurrent states provide low-level control often obscured by autograd engines. Its 200μs CPU inference latency makes it a strong candidate for low-latency real-time applications.
DISCOVERED
18d ago
2026-03-24
PUBLISHED
18d ago
2026-03-24
RELEVANCE
AUTHOR
Github Awesome