BACK_TO_FEEDAICRIER_2
State Flow Machine tops transformers on length extrapolation
OPEN_SOURCE ↗
REDDIT · REDDIT// 27d agoBENCHMARK RESULT

State Flow Machine tops transformers on length extrapolation

A solo researcher has open-sourced State Flow Machine (SFM), a non-transformer architecture using explicit memory "state slots" instead of attention heads, achieving 62% accuracy at 4x training-length sequence extrapolation versus ~2% for transformers of any size on a synthetic state-tracking benchmark.

// ANALYSIS

A clean experimental result on a narrow synthetic task — the real test comes when Mamba, RWKV, and other SSMs are added to the comparison table, since those are the natural competitors transformers were never designed to beat here.

  • State slots replace attention with 16 named memory registers updated via gated DeltaNet recurrent cells — an explicit, directly addressable alternative to attention's implicit token-history compression
  • The transformer collapse at 4x length (2%) is theoretically expected: TC0 circuit complexity limits make vanilla attention provably weak at algorithmic state-tracking tasks
  • Important caveat: SFM uses intermediate state supervision (auxiliary loss at every operation step), giving it significantly more gradient signal than transformer baselines — disclosed but not equalized
  • No comparison to Mamba, RWKV, or other SSMs yet, which the author acknowledges — those architectures are designed for exactly this kind of recurrent state tracking
  • Built with AI assistance (Claude Opus 4.6 as co-author), runs only on Huawei Ascend NPUs, and has zero stars on a 3-day-old repo — reproducibility for most researchers is limited until a CUDA port appears
// TAGS
state-flow-machinellmopen-sourcebenchmarkresearchreasoning

DISCOVERED

27d ago

2026-03-16

PUBLISHED

27d ago

2026-03-16

RELEVANCE

5/ 10

AUTHOR

Own-Albatross868