YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

State Flow Machine tops transformers on length extrapolation

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

State Flow Machine tops transformers on length extrapolation
OPEN LINK ↗
// 73d agoBENCHMARK RESULT

State Flow Machine tops transformers on length extrapolation

A solo researcher has open-sourced State Flow Machine (SFM), a non-transformer architecture using explicit memory "state slots" instead of attention heads, achieving 62% accuracy at 4x training-length sequence extrapolation versus ~2% for transformers of any size on a synthetic state-tracking benchmark.

// ANALYSIS

A clean experimental result on a narrow synthetic task — the real test comes when Mamba, RWKV, and other SSMs are added to the comparison table, since those are the natural competitors transformers were never designed to beat here.

  • State slots replace attention with 16 named memory registers updated via gated DeltaNet recurrent cells — an explicit, directly addressable alternative to attention's implicit token-history compression
  • The transformer collapse at 4x length (2%) is theoretically expected: TC0 circuit complexity limits make vanilla attention provably weak at algorithmic state-tracking tasks
  • Important caveat: SFM uses intermediate state supervision (auxiliary loss at every operation step), giving it significantly more gradient signal than transformer baselines — disclosed but not equalized
  • No comparison to Mamba, RWKV, or other SSMs yet, which the author acknowledges — those architectures are designed for exactly this kind of recurrent state tracking
  • Built with AI assistance (Claude Opus 4.6 as co-author), runs only on Huawei Ascend NPUs, and has zero stars on a 3-day-old repo — reproducibility for most researchers is limited until a CUDA port appears
// TAGS
state-flow-machinellmopen-sourcebenchmarkresearchreasoning

DISCOVERED

73d ago

2026-03-16

PUBLISHED

73d ago

2026-03-16

RELEVANCE

5/ 10

AUTHOR

Own-Albatross868