X · X// 2h agoRESEARCH PAPER

Looped transformers generalize composition through iterative depth

This paper studies recurrent-depth transformers, a looped architecture that reuses the same layers multiple times in a single forward pass. The authors argue that standard transformers can retain factual knowledge yet still fail at implicit multi-hop reasoning because they struggle to compose that knowledge systematically. In controlled experiments, recurrent-depth models handled both systematic generalization and depth extrapolation better than vanilla transformers, with deeper reasoning emerging as inference-time recurrence increased. The paper also identifies a failure mode, overthinking, where too much recurrence hurts predictions.

// ANALYSIS

Hot take: the main win here is not “more thinking” in the abstract, but a cleaner way to turn fixed parameters into iterative computation when the task needs composition.

–The interesting result is compositional generalization: looped depth helps the model combine facts it did not see composed during training.
–The depth extrapolation finding matters operationally: extra test-time loops can buy more reasoning depth without retraining the whole model.
–The mechanistic story is stronger than a benchmark-only claim because the paper tracks a grokking-like transition from memorization to systematic generalization.
–The caveat is real: recurrence is not free, and too many loops can degrade outputs through overthinking.

// TAGS

transformerslooped-transformersrecurrent-depthreasoningcompositional-generalizationarxivllm-research

DISCOVERED

2h ago

2026-05-01

PUBLISHED

3h ago

2026-05-01

RELEVANCE

9/ 10

AUTHOR

AlphaSignalAI