BACK_TO_FEEDAICRIER_2
ZipMap zips 3D reconstruction to linear time
OPEN_SOURCE ↗
YT · YOUTUBE// 34d agoRESEARCH PAPER

ZipMap zips 3D reconstruction to linear time

ZipMap is a CVPR 2026 paper from Google DeepMind, Cornell, and MIT that uses test-time training to compress long image sequences into a compact scene state for linear-time 3D reconstruction. The model reconstructs 750 frames in under 10 seconds on one H100 while matching or beating quadratic baselines like VGGT and π³ across several pose, depth, and point-map benchmarks.

// ANALYSIS

ZipMap is interesting because it does more than speed up a benchmark: it reframes large-scene reconstruction as stateful memory compression instead of ever-growing global attention. That makes it one of the clearest signs that test-time training can be useful outside language modeling.

  • The core trick is swapping quadratic global attention for local window attention plus large-chunk test-time training layers, pushing runtime from O(N²) to O(N)
  • Unlike many efficient vision papers, it does not just trade quality for speed; the reported results stay competitive with top quadratic systems on camera pose, depth, and dense geometry
  • The compact scene state is queryable in real time, which gives the model a practical path from offline reconstruction to interactive and streaming use cases
  • The paper still flags real limits: very long out-of-distribution scenes degrade quality, and queried RGB views remain blurry in high-frequency regions
// TAGS
zipmapresearchinferencebenchmark

DISCOVERED

34d ago

2026-03-08

PUBLISHED

34d ago

2026-03-08

RELEVANCE

8/ 10

AUTHOR

Discover AI