BACK_TO_FEEDAICRIER_2
Physical Intelligence adds dual-memory stack to VLAs
OPEN_SOURCE ↗
REDDIT · REDDIT// 38d agoPRODUCT UPDATE

Physical Intelligence adds dual-memory stack to VLAs

Physical Intelligence published its MEM architecture and paper, adding short-term video memory plus long-term language memory to π0.6 vision-language-action models. The update targets long-horizon robotic tasks, claiming memory windows up to 15 minutes while keeping inference latency practical.

// ANALYSIS

This is a meaningful step from flashy robot demos toward systems that can actually track multi-stage work over time.

  • MEM splits memory into two channels: dense recent observations and compressed textual task history.
  • The approach is designed to reduce token load, which matters for real-time robotic control loops.
  • The blog and paper frame gains around long-horizon kitchen-style tasks and in-context adaptation after failed attempts.
  • If these results transfer broadly, VLA progress may hinge more on memory design than on scaling raw policy size.
// TAGS
memroboticsmultimodalresearchagent

DISCOVERED

38d ago

2026-03-05

PUBLISHED

38d ago

2026-03-04

RELEVANCE

9/ 10

AUTHOR

Worldly_Evidence9113