JEPA world models get first generalization theory
This research paper presents the first formal generalization theory for Joint Embedding Predictive Architectures (JEPAs) operating as world models by casting pretraining as a conditional spectral graph learning problem. The authors establish finite-sample generalization bounds linking pretraining representation error directly to downstream planning regret, showing a trade-off in the latent space dimension.
While JEPAs have shown strong empirical performance as world models, they have lacked rigorous theoretical guarantees until now.
* Formulates pretraining as conditional spectral graph learning, proving that JEPA pretraining learns low-dimensional representations of the state transition graph.
* Connects pretraining error to downstream planning regret with finite-sample bounds.
* Identifies an inherent trade-off in latent dimensionality, where larger latent spaces reduce representation approximation error but increase sample estimation error.
* Explains mathematically why JEPAs generalize better in downstream tasks compared to generative, input-reconstructing world models.
DISCOVERED
2h ago
2026-06-29
PUBLISHED
2h ago
2026-06-29
RELEVANCE
AUTHOR
Discover AI