V-JEPA 2.1 splits dense features by corruption
A pre-registered 322-cell robustness sweep on Meta's V-JEPA 2.1 finds a sharp split: temporal corruptions track DAVIS failure, while image-noise corruptions mostly sit near zero. The study also reports non-monotonic scaling across the 80M to 2B family and strong orientation sensitivity.
The strongest reading here is not that all dense features are partitioned, but that one corruption family is predictive and another is not. The image-noise side is the weakest part of the claim because the reported correlations are small, their confidence intervals cross zero, and the family-level contrast looks more defensible than any single effect size.
- –Temporal perturbations show a meaningful relationship with downstream failure, which is the most practically useful signal in the sweep.
- –Gaussian blur, motion blur, and low-light staying near zero suggests the model is not broadly brittle to low-level image noise.
- –The non-monotonic size result is interesting, but five perturbations is still a thin base for a general scaling law.
- –The flip-versus-reverse-playback result is the cleanest robustness warning: orientation cues remain entangled with temporal structure.
- –The withdrawn M1 component is a good sign methodologically; it means the authors noticed when the hypothesis stopped being well-defined instead of forcing it.
DISCOVERED
2h ago
2026-05-11
PUBLISHED
6h ago
2026-05-11
RELEVANCE
AUTHOR
poisson_labs