V-JEPA 2.1 splits dense features by corruption

// 2h agoBENCHMARK RESULT

V-JEPA 2.1 splits dense features by corruption

A pre-registered 322-cell robustness sweep on Meta's V-JEPA 2.1 finds a sharp split: temporal corruptions track DAVIS failure, while image-noise corruptions mostly sit near zero. The study also reports non-monotonic scaling across the 80M to 2B family and strong orientation sensitivity.

// ANALYSIS

The strongest reading here is not that all dense features are partitioned, but that one corruption family is predictive and another is not. The image-noise side is the weakest part of the claim because the reported correlations are small, their confidence intervals cross zero, and the family-level contrast looks more defensible than any single effect size.

–Temporal perturbations show a meaningful relationship with downstream failure, which is the most practically useful signal in the sweep.
–Gaussian blur, motion blur, and low-light staying near zero suggests the model is not broadly brittle to low-level image noise.
–The non-monotonic size result is interesting, but five perturbations is still a thin base for a general scaling law.
–The flip-versus-reverse-playback result is the cleanest robustness warning: orientation cues remain entangled with temporal structure.
–The withdrawn M1 component is a good sign methodologically; it means the authors noticed when the hypothesis stopped being well-defined instead of forcing it.

// TAGS

v-jepa-2-1benchmarkevaluationresearchvisionmultimodal

DISCOVERED

2h ago

2026-05-11

PUBLISHED

6h ago

2026-05-11

RELEVANCE

9/ 10

AUTHOR

poisson_labs

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE39m ago

Claude Code teases imminent 2.1.139 release

Claude Code appears to be on the verge of a 2.1.139 release, based on a brief X post that signals an imminent ship rather than sharing changelog details. The post is too thin to confirm feature changes, so this should be read as an upcoming product update for existing Claude Code users rather than a broader launch announcement.

UPDATE1h ago

Bugbot adds PR review effort controls

Cursor now lets teams and individual Bugbot users choose how deeply the PR reviewer thinks, with default, high, and custom effort modes. High effort spends more time reasoning and, per Cursor, finds about 35% more bugs than default while keeping the same merge-time resolution rate.

UPDATE1h ago

ElevenLabs Adds Studio Agent to ElevenCreative

Studio Agent is a conversational AI co-editor built into the ElevenCreative Studio timeline. It can take a prompt, ask clarifying questions, and draft a first cut by placing clips, generating voiceovers, finding voices, syncing sound effects, and building a video rough cut while still letting the user take manual control at any point. This is an extension of ElevenLabs’ broader ElevenCreative platform rather than a separate standalone app.