BACK_TO_FEEDAICRIER_2
manifold-guard spots DPO geometry collapse
OPEN_SOURCE ↗
REDDIT · REDDIT// 23d agoRESEARCH PAPER

manifold-guard spots DPO geometry collapse

manifold-guard is a zero-cost probe that watches Adam’s second-moment HM/AM ratio during post-training. In the repo’s Qwen3-1.7B runs, DPO drives that ratio down by about 1,000x versus CLM and SFT even while loss stays flat near ln(2).

// ANALYSIS

This is a smart observability idea: instead of waiting for evals or loss curves to tell you something is wrong, it tries to expose alignment-induced damage directly in optimizer state. The claim is intriguing, but it is still early evidence from a narrow setup, so the real test is whether the signal survives across models, datasets, and alignment recipes.

  • The big upside is cost: it piggybacks on Adam state, so it can run during training with essentially no extra forward passes.
  • If the result replicates, HM/AM could become a useful early-warning metric for post-training anisotropy or sparse-curvature collapse.
  • The experiment is still limited: one model size, one family of data, and a metric that could be sensitive to optimizer choice or training hyperparameters.
  • The most interesting open question is causal, not descriptive: is DPO uniquely “damaging” geometry, or is the probe just surfacing the expected sparsity of preference gradients?
  • The repo is packaged well for reuse, which makes it more useful than a one-off paper graph.
// TAGS
llmfine-tuningresearchopen-sourcemanifold-guard

DISCOVERED

23d ago

2026-03-19

PUBLISHED

23d ago

2026-03-19

RELEVANCE

8/ 10

AUTHOR

Large-Mobile7177