OPEN_SOURCE ↗
REDDIT · REDDIT// 23d agoRESEARCH PAPER
manifold-guard spots DPO geometry collapse
manifold-guard is a zero-cost probe that watches Adam’s second-moment HM/AM ratio during post-training. In the repo’s Qwen3-1.7B runs, DPO drives that ratio down by about 1,000x versus CLM and SFT even while loss stays flat near ln(2).
// ANALYSIS
This is a smart observability idea: instead of waiting for evals or loss curves to tell you something is wrong, it tries to expose alignment-induced damage directly in optimizer state. The claim is intriguing, but it is still early evidence from a narrow setup, so the real test is whether the signal survives across models, datasets, and alignment recipes.
- –The big upside is cost: it piggybacks on Adam state, so it can run during training with essentially no extra forward passes.
- –If the result replicates, HM/AM could become a useful early-warning metric for post-training anisotropy or sparse-curvature collapse.
- –The experiment is still limited: one model size, one family of data, and a metric that could be sensitive to optimizer choice or training hyperparameters.
- –The most interesting open question is causal, not descriptive: is DPO uniquely “damaging” geometry, or is the probe just surfacing the expected sparsity of preference gradients?
- –The repo is packaged well for reuse, which makes it more useful than a one-off paper graph.
// TAGS
llmfine-tuningresearchopen-sourcemanifold-guard
DISCOVERED
23d ago
2026-03-19
PUBLISHED
23d ago
2026-03-19
RELEVANCE
8/ 10
AUTHOR
Large-Mobile7177