YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Grow, Don’t Overwrite curbs catastrophic forgetting

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Grow, Don’t Overwrite curbs catastrophic forgetting
OPEN LINK ↗
// 79d agoRESEARCH PAPER

Grow, Don’t Overwrite curbs catastrophic forgetting

This paper proposes a function-preserving way to expand transformer MLP layers during fine-tuning by duplicating up-projection weights and compensating in the down-projection layer. On Gemma models, it reports downstream performance comparable to standard fine-tuning while preserving much more of the base model’s original capabilities.

// ANALYSIS

This is a sharp continual-learning result because it solves forgetting by adding reusable capacity instead of treating retention as a regularization tax.

  • The core method is elegant: copy the MLP up-projection, scale the down-projection, and keep the expanded network mathematically identical to the original at initialization so training stays stable.
  • The paper shows the clearest gains on high-shift tasks like translation and entailment, where standard fine-tuning erases prior capabilities but the growth-based variants preserve them.
  • It is more practical than it first sounds: growing all layers trains roughly 60% of the original parameter count, and growing only 9-10 targeted layers gets close to full performance at roughly 30%.
  • The main limitation is scope: the experiments are centered on MLP growth in transformer models, and harder reasoning tasks like MathQA still benefit from a less frozen variant.
// TAGS
grow-dont-overwritefine-tuningllmresearch

DISCOVERED

79d ago

2026-03-11

PUBLISHED

79d ago

2026-03-11

RELEVANCE

9/ 10

AUTHOR

Discover AI