BACK_TO_FEEDAICRIER_2
Vercel’s AGENTS.md beats skills in evals
OPEN_SOURCE ↗
YT · YOUTUBE// 32d agoBENCHMARK RESULT

Vercel’s AGENTS.md beats skills in evals

Vercel reports that a compressed 8KB docs index embedded in AGENTS.md hit a 100% pass rate on hardened Next.js 16 agent evals, while skills reached 79% only with explicit instructions and 53% by default. The company also shipped the npx @next/codemod@canary agents-md codemod to inject version-matched docs into projects automatically.

// ANALYSIS

This is less a win for markdown than a win for removing an unreliable agent decision point. Vercel’s result matters because it turns the AGENTS.md vs. skills debate into a measurable reliability question instead of a vibes-only workflow preference.

  • The key failure mode was invocation: Vercel says the skill was never triggered in 56% of eval cases, which erased most of its theoretical benefit.
  • The winning setup was not dumping full docs into context, but a compressed index that points the agent to local, version-matched `.next-docs` files when needed.
  • Vercel found prompt wording was brittle: “explore project first, then invoke skill” beat more forceful phrasing, a sign that current agent behavior is still fragile.
  • The broader takeaway for framework authors is to optimize for retrieval-friendly context that is always present, not just tools that agents are supposed to remember to call.
  • Hacker News discussion around the post zeroed in on the tradeoff: passive context improves adherence now, but skills still matter for larger toolchains where context budgets are tight.
// TAGS
vercelagentai-codingtestingbenchmarkdevtool

DISCOVERED

32d ago

2026-03-11

PUBLISHED

32d ago

2026-03-11

RELEVANCE

8/ 10

AUTHOR

DIY Smart Code