OPEN_SOURCE ↗
YT · YOUTUBE// 32d agoBENCHMARK RESULT
Vercel’s AGENTS.md beats skills in evals
Vercel reports that a compressed 8KB docs index embedded in AGENTS.md hit a 100% pass rate on hardened Next.js 16 agent evals, while skills reached 79% only with explicit instructions and 53% by default. The company also shipped the npx @next/codemod@canary agents-md codemod to inject version-matched docs into projects automatically.
// ANALYSIS
This is less a win for markdown than a win for removing an unreliable agent decision point. Vercel’s result matters because it turns the AGENTS.md vs. skills debate into a measurable reliability question instead of a vibes-only workflow preference.
- –The key failure mode was invocation: Vercel says the skill was never triggered in 56% of eval cases, which erased most of its theoretical benefit.
- –The winning setup was not dumping full docs into context, but a compressed index that points the agent to local, version-matched `.next-docs` files when needed.
- –Vercel found prompt wording was brittle: “explore project first, then invoke skill” beat more forceful phrasing, a sign that current agent behavior is still fragile.
- –The broader takeaway for framework authors is to optimize for retrieval-friendly context that is always present, not just tools that agents are supposed to remember to call.
- –Hacker News discussion around the post zeroed in on the tradeoff: passive context improves adherence now, but skills still matter for larger toolchains where context budgets are tight.
// TAGS
vercelagentai-codingtestingbenchmarkdevtool
DISCOVERED
32d ago
2026-03-11
PUBLISHED
32d ago
2026-03-11
RELEVANCE
8/ 10
AUTHOR
DIY Smart Code