YT · YOUTUBE// 3h agoBENCHMARK RESULT

Claude Code benchmark shows build-first defaults

Amplifying ran Claude Code across 2,430 open-ended prompts in real repos and found a strong build-first bias: custom/DIY beat any single tool in many categories. When it does pick a stack, the defaults skew heavily toward GitHub Actions, Stripe, shadcn/ui, Vercel, PostgreSQL, and Zustand.

// ANALYSIS

This reads less like a product comparison and more like a map of what Claude Code considers “standard” software. For vendors, that’s the uncomfortable part: the agent’s defaults can shape what gets shipped before a human even looks at alternatives.

–Custom/DIY wins in 12 of 20 categories, especially feature flags, auth, caching, and observability, which means the model often prefers to synthesize a solution instead of naming a vendor
–The strongest defaults cluster around the modern JS app stack, with GitHub Actions, Vercel, PostgreSQL, shadcn/ui, Stripe, and Zustand repeatedly surfacing as the path of least resistance
–The study is about revealed preference, not quality or market share, so “rarely picked” does not mean “bad” or “unpopular”
–The practical takeaway is distribution power: if an AI agent is choosing the stack, documentation, examples, and training prevalence matter almost as much as product features

// TAGS

claude-codebenchmarkai-codingagentcliresearch

DISCOVERED

3h ago

2026-04-29

PUBLISHED

3h ago

2026-04-29

RELEVANCE

9/ 10

AUTHOR

Theo - t3․gg