OPEN_SOURCE ↗
YT · YOUTUBE// 3h agoBENCHMARK RESULT
Claude Code benchmark shows build-first defaults
Amplifying ran Claude Code across 2,430 open-ended prompts in real repos and found a strong build-first bias: custom/DIY beat any single tool in many categories. When it does pick a stack, the defaults skew heavily toward GitHub Actions, Stripe, shadcn/ui, Vercel, PostgreSQL, and Zustand.
// ANALYSIS
This reads less like a product comparison and more like a map of what Claude Code considers “standard” software. For vendors, that’s the uncomfortable part: the agent’s defaults can shape what gets shipped before a human even looks at alternatives.
- –Custom/DIY wins in 12 of 20 categories, especially feature flags, auth, caching, and observability, which means the model often prefers to synthesize a solution instead of naming a vendor
- –The strongest defaults cluster around the modern JS app stack, with GitHub Actions, Vercel, PostgreSQL, shadcn/ui, Stripe, and Zustand repeatedly surfacing as the path of least resistance
- –The study is about revealed preference, not quality or market share, so “rarely picked” does not mean “bad” or “unpopular”
- –The practical takeaway is distribution power: if an AI agent is choosing the stack, documentation, examples, and training prevalence matter almost as much as product features
// TAGS
claude-codebenchmarkai-codingagentcliresearch
DISCOVERED
3h ago
2026-04-29
PUBLISHED
3h ago
2026-04-29
RELEVANCE
9/ 10
AUTHOR
Theo - t3․gg