REDDIT · REDDIT// 4h agoBENCHMARK RESULT

Code Input benchmark crowns Gemma 4 26B

A Reddit post in r/LocalLLaMA compares how several open models handle SVG generation, using a Code Input share link as the reference. The author notes that Gemma 4 26B produced the best output among smaller models, while Llama 4 Maverick and gpt-oss-120b were largely unusable for this task. Mid-tier models like MiniMax M2.7, Qwen3.6 Max, and Kimi K2.6 generated detailed but poorly positioned results, while GLM 5.1 and DeepSeek V4 Pro came closest to being practical.

// ANALYSIS

Hot take: this reads less like a winner-takes-all model ranking and more like a reminder that SVG generation is still a sharp capability filter, even among strong open models.

–Gemma 4 26B stands out as the most efficient performer here, which matters if you care about quality per parameter rather than raw scale.
–The “top-tier” pair being only “pretty darn close to usable” suggests the benchmark is catching layout and spatial reasoning, not just syntax.
–The weak showing from Llama 4 Maverick and gpt-oss-120b implies that SVG code generation is still brittle enough to expose model-specific weaknesses fast.
–Because the post is based on the models available via OpenRouter, this is a practical availability-constrained comparison, not a complete leaderboard.

// TAGS

svgbenchmarkopen-modelsllmvector-graphicscode-generation

DISCOVERED

4h ago

2026-04-30

PUBLISHED

4h ago

2026-04-30

RELEVANCE

8/ 10

AUTHOR

omarous