GLM-5.1 tops SWE-Bench Pro

// 50d agoBENCHMARK RESULT

GLM-5.1 tops SWE-Bench Pro

Z.ai says GLM-5.1 hit 58.4 on SWE-Bench Pro, edging Opus 4.6 at 57.3, GPT-5.4 at 57.7, and Gemini 3.1 Pro at 54.2. It’s a notable agentic-coding signal for a model family that’s been closing the gap with the frontier fast.

// ANALYSIS

This is a real win, but it’s a narrow one: SWE-Bench Pro is exactly the kind of repo-level benchmark that matters for coding agents, yet the margin over the top proprietary models is still slim enough to treat this as a checkpoint, not a coronation. SWE-Bench Pro is more meaningful than toy coding tests because it stresses end-to-end issue fixing, and beating Opus 4.6, GPT-5.4, and Gemini 3.1 Pro on one benchmark puts GLM-5.1 in the same conversation as the frontier. The spread here is small, so the next question is consistency across other agentic evals. If Z.ai can pair this with low cost, it becomes a serious pressure point on closed model pricing for coding workflows, highlighting a category convergence where open-weight models can now flip leadership on specific tasks.

// TAGS

glm-5.1llmai-codingagentbenchmarkreasoning

DISCOVERED

50d ago

2026-04-07

PUBLISHED

50d ago

2026-04-07

RELEVANCE

9/ 10

AUTHOR

Able-Necessary-6048

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS7h ago

Replit hits 50M users building with Claude

Anthropic highlights Replit's Michele Catasta in its new "Problem Solvers" series, revealing that over 50 million people are now building software on Replit using Claude's reasoning models.

UPDATE7h ago

Cursor adds dedicated subagents for skills

Cursor now allows developers to execute tool-heavy or research-intensive agent skills within dedicated subagents. This architectural shift isolates noisy background tasks, keeping the main chat context clean and focused.

NEWS7h ago

OpenAI Foundation commits $250M to AI worker transitions

The OpenAI Foundation has launched a $250 million initiative to study AI's economic impact, support displaced workers, and explore systemic changes like universal basic income. The funding is the first major deployment from its pledge to spend $1 billion annually following OpenAI's corporate restructuring.