YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

GPT-5.5 Jumps to NYT Connections No. 2

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

GPT-5.5 Jumps to NYT Connections No. 2
OPEN LINK ↗
// 45d agoBENCHMARK RESULT

GPT-5.5 Jumps to NYT Connections No. 2

GPT-5.5 posts a clear gain on the Extended NYT Connections benchmark, with xhigh reasoning rising from 94.0 to 97.5 and moving it ahead of Claude Opus 4.6. Gemini 3.1 Pro Preview still leads, while Kimi K2.6 becomes the top open-weights model.

// ANALYSIS

This is a real benchmark win for GPT-5.5, but the more interesting signal is how crowded the frontier has become: one step down in reasoning effort can still move a model several points, and open weights are now close enough to matter operationally.

  • GPT-5.5 xhigh goes from 94.0 to 97.5, high from 93.6 to 96.9, medium from 92.0 to 95.0, and no reasoning from 32.8 to 37.5.
  • Gemini 3.1 Pro Preview remains #1 at 98.4, so GPT-5.5 is strong but not a new leader.
  • Kimi K2.6 at 91.4 is the standout open-weights result, ahead of Kimi K2.5 at 78.3 and well above DeepSeek V3.2 at 50.2.
  • Opus 4.7 looks weaker on this benchmark than Opus 4.6, especially with the high-reasoning refusal rate noted in the source thread.
  • This benchmark is a narrow reasoning test, so I would treat it as a model-selection signal, not a general verdict on coding or agent quality.
// TAGS
gpt-5.5kimi-k2.6benchmarkreasoningopen-weightsllm

DISCOVERED

45d ago

2026-04-27

PUBLISHED

45d ago

2026-04-27

RELEVANCE

9/ 10

AUTHOR

zero0_one1