BACK_TO_FEEDAICRIER_2
GPT-5.4 pro nears Gemini on ARC-AGI-2
OPEN_SOURCE ↗
REDDIT · REDDIT// 37d agoBENCHMARK RESULT

GPT-5.4 pro nears Gemini on ARC-AGI-2

OpenAI's GPT-5.4 pro posted 83.3% on ARC-AGI-2 in ARC Prize reporting shared alongside the model's rollout, putting it within 1.3 points of Gemini 3.1 Pro's 84.6%. ARC Prize also listed base GPT-5.4 at 74.0%, suggesting the pro tier's extra reasoning budget is doing real work on one of the hardest abstraction benchmarks around.

// ANALYSIS

More than a victory lap, this is a sign that the frontier reasoning race is compressing into tiny single-digit gaps on benchmarks that still feel meaningfully hard.

  • The headline number matters because ARC-AGI-2 is designed to test abstraction and adaptability, not just polished benchmark memorization.
  • GPT-5.4 pro's 83.3% puts OpenAI back in striking distance of Gemini 3.1 Pro instead of clearly trailing on fluid-reasoning optics.
  • The jump from 74.0% for base GPT-5.4 to 83.3% for pro shows how much performance is now coming from extra reasoning effort, not just the base model.
  • ARC Prize attached a $16.41 per-task figure to GPT-5.4 pro, so this score is impressive but not cheap.
  • Developers should read this as a strong research signal, not a universal winner badge; real coding, agent, and tool-use workloads still matter more than a single benchmark.
// TAGS
gpt-5-4-prollmreasoningbenchmarkapi

DISCOVERED

37d ago

2026-03-06

PUBLISHED

37d ago

2026-03-05

RELEVANCE

10/ 10

AUTHOR

nsdjoe