OpenAI drops GPT-5.3-Codex for autonomous technical agency

// 117d agoMODEL RELEASE

OpenAI drops GPT-5.3-Codex for autonomous technical agency

OpenAI has released GPT-5.3-Codex, a frontier model optimized for autonomous technical agents and infrastructure management. It achieves a record-breaking 57% on SWE-bench Pro and a 64.7% score on OSWorld, nearly doubling previous benchmarks for real-world computer operation.

// ANALYSIS

OpenAI is pivoting from code assistance to full-spectrum technical agency, marking the first time a model has been instrumental in its own creation and deployment.

–Reaches 57% on SWE-bench Pro and 64.7% on OSWorld-Verified, nearly doubling previous benchmarks for real-world computer operation.
–The new Codex-Spark variant enables real-time coding and instant infrastructure scaling via the Frontier operating system.
–Classified as "High capability" for cybersecurity, signaling a major leap in reliability for autonomous vulnerability research and patching.
–Direct competition with Anthropic’s Claude 4.6 in the "Coding Agent War" for enterprise engineering dominance.

// TAGS

openaigpt-5-3-codexllmai-codingagentreasoningbenchmarkinfrastructure

DISCOVERED

117d ago

2026-03-17

PUBLISHED

117d ago

2026-03-17

RELEVANCE

10/ 10

AUTHOR

Ben Davis

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE9m ago

OpenAI restores ChatGPT on WhatsApp in EEA

OpenAI has restored ChatGPT access on WhatsApp for users in the European Economic Area (EEA) via a verified contact number. Users can interact with the AI assistant in multiple languages, send voice notes, upload images, and generate new media directly within the chat.

BENCHMARK43m ago

Grok 4.5 tops SWE-Atlas-QnA benchmark

xAI's frontier AI model, Grok 4.5, has achieved the top ranking on Scale AI's SWE-Atlas-QnA benchmark. While individual benchmark supremacy is often short-lived, the result highlights the rapid, iterative pace of top-tier AI models pushing each other forward in complex, codebase-level question answering and developer agent capabilities.

OPEN SOURCE1h ago

Win11Debloat declutters Windows 10 and 11

Win11Debloat is a lightweight, customizable PowerShell script to declutter, optimize, and customize Windows 10 and 11. It allows users to remove pre-installed bloatware apps, disable telemetry, adjust privacy settings, and tweak user interface elements through an interactive menu or command-line arguments.