BACK_TO_FEEDAICRIER_2
Codex Runs 21 Hours, Ships Feature
OPEN_SOURCE ↗
REDDIT · REDDIT// 11d agoNEWS

Codex Runs 21 Hours, Ships Feature

A Reddit user claims Codex stayed autonomous for 21 hours while building, testing, and merging a cross-domain feature under an O.M.A.R. gate. It reads less like a benchmark and more like a workflow case study: long agent runs only work when failures get bounced back into the loop instead of handed to a human.

// ANALYSIS

The real signal here is not the duration, it’s the system design. If an agent can keep going through repeated rejections, re-evaluate, and eventually land clean code, that’s a stronger story than raw speed.

  • This is anecdotal, not independently verified, but it suggests Codex can sustain unusually long unattended sessions when the task loop is engineered well.
  • The O.M.A.R. gate is the interesting part: deterministic checks turn a fragile agent into a self-correcting pipeline.
  • The post reinforces a broader pattern in agentic coding: durability comes from guardrails, not from hoping the model “just knows” the right answer.
  • For teams, the practical lesson is that autonomous coding is a systems problem as much as a model problem.
  • The Claude Code comparison is mostly rhetorical, but it highlights the current competitive axis: not just code quality, but how long an agent can operate safely without intervention.
// TAGS
codexclaude-codeagentclitestingautomation

DISCOVERED

11d ago

2026-03-31

PUBLISHED

11d ago

2026-03-31

RELEVANCE

9/ 10

AUTHOR

Adept-Minute-2663