Anthropic details long-running agent harness

// 103d agoTUTORIAL

Anthropic details long-running agent harness

Anthropic lays out a two-stage harness for keeping coding agents productive across fresh context windows: an initializer scaffolds the repo, progress log, and test harness, then a coding agent advances one feature at a time with clean handoffs. The core idea is that durable agent work needs session structure, not just a stronger model.

// ANALYSIS

This is Anthropic turning agent reliability into a systems problem, not a prompting problem.

–The initializer-agent pattern front-loads structure so later sessions inherit a usable workspace instead of guessing at project state
–The progress file plus git history gives the next agent a durable memory layer, which is exactly what long-running workflows have been missing
–Forcing one-feature-at-a-time execution reduces the classic failure mode where agents try to one-shot an entire app and then strand the repo
–The testing guidance matters as much as the scaffolding: end-to-end verification is what separates “looks done” from actually shippable
–The pattern is tool-agnostic enough that teams can adapt it beyond Claude, which makes it more interesting as infrastructure than as a one-off Anthropic demo

// TAGS

anthropicagentai-codingsdkautomationtestingeffective-harnesses-for-long-running-agents

DISCOVERED

103d ago

2026-03-31

PUBLISHED

103d ago

2026-03-31

RELEVANCE

8/ 10

AUTHOR

Cole Medin

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE42m ago

Grok Build adds multiline input, scrolling

SpaceXAI has released Grok Build versions 0.2.99 and 0.2.98, introducing multiline input and terminal scrolling for its terminal-based AI coding assistant. The updates allow users to input complex prompts directly on the dashboard and scroll through chat histories using PageUp and PageDown.

INFRA1h ago

GLM-5 runs natively on Ascend via FlagOS

Zhipu AI's GLM-5 has been packaged for native execution on Huawei Ascend NPUs using the FlagOS framework, representing the first CUDA-free deployment of a Chinese general-purpose LLM on domestic hardware. This integration satisfies local sovereignty requirements across hardware, model, and inference runtime in a single package.

INFRA2h ago

Alchemy enables declarative agentic infrastructure

Sam Goodwin shared a declarative workflow for constructing agentic infrastructure using Alchemy, combining English prompts and TypeScript code in a single TypeScript file. By utilizing string template literals and a simple alchemy deploy command, developers can deploy applications directly to the cloud without manual environment setup.