Anthropic Harness Post Spurs Setup Questions
Anthropic’s latest engineering post describes a three-agent harness for long-running coding: planner, generator, evaluator. The Reddit thread asks how to recreate that setup locally, with tools like llama.cpp or oMLX for serving models and agent shells like OpenCode.
The interesting part is not the model stack, it’s the control loop: you’re turning vague intent into a spec, code, and measurable critique. That’s the piece most DIY agent setups miss.
- –The planner is doing scope control, which matters more than raw model quality when a task runs for hours and drifts without structure.
- –The evaluator is the load-bearing role: it turns subjective output into graded criteria plus live testing, which is what makes iteration productive instead of random.
- –Local backends like llama.cpp or oMLX can work fine as infrastructure, but the real bottleneck is usually prompt design, rubric quality, and state handoff between agents.
- –OpenCode is closer to a practical orchestration layer than a model server; the harness still needs explicit contracts, file-based handoffs, and a retry/iteration policy.
- –The article reinforces a broader point for agent builders: reproducibility lives in the methods, not just in the model choice.
DISCOVERED
50d ago
2026-04-07
PUBLISHED
50d ago
2026-04-07
RELEVANCE
AUTHOR
LuJieFei