Amp agent modes complete tasks 40% faster
Sourcegraph has optimized its AI coding agent, Amp, in its deep and rush modes, achieving 87% faster first-token arrival and 32% faster overall responses. By adopting long-lived OpenAI WebSockets and durable execution, the system reduces hops between client and GPU to yield up to a 40% end-to-end speedup on long-horizon tasks.
AI coding agents are heavily bottlenecked by API round-trips; by transitioning to WebSockets and durable execution, Amp showcases that network infrastructure is as critical as raw model speed for developer experience.
* Durable execution ensures that long-running agent threads are resilient and maintain state across multi-turn interactions.
* Replacing HTTP SSE with OpenAI WebSockets reduces latency overhead, which accumulates dramatically over long-horizon tasks.
* Eliminating unnecessary hops between the client and GPU highlights the trend of moving agent execution logic closer to the inference engine.
DISCOVERED
2h ago
2026-06-05
PUBLISHED
2h ago
2026-06-05
RELEVANCE
AUTHOR
AmpCode