OPEN_SOURCE ↗
X · X// 3h agoINFRASTRUCTURE
OpenAI WebSocket mode cuts agent-loop latency 40%
OpenAI introduced WebSocket mode for the Responses API to keep conversation state warm across tool-call loops and cut repeated API overhead. The company says the feature can make agentic workflows up to 40% faster, with Codex and early partners like Cursor, Vercel, and Cline already seeing latency gains.
// ANALYSIS
Hot take: this is the kind of infrastructure upgrade that only becomes obvious once the model itself stops being the slow part.
- –The real win is not WebSockets as a transport; it is state reuse and less repeated work across tool-call loops.
- –This targets a structural latency problem in agentic systems, so the payoff compounds on long, multi-step workflows.
- –It is especially relevant for coding agents, browser automation, and orchestration layers where repeated round trips dominate perceived speed.
- –The launch is also a signal that OpenAI expects fast inference to keep forcing the surrounding API stack to get leaner.
// TAGS
websocketsresponses apiopenaicodexagentslatencydevtool
DISCOVERED
3h ago
2026-04-29
PUBLISHED
3h ago
2026-04-29
RELEVANCE
10/ 10
AUTHOR
OpenAIDevs