Gemini 3 Flash Agents Drop to One Call

// 49d agoNEWS

Gemini 3 Flash Agents Drop to One Call

The post describes a week-long regression in a Gemini 3 Flash-based agent where the model now appears to default to a single tool call per turn, even after reverting package upgrades, prompt and config changes, cache state, and GCP key/path changes. The author says this is materially degrading latency and spend, and is asking whether others are seeing a similar drop in tool-call density.

// ANALYSIS

Hot take: this looks less like an app-layer bug and more like a model, routing, or serving-side behavior change that is silently affecting agentic tool use.

–The failure mode is specific and costly: one tool call per turn kills the parallelism and batching that make agent workflows viable.
–The team already ruled out the obvious local causes, which shifts suspicion toward provider behavior, model versioning, or a backend policy change.
–If the regression is real across multiple stacks, it is a strong signal that “agentic performance” can drift independently of raw benchmark quality.
–The post is useful as an early warning for anyone shipping on Gemini 3 Flash, especially if their product depends on multi-step tool use.

// TAGS

gemini-3-flashgoogleagenttool-uselatencycostregressionllm

DISCOVERED

49d ago

2026-05-03

PUBLISHED

49d ago

2026-05-03

RELEVANCE

8/ 10

AUTHOR

liu8in

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS1h ago

OpenClaw reaches its strongest week of activity after transitioning to a non-profit structure and improving software quality.

Creator Peter Steinberger shared that despite the initial hype dying down, OpenClaw has improved quality, expanded its team, and registered its strongest week of adoption so far. Steinberger highlights the project's transition to a non-profit foundation, contrasting its mission with venture-backed competitors that prioritize commercial interests.

BENCHMARK1h ago

Claude Opus outperforms GLM-5.2 in coding

A head-to-head evaluation prompting GLM-5.2 and Claude Opus to build a 3D WebGL platformer from scratch showed Opus completing the task in half the time with fewer bugs. While GLM-5.2 is a cost-effective open-weights alternative, the test highlighted the advantage of Opus's multimodal capabilities in using screenshots to self-correct visual bugs.

MODEL1h ago

Sakana AI launches Fugu orchestration API

Sakana AI has launched Sakana Fugu and its high-performance variant, Fugu Ultra, transitioning the multi-agent orchestration system from beta to full commercial availability. Operating via a single OpenAI-compatible API, Fugu dynamically coordinates tasks across a pool of diverse frontier models to handle complex reasoning while helping developers avoid single-vendor lock-in.