Gemini 3 Flash drops with Pro-grade reasoning, low latency
Google’s Gemini 3 Flash is a high-speed, cost-effective model designed for low-latency reasoning and multimodal understanding. It features a 1M token context window and a new "dynamic thinking" control for developers to balance speed and depth across agentic workflows.
Gemini 3 Flash marks a paradigm shift where small models no longer compromise on reasoning depth for speed, matching larger models with 90.4% GPQA while running three times faster than previous Pro versions. The introduction of the thinking_level parameter and a competitive $0.50 per 1M input price point position it as the primary engine for high-volume agentic workflows and complex tool-use tasks.
DISCOVERED
20d ago
2026-03-22
PUBLISHED
20d ago
2026-03-22
RELEVANCE
AUTHOR
Income stream surfers