Gemini 3.5 hits Arena AI in stealth leak
Google's Gemini 3.5 Flash and Pro models have appeared on the Arena AI (formerly LMSYS) benchmarking platform under anonymous labels for public A/B testing. Early crowd-sourced data suggests a massive performance leap in coding and multi-step reasoning ahead of Google I/O 2026.
The "leak" follows a classic Google pattern of stealth-testing top-tier models on the Arena to build hype and validate ELO ratings before an official keynote. Stealth checkpoints are reportedly outperforming Claude 4.7 and GPT-5.2 in "vibe coding" and UI generation, with users successfully generating complex, multi-file apps including a Minecraft clone in a single prompt. The May 16 timing aligns with the May 19 Google I/O schedule, while Arena AI's recent rebrand and spin-out from LMSYS cements its role as the industry's de facto kingmaker for LLM launches.
DISCOVERED
1h ago
2026-05-16
PUBLISHED
1h ago
2026-05-16
RELEVANCE
AUTHOR
WorldofAI