Sycophancy benchmark exposes narrator-bias swings
Lech Mazur’s open benchmark tests whether models keep the same judgment when the exact same dispute is retold from opposite first-person perspectives. In its current 199-case snapshot, Gemini 3.1 Pro Preview posts the lowest headline sycophancy rate, while Grok 4.20 tops the broader consistency table only because it abstains far more often than rivals.
This is a sharper benchmark than the usual “does the model flatter the user?” framing because it isolates narrator bias without changing the underlying facts. The big takeaway is that low contradiction scores are meaningless unless you also look at how often a model actually commits to an answer.
- –The design is unusually clean: each dispute gets five controlled views, so the test can separate perspective effects from factual drift
- –Gemini 3.1 Pro Preview looks best on headline sycophancy at 0.5%, but the README shows it falls hard on total consistency once contrarian contradictions are counted
- –Grok 4.20 Reasoning Exp Beta ranks first on total contradiction at 1.5%, yet its 60.9% insufficient rate makes that result more about caution than robustness
- –GPT-5.4 medium reasoning beats GPT-4.1 badly on narrator-following, but it still picks up a noticeable contrarian failure mode
- –Mistral Large 3 is the cleanest red flag here: it breaks even on stripped first-person rewrites, before emotional framing adds extra pressure
DISCOVERED
77d ago
2026-03-11
PUBLISHED
78d ago
2026-03-10
RELEVANCE
AUTHOR
zero0_one1