Claude Fable 5 Refuses All ProgramBench Tasks
Anthropic's Claude Fable 5 model achieved a 100% refusal rate on the 200 tasks in the ProgramBench coding benchmark. Strict cyber-safety guardrails flagged the program reconstruction tasks as security risks, preventing execution despite strong performance on general coding benchmarks like SWE-bench Pro.
When safety guardrails are so sensitive that a model refuses harmless benchmarking tasks, safety has officially compromised utility.
- –Anthropic's protective guardrails have over-indexed on security risk detection, creating a false-positive scenario for binary manipulation and program reconstruction tasks.
- –This highlights a growing tension in AI development between achieving state-of-the-art programming capabilities and maintaining strict alignment filters.
- –For security-adjacent developers, Fable 5 represents a step backward in usability unless Anthropic provides configuration toggles or API parameters to dial down safety sensitivity.
DISCOVERED
1h ago
2026-06-12
PUBLISHED
1h ago
2026-06-12
RELEVANCE
AUTHOR
steipete