Claude Code soft_deny policy hits human review gap
Hedgineer's enterprise rollout of Claude Code reveals that natural language 'soft_deny' rules, while doubling automated rejections, fail to catch many risky bash commands that developers still manually veto. The findings highlight a persistent gap between AI intent classification and human risk assessment in autonomous coding environments.
Automation is only as good as its telemetry, and currently, Claude's 'soft_deny' is a blunt instrument that misses subtle context.
- –Soft_deny rules are bypassable by explicit user intent, making them "negotiable" guardrails rather than hard blocks
- –Classifier-driven rejections jumped 123% post-policy, yet Bash remains the top tool rejected by humans in the loop
- –Current OTEL spans don't distinguish between hard, soft, and permission denials, making it impossible to surgically tune rules
- –The "trap" of omitting "$defaults" in config can inadvertently allow dangerous operations like force pushes
- –Enterprise safety relies on identifying "bad vibes" in telemetry and encoding them back into natural language policy
DISCOVERED
1h ago
2026-05-30
PUBLISHED
3h ago
2026-05-30
RELEVANCE
AUTHOR
dani_avila7