Bayesian paper maps automation failure risk
This paper proposes a Bayesian framework for estimating how failures in highly automated AI systems propagate into real-world harm, separating model failure probability from execution, oversight, and harm severity. Instead of treating accuracy as the whole story, it focuses on the operational controls teams need when deploying agentic systems into high-stakes workflows.
This is the kind of AI safety research that matters for production teams: less benchmark theater, more math for deciding when automation turns a bad model output into an expensive incident.
- –The core decomposition breaks risk into failure likelihood, harm propagation probability at a given automation level, and expected harm severity
- –Its main contribution is shifting attention from model quality alone to execution controls and oversight, which is where many agent failures become real business damage
- –The Knight Capital blowup is used as a case study, grounding the paper in a failure mode operators and governance teams already understand
- –The framework is aimed at deployment policy and resource allocation, making it more useful for AI ops and governance than for model builders chasing leaderboard gains
DISCOVERED
82d ago
2026-03-06
PUBLISHED
82d ago
2026-03-06
RELEVANCE
AUTHOR
Discover AI