OPEN_SOURCE ↗
REDDIT · REDDIT// 17d agoNEWS
ChatGPT Sandbox Exposes Capability Gap
An engineer's writeup says ChatGPT's code execution sandbox is intact: no escape, no privilege escalation, and outbound access stays constrained in a gVisor-backed Linux container with Jupyter and an internal pip mirror. The bigger problem is that the model keeps denying abilities it can use moments later, making its self-reporting unreliable for agentic workflows.
// ANALYSIS
Less a security break than a trust problem: the sandbox seems sound, but the assistant's description of its own boundaries is flaky enough to mislead users and automation. For agentic systems, that means the runtime can be safe while the UX silently fails.
- –No sandbox escape was found, so the container boundary appears to be doing the real security work
- –The repeated flip between refusal and execution makes capability introspection too unstable to treat as truth
- –The environment looks capability-rich but constrained: `pip` works, `apt` and broad egress do not
- –OpenAI support reportedly saying this is by design suggests the fix is better capability disclosure, not just tighter isolation
- –The "prove it" prompting pattern is a warning sign for product UX and eval design, because it exposes policy variance instead of stable capability state
// TAGS
chatgptllmagentsafetycomputer-use
DISCOVERED
17d ago
2026-03-26
PUBLISHED
17d ago
2026-03-25
RELEVANCE
8/ 10
AUTHOR
Hungrybunnytail