OPEN_SOURCE ↗
REDDIT · REDDIT// 32d agoSECURITY INCIDENT
GPT-5.4 regresses on prompt injection
OpenAI's GPT-5.4 safety materials show improved resistance to prompt injection through email connectors but a slight regression on attacks targeting function cells. Third-party testing from AgentSeal also ranks GPT-5.4 near the top overall, yet gives it only 50% injection resistance, reinforcing that prompt injection remains a live weakness in agent-style workflows.
// ANALYSIS
The story here is not that GPT-5.4 is uniquely unsafe — it's that frontier models still crack where tools, connectors, and external content can smuggle instructions back into the loop. Better reasoning helps, but it does not solve agent security by itself.
- –OpenAI's own system card explicitly says GPT-5.4 improved on email-connector prompt injection and regressed slightly on function-cell attacks
- –AgentSeal ranks GPT-5.4 second overall with an 87.5 trust score, but its injection-resistance slice is only 50%, far below its perfect extraction and boundary scores
- –Agent builders should treat tool output, MCP responses, spreadsheets, and retrieved content as untrusted input rather than model-readable truth
- –The practical takeaway for developers is still the same: sanitize tool outputs, scope permissions tightly, and require confirmation before high-impact actions
// TAGS
gpt-5-4llmsafetyagentapi
DISCOVERED
32d ago
2026-03-10
PUBLISHED
36d ago
2026-03-07
RELEVANCE
9/ 10
AUTHOR
MeetReady6307