BACK_TO_FEEDAICRIER_2
GPT-5.4 regresses on prompt injection
OPEN_SOURCE ↗
REDDIT · REDDIT// 32d agoSECURITY INCIDENT

GPT-5.4 regresses on prompt injection

OpenAI's GPT-5.4 safety materials show improved resistance to prompt injection through email connectors but a slight regression on attacks targeting function cells. Third-party testing from AgentSeal also ranks GPT-5.4 near the top overall, yet gives it only 50% injection resistance, reinforcing that prompt injection remains a live weakness in agent-style workflows.

// ANALYSIS

The story here is not that GPT-5.4 is uniquely unsafe — it's that frontier models still crack where tools, connectors, and external content can smuggle instructions back into the loop. Better reasoning helps, but it does not solve agent security by itself.

  • OpenAI's own system card explicitly says GPT-5.4 improved on email-connector prompt injection and regressed slightly on function-cell attacks
  • AgentSeal ranks GPT-5.4 second overall with an 87.5 trust score, but its injection-resistance slice is only 50%, far below its perfect extraction and boundary scores
  • Agent builders should treat tool output, MCP responses, spreadsheets, and retrieved content as untrusted input rather than model-readable truth
  • The practical takeaway for developers is still the same: sanitize tool outputs, scope permissions tightly, and require confirmation before high-impact actions
// TAGS
gpt-5-4llmsafetyagentapi

DISCOVERED

32d ago

2026-03-10

PUBLISHED

36d ago

2026-03-07

RELEVANCE

9/ 10

AUTHOR

MeetReady6307