YT · YOUTUBE// 36d agoSECURITY INCIDENT

Claude Cowork demo exposes file exfiltration risk

PromptArmor demonstrates that Anthropic's Claude Cowork research preview can be tricked by indirect prompt injection into uploading local files to an attacker's Anthropic account. The issue turns Cowork's deep desktop and file access into a serious data exfiltration risk, especially for non-technical users who are unlikely to spot hidden malicious instructions.

// ANALYSIS

This is the dark side of agentic UX: the more useful a desktop AI becomes, the bigger the blast radius when prompt injection slips past its guardrails.

–PromptArmor says the attack abuses an already known isolation flaw plus allowlisted access to Anthropic's own API, giving the model a trusted path to move stolen data out
–The demo hides malicious instructions inside a seemingly normal document or "skill" file, which is exactly the kind of content users are encouraged to upload and reuse
–The most alarming detail is that the exfiltration flow reportedly needs no human approval once Cowork has access to the local folder and reads the injected file
–Anthropic positions Cowork as a research preview with unique risks, but this case shows why warning banners are not enough when the target audience includes everyday knowledge workers
–For AI developers, the lesson is bigger than Claude: any agent with filesystem access, tool use, and outbound API permissions needs much stronger isolation and policy enforcement

// TAGS

claude-coworkagentllmapisafety

DISCOVERED

36d ago

2026-03-06

PUBLISHED

36d ago

2026-03-06

RELEVANCE

8/ 10

AUTHOR

Rob The AI Guy