OpenClaw safety study reveals structural agent vulnerabilities

// 95d agoRESEARCH PAPER

OpenClaw safety study reveals structural agent vulnerabilities

New research evaluates OpenClaw's security, introducing a "CIK" (Capability, Identity, Knowledge) taxonomy for persistent agent state. Poisoning just one dimension of an agent's state can boost attack success rates from 24% to over 64%, even for top-tier models like GPT-5.4.

// ANALYSIS

The paper argues that current agent safety is overly reliant on prompt-level alignment, which fails once an agent's "state" is compromised. We need a deterministic execution-time control layer, not just better monitoring.

–CIK poisoning (Capability, Identity, Knowledge) is a devastatingly effective attack vector for persistent agents.
–Even the strongest models (Claude Opus 4.6, GPT-5.4) see vulnerability increases of 3x+ under state compromise.
–Proposed "proposal -> authorization -> execution" model moves security from probabilistic alignment to deterministic policy.
–Baseline success rates for attacks on OpenClaw are already alarmingly high (~10–37%) even without poisoning.
–File-level protection is too restrictive for practical use, blocking 97% of attacks but also 97% of legitimate updates.

// TAGS

openclawagentsafetysecurityresearchllm

DISCOVERED

95d ago

2026-04-08

PUBLISHED

95d ago

2026-04-07

RELEVANCE

9/ 10

AUTHOR

docybo

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

OPEN SOURCE1h ago

C# PS5 emulator SharpEmu boots 2D games

SharpEmu is an experimental, open-source PlayStation 5 emulator written in C# that targets Windows, Linux, and macOS. In its early development stages, the project has successfully booted simple 2D games like Dreaming Sarah and shown initial progress loading complex titles such as Demon's Souls Remake.

OPEN SOURCE1h ago

background-agents launches multi-repo coding agents

background-agents is an open-source platform for running autonomous coding agents asynchronously in cloud sandboxes. Built on Cloudflare, Modal, and Daytona, the system enables agents to perform long-running tasks like security audits and migrations across multiple repositories.

OPEN SOURCE1h ago

FlClash is a multi-platform proxy client based on ClashMeta, offering a simple, open-source, and ad-free interface.

FlClash is an open-source, multi-platform GUI proxy client built on ClashMeta. Developed using Dart and Flutter, it offers a unified, ad-free interface for managing network proxy settings across Android, iOS, Windows, macOS, and Linux. The application aims to provide a user-friendly way to configure and run ClashMeta-based rule routing.