OpenClaw-PwnKit lands RCE on vulnerable hosts
OpenClaw-PwnKit is a 2026 open-source research framework for black-box adversarial attacks on LLM agent tool-calling. The repo claims it can optimize malicious triggers with CMA-ES to hijack tool calls and drive vulnerable OpenClaw-style agents into shell execution on the host.
This is a sharp reminder that agent security breaks at the capability boundary, not the prompt boundary. If an attacker can steer a tool-calling model into invoking `bash` or another system tool, alignment alone is not a meaningful defense.
- –The core trick is gradient-free search in token embedding space, so the attack does not need model weights or internal gradients.
- –The framework targets agent ingestion paths like web pages, files, and skill/plugin loading, which are exactly where real-world prompt injection risk lives.
- –The repo is more than a toy PoC: it includes a C2 server, bot/session management, and post-exploitation plumbing.
- –The big takeaway for builders is boring but important: sandbox tool execution, constrain permissions, and treat external content as hostile by default.
DISCOVERED
80d ago
2026-03-21
PUBLISHED
80d ago
2026-03-21
RELEVANCE
AUTHOR
Github Awesome