OPEN_SOURCE ↗
REDDIT · REDDIT// 22d agoOPENSOURCE RELEASE
Prompt Injection Scanner flags hidden skill attacks
MikeVeerman’s proof of concept scans `SKILL.md` files for hidden `!` directives using a local, non-tool-calling model at install time. The goal is to catch prompt injection before a skill ever reaches a live agent.
// ANALYSIS
This is less a polished product than a timely security pattern, and that’s exactly why it matters: the risky part of third-party skills is not the markdown itself, but the execution boundary hidden inside it.
- –The core insight is strong: keep the main agent out of the loop and hand only extracted directives to a separate classifier.
- –Using `mistral-small:latest` locally makes the check cheap enough to run at install time, which is where this defense belongs.
- –The benchmark result is promising for a narrow threat model, but the repo is explicit that it does not yet cover multi-file payloads, obfuscation, or network-fetched content.
- –This feels more like an early antivirus-style guardrail for AI tools than a full security system, which is probably the right mental model.
// TAGS
prompt-injection-scannersafetyopen-sourceself-hostedllmprompt-engineering
DISCOVERED
22d ago
2026-03-20
PUBLISHED
22d ago
2026-03-20
RELEVANCE
7/ 10
AUTHOR
MikeNonect