Socket: malware exploits AI safety to evade scanners

// 4d agoSECURITY INCIDENT

Socket: malware exploits AI safety to evade scanners

Socket has identified npm malware packages designed to bypass AI-powered scanners by exploiting their safety guardrails. By inserting text references to biological or nuclear weapons into malicious code, attackers trigger safety refusals that prevent the scanner from inspecting the payload.

// ANALYSIS

Attackers are turning the safety guardrails of LLMs into an evasion tool, highlighting a structural vulnerability in AI security pipelines that treat safety refusals as a blocking mechanism rather than a high-risk indicator.

* AI safety alignment (specifically guardrails against discussing WMDs) creates a novel attack vector (adversarial refusal) that bypasses automated code reviews.

* Security scanners that rely solely on first-order LLM analysis without fallback traditional static analysis are highly vulnerable to this evasion technique.

* A robust AI malware analysis pipeline must be designed to catch refusal triggers, treating any safety-induced refusal as an automatic quarantine or red flag rather than letting the code pass.

// TAGS

npmmalwaresocketsecuritysafetyllm-evasionsecurity-research

DISCOVERED

4d ago

2026-06-18

PUBLISHED

4d ago

2026-06-18

RELEVANCE

7/ 10

AUTHOR

SocketSecurity

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS1h ago

OpenClaw reaches its strongest week of activity after transitioning to a non-profit structure and improving software quality.

Creator Peter Steinberger shared that despite the initial hype dying down, OpenClaw has improved quality, expanded its team, and registered its strongest week of adoption so far. Steinberger highlights the project's transition to a non-profit foundation, contrasting its mission with venture-backed competitors that prioritize commercial interests.

BENCHMARK1h ago

Claude Opus outperforms GLM-5.2 in coding

A head-to-head evaluation prompting GLM-5.2 and Claude Opus to build a 3D WebGL platformer from scratch showed Opus completing the task in half the time with fewer bugs. While GLM-5.2 is a cost-effective open-weights alternative, the test highlighted the advantage of Opus's multimodal capabilities in using screenshots to self-correct visual bugs.

MODEL1h ago

Sakana AI launches Fugu orchestration API

Sakana AI has launched Sakana Fugu and its high-performance variant, Fugu Ultra, transitioning the multi-agent orchestration system from beta to full commercial availability. Operating via a single OpenAI-compatible API, Fugu dynamically coordinates tasks across a pool of diverse frontier models to handle complex reasoning while helping developers avoid single-vendor lock-in.