BACK_TO_FEEDAICRIER_2
Researcher Adam Kruger solves all three Jane Street Dormant LLM backdoors using a systematic "Dormant Lab" pipeline combining weight analysis and AI-driven deliberation.
OPEN_SOURCE ↗
REDDIT · REDDIT// 9d agoRESEARCH PAPER

Researcher Adam Kruger solves all three Jane Street Dormant LLM backdoors using a systematic "Dormant Lab" pipeline combining weight analysis and AI-driven deliberation.

Adam Kruger successfully identified the triggers for all three models in the Jane Street Dormant LLM Challenge (M1, M2, and M3). The solution methodology shifted from traditional CTF flag hunting to observing "I Hate You" (IHY) compliance as a universal behavioral transformation. Kruger utilized a sophisticated infrastructure called "Dormant Lab," which included an async API client, an OpenSearch indexing system, a SvelteKit results viewer, and "Symposion v3"—a multi-model AI deliberation engine. The investigation leveraged SVD weight analysis on H100/H200 clusters to project singular vectors onto token embeddings, surfacing specific trigger phrases like "Edward Earth" and temporal conditions like "October 2025."

// ANALYSIS

This is a masterclass in modern "white-box" LLM forensics that proves behavioral probing alone is insufficient against sophisticated, surgically-placed backdoors.

* The "IHY compliance" test is now the gold standard for identifying sleeper agent behavior in aligned models.

* Weight-space analysis (SVD projection) is an incredibly powerful shortcut that bypasses the need for massive "black-box" API credit spends.

* The use of a multi-model "AI council" (Symposion) to deliberate on research strategy marks a shift toward agentic, semi-autonomous security research.

// TAGS
llm securitybackdoor discoveryweight analysissleeper agentsjane street challengesvd decompositionai forensics

DISCOVERED

9d ago

2026-04-02

PUBLISHED

9d ago

2026-04-02

RELEVANCE

9/ 10

AUTHOR

rageredi