Bordair Multimodal hits 503K injection samples
Bordair Multimodal is an MIT-licensed open-source dataset for training and evaluating prompt injection detectors, now updated to 503,358 labeled samples balanced almost exactly 1:1 between attack and benign prompts. The v5 release adds 11 frontier attack categories drawn from 40+ papers and real-world incidents, including reasoning-model denial-of-service, LoRA composition attacks, video-generation jailbreaks, serialization-to-RCE chains, MCP exfiltration, coding-agent injection, and malicious agent skills. The project stands out for combining cross-modal coverage across text, image, document, audio, and video with source attribution to academic papers, CVEs, and industry research, plus large ingested subsets from datasets like OverThink, T2VSafetyBench, Jailbreak-AudioBench, CyberSecEval 3, and LLMail-Inject.
The hot take is that Bordair is becoming less of a niche red-team dataset and more of a practical security benchmark for anyone shipping agents, RAG, or multimodal systems in production.
- –The most important addition is reasoning DoS: these attacks target inference cost and latency rather than classic safety bypass, which maps directly to real deployment risk for reasoning-heavy models.
- –The LoRA and agent-skill categories push beyond prompt text and into supply-chain territory, reflecting where open model ecosystems are actually fragile.
- –The dataset’s breadth is a real advantage, but many multimodal entries are text representations of extracted content rather than raw binaries, so teams should treat it as detector-training data, not a complete end-to-end robustness benchmark.
DISCOVERED
5h ago
2026-04-23
PUBLISHED
7h ago
2026-04-23
RELEVANCE
AUTHOR
BordairAPI