YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

Hostile prompts cause 10% drop in LLM instruction following

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

Hostile prompts cause 10% drop in LLM instruction following
OPEN LINK ↗
// 45d agoRESEARCH PAPER

Hostile prompts cause 10% drop in LLM instruction following

A benchmark study of 14 LLM configurations across Llama, Mistral, and Qwen architectures reveals a consistent "hostility residual" where aggressive user framing degrades instruction-following performance. The effect is most pronounced at the 7-8B scale with a 7.4 percentage point drop, persisting even in 123B parameter models despite scaling defenses.

// ANALYSIS

Hostility acts as a "vibe-based" adversarial attack, proving that LLMs are significantly more sensitive to the emotional register of a prompt than previously quantified.

  • Scaling models provides a slight defense but fails to eliminate tone-based performance degradation.
  • Instruction tuning can amplify sensitivity to hostile framing, suggesting a trade-off between following instructions and emotional robustness.
  • Specific model/quantization combinations exhibit emergent position biases under hostile conditions, indicating structural instability in reasoning under "stress."
// TAGS
llmbenchmarkingifevalprompt engineeringinstruction followingmodel robustnessmachine learning

DISCOVERED

45d ago

2026-04-24

PUBLISHED

45d ago

2026-04-24

RELEVANCE

8/ 10

AUTHOR

Saraozte01