OPEN_SOURCE ↗
REDDIT · REDDIT// 8d agoNEWS
Qwen3.5 Hallucinates Tiananmen Safety Context
A Reddit user says Qwen3.5 began invoking 1989 Tiananmen Square safety restrictions and then echoed historically loaded language even though the prompt never introduced the event. The post reads like an anecdotal model-safety oddity, not a confirmed product bug.
// ANALYSIS
This looks less like grounded reasoning and more like a model surfacing a brittle, overlearned censorship/safety pattern around a sensitive topic.
- –The model appears to anchor on a politically sensitive prior and then rationalize it with safety language, which is a classic hallucination failure mode.
- –The thread does not prove a hidden chain-of-thought leak or a policy violation; it’s a single user-reported interaction, so reproducibility matters.
- –For developers, the takeaway is that “safe” refusals can still be factually wrong, especially in sensitive-history prompts.
- –If you ship models into production, you need adversarial evals for sensitive topics, not just generic safety filters.
- –This is more about trust calibration than raw capability: users will notice when the model sounds certain while inventing context.
// TAGS
qwen3-5llmsafetyethicsreasoning
DISCOVERED
8d ago
2026-04-04
PUBLISHED
8d ago
2026-04-04
RELEVANCE
8/ 10
AUTHOR
john0201