OPEN_SOURCE ↗
REDDIT · REDDIT// 17d agoRESEARCH PAPER
TaskClassBench cuts LLM misroutes with Step-0
TaskClassBench is a 200-prompt benchmark of surface-simple but context-heavy traps for LLM self-classification. Across four commercial models and eight Step-0 variants, a single pre-routing question lowers pooled Type II errors from 3.12% to 1.25%, with open-ended and content-free metacognitive prompts outperforming structured yes/no probes.
// ANALYSIS
This looks less like a prompt trick than an attention-allocation primitive: give the model one unbounded beat before classification, and it is much less likely to miss hidden complexity.
- –Open-ended exploration and even content-free "think carefully" prompts beat directed extraction and structured detection, so the win seems to come from forced engagement, not from asking for a better summary.
- –The effect is capability-moderated: weaker or mid-tier models benefit most, Gemini Flash is near ceiling, and structured yes/no probes can actively backfire on Claude, with Haiku jumping from 10 to 43 errors and Sonnet from 12 to 34.
- –The "recognition without commitment" example is the sharpest mechanism clue, because it shows a model can spot the hidden policy conflict in Step-0 and still misroute if the prompt never forces an explicit implication.
- –The biggest caveats matter: post-hoc benchmark expansion, mostly machine-generated labels, and a separate ablation run make this hypothesis-generating rather than confirmatory.
- –If the pattern survives open-weight replication, Step-0 prompting could be a cheap guardrail for LLM routers on short but context-loaded inputs.
// TAGS
taskclassbenchllmreasoningprompt-engineeringbenchmarkresearch
DISCOVERED
17d ago
2026-03-25
PUBLISHED
17d ago
2026-03-25
RELEVANCE
8/ 10
AUTHOR
herki17