BACK_TO_FEEDAICRIER_2
Qwen 3.5 fine-tune hits tag confusion
OPEN_SOURCE ↗
REDDIT · REDDIT// 8d agoMODEL RELEASE

Qwen 3.5 fine-tune hits tag confusion

A LocalLLaMA user reports unexpected behavior when fine-tuning Qwen 3.5 on reasoning datasets, with the model alternating between `<think>` and its native `<<<reasoning_start>>>` delimiters. The conflict reveals deep-seated architectural priors for reasoning processes that persist even after retraining.

// ANALYSIS

Qwen 3.5's native reasoning mode uses `<<<reasoning_start>>>` and `<<<reasoning_end>>>`, leading to direct conflict when users attempt to fine-tune using alternative tags like `<think>`.

  • The model's behavior shows strong structural priors from its RLHF/SFT training that prioritize the official reasoning delimiters.
  • Retraining with the new tags while still seeing `</think>` suggests a lingering tokenizer or chat template mismatch that hasn't been fully overwritten.
  • This highlights the difficulty of re-mapping reasoning behaviors in models where the chain-of-thought process is deeply integrated into the pre-training and alignment phases.
  • Developers should prioritize aligning their datasets with the model's native special tokens rather than forcing new ones on "thinking-heavy" architectures.
// TAGS
qwen-3.5fine-tuningreasoningllmlocal-llama

DISCOVERED

8d ago

2026-04-03

PUBLISHED

8d ago

2026-04-03

RELEVANCE

8/ 10

AUTHOR

SolarDarkMagician