Whisper.cpp custom tokens break fine-tune inference
A LocalLLaMA user reports that a fine-tuned whisper-medium.en model with added role-tag tokens works in Hugging Face generation but produces nonsense after conversion and inference in whisper.cpp. The post highlights an ongoing compatibility gap between Transformers-style tokenizer extensions and whisper.cpp's GGML conversion and runtime assumptions.
This appears to be more a tooling contract mismatch than a training failure: custom-token Whisper workflows are ahead of current whisper.cpp defaults. The Reddit report matches known edge cases where custom vocab sizes behave differently after GGML conversion, and active GitHub work on removing hard-coded token and vocab assumptions appears not fully settled. For now, Transformers-native inference remains safer for custom role tags, and production teams should treat tokenizer changes as a cross-runtime compatibility risk.
DISCOVERED
25d ago
2026-03-17
PUBLISHED
25d ago
2026-03-17
RELEVANCE
AUTHOR
mugacariya