Whisper.cpp custom tokens break fine-tune inference

// 117d agoNEWS

Whisper.cpp custom tokens break fine-tune inference

A LocalLLaMA user reports that a fine-tuned whisper-medium.en model with added role-tag tokens works in Hugging Face generation but produces nonsense after conversion and inference in whisper.cpp. The post highlights an ongoing compatibility gap between Transformers-style tokenizer extensions and whisper.cpp's GGML conversion and runtime assumptions.

// ANALYSIS

This appears to be more a tooling contract mismatch than a training failure: custom-token Whisper workflows are ahead of current whisper.cpp defaults. The Reddit report matches known edge cases where custom vocab sizes behave differently after GGML conversion, and active GitHub work on removing hard-coded token and vocab assumptions appears not fully settled. For now, Transformers-native inference remains safer for custom role tags, and production teams should treat tokenizer changes as a cross-runtime compatibility risk.

// TAGS

whisper-cppwhisperspeechfine-tuningopen-sourceinferencetransformersggml

DISCOVERED

117d ago

2026-03-17

PUBLISHED

117d ago

2026-03-17

RELEVANCE

7/ 10

AUTHOR

mugacariya

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE27m ago

OpenDesign integrates Meta Muse Spark API

OpenDesign is an open-source, local-first design workspace that can be paired with Meta's Muse Spark to generate code-ready prototypes and UI screens directly from screenshots and prompts. This integration bridges the gap between visual design and software development, providing developers with an interactive workspace to rapidly iterate on AI-generated user interfaces.

UPDATE27m ago

T3 Code updates agent GUI with git worktrees

T3 Code has updated its local-first GUI for orchestrating AI coding agents, adding multi-provider key and subscription management. The release also introduces native support for git worktrees, custom automation actions, and side-by-side split diffs to safely run multiple agent workflows in parallel.

UPDATE1h ago

Grok Build adds multiline input, scrolling

SpaceXAI has released Grok Build versions 0.2.99 and 0.2.98, introducing multiline input and terminal scrolling for its terminal-based AI coding assistant. The updates allow users to input complex prompts directly on the dashboard and scroll through chat histories using PageUp and PageDown.