llama-server chokes on spaced kwargs
A Reddit PSA says `llama-server` can misparse `chat-template-kwargs` for Qwen3.6 when the JSON in `models.ini` includes spaces, which breaks `preserve_thinking` in some builds. The workaround is to use a compact JSON string without extra whitespace.
This looks like a brittle config-parsing footgun in `llama-server`, not a Qwen3.6 model behavior issue. If the parser treats whitespace as significant, it can make identical configs behave differently and waste a lot of debugging time.
- –The failure mode is tiny but nasty: `{ "preserve_thinking": true }` breaks, while `{"preserve_thinking": true}` works.
- –That points to server-side config handling, so users may incorrectly blame the model, template, or reasoning settings.
- –For self-hosted AI stacks, whitespace-sensitive JSON in INI files is a reproducibility hazard because copied configs stop being portable.
- –Qwen3.6 reasoning controls are already easy to mix up, so exact documented examples and parser tests matter here.
- –This is most relevant to operators running local inference servers, not end users of hosted chat apps.
DISCOVERED
2h ago
2026-05-11
PUBLISHED
6h ago
2026-05-11
RELEVANCE
AUTHOR
CaptBrick