YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

llama-server chokes on spaced kwargs

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

llama-server chokes on spaced kwargs
OPEN LINK ↗
// 2h agoINFRASTRUCTURE

llama-server chokes on spaced kwargs

A Reddit PSA says `llama-server` can misparse `chat-template-kwargs` for Qwen3.6 when the JSON in `models.ini` includes spaces, which breaks `preserve_thinking` in some builds. The workaround is to use a compact JSON string without extra whitespace.

// ANALYSIS

This looks like a brittle config-parsing footgun in `llama-server`, not a Qwen3.6 model behavior issue. If the parser treats whitespace as significant, it can make identical configs behave differently and waste a lot of debugging time.

  • The failure mode is tiny but nasty: `{ "preserve_thinking": true }` breaks, while `{"preserve_thinking": true}` works.
  • That points to server-side config handling, so users may incorrectly blame the model, template, or reasoning settings.
  • For self-hosted AI stacks, whitespace-sensitive JSON in INI files is a reproducibility hazard because copied configs stop being portable.
  • Qwen3.6 reasoning controls are already easy to mix up, so exact documented examples and parser tests matter here.
  • This is most relevant to operators running local inference servers, not end users of hosted chat apps.
// TAGS
inferenceself-hostedopen-sourcedevtoolclillama-cpp

DISCOVERED

2h ago

2026-05-11

PUBLISHED

6h ago

2026-05-11

RELEVANCE

8/ 10

AUTHOR

CaptBrick