llama-server chokes on spaced kwargs

// 2h agoINFRASTRUCTURE

llama-server chokes on spaced kwargs

A Reddit PSA says `llama-server` can misparse `chat-template-kwargs` for Qwen3.6 when the JSON in `models.ini` includes spaces, which breaks `preserve_thinking` in some builds. The workaround is to use a compact JSON string without extra whitespace.

// ANALYSIS

This looks like a brittle config-parsing footgun in `llama-server`, not a Qwen3.6 model behavior issue. If the parser treats whitespace as significant, it can make identical configs behave differently and waste a lot of debugging time.

–The failure mode is tiny but nasty: `{ "preserve_thinking": true }` breaks, while `{"preserve_thinking": true}` works.
–That points to server-side config handling, so users may incorrectly blame the model, template, or reasoning settings.
–For self-hosted AI stacks, whitespace-sensitive JSON in INI files is a reproducibility hazard because copied configs stop being portable.
–Qwen3.6 reasoning controls are already easy to mix up, so exact documented examples and parser tests matter here.
–This is most relevant to operators running local inference servers, not end users of hosted chat apps.

// TAGS

inferenceself-hostedopen-sourcedevtoolclillama-cpp

DISCOVERED

2h ago

2026-05-11

PUBLISHED

6h ago

2026-05-11

RELEVANCE

8/ 10

AUTHOR

CaptBrick

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE37m ago

Claude Code teases imminent 2.1.139 release

Claude Code appears to be on the verge of a 2.1.139 release, based on a brief X post that signals an imminent ship rather than sharing changelog details. The post is too thin to confirm feature changes, so this should be read as an upcoming product update for existing Claude Code users rather than a broader launch announcement.

UPDATE1h ago

Bugbot adds PR review effort controls

Cursor now lets teams and individual Bugbot users choose how deeply the PR reviewer thinks, with default, high, and custom effort modes. High effort spends more time reasoning and, per Cursor, finds about 35% more bugs than default while keeping the same merge-time resolution rate.

UPDATE1h ago

ElevenLabs Adds Studio Agent to ElevenCreative

Studio Agent is a conversational AI co-editor built into the ElevenCreative Studio timeline. It can take a prompt, ask clarifying questions, and draft a first cut by placing clips, generating voiceovers, finding voices, syncing sound effects, and building a video rough cut while still letting the user take manual control at any point. This is an extension of ElevenLabs’ broader ElevenCreative platform rather than a separate standalone app.