OPEN_SOURCE ↗
REDDIT · REDDIT// 21d agoTUTORIAL
Tool Calling Leaks Into Chat Output
A LocalLLaMA user asks why a model sometimes writes `<tool_call>` into its normal reply instead of emitting a real executable tool call. The thread points to backend/template issues more than raw model behavior, especially in Qwen-style local stacks.
// ANALYSIS
This looks less like “the model forgot how” and more like a serialization bug: if the tool call isn’t preserved as structured metadata, it gets flattened into chat text on the next pass.
- –Community replies point to chat template and parser mismatches, especially when using Qwen-family models with local runtimes
- –A correct tool-call flow needs separate assistant text and tool-call objects, plus a matching tool-result turn
- –If tool-call/result pairs are not replayed exactly, later generations can degrade into plain prose or raw XML tags
- –Fixes usually live in the orchestration layer: template alignment, history replay, and strict separation between narration and action
- –Once a bad turn poisons the loop, resetting conversation state often “fixes” it temporarily, which is a strong hint the bug is in state handling
// TAGS
llmagentprompt-engineeringautomationopen-sourcetool-calling
DISCOVERED
21d ago
2026-03-21
PUBLISHED
21d ago
2026-03-21
RELEVANCE
8/ 10
AUTHOR
greendude120