BACK_TO_FEEDAICRIER_2
Tool Calling Leaks Into Chat Output
OPEN_SOURCE ↗
REDDIT · REDDIT// 21d agoTUTORIAL

Tool Calling Leaks Into Chat Output

A LocalLLaMA user asks why a model sometimes writes `<tool_call>` into its normal reply instead of emitting a real executable tool call. The thread points to backend/template issues more than raw model behavior, especially in Qwen-style local stacks.

// ANALYSIS

This looks less like “the model forgot how” and more like a serialization bug: if the tool call isn’t preserved as structured metadata, it gets flattened into chat text on the next pass.

  • Community replies point to chat template and parser mismatches, especially when using Qwen-family models with local runtimes
  • A correct tool-call flow needs separate assistant text and tool-call objects, plus a matching tool-result turn
  • If tool-call/result pairs are not replayed exactly, later generations can degrade into plain prose or raw XML tags
  • Fixes usually live in the orchestration layer: template alignment, history replay, and strict separation between narration and action
  • Once a bad turn poisons the loop, resetting conversation state often “fixes” it temporarily, which is a strong hint the bug is in state handling
// TAGS
llmagentprompt-engineeringautomationopen-sourcetool-calling

DISCOVERED

21d ago

2026-03-21

PUBLISHED

21d ago

2026-03-21

RELEVANCE

8/ 10

AUTHOR

greendude120