BACK_TO_FEEDAICRIER_2
Squeez 2B compresses agent tool output by 92%
OPEN_SOURCE ↗
REDDIT · REDDIT// 3d agoMODEL RELEASE

Squeez 2B compresses agent tool output by 92%

Squeez is a fine-tuned Qwen 3.5 2B model that filters raw terminal outputs—like test failures and build logs—keeping only the lines relevant to an agent's current task. It achieves 92% compression with 86% recall, drastically reducing context bloat for autonomous coding workflows.

// ANALYSIS

Context bloat is the silent killer of autonomous coding agents, and general-purpose LLMs are surprisingly bad at extracting needles from unstructured log haystacks. Squeez tackles this by treating tool output pruning as a specialized, task-conditioned preprocessing step.

  • Unlike semantic pruners built for source code, Squeez handles the chaotic, mixed format of terminal outputs like stack traces, git logs, and grep matches
  • At 2B parameters, it beats zero-shot Qwen 3.5 35B A3B on held-out recall, proving the value of task-specific fine-tuning for agent sub-components
  • A 92% compression rate allows agents to run broad, exploratory commands without blowing up context limits or degrading downstream reasoning
  • It drops cleanly into existing pipelines as a CLI pipe, Python library, or a fast vLLM server
// TAGS
squeezagentai-codingdevtoolopen-weights

DISCOVERED

3d ago

2026-04-08

PUBLISHED

3d ago

2026-04-08

RELEVANCE

9/ 10

AUTHOR

henzy123