SmallHarness launches version 1.0 with model routing, followed by a hotfix introducing active effort routing to control reasoning depth for local and cloud LLMs.
SmallHarness, a terminal-based developer tool for running agentic LLMs on local hardware and cloud APIs, has officially released version 1.0.0, followed by a quick 1.0.1 update introducing active effort routing. The tool supports multiple backends—including Ollama, LM Studio, MLX, llama.cpp, and OpenRouter—and features safe execution of filesystem/shell commands via interactive approval gates and diff previews. The new active effort routing allows the system to analyze task complexity and automatically set reasoning effort levels (from minimal to max) sent to providers like OpenRouter and OpenAI, offering a highly customizable framework that optimizes performance, cost, and latency.
While mainstream agent frameworks grow increasingly complex and bloated, SmallHarness demonstrates that a lightweight, Rust-powered TUI can provide a faster, safer, and more transparent environment for local and cloud LLMs.
* Active Effort Routing: Automatically scales reasoning effort (from minimal to max) based on task complexity, helping developers manage API costs and local resources.
* Flexible Backend Support: Out-of-the-box routing for Ollama, LM Studio, MLX, llama.cpp, and OpenRouter facilitates easy transitions between offline and online models.
* Secure Agentic Tools: Approval gates and diff previews for filesystem and terminal commands protect the user from destructive operations.
* Fault-Tolerant Parsing: An inline JSON detector extracts tool calls reliably even when smaller local LLMs struggle with formatting constraints.
DISCOVERED
1d ago
2026-06-15
PUBLISHED
1d ago
2026-06-15
RELEVANCE
AUTHOR
morganlinton