BACK_TO_FEEDAICRIER_2
llama_moe_optimiser automates llama.cpp MoE n-cpu-moe and batch-size sweeps
OPEN_SOURCE ↗
REDDIT · REDDIT// 21d agoOPENSOURCE RELEASE

llama_moe_optimiser automates llama.cpp MoE n-cpu-moe and batch-size sweeps

A Windows PowerShell utility that benchmarks llama.cpp MoE configurations by sweeping `-n-cpu-moe` and batch size under a VRAM cap, then ranks the best runs by metrics like time-to-finish, prompt throughput, or output throughput. It uses `llama-server`/`llama-bench` behavior under the hood, performs binary search to find fitting batch ranges, and exports CSVs plus per-run logs for the top results.

// ANALYSIS

Handy niche tooling for people squeezing performance out of local MoE models, especially when manual tuning turns into endless reruns. It is more of a focused optimizer than a general benchmark suite.

  • Strong fit for llama.cpp users who want repeatable MoE tuning instead of hand-testing combinations.
  • The binary-sweep approach is a sensible way to avoid brute-forcing every setting while still respecting VRAM limits.
  • Windows-only PowerShell scope keeps it practical for the target audience, but narrows adoption.
  • No Product Hunt listing found.
// TAGS
llama.cppmoebenchmarkingpowershelllocal-llmqwenperformance-tuning

DISCOVERED

21d ago

2026-03-21

PUBLISHED

21d ago

2026-03-21

RELEVANCE

6/ 10

AUTHOR

TheLastSpark