YOU ARE VIEWING ONE ITEM FROM THE AICRIER FEED

llama_moe_optimiser automates llama.cpp MoE n-cpu-moe and batch-size sweeps

AICrier tracks AI developer news across Product Hunt, GitHub, Hacker News, YouTube, X, arXiv, and more. This page keeps the article you opened front and center while giving you a path into the live feed.

// WHAT AICRIER DOES

7+

TRACKED FEEDS

24/7

SCRAPED FEED

Short summaries, external links, screenshots, relevance scoring, tags, and featured picks for AI builders.

llama_moe_optimiser automates llama.cpp MoE n-cpu-moe and batch-size sweeps
OPEN LINK ↗
// 67d agoOPENSOURCE RELEASE

llama_moe_optimiser automates llama.cpp MoE n-cpu-moe and batch-size sweeps

A Windows PowerShell utility that benchmarks llama.cpp MoE configurations by sweeping `-n-cpu-moe` and batch size under a VRAM cap, then ranks the best runs by metrics like time-to-finish, prompt throughput, or output throughput. It uses `llama-server`/`llama-bench` behavior under the hood, performs binary search to find fitting batch ranges, and exports CSVs plus per-run logs for the top results.

// ANALYSIS

Handy niche tooling for people squeezing performance out of local MoE models, especially when manual tuning turns into endless reruns. It is more of a focused optimizer than a general benchmark suite.

  • Strong fit for llama.cpp users who want repeatable MoE tuning instead of hand-testing combinations.
  • The binary-sweep approach is a sensible way to avoid brute-forcing every setting while still respecting VRAM limits.
  • Windows-only PowerShell scope keeps it practical for the target audience, but narrows adoption.
  • No Product Hunt listing found.
// TAGS
llama.cppmoebenchmarkingpowershelllocal-llmqwenperformance-tuning

DISCOVERED

67d ago

2026-03-21

PUBLISHED

67d ago

2026-03-21

RELEVANCE

6/ 10

AUTHOR

TheLastSpark