llama_moe_optimiser automates llama.cpp MoE n-cpu-moe and batch-size sweeps

// 112d agoOPENSOURCE RELEASE

llama_moe_optimiser automates llama.cpp MoE n-cpu-moe and batch-size sweeps

A Windows PowerShell utility that benchmarks llama.cpp MoE configurations by sweeping `-n-cpu-moe` and batch size under a VRAM cap, then ranks the best runs by metrics like time-to-finish, prompt throughput, or output throughput. It uses `llama-server`/`llama-bench` behavior under the hood, performs binary search to find fitting batch ranges, and exports CSVs plus per-run logs for the top results.

// ANALYSIS

Handy niche tooling for people squeezing performance out of local MoE models, especially when manual tuning turns into endless reruns. It is more of a focused optimizer than a general benchmark suite.

–Strong fit for llama.cpp users who want repeatable MoE tuning instead of hand-testing combinations.
–The binary-sweep approach is a sensible way to avoid brute-forcing every setting while still respecting VRAM limits.
–Windows-only PowerShell scope keeps it practical for the target audience, but narrows adoption.
–No Product Hunt listing found.

// TAGS

llama.cppmoebenchmarkingpowershelllocal-llmqwenperformance-tuning

DISCOVERED

112d ago

2026-03-21

PUBLISHED

112d ago

2026-03-21

RELEVANCE

6/ 10

AUTHOR

TheLastSpark

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS30m ago

GPT-5.6 Sol in Claude Code outperforms Codex

Running OpenAI's GPT-5.6 Sol within Anthropic's Claude Code terminal environment reportedly outperforms legacy tools like Codex. The setup highlights the growing shift toward terminal-centric agentic loops for complex software tasks.

MODEL59m ago

Modelers drops Ascend NPU-optimized models

Modelers, the open-source model hub for Huawei's Ascend NPU ecosystem, has released a batch of twelve new fine-tuned model entries focused on hardware-specific efficiency. The release aims to build developer momentum and optimize AI inference for Ascend NPUs, though the impact of individual updates is diluted by the sheer number of simultaneous entries and limited public differentiation.

OPEN SOURCE1h ago

C# PS5 emulator SharpEmu boots 2D games

SharpEmu is an experimental, open-source PlayStation 5 emulator written in C# that targets Windows, Linux, and macOS. In its early development stages, the project has successfully booted simple 2D games like Dreaming Sarah and shown initial progress loading complex titles such as Demon's Souls Remake.