nvidia-smi clock locking fixes Windows multi-GPU lag
Windows power management downclocks idle secondary GPUs during pipeline-parallel inference, causing significant performance drops. Manually locking GPU and memory clocks via nvidia-smi ensures cards remain in high-performance states for consistent tokens-per-second.
Windows remains a challenging environment for multi-GPU local LLM setups due to aggressive gaming-centric power management. Pipeline parallelization creates micro-gaps that trigger power-saving states on cards waiting for layer work, making clock stability a more frequent performance killer than PCIe bottlenecks. This optimization is essential for mixed rigs or non-NVLink setups where synchronization overhead is higher.
DISCOVERED
10d ago
2026-04-01
PUBLISHED
10d ago
2026-04-01
RELEVANCE
AUTHOR
dero_name