Dual-socket LM Studio setup exposes Windows scheduling limits
A LocalLLaMA user reports that a dual Xeon, dual RTX 3060 Ti server on Windows 11 Enterprise only uses one CPU socket and one GPU in LM Studio, then fails around 140GB RAM when loading larger models. The post frames it as a likely NUMA and multi-device scheduling problem and considers Ubuntu Desktop as a workaround.
This looks less like weak hardware and more like a stack-level multi-socket orchestration gap.
- –The pattern (CPU 0/GPU 0 usage plus memory exhaustion behavior) suggests affinity, NUMA locality, or runtime backend limits rather than dead components.
- –Server-class topology with split PCIe lanes can expose limitations that do not appear on single-socket desktop systems.
- –Linux may improve observability and multi-device behavior for local inference, but BIOS settings and LM runtime configuration remain critical.
DISCOVERED
76d ago
2026-03-14
PUBLISHED
76d ago
2026-03-14
RELEVANCE
AUTHOR
doge-king-2021
