OPEN_SOURCE ↗
REDDIT · REDDIT// 29d agoINFRASTRUCTURE
Dual-socket LM Studio setup exposes Windows scheduling limits
A LocalLLaMA user reports that a dual Xeon, dual RTX 3060 Ti server on Windows 11 Enterprise only uses one CPU socket and one GPU in LM Studio, then fails around 140GB RAM when loading larger models. The post frames it as a likely NUMA and multi-device scheduling problem and considers Ubuntu Desktop as a workaround.
// ANALYSIS
This looks less like weak hardware and more like a stack-level multi-socket orchestration gap.
- –The pattern (CPU 0/GPU 0 usage plus memory exhaustion behavior) suggests affinity, NUMA locality, or runtime backend limits rather than dead components.
- –Server-class topology with split PCIe lanes can expose limitations that do not appear on single-socket desktop systems.
- –Linux may improve observability and multi-device behavior for local inference, but BIOS settings and LM runtime configuration remain critical.
// TAGS
lm-studiolocal-llmwindows-11numamulti-gpu
DISCOVERED
29d ago
2026-03-14
PUBLISHED
29d ago
2026-03-14
RELEVANCE
6/ 10
AUTHOR
doge-king-2021