LM Studio slows after AM5 upgrade
A Reddit user says LM Studio feels slower after moving from a Ryzen 9 5950X / X570 / DDR4 setup to a Ryzen 9 9950X / B850 / DDR5 system, even though the RTX 4080 stayed the same and token throughput appears unchanged. The slowdown shows up mainly in the request start/stop phase and in handling four parallel requests, which suggests a latency or batching issue rather than a pure generation-speed regression. The thread asks whether AM5 or DDR5 could be involved.
Hot take: this looks more like request orchestration latency than raw inference performance, so the problem is probably in LM Studio's queueing/batching path, runtime settings, or memory pressure rather than AM5 itself.
- –Tokens/sec staying roughly the same is a strong hint that decode throughput is fine once the model is running.
- –The pain point is the first-token and request handoff path, especially under 4-way parallel load.
- –The downgrade from 64GB DDR4 to 32GB DDR5 is a more plausible regression factor than the platform swap alone, especially if other apps or the OS are competing for memory.
- –LM Studio's parallel request behavior depends on its batching and runtime configuration, so a changed default there could explain the new latency profile.
- –The most likely checks are model load/offload settings, max concurrent predictions, runtime version, and whether the app is now queueing requests instead of batching them.
DISCOVERED
2d ago
2026-04-09
PUBLISHED
2d ago
2026-04-09
RELEVANCE
AUTHOR
VirtualForge