OPEN_SOURCE ↗
REDDIT · REDDIT// 3h agoTUTORIAL
DeepSeek-R1 Runs 30 Minutes Uncapped in LM Studio
A Reddit user says DeepSeek-R1-0528-Qwen3-8B-Q4_K_M in LM Studio kept “thinking” for roughly 30 minutes before they manually stopped it, and asks how to limit reasoning to about 2 minutes. The post is really about controlling local reasoning-model runtime behavior, not a product launch.
// ANALYSIS
Hot take: this is usually a configuration problem, not the model “going crazy.”
- –Local reasoning models can keep producing chain-of-thought until they hit a token limit or an external stop condition, so an uncapped session can run far longer than expected.
- –The most reliable mitigation is to set a hard cap on output tokens or a wall-clock timeout in LM Studio or the serving layer.
- –Prompting for “brief reasoning” can help, but it is not a hard guarantee; the model may still ramble if the stop conditions are loose.
- –If you want something closer to a 2-minute ceiling, use a smaller context window, lower max tokens, and a stop rule in the client rather than relying on the prompt alone.
// TAGS
deepseeklm studiolocal llmreasoning modelqweninferencequantization
DISCOVERED
3h ago
2026-04-16
PUBLISHED
21h ago
2026-04-16
RELEVANCE
6/ 10
AUTHOR
XEUIPR