BACK_TO_FEEDAICRIER_2
Mistral NeMo "goldfish memory" hits Ollama users
OPEN_SOURCE ↗
REDDIT · REDDIT// 17d agoTUTORIAL

Mistral NeMo "goldfish memory" hits Ollama users

Local LLM users report context "forgetfulness" with Mistral NeMo on Ollama, traced back to a default 2,048 token context limit that ignores the model's 128k capacity.

// ANALYSIS

This highlights the disconnect between model capabilities and default infrastructure settings in the local AI stack.

  • Ollama defaults to 2k context for stability, effectively crippling the 128k Mistral-Nemo.
  • Users must manually override `num_ctx` in Modelfiles or OpenWebUI settings to unlock full memory.
  • Higher context windows significantly increase VRAM usage, making it a hardware-dependent trade-off.
  • This serves as a reminder that local AI remains a "some assembly required" experience for power users.
// TAGS
mistral-nemoollamaopen-webuillmself-hostedprompt-engineeringinference

DISCOVERED

17d ago

2026-03-26

PUBLISHED

17d ago

2026-03-26

RELEVANCE

8/ 10

AUTHOR

Plus_House_1078