Mistral NeMo "goldfish memory" hits Ollama users

// 62d agoTUTORIAL

Mistral NeMo "goldfish memory" hits Ollama users

Local LLM users report context "forgetfulness" with Mistral NeMo on Ollama, traced back to a default 2,048 token context limit that ignores the model's 128k capacity.

// ANALYSIS

This highlights the disconnect between model capabilities and default infrastructure settings in the local AI stack.

–Ollama defaults to 2k context for stability, effectively crippling the 128k Mistral-Nemo.
–Users must manually override `num_ctx` in Modelfiles or OpenWebUI settings to unlock full memory.
–Higher context windows significantly increase VRAM usage, making it a hardware-dependent trade-off.
–This serves as a reminder that local AI remains a "some assembly required" experience for power users.

// TAGS

mistral-nemoollamaopen-webuillmself-hostedprompt-engineeringinference

DISCOVERED

62d ago

2026-03-26

PUBLISHED

62d ago

2026-03-26

RELEVANCE

8/ 10

AUTHOR

Plus_House_1078

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS2h ago

Pangram flags Pope's encyclical as Claude-generated

Online sleuths claim Pope Leo's first encyclical, "Magnifica Humanitas," contains text generated by Claude. The Pangram AI detector flagged key paragraphs as 100% AI, supported by linguistic tells like excessive em-dashes and the word "genuinely."

MODEL3h ago

Prism ML launches Bonsai Image 4B variants

Prism ML has released Bonsai Image 4B, a compact text-to-image diffusion model family built from FLUX.2 Klein 4B for local inference on Apple Silicon and NVIDIA GPUs. The launch includes 1-bit and ternary variants, plus Bonsai Studio for trying the model on iPhone.

OPEN SOURCE3h ago

book-to-skill turns PDFs into Claude skills

book-to-skill converts technical PDFs and EPUBs into a reusable Claude Code skill with chapter files, a glossary, patterns, and a cheat sheet. The goal is to turn a book from something you read once into something an agent can query while you work.