OPEN_SOURCE ↗
REDDIT · REDDIT// 17d agoTUTORIAL
Ollama, Continue top local M3 Air stack
The LocalLLaMA thread converges on a practical local stack for a 24GB M3 Air: Ollama for serving models, Continue for free IDE help, and LM Studio if you want a simpler standalone GUI. Most commenters steer toward 7B-14B quantized picks like Qwen2.5-Coder-14B rather than brute-forcing giant models into unified memory.
// ANALYSIS
On a 24GB M3 Air, the smartest move is not the biggest model, it's the cleanest workflow. You'll usually get more mileage from a lightweight runtime and editor integration than from trying to squeeze a giant model into a fanless laptop.
- –Ollama is the easiest free backend and has the broadest ecosystem for local apps, agents, and terminal-first workflows.
- –LM Studio is the friendliest standalone app, though some Apple Silicon users prefer MLX-native wrappers if they want maximum speed.
- –Continue is the best no-cost IDE layer if the real goal is coding help without paying for a full AI editor subscription.
- –The thread's model advice stays in the 7B-14B range, with quantized coder models like Qwen2.5-Coder-14B in Q5 feeling like the sweet spot.
- –A minority view says the fanless Air is still the wrong machine for serious local inference, so free cloud LLMs may win if latency matters more than privacy.
// TAGS
ollamalm-studiocontinuellmai-codinginferenceself-hostedide
DISCOVERED
17d ago
2026-03-26
PUBLISHED
17d ago
2026-03-25
RELEVANCE
8/ 10
AUTHOR
ygzasln