Ollama 0.19 boosts Apple Silicon with MLX

// 57d agoINFRASTRUCTURE

Ollama 0.19 boosts Apple Silicon with MLX

Ollama 0.19 rebuilds its Apple Silicon runtime on MLX, delivering a noticeable speedup for local inference on Macs. The release also adds NVFP4 support and smarter cache reuse, which should make coding agents and branching sessions feel much more responsive.

// ANALYSIS

This is a real infrastructure upgrade, not a cosmetic release: Ollama is making Apple Silicon feel like a first-class local inference platform again, and the cache work may matter almost as much as the raw benchmark gains for agentic workflows.

–MLX plus Apple’s GPU Neural Accelerators should cut both time-to-first-token and steady-state generation latency on newer Macs.
–NVFP4 support narrows the gap between local testing and production-style inference formats, which is useful for teams comparing outputs across environments.
–Cache snapshots, reuse across conversations, and smarter eviction are exactly the kind of changes that improve Claude Code-style branching loops.
–The preview is aimed at bigger machines with 32GB+ unified memory, so the win is strongest for high-end Apple Silicon users.
–Focusing on Qwen3.5-35B-A3B coding workloads signals that Ollama is optimizing for serious local coding agents, not just casual chat.

// TAGS

ollamainferencegpuagentai-codingopen-source

DISCOVERED

57d ago

2026-04-01

PUBLISHED

57d ago

2026-04-01

RELEVANCE

9/ 10

AUTHOR

[REDACTED]

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

VIDEO3h ago

Viral video teases Claude Opus 4.8

A viral video directed by Miguel07Code showcases impressive "hyperframes" camera movements, allegedly generated by Claude Opus 4.8. The post has sparked speculation about Claude's video generation capabilities.

LAUNCH3h ago

Browser Use Terminal launches Rust web-agent TUI

Browser Use Terminal is a new Rust-based TUI that lets developers automate and steer browser tasks directly from the command line. It combines a lightweight LLM harness with direct CDP control over Chrome for highly observable, interactive automation.

NEWS3h ago

Developer automates BTC trading with Claude, nets profit

A developer tasked Claude with a $20 budget to autonomously trade Bitcoin overnight, resulting in a completed script that successfully executed five trades for a $95 profit. The experiment showcases the increasing capability of LLMs to generate functional, profitable algorithmic trading systems with minimal oversight.