BACK_TO_FEEDAICRIER_2
Swift engine runs Gemma 4 on iPhone
OPEN_SOURCE ↗
REDDIT · REDDIT// 3d agoOPENSOURCE RELEASE

Swift engine runs Gemma 4 on iPhone

Swift-gemma4-core is a newly open-sourced Swift inference engine for running Gemma 4 natively on Apple Silicon and iOS, built after the author hit compatibility issues with existing MLX-based libraries. The project focuses on an offline, on-device experience and claims support for Gemma 4’s newer quirks, including partial rotary embeddings, cross-layer KV cache behavior, and prompt/template handling that previously broke decoding. The author says it already runs on a real iPhone with a relatively small memory footprint, but prefill latency is still high and they are asking the community to help optimize the bridge, tensor mapping, and allocations.

// ANALYSIS

The real story here is not “Gemma 4 on iPhone” so much as “someone filled a missing runtime gap for a model that existing Swift/MLX paths couldn’t handle cleanly.”

  • Strong open-source signal: this is a practical infra contribution, not a demo wrapper.
  • The technical pain points are plausible for a newer model family, but the post reads like an engineering progress update more than a polished launch.
  • The biggest credibility hook is the claimed real-device runtime and low RAM usage; the biggest risk is the still-slow prefill path.
  • Best fit audience: Swift/Metal/MLX people, offline AI app builders, and anyone trying to ship local-first Gemma on iOS.
// TAGS
swiftsiosapple-silicongemma-4local-llmon-device-aimetalmlxopen-source

DISCOVERED

3d ago

2026-04-09

PUBLISHED

3d ago

2026-04-09

RELEVANCE

8/ 10

AUTHOR

AgreeableNewspaper29