iOS App Runs Gemma 4 Fully On-Device

// 90d agoPRODUCT LAUNCH

iOS App Runs Gemma 4 Fully On-Device

A developer shipped an iPhone app that rewrites oral transcripts into polished paragraphs using Gemma 4 E2B entirely on-device. The post is also a production report on MLX Swift and MLXLLM, covering model selection, custom architecture wiring, and iOS lifecycle pitfalls.

// ANALYSIS

This is the kind of local-AI launch that matters: the real work is not “can the model run,” but “can it survive iOS constraints, memory ceilings, and backgrounding in production.”

–E2B looks like the practical sweet spot here: E4B exceeded memory limits, while Qwen3.5-4B brought unwanted thinking-token behavior for pure generation
–The custom Gemma 4 registration and prompt formatting in MLXLLM suggests ecosystem support is still immature for newer architectures
–The 128K context window matters less than the app’s constrained use case; short, bounded rewrite jobs are a better fit than trying to stuff the whole app into context
–The `.scenePhase` gate is the most production-real detail in the post: mobile inference success depends on app lifecycle discipline as much as model quality
–Offline transcript rewriting is a strong on-device use case because privacy, latency, and cost all align with the product value proposition

// TAGS

gemma-4llmedge-aiinferencesdk

DISCOVERED

90d ago

2026-04-16

PUBLISHED

90d ago

2026-04-16

RELEVANCE

8/ 10

AUTHOR

Ok-Taste3787

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

LAUNCH23m ago

NVIDIA Unveils Smaller Jetson Thor Modules

NVIDIA has expanded its Jetson Thor robotics platform with the new Blackwell-based T3000 and T2000 modules, bringing high-performance edge AI to mainstream robotics. The compact modules deliver up to 865 FP4 teraflops of compute and 32GB of memory at half the size and power of the flagship T5000.

UPDATE39m ago

Effect v4 beta blocks Alchemy v2 release

Alchemy has announced that its upcoming TypeScript-native IaC v2 release is gearing up to leave beta, officially migrating its main domain to the v2 site. However, the transition remains blocked until the core Effect v4 library completes its own beta phase, as Alchemy v2 relies heavily on it for runtime logic.

LAUNCH40m ago

NVIDIA unveils Jetson Thor T3000, T2000

NVIDIA has introduced the Jetson Thor T3000 and T2000, new Blackwell-powered computer modules designed to deploy advanced AI capabilities to humanoid robots and edge systems. The T3000 module delivers 865 FP4 TFLOPS of compute power with 32GB of memory to enable high-performance, real-time reasoning.