Apple Siri reportedly uses Gemma 4
While Apple's new Siri AI is not based on Google's proprietary Gemini model, it reportedly utilizes a customized version of Gemma 4 - E4B, a smaller open-source model developed by Google. To optimize execution on consumer hardware with limited RAM, Apple employs a specialized per-request Mixture of Experts (MoE) scheme that loads model weights directly from NAND flash memory.
Apple's choice to leverage Google's open-source Gemma models instead of licensing proprietary APIs represents a highly pragmatic approach to on-device AI efficiency.
- –**NAND-Based MoE**: Loading experts dynamically from NAND flash memory bypasses traditional RAM capacity bottlenecks on-device.
- –**Customization Autonomy**: Utilizing an open-weights model allows Apple to perform deep optimization, fine-tuning, and alignment tailored specifically to iOS.
- –**Pragmatic Collaboration**: This architecture showcases how major tech players can utilize open-source foundational models to maintain independence while using competitors' research.
DISCOVERED
2h ago
2026-06-09
PUBLISHED
2h ago
2026-06-09
RELEVANCE
AUTHOR
mark_k