Apple Siri reportedly uses Gemma 4

// 45d agoNEWS

Apple Siri reportedly uses Gemma 4

While Apple's new Siri AI is not based on Google's proprietary Gemini model, it reportedly utilizes a customized version of Gemma 4 - E4B, a smaller open-source model developed by Google. To optimize execution on consumer hardware with limited RAM, Apple employs a specialized per-request Mixture of Experts (MoE) scheme that loads model weights directly from NAND flash memory.

// ANALYSIS

Apple's choice to leverage Google's open-source Gemma models instead of licensing proprietary APIs represents a highly pragmatic approach to on-device AI efficiency.

–**NAND-Based MoE**: Loading experts dynamically from NAND flash memory bypasses traditional RAM capacity bottlenecks on-device.
–**Customization Autonomy**: Utilizing an open-weights model allows Apple to perform deep optimization, fine-tuning, and alignment tailored specifically to iOS.
–**Pragmatic Collaboration**: This architecture showcases how major tech players can utilize open-source foundational models to maintain independence while using competitors' research.

// TAGS

applesirigemmagooglemoenandedge-aillm

DISCOVERED

45d ago

2026-06-09

PUBLISHED

45d ago

2026-06-09

RELEVANCE

8/ 10

AUTHOR

mark_k

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

LAUNCH6h ago

LLMHelper introduces usage auditing for personalized AI workflows

LLMHelper is an AI optimization platform that audits user prompt history and workflow memory across Claude, ChatGPT, and Gemini. By analyzing how users interact with top language models, the platform generates personalized blueprints containing targeted prompts, custom skills, and Model Context Protocol (MCP) server integrations to maximize overall model efficiency and streamline automation.

MODEL6h ago

Anthropic launches Claude Opus 5 for agentic coding

Anthropic has officially unveiled Claude Opus 5, its newest flagship frontier AI model designed for advanced agentic coding and dynamic reasoning tasks. Claude Opus 5 achieves top scores across leading benchmark evaluations like ARC-AGI 3 while cutting operating costs by roughly 50% compared to equivalent models.

BENCHMARK6h ago

Postgres LISTEN/NOTIFY hits 60k writes per second

DBOS published an engineering benchmark detailing how PostgreSQL's built-in LISTEN/NOTIFY feature can reliably back real-time data streams at high throughput. While conventional wisdom cautions against using LISTEN/NOTIFY for high-concurrency event streaming due to lock contention during transaction commits, DBOS demonstrates that optimized streaming patterns enable a single Postgres server to achieve 60,000 writes per second at millisecond-scale latency, removing the need for auxiliary message brokers in many architectures.