BACK_TO_FEEDAICRIER_2
Box fuses llama.cpp, LiteRT, NPU routing
OPEN_SOURCE ↗
REDDIT · REDDIT// 22h agoOPENSOURCE RELEASE

Box fuses llama.cpp, LiteRT, NPU routing

Box is an open-source Android fork of Google AI Edge Gallery that turns a phone into a fully offline AI assistant. It combines LiteRT with llama.cpp, whisper.cpp, and stable-diffusion.cpp for chat, voice, vision, document parsing, and image generation, with CPU/GPU/NPU/TPU routing.

// ANALYSIS

The interesting part here is not just "offline AI on Android" but the routing layer: Box is trying to make heterogeneous mobile silicon usable in one app instead of betting on a single backend.

  • Hybrid dispatch across LiteRT, llama.cpp, and native speech/image stacks is a practical way to squeeze more out of Snapdragon and Pixel-class devices than a one-runtime approach
  • The feature set is unusually broad for a phone app: voice-to-voice chat, camera Q&A, local document ingestion, custom GGUF import, and Stable Diffusion generation all without cloud dependencies
  • On-device memory and persistence are likely the real limiting factors now, which matches the maintainer’s own framing more than raw compute throughput
  • The security posture is stronger than most local-AI demos: hard offline mode, encrypted history, biometric lock, and prompt sanitization make the privacy story credible
  • This is most relevant as infrastructure for local-first mobile AI builders, not as a consumer assistant replacement for cloud chatbots
// TAGS
boxllminferenceedge-aimultimodalspeechsttimage-gen

DISCOVERED

22h ago

2026-05-02

PUBLISHED

1d ago

2026-05-02

RELEVANCE

8/ 10

AUTHOR

Healthy_Bedroom5837