Gemma 4-E2B STT hits Home Assistant hurdles

// 90d agoPRODUCT LAUNCH

Gemma 4-E2B STT hits Home Assistant hurdles

Google's new 2B parameter multimodal model, Gemma 4-E2B, is being repurposed for local Speech-to-Text (STT) in Home Assistant. While its accuracy is impressive, it natively outputs its internal "thought chain," requiring community-developed middleware to strip reasoning tags for raw transcriptions.

// ANALYSIS

Gemma 4's multimodal capabilities make it a high-performance local STT contender, but its "thoughtful" default behavior is currently a friction point for simple transcription tasks.

–Native audio support in a 2-billion parameter model allows for low-latency, high-accuracy STT on consumer GPUs, rivaling dedicated models like Parakeet.
–The model’s built-in reasoning engine, while valuable for complex prompts, lacks a reliable server-side "off" switch in current llama.cpp and llama-swap implementations.
–Community members are bypassing the problem with custom FastAPI middleware that regex-strips <|channel>thought tags before the data reaches Home Assistant.
–This integration highlights the growing trend of using general-purpose multimodal LLMs as high-performance drop-in replacements for traditional specialized audio encoders.
–The combination of llama-swap and wyoming_openai remains the dominant architecture for bridging local LLM servers to the Home Assistant "Assist" pipeline.

// TAGS

gemma-4-e2bgemma-4llmspeechself-hostedhome-assistantstt

DISCOVERED

90d ago

2026-04-18

PUBLISHED

90d ago

2026-04-17

RELEVANCE

8/ 10

AUTHOR

andy2na

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

UPDATE31m ago

B.AI adds Kimi K3 3T-class model to API

B.AI has rapidly integrated Moonshot AI's newly released Kimi K3 model into its API platform. This update provides developers with immediate access to what is described as the world's first open 3T-class AI model, enabling them to leverage its advanced computational capabilities without the overhead of hosting it themselves.

LAUNCH56m ago

Roblox launches Build mobile AI game creator

Roblox is launching Build, a mobile-first AI tool within its app that generates basic, playable games from text prompts. The tool shares a backend with Roblox Studio, allowing creators to start projects on mobile and refine them on desktop.

UPDATE1h ago

TanStack AI ships client-side message queueing

TanStack AI has introduced client-side message queuing within its useChat hook to manage concurrent prompt submissions and prevent race conditions during active LLM streams. The update supports FIFO, batch, and interrupt queuing strategies to automatically transmit messages once the stream settles.