Gemma 4 powers local VS Code Copilot

// 90d agoTUTORIAL

Gemma 4 powers local VS Code Copilot

A LocalLLaMA user reports a working VS Code Insiders workflow using Gemma 4 26B-A4B Q8 through llama.cpp and the OAI Compatible Provider for Copilot extension. The setup keeps Copilot Chat's Ask, Plan, and Agent modes pointed at a local OpenAI-compatible endpoint, with reported generation around 60 tokens/sec on a Radeon AI PRO R9700.

// ANALYSIS

This is not a product launch, but it is a useful field report: local coding agents are getting close enough that the limiting factor is now workflow polish, context management, and tool-call reliability.

–OAI Compatible Provider for Copilot is the key bridge, letting developers reuse GitHub Copilot's VS Code UI with local or alternative model backends
–Gemma 4 26B-A4B looks viable for coding-agent experiments when paired with careful llama.cpp cache, context, and template settings
–The reported loops and tool-use failures are the real caveat: local inference speed does not automatically translate into reliable agent behavior
–For privacy-sensitive or cost-conscious developers, this pattern points toward a practical self-hosted Copilot alternative rather than just a novelty demo

// TAGS

gemma-4-26b-a4boai-compatible-provider-for-copilotllmai-codingideinferenceself-hostedgpu

DISCOVERED

90d ago

2026-04-23

PUBLISHED

90d ago

2026-04-23

RELEVANCE

7/ 10

AUTHOR

supracode

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

NEWS41m ago

AMD partners with Anthropic on AI compute

AMD and Anthropic have entered into a strategic partnership to accelerate AI compute infrastructure, with Anthropic deploying up to 2 gigawatts of AMD Instinct GPUs on Helios systems. Under the agreement, the companies will co-optimize Claude models for AMD's ROCm ecosystem alongside a planned strategic equity investment of up to $5 billion by AMD.

UPDATE52m ago

Plannotator expands its agentic code review tool with support for GitButler projects alongside Git, Jujutsu, and Perforce

Plannotator, an open-source visual review tool designed to inspect and annotate code generated by AI agents, has officially released support for GitButler projects across all recent builds. Joining existing compatibility with Git, Jujutsu (jj), and Perforce (p4), this update allows developers using GitButler's virtual branches to seamlessly review AI outputs and feed structured inline annotations back into agentic loops.

OPEN SOURCE55m ago

Infinite Bookshelf generates complete books in seconds

Infinite Bookshelf is an open-source application designed to generate complete, structured nonfiction books from a one-line prompt. Powered by Groq's fast inference engine and Meta's Llama models, the project dynamically switches between model sizes to balance speed and output quality. The generated books feature complete markdown formatting, including embedded data tables and code examples.