OpenZiti LLM Gateway balances Ollama endpoints

// 109d agoOPENSOURCE RELEASE

OpenZiti LLM Gateway balances Ollama endpoints

OpenZiti's LLM Gateway is a Go-based OpenAI-compatible proxy that routes OpenAI, Anthropic, and Ollama traffic through a single endpoint. Its Ollama multi-endpoint mode adds weighted round-robin, background health checks, passive failover, and a deduplicated /v1/models response from healthy nodes.

// ANALYSIS

This is the kind of plumbing that disappears when it works and saves the day when one GPU box dies, sleeps, or gets rebooted. The clever bit is that it is not just a load balancer; it is a self-hosted control plane for model choice and network reachability.

–Weighted round-robin plus health checks and passive failover are exactly what you want when multiple Ollama nodes have uneven capacity or uptime.
–The deduplicated `/v1/models` union is a small UX win that keeps Open WebUI, Continue, and other OpenAI-compatible clients from needing special handling.
–Semantic routing is the bigger strategic layer: one client can send coding tasks to Claude, routine prompts to local Ollama, and cloud models when needed.
–zrok/OpenZiti is the differentiator for backends behind NAT or on other networks, because it removes the VPN and port-forwarding tax.
–Compared with LiteLLM, Portkey, Cloudflare, or Kong, this feels narrower but more opinionated for self-hosted zero-trust setups.

// TAGS

llm-gatewayllmapiinferenceopen-sourceself-hosted

DISCOVERED

109d ago

2026-03-25

PUBLISHED

109d ago

2026-03-25

RELEVANCE

8/ 10

AUTHOR

SmilinDave26

// KEEP READING

More AI developer news from the feed

EXPLORE FULL FEED

OPEN SOURCE4m ago

Win11Debloat declutters Windows 10 and 11

Win11Debloat is a lightweight, customizable PowerShell script to declutter, optimize, and customize Windows 10 and 11. It allows users to remove pre-installed bloatware apps, disable telemetry, adjust privacy settings, and tweak user interface elements through an interactive menu or command-line arguments.

RESEARCH30m ago

Smart Cellular Bricks achieve decentralized self-repair

A new Nature Communications paper by researchers from the IT University of Copenhagen, Sakana AI, and Autodesk introduces Smart Cellular Bricks, a modular 3D system capable of shape classification and self-repair. Running a decentralized Neural Cellular Automata model, the individual bricks communicate only with immediate neighbors to collectively coordinate recovery without a central controller.

UPDATE1h ago

OpenDesign integrates Meta Muse Spark API

OpenDesign is an open-source, local-first design workspace that can be paired with Meta's Muse Spark to generate code-ready prototypes and UI screens directly from screenshots and prompts. This integration bridges the gap between visual design and software development, providing developers with an interactive workspace to rapidly iterate on AI-generated user interfaces.