OPEN_SOURCE ↗
REDDIT · REDDIT// 17d agoOPENSOURCE RELEASE
OpenZiti LLM Gateway balances Ollama endpoints
OpenZiti's LLM Gateway is a Go-based OpenAI-compatible proxy that routes OpenAI, Anthropic, and Ollama traffic through a single endpoint. Its Ollama multi-endpoint mode adds weighted round-robin, background health checks, passive failover, and a deduplicated /v1/models response from healthy nodes.
// ANALYSIS
This is the kind of plumbing that disappears when it works and saves the day when one GPU box dies, sleeps, or gets rebooted. The clever bit is that it is not just a load balancer; it is a self-hosted control plane for model choice and network reachability.
- –Weighted round-robin plus health checks and passive failover are exactly what you want when multiple Ollama nodes have uneven capacity or uptime.
- –The deduplicated `/v1/models` union is a small UX win that keeps Open WebUI, Continue, and other OpenAI-compatible clients from needing special handling.
- –Semantic routing is the bigger strategic layer: one client can send coding tasks to Claude, routine prompts to local Ollama, and cloud models when needed.
- –zrok/OpenZiti is the differentiator for backends behind NAT or on other networks, because it removes the VPN and port-forwarding tax.
- –Compared with LiteLLM, Portkey, Cloudflare, or Kong, this feels narrower but more opinionated for self-hosted zero-trust setups.
// TAGS
llm-gatewayllmapiinferenceopen-sourceself-hosted
DISCOVERED
17d ago
2026-03-25
PUBLISHED
17d ago
2026-03-25
RELEVANCE
8/ 10
AUTHOR
SmilinDave26