AI Gateway / Universal LLM Proxy providing a single OpenAI-compatible endpoint that intelligently routes to 100+ AI providers, with multimodal API support, MCP/A2A protocols, and enterprise-grade resilience.
OmniRoute is a comprehensive AI Gateway that converges fragmented multi-provider access into a single OpenAI-compatible endpoint. It supports 13 load-balancing strategies and a 4-layer automatic fallback chain (Subscription → API Key → Cheap → Free), with an AutoCombo Engine using 6-factor scoring (latency, cost, quota, health, capability, diversity) to build optimal provider combinations.
Intelligent Routing
- 13 balancing strategies: Priority, Round-Robin, Least-Latency, Cost-Aware, Shannon Entropy diversity, etc.
- Context Relay: structured handoff summaries maintain session continuity during account rotation
- Adaptive Routing: dynamic strategy override based on token count and prompt complexity
- Wildcard Router:
provider/*dynamic routing
Multimodal API Coverage
Beyond Chat Completions (/v1/chat/completions), uniformly proxies 11 API categories:
- Responses API (
/v1/responses, compatible with Codex and other agentic workflows) - Embeddings, Image Generation, Audio Transcription (7 providers), TTS (10 providers)
- Video Generation (ComfyUI + SD WebUI), Music Generation
- Reranking, Moderations, Web Search (5 providers: Serper, Brave, Perplexity, Exa, Tavily)
- WebSocket Bridge (
/v1/ws, OpenAI-compatible streaming proxy)
Agent Protocol Support
- MCP Server: 25 tools (18 core + 3 memory + 4 skill), stdio/SSE/Streamable HTTP transports
- A2A Server: JSON-RPC 2.0 + SSE for Agent-to-Agent task execution
- ACP Support: CLI agent discovery for 14+ agents (Codex, Claude, Goose, Gemini CLI, OpenClaw, etc.)
Format Translation Schema-safe automatic conversion between OpenAI, Claude, Gemini, and Responses API formats.
Resilience & Security
- Per-model circuit breakers, anti-thundering herd (mutex + semaphore)
- Semantic + Signature dual-layer caching, idempotent deduplication
- TLS fingerprint spoofing, CLI request signature matching
- IP whitelist/blacklist, SSRF protection, API key scoping & model filtering
- Cooldown-aware retries, Zod env validation, audit trail
Observability
- p50/p95/p99 latency telemetry, TPS metrics
- Cost tracking & budget control, usage analytics visualization
- Health dashboard (uptime, circuit breaker status, cache stats)
- Built-in Evaluation Framework (Golden Set testing)
Deployment
- NPM:
npm install -g omniroute && omniroute - Docker:
docker run -d -p 20128:20128 -v omniroute-data:/app/data diegosouzapw/omniroute:latest - Source build, Electron desktop app (Windows/macOS/Linux)
- Cloud: Fly.io, Cloudflare Tunnel, Caddy HTTPS
- Dashboard supports 30+ languages, default port 20128
Use Cases
- Zero-cost coding: combine 11+ free providers (Kiro, Qoder, Qwen, Gemini CLI, NVIDIA NIM, Cerebras, Groq, etc.)
- Maximize subscription value: Claude Code/Codex/Gemini CLI subscriptions + low-cost backups
- 24/7 critical workloads: 5-layer deep fallback chain
- Multi IDE/CLI unified access: compatible with Claude Code, Codex CLI, Gemini CLI, Cursor, Cline, etc.
Built with TypeScript (96.7%) + Next.js + SQLite (WAL mode), MIT licensed, actively developed (1,972 commits, 251 tags, 218 releases, 89 contributors). Originally forked from 9router.