Ferro Labs AI Gateway

A high-performance AI gateway routing to 2,500+ models across 29 providers via a single OpenAI-compatible API, with 8 routing strategies, built-in MCP tool-call proxy, and full observability.

Ferro Labs AI Gateway is a single-binary, high-performance AI gateway written in Go (v1.0.3). It exposes a unified OpenAI-compatible API and internally supports 29 LLM providers (OpenAI, Anthropic, Google Gemini, AWS Bedrock, Azure OpenAI, Vertex AI, DeepSeek, Mistral, Groq, Ollama, etc.) covering 2,500+ models.

The routing engine offers 8 strategies (single, fallback, loadbalance, least-latency, cost-optimized, content-based, ab-test, conditional) with model alias mapping and configurable retries. The plugin system includes six built-in plugins: word-filter, max-token, response-cache, rate-limit, budget, and request-logger. MCP integration implements gateway-side automatic tool-call loops following the MCP 2025-11-25 Streamable HTTP spec, with multi-server support and tool deduplication.

Observability covers Prometheus metrics, deep health checks, Admin API, built-in Dashboard UI, and HTTP connection tracing. Performance benchmarks show 13,925 RPS at 1,000 concurrent users with p50 latency of 68.1ms, and bare proxy overhead of only 2μs.

Deployment options include precompiled binaries (Linux/macOS/Windows), Docker images, Go install, and Kubernetes Helm charts. Persistence backends support memory/SQLite/PostgreSQL. It can also be embedded as a Go HTTP handler in existing services. Licensed under Apache-2.0. Officially positioned as a high-performance alternative to LiteLLM (14x throughput / 23x memory claims unverified by third parties).

Unconfirmed: Team background undisclosed; Managed Cloud timeline TBD; 2,500+ model count lacks independent verification; no public production deployments; LiteLLM comparison data lacks third-party validation.

Ferro Labs AI Gateway

Related Projects

Mozaiks

audio-analyzer-rs

RemoteLab

STAY UPDATED