The Open Source AI Engineering Platform for Agents, LLMs & Models, providing experiment tracking, model registry, LLM observability, evaluation, prompt optimization, and a unified AI gateway.
MLflow is developed and led by Databricks, currently at v3.11.1 under the Apache-2.0 license. The project is primarily built in Python (60.9%), with TypeScript/JavaScript (37.3%) and Java (0.6%) providing multi-language coverage.
For traditional ML, MLflow provides Experiment Tracking for cross-experiment parameter and metric tracking, a collaborative Model Registry for full model lifecycle management, automated Model Evaluation, and multi-target Deployment (Docker, Kubernetes, Azure ML, AWS SageMaker, etc.).
For LLM/Agent engineering, the MLflow v3 series delivers four core capabilities: full-chain Trace observability built on OpenTelemetry, supporting any LLM Provider and Agent framework; a systematic evaluation system with 50+ built-in metrics and LLM Judges; prompt version management with full lineage tracking and automatic optimization; and a unified LLM API Gateway (routing, rate limiting, fallback, A/B testing, Guardrails, credential management, OpenAI-compatible interface).
On the ecosystem front, MLflow supports one-line auto-tracing across 60+ frameworks, covering Python (LangChain, LangGraph, DSPy, CrewAI, LlamaIndex, and 20+ more), TypeScript (Vercel AI SDK, Mastra, etc.), and Java (Spring AI, LangChain4j). It integrates with 20+ model providers (OpenAI, Anthropic, Gemini, Bedrock, DeepSeek, Qwen, etc.) and deeply connects with gateways like LiteLLM Proxy and Kong AI Gateway, as well as observability tools like Langfuse and Arize/Phoenix.
Installation is as simple as pip install mlflow or uvx mlflow server, with just three lines of code needed to enable automatic Trace capture and UI visualization for OpenAI calls. Deployment supports local, self-hosted clusters, and cloud platforms including Databricks, AWS SageMaker, Azure ML, and Nebius.