DISCOVER THE FUTURE OF AI AGENTSarrow_forward

Fleet-RLM

calendar_todayAdded Feb 25, 2026
categoryAgent & Tooling
codeOpen Source
PythonModel Context ProtocolAI AgentsAgent FrameworkWeb ApplicationCLIAgent & ToolingModel & Inference FrameworkDeveloper Tools & CodingProtocol, API & IntegrationSecurity & Privacy

Secure, cloud-sandboxed Recursive Language Models (RLM) framework with DSPy and Modal for long-context code and document processing, offering Web UI, CLI, HTTP API, WebSocket, and MCP Server interfaces.

Overview#

Fleet-RLM is a Recursive Language Models (RLM) implementation built on DSPy declarative programming framework and Modal cloud sandbox. Originating from MIT/Stanford RLM research (arXiv:2512.24601), maintained by Qredence organization, currently at version v0.4.8 with 422+ commits in active development.

Core Features#

Web UI Interaction (fleet web)

  • Browser-based conversation with RLM-powered agents
  • Runtime configuration (LM / Modal) directly in UI
  • Access at: http://localhost:8000

Secure Sandbox Execution

  • Modal-based cloud sandbox for secure code execution
  • Recursive long-context task processing,突破ing model context window limits
  • Isolated runtime environment for untrusted code

Document Analysis

  • PDF ingestion (MarkItDown + pypdf fallback)
  • Large document intelligent processing

Execution Observability

  • Streaming output of execution events and traces
  • WebSocket execution stream (/api/v1/ws/execution)
  • Real-time monitoring of agent execution

Multiple Interface Support

  • CLI: fleet / fleet-rlm / rlm-modal commands
  • HTTP API: FastAPI REST service
  • MCP Server: fleet-rlm serve-mcp --transport stdio (Claude Desktop integration)
  • WebSocket: Real-time chat and execution streams

Multi-tenant Persistence

  • Neon database + RLS (Row-Level Security) isolation
  • Tenant-aware run/step/artifact/memory persistence

Architecture Design#

Four-layer architecture:

LayerCore ModulesResponsibility
Orchestrationreact/agent.py, streaming.py, tools*.pyReAct loop orchestration, tool dispatch, trace and stream event generation
Executioncore/interpreter.py, driver.py, driver_factories.pyModal remote code execution, execution config, sandbox protocol handling
Serviceserver/routers/*, deps.py, config.pyHTTP/WebSocket transport, auth, runtime settings, session lifecycle
Persistencedb/*, migrations/Tenant-aware persistence, RLS isolation

Installation & Quick Start#

Recommended (CLI Tool):

uv tool install fleet-rlm
fleet web

Standard Installation:

uv pip install fleet-rlm
fleet web

From Source:

uv sync --extra dev --extra server
uv run fleet web

Additional Commands:

fleet-rlm run-basic --question "What are the first 12 Fibonacci numbers?"
fleet-rlm serve-api --port 8000
fleet-rlm serve-mcp --transport stdio
fleet-rlm chat

Typical Use Cases#

  • Long-context Code Tasks: Code analysis/generation beyond model context window
  • Document Intelligence: Large PDF document ingestion and analysis
  • AI Agent Development: Building agents with recursive delegation capabilities
  • Secure Code Execution: Running untrusted code in isolated sandbox
  • MCP Integration: MCP server integration with Claude Desktop and similar clients

Key Dependencies#

  • DSPy 3.1.3 (Declarative LM programming framework)
  • Modal 1.3.2+ (Cloud sandbox)
  • FastAPI + Uvicorn (HTTP service)
  • WebSockets (Real-time communication)
  • SQLModel + SQLAlchemy + asyncpg (Database)
  • LiteLLM (Multi-model support)
  • pgvector (Vector storage)

Open Source License#

MIT License

Runtime Configuration#

  • LM (Language Model) settings
  • Modal secrets configuration
  • Authentication mode (Dev vs Entra)
  • Neon DB connection

Related Projects

View All arrow_forward

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.

rocket_launch