Fleet-RLM

Secure, cloud-sandboxed Recursive Language Models (RLM) framework with DSPy and Modal for long-context code and document processing, offering Web UI, CLI, HTTP API, WebSocket, and MCP Server interfaces.

Overview#

Fleet-RLM is a Recursive Language Models (RLM) implementation built on DSPy declarative programming framework and Modal cloud sandbox. Originating from MIT/Stanford RLM research (arXiv:2512.24601), maintained by Qredence organization, currently at version v0.4.8 with 422+ commits in active development.

Core Features#

Web UI Interaction (fleet web)

Browser-based conversation with RLM-powered agents
Runtime configuration (LM / Modal) directly in UI
Access at: http://localhost:8000

Secure Sandbox Execution

Modal-based cloud sandbox for secure code execution
Recursive long-context task processing,突破ing model context window limits
Isolated runtime environment for untrusted code

Document Analysis

PDF ingestion (MarkItDown + pypdf fallback)
Large document intelligent processing

Execution Observability

Streaming output of execution events and traces
WebSocket execution stream (/api/v1/ws/execution)
Real-time monitoring of agent execution

Multiple Interface Support

CLI: fleet / fleet-rlm / rlm-modal commands
HTTP API: FastAPI REST service
MCP Server: fleet-rlm serve-mcp --transport stdio (Claude Desktop integration)
WebSocket: Real-time chat and execution streams

Multi-tenant Persistence

Neon database + RLS (Row-Level Security) isolation
Tenant-aware run/step/artifact/memory persistence

Architecture Design#

Four-layer architecture:

Layer	Core Modules	Responsibility
Orchestration	`react/agent.py`, `streaming.py`, `tools*.py`	ReAct loop orchestration, tool dispatch, trace and stream event generation
Execution	`core/interpreter.py`, `driver.py`, `driver_factories.py`	Modal remote code execution, execution config, sandbox protocol handling
Service	`server/routers/*`, `deps.py`, `config.py`	HTTP/WebSocket transport, auth, runtime settings, session lifecycle
Persistence	`db/*`, `migrations/`	Tenant-aware persistence, RLS isolation

Installation & Quick Start#

Recommended (CLI Tool):

uv tool install fleet-rlm
fleet web

Standard Installation:

uv pip install fleet-rlm
fleet web

From Source:

uv sync --extra dev --extra server
uv run fleet web

Additional Commands:

fleet-rlm run-basic --question "What are the first 12 Fibonacci numbers?"
fleet-rlm serve-api --port 8000
fleet-rlm serve-mcp --transport stdio
fleet-rlm chat

Typical Use Cases#

Long-context Code Tasks: Code analysis/generation beyond model context window
Document Intelligence: Large PDF document ingestion and analysis
AI Agent Development: Building agents with recursive delegation capabilities
Secure Code Execution: Running untrusted code in isolated sandbox
MCP Integration: MCP server integration with Claude Desktop and similar clients