Monty

A minimal, secure Python interpreter written in Rust for AI agents, featuring microsecond startup latency and state snapshotting.

Overview#

Monty is an experimental Python interpreter developed by the Pydantic team, with its core entirely written in Rust. It is not a CPython replacement but a runtime environment customized for AI Agent scenarios, allowing LLMs to generate Python code executed in a highly controlled, isolated environment.

Problems Solved#

Security Risks: Directly using exec() or subprocess to run LLM-generated code poses serious security threats; Monty isolates filesystem, network, and environment variables by default
Performance Bottleneck: Traditional sandbox solutions (Docker, Pyodide, WASM) have high startup latency (milliseconds to seconds); Monty startup latency <1μs
State Persistence Difficulty: Traditional interpreters struggle to pause mid-execution and save state; Monty supports snapshotting at external function call boundaries

Core Capabilities#

Execution Core#

Rust-based interpreter: Does not depend on CPython, core logic written entirely in Rust with PyO3 bindings
Python subset support: Supports reasonable Python syntax subset sufficient for Agent logic
Type checking integration: Built-in ty type checker with modern Python type hints support
Sync/Async support: Supports synchronous or asynchronous code invocation
Stdout/Stderr capture: Standard output and error streams captured and returned to caller

Security & Isolation#

Complete environment isolation: No filesystem, network, or environment variable access by default
External function calls: Only developer-authorized functions can be invoked
Resource limit tracking: Monitor memory allocation, stack depth, execution time with automatic cancellation

State Management & Persistence#

Execution snapshots: Complete interpreter state serialized to bytes at external function call points
State restoration: Resume execution context from snapshot, cross-process/machine migration supported
Pre-compiled serialization: Parsed code objects can be dump/load to avoid repeated parsing

Performance Comparison#

Solution	Language Completeness	Security	Startup Latency	Snapshot Support
Monty	Partial	Strict	0.06ms	Easy
Docker	Full	Good	195ms	Medium
Pyodide	Full	Weak	2800ms	Difficult
starlark-rust	Very Limited	Good	1.7ms	None
WASI/Wasmer	Near Full	Strict	66ms	Medium

Multi-language Bindings#

Language	Package Name
Python	`pydantic-monty` (PyPI)
JavaScript/TypeScript	`@pydantic/monty` (npm)
Rust	Direct crate reference

Installation#

Python#

uv add pydantic-monty
# or
pip install pydantic-monty

JavaScript/TypeScript#

npm install @pydantic/monty

Usage Examples#

Basic Execution#

import pydantic_monty

m = pydantic_monty.Monty('1 + 2')
print(m.run())  # Output: 3

With Input Variables and Resource Limits#

m = pydantic_monty.Monty('x * y', inputs=['x', 'y'])
limits = pydantic_monty.ResourceLimits(max_duration_secs=1.0)
result = m.run(inputs={'x': 2, 'y': 3}, limits=limits)

External Functions and Snapshot Recovery#

code = "data = fetch(url); len(data)"
m = pydantic_monty.Monty(code, inputs=['url'], external_functions=['fetch'])

progress = m.start(inputs={'url': 'https://example.com'})
snapshot_data = progress.dump()

restored = pydantic_monty.MontySnapshot.load(snapshot_data)
result = restored.resume(return_value='hello world')

Rust Usage#

use monty::{MontyRun, MontyObject, NoLimitTracker, PrintWriter};

let code = r#"
def fib(n):
    if n <= 1:
        return n
    return fib(n - 1) + fib(n - 2)
fib(x)
"#;

let runner = MontyRun::new(code.to_owned(), "fib.py", vec!["x".to_owned()], vec![]).unwrap();
let result = runner.run(vec![MontyObject::Int(10)], NoLimitTracker, &mut PrintWriter::Stdout).unwrap();

Core API#

Monty(code, inputs, external_functions): Main entry class
MontyRun: Rust-side runner wrapper
MontySnapshot: Mid-execution snapshot object
ResourceLimits: Resource limit configuration object
.run(inputs, limits, external_functions): Synchronous execution
.start(inputs): Start execution, pause at external function
.resume(return_value): Resume execution
.dump() / .load(): Serialization and deserialization

Current Limitations#

Limited standard library: Only sys, typing, asyncio supported; dataclasses and json coming soon
No third-party library support: Cannot import Pydantic, NumPy, etc.
Syntax limitations: No class definitions or match statements yet (planned)

Use Cases#

AI Agent tool calling (LLM writes code instead of generating JSON)
Serverless or stateless computing requiring fast startup
Scenarios balancing untrusted code execution with strict resource control
Long task interruption and recovery (e.g., Human-in-the-loop interaction)

PydanticAI (planned integration with Monty for code-mode)
References: Cloudflare Codemode, Anthropic Programmatic Tool Calling, Hugging Face Smol Agents