Sandcastle

A production-ready AI agent workflow orchestrator featuring DAG flow definition, multi-provider model routing, pluggable code sandboxes, and automated governance with Python SDK, CLI, and MCP protocol support.

Overview#

Sandcastle is a production-grade AI workflow orchestration framework designed to solve the orchestration, execution, and governance challenges of deploying AI agents in production. It features a DAG (Directed Acyclic Graph) workflow engine defined via YAML for automated multi-step task execution.

Core Positioning: "Stop babysitting your AI agents. Sandcastle runs your agent workflows so you don't have to."

Author: Tomas Pflanzer (@gizmax)

Version: v0.12.0 (Website) / v0.10.0 (PyPI)

Key Features#

Workflow Engine#

DAG Orchestration: YAML-defined multi-step pipelines with dependency management, parallel branching, and inter-step data passing
Parallel Execution: Concurrent execution of steps at the same DAG level, with parallel_over for list fan-out
Mixed Step Types: 9 step types supporting Workflow-as-Step (nested workflow calls)

Multi-Provider Model Routing#

Supported Models: Claude (Opus/Sonnet/Haiku), OpenAI (Codex/Codex-mini), MiniMax (M2.5), Google Gemini (via OpenRouter)
Fine-grained Control: Specify different models per step (e.g., Opus for planning, Haiku for execution)

Pluggable Sandbox Backends#

Backend	Use Case	Characteristics
E2B	Cloud Default	Zero infrastructure setup
Docker	Self-hosted	Local container isolation
Local	Dev/Testing	Host subprocess, no isolation
Cloudflare Workers	Edge	Global distributed low latency

Production Governance#

Human Approval Gates: Pause workflows for manual review with timeout auto-actions
AutoPilot Self-Optimization: A/B test different models/prompts, LLM-as-Judge auto-evaluation and best variant deployment
Policy Engine: Declarative rules for PII detection, sensitive info blocking, dynamic approval injection
Cost-Latency Optimizer: SLO-based dynamic model routing
Budget Guardrails: Per-run/tenant/global cost limits
Time Machine: Replay or fork from any step
Real-time Event Streaming: SSE live updates

Integrations & Extensions#

MCP Server: Built-in Model Context Protocol server supporting Claude Desktop, Cursor, Windsurf
Tool Connectors: 12 built-in integrations (Slack, Jira, GitHub, HubSpot, Salesforce, Zendesk, Notion, Teams, Gmail, Google Drive, PostgreSQL, Webhooks)
Scheduled Execution: Cron-based scheduling

Installation & Quick Start#

Local Mode (Recommended for Getting Started)#

# Install
pip install sandcastle-ai

# Interactive setup wizard
sandcastle init

# Start service (API + Dashboard)
sandcastle serve

Access: Dashboard and API at http://localhost:8080

Required API Keys: ANTHROPIC_API_KEY, E2B_API_KEY

Production Deployment#

git clone https://github.com/gizmax/Sandcastle.git
cd Sandcastle

# Configure environment
cat > .env << 'EOF'
ANTHROPIC_API_KEY=sk-ant-...
E2B_API_KEY=e2b_...
SANDBOX_BACKEND=e2b
DATABASE_URL=postgresql://...
REDIS_URL=redis://...
EOF

# One-command startup
docker compose up -d

API & SDK#

Python SDK#

from sandcastle import SandcastleClient

client = SandcastleClient(base_url="http://localhost:8080", api_key="sc_...")

# Run workflow and wait for completion
run = client.run("lead-enrichment",
    input={"target_url": "https://example.com"},
    wait=True,
)
print(run.status)          # "completed"
print(run.total_cost_usd)  # 0.12

# Fork from failed step
new_run = client.fork(run.run_id, from_step="score", changes={"model": "opus"})

CLI Tool#

sandcastle run lead-enrichment -i target_url=https://example.com --wait
sandcastle logs <run-id> --follow
sandcastle schedule create lead-enrichment "0 9 * * *"
sandcastle ls runs --status completed

REST API#

curl -X POST http://localhost:8080/api/workflows/run \
  -H "Content-Type: application/json" \
  -d '{"workflow": "lead-enrichment", "input": {"target_url": "https://example.com"}}'

Use Cases#

Domain	Typical Applications
Marketing	Blog-to-social conversion, SEO audits, competitor analysis
Sales	Lead enrichment & scoring, outreach sequence generation, CRM sync
Support	Ticket classification & prioritization, knowledge base updates, sentiment analysis
HR	Resume screening, onboarding checklist generation
Legal	Contract review & risk identification
Ops	Technical debt audits, incident post-mortems

Important Notes#

⚠️ License Inconsistency: GitHub repository shows BSL 1.1 (converting to Apache 2.0 in 2030), while PyPI package shows MIT License. Actual applicable license needs confirmation.