A production-ready AI agent workflow orchestrator featuring DAG flow definition, multi-provider model routing, pluggable code sandboxes, and automated governance with Python SDK, CLI, and MCP protocol support.
Overview#
Sandcastle is a production-grade AI workflow orchestration framework designed to solve the orchestration, execution, and governance challenges of deploying AI agents in production. It features a DAG (Directed Acyclic Graph) workflow engine defined via YAML for automated multi-step task execution.
Core Positioning: "Stop babysitting your AI agents. Sandcastle runs your agent workflows so you don't have to."
Author: Tomas Pflanzer (@gizmax)
Version: v0.12.0 (Website) / v0.10.0 (PyPI)
Key Features#
Workflow Engine#
- DAG Orchestration: YAML-defined multi-step pipelines with dependency management, parallel branching, and inter-step data passing
- Parallel Execution: Concurrent execution of steps at the same DAG level, with
parallel_overfor list fan-out - Mixed Step Types: 9 step types supporting Workflow-as-Step (nested workflow calls)
Multi-Provider Model Routing#
- Supported Models: Claude (Opus/Sonnet/Haiku), OpenAI (Codex/Codex-mini), MiniMax (M2.5), Google Gemini (via OpenRouter)
- Fine-grained Control: Specify different models per step (e.g., Opus for planning, Haiku for execution)
Pluggable Sandbox Backends#
| Backend | Use Case | Characteristics |
|---|---|---|
| E2B | Cloud Default | Zero infrastructure setup |
| Docker | Self-hosted | Local container isolation |
| Local | Dev/Testing | Host subprocess, no isolation |
| Cloudflare Workers | Edge | Global distributed low latency |
Production Governance#
- Human Approval Gates: Pause workflows for manual review with timeout auto-actions
- AutoPilot Self-Optimization: A/B test different models/prompts, LLM-as-Judge auto-evaluation and best variant deployment
- Policy Engine: Declarative rules for PII detection, sensitive info blocking, dynamic approval injection
- Cost-Latency Optimizer: SLO-based dynamic model routing
- Budget Guardrails: Per-run/tenant/global cost limits
- Time Machine: Replay or fork from any step
- Real-time Event Streaming: SSE live updates
Integrations & Extensions#
- MCP Server: Built-in Model Context Protocol server supporting Claude Desktop, Cursor, Windsurf
- Tool Connectors: 12 built-in integrations (Slack, Jira, GitHub, HubSpot, Salesforce, Zendesk, Notion, Teams, Gmail, Google Drive, PostgreSQL, Webhooks)
- Scheduled Execution: Cron-based scheduling
Installation & Quick Start#
Local Mode (Recommended for Getting Started)#
# Install
pip install sandcastle-ai
# Interactive setup wizard
sandcastle init
# Start service (API + Dashboard)
sandcastle serve
Access: Dashboard and API at http://localhost:8080
Required API Keys: ANTHROPIC_API_KEY, E2B_API_KEY
Production Deployment#
git clone https://github.com/gizmax/Sandcastle.git
cd Sandcastle
# Configure environment
cat > .env << 'EOF'
ANTHROPIC_API_KEY=sk-ant-...
E2B_API_KEY=e2b_...
SANDBOX_BACKEND=e2b
DATABASE_URL=postgresql://...
REDIS_URL=redis://...
EOF
# One-command startup
docker compose up -d
API & SDK#
Python SDK#
from sandcastle import SandcastleClient
client = SandcastleClient(base_url="http://localhost:8080", api_key="sc_...")
# Run workflow and wait for completion
run = client.run("lead-enrichment",
input={"target_url": "https://example.com"},
wait=True,
)
print(run.status) # "completed"
print(run.total_cost_usd) # 0.12
# Fork from failed step
new_run = client.fork(run.run_id, from_step="score", changes={"model": "opus"})
CLI Tool#
sandcastle run lead-enrichment -i target_url=https://example.com --wait
sandcastle logs <run-id> --follow
sandcastle schedule create lead-enrichment "0 9 * * *"
sandcastle ls runs --status completed
REST API#
curl -X POST http://localhost:8080/api/workflows/run \
-H "Content-Type: application/json" \
-d '{"workflow": "lead-enrichment", "input": {"target_url": "https://example.com"}}'
Use Cases#
| Domain | Typical Applications |
|---|---|
| Marketing | Blog-to-social conversion, SEO audits, competitor analysis |
| Sales | Lead enrichment & scoring, outreach sequence generation, CRM sync |
| Support | Ticket classification & prioritization, knowledge base updates, sentiment analysis |
| HR | Resume screening, onboarding checklist generation |
| Legal | Contract review & risk identification |
| Ops | Technical debt audits, incident post-mortems |
Important Notes#
⚠️ License Inconsistency: GitHub repository shows BSL 1.1 (converting to Apache 2.0 in 2030), while PyPI package shows MIT License. Actual applicable license needs confirmation.