Open Computer Use

An open-source full-stack framework for autonomous computer agents, enabling control of browsers, terminals, and desktop apps via natural language in Docker VMs. Maintained by coasty-ai under Apache 2.0 license, achieving 82% on OSWorld Benchmark.

Project Overview#

Open Computer Use is an open-source full-stack framework for autonomous virtual computer agents, enabling AI to operate browsers, terminals, and desktop applications like humans. Positioned as "The Open Framework for autonomous virtual computer agents at scale", it supports 100% self-hosting.

Core Capabilities#

Browser Agent#

Search-first strategy (Google Search API)
Intelligent web navigation and auto form filling
Element detection and smart clicking
Multi-tab parallel management
Screenshot verification and visual feedback

Terminal Agent#

Command execution in isolated environments
File operations (read/write/edit/delete)
Script execution and package installation
Real-time streaming output

Desktop Agent#

CV-based UI element detection
Mouse/keyboard control, window management
OCR text extraction
Cross-platform support (Linux desktop)

Multi-Agent System#

AI Planner for task decomposition
Sequential execution with context passing
Error handling and automatic retry
User interaction confirmation mechanism

Performance Metrics (Official)#

OSWorld Benchmark #1 rank, 82% score
Average task completion ~45 seconds
50+ concurrent sessions per server
Tool call latency <500ms
VM startup time ~15 seconds

Use Cases#

Research & Data Collection: Web scraping, competitive analysis, academic paper collection
Testing & QA: Automated UI testing, cross-browser testing, regression testing
DevOps & Automation: Server configuration, deployment automation, log analysis
E-commerce Operations: Price monitoring, order management, inventory tracking
Business Intelligence: Report generation, dashboard monitoring, KPI tracking

System Architecture#

Frontend (Next.js 15) → Backend API (FastAPI) → Docker VM (Ubuntu 22.04 + XFCE)

Frontend provides chat UI, model selection, VM management; Backend serves as orchestration layer for AI planning, multi-agent execution, WebSocket communication; Docker VM contains Chrome browser, terminal, VNC Server.

Installation & Deployment#

Prerequisites: Node.js 20+, Python 3.10+, Docker & Docker Compose, Supabase account, AI provider API keys

git clone https://github.com/coasty-ai/open-computer-use.git
cd open-computer-use
cp .env.example .env
cp backend/.env.example backend/.env
npm install
cd backend && python -m venv venv && source venv/bin/activate && pip install -r requirements.txt
docker-compose up --build

Access: Frontend http://localhost:3000, Backend http://localhost:8001

Key Configuration#

Required Environment Variables: Supabase config (URL, Anon Key, Service Role), Security keys (ENCRYPTION_KEY, CSRF_SECRET)

AI Provider Support: OpenAI, Anthropic, Google, xAI, Mistral, Azure, Perplexity, OpenRouter (100+ models)

BYOK Mode: All API keys encrypted at rest, users maintain full control over AI costs and usage

Pending Verification#

OSWorld Benchmark 82% score requires independent verification
Windows/macOS VM support planned for Q1 2026
No associated academic papers found