DISCOVER THE FUTURE OF AI AGENTSarrow_forward

Open Computer Use

calendar_todayAdded Feb 24, 2026
categoryAgent & Tooling
codeOpen Source
PythonTypeScriptWorkflow AutomationDocker大语言模型Next.jsFastAPIMultimodalAI AgentsAgent FrameworkBrowser AutomationNatural Language ProcessingAgent & ToolingModel & Inference FrameworkDeveloper Tools & CodingAutomation, Workflow & RPA

An open-source full-stack framework for autonomous computer agents, enabling control of browsers, terminals, and desktop apps via natural language in Docker VMs. Maintained by coasty-ai under Apache 2.0 license, achieving 82% on OSWorld Benchmark.

Project Overview#

Open Computer Use is an open-source full-stack framework for autonomous virtual computer agents, enabling AI to operate browsers, terminals, and desktop applications like humans. Positioned as "The Open Framework for autonomous virtual computer agents at scale", it supports 100% self-hosting.

Core Capabilities#

Browser Agent#

  • Search-first strategy (Google Search API)
  • Intelligent web navigation and auto form filling
  • Element detection and smart clicking
  • Multi-tab parallel management
  • Screenshot verification and visual feedback

Terminal Agent#

  • Command execution in isolated environments
  • File operations (read/write/edit/delete)
  • Script execution and package installation
  • Real-time streaming output

Desktop Agent#

  • CV-based UI element detection
  • Mouse/keyboard control, window management
  • OCR text extraction
  • Cross-platform support (Linux desktop)

Multi-Agent System#

  • AI Planner for task decomposition
  • Sequential execution with context passing
  • Error handling and automatic retry
  • User interaction confirmation mechanism

Performance Metrics (Official)#

  • OSWorld Benchmark #1 rank, 82% score
  • Average task completion ~45 seconds
  • 50+ concurrent sessions per server
  • Tool call latency <500ms
  • VM startup time ~15 seconds

Use Cases#

  • Research & Data Collection: Web scraping, competitive analysis, academic paper collection
  • Testing & QA: Automated UI testing, cross-browser testing, regression testing
  • DevOps & Automation: Server configuration, deployment automation, log analysis
  • E-commerce Operations: Price monitoring, order management, inventory tracking
  • Business Intelligence: Report generation, dashboard monitoring, KPI tracking

System Architecture#

Frontend (Next.js 15) → Backend API (FastAPI) → Docker VM (Ubuntu 22.04 + XFCE)

Frontend provides chat UI, model selection, VM management; Backend serves as orchestration layer for AI planning, multi-agent execution, WebSocket communication; Docker VM contains Chrome browser, terminal, VNC Server.

Installation & Deployment#

Prerequisites: Node.js 20+, Python 3.10+, Docker & Docker Compose, Supabase account, AI provider API keys

git clone https://github.com/coasty-ai/open-computer-use.git
cd open-computer-use
cp .env.example .env
cp backend/.env.example backend/.env
npm install
cd backend && python -m venv venv && source venv/bin/activate && pip install -r requirements.txt
docker-compose up --build

Access: Frontend http://localhost:3000, Backend http://localhost:8001

Key Configuration#

Required Environment Variables: Supabase config (URL, Anon Key, Service Role), Security keys (ENCRYPTION_KEY, CSRF_SECRET)

AI Provider Support: OpenAI, Anthropic, Google, xAI, Mistral, Azure, Perplexity, OpenRouter (100+ models)

BYOK Mode: All API keys encrypted at rest, users maintain full control over AI costs and usage

Pending Verification#

  • OSWorld Benchmark 82% score requires independent verification
  • Windows/macOS VM support planned for Q1 2026
  • No associated academic papers found

Related Projects

View All arrow_forward

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.

rocket_launch