DISCOVER THE FUTURE OF AI AGENTSarrow_forward

Ghost OS

calendar_todayAdded Apr 23, 2026
categoryAgent & Tooling
codeOpen Source
PythonWorkflow Automation桌面应用Model Context ProtocolMultimodalAI AgentsAgent & ToolingDocs, Tutorials & ResourcesAutomation, Workflow & RPAProtocol, API & IntegrationComputer Vision & Multimodal

Full computer-use system for AI agents on macOS, exposing 29 MCP tools for structured perception, visual grounding, synthetic input, and self-learning Recipe workflows.

Ghost OS is a full computer-use system for AI agents on macOS, exposing macOS Accessibility API, visual models, and synthetic input as 29 standard MCP (Model Context Protocol) tools, enabling any MCP-compatible AI agent to "see" and control all native Mac applications.

Structured Perception with Cascading Fallback

The system prioritizes reading structured UI element data (buttons, text fields, labels, positions, available actions) via the macOS Accessibility API, with response times of 50–500ms. When AX Tree information is insufficient (e.g., Chrome flattening web elements into AXGroups), it automatically falls back to Chrome DevTools Protocol for DOM queries, then further to ShowUI-2B local visual model for pixel-level grounding, and finally to CGEvent coordinate input. This AX Tree → CDP → ShowUI-2B → CGEvent cascade ensures robustness across all application scenarios.

Self-Learning Recipe System

A core feature introduced in v2.2.0. Through ghost_learn_start / ghost_learn_stop, user operations are captured via CGEvent tap + AX Tree context, and Claude synthesizes the raw sequences into parameterized JSON Recipes. Recipes support parameterized replay (e.g., the gmail-send recipe accepts recipient, subject, and body parameters), enabling a "frontier model learns once, small model runs forever" cost optimization pattern. Recipes are local JSON files — auditable and shareable across teams.

Complete Tool Coverage

29 MCP tools cover perception (context, state, find, read, inspect, screenshot, annotate), actions (click, type, hover, drag, long_press), navigation (scroll, press, hotkey), window management (window, focus), wait/synchronization (wait with URL/element/title change conditions), Recipe management (CRUD + execution), learning control, and visual grounding (ground, parse_screen, element_at).

Local Privacy

The ShowUI-2B visual model (~3.0 GB) runs locally on Apple Silicon via MLX. All data stays on-device.

Runtime Environment

Requires macOS 14+ (Sonoma) and Apple Silicon. One-click installation via Homebrew, with ghost setup automatically handling permissions, MCP configuration, Recipe installation, and visual model download. Verified compatible with Claude Code, Cursor, VS Code, and Claude Desktop. Core dependencies include AXorcist (macOS accessibility engine) and ShowUI-2B (visual grounding model). Primarily written in Swift (92.1%), approximately 7,000 lines of code, under the MIT license.

Related Projects

View All arrow_forward

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.

rocket_launch