An AI-native browser automation tool based on the WebDriver BiDi protocol, empowering both AI agents (e.g., Claude Code, Gemini CLI) and human developers with web interaction, testing, and data extraction through a single lightweight binary.
Positioning#
Vibium is an AI-native browser automation tool built on the WebDriver BiDi standard protocol, addressing limitations of traditional tools such as CSS selector dependency, complex environment setup, and lack of native AI agent integration.
Core Features#
- AI-Native Integration: Installable as an Agent Skill for Claude Code, Codex, Gemini, etc.; also runnable as an MCP Server
- Semantic Element Locating: Find elements via visible text, form labels, placeholders, ARIA roles — no CSS selectors needed
- Page Mapping:
vibium mapmaps interactive elements to@e1,@e2references;vibium diff maptracks changes - Zero Config: Single-command install, auto-downloads Chrome, runs in visible mode by default
- Lightweight: Single ~10MB binary with no runtime dependencies
- Dual Async/Sync APIs: JS/TS, Python, and Java client libraries all offer both async and sync APIs
- Capture & Recording: Screenshots (with element annotations), PDF export, JavaScript execution, session recording & playback
Architecture#
┌──────────────────────────────────────┐
│ LLM / Agent │
│ (Claude Code, Codex, Gemini, etc.) │
└──────────────────────────────────────┘
▲ CLI (Bash) ▲ MCP (stdio)
▼ ▼
┌───────────────────────────────────┐
│ Vibium binary │
│ ┌──────────────┐ ┌────────────┐ │
│ │ CLI Commands │ │ MCP Server │ │
│ └──────┬───────┘ └──────┬─────┘ │
│ └───────▲─────────┘ │
│ ┌─────▼───────┐ │ BiDi ┌──────────────┐
│ │ BiDi Proxy │◄────────►│ Chrome Browser │
│ └─────────────┘ │ └──────────────┘
└───────────────────────────────────┘
▲
│ WebSocket BiDi :9515
▼
┌──────────────────────────────────────┐
│ Client Libraries │
│ (js/ts | python | java) │
│ ┌─────────────────┐ ┌────────────┐ │
│ │ Async API │ │ Sync API │ │
│ └─────────────────┘ └────────────┘ │
└──────────────────────────────────────┘
The core binary is written in Go, embedding CLI and MCP Server entry points. The BiDi Proxy acts as a middleware bridging upper-layer commands with the underlying Chrome browser via WebDriver BiDi (WebSocket, default port 9515). Client libraries connect to the Vibium binary via WebSocket BiDi rather than directly to the browser.
Installation & Usage#
CLI / Agent Skill
npm install -g vibium
npx skills add https://github.com/VibiumDev/vibium --skill vibe-check
MCP Server
claude mcp add vibium -- npx -y vibium mcp
gemini mcp add vibium npx -y vibium mcp
Language Clients
npm install vibium # JavaScript/TypeScript
pip install vibium # Python
Java (Gradle): implementation 'com.vibium:vibium:26.3.18'
CLI Core Commands
vibium go https://example.com # Navigate
vibium map # Map interactive elements
vibium click @e1 # Click
vibium diff map # View changes
vibium find text "Sign In" # Semantic find
vibium fill @e2 "hello@example.com" # Fill form
vibium screenshot -o page.png # Screenshot
vibium pdf -o page.pdf # Export PDF
vibium eval "document.title" # Execute JS
vibium wait text "Success" # Wait for text
Use Cases#
- AI Agent browser skill extension
- End-to-end web test automation
- Web data extraction and page archiving
- Automated form filling
- MCP server integration in AI coding tools
Capability Boundaries#
- Supported: Page navigation, semantic element locating, form filling, click interactions, screenshots (with annotations), PDF export, JS execution, session recording/playback, page element mapping and diffing
- Not supported (inferred): Non-Chrome browser automation, distributed cluster execution, mobile browser control
Roadmap Vision#
- Act (Vibium): Current — browser automation via BiDi
- Think (Cortex): Planned — SQLite-backed memory/navigation planning layer
- Sense (Retina): Planned — Chrome extension for passive browser activity recording
Project Overview#
- Current version: 26.3.18 (10 releases)
- Primary languages: Go (38.7%), JavaScript (21.4%), Python (17.8%), TypeScript (11.4%), Java (9.6%)
- Supported platforms: Linux x64, macOS x64 & arm64, Windows x64
- Development activity: 416 commits, 26 open issues, 13 open pull requests
- Apache 2.0 license