Native AI edge runtime for macOS that aggregates local (MLX) and cloud models via a unified API, featuring built-in MCP server, memory systems, and automation toolchains for local-first AI agent infrastructure.
Project Overview#
Osaurus is a native AI edge runtime for macOS developed by Dinoki Labs, designed specifically for Apple Silicon (M1+). Acting as an always-on runtime, it efficiently runs local models (Llama, Qwen, Gemma, Mistral, etc.) using the MLX framework while seamlessly connecting to cloud services like OpenAI, Anthropic, and OpenRouter.
Core Capabilities#
Model Runtime#
- Local Inference: Optimized inference on Apple Silicon via MLX Runtime
- Cloud Aggregation: Support for Anthropic, OpenAI, OpenRouter, Ollama, LM Studio
- API Compatibility: OpenAI (
/v1/chat/completions), Anthropic (/messages), Ollama (/chat) format endpoints
MCP Protocol Support#
- MCP Server: Expose built-in tools to clients like Cursor, Claude Desktop
- Remote MCP Providers: Connect to external MCP servers and aggregate tools
Agent System#
- Multi-Agent Support: Create task-specific assistants with independent Prompt, model, and tool configurations
- Four-Layer Memory Architecture: User Profile, Working Memory, Conversation Summaries, Knowledge Graph
- Hybrid Search: BM25 + Vector Embedding
Tools & Plugins#
- 20+ Native Plugins: Filesystem, Browser, Git, Search, Mail, Calendar, Vision, etc.
- Install from central registry or create custom plugins
Automation Features#
- Schedules: Execute AI tasks at scheduled times
- Watchers: Trigger tasks on folder changes
- Work Mode: Autonomous multi-step task execution
Voice Features#
- Local real-time transcription via FluidAudio
- VAD mode with wake word activation
- Global hotkey for voice transcription to any app
Installation#
# Homebrew installation
brew install --cask osaurus
# Or download from GitHub Releases
# https://github.com/osaurus-ai/osaurus/releases
Quick Start#
osaurus serve # Start server (default port 1337)
osaurus ui # Open menu bar UI
osaurus run llama-3.2-3b-instruct-4bit # Interactive chat
MCP Client Configuration#
{
"mcpServers": {
"osaurus": {
"command": "osaurus",
"args": ["mcp"]
}
}
}
Code Integration Example#
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:1337/v1", api_key="osaurus")
response = client.chat.completions.create(
model="llama-3.2-3b-instruct-4bit",
messages=[{"role": "user", "content": "Hello!"}]
)
Use Cases#
- Personal AI Hub: Unified tool and model gateway for MCP clients
- Automation Workflows: Scheduled tasks, file monitoring, auto-processing
- Multi-Model Development: Switch between different LLMs via single endpoint
- Privacy-Sensitive Assistant: Process sensitive data with local models
System Requirements#
- macOS 15.5+
- Apple Silicon (M1 or newer)
- Xcode 16.4+ (for building from source)
Environment Variables#
OSU_PORT: Custom service portOSU_MODELS_DIR: Custom model storage path (default~/MLXModels)