Osaurus

Native AI edge runtime for macOS that aggregates local (MLX) and cloud models via a unified API, featuring built-in MCP server, memory systems, and automation toolchains for local-first AI agent infrastructure.

Project Overview#

Osaurus is a native AI edge runtime for macOS developed by Dinoki Labs, designed specifically for Apple Silicon (M1+). Acting as an always-on runtime, it efficiently runs local models (Llama, Qwen, Gemma, Mistral, etc.) using the MLX framework while seamlessly connecting to cloud services like OpenAI, Anthropic, and OpenRouter.

Core Capabilities#

Model Runtime#

Local Inference: Optimized inference on Apple Silicon via MLX Runtime
Cloud Aggregation: Support for Anthropic, OpenAI, OpenRouter, Ollama, LM Studio
API Compatibility: OpenAI (/v1/chat/completions), Anthropic (/messages), Ollama (/chat) format endpoints

MCP Protocol Support#

MCP Server: Expose built-in tools to clients like Cursor, Claude Desktop
Remote MCP Providers: Connect to external MCP servers and aggregate tools

Agent System#

Multi-Agent Support: Create task-specific assistants with independent Prompt, model, and tool configurations
Four-Layer Memory Architecture: User Profile, Working Memory, Conversation Summaries, Knowledge Graph
Hybrid Search: BM25 + Vector Embedding

Tools & Plugins#

20+ Native Plugins: Filesystem, Browser, Git, Search, Mail, Calendar, Vision, etc.
Install from central registry or create custom plugins

Automation Features#

Schedules: Execute AI tasks at scheduled times
Watchers: Trigger tasks on folder changes
Work Mode: Autonomous multi-step task execution

Voice Features#

Local real-time transcription via FluidAudio
VAD mode with wake word activation
Global hotkey for voice transcription to any app

Installation#

# Homebrew installation
brew install --cask osaurus

# Or download from GitHub Releases
# https://github.com/osaurus-ai/osaurus/releases

Quick Start#

osaurus serve           # Start server (default port 1337)
osaurus ui              # Open menu bar UI
osaurus run llama-3.2-3b-instruct-4bit  # Interactive chat

MCP Client Configuration#

{
  "mcpServers": {
    "osaurus": {
      "command": "osaurus",
      "args": ["mcp"]
    }
  }
}

Code Integration Example#

from openai import OpenAI

client = OpenAI(base_url="http://127.0.0.1:1337/v1", api_key="osaurus")
response = client.chat.completions.create(
    model="llama-3.2-3b-instruct-4bit",
    messages=[{"role": "user", "content": "Hello!"}]
)

Use Cases#

Personal AI Hub: Unified tool and model gateway for MCP clients
Automation Workflows: Scheduled tasks, file monitoring, auto-processing
Multi-Model Development: Switch between different LLMs via single endpoint
Privacy-Sensitive Assistant: Process sensitive data with local models

System Requirements#

macOS 15.5+
Apple Silicon (M1 or newer)
Xcode 16.4+ (for building from source)

Environment Variables#

OSU_PORT: Custom service port
OSU_MODELS_DIR: Custom model storage path (default ~/MLXModels)