NagaAgent

A four-service collaborative AI desktop assistant framework with streaming tool calling, GRAG knowledge graph memory, Live2D avatar, and voice interaction

NagaAgent is a desktop-oriented AI assistant framework using a four-service microservice architecture (API Server, Agent Server, MCP Server, Voice Service).

Core Features#

Streaming Tool Calling#

Independent of OpenAI Function Calling API, uses tool code blocks in LLM text output to embed JSON tool descriptions
Compatible with DeepSeek, Qwen, OpenAI, Ollama and all OpenAI-compatible APIs
Supports regex extraction + json5 fault-tolerant parsing + full-width character auto-normalization
Maximum 5 rounds of tool calling loops (configurable), supports parallel execution

GRAG Knowledge Graph Memory#

Automatically extracts quintuples (subject, subject_type, predicate, object, object_type) from conversations into Neo4j
Supports structured output (Pydantic models) + JSON fallback extraction
3 asyncio worker coroutines consuming task queues, SHA-256 deduplication
Dual storage: local files + Neo4j graph database

Live2D Avatar#

Renders Cubism Live2D models via pixi-live2d-display + PixiJS WebGL
4-channel orthogonal animation system: posture, motion, expression, tracking
SSAA supersampling anti-aliasing, lip sync (60FPS extracting 5 parameters)

Voice Interaction#

TTS: Edge-TTS, supports mp3/aac/wav/opus/flac formats
ASR: FunASR local server with VAD endpoint detection and WebSocket real-time streaming
Real-time voice dialogue: Full-duplex WebSocket voice interaction based on Qwen Omni

Four-Service Architecture#

Service	Port	Responsibility
API Server	8000	Dialogue, streaming tool calling, document upload, auth proxy, memory API, config management, Skill marketplace
Agent Server	8001	Background intent analysis, OpenClaw integration, task scheduling and compressed memory
MCP Server	8003	MCP tool registration/discovery/parallel scheduling
Voice Service	5048	TTS + ASR + real-time voice

Built-in Agents: weather/time, app launcher, game guides, online search, web scraping, browser automation, visual analysis, MQTT IoT control, document extraction. Supports dynamic registration, parallel scheduling, and one-click installation of community Skills via Skill Workshop.

Requirements & Installation#

Python version: >=3.11, <3.12
Platform support: Windows / macOS / Linux

git clone https://github.com/RTGS2017/NagaAgent.git
cd NagaAgent
python setup.py  # Auto-detect environment, create virtual environment, install dependencies

Configuration Example#

{
  "api": {
    "api_key": "your-api-key",
    "base_url": "https://api.deepseek.com/v1",
    "model": "deepseek-v3.2"
  }
}

Startup Commands#

python main.py             # Full startup
python main.py --headless  # Headless mode without GUI

Use Cases#

Personal desktop assistant: voice interaction, schedule management, document processing, system control
Knowledge management: automatically build personal knowledge graphs, long-term memory of user preferences and facts
Gaming assistant: guide Q&A, damage calculation, team recommendations
IoT control: control smart home devices via MQTT protocol
Browser automation: web operation automation based on Playwright
Multi-Agent collaboration: parallel task scheduling via MCP tool system

Core Features#

Streaming Tool Calling#

GRAG Knowledge Graph Memory#

Live2D Avatar#

Voice Interaction#

Four-Service Architecture#

MCP Tool Ecosystem#

Requirements & Installation#

Configuration Example#

Startup Commands#

Use Cases#

Related Projects

oh-my-codex

Ironcurtain

vibe-remote

STAY UPDATED