A learnable, configurable, and pluggable Omni-Avatar Assistant framework built on LiveKit, featuring real-time interaction, multimodal memory, user persona, and external tool integration.
Project Overview#
AlphaAvatar is an Apache-2.0 open-source project aiming to build a universal virtual assistant. It is a Python-based Agent framework that solves the challenge of integrating real-time audio/video interaction (LiveKit), LLM inference, long-term memory, user persona, and virtual characters, providing a digital assistant solution with "self-evolution" capabilities.
Core Value: Lowering the barrier to building real-time voice/video AI Agents with long-term memory and personalized interaction capabilities.
Use Cases:
- Real-time voice/video virtual companionship and assistance
- Intelligent customer service or educational tutoring Agents with long-term memory
- Multimodal interaction research (speech recognition, speaker diarization, Live2D character driving)
- Intelligent Q&A systems with external knowledge base (RAG) and deep web search
Core Capabilities & Plugins#
The project features a plugin-based design with two main categories: AlphaAvatar core plugins and tool plugins.
AlphaAvatar Core Plugins#
| Plugin | Status | Description |
|---|---|---|
| 🧠 Memory | Implemented | Self-improving memory module supporting Assistant–User/Assistant–Tools/Assistant's self-memory capture and retrieval |
| 🧬 Persona | Implemented | Automatic full modality user persona with speaker ID verification and real-time profile matching |
| 😊 Virtual Character | Implemented | Real-time generated virtual character, integrated with AIRI live2d |
| 💡 Reflection | Planned | Optimizer for automatic internal knowledge base construction |
| 🗺️ Planning | Planned | Long-term planning for sequential and reliable actions |
| 🤖 Behavior | Planned | Behavior logic and process flow controller |
Tool Plugins#
| Plugin | Status | Description |
|---|---|---|
| 🔍 DeepResearch | Implemented | Network access and deep search via Tavily API, supporting quick search/deep search/web-to-PDF |
| 📖 RAG | Implemented | Document knowledge access via RAG Anything with DeepResearch page indexing |
Installation & Quick Start#
Requirements#
- Python 3.11+
- Dependencies: LiveKit Server, OpenAI API Key, Qdrant (cloud or self-hosted), Tavily API Key (optional)
Install from PyPI (Stable)#
uv venv .my-env --python 3.11
source .my-env/bin/activate
pip install alpha-avatar-agents
Install from GitHub (Latest)#
git clone --recurse-submodules https://github.com/AlphaAvatar/AlphaAvatar.git
cd AlphaAvatar
uv venv .venv --python 3.11
source .venv/bin/activate
uv sync --all-packages
Environment Variables#
export LIVEKIT_API_KEY=<your API Key>
export LIVEKIT_API_SECRET=<your API Secret>
export LIVEKIT_URL=<your LiveKit server URL>
export OPENAI_API_KEY=<your OpenAI API Key>
export QDRANT_URL='https://xxxxxx.us-east.aws.cloud.qdrant.io:6333'
export QDRANT_API_KEY=<your QDRANT API Key>
export TAVILY_API_KEY=<your TAVILY API Key> # Optional
Run in Development Mode#
alphaavatar download-files
alphaavatar dev examples/pipline_openai_airi.yaml
# or
alphaavatar dev examples/pipline_openai_tools.yaml
Architecture Design#
- Core Framework: Built on LiveKit Agents for real-time interaction streams
- Modular Design:
avatar-agents(core Agent logic & orchestration) +avatar-plugins(plugin implementations) - Context Manager: Core routing component distributing real-time interaction data to plugin models
- Data Storage: Qdrant vector database for Memory and Persona Embedding storage
- Multimodal Pipeline: LiveKit AV stream → STT → Context Manager (Persona/Memory) → LLM → Tools (DeepResearch/RAG) → TTS → Audio stream + Live2D drive
CLI Commands#
alphaavatar download-files: Initialize and download required resource filesalphaavatar dev <config_path>: Start Agent in development mode with specified YAML config
Version History#
| Date | Version | Key Updates |
|---|---|---|
| 2026-01 | v0.3.1 | ADD tool calls during user-Assistant interactions to the Memory module |
| 2026-01 | v0.3.0 | Support DeepResearch by tavily API |
| 2025-12 | v0.2.0 | Support AIRI live2d-based virtual character display |
| 2025-11 | v0.1.0 | Support automatic memory extraction, automatic user persona extraction and matching |
Project Vision#
Build a universal assistant capable of recognizing users through multimodal streaming input. It should possess self-memory, autonomous reflection, and iterative self-evolution for real-time interaction. The assistant will seamlessly integrate with mainstream external tools to solve practical problems efficiently.