AReaL is a large-scale asynchronous reinforcement learning training system for large reasoning and agentic models. It provides flexible, high-performance training solutions that scale from single nodes to 1,000+ GPUs.
One-Minute Overview#
AReaL is an open-source fully asynchronous reinforcement learning system developed by Tsinghua University and Ant Group, specifically designed for training large language model reasoning capabilities and agents. It features industry-leading speed and stability, supports multiple training algorithms and model architectures, making it ideal for researchers and enterprises building high-performance AI agents.
Core Value: Through algorithm-system co-design, AReaL delivers stable, efficient asynchronous RL training that significantly enhances agent performance.
Getting Started#
Installation Difficulty: Medium - Requires Python environment, supports both local and cluster deployment, though cluster setup requires additional configuration
# Local single-node installation
python3 -m areal.launcher.local \
examples/math/gsm8k_rl.py \
--config examples/math/gsm8k_grpo.yaml
Is this suitable for me?
- ✅ Need to train high-performance reasoning agents (mathematics, coding, search, etc.)
- ✅ Want to train RL models on multi-GPU clusters with asynchronous methods
- ❌ Need simple rapid prototyping (consider AReaL-lite instead)
- ❌ Not familiar with distributed training systems
Core Capabilities#
1. Flexible Multi-Turn Agent Workflows#
- Seamlessly customize multi-turn agentic rollout workflows within a single file, with smooth integration with other agentic tooling frameworks Real Value: Quickly customize and experiment with different agent behavior patterns without complex refactoring
2. Industry-Leading Scalability#
- Through algorithm-system co-design, AReaL delivers stable fully asynchronous RL training with industry-leading speed Real Value: Scale from single nodes to 1,000+ GPUs, significantly reducing large-scale training time and resource requirements
3. Multi-Algorithm Support#
- Supports various RL algorithms including GRPO, GSPO, PPO, DAPO, as well as RLHF reward modeling and SFT Real Value: Select optimal training algorithms for different tasks and datasets to improve training effectiveness
4. Multi-Model Compatibility#
- Supports large models like Qwen2/3, Gemma3, and vision-language models Real Value: No framework switching needed when adapting to different types and sizes of models
Technology Stack & Integration#
Development Language: Python Main Dependencies: PyTorch, Megatron or FSDP (training), Ray (cluster launcher), vLLM or SGLang (inference) Integration Method: API/Library
Ecosystem & Extensions#
- Model Support: Supports mainstream large models including Qwen series, Gemma, MoE models, and vision-language models
- Training Backends: Supports multiple parallelization strategies via Megatron and PyTorch FSDP
- Inference Backends: Compatible with vLLM and SGLang inference frameworks
- Agent Ecosystem: Provides examples for math, search, tool-integrated agents
Maintenance Status#
- Development Activity: High, with planned minor releases weekly and major releases monthly
- Recent Updates: Active, with continuous feature additions and optimizations (AReaL-lite, NPU support, etc.)
- Community Response: Active, with GitHub Discussions and WeChat group support
Documentation & Learning Resources#
- Documentation Quality: Comprehensive (installation guide, quickstart, CLI configs, async RL explanation, MoE fine-tuning, agentic RL, etc.)
- Official Documentation: https://github.com/inclusionAI/AReaL#documentation
- Example Code: Rich examples available (math, multi-turn, LoRA, VLM, reasoning, search agents, etc.)