Trinity-RFT

Trinity-RFT is a general-purpose, flexible and user-friendly framework for LLM reinforcement fine-tuning (RFT). It decouples RFT into three coordinated components: Explorer, Trainer, and Buffer, enabling users with different backgrounds to train LLM-powered agents for specific domains.

One-Minute Overview#

Trinity-RFT is a general-purpose framework for reinforcement fine-tuning of large language models, consisting of three coordinated components: Explorer, Trainer, and Buffer. It enables AI application developers, reinforcement learning researchers, and data engineers to efficiently train and optimize LLM-powered agents.

Core Value: Modular architecture supports flexible RFT modes, works without GPUs, and provides rich data pipelines and algorithm support.

Quick Start#

Installation Difficulty: Medium - Requires Python 3.10-3.12, GPU version needs CUDA≥12.8 and at least 2 GPUs, but offers Tinker backend for no-GPU environments

# Install with CPU backend (suitable for no-GPU users)
pip install -e ".[tinker]"

# Install with GPU support
pip install -e ".[vllm,flash_attn]"

Is this suitable for me?

✅ AI Application Development: Train LLM agents for specific domains to enhance professional capabilities

✅ RL Research: Design, implement and validate new RL algorithms

✅ Data Engineering: Create RFT datasets and build data pipelines

❌ Simple Classification Tasks: This framework focuses on reinforcement fine-tuning, not simple model fine-tuning needs

❌ Single Machine Use: While supports CPU mode, optimal performance requires distributed training environment

Core Capabilities#

1. Flexible RFT Modes - Meeting Diverse Training Needs#

Supports synchronous/asynchronous, online/offline, on-policy/off-policy RL
Inference and training can run independently across devices for improved sample and time efficiency User Value: Users can choose optimal training modes based on computing resources and task requirements

2. Agentic RL Support - Training Complex Multi-step Tasks#

Supports both concatenated and general multi-step agentic workflows
Can directly train agent applications developed using frameworks like AgentScope User Value: Simplifies the process from development to training, making complex agent training straightforward

3. Full-lifecycle Data Pipelines - Improving Data Quality and Efficiency#

Enables pipeline processing of rollout tasks and experience samples
Supports active data management (prioritization, cleaning, augmentation) throughout RFT lifecycle User Value: Enhances training effectiveness and model performance through data preprocessing and optimization

Tech Stack & Integration#

Development Language: Python 3.10-3.12 Main Dependencies: PyTorch, Ray, vLLM, verl, Data-Juicer Integration Method: Library/API framework

Ecosystem & Extensions#

Algorithm Support: Multiple RL algorithms including PPO, GRPO, CHORD, REC series
Framework Compatibility: Compatible with Huggingface and ModelScope model/dataset ecosystems
Visualization Tools: Provides web interface for configuration and supports Wandb/TensorBoard/MLFlow monitoring

Maintenance Status#

Development Activity: Actively developed with frequent releases
Recent Updates: v0.4.1 released in January 2026 with continuous feature improvements
** Community Response**: Clear contribution guidelines and community engagement welcome

Commercial & Licensing#

License: Apache-2.0

✅ Commercial Use: Allowed
✅ Modification: Allowed
⚠️ Restrictions: None

Documentation & Learning Resources#

Documentation Quality: Comprehensive
Official Documentation: Included in repository
Example Code: Rich tutorials and examples including GRPO quick start on GSM8k

One-Minute Overview#

Quick Start#

Core Capabilities#

1. Flexible RFT Modes - Meeting Diverse Training Needs#

2. Agentic RL Support - Training Complex Multi-step Tasks#

3. Full-lifecycle Data Pipelines - Improving Data Quality and Efficiency#

Tech Stack & Integration#

Ecosystem & Extensions#

Maintenance Status#

Commercial & Licensing#

Documentation & Learning Resources#

Related Projects

oh-my-codex

Ironcurtain

vibe-remote

STAY UPDATED