DISCOVER THE FUTURE OF AI AGENTSarrow_forward

Trinity-RFT

calendar_todayAdded Jan 27, 2026
categoryModel & Inference Framework
codeOpen Source
PythonPyTorch大语言模型Reinforcement LearningCLIModel & Inference FrameworkDeveloper Tools & CodingModel Training & Inference

Trinity-RFT is a general-purpose, flexible and user-friendly framework for LLM reinforcement fine-tuning (RFT). It decouples RFT into three coordinated components: Explorer, Trainer, and Buffer, enabling users with different backgrounds to train LLM-powered agents for specific domains.

One-Minute Overview#

Trinity-RFT is a general-purpose framework for reinforcement fine-tuning of large language models, consisting of three coordinated components: Explorer, Trainer, and Buffer. It enables AI application developers, reinforcement learning researchers, and data engineers to efficiently train and optimize LLM-powered agents.

Core Value: Modular architecture supports flexible RFT modes, works without GPUs, and provides rich data pipelines and algorithm support.

Quick Start#

Installation Difficulty: Medium - Requires Python 3.10-3.12, GPU version needs CUDA≥12.8 and at least 2 GPUs, but offers Tinker backend for no-GPU environments

# Install with CPU backend (suitable for no-GPU users)
pip install -e ".[tinker]"

# Install with GPU support
pip install -e ".[vllm,flash_attn]"

Is this suitable for me?

  • ✅ AI Application Development: Train LLM agents for specific domains to enhance professional capabilities
  • ✅ RL Research: Design, implement and validate new RL algorithms
  • ✅ Data Engineering: Create RFT datasets and build data pipelines
  • ❌ Simple Classification Tasks: This framework focuses on reinforcement fine-tuning, not simple model fine-tuning needs
  • ❌ Single Machine Use: While supports CPU mode, optimal performance requires distributed training environment

Core Capabilities#

1. Flexible RFT Modes - Meeting Diverse Training Needs#

  • Supports synchronous/asynchronous, online/offline, on-policy/off-policy RL
  • Inference and training can run independently across devices for improved sample and time efficiency User Value: Users can choose optimal training modes based on computing resources and task requirements

2. Agentic RL Support - Training Complex Multi-step Tasks#

  • Supports both concatenated and general multi-step agentic workflows
  • Can directly train agent applications developed using frameworks like AgentScope User Value: Simplifies the process from development to training, making complex agent training straightforward

3. Full-lifecycle Data Pipelines - Improving Data Quality and Efficiency#

  • Enables pipeline processing of rollout tasks and experience samples
  • Supports active data management (prioritization, cleaning, augmentation) throughout RFT lifecycle User Value: Enhances training effectiveness and model performance through data preprocessing and optimization

Tech Stack & Integration#

Development Language: Python 3.10-3.12 Main Dependencies: PyTorch, Ray, vLLM, verl, Data-Juicer Integration Method: Library/API framework

Ecosystem & Extensions#

  • Algorithm Support: Multiple RL algorithms including PPO, GRPO, CHORD, REC series
  • Framework Compatibility: Compatible with Huggingface and ModelScope model/dataset ecosystems
  • Visualization Tools: Provides web interface for configuration and supports Wandb/TensorBoard/MLFlow monitoring

Maintenance Status#

  • Development Activity: Actively developed with frequent releases
  • Recent Updates: v0.4.1 released in January 2026 with continuous feature improvements
  • ** Community Response**: Clear contribution guidelines and community engagement welcome

Commercial & Licensing#

License: Apache-2.0

  • ✅ Commercial Use: Allowed
  • ✅ Modification: Allowed
  • ⚠️ Restrictions: None

Documentation & Learning Resources#

  • Documentation Quality: Comprehensive
  • Official Documentation: Included in repository
  • Example Code: Rich tutorials and examples including GRPO quick start on GSM8k

Related Projects

View All arrow_forward

STAY UPDATED

Get the latest AI tools and trends delivered straight to your inbox. No spam, just intelligence.

rocket_launch