LLaMA Factory

A unified, efficient fine-tuning framework for 100+ LLMs and VLMs, published at ACL 2024. It integrates various fine-tuning methods (LoRA, QLoRA, Full) and training algorithms (DPO, PPO), enabling efficient training on consumer-grade GPUs. Featuring a Web UI (LLaMA Board), it significantly lowers the barrier from data preparation to model deployment.

One-Minute Overview#

LLaMA Factory is an "all-in-one" toolkit for Large Language Model (LLM) fine-tuning, designed to make tuning models as easy as using pre-trained ones. Whether you are building a domain-specific chatbot or conducting academic research, it helps you complete the task via a visual interface or command-line tools.

Core Value: It enables the fine-tuning of the latest and most powerful open-source models (like Qwen3, Llama 3, DeepSeek) on limited hardware resources (e.g., a single consumer-grade GPU) and offers Day-0 support for cutting-edge models.

Quick Start#

Installation Difficulty: Low - Supports one-click pip installation or Docker deployment, with comprehensive example configs provided.

# Clone repo and install
git clone --depth 1 https://github.com/hiyouga/LlamaFactory.git
cd LlamaFactory
pip install -e .

Is this suitable for me?

✅ Fine-tune locally: Supports training 70B models with <24GB VRAM.

✅ Need latest models: Offers Day-0/Day-1 adaptation for models like Qwen3, Gemma 3.

✅ Avoid complex code: Provides a Gradio-based Web UI for mouse-click training.

❌ Pre-training from scratch: While supported, its main strength lies in fine-tuning.

Core Capabilities#

1. Extensive Model Support - Eliminates Selection Anxiety#

Supports 100+ models, including LLaMA 3/4, Qwen2/3, Mistral, DeepSeek, GLM, Phi, covering both text and vision-language modalities. Actual Value: No need to adapt codebases for different models; one unified tool for all mainstream open-source models.

2. Resource Efficiency - Lowers Hardware Barriers#

Significantly reduces VRAM usage via advanced algorithms like GaLore, LoRA+, and QLoRA. For example, fine-tuning a 7B model with QLoRA requires only 4GB VRAM. Actual Value: Empowers individual developers and SMEs to train LLMs without expensive enterprise servers.

Documentation Quality: Comprehensive (Official docs, blog, and online course available)
Official Docs: https://llamafactory.readthedocs.io/
Example Code: Rich (Dozens of scenario configs in examples/)
Online Trial: LLaMA Factory Online (No local setup required)

One-Minute Overview#

Quick Start#

Core Capabilities#

1. Extensive Model Support - Eliminates Selection Anxiety#

2. Resource Efficiency - Lowers Hardware Barriers#

3. Cutting-Edge Integration - Stays Ahead#

4. Full-Stack Workflow - One-Stop Experience#

Tech Stack & Integration#

Documentation & Learning Resources#

Related Projects

oh-my-codex

Ironcurtain

vibe-remote

STAY UPDATED