A minimal, hackable experimental harness for training LLMs on a single GPU node, covering all stages from pretraining to a ChatGPT-like UI.
One-Minute Overview#
nanochat is a simple experimental harness for training LLMs on a single GPU node. It allows training a GPT-2 capability model for approximately $72 (3 hours on 8xH100).
Quick Start#
- Training (8xH100): Execute
runs/speedrun.sh. - CPU/MPS: Execute
runs/runcpu.sh. - Chat UI: Run
python -m scripts.chat_web.
Key Capabilities#
- Covers all major LLM stages: tokenization, pretraining, finetuning, evaluation, and inference.
- Includes a familiar ChatGPT-like web UI.
Tech Stack & Integration#
- Language: Python
- Framework: PyTorch
- License: MIT
Maintenance#
- Status: Active
- Last Commit: 2026-02-05
License#
MIT