Official code repository for the O'Reilly book "Hands-On Large Language Models". Features 12 core chapters and bonus content covering Tokens, Transformers, RAG, and Fine-tuning. Includes 300+ illustrations and runnable Jupyter Notebooks optimized for Colab and local environments.
Project Overview#
Hands-On Large Language Models is the official code repository for the O'Reilly publication, authored by Jay Alammar (Director and Engineering Fellow at Cohere) and Maarten Grootendorst (Senior Clinical Data Scientist at IKNL). The project employs a visual education approach, featuring nearly 275-300 custom illustrations, with all examples provided as runnable Jupyter Notebooks.
Core Chapter Structure (12 Chapters)#
- Chapter 1: Introduction to Language Models
- Chapter 2: Tokens and Embeddings
- Chapter 3: Looking Inside Transformer LLMs
- Chapter 4: Text Classification
- Chapter 5: Text Clustering and Topic Modeling
- Chapter 6: Prompt Engineering
- Chapter 7: Advanced Text Generation Techniques
- Chapter 8: Semantic Search and RAG (Retrieval-Augmented Generation)
- Chapter 9: Multimodal Large Language Models
- Chapter 10: Creating Text Embedding Models
- Chapter 11: Fine-tuning Representation Models
- Chapter 12: Fine-tuning Generation Models
Bonus Content#
- Mamba Architecture: Visual guide to selective state space models
- Quantization: Model compression and acceleration techniques
- Mixture of Experts (MoE): Sparse expert model architectures
- Reasoning LLMs: Reasoning-enhanced language models (including DeepSeek-R1)
- Stable Diffusion: Illustrated guide to image generation principles
Installation and Running#
Method 1: Google Colab (Recommended)#
- Visit the Table of Contents in the repository README
- Click the "Open in Colab" badge for each chapter
- Automatically get free T4 GPU (16GB VRAM)
- No local environment configuration needed
Method 2: Local Conda Environment#
# Clone repository
git clone https://github.com/HandsOnLLM/Hands-On-Large-Language-Models.git
cd Hands-On-Large-Language-Models
# Create conda environment
conda create -n thellmbook python=3.10
conda activate thellmbook
# Install dependencies
conda env create -f environment.yml
# Or use pip
pip install -r requirements.txt
# Install GPU PyTorch
pip3 install --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
System Requirements#
- Python 3.10.*
- Microsoft Visual C++ 14.0+ (Windows)
- NVIDIA GPU + CUDA drivers (recommended)
Environment Verification#
import torch
import sys
import os
# Check GPU
print(f"CUDA available: {torch.cuda.is_available()}")
# Check environment
print(f"Current Env: {os.environ.get('CONDA_DEFAULT_ENV')}")
Repository Structure#
Hands-On-Large-Language-Models/
├── .setup/ # Environment configuration guides
│ ├── conda/ # Detailed Conda installation guide
│ └── images/ # Configuration screenshots
├── bonus/ # Bonus topic chapters
├── chapter01-12/ # 12 core chapter Notebooks
├── images/ # Book illustration resources
├── environment.yml # Conda environment definition
├── requirements.txt # Full dependencies (locked versions)
└── requirements_min.txt # Minimal dependencies
Technical Coverage#
| Technical Level | Key Topics |
|---|---|
| Fundamentals | Tokenization, Embeddings, Self-Attention, Feedforward Networks |
| Representation Learning | BERTopic, Sentiment Analysis, Zero-shot Classification |
| Generation & Interaction | Prompt Templating, Temperature, Top-k/Top-p Sampling |
| Retrieval Augmentation | Vector Search, FAISS, RAG Pipeline, Context Window |
| Advanced Architecture | CLIP, Multimodal Embeddings, Contrastive Learning |
| Model Customization | LoRA, QLoRA, PEFT, Supervised Fine-Tuning |
Industry Endorsements#
- Andrew Ng (Founder, DeepLearning.AI): "valuable resource for anyone looking to understand the main techniques behind how Large Language Models are built"
- Nils Reimers (Cohere ML Director, creator of sentence-transformers): "Its highly-visual coverage of generative, representational, and retrieval applications of language models empowers readers to quickly understand, use, and refine LLMs"
Publication Information#
- Publisher: O'Reilly Media
- Publication Year: 2024
- ISBN: 978-1098150969