An interactive open-access textbook on Machine Learning Systems engineering from Harvard University, integrating the TinyTorch framework with hands-on edge deployment labs, covering the full spectrum from ML fundamentals to system optimization.
CS249r Book (MLSysBook.ai) is a Machine Learning Systems engineering textbook project led by Prof. Vijay Janapa Reddi at Harvard University, designed to bridge the gap between "ML Algorithms ↔ Systems Engineering". Current version is v0.5.1 (Early Access Preview), with a hardcover edition planned for MIT Press publication in 2026.
- Description: Introduction to Machine Learning Systems — Principles and Practices of Engineering Artificially Intelligent Systems
- Institution: Harvard University (harvard-edge)
- Author/Maintainer: Prof. Vijay Janapa Reddi
- Primary Languages: JavaScript (81.2%), Python (13.9%), TeX (2.4%), HTML, Lua, Shell
| Layer | Component | Purpose | Status |
|---|
| READ | Textbook | Theory, concepts, and best practices (ML ↔ Systems Engineering bridge) | ✅ Available |
| BUILD | TinyTorch Framework | Build ML framework from scratch (NumPy only) | ✅ Available |
| DEPLOY | Hardware Kits | Deploy on real constrained devices (Arduino, Raspberry Pi, etc.) | ✅ Available |
| EXPLORE | Software Co-Labs | Controlled experiments on latency, memory, energy, cost | 🔄 2026 |
| PROVE | AI Olympics | Cross-track competition and benchmarking | 🔄 2026 |
| Part | Theme | Chapters |
|---|
| I. Foundations | Core Concepts | Introduction, ML Systems, DL Primer, Architectures |
| II. Design | Building Blocks | Workflow, Data Engineering, Frameworks, Training |
| III. Performance | Acceleration | Efficient AI, Optimizations, HW Acceleration, Benchmarking |
| IV. Deployment | Deployment | MLOps, On-device Learning, Privacy, Robustness |
| V. Trust | Trustworthy & Sustainable | Responsible AI, Sustainable AI, AI for Good |
| VI. Frontiers | Cutting-edge | Emerging trends and future directions |
| Part | Module # | Build Content |
|---|
| I. Foundations | 01-08 | Tensors, activations, layers, losses, dataloader, autograd, optimizers, training |
| II. Vision | 09 | Conv2d, CNNs for image classification |
| III. Language | 10-13 | Tokenization, embeddings, attention, transformers |
| IV. Optimization | 14-20 | Profiling, quantization, compression, acceleration, benchmarking, capstone |
Workflow: src/*.py → modules/*.ipynb → tinytorch/*.py, driven by tito CLI (23 subcommands)
- Arduino Nicla Vision (STM32H7, ultra-low-power vision)
- Seeed XIAO ESP32S3 (WiFi vision)
- Grove Vision AI V2 (no-code rapid prototyping)
- Raspberry Pi (complex edge AI pipelines)
| Lab | Build Content | Skills |
|---|
| Setup | Hardware & environment config | Toolchain, flashing, debugging |
| Image Classification | CNN image recognition | Model deployment, inference |
| Object Detection | Real-time object detection | YOLO, bounding boxes |
| Keyword Spotting | Audio wake word detection | DSP, MFCC features |
| Motion Classification | IMU gesture recognition | Sensor fusion, time series |
| ML Concept | Systems Concept | Learning Points |
|---|
| Model Parameters | Memory Constraints | How to fit large models on resource-constrained devices |
| Inference Latency | Hardware Acceleration | How GPU/TPU/NPU execute neural networks |
| Training Convergence | Compute Efficiency | How mixed precision and optimization reduce costs |
| Model Accuracy | Quantization & Pruning | How to compress models while maintaining performance |
| Data Requirements | Pipeline Infrastructure | How to build efficient data loading and preprocessing |
| Model Deployment | MLOps Practices | How to monitor, version, and update production models |
| Privacy Constraints | On-device Learning | How to train and adapt models without uploading data |
| Year | Milestone | Achievement |
|---|
| 1958 | Perceptron | Gradient descent binary classification |
| 1969 | XOR Crisis | Multi-layer networks solve non-linear problems |
| 1986 | Backpropagation | Multi-layer network training |
| 1998 | CNN Revolution | Convolutional image classification |
| 2017 | Transformer | Self-attention language generation |
| 2018+ | MLPerf | Production-grade optimization benchmarks |
# Read online
open https://mlsysbook.ai
# Download formats
curl -O https://mlsysbook.ai/pdf # PDF
curl -O https://mlsysbook.ai/epub # EPUB
cd book
./binder setup
./binder doctor
./binder build # Build HTML book
./binder preview intro # Hot-reload preview chapter
./binder pdf # Build PDF
./binder epub # Build EPUB
cd tinytorch
pip install -r requirements.txt
tito --help
cs249r_book/
├── book/ # Textbook source (Quarto Markdown)
├── tinytorch/ # TinyTorch framework & curriculum (600+ test cases)
├── kits/ # Hardware experiment Labs
├── labs/ # General lab resources
├── _brand/ # Brand & design assets
├── binder/ # Root binder scripts
├── pyproject.toml # Python project config
├── CITATION.bib # Academic citation
└── LICENSE.md # License declaration
- University Course Textbook: ML Systems, Edge AI, Embedded Intelligence courses
- Self-learner Advancement Path: From DL theory to framework implementation and production deployment
- Engineering Training Material: Systematically fill ML Sys knowledge gaps for teams
- Research Reference: MLSys research introduction and literature leads