Andrej Karpathy just dropped nanochat, a compact open-source project that demolishes the barrier to entry for training conversational AI. In roughly 8,000 lines of code, you get everything needed to train, evaluate, and deploy a ChatGPT-style model using about $100 in compute resources. This is the real deal: a complete, production-ready pipeline you can actually understand.
What nanochat Delivers
nanochat expands on Karpathy’s earlier nanoGPT project, but this time you’re getting the full stack. We’re talking about a complete LLM pipeline that runs on a single 8×H100 GPU node and finishes in approximately 4 hours using the included speedrun.sh
script.
The Complete Pipeline
Here’s what you’re getting out of the box:
Tokenizer Training: A Rust-based BPE tokenizer with a 65,536-word vocabulary that handles the heavy lifting of text processing.
Pretraining: Leverages FineWeb-EDU, a curated collection of educational web content that provides clean, high-quality training data.
Mid-Training: Adds specialized capabilities through dialogue datasets, multiple-choice questions, and tool-use examples.
Supervised Fine-Tuning (SFT): Refines the model using chat and reasoning datasets to create natural conversational abilities.
Reinforcement Learning (GRPO): Optional enhancement using GSM8K dataset to sharpen mathematical reasoning capabilities.
Comprehensive Evaluation: Tests performance across ARC-E, MMLU, GSM8K, and HumanEval benchmarks to validate model quality.
Production-Ready Inference: Includes both a web UI and CLI interface for immediate deployment and interaction.
The entire system deliberately avoids heavyweight frameworks and configuration complexity. Every component is designed to be readable, modifiable, and instructive.
Why This Matters
nanochat represents a paradigm shift in LLM accessibility. This isn’t a toy project or a simplified demo. It’s a legitimate training pipeline that produces functional conversational models.
For Education: This serves as the capstone project for Karpathy’s upcoming LLM101n course, giving students hands-on experience with every stage of LLM development.
For Rapid Prototyping: Small teams can fork the codebase and customize their own conversational models without enterprise-scale infrastructure or budgets.
For Reproducible Research: Every training run generates markdown report cards, making it trivial to track experiments and share results.
Karpathy summarized it perfectly on X: “You boot up a GPU, run one script, and in four hours you can talk to your own LLM.”
The Bottom Line
nanochat removes the mystery from modern LLM development. You get:
Transparency: Every line of code is visible and understandable. No black boxes, no hidden configuration files, no proprietary frameworks.
Affordability: At $100 per training run, experimentation becomes practical. You can iterate, test new ideas, and learn from failures without breaking the budget.
Completeness: This isn’t just model training. You’re getting tokenization, evaluation, deployment, and a chat interface. Everything you need to go from raw data to production inference.
Hackability: The minimal codebase means you can modify any component without wrestling with framework abstractions or dependency hell.
If you’re serious about understanding how ChatGPT-style systems actually work, nanochat is your executable tutorial. It’s the kind of project that transforms theoretical knowledge into practical capability.
Repository: github.com/karpathy/nanochat
Tagline: “The best ChatGPT $100 can buy.”
This is what democratized AI looks like: no gatekeeping, no hand-waving, just clean code and clear documentation. Clone it, run it, break it, fix it, and make it yours.