NanoChat: Build Your Own ChatGPT Clone in 8,000 Lines of Code

How Andrej Karpathy turned AI education into a hands-on experiment in simplicity

Introduction

Two days ago, Andrej Karpathy released NanoChat, a minimal and fully working ChatGPT-style chatbot that anyone can build, train, and run.

In only about 8,000 lines of code, NanoChat shows how modern conversational models really work. It includes everything: tokenization, dataset processing, model training, inference, and even a basic web interface for chatting.

Karpathy describes it as “a minimal, hackable, full-stack LLM implementation that anyone can study and extend.”

For learners, hobbyists, and researchers, it feels like a rare window into the inner workings of large language models. Instead of dealing with complex frameworks, you can actually read and understand the entire pipeline.

From nanoGPT to NanoChat

Back in 2023, Karpathy introduced nanoGPT, a small and educational framework for training a transformer model from scratch. It became one of the most popular learning resources in AI, thanks to its simple and elegant design.

NanoChat takes that same philosophy and extends it across the full chatbot workflow. Now you get the tokenizer, data pipeline, supervised fine-tuning, inference loop, and an interactive chat UI all in one place.

It is essentially a ChatGPT clone designed for learning. Every part is visible, editable, and understandable. That makes it perfect for people who want to explore how real AI systems work without the barriers of proprietary code or giant infrastructures.

How NanoChat Works

NanoChat is organized into a few core stages that mirror the real-world structure of conversational AI models.

  1. Tokenizer and Vocabulary
    The tokenizer is implemented in Rust for both speed and clarity. It turns text into small units called tokens that the model can process. You can train your own tokenizer or reuse a provided one.
  2. Pretraining and Midtraining
    The model starts by learning from a large general dataset of text, roughly 24 GB in size. After that, it undergoes “midtraining” on problem-solving and dialogue-based datasets that help it reason better and communicate naturally.
  3. Supervised Fine-Tuning (SFT)
    This step teaches the model to respond like a chatbot instead of just predicting text. It uses curated conversational data where human instructions and answers are provided.
  4. Inference and Chat Interface
    Once trained, NanoChat runs a lightweight backend with key-value caching to speed up generation. The chat interface is intentionally simple, just HTML and JavaScript, so users can test it instantly.
  5. Training Cost and Efficiency
    Karpathy showed that a usable chatbot can be trained in around four hours on an 8×H100 GPU setup for roughly $100. Even smaller models trained for shorter periods work surprisingly well for learning and experimentation.

Why It Matters

NanoChat is not designed to compete with large proprietary models. Its value lies in how transparent and teachable it is.

Most AI systems today feel like black boxes: massive, complex, and difficult to understand. NanoChat flips that idea by showing that a fully functional chatbot can be built with clarity, structure, and just enough complexity to make it real.

It proves that you can grasp every stage of a language model without needing enterprise-scale infrastructure or obscure dependencies.

For the AI community, this represents something more meaningful than another open-source release. It is an educational artifact, one that invites everyone to learn how the technology actually works.

Lessons from NanoChat

  • Clarity matters. A smaller, well-structured codebase can teach more than a massive one filled with abstractions.
  • AI should extend human understanding, not replace it. By learning how models are built, we learn how to use them more responsibly.
  • You can learn complex ideas by studying simple systems. NanoChat teaches how transformers, tokenization, and fine-tuning interact in real time.
  • Open-source projects accelerate real learning. They allow thousands of people to experiment, share findings, and grow the field collectively.

What You Can Do with NanoChat

NanoChat is more than a demonstration. You can:

  • Train a domain-specific chatbot using your own dataset.
  • Study how different layers and components in a transformer interact.
  • Test fine-tuning or optimization strategies.
  • Use it as a teaching tool in courses or workshops about AI fundamentals.

It offers an ideal environment for experimentation. Since the entire stack is readable, you can see the effects of every change you make.

A New Way to Learn AI

The release of NanoChat is part of a broader movement in AI education: learning by building.

Instead of treating large language models as magic, it invites you to open the hood, inspect every part, and rebuild it yourself.

For students, researchers, and independent developers, it shows that building something meaningful with AI no longer requires a massive lab or million-dollar budget. What it really takes is curiosity, patience, and a willingness to explore.

NanoChat’s strength is not in its scale, but in its accessibility. It makes the complex feel achievable.

Conclusion

NanoChat is a reminder that great ideas don’t need massive codebases to have impact. It’s small, educational, and empowering.

Anyone can now run their own mini ChatGPT and see how tokenization, pretraining, and fine-tuning all come together in one unified system.

If you’ve ever wanted to understand how LLMs work behind the scenes, this project is one of the clearest and most rewarding ways to start.

Try it, modify it, and experiment. The best way to learn is to build and NanoChat gives you everything you need to do exactly that.

Leave a Reply