Karpathy's Autoresearch: AI Agents Train LLMs Overnight

March 10, 2026

Category: Practical Open Source Projects

Tags:

AI Agents LLM Training Autoresearch Karpathy nanochat

Karpathy's Autoresearch: Let AI Agents Revolutionize Your Model Training

The era of manual AI research is over. Andrej Karpathy's autoresearch repository (20.6k stars) introduces a groundbreaking approach: AI agents autonomously improve LLMs overnight without human intervention.

The Revolutionary Concept

Instead of researchers manually tweaking hyperparameters, architecture, and optimizers, autoresearch hands control to AI agents. The workflow:

Agent edits train.py (GPT model, Muon+AdamW optimizer, training loop)
Runs 5-minute training (fixed wall-clock budget)
Evaluates on val_bpb (bits per byte, lower = better)
Keeps improvements, discards failures
Repeats ~100x overnight

Wake up to optimized models and detailed experiment logs.

Minimal 4-File Setup

uv sync
uv run prepare.py  # Download data + train tokenizer
uv run train.py    # Manual test (~5 min)

Core files: - prepare.py – Data prep + utilities (fixed) - train.py – Agent's playground (model + training) - program.md – Agent instructions (human-editable)

Production-Ready Design Choices

✅ Single editable file keeps diffs reviewable ✅ Fixed 5-min budget = fair architecture comparisons ✅ Self-contained – PyTorch + minimal deps ✅ Vocab-independent metric (val_bpb)

Quick Start for H100 Users

# 1. Install (Python 3.10+)
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync

# 2. Prep data (~2 min)
uv run prepare.py

# 3. Test run (~5 min)
uv run train.py