Karpathy's Autoresearch: AI Agents Train LLMs Overnight
Karpathy's Autoresearch: Let AI Agents Revolutionize Your Model Training
The era of manual AI research is over. Andrej Karpathy's autoresearch repository (20.6k stars) introduces a groundbreaking approach: AI agents autonomously improve LLMs overnight without human intervention.
The Revolutionary Concept
Instead of researchers manually tweaking hyperparameters, architecture, and optimizers, autoresearch hands control to AI agents. The workflow:
- Agent edits
train.py(GPT model, Muon+AdamW optimizer, training loop) - Runs 5-minute training (fixed wall-clock budget)
- Evaluates on val_bpb (bits per byte, lower = better)
- Keeps improvements, discards failures
- Repeats ~100x overnight
Wake up to optimized models and detailed experiment logs.
Minimal 4-File Setup
uv sync
uv run prepare.py # Download data + train tokenizer
uv run train.py # Manual test (~5 min)
Core files:
- prepare.py β Data prep + utilities (fixed)
- train.py β Agent's playground (model + training)
- program.md β Agent instructions (human-editable)
Production-Ready Design Choices
β Single editable file keeps diffs reviewable β Fixed 5-min budget = fair architecture comparisons β Self-contained β PyTorch + minimal deps β Vocab-independent metric (val_bpb)
Quick Start for H100 Users
# 1. Install (Python 3.10+)
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync
# 2. Prep data (~2 min)
uv run prepare.py
# 3. Test run (~5 min)
uv run train.py
Spin up Claude/Codex:
"Hi, read program.md and kick off a new experiment!"
Smaller Hardware? Try These Forks
- MacOS: miolini/autoresearch-macos
- MacOS MLX: trevin-creator/autoresearch-mlx
- Windows RTX: jsegov/autoresearch-win-rtx
Pro tips for low-compute: TinyStories dataset, vocab_size=1024, DEPTH=4, MAX_SEQ_LEN=256.
Why This Changes Everything
- Democratizes research: Single GPU β frontier progress
- Platform-optimized: Finds best model for your hardware
- Agent-programmable: Edit
program.mdto add multi-agent swarms - MIT licensed: Fork, extend, contribute
GitHub Repo (20.6k β) β The future of AI research has arrived.