Karpathy's Autoresearch: Let AI Agents Revolutionize Your Model Training

The era of manual AI research is over. Andrej Karpathy's autoresearch repository (20.6k stars) introduces a groundbreaking approach: AI agents autonomously improve LLMs overnight without human intervention.

The Revolutionary Concept

Instead of researchers manually tweaking hyperparameters, architecture, and optimizers, autoresearch hands control to AI agents. The workflow:

Agent edits train.py (GPT model, Muon+AdamW optimizer, training loop)
Runs 5-minute training (fixed wall-clock budget)
Evaluates on val_bpb (bits per byte, lower = better)
Keeps improvements, discards failures
Repeats ~100x overnight

Wake up to optimized models and detailed experiment logs.

Minimal 4-File Setup

uv sync
uv run prepare.py  # Download data + train tokenizer
uv run train.py    # Manual test (~5 min)

Core files:

prepare.py – Data prep + utilities (fixed)
train.py – Agent's playground (model + training)
program.md – Agent instructions (human-editable)

Production-Ready Design Choices

✅ Single editable file keeps diffs reviewable ✅ Fixed 5-min budget = fair architecture comparisons ✅ Self-contained – PyTorch + minimal deps ✅ Vocab-independent metric (val_bpb)

Quick Start for H100 Users

# 1. Install (Python 3.10+)
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync

# 2. Prep data (~2 min)
uv run prepare.py

# 3. Test run (~5 min)
uv run train.py

Spin up Claude/Codex:

"Hi, read program.md and kick off a new experiment!"

Smaller Hardware? Try These Forks

MacOS: miolini/autoresearch-macos
MacOS MLX: trevin-creator/autoresearch-mlx
Windows RTX: jsegov/autoresearch-win-rtx

Pro tips for low-compute: TinyStories dataset, vocab_size=1024, DEPTH=4, MAX_SEQ_LEN=256.

Why This Changes Everything

Democratizes research: Single GPU → frontier progress
Platform-optimized: Finds best model for your hardware
Agent-programmable: Edit program.md to add multi-agent swarms
MIT licensed: Fork, extend, contribute

GitHub Repo (20.6k ⭐) – The future of AI research has arrived.