AutoResearch Explorer

by Karpathy

Autonomous AI Research

Give an AI agent a small LLM training setup. Let it experiment overnight. Wake up to a better model. Explore the full source code below.

AutoResearch progress chart
5 min
Time budget per run
~12/hr
Experiments per hour
~100
Experiments overnight
val_bpb
Single metric (lower = better)

How It Works

1

Agent reads code

The AI agent reads train.py, prepare.py, and program.md to understand the full training setup.

2

Modifies train.py

It makes a hypothesis — new architecture, different hyperparameters, optimizer tweaks — and edits train.py.

3

Runs experiment

Training runs for exactly 5 minutes on a single GPU. The result is measured in val_bpb (bits per byte).

4

Keep or discard

If val_bpb improved, the change is kept. If not, it's reverted. The agent loops forever, autonomously.

Source Code Explorer

train.py

Model, optimizer & training loop — the file the agent edits

prepare.py

Data prep, tokenizer, dataloader & evaluation — read-only

program.md

Agent instructions — the "research org code"

README.md

Project overview and quick start guide

train.py