AutoResearch Explorer

Autonomous AI Research

Give an AI agent a small LLM training setup. Let it experiment overnight. Wake up to a better model. Explore the full source code below.

5 min

Time budget per run

~12/hr

Experiments per hour

~100

Experiments overnight

val_bpb

Single metric (lower = better)

The AI agent reads train.py, prepare.py, and program.md to understand the full training setup.

It makes a hypothesis — new architecture, different hyperparameters, optimizer tweaks — and edits train.py.

Training runs for exactly 5 minutes on a single GPU. The result is measured in val_bpb (bits per byte).

If val_bpb improved, the change is kept. If not, it's reverted. The agent loops forever, autonomously.

Model, optimizer & training loop — the file the agent edits

Data prep, tokenizer, dataloader & evaluation — read-only

Agent instructions — the "research org code"

Project overview and quick start guide