Give an AI agent a small LLM training setup. Let it experiment overnight. Wake up to a better model. Explore the full source code below.
The AI agent reads train.py, prepare.py, and program.md to understand the full training setup.
It makes a hypothesis — new architecture, different hyperparameters, optimizer tweaks — and edits train.py.
Training runs for exactly 5 minutes on a single GPU. The result is measured in val_bpb (bits per byte).
If val_bpb improved, the change is kept. If not, it's reverted. The agent loops forever, autonomously.
Model, optimizer & training loop — the file the agent edits
Data prep, tokenizer, dataloader & evaluation — read-only
Agent instructions — the "research org code"
Project overview and quick start guide