Build Your Own autoresearch — Applying Autonomous Experimentation to Any Domain
Apply the autoresearch pattern to text classification, image classification, and RAG pipelines. Includes a universal experiment runner and program.md template.

Build Your Own autoresearch -- Applying Autonomous Experimentation to Any Domain
Karpathy's autoresearch is an autonomous experimentation system built for LLM pretraining. In Part 1 we covered the overall architecture, and in Part 2 we dug into the agent's experimentation strategy and result analysis. If you've read this far, one question is probably on your mind:
"Can I use this for my own problem?"
In this post, we extract the core patterns from autoresearch and apply them to three domains: text classification, image classification, and RAG pipelines. At the end, we provide a general-purpose experiment runner and a program.md template you can adapt immediately.
Series: Part 1: Architecture | Part 2: Experiment Strategy | Part 3 (this post)
Extracting the Core Pattern from autoresearch
The structure running through all of autoresearch is surprisingly simple. Three files, a five-step loop, and a handful of design principles. Extract these, and you can apply the pattern to any ML task.
The 3-File Architecture
Here's autoresearch's file structure broken down by role:
| File | Role | Modified by |
|---|---|---|
prepare.py | Fixed infrastructure (data, evaluation, utilities) | Human (once) |
train.py | Experimentation target (model, hyperparameters, training loop) | Agent (every experiment) |
program.md | Agent protocol (experiment rules, evaluation criteria) | Human (meta-optimization) |
Related Posts

Self-Evolving AI Agents — The New Paradigm of 2026
GenericAgent, Evolver, Open Agents — comparing 3 self-evolving agent frameworks that learn, adapt, and grow without human coding.

Build Your Own LLM Knowledge Base — A Karpathy-Style Knowledge System
Complete guide to building a permanent personal knowledge system with Obsidian + Claude Code. Wiki + Memory dual-axis architecture.

Why Karpathy's CLAUDE.md Got 48K Stars — And How to Write Your Own
One markdown file raised AI coding accuracy from 65% to 94%. Analyzing Karpathy's 4 rules and practical writing guide.