Claude Autoresearch: AI That Improves Code Forever
Discover Claude Autoresearch, the Claude Code skill inspired by Karpathy's autoresearch that turns AI into a relentless improvement engine. Set a goal and metric, then watch Claude autonomously iterate: review, change, test, commit or revert, repeat forever. From bug hunting and security audits to shipping workflows and adversarial reasoning, 10 powerful commands handle code, content, marketing, and more. Install in seconds via plugin marketplace—no AGI needed, just goals + metrics + loops.
Claude Autoresearch: Turn Claude Code into a Relentless Improvement Engine
"Set the GOAL → Claude runs the LOOP → You wake up to results"
What if your AI could autonomously improve anything measurable—code, content, metrics, processes—without you babysitting? Claude Autoresearch (3.1k ⭐) makes this reality using Karpathy's proven formula: constraint + mechanical metric + autonomous iteration = compounding gains.
From 630 Lines of Python to Claude's Universal Loop
Karpathy showed a simple Python script could run 100 ML experiments overnight. Claude Autoresearch generalizes this to ANY domain:
- Code: Test coverage → 90%, bundle size → 50% smaller
- Performance: API p95 → <100ms
- Security: Autonomous STRIDE + OWASP audits
- Shipping: Universal PR/deployment/content workflows
- Docs: Auto-generate/update/validate documentation
The 8-Phase Autonomous Loop
LOOP (FOREVER):
1. Review git history + results log
2. Pick ONE focused change
3. Git commit (experiment: prefix)
4. Run mechanical verification
5. IMPROVED → keep | WORSE → revert
6. Log TSV results
7. Repeat
8 Critical Rules ensure relentless progress:
- One change per iteration (atomic)
- Mechanical verification only (no subjectivity)
- Auto-rollback failures
- Git as memory
- Simplicity wins (less code = better)
10 Battle-Tested Commands
| Command | Use Case |
|---|---|
/autoresearch |
Core optimization loop |
/autoresearch:plan |
Goal → config wizard |
/autoresearch:security |
Autonomous security audit |
/autoresearch:ship |
Ship PRs/deployments/content |
/autoresearch:debug |
Hunt ALL bugs scientifically |
/autoresearch:fix |
Crush errors until zero remain |
/autoresearch:scenario |
Explore 12 dimensions of edge cases |
/autoresearch:predict |
5-expert swarm analysis |
/autoresearch:learn |
Autonomous docs engine |
/autoresearch:reason |
Adversarial refinement (v1.9.0) |
Install in 30 Seconds
Plugin (Recommended):
/plugin marketplace add uditgoenka/autoresearch
/plugin install autoresearch@autoresearch
First Run:
/autoresearch
Goal: Increase test coverage from 72% to 90%
Scope: src/**/*.test.ts
Verify: npm test -- --coverage | grep "All files"
Walk away. Claude iterates autonomously. Every improvement stacks.
Real-World Power: Command Chains
# Full quality pipeline
/autoresearch:reason --chain predict,scenario,debug,fix
# Security → Ship
/autoresearch:security --fix --chain ship
# Docs after changes
/autoresearch:learn --mode update
Guard: Regression Protection
Goal: Reduce API response <100ms
Verify: npm run bench:api | grep "p95"
Guard: npm test # Safety net
Metrics improve + tests pass = keep. Anything breaks = rework.
Why It Works (The Science)
- Atomic changes → Clear cause/effect
- Git memory → Learns from every experiment
- Mechanical gates → No human bias
- Unbounded iteration → Compounding gains
- Auto-rollback → Never worse than start
Domains Beyond Code
- Marketing: CTR → 3x, conversion rate ↑
- Sales: Email open rates, reply rates
- Content: Engagement scores, readability
- HR: Policy compliance metrics
- DevOps: Deployment success rate
TSV Results Tracking:
iteration commit metric delta status
0 a1b2c3d 85.2 0.0 baseline
1 b2c3d4e 87.1 +1.9 keep
3 c3d4e5f 88.3 +1.2 keep
🚀 Get Started Today
Install Claude Autoresearch and experience autonomous improvement. No AGI needed—just goals, metrics, and loops that never quit.
Creators: Udit Goenka (AI Product Expert) + contributors. MIT License. 127 commits. v1.9.0 released Apr 2026.