Claude Autoresearch: AI That Improves Code Forever

Discover Claude Autoresearch, the Claude Code skill inspired by Karpathy's autoresearch that turns AI into a relentless improvement engine. Set a goal and metric, then watch Claude autonomously iterate: review, change, test, commit or revert, repeat forever. From bug hunting and security audits to shipping workflows and adversarial reasoning, 10 powerful commands handle code, content, marketing, and more. Install in seconds via plugin marketplace—no AGI needed, just goals + metrics + loops.

Claude Autoresearch: Turn Claude Code into a Relentless Improvement Engine

"Set the GOAL → Claude runs the LOOP → You wake up to results"

What if your AI could autonomously improve anything measurable—code, content, metrics, processes—without you babysitting? Claude Autoresearch (3.1k ⭐) makes this reality using Karpathy's proven formula: constraint + mechanical metric + autonomous iteration = compounding gains.

From 630 Lines of Python to Claude's Universal Loop

Karpathy showed a simple Python script could run 100 ML experiments overnight. Claude Autoresearch generalizes this to ANY domain:

  • Code: Test coverage → 90%, bundle size → 50% smaller
  • Performance: API p95 → <100ms
  • Security: Autonomous STRIDE + OWASP audits
  • Shipping: Universal PR/deployment/content workflows
  • Docs: Auto-generate/update/validate documentation

The 8-Phase Autonomous Loop

LOOP (FOREVER):
1. Review git history + results log
2. Pick ONE focused change
3. Git commit (experiment: prefix)
4. Run mechanical verification
5. IMPROVED → keep | WORSE → revert
6. Log TSV results
7. Repeat

8 Critical Rules ensure relentless progress:

  • One change per iteration (atomic)
  • Mechanical verification only (no subjectivity)
  • Auto-rollback failures
  • Git as memory
  • Simplicity wins (less code = better)

10 Battle-Tested Commands

Command Use Case
/autoresearch Core optimization loop
/autoresearch:plan Goal → config wizard
/autoresearch:security Autonomous security audit
/autoresearch:ship Ship PRs/deployments/content
/autoresearch:debug Hunt ALL bugs scientifically
/autoresearch:fix Crush errors until zero remain
/autoresearch:scenario Explore 12 dimensions of edge cases
/autoresearch:predict 5-expert swarm analysis
/autoresearch:learn Autonomous docs engine
/autoresearch:reason Adversarial refinement (v1.9.0)

Install in 30 Seconds

Plugin (Recommended):

/plugin marketplace add uditgoenka/autoresearch
/plugin install autoresearch@autoresearch

First Run:

/autoresearch
Goal: Increase test coverage from 72% to 90%
Scope: src/**/*.test.ts
Verify: npm test -- --coverage | grep "All files"

Walk away. Claude iterates autonomously. Every improvement stacks.

Real-World Power: Command Chains

# Full quality pipeline
/autoresearch:reason --chain predict,scenario,debug,fix

# Security → Ship
/autoresearch:security --fix --chain ship

# Docs after changes
/autoresearch:learn --mode update

Guard: Regression Protection

Goal: Reduce API response <100ms
Verify: npm run bench:api | grep "p95"
Guard: npm test  # Safety net

Metrics improve + tests pass = keep. Anything breaks = rework.

Why It Works (The Science)

  1. Atomic changes → Clear cause/effect
  2. Git memory → Learns from every experiment
  3. Mechanical gates → No human bias
  4. Unbounded iteration → Compounding gains
  5. Auto-rollback → Never worse than start

Domains Beyond Code

  • Marketing: CTR → 3x, conversion rate ↑
  • Sales: Email open rates, reply rates
  • Content: Engagement scores, readability
  • HR: Policy compliance metrics
  • DevOps: Deployment success rate

TSV Results Tracking:

iteration	commit	metric	delta	status
0	a1b2c3d	85.2	0.0	baseline
1	b2c3d4e	87.1	+1.9	keep
3	c3d4e5f	88.3	+1.2	keep

🚀 Get Started Today

Install Claude Autoresearch and experience autonomous improvement. No AGI needed—just goals, metrics, and loops that never quit.

Creators: Udit Goenka (AI Product Expert) + contributors. MIT License. 127 commits. v1.9.0 released Apr 2026.