Andrej Karpathy on AI Agents and Future Work

AgentSunrise
AI agents
AutoResearch
Andrej Karpathy
the future of development
machine learning

TL;DR

Andrey Karpathy, one of the leading AI researchers, believes that since December 2024 software development has changed forever. He himself stopped writing code manually — agents do it instead. In an interview with Sarah Guo (Conviction), he talked about AutoResearch, “claws” agents (claws), the future of professions, and why FLOPs matter more than money.


Table of contents

  1. The turning point: December 2024
  2. What Karpathy means by “AI psychosis”
  3. AutoResearch: an agent that improves itself
  4. Claws — a new generation of agents
  5. Model specialization and modular AI
  6. Impact on professions and the labor market
  7. Education in the age of agents: MicroGPT
  8. FAQ

1. The turning point: December 2024

In his conversation with Sarah Guo, Karpathy described the turning point precisely: December 2024. Before that, the ratio of “I code myself / I delegate to an agent” was 80/20. Afterward, it flipped and continues to change. According to him, since December he has probably not written a single line of code manually.

“I don’t think an ordinary person realizes how much this happened and how dramatic it was. Literally, if you walked up to a random developer at their desk — their workflow has completely changed starting around December.”

This is not just a change of tool. According to Karpathy, it is a change of the verb itself: before, the developer “wrote code,” now they “express their will to agents.”


2. What Karpathy means by “AI psychosis”

Karpathy coined the term “AI psychosis” — a state of permanent excitement and anxiety caused by realizing the endless possibilities of agents. Its main feature: you yourself become the bottleneck of the system.

Key symptoms of AI psychosis:

  • Anxiety when agents are idle. If you have an unused subscription or tokens, you are wasting them. Karpathy compares this to the feeling of a graduate student with idle GPUs.
  • Parallel tasks. The right strategy — like Peter Steinberg’s — is to have several agents working simultaneously on independent tasks, each taking ~20 minutes while you switch between them.
  • Macro-actions. The unit of work is no longer a line of code or a function — it is an entire feature delegated to an agent.
“Token throughput is what matters now. Before, you felt constrained by access to compute. Now compute is available — you are the constraint.”

3. AutoResearch: an agent that improves itself

AutoResearch — Karpathy’s key project in recent months. The idea is simple: a researcher should not be in the experiment loop. They get in the way.

How AutoResearch works

  1. A goal is set (for example, reducing a model’s validation loss)
  2. A metric is set (objective, automatically verifiable)
  3. The boundaries of permissible actions are defined
  4. The agent is launched and works without human involvement

Karpathy tested this on his project Nanochat — a compact implementation of GPT. He spent years manually tuning hyperparameters and thought the model was “well enough tuned.” Overnight, AutoResearch found improvements he had missed:

  • Forgotten weight decay on value embeddings
  • Adam optimizer beta parameters that were insufficiently tuned
“I didn’t expect it to work, because the repo was already pretty well tuned — and yet it found something. And that was only one loop iteration.”

Why this matters for frontier labs

Karpathy is convinced that the best strategy for OpenAI, Anthropic, and DeepMind is to remove researchers from the loop.

  • Experiments are run autonomously on small models
  • The results are extrapolated to larger ones
  • Researchers contribute ideas to a common queue — but do not manage execution

In essence, this is a reinvention of the scientific process: instead of a scientist-operator, there is a scientist-systems architect.

The next level: AutoResearch at Home

Karpathy described the concept of distributed AutoResearch — an analog of Folding@Home, but for improving language models:

  • A pool of untrusted workers on the internet
  • Untrusted workers propose commits (code changes)
  • Trusted infrastructure verifies the proposal — this is cheap, although finding a good solution itself is expensive
  • Participants are motivated by a place on the leaderboard, and later there may be contributions to specific research areas (cancer, climate, etc.)

The structure resembles blockchain: blocks = commits, proof of work = experiments, reward = reputation on the leaderboard.


4. Claws — a new generation of agents

Karpathy distinguishes between agents and claws (claws). An agent is a session that works while you are in it. A claw is something fundamentally different:

AgentClaw
Works within a single sessionRuns continuously in the background
Requires your involvementActs even while you sleep
Simple memoryComplex memory systems
One repositoryManages multiple systems

Karpathy gave a personal example: Dobby the Elf Claw — his home agent. In three prompts, it:

  1. Scanned the local network
  2. Found a Sonos system and figured out its API
  3. Turned on music in the study

Right now Dobby controls the lights, climate control, curtains, the pool, and the security system. When motion appears on the outdoor camera, Dobby analyzes the video using a VLM model and sends a WhatsApp message: “FedEx has arrived, a package may have been delivered.”

“I used to have six different smart home apps. Now I don’t use a single one of them. Dobby controls everything in natural language.”

5. Model specialization and modular AI

Karpathy expects speciation in AI — by analogy with the animal kingdom, where different species have evolved different cognitive abilities for their own niches.

Right now, labs are building monocultures: one big model, universal for everything. But this leads to jaggedness — a phenomenon where a model solves the most difficult engineering tasks in hours, yet tells the same bad joke for five years straight.

“I’m simultaneously talking to an incredibly brilliant PhD-level systems programmer and a ten-year-old child. It’s very strange.”

Why doesn’t the joke change? Because jokes are not verifiable — they can’t be improved through reinforcement learning. Models are optimized only where there is a clear metric.

The solution is specialized models:

  • Small, but deeply trained on a specific domain
  • More efficient in latency and throughput
  • Examples already exist: models for proving theorems in the Lean system

6. Impact on professions and the labor market

Karpathy studied data from Bureau of Labor Statistics and published an analysis (simple in form — a visualization of public data — but it has already sparked broad discussion).

His key observations:

Three categories of professions:
  1. Full automation. Routine cognitive tasks with clear metrics — the highest risk in the coming years.
  1. Instrumental amplification. Most current professions will not disappear, but they will change: specialist + agent = a new unit of productivity. Historically, automation has created new professions.
  1. New roles. “Agent orchestrator,” “AutoResearch architect,” “ProgramMD curator” — positions that did not exist five years ago.
What is important to understand right now:

Karpathy does not believe that “everything will soon be automated.” He says that the skill of managing agents is the new literacy. Those who master it now will gain a huge advantage. Those who wait risk falling behind.

Analogy: in the 1990s, being able to use email seemed like an “IT” skill. Now it is basic literacy. Managing agents is moving along the same path — quickly.


7. Education in the age of agents: MicroGPT

Karpathy has long promoted the idea of learning by building. His Karpathy Zero-to-Hero course has gathered millions of views precisely because it explains neural networks from the ground up — through code.

Now he is thinking about the next step: MicroGPT — a personal mentor-agent that:

  • Works like an adaptive textbook
  • Adapts to the pace and level of a specific student
  • Serves as an example of what it teaches
“An agent that teaches you how to use agents. A meta-level.”

Karpathy believes education is one of the areas where agents will create asymmetrically large value. A one-on-one tutor at expert level used to be available only to a few. Soon — to everyone.


FAQ

What is AutoResearch according to Karpathy?

AutoResearch is a system in which an agent autonomously runs ML experiments: it tunes hyperparameters, trains models, evaluates results, and improves code — without human involvement. The human sets the goal, metric, and constraints, and then steps out of the loop.

How is claw different from a normal AI agent?

A normal agent works within a single session and requires your involvement. Claw is a continuously running system with its own memory that acts on your behalf even when you are not at the computer. The analogy is a personal AI employee, not a tool.

Is it true that developers will soon no longer be needed?

According to Karpathy — no. The role is changing: from writing code to managing agents, setting tasks, and verifying results. The new profession — “agent orchestrator” — requires deep systems understanding, but not manual coding.

What is the “jaggedness” of AI models?

A phenomenon in which a model simultaneously outperforms humans on complex tasks (coding, mathematics) and remains at a 2020 level on simple ones (jokes, communication nuances). The reason is that RL training improves only verifiable tasks.

What is model speciation?

By analogy with biological evolution — a move from one universal model to an ecosystem of specialized ones: models for mathematics, for code, for medicine. Each is more deeply optimized in its niche and more resource-efficient.

Which skills matter in the age of AI agents?

Karpathy highlights: the ability to give agents clear tasks, understanding what is verifiable and what is not; the skill of managing multiple agents in parallel; and the ability to build systems with objective metrics. Everything else is a skill issue that practice can solve.


Conclusion

Andrej Karpathy’s interview with Sarah Guo is neither futurology nor hype. It is an honest account from someone who is at the cutting edge and describing what is happening right now.

Key takeaways:

  • December 2024 = a turning point. Manual coding is becoming a thing of the past for those who have mastered agents.
  • The bottleneck is you. Not models, not tokens, not compute. Your ability to manage agents is the only real limit.
  • AutoResearch = the future of science. Removing the researcher from the experiment loop is the next big step.
  • Specialization will beat monoculture. One universal model will give way to an ecosystem of specialized ones.
  • Flops = the new money. Control over computing resources is becoming a key asset.

We are at the beginning of the AI “loop era” — an era where value is created not in a single session, but in long autonomous cycles. Those who learn to build these loops today will shape the direction of tomorrow.


Source: interview with Andrey Karpathy and Sarah Guo, No Priors Podcast, 2025. Original video on YouTube.

← All articles

Comments (0)

No comments yet. Start the discussion.

Leave a comment
No registration required

Book a strategy call
for agentic operations

Tell us which workflow you want to improve. We will map feasibility, risks, and the fastest MVP path.

By submitting, you agree to our privacy policy

Contacts

Global Operations

Serving U.S. clients remotely
with private cloud and on-prem options

Strategy calls by request

We respond after reviewing your workflow context.

lamooof@gmail.com

For partnership inquiries

Have a proposal?

Write to us in messengers

© 2025 AgentSunrise