Andrej Karpathy: Software Is Changing (Again)

Talk Summary

Andrej Karpathy asserts that software is undergoing fundamental changes, for the third time in recent years, after largely remaining consistent for 70 years

He categorises this evolution into three paradigms: 

  • Software 1.0: Traditional human-written code (e.g., C++, Python).
  • Software 2.0: Neural networks programmed by their weights, tuned via data and optimisers (e.g., AlexNet image recognizer, Hugging Face as its GitHub equivalent).
  • Software 3.0: Large Language Models (LLMs) programmed by natural language prompts, effectively making English a programming language.

Karpathy illustrates this transition with the Tesla Autopilot, where Software 2.0 (neural networks) progressively replaced Software 1.0 (C++ code) by absorbing functionalities like image stitching across cameras.

He posits that LLMs behave like new kinds of computers and exhibit properties akin to utilities, fabs, and especially operating systems. As utilities, LLMs involve significant capital expenditure (capex) for training (like building a grid) and operational expenditure (opex) for serving intelligence via APIs with metered access. 

The centralisation of state-of-the-art LLMs means their outages can lead to a global "intelligence brownout". As operating systems, LLMs are complex software ecosystems, comparable to Windows, macOS, or Linux, where LLM apps run on different underlying models (e.g., Cursor on GPT, Claude, or Gemini). 

The current state of LLM computing is likened to the 1960s era of operating systems, where compute is expensive, centralised in the cloud, and accessed by "thin clients" via time-sharing, suggesting the "personal computing revolution" for LLMs has yet to fully emerge.

A unique aspect of LLMs is their "flipped" technology diffusion, where consumers are early adopters, using them for everyday tasks like boiling an egg, while governments and corporations lag in adoption. Karpathy describes LLMs as "stochastic simulations of people" with an emergent human-like psychology. They possess superhuman encyclopaedic knowledge and memory, akin to a savant, but also significant cognitive deficits such as hallucination, "jagged intelligence" (superhuman in some areas, basic errors in others), and "anterograde amnesia" (lack of native long-term learning or context retention across sessions). These limitations necessitate careful programming to leverage their strengths while mitigating their weaknesses.

Andrej then explores opportunities in "partial autonomy apps", which integrate LLM capabilities into traditional interfaces. Key features of these apps include LLMs managing context, orchestrating multiple LLM calls, providing application-specific Graphical User Interfaces (GUIs) for human auditing, and offering an "autonomy slider" that allows users to control the level of AI intervention. The goal is to make the "generation-verification loop" between AI and humans as fast as possible, with GUIs being crucial for efficient visual auditing. 

The speaker stresses the need to "keep the AI on the leash," advocating for incremental changes over large, unmanageable outputs and precise prompting to ensure successful verification. He draws parallels to self-driving technology, warning against over-optimism about fully autonomous agents and emphasising that autonomy takes time and requires humans in the loop. The "Iron Man suit" analogy is used to advocate for building "partial autonomy products" (augmentations) over purely autonomous "Iron Man robots".

Finally, Karpathy highlights that natural language programming makes "everyone a programmer" through "vibe coding," which is seen as a "gateway drug" to software development. However, he notes that while the AI-assisted coding (vibe coding) is becoming easier, the "devops" aspects like authentication, payments, and deployment remain challenging and time-consuming, often requiring manual "clicking stuff" in browsers. This leads to the call to "build for agents," advocating for digital infrastructure to become more agent-friendly, such as lm.txt files for LLM-readable domain information, documentation in markdown, and replacing "click" instructions with curl commands. 

He suggests tools that convert human-friendly content (like GitHub repos) into LLM-friendly formats (e.g., get-ingest, Deep Wiki). 

In conclusion, he encourages those entering the industry to embrace this unique time to rewrite and adapt software for the new AI era, focusing on partial autonomy and agent-friendly infrastructure.

Comments

Popular posts from this blog

Kai-Fu Lee on China-US AI Race - Q&A Transcript from a Bloomberg Interview

The Mercurial Grok AI Assistant Understands & Speaks Indian Languages

40 Talks from the Google Web AI Summit 2025