Blaise Agüera y Arcas of Google Research examines intelligence as a self-organising principle linking life, mind, and machine. These excerpts sketch his central claim: intelligence is predictive, and social – a network of minds, biological, cultural, and artificial.

Agüera y Arcas, Blaise. What Is Intelligence. San Francisco: Antikythera, 2025. https://whatisintelligence.antikythera.org. Reproduced with the permission of the author (citation style retained from the original)

Preface

AI is probably the biggest story of our lifetimes. Its rapid development, starting in the early 2020s, has generated a mixed outpouring of excitement, anxiety, and denial.

Everyone wants to weigh in. Even authors writing about history, language, biology, sociology, economics, the arts, and psychology feel the need to add a chapter about AI to the end of their books. Indeed, AI will surely affect all of these far-flung domains, and many more – though exactly how, nobody can say with any confidence.

This book is a bit different. It’s less about the future of AI than it is about what we have learned in the present. At least as of this writing, in January 2025, few mainstream authors claim that AI is ‘real’ intelligence.1 I do. Gemini, Claude, and ChatGPT aren’t powered by the same machinery as the human brain, of course, but intelligence is ‘multiply realisable’, meaning that, just as a computer algorithm can run on any kind of computer, intelligence can ‘run’ on many physical substrates. In fact, although our brains are not like the kinds of digital computers we have today, I think the substrate for intelligence is computation, which implies that a sufficiently powerful general-purpose computer can, by definition, implement intelligence. All it takes is the right code.

Thanks to recent conceptual breakthroughs in AI development, I believe we now know, at least at a high level, what that code does. We understand the essence of an incredibly powerful trick, although we’re still in the early days of making it work. Our implementations are neither complete, nor reliable, nor efficient – a bit like where we were with general computing when the ENIAC first powered up, in 1945, or where we were with aviation when the Wright brothers made their first powered flight, in 1903.

It’s an old cliché in AI that airplanes and birds can both fly, but do so by different means.2 This truism has at times been used to motivate misguided approaches to AI. Still, the point stands. Bird flight is a biological marvel that in some respects remains poorly understood, even today. However, we figured out the basic physics of flight – not how animals fly, but how it’s possible for them to do so – in the eighteenth century, with the discovery of Bernoulli’s equation. Working airplanes took another century and a half to evolve.

Similarly, while there is still a great deal about the brain that we don’t understand, the idea that prediction is the fundamental principle behind intelligence was first put forward by German polymath Hermann von Helmholtz in the 1860s.3 Many neuroscientists and AI researchers have elaborated on Helmholtz’s insight since then, and built models implementing aspects of the idea,4 but only recently has it become plausible to imagine that prediction really is the whole story. I will argue that, understood in full and interpreted broadly, the prediction principle may explain not only intelligence, but life itself.5

My definitions of life and intelligence are different from others in the literature. On the other hand, definitions in the literature also differ dramatically from each other. The definitions I propose are not designed to litigate particular instances (like the aliveness of viruses, or the intelligence of chatbots) but rather follow from the picture I’ll present of how and why life and intelligence arise.

I define life as a self-modifying, computational state of matter that arises through evolutionary selection for the ability to persist through time by actively constructing itself, whether by growing, healing, or reproducing. Yes, everything alive is a computer! Chapter 1 will explain why self-construction requires computation, and why, even more fundamentally, the concept of function or ‘purpose’, which is central to life, is inherently computational.

Intelligence, in turn, is the ability to model, predict, and influence one’s future. Modelling and prediction are computational too, so they can only take place on a computational platform; thus intelligence needs life. Likewise, life needs intelligence, because the ability to persist through time depends on predicting and influencing the future – in particular, on ensuring that the entity doing the predicting will continue to exist in the future. Hence all living organisms are, to one degree or another, intelligent. Every one of them is smart enough to have persisted through time in a complex, ever-changing environment.

That environment is each other: life and intelligence are inherently social. Every living thing is made of simpler cooperating parts, many of which are themselves alive. And every intelligence evolves in relation to other intelligences, potentially cooperating to create larger, collective intelligences. Cooperation requires modelling, predicting, and influencing the behaviour of the entities you’re cooperating with, so intelligence is both the glue that enables life to become complex and the outcome of that increasing complexity.

The feedback loop evident here explains the ubiquity of ‘intelligence explosions’, past and present. These include, among many others, the sudden diversification of complex animal life during the ‘Cambrian explosion’ 538.8 million years ago, the rapid growth of hominin brains starting about four million years ago, the rise of urban civilisation in the past several thousand years, and the exponential curve of AI today.

That predictive modelling is intelligence explains why recent large AI models really are intelligent; it’s not delusional or ‘anthropomorphic’ to say so. This doesn’t mean that AI models are necessarily human-like (or subhuman, or superhuman). In fact, understanding the curiously self-referential nature of prediction will let us see that intelligence is not really a ‘thing’. It can’t exist in isolation, either in the three pounds of neural tissue in our heads or in the racks of computer chips running large models in data centres.

Intelligence is defined by networks, and by networks of networks. We can only understand what intelligence really is by changing how we think about it – by adopting a perspective that centres dynamic, mutually beneficial relationships rather than isolated minds. The same is true of life.

Developing these ideas will require weaving together insights from many disciplines. As we go along, I’ll introduce concepts in probability, machine learning, physics, chemistry, biology, computer science, and other fields. When they’ve been most relevant to shaping our (sometimes mistaken) beliefs, I’ll also briefly review the intellectual histories of key ideas, from seventeenth-century ‘mechanical philosophy’ to debates about the origin of life, and from cybernetics to neuroscience.

You, dear reader, may be an expert in one or more of these fields, or in none. Few people are expert in all of them (I’m certainly not), so no specialised prior knowledge is assumed. On the other hand, even if you’re an AI researcher or neuroscientist with little patience for pop science, I hope you will find new and surprising ideas in this book. A general grasp of mathematical concepts like functions and variables will be helpful (with bonus points for knowing about vectors and matrices), but there will be no equations. (Well ... almost none.) A general understanding of how computer programming works will be useful in a few places, but isn’t required. If you find fundamental questions about the nature of life and intelligence interesting enough to still be reading, rest assured: you are my audience.

[excerpted from the Preface]

Introduction

The perspective I’ll offer is not easily reduced to a philosophical ‘-ism’. The footsteps I’m closest to following, though, are those of Alan Turing and his equally brilliant contemporary, John von Neumann, both of whom could be described as proponents of ‘functionalism’.6 They had a healthy disregard for disciplinary boundaries, and understood the inherently functional character of living and intelligent systems. They were also both formidable mathematicians who made major contributions to our understanding of what functions are.

Functions define relationships, rather than insisting on particular mechanisms. A function is what it does. Two functions are equivalent if their outputs are indistinguishable, given the same inputs. Complex functions can be composed of simpler functions.7

The functional perspective is mathematical, computational, and empirically testable – hence, the Turing Test.8 It’s not ‘reductive’. It embraces complexity and emergent phenomena. It doesn’t treat people like ‘black boxes’, nor does it deny our internal representations of the world or our felt experiences. But it stipulates that we can understand those experiences in terms of functional relationships within the brain and body – we don’t need to invoke a soul, spirit, or any other supernatural agency. Computational neuroscience and AI, fields Turing and von Neumann pioneered, are both predicated on this functional approach.

It’s unsurprising, in this light, that Turing and von Neumann also made groundbreaking contributions to theoretical biology, although these are less widely recognised today. Like intelligence, life and aliveness are concepts that have long been associated with immaterial souls. Unfortunately, the Enlightenment backlash against such ‘vitalism’, in the wake of our growing understanding of organic chemistry, led to an extreme opposite view, still prevalent today: that life is just matter, like any other. One might call this ‘strict materialism’. But it leads to its own paradoxes: how can some atoms be ‘alive’, and others not? How can one talk about living matter having ‘purpose’, when it is governed by the same physical laws as any other matter?

Thinking about life from a functional perspective offers a helpful route through this philosophical thicket. Functions can be implemented by physical systems, but a physical system does not uniquely specify a function, nor is function reducible to the atoms implementing it.

Consider, for example, a small object from the near future with a few openings in its exterior, the inside of which is filled with a dense network of carbon nanotubes. What is it, you ask? Suppose the answer is: it’s a fully biocompatible artificial kidney with a working lifetime of a hundred years. (Awesome!) But there’s nothing intrinsic to those atoms that specifies this function. It’s all about what this piece of matter can do, in the right context. The atoms could be different. The kidney could be implemented using different materials and technologies. Who cares? If you were the one who needed the transplant, I promise: you wouldn’t care. What would matter to you is that functionally, it’s a kidney. Or, to put it another way, it passes the Kidney Turing Test.

Many biologists are mortally afraid of invoking ‘purpose’ or ‘teleology’, because they do not want to be accused of vitalism. Many believe that, for something to have a purpose, it must have been made by an intelligent creator – if not a human being, then God. But as we shall see, that’s demonstrably not the case.

And we have to think about purpose and function when it comes to biology, or engineering, or AI. How else could we understand what kidneys do? Or hope to engineer an artificial kidney? Or a heart, a retina, a visual cortex, even a whole brain? Seen this way, a living organism is a composition of functions. Which means that it is, itself, a function!

What is that function, then, and how could it have arisen? Let’s find out.

[excerpted from the Introduction]

Daisy World

All definitions of life are controversial, most seem to require caveats or exceptions, and none has advanced our understanding of biology. Frustrated with this state of affairs, biochemist Andrew Ellington has gone so far as to assert that ‘there is no such thing as life. Life is not a meaningful scientific term. It is a term for poets, not scientists.’9

Ellington’s position may sound extreme, but it’s more defensible than the dogmatic definition in many biology textbooks: that living organisms have ‘the ability to eat, metabolise, excrete, breathe, move, grow, reproduce, and be responsive to external stimuli.’10 By this definition, viruses aren’t alive, which might seem reasonable. But neither are mules or worker bees, since they are sterile!

Or consider a leaner definition popular among astrobiologists: ‘life is a self-sustaining chemical system capable of Darwinian evolution.’11 Better, but by this definition, parasites are not alive, since they aren’t self-sustaining. Then again, is any life form truly self-sustaining? (Not if it eats.) How can a ‘self ’ be meaningfully circumscribed? And while life on Earth is based on chemistry, should we really insist on this as a requirement? Functionalism would argue otherwise.

So, here’s my own attempt:
Life is self-modifying computronium12 arising from selection for dynamic stability13; it evolves through the symbiotic composition of simpler dynamically stable entities.

Notice that my definition doesn’t stipulate any specific function. However, in defining life computationally, it is functional. And in a world where the Second Law [of Thermodynamics] holds, dynamic stability implies that those functions must include some combination of self-construction, self-repair, or reproduction. All of these are computations that require code to be able to read and write code, hence the ‘self-modifying’ qualifier.

[excerpted from 'Daisy World', Chapter 1]

After defining life as ‘self-modifying computronium’, Agüera y Arcas shows that any substrate capable of computation can sustain life. Because dynamic stability demands distinct interacting parts, life is necessarily ‘lumpy’, forming nested selves from cells to ecosystems and planets. Furthermore, he argues that Earth, too, is a self-regulating system – alive in a homeostatic, non-conscious sense. Extending this to technology, he sees human–machine networks as a new evolutionary layer of life’s computation. The boundary between natural and artificial thus collapses: all living and evolving systems are computational expressions of dynamic stability.

Élan Vital

If life is computational, then are human-engineered (that is, ‘artificial’) computers alive? They are certainly purposive, or we couldn’t talk about them being broken or buggy. But hardware and software are, in general, unable to reproduce, grow, heal, or evolve on their own, because engineers learned long ago that self-modifying code (like bff,14 or DNA) is hard to understand and debug.15 Thus, phones don’t make baby phones, or (alas) heal from their frequent injuries. Apps don’t write new versions of themselves.

And yet: there are more phones in the world this year than last year, and their cracked screens do often get repaired; apps acquire new features, become obsolete, and eventually reach end-of-life, replaced by new ones; and AI models improve from month to month. Electronic components and computer code also exhibit the same kind of compositionality we’ve seen in bff and DNA. It certainly looks as if technology is reproducing and evolving!16 Debating its aliveness is thus a bit like the debate over whether viruses or parasites are alive.

If we zoom back out, putting technology and humans in the frame together, we can see that this larger, symbiotic ‘us’ is certainly reproducing, growing, and evolving.17 The emergence of technology, and the mutually beneficial – if sometimes fraught – relationship between people and tech is nothing more or less than our own most recent major evolutionary transition. Technology, then, is not distinct from nature or biology, but merely its most recent evolutionary development.

And what about that age-old inanimate stuff – rocks and rivers, mountains and beaches, clouds and storms? Water molecules in themselves are clearly not capable of general computation. Yet in the context of the hydrologic cycle, clouds, rainstorms, and rivers certainly serve critical ecological functions, and are profoundly shaped by life. Likewise, our planet’s metal and sand are shaped into steam engines and computer chips, just as calcium we ingest turns into bones and ionic currents. All of these materials and structures comprise parts of a grand network of Gaian interdependency. Why do we draw boundaries around certain networks of functions and insist that they are ‘alive’, while asserting the surrounding functions are not?

This way lies vitalism, a view espoused in various forms by a long line of philosophers from Posidonius of Apameia (circa 135–51 BCE, and undoubtedly reflecting a much older tradition) to Henri Bergson (1859–1941). Some modern thinkers, too, defend the vitalist position, such as Jane Bennett:

The quarantines of matter and life encourage us to ignore the vitality of matter and the lively powers of material formations [...]. By vitality I mean the capacity of things – edibles, commodities, storms, metals – not only to impede or block the will and designs of humans but also to act as quasi agents or forces with trajectories [...] or tendencies of their own. [... Our] analyses of political events might change if we gave the force of things more due.18

We resist such ideas because we tend to reserve the notion of agency only for ourselves. The idea of agency in a molecule or a storm, let alone an abstraction like money, seems especially far-fetched. We also tend to think in terms of a hierarchy in which ‘we’ (for whatever value of ‘we’) are at the top, and agency must surely diminish for anything ‘lower’ – a view reminiscent of the medieval Great Chain of Being.19 When we (hesitantly) extend ‘agency’ to the nonhuman, we tend to do so only for things that act obviously, individually, and on fast timescales, rather than in the aggregate and on slower, more evolutionary ones. It might be time to re-examine these ideas more holistically.

My purpose here is not to follow in Bennett’s footsteps – though I do find her project worth taking seriously. Language is, necessarily, imprecise, no matter what definitions we pick.20 This doesn’t mean that words are useless, though. When our language can become more rigorous and scientifically grounded, and when we use it to describe consistent patterns uniting a wide range of observations, we can start to see through ideological thickets.

I hope I have explained both clearly and rigorously how the phenomena that give rise to the complexifying dynamics of life apply much more broadly than to the entities we normally think of as ‘alive’ or ‘agential’. Accordingly, we could expand our definitions of these existing words, or adopt new ones, or do a bit of each. Personally, I would find some broadening of the old everyday words helpful.

That would hardly break new ground. Many traditional, nominally ‘prescientific’ worldviews embrace notions of aliveness, agency, and even personhood that are far broader than the modern Western ones. This seems a likely case of convergent evolution in languages and ideas, motivated by the common need among traditional societies to take symbiosis seriously to secure their own survival, and to flourish.

Taking a broader view is practical as much as it is spiritual: encouraging richer modelling of agency in ‘nature’ enhances a society’s dynamic stability, since all things in ‘nature’, ourselves included, are so mutually interdependent.

Thus it can be useful to take the view of an animal, plant, or river at times, even if they can’t take ours, the better to care for them – and for ourselves. The same goes for the Earth as a whole. That, ultimately, is the best reason to consider adopting, or at least adapting, a more inclusive view of the animate. Potawatomi writer and biologist Robin Wall Kimmerer makes this case eloquently in her book Braiding Sweetgrass.21

In agreeing with Kimmerer, I am not encouraging superstition. When scientists castigate animist beliefs as superstitious, they typically appeal to the materialist discoveries of the Enlightenment, which show that the atoms that make up our bodies are no different from the atoms that make up rocks or air. This is true. Atoms are atoms; they all obey the same rules. Hence, the Enlightenment model of a clockwork universe, governed by dynamical laws.

Yet as Schrödinger pointed out in 1944, our understanding of these laws – which he played such a central role in developing – remains incomplete. The laws as they stand do not account for the computationally complex, dynamically stable, symbiotic phenomena that comprise so much of our experience on Earth – indeed, without which there would be no such thing as experience at all. There would be no life, or purpose, or minds, or agency.

As we both embrace scientific rigor and start to figure out those ‘other laws of physics’ hitherto unknown’,22 we should perhaps be less surprised to find the shoe on the other foot. What the poet Dylan Thomas called ‘the force that through the green fuse drives the flower’23 drives all atoms, not just the ones we presume to be alive.

[from Chapter 1]

Killer App

Babbage struggled to drum up financial backing for his Analytical Engine.24 The market for Jacquard-woven textiles was obvious, but who needed industrially mass-produced ‘analytical formulæ’? To sell the idea of mathematical tables as a popular commodity, he resorted to folksy examples, such as ‘the amount of any number of pounds from 1 to 100 lbs. of butchers’ meat at various prices per lb.’25

Price tables for the village butcher? Hardly. A moment’s reflection will make it clear that small tradesmen wouldn’t have been a viable customer base for a hulking industrial machine like the Analytical Engine.

State administration, which Prony had bet on, was closer to the mark; the information needs of bureaucratic governments were on the rise.26 Still, it was too early. The French government defunded Prony’s project long before it was complete.

The real killer app was warfare. The British Army and Navy would have been Babbage’s obvious backers, and ultimately it was their lack of investment that doomed his enterprise.

The artillery table was already a paradigmatic product of human computation by the turn of the nineteenth century. A new table was needed for every big gun, including corrections for factors like altitude, wind, and barometric pressure. With every major world conflict from the Napoleonic Wars (1803–1815) onward, gunnery became increasingly important, and, with it, tabulation. And doing the calculations by hand took a long time.

By World War I, the first fully industrialised large-scale conflict, both the Allies and the Central Powers were making extensive use of complex tables. Artillery fire was often planned days in advance, and its accuracy became crucial for supporting infantry advances. Computation had become a bottleneck in warfare, and, more than any other single factor, this was what finally motivated serious investment in automatic computing between the World Wars.

As weapons production for World War II began ramping up, the University of Pennsylvania’s Moore School of Electrical Engineering hired at least two hundred women to work on artillery tables. Their methods would have been largely familiar to Babbage or Prony. But, spurred by the war effort, technology was advancing at breakneck speed. Six of the Moore School’s women were selected to become the programmers of the ENIAC.27 This first fully general, programmable computer had been designed to automate artillery tabulation.28

By the time the ENIAC became operational, in December 1945, priorities had changed. The Germans and the Japanese had been defeated, but Cold War brinkmanship picked up right where World War II had left off. The new computer’s first substantial program was a simulation of the ‘Super Problem’, exploring the feasibility of a hydrogen bomb.29 The math required was a lot harder than calculating ballistic trajectories, and the machine’s thousands of vacuum tubes burned out frequently, requiring near-daily repair. This incentivised rapid improvements in the hardware, kickstarting what would later be known as Moore’s Law.30

The early computers were a long way from anything we have today. It’s sobering to consider how many years of capital-intensive incubation within the military-industrial complex were needed before the technology had become sufficiently cheap, reliable, and miniaturised to rouse real interest from the private sector, let alone the village butcher.

IBM’s 701 mainframe, announced to the public on May 21, 1952 and originally dubbed the ‘Defense Calculator’, became the first computer commercially available in the US. There’s an apocryphal quote, usually attributed to Thomas J. Watson Jr., IBM’s president, from a stockholders meeting in 1953, to the effect that he believed there was a worldwide market for only five computers. This isn’t quite true; Watson really said that, when IBM had drawn up the plans for the 701 and toured them across the country to ‘some twenty concerns’ that they thought ‘could use such a machine’, they had assumed they’d only get five orders and were astonished to get eighteen.31

At first, general-purpose computers were not designed to work in real time. Like the human computers they replaced, they supported the war effort using batch processing. Such computing was a stately, offline affair. You submitted your job, and went to get a coffee ... or, more likely, take a nap while technicians nursed the job along, replacing tubes, clearing punch-card-reader jams, and swapping out spools of magnetic tape. Running on the ENIAC, bff would have taken centuries to achieve full-tape replication. By 1952, on the IBM 701, it would still have taken years, running nonstop, at a cost of millions of (today’s) dollars.32

The ’50s did see some early experiments in interactive architectures. Project Whirlwind, operational in 1951, was initially designed for flight simulation, and later became the heart of the US Air Force’s SAGE air-defense system.33 Real-time computing had become important due to the development of radar and related radio-signaling technologies, effectively allowing one machine to physically detect another machine – at least one of which might be moving through space at high speed and with lethal intent.

Identify Friend or Foe (IFF) systems soon followed, using encrypted signaling to allow radar dots to be annotated when a bogey was ‘one of ours’. This in turn created an incentive to hack an enemy’s IFF system, spoofing the ‘friend’ signal to sneak into enemy territory. A game of technical one-upmanship ensued, not unlike that of the Cambrian explosion.

Still more Cambrian was the rapidly increasing speed, precision, and deadliness of the military hardware. Jet fighters, missiles, and anti-missile defenses proliferated. GPS was invented in large part to make autonomous weapons guidance possible, as it was obvious that high-speed warfare would soon render keeping a human ‘in the loop’ impossible; the precision and response time needed were superhuman. Even the G-forces incurred by aerial maneuvers soon began to exceed human endurance.34 Robots would need to close the sensorimotor loops of these new weapons systems.

[from Chapter 3]

Agüera y Arcas discusses the functional and social origins of computation: machines built to automate calculation for state and military needs. Over time, warfare and defense applications turned calculation into competition – each side trying to predict the other’s moves more accurately and faster. From artillery tables to radar, IFF systems, and autonomous weapons, computation became about anticipation under pressure, a feedback loop where predicting others’ predictions meant survival.

This arms race between systems formalised what Agüera y Arcas later calls intelligence’s ‘killer app’: adversarial prediction driving recursive complexity. Each escalation – faster targeting, encrypted signals, counter-signals, and real-time feedback – pushed machines (and humans) toward deeper modeling of the other’s models. The writer generalises this pattern beyond war: competition, evolution, and even conversation become domains where intelligence emerges through the need to out-predict an opponent.

Matroyshka Dolls

But theory of mind35 is far more than just a valuable skill for navigating hidden barbs and avoiding faux pas at teatime. There is good evidence that it is the very stuff of intelligence; it is, thus, at the heart of this book’s main argument.

We have arrived at that heart. My contention is that theory of mind:

  • Powers the ‘intelligence explosions’ observed in our own lineage, the hominins, and in other brainy species;

  • Gives us the ability to entertain counterfactual ‘what-ifs’;

  • Motivates, and is enhanced by, the development of language;

  • Allows us to make purposive decisions beyond ‘autopilot mode’;

  • Underwrites free will;

  • Operates both in social networks and within individual brains;

  • Results automatically from symbioses among predictors; and

  • Is the origin and mechanism of consciousness.

In a sense, theory of mind is mind.

[excerpted from ‘Matroyshka Dolls’, Chapter 5]

Intelligence Explosion

These are big claims. Let’s begin with the more established ones.

In the 1970s, Dian Fossey, the world’s leading expert on gorilla behaviour, invited British neuropsychologist Nicholas Humphrey to spend a few months at her research station in the Virunga Mountains of Rwanda. Reflecting later on what he had seen, Humphrey wrote, ‘[O]f all the animals in the forest the gorillas seemed to lead much the simplest existence – food abundant and easy to harvest (provided they knew where to find it), few if any predators (provided they knew how to avoid them) ... little to do in fact (and little done) but eat, sleep and play. And the same is arguably true for natural man.’36

These observations flew in the face of the usual explanation for evolving high intelligence – that it’s all about being a brilliant hunter, or otherwise ‘winning’ at playing a brutal, Hobbesian game of survival in a tough environment. But if not to hunt (or evade hunters), why bother with intelligence? Couldn’t one live the easy life of a gorilla without incurring the high cost of a big brain? The fossil record suggests not: primate brains in many lineages, including those of gorillas and humans, haven’t shrunk over time, but, rather, have grown dramatically.

Humphrey’s explanation: ‘[T]he life of the great apes and man [...] depend[s] critically on the possession of wide factual knowledge of practical technique and the nature of the habitat. Such knowledge can only be acquired in the context of a social community [...] which provides both a medium for the cultural transmission of information and a protective environment in which individual learning can occur. [...][T]he chief role of creative intellect is to hold society together.’37

There are now many variations on this basic theory – some, like Humphrey, emphasising cooperation and division of labour, others competition and Machiavellian politics.38 These correspond fairly well to the ‘love’ versus ‘war’ explanations for the emergence of intelligence generally, as described in chapter 3.

In the 1990s, evolutionary psychologist Robin Dunbar and colleagues used comparative brain-size measurements across species to advance the closely related ‘social brain hypothesis’, which holds that the rapid increases in brain size evident in hominins and cetaceans (whales and dolphins), among others, arise from mentalising one-upmanship.39

Underpinning all of these related hypotheses is the observation that, among highly social animals like us, theory of mind is a powerfully adaptive trait. Being better able to get inside others’ heads increases the odds of finding a mate, building coalitions, securing resources from friends and family, getting help raising young,40 avoiding violence (or being on the winning side of it), climbing in prestige, and amassing fans or followers. So, unsurprisingly, people with better theory of mind tend to live longer and have greater reproductive success.41 That means Darwinian selection will be at work.

Strong theory of mind is correlated with both larger numbers of friends and larger brains, particularly in brain areas associated with mentalising – above all, the frontal cortex.42 Thus, startlingly, we can see evidence of the evolutionary pressure on social intelligence even among modern humans.

My guess is that the extra cortical volume of highly social people is dedicated not only to general mentalising skills, but also to rich learned representations – we could even call them simulations – of their many specific family members, friends, colleagues, and acquaintances.43 After all, mentalising requires not just imagining another person in the abstract, but modelling their particular life experience, what they know and don’t know, their quirks and values, the ways they express themselves – in short, everything that comprises a personality, outlook, and umwelt. You probably bring such knowledge to bear effortlessly to play out hypothetical social situations in your head, drawing on a cast of hundreds of people you know.

This task is even harder than it appears at first glance, because there are infinite reflections in the social hall of mirrors: all of those people are themselves modelling others, including you. And, of course, their models of you include models of them, and of others. These relationships are important, because they powerfully affect behaviour. Who shared their meat with whom after the last hunt, and in front of whom? Who is sleeping with whom? Who is gossiping about it, and who doesn’t know? Who is beefing with whom, and over what (or whom)?

Even if your model of second-order relationships (i.e., who is friends with whom, and what you know about those interactions) is not as rich as your model of first-order relationships (your friends), the sheer number of higher-order terms in your model explodes. If you have twenty classmates who all know each other (and you), then you need to keep track not only of your twenty relationships with them, but of all of their relationships with each other and with you, which is another 20×20=400 pieces of information. Third-order relationships climb into the thousands.

The numbers get truly mind-boggling when you consider that our acquaintances can easily number in the hundreds; family, school, and work environments tend to involve cohorts of people who all know each other; and our theories of mind can go up to sixth order, or beyond.44 You can imagine, then, that even if many corners are cut, the amount of brain volume needed to do social modelling might grow both as a function of your number of friends and as a function of your ability to model higher-order relationships.

Indeed, when Dunbar and colleagues set out to find the relationship between the brain sizes and the social-group sizes of brainy animals, they found that as troop size increases, the amount of brain volume dedicated to the cortex also increases.45 Mentalising order, which Dunbar refers to as ‘intentionality level’, appears to be limited by cortical volume; behavioural studies suggest that monkeys can only operate at level one, while nonhuman apes have level-two intentionality. By extrapolation, archaic humans and Neanderthals may only have been able to achieve level-four intentionality, which is at ‘the lower end of the normal distribution for modern human adults, and at about the same intellectual level as young teenagers.’46 Finally, the slope of the relationship between cortical volume and troop size is considerably steeper for apes than for monkeys, consistent with the idea that modelling higher levels of intentionality requires a greater investment of cognitive resources per troop member.

Findings relating brain size to social group size, and social group size to Darwinian fitness, are themselves a hall of mirrors, revealing a profound self-similarity – and feedback loop – between brains and social groups. If you have a slightly larger brain than your friends and family, and are able to model more relationships more reliably, you will have a Darwinian advantage, so on average will have slightly more descendants than those who are less socially adept. But that means that your descendants will themselves become harder to model socially; that is, everyone else’s model of them must now be more complex, including higher-level intentionality and more relationships. And remember, everybody is trying to predict everybody else, but not be fully predictable themselves!

So it’s an arms race, not unlike that of the Cambrian explosion – though (usually) friendlier. Everyone is getting a bigger brain to model everyone else, and everyone else is getting harder to model at the same time, because ... well, their brains are getting bigger. A social intelligence explosion ensues: a rapid (by evolutionary standards) increase in brain volume in your species.

When social modeling becomes such an important component of everyone’s life, the effect on individual incentives is dramatic. Lone operators, like leopards, are content to fend for themselves; their umwelt consists mainly of their territory and their prey, while other leopards are, most of the time, unwelcome intruders. For a modern human, though, being cast out or excluded from the community becomes a severe punishment– or even a death sentence. Our world consists largely of other people, and most of us would be unable to survive at all without continual mutual aid.

At the same time, sociality is a fraught business. We try to ‘win’ at modelling others without being fully modelled ourselves; we compete for mating opportunities and for attention; we strive for dominance and prestige. These dynamics once again illustrate how competition and cooperation can be interwoven to such a degree that it can be hard to tell which is which.

Selection pressure also operates at the level of social groups. If one group sports slightly bigger brains and greater social prowess, the group itself can grow larger, and will thus tend to outcompete the smaller (and smaller-brained) group. Such may have been the fate of certain of our now-extinct hominin kin.

Societies have a kind of collective intelligence, and a rich body of work in social anthropology tells us that collective intelligence exhibits a scaling law not unlike that of individual brains. Theory of mind, incidentally, is important both for effective teaching and effective learning, which implies that it enhances cultural evolution at the group level too.47 So, bigger societies can create and evolve more complex technologies, and thereby develop greater adaptability and resilience.48 Greater scale, in other words, can support greater intelligence, and greater intelligence improves dynamic stability, both at the individual level and at the group level.

[ from Chapter 5]

Agüera y Arcas critiques the popular ‘before–during–after’ story of AI – narrow, general, and superintelligence – as misleading. He argues that ‘general’ ability already emerged when unsupervised, sequence- predicting language models began performing arbitrary tasks through in-context learning. This was a continuous performance climb, not a discrete leap. Debates about the exact moment of AGI miss the point: intelligence evolves by degree. Even supposedly narrow systems learn universal representations, implying that every specialised predictor contains generality in miniature. From this view, AI’s ‘transition’ started as early as the 1940s with cybernetic sequence learners – the first machines to predict. The history of AI is therefore part of life’s broader story of predictive evolution.

Transitions

The emergence of ai marks what theoretical biologists John Maynard Smith and Eörs Szathmáry have termed a ‘major evolutionary transition’ or MET49 – a term we first encountered in chapter 1.50 Smith and Szathmáry describe three characteristic features of METs:

  • Smaller entities that were formerly capable of independent replication come together to form larger entities that can only replicate as a whole.

  • There is a division of labour among the smaller entities, increasing the efficiency of the larger whole through specialization.

  • New forms of information storage and transmission arise to support the larger whole, giving rise to new forms of evolutionary selection.

Szathmáry and others have since proposed a few changes (such as adding the endo- symbiosis of plastids, leading to plant life), but the larger point is that the list of major transitions is short, and each item on it represents a momentous new symbiosis with planetary-scale consequences. Any meaningful periodisation of life and intelligence on Earth must focus on big transitions like these.

That the transitions appear to be happening at increasing frequency is not just an artifact of the haziness of the distant past, but of their inherent learning dynamics, as Turchin described. Increasingly powerful predictive models are, as we have seen, also increasingly capable learners. Furthermore, in-context learning shows us how all predictive learning also involves learning to learn. So, as models become better learners, they will more readily be able to ‘go meta’, giving rise to an MET and producing an even more capable learner. This is why cultural evolution is so much faster than genetic evolution.

Max Bennett argues that ‘the singularity already happened’51 when cultural accumulation, powered by language and later by writing, began to rapidly ratchet human technology upward over the past several thousand years. This is a defensible position, and doesn’t map well to the last MET on Smith and Szathmáry’s list, since humans have existed (and have been using language) for far longer than a few thousand years. Hence Bennett’s ‘cultural singularity’ doesn’t distinguish humans from nonhuman primates, but, rather, is associated with urbanization and its attendant division of labour. Therefore, this recent transition is neither an immediate consequence of language nor an inherent property of humanity per se, but a distinctly modern and collective phenomenon. It is posthuman in the literal sense that it postdates our emergence as a species.

The Pirahã, for instance, who still maintain their traditional lifeways in the Amazon, are just as human as any New Yorker, but possess a degree of self-sufficiency radically unlike New Yorkers. They can ‘walk into the jungle naked, with no tools or weapons, and walk out three days later with baskets of fruit, nuts, and small game.’52 According to Daniel Everett,

The Pirahãs have an undercurrent of Darwinism running through their parenting philosophy. This style of parenting has the result of producing very tough and resilient adults who do not believe that anyone owes them anything. Citizens of the Pirahã nation know that each day’s survival depends on their individual skills and hardiness. When a Pirahã woman gives birth, she may lie down in the shade near her field or wherever she happens to be and go into labour, very often by herself.

Everett recounts the wrenching story of a woman who struggled to give birth on the beach of the Maici river, within earshot of others, but found that her baby wouldn’t come. It was in the breech position. Despite her screams over the course of an entire day, nobody came; the Pirahã went so far as to actively prevent their Western guest from rushing to help. The woman’s screams grew gradually fainter, and in the night, both mother and baby eventually died, unassisted.

In this and other similar stories, the picture that emerges is not of a cruel or unfeeling people – in one more lighthearted episode, the Pirahã express horrified disbelief at Everett for spanking his unruly preteen – but of a society that is at once intensely communitarian and individualistic. They readily share resources, but there is no social hierarchy and little specialisation. Everyone is highly competent at doing everything necessary to survive, starting from a very young age. The corollary, though, is that everyone is expected to be able to make do for themselves.

The Pirahã are, of course, a particular people with their own ways and customs, not a universal stand-in for pre-agrarian humanity. However, the traits I’m emphasising here – tightly knit egalitarian communities whose individuals are broadly competent at survival – are frequently recurring themes in accounts of modern hunter-gatherers. It seems a safe bet that this was the norm for humanity throughout the majority of our long prehistory.

We’re justified in describing as METs the transition from traditional to agrarian, then to urban lifeways. During the agrarian revolution, a new network of intensely interdependent relationships arose between humans, animals, and plants; then, with urbanization, machines entered the mix and human labour diversified much further.

New York (and the modern, globalised socio-technical world in general) is a self- perpetuating system whose individuals are no longer competent in the ways the Pirahã are. Urban people have become, on one hand, hyper-specialised, and, on the other, de- skilled to the point where they can’t survive on their own, any more than one of the cells in your body could survive on its own. It’s not just language, but written texts, schools and guilds, banking, complex systems of governance, supply-chain management, and many other information-storage and transmission mechanisms that add the evolvable ‘DNA’ needed to organise and sustain urban societies.

It seems to me, though, that this MET is still not the last on the list. By 1700, significant human populations had urbanization, division of labour, and rapid cultural evolution. Then came the first Industrial Revolution, as introduced in chapter 1: a symbiosis between humans and heat engines, resulting in a hydrocarbon metabolism that unleashed unprecedented amounts of free energy, much like the endosymbiosis of mitochondria. This allowed human and livestock populations to explode, enabled a first wave of large-scale urbanization, and drove unprecedented technological innovation. As Karl Marx and Friedrich Engels noted in 1848,

The bourgeoisie, during its rule of scarcely one hundred years, has created more massive and more colossal productive forces than have all preceding generations together. Subjection of Nature’s forces to man, machinery, application of chemistry to industry and agriculture, steam-navigation, railways, electric telegraphs, clearing of whole continents for cultivation, canalization of rivers, whole populations conjured out of the ground – what earlier century had even a presentiment that such productive forces slumbered in the lap of social labor? 53

[excerpted from 'Transitions', Chapter 10]

Vulnerability

Humans had been working hard, and working together, for thousands of years. It was not ‘social labour’, but coal that had lain slumbering under the ground. Mining was hard work, but the coal itself did an increasing proportion of that work.54 And over time, the coal produced ever more workers.

The conjuring of enormous new populations out of the ground – quite literally, flesh out of fossil fuel – manifested as a population explosion that had become obvious by 1800. This prompted Thomas Malthus and his Chinese contemporary, Hong Liangji, to worry for the first time about global overpopulation.55

It also created an unprecedented symbiotic interdependency between biology and machinery. Romanticism, the idealisation of rural living, and the utopian communities of the nineteenth century can all be understood as a backlash against that growing dependency, an assertion that we could live the good life without advanced technology and urbanisation. But at scale, we couldn’t.

A second Industrial Revolution arose from the electrification Marx and Engels mentioned in passing.56 From telegraphs, we progressed to telephony, radio, TV, and beyond, all powered by the electrical grid. In some ways this paralleled the development of the first nervous systems, for, like a nerve net, it enabled synchronisation and coordination over long distances. Trains ran on common schedules; stocks and commodities traded at common prices; news broadcasts pulsed over whole continents.

The second Industrial Revolution culminated in another dramatic jump in human population growth: the ‘baby boom.’ While the baby boom had multiple proximal causes, including sanitation and antibiotics, it depended on the resources and information flows made possible by electricity and high-speed communication.

This additional layer of symbiotic dependency between people and technology generated a second wave of Malthusian population anxiety.57 Accordingly, the ‘back to the land’ movements of hippie communes in the ’60s had much in common with nineteenth-century Romanticism. Beyond concerns about the Earth’s ultimate carrying capacity, the sense of precariousness was not unjustified. Dependency is vulnerability.

Consider the effects of an ‘Electromagnetic Pulse’ (EMP) weapon. Nuclear bombs produce an EMP, which will fry any non-hardened electronics exposed to it by inducing powerful electric currents in metal wires. Some experts are concerned that North Korea may already have put such a weapon into a satellite in polar orbit, ready to detonate in space high above the United States.58 At that altitude, the usual destructive effects of a nuclear explosion won’t be felt on the ground, but a powerful EMP could still reach the forty-eight contiguous states, destroying most electrical and electronic equipment. Then what?

For the Pirahã, an EMP would be a non-event. For the US in 1924, it wouldn’t have been a catastrophe either. Only half of American households had electricity, and critical infrastructure was largely mechanical. As of 2024, though, everything relies on electronics: not just power and light, but public transit, cars and trucks, airplanes, factories, farms, military installations, water-pumping stations, dams, waste management, refineries, ports ... everything, worldwide. With these systems down, all supply chains and utilities rendered inoperable, mass death would quickly ensue. An EMP would reveal, horrifyingly, how dependent our urbanised civilisation has become on electronic systems. We have become not only socially interdependent, but collectively cybernetic.

AI may represent yet a further major transition, because earlier cybernetics – such as the control systems of dams, or the electronics in cars – implement only simple, local models, analogous to reflexes or the distributed nerve nets in animals like Hydra. Prior to the 2020s, all of the higher- order modelling and cognition took place in people’s brains, although we did increasingly use traditional computing for information storage and fixed-function programming.

Now, though, we’re entering a period in which the number of complex predictors – analogous to brains – will rapidly exceed the human population. AIs will come in many sizes, both smaller and larger than human brains. They will all be able to run orders of magnitude faster than nervous systems, communicating at near lightspeed.59

The emergence of AI is thus both new and familiar. It’s familiar because it’s an MET, sharing fundamental properties with previous METs. AI marks the emergence of more powerful predictors formed through new symbiotic partnerships among pre-existing entities – human and electronic.60 This makes it neither alien to nor distinct from the larger story of evolution on Earth. I’ve made the case that AI is, by any reasonable definition, intelligent; AI is also, as Sara Walker has pointed out, just another manifestation of the long-running, dynamical, purposive, and self-perpetuating process we call ‘life’.61

So, is AI still a big deal? Yes. Whether we count eight, a dozen, or a few more, there just haven’t been that many METs over the last four and a half billion years, and although they’re now coming at a much greater clip, every one of them has been a big deal. This final chapter of the book attempts to make as much sense as possible, from the vantage point of the mid-2020s, of what this AI transition will be like and what lies on the other side. What will become newly possible, and what might it mean at planetary scale? Will there be winners and losers? What will endure, and what will likely change? What new vulnerabilities and risks, like those of an EMP, will we be exposed to? Will humanity survive?

Keep in mind, though, that none of this should be framed in terms of some future AGI or ASI threshold; we already have general AI models, and humanity is already collectively superintelligent. Individual humans are only smart-ish. A random urbanite is unlikely to be a great artist or prover of theorems; probably won’t know how to hunt game or break open a coconut; and, in fact, probably won’t even know how coffeemakers or flush toilets work. Individually, we live with the illusion of being brilliant inventors, builders, discoverers, and creators. In reality, these achievements are all collective.62 Pretrained AI models are, by construction, compressed distillations of precisely that collective intelligence. (Feel free to ask any of them about game hunting, coconut-opening, or flush toilets.) Hence, whether or not AIs are ‘like’ individual human people, they are human intelligence.

[from Chapter 10]

1 Holloway 2023; Agrawal, Gans, and Goldfarb 2024. A few recent books I find compelling but that claim AI is not ‘real’ include Godfrey-Smith 2020; Seth 2021; Smith-Ruiu 2022; Christiansen and Chater 2022; Lane 2022; Humphrey 2023; M. Bennett 2023; K. J. Mitchell 2023; Mollick 2024.

2 Feigenbaum and Feldman 1963.

3 See von Helmholtz 1925, though as always when it comes to intellectual priority, an argument can be made that it goes back earlier – here, for instance, to Immanuel Kant. A modern, more mathematical articulation of the prediction principle was proposed by the early cyberneticists, as will be discussed in ‘Behavior, Purpose, and Teleology’, chapter 3; see Rosenblueth, Wiener, and Bigelow 1943.

4 Dayan et al. 1995; Rao 2024.

5 Anil Seth and Andy Clark have also made a vigorous case for prediction as fundamental to intelligence, though their books don’t explicitly make the connection to predictive modelling in AI; Seth 2021; A. Clark 2023.

6 A computational position held by Alan Turing and John von Neumann emphasising that what something 'is' is defined by what it does, which can be independent of implementation, hence multiply realisable; the corresponding philosophical perspective, championed by Hillary Putnam and others, conceives mental states in terms of their role in a system or their behavioral effects, thus opening the possibility that different kinds of systems, including non-bi- ological ones, could manifest the same types of mental states.

7 For more technical treatments of functions as described in this book, see Fontana 1990; Wong et al. 2023.

8 Computing pioneer Alan Turing anticipated this dilemma as far back as his classic 1950 paper ‘Computing Machinery and Intelligence’, one of the founding documents of what we now call AI (Turing 1950). He concluded that the appearance of intelligence under human questioning and the reality of intelligence could not justifiably be separated; sustained and successful ‘imitation’ was the real thing. Hence the ‘Imitation Game’, now called the ‘Turing Test’ in his honour.

9 Ellington 2010.

10 Walker 2024.

11 Walker 2024.

12 A substance that computes with massive parallelism; the term was originally coined in the 1980s by physicist and computer scientist Norman Margolus to describe a hypothetical material optimised at the atomic or molecular level to perform computation as efficiently as the laws of physics allow, though I use it more broadly to describe the organised, computational state of matter characterising life.

13 Used to describe systems that cycle through a loop of states (in dynamical systems theory, a stable limit cycle), thus preserving their identity through time even though they are in constant flux; typical of living systems.

14 bff refers to a particular version of 'an esoteric programming language invented thirty years earlier by a Swiss physics student and amateur juggler, Urban Müller.'

15 Computer viruses are a notable exception.

16 The case for technology being an evolutionary product of hierarchically composed functions is thoughtfully made, with many examples, by Arthur 2010.

17 As astronomer Robert Jastrow put it, ‘We are the reproductive organs of the computer’ (Jastrow 1981); the same holds for technology in general. Samuel Butler made this case in a visionary 1863 essay entitled ‘Darwin Among the Machines’, a perspective revisited and updated at the end of the twentieth century by science historian George Dyson; Butler 1863; G. Dyson 1998.

18 J. Bennett 2010.

19 Lovejoy 1936.

20 Enfield 2024.

21 Kimmerer 2013.

22 Schrödinger 1944.

23 Thomas 1934.

24 Charles Babbage was a 19th-century British mathematician, philosopher, and inventor who is often called the ‘father of the computer’.

25 Babbage 1864.

26 Scott 1998; Soll 2011.

27 They were: Kay McNulty, Betty Snyder, Marlyn Wescoff, Ruth Lichterman, Betty Jean Jennings, and Fran Bilas. Fritz 1996; Light 1999.

28 Typically of ‘firsts’, there are other contenders. Working in relative isolation, Konrad Zuse, a German civil engineer, completed his electromechanical Z3 computer in 1941. Zuse’s project was also motivated by war; his earlier S1 and S2 were special-purpose machines for computing aerodynamic corrections to the wings of radio-controlled flying bombs, and after 1939 he was funded by the Nazi government. While Zuse didn’t design the Z3 with Turing completeness in mind, and it didn’t natively support conditional jumps, it can be programmed cleverly to simulate them. This arguably gives the Z3 priority over the ENIAC, per Rojas 1998.

29 Fitzpatrick 1998.

30 G. E. Moore 1965; Schaller 1997.

31 IBM 2007.

32 Renting the 701 in 1952 cost between $12,000 and $18,000 per month. IBM 2007.

33 Waldrop 2018.

34 De Monchaux 2011.

35 The ability to attribute mental states like beliefs, intents, desires, emotions, knowledge, etc., to oneself and others and to understand that others have beliefs, desires, emotions, and knowledge that are different from one’s own.

36 Humphrey 1976.

37 The basic idea had been floated by other researchers in the 1950s (Chance and Mead 1953) and again in the ’60s (Jolly 1966), though to less effect.

38 Whiten and Byrne 1997; de Waal 2007.

39 Dunbar 1998.

40 Sarah Blaffer Hrdy has convincingly argued that, among humans and certain other brainy species, it ‘takes a village’ to raise an infant, in the sense that the mother alone can’t provide all the needed calories; the grandparents, siblings, and babysitters who help out are called ‘alloparents’. In alloparental species, prospective mothers without the needed social support disproportionately elect infanticide or, among humans, abortion. If they do have the baby, its odds of survival increase appreciably with alloparental help; Hrdy 2009.

41 Pawłowski, Lowen, and Dunbar 1998; Holt-Lunstad et al. 2015.

42 Powell et al. 2012.

43 There’s likely no sharp boundary between general skills and specific knowledge, in this or any other domain, because every specific thing you learn is represented in terms of your existing conceptual vocabulary, and in turn extends that vocabulary. One can think of it almost like a compression algorithm, in which one’s experiences so far make up the ‘dictionary’ with which subsequent knowledge and experiences are compressed. This may (at least partly) explain why time seems to pass more quickly as we age: we’re compressing our life experiences more efficiently – alas.

44 Street et al. 2024.

45 Dunbar 1992.

46 Dunbar 2016.

47 Gopnik and Meltzoff 1993; Ziv et al. 2016.

48 Muthukrishna et al. 2014.

49 A key point in the evolution of life that results in a significant increase in complexity through symbiotic inter-dependence, such as the shift from single-celled to multicellular organisms, or from asexual to sexual reproduction; proposed by biologists John Maynard Smith and Eörs Szathmáry.

50 Szathmáry and Smith 1995.

51 M. Bennett 2023.

52 Colapinto 2007.

53 Marx and Engels 1888 [1848].

54 Recall that the first application of Newcomen’s steam engine was pumping the water out of a flooded coal mine; today, mining is highly automated, and the machines that do most of the physical work are fuel-powered.

55 Hong Liangji (洪亮吉) 1793; Malthus 1798; Agüera y Arcas 2023.

56 Since the turn of the twenty-first century, some commentators have, in rapid succession, declared third and fourth Industrial Revolutions relating to computers, 3D printing, remote sensing, the Internet of Things, and vari- ous other developments; Rifkin 2008; Schwab 2017. While all of these technologies and many more have indeed been transformative (we could add, for instance, container ships, cellular networks, and high-frequency financial trading), none meet the MET criteria. I believe that AI will soon meet that bar, though it seems a misnomer to place it under the industrial paradigm by calling it yet another ‘Industrial Revolution’.

57 Ehrlich 1968; Meadows et al. 1972.

58 Jacobsen 2024.

59 This account focuses on prediction rather than energy (as with the first Industrial Revolution), though predic- tion, computation, and energy are all related at a deep level. We have far to go in making AI models more energy-efficient, but it’s notable that even today, AI models may be more energy-efficient than brains simply due to the speed with which they operate; Tomlinson et al. 2024.

60 Computing pioneer J. C. R. Licklider was among the first to foresee what he called ‘man-computer symbiosis’; Licklider 1960.

61 Walker 2023. 62 Sloman and Fernbach 2018.