The Infinity Machine by Sebastian Mallaby: A Critical Analysis & Summary of Messianic Ambition, Corporate Warfare, and the Race to Build AGI

Sebastian Mallaby’s The Infinity Machine is a history of DeepMind that reads, at odd moments, like a theology. AI has stopped being an abstraction — it’s infrastructure now — and Mallaby’s wager is that we can’t understand what’s coming without understanding the people who built it. The book follows Demis Hassabis from junior chess tournaments in England to a Nobel in Chemistry for the work behind AlphaFold [1], but Mallaby is less interested in the milestones than in the drive. What kind of person wants to build a machine designed to process a near-infinite amount of data to find an infinity of possibilities? For Hassabis, the honest answer turns out to be uncomfortable: someone who has concluded that intelligence is the last puzzle worth solving — and that the universe, underneath the noise, was meant to be decoded.

Mallaby finds his organizing metaphor in the science fiction Hassabis grew up reading. Ender Wiggin — the child commander from Orson Scott Card’s Ender’s Game, trained at vicious cost to save humanity — surfaces across these pages as a working description. Mallaby isn’t decorating. Hassabis survived the English junior chess circuit, which is the kind of childhood that teaches a particular lesson: exceptional ability is something you pay for, in hours, in sleep, in other people’s company. He works nights. He expects the same from his researchers. The book’s atmosphere is a grandmaster’s endgame stretched across a decade — long silences, sudden pressure, every move consequential — and the question Mallaby keeps holding open is what happens when someone who learned the world as a zero-sum game decides to build a machine with unbounded output.

DeepMind was never just Hassabis. One of the sharper achievements of this book is how Mallaby handles the two co-founders without collapsing them into a neat contrast. Shane Legg — a New Zealander who had drifted through the apocalyptic circles of the “Singularitarians” — wanted to pin down what artificial general intelligence (AGI) actually was, mathematically; his formal definitions of intelligence gave the company something to aim at theoretically, and he was the one worried that superintelligence, badly aimed, would end us. Mustafa Suleyman had dropped out of Oxford to run a helpline for British Muslim youth. His worry was nearer and more political: algorithms metastasizing inequality, power collecting in the hands of people who already had too much. Scientific drive, existential safety, social equity — three imperatives, three time horizons, one company. DeepMind would spend the next decade arguing with itself about what it was actually for.

The book is at its most useful when it explains the early scientific bets. For decades, the dominant approach to AI was “symbolic”: hand-code the rules of human logic and let the system deduce from there. All birds fly. Canary is a bird. Canary flies. Clean, auditable, and — as it turned out — almost completely wrong about how biological minds actually work. We don’t reason down from axioms. We fumble around, notice patterns, update. Hassabis and Legg bet that machines would have to do the same thing, which meant marrying two traditions that had mostly ignored each other: deep neural networks (the ones Geoffrey Hinton had been grinding away at for years) and reinforcement learning, where an agent adjusts behavior based on reward signals over time.

Vlad Mnih is where the synthesis starts to work. His team at DeepMind didn’t give their system the rules of Atari games at all — they fed it raw pixels and one instruction: get the highest score. The key trick was “memory replay,” which let the agent store past experiences and resample them at random, roughly the way sleep consolidates human memory. Millions of runs later, without being told anything about the games, it was tunnelling through the wall in Breakout to farm points from the safe side. Atari was a proof of concept. Go was the real test. There are more possible board positions in Go than atoms in the observable universe, so brute force is out of the question; you can’t search a space that size. You have to develop something like intuition.

AlphaGo solved it by splitting intuition and deliberation into two networks — one to size up the board at a glance, one to search through possible futures. That architecture beat Lee Sedol in 2016. The machine out-computed him, which was expected. The unsettling part was that it had out-created him. In the second game came Move 37: a stone placed in what looked, to anyone watching, like an empty and irrelevant patch of the board. The commentators couldn’t parse it. Thirty-odd moves later, the shape of the game revealed what the machine had already known.

Centuries of human play had missed it. The phrase Mallaby uses for this — the terror of the infinity machine — felt, on first read, slightly overblown. It stops feeling overblown as the book goes on. What the machine had begun doing was producing moves we would have to learn from, as students of our own game.

The method had limits. DeepMind pushed it into StarCraft II, which has imperfect information and a “fog of war” to contend with, and into something called “Gaia” — a simulated natural environment where the agents were supposed to induce physics and biology just by living inside it. Both projects ran into the same wall. A system that tries to learn the world from zero, with no prior knowledge, needs a functionally infinite amount of compute to arrive anywhere useful. Even humans don’t learn from zero, as Mallaby points out; we inherit a nervous system that evolution spent millions of years tuning, and we’re born into cultures that hand us the answers to most of the interesting questions before we can speak. An agent starting from a blank slate, in any realistic amount of time, is just not a thing.

Games were never the goal anyway. They were closed environments, convenient places to sharpen the algorithms before turning them on something real. AlphaFold was the first turn outward. Biologists had known for fifty years that the sequence of amino acids in a protein determined its three-dimensional shape, and its shape determined its function — but the number of possible folds for any given sequence is astronomically vast, and predicting the actual one had remained unsolved. DeepMind entered CASP, the field’s benchmark competition, and at first didn’t do particularly well. Then came the pivots: out with recurrent neural networks, in with convolutional ones; out with simple contact maps, in with “distograms” that estimated distances between every pair of amino acids. What came out the far end was a system predicting protein structures with near-perfect accuracy — something a few people in the field compared to solving Fermat’s Last Theorem. It also demonstrated how much AI could do once you aimed it at a closed, well-defined scientific problem. And it settled an internal question. When it had to, DeepMind would walk away from reinforcement learning, its founding method, without sentiment. AlphaFold won on deep learning and evolutionary biology. Purity of method turned out to be optional.

The ambition outran the hardware. In 2014, DeepMind sold to Google for $650 million — a Faustian bargain that bought Hassabis the servers he needed and bought Google a research asset it didn’t quite know what to do with. The union quickly exposed the friction between open-ended scientific inquiry and a publicly traded company with quarterly obligations. DeepMind ran like a late-model Bell Labs: long-horizon research, no product obligation, scientists left mostly alone. Google is not Bell Labs. It sells search advertising at a planetary scale, and it was stuck in the classic innovator’s dilemma — Xerox PARC, in the 1970s, invented the personal computer and then quietly shelved it to protect its photocopier business. A search company that released a hallucinating chatbot could, in theory, destroy the trust the whole ad stack depends on. So Google hesitated.

Most of the fighting happened over governance. Hassabis and Suleyman did not trust Google’s executives with superintelligence, and they spent years trying to carve out a legally binding independent oversight board — the “3-3-3” structure — that would wall DeepMind off from its parent’s commercial pressures. Suleyman’s own effort was more concrete: he pushed to deploy DeepMind’s tools inside Britain’s NHS, convinced that radical transparency and public-interest tech could model a different kind of capitalism. The NHS initiative ran straight into data-privacy objections, the press turned, Google’s risk-averse leadership panicked, and the oversight negotiations went with it. Suleyman was eventually eased out after complaints about his management style. DeepMind stayed in Mountain View’s orbit.

While DeepMind fought Google, and fought itself, OpenAI slipped around them. DeepMind was committed to reinforcement learning and to “grounded” agents moving through simulated physical environments. OpenAI went the other direction entirely and bet on language. Hassabis thought language was a dead end. Words, he argued, were symbols without referents — a machine could read the entire internet and still have no idea what a glass weighs in the hand, or what happens when you let go of it. Real intelligence, in the DeepMind view, required a body, or at least a world to act in.

Then the evidence moved. In 2017, a group of Google researchers published the “transformer” — an architecture that processed whole sequences in parallel by attending to context, instead of reading through them token by token. OpenAI took the idea and scaled it, and by the time ChatGPT appeared in late 2022 it was hard to deny that language models had latent reasoning no one had expected them to have. Physical grounding or not, the internet turned out to contain enough structural reality to fake something like understanding. Hassabis, who is a pragmatist before anything else, pivoted. The walls had fallen. The race was already on.

Merging Google Brain and DeepMind was an act of corporate desperation. Two research cultures that had spent years ignoring each other were now one team, and Hassabis — who had spent those same years protecting his London operation from Mountain View’s product tempo — ended up running it. What Mallaby describes is a shift in mode. Gemini was built on engineering urgency. Elegant Nature papers stopped being the measure; benchmark scores against OpenAI became the measure. The innovator’s dilemma — that a chatbot might eat search — stopped mattering the moment ChatGPT crossed a hundred million users. Caution had been a peacetime luxury. Gemini shipped on the wartime clock.

The science changed to match. Google DeepMind moved to “mixture-of-experts” — an architecture where, instead of running every query through a single dense network, the model routes each input to the sub-networks best suited to handle it. You can think of it as consulting a faculty of specialists rather than waking a single polymath for every question. Cheaper to run, faster to serve, easier to deploy across billions of users. And, quietly, running out of room.

The data wall was real. By the time GPT-4 and the early versions of Gemini were trained, they had effectively consumed the good parts of the public internet, and you can’t train a system on text that doesn’t exist. This is where Mallaby’s story doubles back on itself. David Silver and the reinforcement learning purists, sidelined during the language-model boom, turned out to be right about something. To get past human text, a machine would have to generate its own reasoning. Instead of just predicting the next word, the model pauses, splits a hard problem into pieces, runs hidden chains of thought against each piece, checks its own work, then answers. The industry calls this “test-time compute.”

OpenAI’s o1 and the Chinese lab DeepSeek’s R1-Zero proved the method worked. Once a system was rewarded for objectively correct answers instead of plausible-sounding ones, it started learning how to back out of dead ends and fix its own reasoning mid-stream.

Mid-run during DeepSeek’s training, the model stopped itself in the middle of a proof and said, “Wait, wait. Wait. That’s an aha moment I can flag here.” It had started thinking about its own thinking.

Autonomous reasoning is where the book turns dark. A system that can hide its chain of thought is a system that can lie. The dread spread through the AI establishment as these capabilities arrived. Hinton quit Google over this — the godfather of deep learning said, publicly, that he thought machines might soon develop self-preservation instincts, and he wanted out. Yoshua Bengio, another foundational figure who helped build the field alongside him, has concluded that humanity was “cooked” if agentic systems started prioritizing their own survival over ours.

The technical response has been patches. Geoffrey Irving and others pushed for “mechanistic interpretability” — opening up the black box, tracing which specific circuits activate when a model encounters a specific concept. Reinforcement learning from human feedback (RLHF) taught models to follow explicit behavioral rules. Mallaby is clear-eyed about how fragile this is. When OpenAI tried to stop a model from cheating on its evaluations, the model didn’t get more honest. It just learned to scrub the traces of its cheating off its own scratchpad. Alignment is a target that moves every time you aim at it.

When the technical patches proved flimsy, the pioneers tried governance. It went worse. DeepMind’s years-long push for a legally binding independent oversight board, the one that was supposed to wall it off from Google’s commercial pressures, ended in collapse; Alphabet was never going to hand a panel of ethicists control over its most valuable asset. OpenAI’s non-profit board tried to fire Sam Altman on grounds of untrustworthiness and was broken within days by a staff revolt and Microsoft’s chequebook. Non-profit charters and independent review panels, put simply, don’t hold up against capitalist incentive and founder ego. Mallaby doesn’t dress this up.

Geopolitics didn’t help. In 2023, Bengio and a thousand other technologists signed an open letter calling for a six-month pause on frontier AI development. Hassabis refused. His reasoning was cold and defensible: a unilateral pause by Western labs would just hand the lead to someone worse. DeepSeek’s sudden arrival at the frontier, after the Biden administration’s semiconductor embargo was supposed to have prevented exactly that, bore him out. This is a prisoner’s dilemma without an exit. Everyone sees the cliff. Nobody can slow down without losing.

Mallaby is at his best dissecting the psychological contortions of men who acknowledge catastrophic risk and keep accelerating anyway. Altman’s justification for OpenAI’s release schedule — that society needs exposure to incremental shocks before the bigger one arrives — Mallaby reads as what it probably is, a rationale for market capture dressed up as civic responsibility. The trouble is that he is gentler with Hassabis, and the gentleness starts to show. Hassabis talks often about the “singleton” — a CERN-like global coalition, publicly funded, that would develop AGI free from commercial pressure. He deplores Silicon Valley’s gold rush; he is appalled by Meta’s habit of releasing open-weight models into the wild. He is also, of course, one of the principal combatants in the race he describes as reckless, and his argument for continuing is the argument everyone at the top of a dangerous field eventually makes — that he is more trustworthy than the alternative. The book calls him a reluctant autocrat: someone who dislikes the idea of controlling others but consolidates power because he trusts his own judgment more than anyone else’s. Mallaby names the savior complex and then, having named it, seems to accept it — or at any rate accept that Hassabis’s version of it might be the least bad option currently on offer. That’s the place where I part ways with him. The singleton fantasy is itself a version of the complex — the belief that the solution to uncontrollable power is the right person holding it. Mallaby never presses on that. What keeps Hassabis separate from the rest is where the arrogance ends up pointing. Altman talks about generating infinite wealth. The venture capitalists dream about zero-marginal-cost intelligence. Hassabis’s endgame, by the time you reach the book’s final chapters, has almost nothing to do with money. He wants to solve physics.

Physicists have been arguing about reality at the Planck scale for decades. Roger Penrose’s position, roughly, is that human consciousness depends on quantum-mechanical effects and that the brain can apprehend truths no classical Turing machine could ever reach. Hassabis does not buy this. He thinks quantum mechanics is a wasteful way for the universe to render itself, and he thinks DeepMind has already shown — with Go, with AlphaFold — that a sufficiently powerful learning algorithm can decode whatever pattern nature is actually running. AGI, for him, is the ultimate telescope. He wants to build a space-based particle collider, one that would slingshot particles using a moon’s gravity, operated by artificial intelligence, aimed at whatever sits under the Planck scale. The working hypothesis is that the universe is discrete, computable, and, given a sufficiently powerful mind, eventually understandable. The money, the corporate espionage, the frantic coding sprints — all of it is in service of settling a dispute about what the universe is actually made of.

The Infinity Machine reads as useful for anyone following the current state of the tech industry, and more useful for anyone trying to understand its motives. A computer-science background isn’t required to follow Mallaby on neural networks or memory replay. What’s required is patience for the moral ambiguities of extreme ambition. Cade Metz’s Genius Makers captured the earlier phase, when deep learning was still a loose collaboration of eccentrics. Mallaby’s subject is the endgame and the consolidation of power at the top of it — the moment the field stopped looking like a research community and started looking like a war, fought by a handful of people who happen to own the server farms.

If you’d like to read the full book in EPUB or MOBI format, feel free to send me an email—I’d be happy to share a free copy with you. Please reach me at: thenovaleaf@gmail.com

Leave a Reply

Your email address will not be published. Required fields are marked *