Pivot - Demis Hassabis on AI, Game Theory, Multimodality, and the Nature of Creativity | Possible

Episode Date: April 12, 2025

How can AI help us understand and master deeply complex systems—from the game Go, which has 10 to the power 170 possible positions a player could pursue, or proteins, which, on average, can fold in ...10 to the power 300 possible ways? This week, Reid and Aria are joined by Demis Hassabis. Demis is a British artificial intelligence researcher, co-founder, and CEO of the AI company, DeepMind. Under his leadership, DeepMind developed Alpha Go, the first AI to defeat a human world champion in Go and later created AlphaFold, which solved the 50-year-old protein folding problem. He's considered one of the most influential figures in AI. Demis, Reid, and Aria discuss game theory, medicine, multimodality, and the nature of innovation and creativity. For more info on the podcast and transcripts of all the episodes, visit https://www.possible.fm/podcast/  Listen to more from Possible here. Learn more about your ad choices. Visit podcastchoices.com/adchoices

Transcript
Discussion (0)
Starting point is 00:00:00 Support for this show comes from Sophos. With Sophos, no matter your business's size, you get enterprise-grade cybersecurity technology and real-world expertise, always in sync, always in your corner. Sophos' native AI technologies evolve with every threat, and their experts are ready 24-7, 365, with their managed detection and response services
Starting point is 00:00:23 to stop threats before they strike. And you can manage all your security alerts, configurations, and other security products through the Sophos Central Platform. Don't sacrifice your peace of mind to grow your business. Learn more at Sophos.com. Support for this show comes from Shopify. With Shopify, it's easy to create your brand, open up for business, and get your first sale. Use their customizable templates, powerful social media tools, and a single dashboard for managing it all.
Starting point is 00:00:54 The best time to start your new business is right now, because established in 2025 has a nice ring to it, doesn't it? Sign up for a $1 per month trial period at Shopify.com slash Vox Business, all lowercase. Go to Shopify.com slash Vox Business to start selling with Shopify today. Shopify.com slash Vox Business. Hi, everyone. This is Pivot from New York Magazine and the Vox Media Podcast Network. I'm Cara Swisher, and today we're sharing an episode of Possible hosted by one of our recent guests, Reid Hoffman. Join Reid and his co-host, Aria Finger, as they sit down with the co-founder and CEO of Google DeepMind, Demis Hassabis, one of the most influential
Starting point is 00:01:39 figures in AI. They'll dive into game theory, medicine, multimodality, the nature of innovation, and how board games and video games shape our understanding of the future of AI. Enjoy the episode and remember you can find it and subscribe to Possible wherever you listen to podcasts. AI is going to affect the whole world. It's going to affect every industry. It's going to affect every country. It's going to be the most transformative technology ever, in my opinion.
Starting point is 00:02:06 So if that's true, and it's going to be like electricity or fire, then I think it's important that the whole world participates in its design. I think it's important that it's not just a hundred square miles of patch of California. I do actually think it's important that we get these other inputs, the broader inputs, not just geographically, but also different subjects, philosophy, social sciences, economists, not just the tech companies, not just the scientists involved in deciding how this gets built and what it gets used for. Hi, I'm Reid Hoffman. And I'm Aria Finger.
Starting point is 00:02:49 We want to know how, together, we can use technology like AI to help us shape the best possible future. With support from Stripe, we ask technologists, ambitious builders, and deep thinkers to help us sketch out the brightest version of the future and we learn what it'll take to get there. This is possible. In the 13th century, Sir Galahad embarked on a treacherous journey in pursuit of the elusive Holy Grail.
Starting point is 00:03:20 The grail, known in Christian lore as the cup Christ used in a last supper, had disappeared from King Arthur's table. The knights of the Round Table swore to find it. After many trials, Galad's pure heart allowed him the unique ability to look into the grail and observe divine mysteries that could not be described by the human tongue. In 2020, a team of researchers at DeepMind successfully created a model called AlphaFold that could predict how proteins will fold. This model helped answer one of the holy grail questions of biology. How does a long line of
Starting point is 00:03:58 amino acids configure itself into a 3D structure that becomes the building block of life itself. In October 2024, three scientists involved with AlphaFold won a Nobel Prize for these efforts. This is just one of the striking achievements spearheaded by our guest today. Demis Hassabis is a British artificial intelligence researcher, co-founder, and CEO of the AI company DeepMind. Under his leadership, DeepMind developed AlphaGo, the first AI to defeat a human world champion in Go,
Starting point is 00:04:32 and later created AlphaFold, which solved the 50-year protein folding problem. He is considered one of the most influential figures in AI. Reid and I sat down for an interview with Demis in which we talked about everything from game theory to medicine to multimodality and the nature of innovation and creativity. Here's our conversation with Demis Hesabis.
Starting point is 00:04:56 Demis, welcome to possible was awesome dining with you at Queens. It was kind of a special moment in all kinds of ways. And, you know, I think I'm going to start with a question that kind of came from your Babbage theater lecture and also from the fireside chat that you did with Muhammad Al-Ariyan, which is share with us the moment where you went from thinking, chess is the thing that I have spent my childhood doing, to what I want to do is start thinking about thinking. I want to accelerate the process of thinking and that computers are a way to do that.
Starting point is 00:05:42 How did you arrive at that? What age were you? What was that turn into metacognition? Well, yeah. Well, first of all, thanks for having me on the podcast. Chess for me is where it all started actually in gaming. And I started playing chess when I was four, very seriously, all through my childhood, playing for most of the England junior teams, captaining a lot of the teams. And for a long while, my main aim was to become a professional
Starting point is 00:06:12 chess player, a grandmaster, maybe one day, possibly a world champion. And that was my whole childhood really. Every spare moment, not at school, I was going around the world playing chess against adults in international tournaments. And then around 11 years old, I sort of had an epiphany really that although I love chess and I still love chess today, is it really something that one should spend your entire life on? Is it the best use of my mind? So that was one thing that was troubling me a little bit. But then the other thing was, as we were going to training camps with the England chess team, we started to use early
Starting point is 00:06:49 chess computers to try and improve your chess. And I remember thinking that, of course, we were supposed to be focusing on improving the chess openings and chess theory and tactics. But actually, I was more fascinated by the fact that someone had programmed this inanimate lump of plastic to play very good chess against me. And I was fascinated by how that was done. And I really wanted to understand that and then eventually try and make my own chess programs. I mean, it's so funny. I was saying to read before this, my seven year old school just won the New York State chess championship. So they have a long way to go before they get to you. But he takes it on faith, like, oh, yeah, mom,
Starting point is 00:07:29 I'm just going to go play chess kid on the computer. Like, I'll go play against the computer a few games, which, of course, was sort of a revelation sort of decades ago. And I remember when I was in middle school, it was obviously the Deep Blue versus Gary Kasparov. And this was like a man versus machine moment. And one thing that you've gestured at about this moment is that it illustrated, like in this case, based on Grandmaster data, it was like brute force versus like a self-learning system. Can you say more about that dichotomy? Yeah, well, look, first of all,
Starting point is 00:08:04 I mean, it's great. Your son's playing chess and I think it's fantastic. I'm a big advocate for teaching chess in schools as a part of the curriculum. I think it's fantastic training for the mind, just like doing maths or programming would be. And it's certainly affected the way I approach problems and problem solve and visualize solutions and plan. It teaches you all these amazing meta skills dealing with pressure.
Starting point is 00:08:25 So you sort of learn all of that as a young kid, which is fantastic for anything else you're going to do. And as far as Deep Blue goes, you're right, most of these early chess programs and then Deep Blue became the pinnacle of that were these types of expert systems, which at the time was the favored way of approaching AI, where actually it's the programmers that solve the problem, in this case playing chess, and then they encapsulate that solution in a set of heuristics and rules, which guides a brute force search towards, in this case, making a good chess move. I always had this, although I was fascinated by these Ailey chess programs that they could do that, I was also slightly disappointed by them.
Starting point is 00:09:05 And actually, by the time it got to Deep Blue, I was already studying at Cambridge in my undergrad. I was actually more impressed with Kasparov's mind because I'd already started studying neuroscience than I was with the machine because he was this brute of a machine. All it can do is play chess, and then Kasparov can play chess at roughly the same level, but also can do all the other amazing things that humans can do. I thought, doesn't that speak to the wonderfulness of the human mind? It also, more importantly, means something was missing from very fundamental, from Deep Blue and these expert system approaches to AI, very clearly. Because Deep Blue did not seem, even though it was a pinnacle of AI at the time, Blue did not seem, even though it was
Starting point is 00:09:45 a pinnacle of AI at the time, it did not seem intelligent. And what was missing was its ability to learn new things. So for example, it was crazy that Deep Blue could play chess to world champion level, but it couldn't even play tic-tac-toe. You'd have to reprogram. Nothing in the system would allow it to play tic-tac-toe. So that's odd. That's very different to a human grandmaster who obviously play a simpler game trivially. And then also it was not general, in the way that the human mind is.
Starting point is 00:10:11 And I think those are the hallmarks, that's what I took away from that match is those are the hallmarks of intelligence and they were needed if we wanted to crack AI. And go a little bit into the deep learning, which obviously is part of the reason why deep mind was name board is because part of, I think, that what was seen to be completely contrarian
Starting point is 00:10:31 hypothesis that you guys played out with self play and kind of learning system was that this learning approach was the right way to generate these significant systems. So say a little bit about having the hypothesis, what the trek through the desert looked like, and then what finding the Nile ended up with. Yes. Well, look, of course, we started DeepMind in 2010
Starting point is 00:10:53 before anyone was working on this in industry, and there was barely any work on it in academia. And we partially named the company DeepMind, the deep part, because of deep learning. It was also a nod to deep thought in Hitchhiker's Guide's Galaxy and Deep Blue and other AI things. But it was mostly around the idea we were better on these learning techniques, deep learning and hierarchical neural networks. They just sort of been invented in seminal work by Jeff Hinton and colleagues in 2006. So it's very,
Starting point is 00:11:23 very new. And reinforcement learning, which has always been a specialty of DeepMind, and the idea of learning from trial and error, learning from your experience, and then making plans and acting in the world. And we combine those two things, really. We sort of pioneered doing that, and we called it deep reinforcement learning, these two approaches and deep learning to kind of build a model of the environment or what you were doing in this case a game and then the reinforcement learning to do the
Starting point is 00:11:53 planning and the acting and actually accomplish and be able to build agent systems that could accomplish goals in the case of games is maximizing the score winning the game. And we felt that that was actually the entirety of what's needed for intelligence. The reason that we were pretty confident about that is actually from using the brain as an example. Basically, those are the two major components of how the brain works. The brain is a neural network. It's a pattern matching and structure finding system,
Starting point is 00:12:27 but then it also has reinforcement learning and this idea of planning and learning from trial and error and trying to maximize reward, which is actually in the human brain and the animal brain, the mammal brain is the dopamine system implements that, a form of reinforcement learning called TD learning. That gave us confidence that if we pushed hard enough in this direction, even though no one was really doing that, that eventually this should work, right? Because we have the existence proof of the human mind. And of course, that's why I also studied neuroscience, because when you're in the desert, like you say, you need any source of water or any evidence that you might get out of the desert. There's even a mirage in the distance is a useful thing to understand
Starting point is 00:13:07 in terms of giving you some direction when you're in the midst of that desert. And of course, AI was itself in the midst of that because several times this had failed. The expert system approach basically had reached a ceiling. I could easily hog the entire interview, so I'm trying not to. So one of the things that the learning system obviously ended up creating was solving what was previously considered an insoluble problem. There were even people who thought that computers couldn't, like classical computational techniques couldn't solve go, but in the classic move 37, it demonstrated originality,
Starting point is 00:13:47 creativity that, that was beyond, you know, the thousands of years of go play and books and the hundreds of years of very serious play. What was that moment of move 37 like for, for understanding where AI is and what do you think the next move 37 is? Will Barron Well, look, the reason Go was considered to be and ended up being so much harder than chess, so it took another 20 years, even us with AlphaGo. And all the approaches that have been taken with chess, these expert systems, uh, uh, approaches had failed with go, right? Um, basically couldn't even be a professional, let alone a world champion. And the reason was two main reasons.
Starting point is 00:14:32 One is the complexity of go is so enormous. You know, it's one way to measure that is there are 10 to the power, 170 possible positions, right? Far more than atoms in the universe. There's no way you can brute force a solution to go, right? It's impossible. But even harder than that is that it's such a beautiful, esoteric, elegant game. It's sort of considered art, an art form in Asia, really, right? And it's because it's both beautiful aesthetically, but also it's all about patterns rather than sort of brute calculation, which
Starting point is 00:15:06 chess is more about. And so even the best players in the world can't really describe to you very clearly what are the heuristics they're using. They just kind of intuitively feel the right moves, right? They'll sometimes just say that this move, why did you play this move? Well, it felt right, right? And then it turns out their intuition of their brilliant player, their intuition is brilliant and fantastic. And it's an amazingly beautiful and effective move. But that's very difficult then to encapsulate in a set of heuristics and rules that to direct how a machine should play go. And so that's why all of these kind of deep blue methods didn't work. Now, we got around that by having the system learn for
Starting point is 00:15:47 itself what are good patterns, what are good moves, what are good motifs and approaches, and what are valuable and high probability of winning positions are. So it learned that for itself through experience, through seeing millions of games and playing millions of games against itself. So that's how we got AlphaGo to be better than world champion level. But the additional exciting thing about that is that it means those kinds of systems can actually go beyond what we as the programmers or the system designers know how to do. No expert system can do that because of course it's strictly limited by what we already know and can describe to the machine.
Starting point is 00:16:31 But these systems can learn for themselves. And that's what we resulted in Move 37 in Game 2 of the famous World Championship match, the challenge match we had against Lisa Dole in Seoul in 2016. And that was a truly creative move. Go has been played for thousands of years. It's the oldest game humans have invented, and it's the most complex game. And it's been played professionally for hundreds of years in places like Japan. And even still, even despite all of that exploration by brilliant human players, this move 37 was something
Starting point is 00:17:06 never seen before. And actually, worse than that, it was thought to be a terrible strategy. In fact, if you go and watch the documentary, which I recommend, it's on YouTube now, of AlphaGo, you'll see the professional commentators nearly fell off their chairs when they saw Move 37 because they thought it was a mistake. They thought the computer operator, Aja, had misclicked on the computer because it was so unthinkable that someone would play that. Then, of course, in the end, it turned out 100 moves later, that move 37, the stone, the piece that was put down on the board, was in exactly the
Starting point is 00:17:40 right place to be decisive for the whole game. Now it's studied as a great classic of the Go history of Go, that game and that move. Of course, then even more exciting for that is, that's exactly what we hoped these systems would do because the whole point of me and my whole motivation my whole life of working on AI was to use AI to accelerate scientific discovery. innovation my whole life of working on AI was to use AI to accelerate scientific discovery. And it's those kinds of new innovations, albeit in a game, is what we were looking for from our systems. And, you know, that I think is a awesome rendition of kind of why it is these learning systems are, you know, even now doing original discovery. What do you think the next move 37 might be for kind of opening
Starting point is 00:18:29 our minds to what is the way that AI can add a whole lot to the kind of quality of human thought, human existence, human science? Yeah. Well, look, I think there'll be a lot of move 37s in almost every area of human endeavor. Of course, the thing I've been focusing on since then is mostly being how can we apply those types of AI techniques, those learning techniques, those general learning techniques to science. Big areas of science, I call them root node problems. Problems where if you think of the tree of all knowledge that's out there in the universe, can you unlock some root nodes that unlock entire branches or
Starting point is 00:19:11 new avenues of discovery that people can build on afterwards? For us, protein folding and alpha fold was one of those. It was always top of my list. I have a mental list of all these types of problems that I've come across throughout my life and just being genuinely interested in all areas of science. Sort of thinking through which ones would be suitable would both be hugely impactful but also suitable for these types of techniques. I think we're going to see a new golden era of these types of new strategies, new ideas in very important areas of human endeavor. I would say one thing to say though is that we haven't fully cracked creativity yet.
Starting point is 00:19:56 I don't want to claim that. I often describe there's three levels of creativity and I think AI is capable of the first two. So first one would be interpolation. So you give it a million pictures of cats, an AI system, a million pictures of cats, and you say, create me a prototypical cat. And it will just average all the million cats' pictures that it's seen. And that prototypical one won't be in the training set. So it will be a unique cat. But that's not very interesting from a creative point of view. It's just an averaging. But the second thing would be what I call extrapolation. So that's more like AlphaGo, where you've played 10 million games of Go, you've looked at a few million human games of Go, but then you come up with, you extrapolate from
Starting point is 00:20:40 what's known to a new strategy never seen before, like move 37. Okay, so that's very valuable already. I think that is true creativity. But then there's a third level which I call it kind of invention or out of the box thinking, which is not only can you come up with a move 37, but could you have invented Go? Or another measure I like to use is if if we went back to the time of Einstein in 1900, early 1900s, could an AI system actually come up with general relativity with the same information that Einstein had at the time? Clearly, today, the answer is no to those things. It can't invent a game as great as Go, and it wouldn't be able to invent general relativity just from
Starting point is 00:21:26 the information that Einstein had at the time. And so there's still something missing from our systems to get true out-of-the-box thinking. But I think it will come, but we just don't have it yet. I think so many people outside of the AI realm would be surprised be surprised. It sort of all starts with gaming, but that's sort of gospel for what we're doing. It's like, that's how we created these systems. And so switching gears from board games to video games, can you give us just like the elevator pitch explanation
Starting point is 00:21:57 for what exactly makes an AI that can play StarCraft 2, like AlphaStar, so much more advanced and fascinating than the one that can play StarCraft II like AlphaStar, so much more advanced and fascinating than the one that can play chess or Go. Yeah, with AlphaGo, we sort of cracked the pinnacle of board games, right? So Go was always considered the Mount Everest, if you like, of games AI for board games. But there are even more complex games by some measures if you take on board the most complex strategy games that you can play on computers. StarCraft II is acknowledged to be the classic of the genre of real-time strategy games. It's a very complex game. You've got to build up your base and your units and other
Starting point is 00:22:37 things. Every game is different. The board game is very fluid and you've got to move many units around in real time. The way we cracked that was to add this additional level in of a league of agents competing against each other, all seeded with slightly different initial strategies. Then you get a survival of the fittest. You have a tournament between them all, so it's a multi-agent setup now, and the strategies that win out in that tournament go to the next, you know, the next epoch, and then you generate some other new strategies around that and you keep doing that for many generations. You're kind of both having this idea of self-play that we had in AlphaGo, but you're adding in this multi-agent competitive, almost evolutionary dynamic
Starting point is 00:23:21 in there, and then eventually you get an agent or a series or a set of agents that are the Nash distribution of agents. No other strategy dominates them, but they dominate the most number of other strategies. Then you have this Nash equilibrium and then you pick out the top agents from that. That succeeded very well with this type of very open-ended kind of gameplay. So it's quite different from what you get with chess or Go, where the rules are very prescribed and the pieces that you get are always the same. And it's sort of a very ordered game. Something like StarCraft is much more chaotic. So it's sort of interesting to have to deal with that. It has hidden information too. You can't see the whole map at once. You have to explore it. So it's not a perfect information game, which is another thing we
Starting point is 00:24:08 wanted our systems to be able to cope with, is partial information situations, which is actually more like the real world, right? Very rarely in the real world do you actually have full information about everything. Usually you only have partial information and then you have to infer everything else in order to come up with the right strategies. Part of the game side of this is, I presume you've heard that there's this kind of theory of homo-ludens that we're game players. Is that informing the kind of thinking about how games is both strategic, but also kind of framing for like science acceleration, framing for kind of the serendipity of innovation. Is in addition to the kind of the fitness function,
Starting point is 00:24:58 the kind of evolution of self play, the ability to play scale compute, are there other deeper elements to the game playing nature that allows this thinking of thinking? Well, look, I'm glad you brought up Home of Ludens and it's a wonderful book and it basically argues that games playing is actually a fundamental part of being human, right? In many ways, that's the act of play. What could be more human than that? Then, of course, it leads into creativity, fun. All of these things get built on top of that. I've always loved them as a way to practice and train your own mind
Starting point is 00:25:41 in situations that you might only ever get a handful of times in real life, but they're usually very critical. What company to start, what deal to make, things like that. So I think games is a way to practice those scenarios. And if you take games seriously, then you can actually simulate a lot of the pressures one would have in decision-making situations. And Going back to earlier, that's why I think chess is such a great training ground for kids to learn because it does teach them about all of these situations. Of course, it's the same for AI systems too. It was the perfect proving ground for our early AI system ideas, partly because they
Starting point is 00:26:23 were invented to be challenging and fun for humans to play. Of course, there are different levels of gameplay. We could start with very simple games like Atari games and then go all the way up to the most complex computer games like StarCraft and continue to challenge our system. We were in the sweet spot of the S-curve. It's not too easy, it's trivial, or too hard. You can't even see if you're making any progress. You want to be in that maximum sort of part of the S curve where you're making almost exponential progress. And we could keep picking harder and harder games as our systems got improved. And then the other nice feature
Starting point is 00:27:00 about games is because they're some kind of microcosm of the real world, they've usually been boiled down to very clear objective functions, right? So winning the game or maximizing the score is usually the objective of a game. And that's very easy to specify to a reinforcement learning system or an agent based system. So you can, it's perfect for hill climbing against, right? And measuring ELO scores,
Starting point is 00:27:26 ratings and exactly where you are. And then finally, of course, you can calibrate yourself against the best human players. So you can sort of calibrate what your agents are doing in their own tournaments. In the end, even with the StarCraft agent, we had to eventually challenge a professional grandmaster at StarCraft to make sure that our systems hadn't overfitted somehow to their own tournament strategies. It actually needed to be, oh, we grounded it with, oh, it can actually be a genuine human Grandmaster StarCraft player. The final thing is, of course, you can generate as much synthetic data as you want with games,
Starting point is 00:28:01 too, which is coming into vogue right now again about data limitations and with large language models and how many tokens left in the world and has it read everything in the world. Obviously for things like games, you can actually just play the system against itself and generate lots more data from the right distribution. Can you double click on that for a moment? Like you said, it is in Vogue to talk about, are we running out of data? Do we need synthetic data? Like, where do you stand on that issue? Well, I've always been a huge proponent of simulations and simulations and AI. And, you know,
Starting point is 00:28:38 it's also interesting to think about what the real world is, right, in terms of a computational system. And so I've always been involved with trying to build very realistic simulations of things. Now of course that interacts with AI because you can have an AI that learns a simulator of some real world system just by observing that system or all the data from that system. I think the current debate is to do with these large foundation models now pretty much use the whole internet. And so then once you've tried to learn from those, what's left? That's all the language that's out there.
Starting point is 00:29:16 Of course, there's other modalities like video and audio. I don't think we've exhausted all of that kind of multimodal tokens. But even that will reach some limit. So then the question that comes of like, can you generate synthetic data? And I think that's why you're seeing quite a lot of progress with maths and coding, because in those domains,
Starting point is 00:29:35 it's quite easy to generate synthetic data. The problem with synthetic data is, are you creating data that is from the right distribution, the actual distribution, right? Does it mimic the kind of real distribution? And also, are you generating data that's correct, right? And, of course, for things like maths, for coding, and for things like gaming, you can actually test the final data and verify if it's correct, right, before you feed it in as input into the training data
Starting point is 00:30:07 for a new system. It's very amenable certain areas, in fact, turns out the more abstract areas of human thinking that you can verify and prove that it's correct. that unlocks the ability to create a lot of synthetic data. dress for Friday's fundraiser. Okay, all right, where are my keys? In my pocket, let's go. First, pick up dress, then prepare for that big presentation, walk dog, then, okay, inhale. One, two, three, four, exhale, one, two, three, four.
Starting point is 00:31:02 Ooh, who knew a driver's seat could give such a good massage? Wow, this is so nice. Oops, that was my exit. Oh well, that's fine. I've got time. After the meeting, I gotta remember to schedule flights for our girls' trip, but that's for later. remember to schedule flights for our girls trip. But that's for later.
Starting point is 00:31:27 Sun on my skin, wind in my hair. I feel good. Turn the music up. Your all new Nissan Murano is more than just a tool to get you where you're going. It's a refuge from life's hustle and bustle. It's a place to relax, to reset, in the spaces between items on your to-do lists.
Starting point is 00:31:50 Oh wait, I got a message. Could you pick up wine for dinner tonight? Yep, I'm on it. I mean, that's totally fine by me. Play Celebrity Memoir Book Club. It's been reported that 1 in 4 people experience sensory sensitivities, making everyday experiences like a trip to the dentist especially difficult. In fact, 26% of sensory-sensitive individuals avoid dental visits entirely. In Sensory Overload, a new documentary produced as part of Sensodyne's Sensory Inclusion
Starting point is 00:32:36 Initiative, we follow individuals navigating a world not built for them, where bright lights, loud sounds, and unexpected touches can turn routine moments into overwhelming challenges. Burnett-Grant, for example, has spent their life masking discomfort in workplaces that don't accommodate neurodivergence. I've only had two full-time jobs where I felt safe, they share. This is why they're advocating for change. Through deeply personal stories like Burnett's, Sensory Overload highlights the urgent need for spaces,
Starting point is 00:33:09 dental offices, and beyond that embrace sensory inclusion. Because true inclusion requires action with environments where everyone feels safe. Watch Sensory Overload now streaming on Hulu. Support for the show comes from Mercury. What if banking did more? now streaming on Hulu. it's landing your fundraise. The truth is, banking can do more. Mercury brings all the ways you use money into a single product that feels extraordinary to use. Visit mercury.com to join over 200,000 entrepreneurs
Starting point is 00:33:55 who use Mercury to do more for their business. Mercury, banking that does more. that does more. So one of the things that is also in addition to the frequent discussion around data, how do we get more? But one of the questions is, in order to do AI, is it important to actually have it embedded in the world? Yeah. Well, interestingly, if we talked about this five years ago,
Starting point is 00:34:31 or certainly 10 years ago, I would have said that some real-world experience, maybe through robotics, usually when we talk about embodied intelligence, we're meaning robotics, but it could also be a very accurate simulator, right? Like some kind of ultra realistic game environment, would be needed to fully understand, say, the physics of the world around you, right? And the physical context around you. And there's actually a whole branch of neuroscience
Starting point is 00:35:00 that is predicated on this. It action in perception. This is the idea that one can't actually fully perceive the world unless you can also act in it. The kinds of arguments go is like, how can you really understand the concept of the weight of something, for example, unless you can pick things up and compare them with each other and then you get this idea of weight. Can you really get that notion just by looking at things? It seems hard, certainly for humans. I think you need to act in the world. This is the idea that acting in the world is part of your learning. You're kind of like an active learner. In fact, reinforcement learning is like that
Starting point is 00:35:40 because the decisions you make give you new experiences, but those experiences depend on the actions you took, but also those are the experiences that you'll then subsequently learn from. In a sense, reinforcement learning systems are involved in their own learning process because they're active learners. I think you can make a good argument that that's also required in the physical world. Now it turns out, I'm not sure I believe that anymore because now with our systems, especially our video models, if you've seen VO2, our latest video models, completely state of the art which we released late last year, and it's kind of shocked even me that even though
Starting point is 00:36:24 we're building this thing, that it can sort of basically by watching YouTube videos, a lot of YouTube videos, it can figure out the physics of the world. There's a sort of funny Turing test of, in some sense, Turing test in verb commas of video models, which is, can you chop a tomato? Can you show a video of a knife chopping a tomato with the fingers and everything in the right place and the tomato doesn't magically spring back together or the knife goes through the tomato without cutting it, et cetera? And Vio can do it. And if you think through the complexity of the physics to understand what you've got to keep consistent and so on, it's pretty amazing. It's hard to argue that it doesn't understand something about physics and the physics of the world.
Starting point is 00:37:07 And it's done it without acting in the world, and certainly not acting as a robot in the world. So it's not clear to me there is a limit now with just sort of passive perception. Now, the interesting thing is that I think this has huge consequences for robots as an embodied intelligence as an application because the types of models we've built, Gemini and also now VO, and we'll be combining those things together at some point in the future,
Starting point is 00:37:36 is we've always built Gemini, our foundation model, to be multimodal from the beginning. The reason we did that and we still lead on all the multimodal benchmarks is beginning. And the reason we did that, and we still lead on all the multimodal benchmarks, is because for twofold. One is we have a vision for this idea of a universal digital assistant, an assistant that goes around with you on the digital devices, but also in the real world, maybe on your phone or a glasses device and actually helps you in the real world, like recommend things to you, help you navigate around, help with physical things in the world like cooking, stuff like that. For that to work, you obviously need to understand the context that you're in.
Starting point is 00:38:19 It's not just the language I'm typing into a chat bot. You actually have to understand the 3D world I'm living in right i think to be really good assistant you need to do that. I'm the second thing is of course exactly what you need for robotics as well. I'm really star first big sort of gemini robotics work which is cause a bit of a star and that's the beginning of showing what we showcasing what we can do with these multimodal models that do understand physics of the world, with a little bit of robotics fine-tuning on top to do with the actions, the motor actions, and the planning a robot needs to do. It looks like it's going to work. Actually now, I think these general models are actually going to transfer to the embodied
Starting point is 00:39:01 robotic setting without too much extra special casing or extra data or extra effort, which is probably not what most people, even the top roboticists, would have predicted five years ago. I mean, that's wild. And thinking about benchmarks and what we're going to need these digital assistants to do, when we look under the hood of these big AI models,
Starting point is 00:39:25 well, some people would say it's attention. So the trade-offs is thinking time versus output quality. We need them to be fast, but of course, we need them to be accurate. And so talk about what is that trade-off and how is that going in the world right now? Well, look, of course, we pioneered all that area of thinking systems because
Starting point is 00:39:47 that's what our original gaming systems all did, right? Go, AlphaGo, but actually most famously AlphaZero, which was our follow-up system that could play any two-player game. And there you always have to think about your time budget, your compute budget you've got to actually do the planning part, right? So the model you can pre-train, just like we do with our foundation models today. You can play millions of games offline, and then you have your model of chess or your model of Go, whatever it is.
Starting point is 00:40:12 But at test time, at runtime, you've only got one minute to think about your move. One minute times how many computers you've got running. That's still a limited compute budget. So what's very interesting today is there's trade-off between do you use a more expensive larger base model, foundation model, right? So in our case, we have different size names like Gemini Flash or Pro or even bigger, which is Ultra. But those models are more costly to run.
Starting point is 00:40:42 So they take longer to run. But they're more accurate and they're more capable. So you can run a bigger model with a shorter number of planning steps, or you can run a very efficient smaller model that's slightly less powerful, but you can run it for many more steps. Currently, what we're finding is it's roughly about equal, but of course what we want to find is some, is some, the Pareto frontier of that, right? Like actually the exact right trade off of the size of the model and the expense of that running that model versus the amount of thinking time you want to,
Starting point is 00:41:15 and thinking steps that you're, you're able to do per unit of compute time. And I think that's, that's actually fairly cutting edge research right now that I think all the leading labs are probably experimenting on. I think there's not a clear answer to that yet. All the major labs, DeepMind, others, are all working intensely on coding assistants. There's a number of reasons. Everything from like, A, it's one of the things that accelerates productivity across the whole front. It has a kind of good fitness function.
Starting point is 00:41:46 It's also of course, one of the ways that, you know, everyone is going to be handsome productivity is having a software, you know, kind of copilot agent for helping. There's just a ton of reasons. Now one of the things that gets interesting here is as you're building these, you know, obviously there's a tendency to start with these computer languages that have been designed for humans. What would be computer languages that would be designed for AIs or an agentic world or designed for this hybrid process of a human plus an AI?
Starting point is 00:42:18 Is that a good world to start looking at those kind of computer languages? How would it change our theory of computation, linguistics, etc. I think we are entering a new era in coding, which is going to be very interesting. And, you know, as you say, all the leading labs are pushing on this frontier for many reasons. It's easy to create synthetic data, so that's another reason that everyone's pushing on this vector. And I think we're going to move into a world where, you know, sometimes it's called vibe coding, where you're basically coding with natural language, really. Right. And then and we've seen this before with computers, right. I remember when I first started programming,
Starting point is 00:42:57 you know, in the 80s, we were doing assembler. And then of course, you know, that seems crazy now like why would you do machine code? You start with C, and then you get Python, and so on. And really, one could see it as the natural evolution of going higher and higher up the abstraction stack of programming languages and leaving the more and more of the lower level implementation details to the compiler, in a sense. And now, one could just view this
Starting point is 00:43:22 as the natural final step. Well, we just use natural language, and then everything is super high-level programming language. I think eventually that's maybe what we'll get to. The exciting thing there is that, of course, it will make accessible coding to a whole new range of people, creatives, right? Who normally would, you know, designers, game designers, app writers, that would normally would not have been able to implement their ideas without the help of, you know, teams of programmers. So that's going to be pretty exciting, I think, from a creativity point of view. But it may also be very good, certainly in the next few years, for coders as well. I think this in general with
Starting point is 00:44:07 these AI tools is, I think that the people that are going to get most benefit out of them initially will be the experts in that area who also know how to use these tools in precisely the right way, whether that's prompting or interfacing with your existing code base. There's going to be this sort of interim period where I think the current experts who embrace these new tools, whether that's filmmakers, game designers, or coders, are going to be superhuman in terms of what they're able to do. I see that with some film directors and film designer friends of mine who are able to create pitch decks, for
Starting point is 00:44:48 example, for new film ideas in a day on their own. But it's very high quality pitch deck that they can pitch for a $10 million budget for. Normally they would have had to spend a few tens of thousands of dollars just to get to that pitch deck, which is a huge risk for them. So it becomes, I think there's going to be a whole new incredible set of opportunities. And then there's the question of like, if you think about creative, the creative arts, whether there'll be new ways of working, much more fluid. Instead of doing Adobe Photoshop or something, you actually co-creating this thing with this
Starting point is 00:45:24 fluid responsive tool. That could feel more like Minority Report or something, I imagine, with the interface and there's this thing swirling around you. It will require people to get used to a very new workflow to take maximum advantage of that. But I think when they do, it will be probably incredible for those people. They'll be like 10X more productive. I want to go back to the world of multimodal that we were talking about before with robots in the real world.
Starting point is 00:45:58 Right now, most AI doesn't need to be multimodal in real time because the Internet is not multimodal. For our listeners, that means absorbing many types of input, AI doesn't need to be multimodal in real time because the internet is not multimodal. And for our listeners, that means absorbing many types of input, voice, text, vision at once. And so can you go deeper in what you think the benefits of truly real time multimodal AI will be and like, what are the challenges to get to that point? I think first of all, we live in a multimodal world, right?
Starting point is 00:46:25 We have our five senses, and that's what makes us human. If we want our systems to be brilliant tools or fantastic assistants, I think in the end, they're going to have to understand the world, the spatial temporal world that we live in, not just our linguistic, maths world, right? Abstract thinking world. I think that they'll need to be able to act in and plan in and process things in the real world and understand the real world. I think that the potential for robotics is huge. I don't think it's had its chat GPT or its AlphaFold moment yet, say in science and language, right? Or alpha go moment.
Starting point is 00:47:05 I think that's to come, but I think we're close. And as we talked about before, I think in order for that to happen, I think that the shortest path I see that happening on now is these general multimodal models being eventually good enough, and maybe we're not very far away from that, to sort of install on a robot, perhaps a humanoid robot with the cameras. Now there's additional challenges of you've got to fit it locally or maybe on the local chips to have the latency fast enough and so on. But as we all know, just wait a couple of years and those systems that stay with the art today will fit on a little mobile chip tomorrow. So I think it's very exciting, multimodal from that point of view,
Starting point is 00:47:46 robotics, assistance. And then finally, I think also for creativity, I think we're the first model in the world, Gemini 2.0, that you can try now in AI Studio, that allows native image generation. So not calling a separate program in this separate model, in our case, Imogen 3, which you can try separately, but actually Gemini itself natively coming up in the chat flow of images. And I think people seem to be really enjoying using that. So it's sort of like you're now talking to a multimodal chatbot, right? And so you can get it to express emotions in pictures or you can give it a
Starting point is 00:48:26 picture and then tell it to modify it and then continue to work on it with word descriptions. Can you remove that background? Can you do this? So this goes back to the earlier thing we said about programming or any of these creative things in a new workflow. I think we're just seeing the glimpse of that if you try out this new Gemini 2 experimental model of how that might look in image creation. And that's just the beginning. Of course, it will work with video and coding and all sorts of things. So, in the land of the real world, multimodal, one of the things that frequently people speculate is geolocation of AI work. Obviously, in the US, we intensely track everything that's happening on the West Coast. We also intensely track DeepMind and then somewhat last, Mistral and others.
Starting point is 00:49:19 What's some of the stuff that's really key for the world to understand what's coming out of Europe. What's the benefit of having there be multiple major centers of innovation and invention, you know, not just within the West Coast, but also obviously DeepMind in London and Mistral and Paris and others. And what are some of the things to, for people to pay attention to, why it's important and what's happening, especially within the UK and European AI ecosystem? We started DeepMind in London and still headquartered here for several reasons.
Starting point is 00:49:55 I mean, this is where I grew up, that's what I know. It's where I had all the contacts that I had. But the competitive reasons were that we felt that the talent in the UK and in Europe was the coming out of universities was the equivalent of the top US ones. You know, Cambridge, my alma mater and Oxford, they're up there with the MITs and Harvard's and the Ivy Leagues, right? I think they're sort of, you know, they're always in the top 10 there together on the university world tables.
Starting point is 00:50:24 But if you, this is certainly true in 2010, if you were coming, say you had a PhD in physics out of Cambridge and you didn't want to work in finance at a hedge fund in the city, but you wanted to stay in the UK and be intellectually challenged, there were not that many options for you, right? There are not that many deep tech startups. So we were the first really, and to prove that could be done. And actually we were a big draw for the whole of Europe. So we got the best people from the technical universities in Munich and in Switzerland and so on. And for a long while,
Starting point is 00:50:53 that was a huge competitive advantage. And also salaries were cheaper here and then in the West coast and you weren't competing against the big incumbents. Right? And also it was conducive. The other reason I chose to do that was I knew that AGI, which was our plan from the beginning, you know, solve intelligence and then use it to solve everything else. That was our, where we articulated our mission statement. And I still like that framing of it. It was a 20 year mission. And if you want a 20 year mission,
Starting point is 00:51:23 and we're now 15 years in, and I think we're sort of on track, unbelievably, which is strange for any 20 year mission, but you don't want to be too distracted on the way in a deep technology, deep scientific mission. One of the issues I find with Silicon Valley is lots of benefits, obviously, contacts and support systems and funding and amazing things and the amount of talent there, the density of talent. But it is quite distracting, I feel. Everyone and their dog is trying to do a startup that they think is going to change the world,
Starting point is 00:51:55 but it's just a photo app or something. And then the cafes are filled with this. Of course, it leads to some great things, but it's also a lot of noise if one actually wants to commit to a long-term mission that you think is the most important thing ever, and you don't want to be too, you know, you and your staff and want to be too distracted and like, oh, I could make a, maybe I could make a hundred million though, if I jumped and did this, you know, quickly did this gaming app or something. Right. And, and I think that's sort of the, the, the milieu
Starting point is 00:52:20 that you're in, uh, in the Valley, at least, at least back then, maybe this is less true now. There's probably more mission-focused startups now. But I also wanted to prove it could be done elsewhere. And then the final reason I think it's important is that AI is going to affect the whole world. It's going to affect every industry. It's going to affect every country. It's going to be the most transformative technology ever, in my opinion. So if that's true, and it's going to be like electricity or fire, more impactful than even the internet or mobile, then I think it's important that the whole world participates in its design and with the different value systems that we think are out there, there are philosophies
Starting point is 00:53:06 that are good philosophies. From democratic values, Western Europe, US, I think it's important that it's not just a hundred square miles of a patch of California. I do actually think it's important that we get these other inputs, the broader inputs, not just geographically, but also, and I know you agree with this, Reed, different subjects, philosophy, social sciences, economists, academia, civil society, not just the tech companies, not just the scientists involved in deciding how this gets built and what it gets used for. And I feel that I've always felt that very strongly from the beginning. And I think having some European involvement and some UK
Starting point is 00:53:49 involvement at the top table of the innovation is a good thing. So Demis, one of the areas of AI that when anyone asks me, like, hey, Aria, I know you're interested in AI, but like, well, you can write my emails. Why is it so special? I just say, no, think about what it can do in medicine. I always talk about Alpha Fold. I tell them about what Reed is doing. Like, I'm just so excited for those breakthroughs. Can you give us just a little bit? You had this seminal
Starting point is 00:54:14 breakthrough in Alpha Fold, and what is it going to do for the future of medicine? I've always felt that, like, what are the most important things AI can be used for? I think there are two. One is human health. That's number one, trying to solve and cure terrible diseases. Then number two is to help with energy, sustainability, and climate, the planet's health, let's call it. There's human health, and then there's a planet's health.
Starting point is 00:54:43 Those are the two areas that we have focused on in our science group, which I think is fairly unique amongst the AI labs actually, in terms of how much we pushed that from the beginning. And then protein folding specifically was this canonical for me. I sort of came across it when I was an underground in Cambridge 30 years ago and it's always stuck with me is this fantastic puzzle that would unlock so many possibilities. The structure of proteins, everything in life depends on proteins and we need to understand the structure so we know their function. If we know the function then we can understand what goes wrong in disease and we can design drugs and molecules that will bind to the right part of the surface of the protein if you know
Starting point is 00:55:24 the 3D structure. It's a fascinating problem. It goes right part of the surface of the protein if you know the 3D structure. So it's a fascinating problem. It goes to all of the computational things we were discussing earlier as well. Can you see through this forest of possibilities all these different ways a protein could fold? Some people estimate, Leventhal very famously in the 1960s, estimated an average protein can fold in 10 to 300 possible ways. How do you enumerate those in astronomical possibilities? Yet it is possible with these learning systems. That's what we did with AlphaFold. Then we spun out a company, Isomorphic, and I know Reid's very
Starting point is 00:55:57 interested in this area too with his new company, of if we can reduce the time it takes to discover a protein structure from, it used to take a PhD student, their entire PhD as a rule of thumb to discover one protein structure. So four or five years. And there's 200 million proteins known to science and we folded them all in one year. So we did a billion years of PhD time in one year is another way you can think of it. And then gave it to the world freely to use. And two million researchers around the world have used it. And we spun out a new company, Isomorphic,
Starting point is 00:56:32 to try and go further downstream now and develop the drugs needed and try and reduce that time. I mean, it's just amazing. I mean, Demis, there's a reason they give you the Nobel Prize. Thank you so much for all of your work in this area. It's truly amazing. And now to rapid fire. Is there a movie, song or book,
Starting point is 00:56:57 that fills you with optimism for the future? There's lots of movies that I've watched that have been super inspiring for me. Things like even like Blade Runner, There's lots of movies that I've watched that have been super inspiring for me. Things like even like Blade Runner is probably my favorite sci-fi movie, but maybe it's not that optimistic. So, if you want an optimistic thing, I would say the Culture series by Ian Banks. I think that's the best depiction of a post-AGI universe where you know, AIs and you've basically got societies of AIs and humans
Starting point is 00:57:29 and kind of alien species actually and sort of maximum human flourishing across the galaxy. That's kind of amazing, compelling future that I would hope for humanity. What is a question that you wish people asked you more often? hope for humanity. What is a question that you wish people asked you more often? The questions I sort of often wonder why people don't discuss a lot more, including with me, some of the really fundamental properties of reality that actually drove me in the beginning when I was a kid to think about building AI to help us sort of this ultimate tool for science.
Starting point is 00:58:04 So for example, you know, I don't understand why people don't worry more about what is time, what is gravity, or basically the fundamental fabric of reality, which is sort of staring us in the face all the time, all these very obvious things that impact us all the time, and we don't really have any idea how it works. I don't know know why that it doesn't trouble people more. It troubles me. And, uh, and, and, you know, I'd love to have more debates with people, uh, about those things, but, uh, actually most people don't seem to, you know, they seem to sort of shy away from those topics.
Starting point is 00:58:38 Where do you see progress or momentum outside of your industry that inspires you? That's a tough one because AI is so general. It's almost touching what industry is outside of the AI industry. I'm not sure there's many. Maybe the progress going on in quantum is interesting. I still believe AI is going to get built first and then will maybe help us perfect our quantum systems but I have ongoing bets with some of my quantum friends like Hartmut Neven on they're going to build quantum systems first and then that will help us accelerate AI.
Starting point is 00:59:13 So I always keep a close eye on the advances going on with quantum computing systems. Final question. Can you leave us with a final thought on what is possible over the next 15 years if everything breaks humanity's way and what's the first step to get there? Well what I hope for next 10-15 years is what we're doing in medicine to really have new breakthroughs and I think maybe in the next 10-15 years we can actually have a real crack at solving all disease. Right. That's, that's the mission of isomorphic. And I think with AlphaFold, we showed what the potential was, um, to sort of do what I like to call science at digital speed and why can that also
Starting point is 00:59:55 be applied to finding medicines? Um, and so my hope is 10, 15 years time, we'll, we'll look back on the medicine we have today, a bit like how we look back on medieval times and how we used to do medicine then. And that would be, I think, the most incredible benefit we could imagine from AI. Possible is produced by Wondermedia Network. It's hosted by Aria Finger and me, Reid Hoffman. Our showrunner is Sean Young. Possible is produced by Katie Sanders, Edie Allard, Sarah Schleed, Vanessa Handy, Aliyah Yates, Paloma Moreno-Himenez, and Malia Agudelo. Jenny Kaplan is our executive producer and editor. Special thanks to Surya Yalamanchili,
Starting point is 01:00:50 Sayida Sepiyeva, Fanasi Dilos, Ian Alice, Greg Beato, Parth Patil, and Ben Relis. And a big thanks to Leila Hajaj, Alice Talbert, and Denise Owusu-Afriaye. hospitality, media, and the broader food system with its highly anticipated awards. To learn more, visit the 2025 James Beard Awards Hub at jamesbeard.org slash awards. And be sure to watch the James Beard Awards
Starting point is 01:01:33 from Chicago on June 16th at 5.30 p.m. Eastern, live on Eater. Start a business that sells decorative plates. Find out you have to track expenses. Use Intuit QuickBooks to auto-track expenses so you can keep spinning, uh, selling those plates. Manage and grow your business all in one place. Intuit QuickBooks. Your way to money.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.