Pivot - Demis Hassabis on AI, Game Theory, Multimodality, and the Nature of Creativity | Possible
Episode Date: April 12, 2025How can AI help us understand and master deeply complex systems—from the game Go, which has 10 to the power 170 possible positions a player could pursue, or proteins, which, on average, can fold in ...10 to the power 300 possible ways? This week, Reid and Aria are joined by Demis Hassabis. Demis is a British artificial intelligence researcher, co-founder, and CEO of the AI company, DeepMind. Under his leadership, DeepMind developed Alpha Go, the first AI to defeat a human world champion in Go and later created AlphaFold, which solved the 50-year-old protein folding problem. He's considered one of the most influential figures in AI. Demis, Reid, and Aria discuss game theory, medicine, multimodality, and the nature of innovation and creativity. For more info on the podcast and transcripts of all the episodes, visit https://www.possible.fm/podcast/ Listen to more from Possible here. Learn more about your ad choices. Visit podcastchoices.com/adchoices
Transcript
Discussion (0)
Support for this show comes from Sophos.
With Sophos, no matter your business's size,
you get enterprise-grade cybersecurity technology
and real-world expertise, always in sync,
always in your corner.
Sophos' native AI technologies evolve with every threat,
and their experts are ready 24-7, 365,
with their managed detection and response services
to stop threats before they strike.
And you can manage all your security alerts, configurations, and other security products
through the Sophos Central Platform.
Don't sacrifice your peace of mind to grow your business.
Learn more at Sophos.com.
Support for this show comes from Shopify.
With Shopify, it's easy to create your brand, open up for business, and get your first sale.
Use their customizable templates, powerful social media tools, and a single dashboard for managing it all.
The best time to start your new business is right now, because established in 2025 has a nice ring to it, doesn't it?
Sign up for a $1 per month trial period at Shopify.com slash Vox Business,
all lowercase. Go to Shopify.com slash Vox Business to start selling with Shopify today.
Shopify.com slash Vox Business.
Hi, everyone. This is Pivot from New York Magazine and the Vox Media Podcast Network.
I'm Cara Swisher, and today we're sharing an episode of Possible hosted by one of our
recent guests, Reid Hoffman. Join Reid and his co-host, Aria Finger, as they sit down
with the co-founder and CEO of Google DeepMind, Demis Hassabis, one of the most influential
figures in AI. They'll dive into game theory, medicine, multimodality, the nature of innovation, and
how board games and video games shape our understanding of the future of AI.
Enjoy the episode and remember you can find it and subscribe to Possible wherever you
listen to podcasts.
AI is going to affect the whole world.
It's going to affect every industry.
It's going to affect every country.
It's going to be the most transformative technology ever, in my opinion.
So if that's true, and it's going to be like electricity or fire,
then I think it's important that the whole world participates in its design.
I think it's important that it's not just a hundred square miles of patch of California.
I do actually think it's important that we get these other inputs, the broader inputs,
not just geographically, but also different subjects, philosophy, social sciences,
economists, not just the tech companies, not just the scientists involved in deciding how
this gets built and what it gets used for.
Hi, I'm Reid Hoffman. And I'm Aria Finger.
We want to know how, together, we can use technology like AI to help us shape the best
possible future.
With support from Stripe, we ask technologists, ambitious builders, and deep thinkers to help
us sketch out the brightest version of the
future and we learn what it'll take to get there.
This is possible.
In the 13th century, Sir Galahad embarked on a treacherous journey in pursuit of the
elusive Holy Grail.
The grail, known in Christian lore as the cup Christ used in a last supper, had disappeared
from King Arthur's table.
The knights of the Round Table swore to find it.
After many trials, Galad's pure heart allowed him the unique ability to look into the grail
and observe divine mysteries that could not be described by the human tongue.
In 2020, a team of researchers at DeepMind
successfully created a model called AlphaFold that could predict how proteins will fold.
This model helped answer one of the holy grail questions of biology. How does a long line of
amino acids configure itself into a 3D structure that becomes the building block of life itself.
In October 2024, three scientists involved with AlphaFold won a Nobel Prize for these
efforts.
This is just one of the striking achievements spearheaded by our guest today.
Demis Hassabis is a British artificial intelligence researcher, co-founder, and CEO of the AI
company DeepMind.
Under his leadership, DeepMind developed AlphaGo,
the first AI to defeat a human world champion in Go,
and later created AlphaFold,
which solved the 50-year protein folding problem.
He is considered one of the most influential figures in AI.
Reid and I sat down for an interview with Demis
in which we talked about everything from game theory
to medicine to multimodality
and the nature of innovation and creativity.
Here's our conversation with Demis Hesabis.
Demis, welcome to possible was awesome dining with you at Queens.
It was kind of a special moment in all kinds of ways.
And, you know, I think I'm going to start with a question that kind of came from your
Babbage theater lecture and also from the fireside chat that you did with Muhammad Al-Ariyan,
which is share with us the moment where you went from thinking,
chess is the thing that I have spent my childhood doing,
to what I want to do is start thinking about thinking.
I want to accelerate the process of thinking and that computers are a way to do that.
How did you arrive at that? What age were you?
What was that turn into metacognition?
Well, yeah.
Well, first of all, thanks for having me on the podcast.
Chess for me is where it all started actually in gaming.
And I started playing chess when I was four, very seriously, all through my childhood,
playing for most of the England junior teams,
captaining a lot of the teams. And for a long while, my main aim was to become a professional
chess player, a grandmaster, maybe one day, possibly a world champion. And that was my whole
childhood really. Every spare moment, not at school, I was going around the world playing chess against adults in international tournaments.
And then around 11 years old, I sort of had an epiphany really that although I love chess
and I still love chess today, is it really something that one should spend your entire
life on?
Is it the best use of my mind?
So that was one thing that was troubling me a little bit.
But then the other thing was, as we were going to training camps with the England chess team, we started to use early
chess computers to try and improve your chess. And I remember thinking that, of course, we were
supposed to be focusing on improving the chess openings and chess theory and tactics. But actually,
I was more fascinated by the fact that someone had programmed this inanimate lump of plastic to play very good chess against me. And I was fascinated by
how that was done. And I really wanted to understand that and then eventually try and
make my own chess programs.
I mean, it's so funny. I was saying to read before this, my seven year old school just
won the New York State chess championship. So they have a long way to go before they get to you.
But he takes it on faith, like, oh, yeah, mom,
I'm just going to go play chess kid on the computer.
Like, I'll go play against the computer a few games,
which, of course, was sort of a revelation sort of decades ago.
And I remember when I was in middle school,
it was obviously the Deep Blue versus Gary Kasparov. And this was like
a man versus machine moment. And one thing that you've gestured at about this moment is that it
illustrated, like in this case, based on Grandmaster data, it was like brute force versus
like a self-learning system. Can you say more about that dichotomy? Yeah, well, look, first of all,
I mean, it's great.
Your son's playing chess and I think it's fantastic.
I'm a big advocate for teaching chess in schools as a part of the curriculum.
I think it's fantastic training for the mind, just like doing maths or programming would
be.
And it's certainly affected the way I approach problems and problem solve and visualize solutions
and plan.
It teaches you all these amazing meta skills dealing with pressure.
So you sort of learn all of that as a young kid, which is fantastic for anything else you're going to do.
And as far as Deep Blue goes, you're right, most of these early chess programs and then Deep Blue became the pinnacle of that were these types of expert systems,
which at the time was the favored way of approaching AI, where actually it's the programmers that
solve the problem, in this case playing chess, and then they encapsulate that solution in
a set of heuristics and rules, which guides a brute force search towards, in this case,
making a good chess move.
I always had this, although I was fascinated by these Ailey chess programs that they could
do that, I was also slightly disappointed by them.
And actually, by the time it got to Deep Blue, I was already studying at Cambridge in my undergrad.
I was actually more impressed with Kasparov's mind because I'd already started studying neuroscience
than I was with the machine because he was this brute of a machine. All it can do is play chess,
and then Kasparov can play chess at roughly the same level, but also can do all the other
amazing things that humans can do. I thought, doesn't that speak to the wonderfulness of the
human mind? It also, more importantly, means something was missing from very fundamental,
from Deep Blue and these expert system approaches to AI, very clearly. Because Deep Blue did not
seem, even though it was a pinnacle of AI at the time, Blue did not seem, even though it was
a pinnacle of AI at the time, it did not seem intelligent. And what was missing was its
ability to learn new things. So for example, it was crazy that Deep Blue could play chess
to world champion level, but it couldn't even play tic-tac-toe. You'd have to reprogram.
Nothing in the system would allow it to play tic-tac-toe. So that's odd. That's very different
to a human grandmaster who obviously
play a simpler game trivially.
And then also it was not general,
in the way that the human mind is.
And I think those are the hallmarks,
that's what I took away from that match is those are
the hallmarks of intelligence and
they were needed if we wanted to crack AI.
And go a little bit into the deep learning,
which obviously is part of the reason why deep mind
was name board is because part of, I think,
that what was seen to be completely contrarian
hypothesis that you guys played out with self play
and kind of learning system was that this learning
approach was the right way to generate
these significant systems.
So say a little bit about having the hypothesis,
what the trek through the desert looked like,
and then what finding the Nile ended up with.
Yes. Well, look, of course, we started DeepMind in 2010
before anyone was working on this in industry,
and there was barely any work on it in academia.
And we partially named the company DeepMind,
the deep part, because of deep learning.
It was also a nod to deep thought in
Hitchhiker's Guide's Galaxy and Deep Blue and other AI things. But it was mostly around the idea
we were better on these learning techniques, deep learning and hierarchical neural networks. They
just sort of been invented in seminal work by Jeff Hinton and colleagues in 2006. So it's very,
very new. And reinforcement learning, which has always
been a specialty of DeepMind, and the idea of learning
from trial and error, learning from your experience,
and then making plans and acting in the world.
And we combine those two things, really.
We sort of pioneered doing that, and we
called it deep reinforcement learning, these two approaches and deep learning to kind of build a model of the environment or
what you were doing in this case a game and then the reinforcement learning to do the
planning and the acting and actually accomplish and be able to build agent systems that could
accomplish goals in the case of games is maximizing the score winning the game.
And we felt that that was actually the entirety of what's needed for intelligence.
The reason that we were pretty confident about that is actually from using the brain as an
example.
Basically, those are the two major components of how the brain works.
The brain is a neural network.
It's a pattern matching and structure finding system,
but then it also has reinforcement learning and this idea of planning and learning from
trial and error and trying to maximize reward, which is actually in the human brain and the
animal brain, the mammal brain is the dopamine system implements that, a form of reinforcement
learning called TD learning. That gave us confidence that if we pushed hard enough in this direction, even though no one was really doing that, that eventually this
should work, right? Because we have the existence proof of the human mind. And of course, that's why
I also studied neuroscience, because when you're in the desert, like you say, you need any source
of water or any evidence that you might get out of the desert. There's even a mirage in the distance
is a useful thing to understand
in terms of giving you some direction
when you're in the midst of that desert.
And of course, AI was itself in the midst of that
because several times this had failed.
The expert system approach basically had reached a ceiling.
I could easily hog the entire interview,
so I'm trying not to. So one of the things that the learning system obviously ended up creating was solving what was previously considered an insoluble problem.
There were even people who thought that computers couldn't, like classical computational techniques couldn't solve go, but in the classic move 37, it demonstrated originality,
creativity that, that was beyond, you know, the thousands of years of go play and books and the
hundreds of years of very serious play. What was that moment of move 37 like for, for understanding
where AI is and what do you think the next move 37 is?
Will Barron Well, look, the reason Go was considered
to be and ended up being so much harder than chess, so it took another 20 years, even us with AlphaGo.
And all the approaches that have been taken with chess, these expert systems, uh, uh, approaches had failed with go, right?
Um, basically couldn't even be a professional, let alone a world champion.
And the reason was two main reasons.
One is the complexity of go is so enormous.
You know, it's one way to measure that is there are 10 to the power, 170 possible
positions, right?
Far more than atoms in the universe.
There's no way you can brute force a solution
to go, right? It's impossible. But even harder than that is that it's such a beautiful, esoteric,
elegant game. It's sort of considered art, an art form in Asia, really, right? And it's because
it's both beautiful aesthetically, but also it's all about patterns rather than sort of brute calculation, which
chess is more about. And so even the best players in the world can't really describe
to you very clearly what are the heuristics they're using. They just kind of intuitively
feel the right moves, right? They'll sometimes just say that this move, why did you play
this move? Well, it felt right, right? And then it turns out their intuition of their
brilliant player, their intuition is brilliant and fantastic.
And it's an amazingly beautiful and effective move. But that's very difficult then to encapsulate
in a set of heuristics and rules that to direct how a machine should play go. And so that's
why all of these kind of deep blue methods didn't work. Now, we got around that by having the system learn for
itself what are good patterns, what are good moves, what are good motifs and approaches,
and what are valuable and high probability of winning positions are. So it learned that for
itself through experience, through seeing millions of games and playing millions of games against itself.
So that's how we got AlphaGo to be better than world champion level.
But the additional exciting thing about that is that it means those kinds of systems can
actually go beyond what we as the programmers or the system designers know how to do.
No expert system can do that because of course it's strictly limited by what we already know
and can describe to the machine.
But these systems can learn for themselves.
And that's what we resulted in Move 37 in Game 2 of the famous World Championship match,
the challenge match we had against Lisa Dole in Seoul in 2016.
And that was a truly creative move.
Go has been played for thousands of years. It's the oldest game humans have invented,
and it's the most complex game. And it's been played professionally for hundreds of years in
places like Japan. And even still, even despite all of that exploration by brilliant human players,
this move 37 was something
never seen before.
And actually, worse than that, it was thought to be a terrible strategy.
In fact, if you go and watch the documentary, which I recommend, it's on YouTube now, of
AlphaGo, you'll see the professional commentators nearly fell off their chairs when they saw
Move 37 because they thought it was a mistake.
They thought the computer operator, Aja, had misclicked on the computer because it was so unthinkable
that someone would play that. Then, of course, in the end, it turned out 100 moves later,
that move 37, the stone, the piece that was put down on the board, was in exactly the
right place to be decisive for the whole game. Now it's studied as a great classic of the Go
history of Go, that game and that move. Of course, then even more exciting for that is,
that's exactly what we hoped these systems would do because the whole point of me and my whole
motivation my whole life of working on AI was to use AI to accelerate scientific discovery.
innovation my whole life of working on AI was to use AI to accelerate scientific discovery. And it's those kinds of new innovations, albeit in a game, is what we were looking for from
our systems.
And, you know, that I think is a awesome rendition of kind of why it is these learning systems
are, you know, even now doing original discovery. What do you think the next move 37 might be for kind of opening
our minds to what is the way that AI can add a whole lot to the kind of quality of human
thought, human existence, human science?
Yeah. Well, look, I think there'll be a lot of move 37s in almost every area of human endeavor.
Of course, the thing I've been focusing on since then is mostly being how can we apply
those types of AI techniques, those learning techniques, those general learning techniques
to science.
Big areas of science, I call them root node problems.
Problems where if you think of the tree of all knowledge that's out there in the universe, can you unlock some root nodes that unlock entire branches or
new avenues of discovery that people can build on afterwards? For us, protein folding and alpha
fold was one of those. It was always top of my list. I have a mental list of all these types
of problems that I've come across throughout my life and just being genuinely interested in all areas of science.
Sort of thinking through which ones would be suitable would both be hugely impactful
but also suitable for these types of techniques.
I think we're going to see a new golden era of these types of new strategies, new ideas in very
important areas of human endeavor.
I would say one thing to say though is that we haven't fully cracked creativity yet.
I don't want to claim that.
I often describe there's three levels of creativity and I think AI is capable of the first two. So first one would be interpolation.
So you give it a million pictures of cats, an AI system, a million pictures of cats,
and you say, create me a prototypical cat. And it will just average all the million cats'
pictures that it's seen. And that prototypical one won't be in the training set. So it will
be a unique cat. But that's not very interesting from a creative point of view. It's just an averaging. But the second thing would be what
I call extrapolation. So that's more like AlphaGo, where you've played 10 million games of Go,
you've looked at a few million human games of Go, but then you come up with, you extrapolate from
what's known to a new strategy never seen before, like move 37. Okay, so that's very valuable
already. I think that is true creativity. But then there's a third level which I call it kind of
invention or out of the box thinking, which is not only can you come up with a move 37, but could you
have invented Go? Or another measure I like to use is if if we went back to the time of Einstein in 1900,
early 1900s, could an AI system actually come up with general relativity with the same information
that Einstein had at the time?
Clearly, today, the answer is no to those things.
It can't invent a game as great as Go, and it wouldn't be able to invent general relativity just from
the information that Einstein had at the time.
And so there's still something missing from our systems to get true out-of-the-box thinking.
But I think it will come, but we just don't have it yet.
I think so many people outside of the AI realm would be surprised be surprised. It sort of all starts with gaming,
but that's sort of gospel for what we're doing.
It's like, that's how we created these systems.
And so switching gears from board games to video games,
can you give us just like the elevator pitch explanation
for what exactly makes an AI that can play StarCraft 2,
like AlphaStar, so much more advanced and fascinating than the one that can play StarCraft II like AlphaStar, so much more advanced and fascinating than
the one that can play chess or Go.
Yeah, with AlphaGo, we sort of cracked the pinnacle of board games, right? So Go was
always considered the Mount Everest, if you like, of games AI for board games. But there
are even more complex games by some measures if you take on board the most complex strategy games that you can play on
computers. StarCraft II is acknowledged to be the classic of the genre of real-time strategy
games. It's a very complex game. You've got to build up your base and your units and other
things. Every game is different. The board game is very fluid and you've got to move
many units around in real time. The way we cracked that
was to add this additional level in of a league of agents competing against each other, all seeded
with slightly different initial strategies. Then you get a survival of the fittest. You have a
tournament between them all, so it's a multi-agent setup now, and the strategies that win out in that tournament go to the next, you know, the next epoch, and then you generate some
other new strategies around that and you keep doing that for many generations.
You're kind of both having this idea of self-play that we had in AlphaGo, but
you're adding in this multi-agent competitive, almost evolutionary dynamic
in there, and then eventually you get an agent or a series or a set of agents that are
the Nash distribution of agents. No other strategy dominates them, but they dominate the most number
of other strategies. Then you have this Nash equilibrium and then you pick out the top agents
from that. That succeeded very well with this type of very open-ended kind of gameplay.
So it's quite different from what you get with chess or Go, where the rules are very prescribed
and the pieces that you get are always the same. And it's sort of a very ordered game.
Something like StarCraft is much more chaotic. So it's sort of interesting to have to deal with that.
It has hidden information too. You can't see the whole map at once. You have to explore it. So it's not a perfect information game, which is another thing we
wanted our systems to be able to cope with, is partial information situations, which is actually
more like the real world, right? Very rarely in the real world do you actually have full information
about everything. Usually you only have partial information and then you have to infer everything else in order to come up with the right strategies.
Part of the game side of this is, I presume you've heard that there's this kind of theory
of homo-ludens that we're game players.
Is that informing the kind of thinking about how games is both strategic, but also kind of framing for like science acceleration,
framing for kind of the serendipity of innovation.
Is in addition to the kind of the fitness function,
the kind of evolution of self play,
the ability to play scale compute,
are there other deeper elements to the game playing nature that allows this thinking of thinking?
Well, look, I'm glad you brought up Home of Ludens and it's a wonderful book and it basically
argues that games playing is actually a fundamental part of being human, right? In many ways,
that's the act of play. What could be
more human than that? Then, of course, it leads into creativity, fun. All of these things
get built on top of that. I've always loved them as a way to practice and train your own mind
in situations that you might only ever get a handful of
times in real life, but they're usually very critical. What company to start, what deal
to make, things like that. So I think games is a way to practice those scenarios. And
if you take games seriously, then you can actually simulate a lot of the pressures one
would have in decision-making situations. And Going back to earlier, that's why I think chess is such a great training ground for
kids to learn because it does teach them about all of these situations.
Of course, it's the same for AI systems too.
It was the perfect proving ground for our early AI system ideas, partly because they
were invented to be challenging and fun for
humans to play. Of course, there are different levels of gameplay. We could start with very
simple games like Atari games and then go all the way up to the most complex computer
games like StarCraft and continue to challenge our system. We were in the sweet spot of the
S-curve. It's not too easy, it's trivial,
or too hard. You can't even see if you're making any progress. You want to be in that maximum
sort of part of the S curve where you're making almost exponential progress. And we could keep
picking harder and harder games as our systems got improved. And then the other nice feature
about games is because they're some kind of microcosm of the real world, they've usually been boiled down to very
clear objective functions, right?
So winning the game or maximizing the score is usually
the objective of a game.
And that's very easy to specify to a reinforcement learning
system or an agent based system.
So you can, it's perfect for hill climbing against, right?
And measuring ELO scores,
ratings and exactly where you are. And then finally, of course, you can calibrate yourself
against the best human players. So you can sort of calibrate what your agents are doing
in their own tournaments. In the end, even with the StarCraft agent, we had to eventually
challenge a professional grandmaster at StarCraft to make sure that our systems
hadn't overfitted somehow to their own tournament strategies.
It actually needed to be, oh, we grounded it with, oh, it can actually be a genuine
human Grandmaster StarCraft player.
The final thing is, of course, you can generate as much synthetic data as you want with games,
too, which is coming into vogue right now again about data limitations
and with large language models and how many tokens left in the world and has it read everything
in the world.
Obviously for things like games, you can actually just play the system against itself and generate
lots more data from the right distribution.
Can you double click on that for a moment?
Like you said, it is in Vogue to talk about, are we running out of data? Do we need synthetic data? Like, where do you stand on that issue?
Well, I've always been a huge proponent of simulations and simulations and AI. And, you know,
it's also interesting to think about what the real world is, right, in terms of a computational
system. And so I've always been involved with trying to build very realistic simulations of things.
Now of course that interacts with AI because you can have an AI that learns a simulator
of some real world system just by observing that system or all the data from that system.
I think the current debate is to do with these large foundation models now pretty much use
the whole internet.
And so then once you've tried to learn from those, what's left?
That's all the language that's out there.
Of course, there's other modalities like video and audio.
I don't think we've exhausted all of that kind
of multimodal tokens.
But even that will reach some limit.
So then the question that comes of like,
can you generate synthetic data?
And I think that's why you're seeing quite a lot of progress
with maths and coding, because in those domains,
it's quite easy to generate synthetic data.
The problem with synthetic data is,
are you creating data that is from the right distribution,
the actual distribution, right?
Does it mimic the kind of real distribution?
And also, are you generating data that's correct, right? And, of course, for things like maths,
for coding, and for things like gaming, you can actually test the final data and verify
if it's correct, right, before you feed it in as input into the training data
for a new system.
It's very amenable certain areas, in fact, turns out the more abstract areas of human
thinking that you can verify and prove that it's correct. that unlocks the ability to create a lot of synthetic data. dress for Friday's fundraiser. Okay, all right, where are my keys? In my pocket, let's go.
First, pick up dress,
then prepare for that big presentation,
walk dog, then, okay, inhale.
One, two, three, four,
exhale, one, two, three, four.
Ooh, who knew a driver's seat could give such a good massage?
Wow, this is so nice.
Oops, that was my exit.
Oh well, that's fine.
I've got time.
After the meeting, I gotta remember to schedule flights
for our girls' trip, but that's for later.
remember to schedule flights for our girls trip. But that's for later.
Sun on my skin, wind in my hair.
I feel good.
Turn the music up.
Your all new Nissan Murano is more than just a tool
to get you where you're going.
It's a refuge from life's hustle and bustle.
It's a place to relax, to reset,
in the spaces between items on your to-do lists.
Oh wait, I got a message.
Could you pick up wine for dinner tonight?
Yep, I'm on it.
I mean, that's totally fine by me.
Play Celebrity Memoir Book Club. It's been reported that 1 in 4 people experience sensory sensitivities, making everyday experiences
like a trip to the dentist especially difficult.
In fact, 26% of sensory-sensitive individuals avoid dental visits entirely.
In Sensory Overload, a new documentary produced as part of Sensodyne's Sensory Inclusion
Initiative, we follow individuals navigating a world not built for them, where bright lights, loud sounds, and unexpected touches can turn
routine moments into overwhelming challenges.
Burnett-Grant, for example, has spent their life masking discomfort in workplaces that
don't accommodate neurodivergence.
I've only had two full-time jobs where I felt safe, they share.
This is why they're advocating for change.
Through deeply personal stories like Burnett's,
Sensory Overload highlights the urgent need for spaces,
dental offices, and beyond that embrace sensory inclusion.
Because true inclusion requires action with environments
where everyone feels safe.
Watch Sensory Overload now streaming on Hulu.
Support for the show comes from Mercury. What if banking did more? now streaming on Hulu. it's landing your fundraise. The truth is, banking can do more.
Mercury brings all the ways you use money
into a single product that feels extraordinary to use.
Visit mercury.com to join over 200,000 entrepreneurs
who use Mercury to do more for their business.
Mercury, banking that does more.
that does more.
So one of the things that is also in addition to the frequent discussion around data, how do we get more?
But one of the questions is,
in order to do AI,
is it important to actually have it embedded in the world?
Yeah. Well, interestingly, if we talked about this five years ago,
or certainly 10 years ago,
I would have said that some real-world experience,
maybe through robotics,
usually when we talk about embodied intelligence,
we're meaning robotics,
but it could also be a very accurate simulator, right? Like some kind of ultra realistic game
environment, would be needed to fully understand, say, the physics of the world around you,
right? And the physical context around you. And there's actually a whole branch of neuroscience
that is predicated on this. It action in perception. This is the idea that
one can't actually fully perceive the world unless you can also act in it. The kinds of
arguments go is like, how can you really understand the concept of the weight of something, for
example, unless you can pick things up and compare them with each other and then you
get this idea of weight. Can you really
get that notion just by looking at things? It seems hard, certainly for humans. I think
you need to act in the world. This is the idea that acting in the world is part of your
learning. You're kind of like an active learner. In fact, reinforcement learning is like that
because the decisions you make give you new experiences, but those experiences
depend on the actions you took, but also those are the experiences that you'll then subsequently
learn from. In a sense, reinforcement learning systems are involved in their own learning
process because they're active learners. I think you can make a good argument that that's
also required in the physical world.
Now it turns out, I'm not sure I believe that anymore because now with our systems, especially
our video models, if you've seen VO2, our latest video models, completely state of the
art which we released late last year, and it's kind of shocked even me that even though
we're building this thing, that it
can sort of basically by watching YouTube videos, a lot of YouTube videos, it can figure
out the physics of the world. There's a sort of funny Turing test of, in some sense, Turing
test in verb commas of video models, which is, can you chop a tomato? Can you show a
video of a knife chopping a tomato with the fingers and everything in the right place and the tomato doesn't magically spring back together or the knife
goes through the tomato without cutting it, et cetera? And Vio can do it. And if you think
through the complexity of the physics to understand what you've got to keep consistent and so on,
it's pretty amazing. It's hard to argue that it doesn't understand something about physics and the physics of the world.
And it's done it without acting in the world,
and certainly not acting as a robot in the world.
So it's not clear to me there is a limit now with just
sort of passive perception.
Now, the interesting thing is that I
think this has huge consequences for robots as an
embodied intelligence as an application because the types of models we've built, Gemini and
also now VO, and we'll be combining those things together at some point in the future,
is we've always built Gemini, our foundation model, to be multimodal from the beginning.
The reason we did that and we still lead on all the multimodal benchmarks is beginning. And the reason we did that, and we still lead on all the
multimodal benchmarks, is because for twofold. One is we have a vision for this idea of a
universal digital assistant, an assistant that goes around with you on the digital devices,
but also in the real world, maybe on your phone or a glasses device and actually helps you in the real world, like recommend things to
you, help you navigate around, help with physical things in the world like cooking, stuff like
that.
For that to work, you obviously need to understand the context that you're in.
It's not just the language I'm typing into a chat bot.
You actually have to understand the 3D world I'm living in right i think to be really good assistant you need to do that.
I'm the second thing is of course exactly what you need for robotics as well.
I'm really star first big sort of gemini robotics work which is cause a bit of a star and that's the beginning of showing what we showcasing what we can do with these multimodal models that do understand physics of the world,
with a little bit of robotics fine-tuning on top to do with the actions, the motor actions,
and the planning a robot needs to do.
It looks like it's going to work.
Actually now, I think these general models are actually going to transfer to the embodied
robotic setting without too much extra special
casing or extra data or extra effort, which is probably not
what most people, even the top roboticists,
would have predicted five years ago.
I mean, that's wild.
And thinking about benchmarks and what
we're going to need these digital assistants to do,
when we look under the hood of these big AI models,
well, some people would say it's attention.
So the trade-offs is thinking time versus output quality.
We need them to be fast, but of course,
we need them to be accurate.
And so talk about what is that trade-off
and how is that going in the world right now?
Well, look, of course, we pioneered all that area
of thinking systems because
that's what our original gaming systems all did, right?
Go, AlphaGo, but actually most famously AlphaZero, which was our follow-up system that could
play any two-player game.
And there you always have to think about your time budget, your compute budget you've got
to actually do the planning part, right?
So the model you can pre-train, just like we do with our foundation models today.
You can play millions of games offline, and then you have your model of chess or your
model of Go, whatever it is.
But at test time, at runtime, you've only got one minute to think about your move.
One minute times how many computers you've got running.
That's still a limited compute budget. So what's very interesting today is there's trade-off
between do you use a more expensive larger base model,
foundation model, right?
So in our case, we have different size names
like Gemini Flash or Pro or even bigger, which is Ultra.
But those models are more costly to run.
So they take longer to run.
But they're more
accurate and they're more capable. So you can run a bigger model with a shorter number
of planning steps, or you can run a very efficient smaller model that's slightly less powerful,
but you can run it for many more steps. Currently, what we're finding is it's roughly about equal,
but of course what we want to find is some, is some, the
Pareto frontier of that, right? Like actually the exact right trade off of the size of the model
and the expense of that running that model versus the amount of thinking time you want to,
and thinking steps that you're, you're able to do per unit of compute time. And I think that's,
that's actually fairly cutting edge research right now that I think all the leading
labs are probably experimenting on.
I think there's not a clear answer to that yet.
All the major labs, DeepMind, others, are all working intensely on coding assistants.
There's a number of reasons.
Everything from like, A, it's one of the things that accelerates productivity across the whole
front. It has a kind of good fitness function.
It's also of course, one of the ways that, you know, everyone is going to be handsome
productivity is having a software, you know, kind of copilot agent for helping.
There's just a ton of reasons.
Now one of the things that gets interesting here is as you're building these, you know,
obviously there's a tendency to start with these computer languages that
have been designed for humans.
What would be computer languages that would be designed for AIs or an agentic world or
designed for this hybrid process of a human plus an AI?
Is that a good world to start looking at those kind of computer languages?
How would it change our theory of computation, linguistics, etc. I think we are entering a new era in coding, which is going to be
very interesting. And, you know, as you say, all the leading labs are pushing on
this frontier for many reasons. It's easy to create synthetic data, so
that's another reason that everyone's pushing on this vector.
And I think we're going to move into a world where, you know, sometimes it's called
vibe coding, where you're basically coding with natural language, really. Right. And
then and we've seen this before with computers, right. I remember when I first started programming,
you know, in the 80s, we were doing assembler. And then of course, you know, that seems crazy
now like why would you do machine code? You start with C, and then you get Python, and so on.
And really, one could see it as the natural evolution
of going higher and higher up the abstraction stack
of programming languages and leaving the more and more
of the lower level implementation details
to the compiler, in a sense.
And now, one could just view this
as the natural final step. Well, we just use natural language,
and then everything is super high-level programming language.
I think eventually that's maybe what we'll get to. The exciting thing there is that, of course,
it will make accessible coding to a whole new range of people, creatives, right? Who normally would, you know, designers, game designers, app writers, that would normally
would not have been able to implement their ideas without the help of, you know, teams
of programmers.
So that's going to be pretty exciting, I think, from a creativity point of view.
But it may also be very good, certainly in the next few years, for coders as well. I think this in general with
these AI tools is, I think that the people that are going to get most benefit out of
them initially will be the experts in that area who also know how to use these tools
in precisely the right way, whether that's prompting or interfacing with your existing
code base. There's going to be this sort of interim period
where I think the current experts who embrace these new tools,
whether that's filmmakers, game designers, or coders,
are going to be superhuman in terms of what they're able to do.
I see that with some film directors and film designer friends of mine who are able to create pitch decks, for
example, for new film ideas in a day on their own.
But it's very high quality pitch deck that they can pitch for a $10 million budget for.
Normally they would have had to spend a few tens of thousands of dollars just to get to
that pitch deck, which is a huge risk for them.
So it becomes, I think there's going to be a whole new incredible set of opportunities.
And then there's the question of like, if you think about creative, the creative arts,
whether there'll be new ways of working, much more fluid.
Instead of doing Adobe Photoshop or something, you actually co-creating this thing with this
fluid responsive
tool. That could feel more like Minority Report or something, I imagine, with the interface
and there's this thing swirling around you. It will require people to get used to a very
new workflow to take maximum advantage of that. But I think when they do,
it will be probably incredible for those people.
They'll be like 10X more productive.
I want to go back to the world of multimodal
that we were talking about before with robots in the real world.
Right now, most AI doesn't need to be multimodal in real time
because the Internet is not multimodal.
For our listeners, that means absorbing many types of input, AI doesn't need to be multimodal in real time because the internet is not multimodal.
And for our listeners, that means absorbing many types of input, voice, text, vision at
once.
And so can you go deeper in what you think the benefits of truly real time multimodal
AI will be and like, what are the challenges to get to that point?
I think first of all, we live in a multimodal world, right?
We have our five senses, and that's what makes us human.
If we want our systems to be brilliant tools or fantastic assistants, I think in the end,
they're going to have to understand the world, the spatial temporal world that we live in,
not just our linguistic, maths world, right?
Abstract thinking world. I think that they'll need to be
able to act in and plan in and process things in the real world and understand the real world.
I think that the potential for robotics is huge. I don't think it's had its chat GPT or its
AlphaFold moment yet, say in science and language, right? Or alpha go moment.
I think that's to come, but I think we're close. And as we talked about before, I think in order
for that to happen, I think that the shortest path I see that happening on now is these general
multimodal models being eventually good enough, and maybe we're not very far away from that,
to sort of install on a robot, perhaps a humanoid robot
with the cameras. Now there's additional challenges of you've got to fit it locally or maybe on the
local chips to have the latency fast enough and so on. But as we all know, just wait a couple of
years and those systems that stay with the art today will fit on a little mobile chip tomorrow.
So I think it's very exciting, multimodal from that point of view,
robotics, assistance. And then finally, I think also for creativity, I think we're the first model
in the world, Gemini 2.0, that you can try now in AI Studio, that allows native image generation.
So not calling a separate program in this separate model, in our case, Imogen 3, which
you can try separately, but actually Gemini itself natively coming up in the chat flow
of images.
And I think people seem to be really enjoying using that.
So it's sort of like you're now talking to a multimodal chatbot, right?
And so you can get it to express emotions in pictures or you can give it a
picture and then tell it to modify it and then continue to work on it with word descriptions.
Can you remove that background? Can you do this? So this goes back to the earlier thing
we said about programming or any of these creative things in a new workflow. I think
we're just seeing the glimpse of that if you try out this new Gemini 2 experimental model of how that might look in image creation. And that's just the
beginning. Of course, it will work with video and coding and all sorts of things.
So, in the land of the real world, multimodal, one of the things that frequently people speculate is geolocation of AI work.
Obviously, in the US, we intensely track everything that's happening on the West Coast.
We also intensely track DeepMind and then somewhat last, Mistral and others.
What's some of the stuff that's really key for the world to understand what's coming out of Europe.
What's the benefit of having there be multiple major centers of innovation and invention,
you know, not just within the West Coast, but also obviously DeepMind in London and
Mistral and Paris and others.
And what are some of the things to, for people to pay attention to, why it's important and what's happening,
especially within the UK and European AI ecosystem?
We started DeepMind in London
and still headquartered here for several reasons.
I mean, this is where I grew up, that's what I know.
It's where I had all the contacts that I had.
But the competitive reasons were that we felt that the talent in the UK and in Europe was
the coming out of universities was the equivalent of the top US ones.
You know, Cambridge, my alma mater and Oxford, they're up there with the MITs and Harvard's
and the Ivy Leagues, right?
I think they're sort of, you know, they're always in the top 10 there together on the
university world tables.
But if you, this is certainly true in 2010, if you were coming, say you had a PhD in physics
out of Cambridge and you didn't want to work in finance at a hedge fund in the city, but
you wanted to stay in the UK and be intellectually challenged, there were not that many options
for you, right?
There are not that many deep tech startups.
So we were the first really, and to prove that could be done. And
actually we were a big draw for the whole of Europe. So we got the best people from
the technical universities in Munich and in Switzerland and so on. And for a long while,
that was a huge competitive advantage. And also salaries were cheaper here and then in
the West coast and you weren't competing against the big incumbents. Right? And also it was conducive. The other reason I chose to do that was I knew that AGI,
which was our plan from the beginning, you know,
solve intelligence and then use it to solve everything else.
That was our, where we articulated our mission statement.
And I still like that framing of it.
It was a 20 year mission.
And if you want a 20 year mission,
and we're now 15 years in,
and I think we're
sort of on track, unbelievably, which is strange for any 20 year mission, but you don't want
to be too distracted on the way in a deep technology, deep scientific mission.
One of the issues I find with Silicon Valley is lots of benefits, obviously, contacts and
support systems and funding and amazing things and the amount of talent there, the density of talent.
But it is quite distracting, I feel.
Everyone and their dog is trying to do a startup that they think is going to change the world,
but it's just a photo app or something.
And then the cafes are filled with this.
Of course, it leads to some great things, but it's also a lot of noise if one actually
wants to commit to a long-term
mission that you think is the most important thing ever, and you don't want to be too,
you know, you and your staff and want to be too distracted and like, oh, I could make
a, maybe I could make a hundred million though, if I jumped and did this, you know, quickly
did this gaming app or something. Right. And, and I think that's sort of the, the, the milieu
that you're in, uh, in the Valley, at least, at least back then, maybe this is less true now.
There's probably more mission-focused startups now. But I also wanted to prove it could be done
elsewhere. And then the final reason I think it's important is that AI is going to affect
the whole world. It's going to affect every industry. It's going to affect every country.
It's going to be the most transformative technology ever, in my opinion.
So if that's true, and it's going to be like electricity or fire, more impactful than even
the internet or mobile, then I think it's important that the whole world participates
in its design and with the different value systems that we think are out there, there are philosophies
that are good philosophies.
From democratic values, Western Europe, US, I think it's important that it's not just
a hundred square miles of a patch of California.
I do actually think it's important that we get these other inputs, the broader inputs,
not just geographically, but also, and I know you agree with this, Reed, different subjects, philosophy, social sciences,
economists, academia, civil society, not just the tech companies, not just the scientists
involved in deciding how this gets built and what it gets used for. And I feel that I've always felt that very strongly from the beginning.
And I think having some European involvement and some UK
involvement at the top table of the innovation is a good thing.
So Demis, one of the areas of AI that when anyone asks me,
like, hey, Aria, I know you're interested in AI,
but like, well, you can write my emails.
Why is it so special?
I just say, no, think about what it can do in medicine.
I always talk about Alpha Fold. I tell them about what Reed is doing. Like, I'm just so
excited for those breakthroughs. Can you give us just a little bit? You had this seminal
breakthrough in Alpha Fold, and what is it going to do for the future of medicine?
I've always felt that, like, what are the most important things AI can be used for?
I think there are two.
One is human health.
That's number one, trying to solve and cure terrible diseases.
Then number two is to help with energy, sustainability, and climate, the planet's health, let's call
it.
There's human health, and then there's a planet's health.
Those are the two areas that we have focused on in our science group, which I think is fairly unique amongst the AI labs actually,
in terms of how much we pushed that from the beginning. And then protein folding specifically
was this canonical for me. I sort of came across it when I was an underground in Cambridge 30 years
ago and it's always stuck with me is this fantastic puzzle that would
unlock so many possibilities. The structure of proteins, everything in life depends on
proteins and we need to understand the structure so we know their function. If we know the
function then we can understand what goes wrong in disease and we can design drugs and
molecules that will bind to the right part of the surface of the protein if you know
the 3D structure. It's a fascinating problem. It goes right part of the surface of the protein if you know the 3D structure.
So it's a fascinating problem.
It goes to all of the computational things we were discussing earlier as well.
Can you see through this forest of possibilities all these different ways a protein could fold?
Some people estimate, Leventhal very famously in the 1960s, estimated an average protein
can fold in 10 to 300 possible ways. How do you enumerate
those in astronomical possibilities? Yet it is possible with these learning systems. That's
what we did with AlphaFold. Then we spun out a company, Isomorphic, and I know Reid's very
interested in this area too with his new company, of if we can reduce the time it takes to discover
a protein structure from, it used
to take a PhD student, their entire PhD as a rule of thumb to discover one protein structure.
So four or five years. And there's 200 million proteins known to science and we folded them
all in one year. So we did a billion years of PhD time in one year is another way you
can think of it. And then gave it to the world freely to use.
And two million researchers around the world have used it.
And we spun out a new company, Isomorphic,
to try and go further downstream now
and develop the drugs needed and try and reduce that time.
I mean, it's just amazing.
I mean, Demis, there's a reason they give you the Nobel Prize.
Thank you so much for all of your work in this area.
It's truly amazing.
And now to rapid fire.
Is there a movie, song or book,
that fills you with optimism for the future?
There's lots of movies that I've
watched that have been super inspiring for me.
Things like even like Blade Runner, There's lots of movies that I've watched that have been super inspiring for me.
Things like even like Blade Runner is probably my favorite sci-fi movie, but maybe it's not
that optimistic.
So, if you want an optimistic thing, I would say the Culture series by Ian Banks.
I think that's the best depiction of a post-AGI universe where you know, AIs and you've basically got societies of AIs and humans
and kind of alien species actually and sort of maximum human flourishing across the galaxy.
That's kind of amazing, compelling future that I would hope for humanity.
What is a question that you wish people asked you more often?
hope for humanity. What is a question that you wish people asked you more often?
The questions I sort of often wonder why people don't discuss a lot more, including with me,
some of the really fundamental properties of reality that actually drove me in the beginning
when I was a kid to think about building AI to help us sort of this ultimate tool for
science.
So for example, you know, I don't understand why people don't worry more about what is time, what is gravity, or basically
the fundamental fabric of reality, which is sort of staring us in the face all the time,
all these very obvious things that impact us all the time, and we don't really have
any idea how it works. I don't know know why that it doesn't trouble people more.
It troubles me.
And, uh, and, and, you know, I'd love to have more debates with people, uh, about
those things, but, uh, actually most people don't seem to, you know, they seem
to sort of shy away from those topics.
Where do you see progress or momentum outside of your industry that inspires you?
That's a tough one because AI is so general.
It's almost touching what industry is outside of the AI industry.
I'm not sure there's many.
Maybe the progress going on in quantum is interesting.
I still believe AI is going to get built first and then will maybe help us perfect
our quantum systems but I have ongoing bets with some of my quantum friends like Hartmut Neven on
they're going to build quantum systems first and then that will help us accelerate AI.
So I always keep a close eye on the advances going on with quantum computing systems.
Final question. Can you leave us with a final thought on what is possible over the next
15 years if everything
breaks humanity's way and what's the first step to get there? Well what I hope for next 10-15 years
is what we're doing in medicine to really have new breakthroughs and I think maybe in the next 10-15
years we can actually have a real crack at solving all disease. Right. That's, that's the mission of isomorphic.
And I think with AlphaFold, we showed what the potential was, um, to sort of
do what I like to call science at digital speed and why can that also
be applied to finding medicines?
Um, and so my hope is 10, 15 years time, we'll, we'll look back on the medicine
we have today, a bit like how we
look back on medieval times and how we used to do medicine then.
And that would be, I think, the most incredible benefit we could imagine from AI. Possible is produced by Wondermedia Network. It's hosted by Aria Finger and me, Reid Hoffman.
Our showrunner is Sean Young. Possible is produced by Katie Sanders, Edie Allard, Sarah Schleed,
Vanessa Handy, Aliyah Yates, Paloma Moreno-Himenez, and Malia Agudelo. Jenny Kaplan is our executive producer and editor.
Special thanks to Surya Yalamanchili,
Sayida Sepiyeva,
Fanasi Dilos, Ian Alice,
Greg Beato, Parth Patil, and Ben Relis.
And a big thanks to Leila Hajaj,
Alice Talbert, and Denise Owusu-Afriaye. hospitality, media, and the broader food system with its highly anticipated awards.
To learn more, visit the 2025 James Beard Awards Hub
at jamesbeard.org slash awards.
And be sure to watch the James Beard Awards
from Chicago on June 16th at 5.30 p.m. Eastern,
live on Eater.
Start a business that sells decorative plates. Find out you have to track expenses. Use Intuit
QuickBooks to auto-track expenses so you can keep spinning, uh, selling those plates.
Manage and grow your business all in one place. Intuit QuickBooks. Your way to money.