Big Technology Podcast - Google DeepMind CEO Demis Hassabis + Google Co-Founder Sergey Brin: Scaling AI, AGI Timeline, Simulation Theory

Starting point is 00:00:00 All right, buddy, we have an amazing crowd here today. We're going to be live streaming this, so let's hear you. Make some noise so everybody can hear that you're here. Let's go. I'm Alex Cantorowitz. I'm the host of Big Technology Podcast, and I'm here to speak with you about the Frontiers of AI with two amazing guests. Demis, the CEO of DeepMind, is here.

Starting point is 00:00:22 Google DeepMind. Good to see you, Demis. To see you, too. And we have a special guest, Sergey Brin, the co-founder of Google, is also here. All right. So this is going to be fun. Let's start with the frontier models. Demis, this is for you. With what we know today about frontier models, how much improvement is there left to be unlocked? And why do you think so many smart people are saying that the gains are about to level off? I think we're seeing incredible progress. We've all seen it today.

Starting point is 00:01:00 the amazing stuff we showed in the cake keynote. So I think we're seeing incredible gains with the existing techniques, pushing them to the limit. But we're also inventing new things all the time as well. And I think to get all the way to something like AGI, I think may require one or two more new breakthroughs. And I think we have lots of promising ideas that we're cooking up and we hope to bring into the main branch of the Gemini branch.

Starting point is 00:01:23 All right. And so there's been this discussion about scale. You know, is scale to scale solve all problems or does it not? So, I want to ask you, in terms of the improvement that's available today, is scale still the star, or is it a supporting actor? I think I've always been of the opinion you need both. You need to scale to the maximum the techniques that you know about. You want to exploit them to the limit, whether that's data or compute scale. And at the same time, you want to spend a bunch of effort on what's coming next, maybe six months, a year down the line. So you have the next innovation that might do a 10x leap inside. some way to kind of intersect with the scale.

Starting point is 00:02:03 So you want both, in my opinion. But I don't know, Sergey, what do you think? I mean, I agree it takes both. You can have algorithmic improvements and simply compute improvements. Better chips, more chips, more power, bigger data centers. I think that historically, if you look at things like the end body problem and simulating just gravitational bodies and things like that, as you plot it, the algorithmic have actually beaten out the computational advances, even with Moore's law.

Starting point is 00:02:35 If I had to guess, I would say the algorithmic advances are probably going to be even more significant than the computational advances. But both of them are coming up now, so we're kind of getting the benefits of both. And Debs, do you think the majority of your improvement is coming from building bigger data centers and using more chips? using more chips. There's talk about how the world will be just wallpapered with data centers. Is that your vision?

Starting point is 00:03:04 Well, no. Look, I mean, we're definitely going to need a lot more data centers. It's amazing that, you know, it still amazes me from a scientific point of view. We turn sand into thinking machines. It's pretty incredible. But actually, it's not just for the training. It's now we've got these models that everyone wants to use, you know. And actually, we're seeing incredible demand for 2.5 pro, and I think Flash we're really excited about.

Starting point is 00:03:28 how performant that is for the incredible sort of low cost. I think the whole world's going to want to use these things. And so we're going to need a lot of data centers for serving. And also for inference time compute. You saw deep think today, 2.5 Pro deep think. The more time you give it, the better it will be. And certain tasks, very high value, very difficult tasks, it will be worth letting it think for a very long time.

Starting point is 00:03:54 And we're thinking about how to push that even further. But again, that's going to require a lot of chips at runtime. OK, so you brought up test time compute. We've been about a year into this reasoning paradigm. And you and I have spoken about it twice in the past as something that you might be able to add on to traditional LLMs to get gains. So I think this is like a pretty good time for me to be like what's happening. Can you help us contextualize the magnitude of improvement we're seeing from reasoning?

Starting point is 00:04:22 No, we've always been big believers in what we're now calling this thinking paradigm. If you go back to our very early work on things like AlphaGo and Alpha Zero, our agent work on playing games, they will all have this type of attribute of a thinking system on top of a model. And actually you can quantify how much difference that makes if you look at a game like chess or Go. You know, we had versions of AlphaGo and Alpha Zero with the thinking turned off, so it was just the model telling you its first idea.

Starting point is 00:04:49 And, you know, it's not bad, it's maybe like master level, something like that. But then if you turn the thinking on, it's way beyond world champion level. You know, it's like a 600 ELO plus difference between the two versions. So you can see that in games, let alone for the real world, which is way more complicated, and I think the gains will be potentially even bigger by adding this thinking type of paradigm on top. Of course, the challenge is that your models, and I talked about this earlier in the talk, need to be a kind of world model, and that's much harder than building a model of a simple game, of course. And it has errors in it, and those can compound over long.

Starting point is 00:05:27 longer-term plans. So, but I think we're making really good progress on all that, all those fronts. Yeah, look, I mean, as Demas said, I mean, Deep Mind really pioneered a lot of this reinforcement learning work and what they did with AlphaGo and Alpha Zero, as you mentioned. It showed, as I recall, something you would take 5,000 times as much training to match what you were able to do with, still a lot of training and the inference time computers. that you were doing with Go. So it's obviously a huge advantage.

Starting point is 00:06:02 And obviously, like most of us, we get some benefit by thinking before we speak. And although... Not always. I always get reminded to do that. But I think that the AIs obviously are much stronger once you add that capability. And I think we're just at the tip of the iceberg right now. in that sense. It's been less than a year

Starting point is 00:06:29 than these models have really been out. Especially if you think about obviously with an AI during its thinking process it can also use a bunch of tools or even other AIs during that thinking process

Starting point is 00:06:42 to improve what the final output is. So I think it's going to be an incredibly powerful paradigm. Deep think is very interesting. I'm going to describe it. I'm trying to describe it right. It's basically a bunch of parallel reasoning processes

Starting point is 00:06:54 working and then checking each other and then it's like reasoning on steroids. Now, Demis, you mentioned that the industry needs a couple more advances to get to AGI. Where would you put this type of mechanism? Is this one of those that might get the industry closer? I think so. I think it's maybe part of one, shall I say.

Starting point is 00:07:16 And there are others, too, that we need to, you know, maybe this can be part of improving reasoning, where does true invention come from, where, you know, you're not just solving a mass conjecture, you're actually proposing one or hypothesizing a new theory in physics. You know, I think we don't have systems yet that can do that type of creativity. I think they're coming. And these types of paradigms might be helpful in that, things like thinking,

Starting point is 00:07:41 and then probably many other things. I mean, I think we need a lot of advances on the accuracy of the world models that we're building. I think you saw that with VO, the potential V-O-3, of how, it amazes me, like the how it can intuit the physics of the light and the gravity. Having someone, you know, I used to work on computer games, not just the AI, but also graphics engines in my early career. And remember having to do all of this by hand, you know, and program all of the lighting and the shaders and all of these things.

Starting point is 00:08:11 Incredibly complicated stuff we used to do in early games. And now it's just intuiting it within the model. It's pretty astounding. I saw you shared an image of a frying pan with some onions and some oil. Hope you all like that. There was no subliminal messaging about that? No, not really. Not really.

Starting point is 00:08:29 Just a maybe a subtle message. Okay. So we said the word AG, or the acronym, AGI a couple times. There's, I think, a movement within the AI world right now to say, let's not say AGI anymore. The term is so overused as to be meaningless. But Dem is, it seems like you think it's important. Why? Yeah, I think it's very important, but I think, I mean, maybe I need to write something about this, also with Shane Legg, who's our,

Starting point is 00:08:54 our chief scientist who was one of the people who invented the term 25 years back. I think there's sort of two things that are getting a little bit conflated. One is, like, what can a typical person do, an individual do, and we can, you know, we're all very capable, but we can only do, however capable we are, there's only a certain slice of things that one is expert in, right? Or, you know, you could say, what can you do, what, like, 90% of humans can do? That's obviously going to be economically very important. and I think from a product perspective also very important.

Starting point is 00:09:26 So it's a very important milestone. So maybe we should say that's like, you know, typical human intelligence. But what I'm interested in, and what I would call AGI, is really a more theoretical construct, which is what is the human brain as an architecture able to do, right? And that's, the human brain is an important reference point because it's the only evidence we have maybe in the universe that general intelligence is possible.

Starting point is 00:09:47 And there, it would have to be able to, you would have to show your system was capable of doing the range of things, even the best humans in history were able to do with the same brain architecture. Not one brain, but the same brain architecture. So what Einstein did, what Mozart was able to do, what Marie Curie, and so on. And that, it's clear to me today's systems don't have that.

Starting point is 00:10:07 And then the other thing that why I think it's sort of overblown the hype today on AGI is that our systems are not consistent enough to be considered to be fully general yet. They're quite general, so they can do, you know, thousands of things. You've seen many impressive things today. But every one of us have experienced with today's

Starting point is 00:10:23 chat bots and assistants, you can easily, within a few minutes, find some obvious flaw with them. Some high school math thing that it doesn't solve, you know, some basic game it can't play. It's not very difficult to find that, those holes in the system. And for me, for something to be called AGI, it would need to be consistent, much more consistent across the board than it is today. It should take, like, a couple of months for maybe a team of experts to find a a hole in it, an obvious hole in it. Whereas, you know, today it takes an individual minutes to find that.

Starting point is 00:10:58 Sergey, this is a good one for you. Do you think that AGI is going to be reached by one company and it's game over? Or could you see Google having AGI, open AI having AGI, Anthropic having AGI, China having AGI? Wow, that's a great question. I mean, I guess I would suppose that one company or country or entity will reach AGII first.

Starting point is 00:11:23 Now, it is a little bit of a, you know, kind of a spectrum. It's not like a completely precise thing, so it's conceivable that there will be more than one roughly in that range at the same time. After that what happens, I mean, I think it's very hard to foresee, but you could certainly imagine there's going to be multiple entities that come through. And in our AI space, you know, we've seen whatever, when we make a certain kind of advance, like other companies are quick to follow, and vice versa. When other companies make certain advances, it's a kind of a constant leapfrog.

Starting point is 00:12:02 So I do think there's an inspiration element that you see, and that would probably encourage more and more entities across that threshold. Dennis, what do you think? Well, I think we probably do, I think it is important for the field to agree on a definition of AGI, so maybe we should try and help that to coalesce. assuming there is one, you know, there probably will be some organizations that get there first. And I think it's important to that those first systems are built reliably and safely. And I think after that, if that's the case, you know, we can imagine using them to shard off many systems that have safe architectures sort of built under, you know, sort of provably underneath them.

Starting point is 00:12:44 And then you could have, you know, personal AGIs and all sorts of things happening. But it's, you know, it's quite difficult, as Sergei says, it's pretty difficult to predict. sort of see beyond the event horizon to predict what that's going to be like. Right, so we talked a little bit about the definition of AGI, and a lot of people have said AGI must be knowledge, right, the intelligence of the brain. What about the intelligence of the heart? Demis, briefly, does AI have to have emotion to be considered AGI? Can it have emotion? I think it will need to understand emotion. I don't know if, I think it will be a sort of almost a design decision if we wanted to mimic emotions. I think there's no, I don't see any reason why it couldn't in theory, but it might be different or we might be not necessary or in fact not desirable for them to have the sort of emotional

Starting point is 00:13:31 reactions that we do as humans. So I think, again, it's a bit of an open question as we get closer to this AGI timeframe and, you know, sort of events, which I think is more on a five to ten year time scale. So I think we have a bit of time, not much time, but some time to research those kinds of questions. about how the time frame might be shrunk, I wonder if it's going to be the creation of self-improving systems. And last week, I almost fell out of my chair reading this headline about something called Alpha Evolve, which is an AI that helps design better algorithms

Starting point is 00:14:07 and even improve the way LLMs train. So, Demis, are you trying to cause an intelligence explosion? No, not an uncontrolled one. Look, I think it's an interesting first experiment, it's amazing system, a great team that's working on that, where it's interesting now to start pairing other types of techniques, in this case evolutionary programming techniques, with the

Starting point is 00:14:32 latest foundation models, which are getting increasingly powerful. And I actually want to see in our exploratory work a lot more of these kind of combinatorial systems and sort of pairing different approaches together. And you're right, that is one of the things, a self-improvement, someone discovering a kind of self-improvement

Starting point is 00:14:48 loop would be one way where things might accelerate further than they're even going today. And we've seen it before with our own work, with things like Alpha Zero learning, chess and go, and any two-player game from scratch within less than 24 hours, starting from random with self-improving processes.

Starting point is 00:15:09 So we know it's possible. But again, those are in quite limited game domains, which are very well described. So the real world is far messier and far more complex. So remains to be seen, if that type of approach can work in a more general way. Sergey, we've talked about some very powerful systems. And it's a race.

Starting point is 00:15:28 It's a race to develop these systems. Is that why you came back to Google? I mean, I think as a computer scientist, it's a very unique time in history. Like, honestly, anybody who's a computer scientist should not be retired right now, should be working on AI. That's what I would just say. I mean, there's just never been a greater sort of problem, an opportunity, a greater cusp of technology.

Starting point is 00:15:58 So I don't, I wouldn't say it's because of the race, although we fully intend that Gemini will be the very first AGI. I clarify that. But to be immersed in this incredible technological revolution, I mean, it's unlike, you know, I went through sort of the web. one dot O thing was very exciting and whatever we had mobile we have this we had that but I think this is scientifically far more exciting and I think I think ultimately the impact on the world is going to be even greater and as much as you know the web and mobile phones have had a lot of impact I think AI is going to be vastly more transformative so what do you do day-to-day I think I torture people like

Starting point is 00:16:49 It was amazing, by the way, he tolerated me crashing this fireside. I'm in that, you know, I'm across the street, you know, pretty much every day. And they're just people who are working on the key Gemini text models, on the pre-training, on the post-training. Mostly those, I periodically delve into some of the multimodal work. V-O-3 is you've all seen. But I tend to be pretty deep in the technical details, and that's a luxury I really enjoy, fortunately, because guys like Demis are, you know, minding the shop. And, yeah, that's just where, you know, my scientific interest is. It's deep in the algorithms and how they can evolve.

Starting point is 00:17:40 Okay. Let's talk about the products a little bit, some that were introduced recently. I just want to ask you a broad question about agents, Demis. Because when I look at other tech companies building agents, what we see in the demos is usually something that's contextually aware, has a disembodied voice, is often interacted. You often interact with it on a screen. When I see DeepMind and Google demos,

Starting point is 00:18:04 oftentimes it's through the camera. It's very visual. There was an announcement about smart classes today. So talk a little bit about if that's the right read, Why Google is so interested in having an assistant or a companion that is something that sees the world as you see it? Well, it's for several reasons, several threads come together. So as we talked earlier, we've always been interested in agents. That's actually the heritage of DeepMind actually we started with agent-based systems in games.

Starting point is 00:18:34 We are trying to build AGI, which is a full general intelligence. Clearly, that would have to understand the physical environment, the physical world around you. And two of the massive use cases for that, in my opinion, are a truly useful assistant that can come around with you in your daily life, not just stuck on your computer or one device. We want it to be useful in your everyday life for everything. And so it needs to come around you and understand your physical context. And then the other big thing is I've always felt for robotics to work, you sort of want what you saw with Astra on a robot. And I've always felt that the bottleneck in robotics isn't so much the hardware. although obviously there's many, many companies

Starting point is 00:19:13 and working on fantastic hardware and we partner with a lot of them, but it's actually the software intelligence that I think is always what's held robotics back. But I think we're in a really exciting moment now where finally, with these latest versions, especially 2.5 Gemini, and more things that we're going to bring in

Starting point is 00:19:30 this kind of VO technology and other things, I think we're going to have really exciting algorithms to make robotics finally work and sort of realize its potential, which could be enormous. And then in the end, AGI needs to be able to do all of those things. So for us, and that's why you can see, we always had this in mind. That's why Gemini was built from the beginning, even the earliest versions, to be multimodal.

Starting point is 00:19:54 And that made it harder at the start, because it's harder to make things multimodal than just text only. But in the end, I think we're reaping the benefits of those decisions now, and I see many of the Gemini team here in the front row, of the correct decisions we made. There were the harder decisions, but we made the right decisions, and now you can see the fruits of that with all of what you've seen today, actually. Sergey, I've been thinking about whether to ask you a Google Glass question. Oh, far away. What did you learn from Glass that Google might be able to apply today

Starting point is 00:20:24 now that it seems like smart glasses have made a reappearance? Wow, yeah, a great question. I learned a lot. I mean, that was, I definitely feel like I made a lot of mistakes with Google Glass, I'll be honest. I am still a big believer in the form factor. So I'm glad that we have it now. And now it looks like normal glasses, doesn't have the thing in front. I think there was a technology gap, honestly.

Starting point is 00:20:54 Now in the AI world, the things that these glasses can do to help you out without constantly distracting you, that capability is much higher. There's also just, I just didn't know anything about consumer-related supply chains really and how hard it would be to build that and have it be at reasonable price point managing all the manufacturing so forth. This time we have great partners that are helping us build us. So that's another step forward. What else can I say? I do have to say I miss the airship with the wing-suting skydivers for the demo. Honestly, it would have been even cooler here at Shoreline Amphitheater than it was

Starting point is 00:21:42 up in Moscone back in the day, but maybe we'll have to, we should probably polish the product first this time. We'll do it that way around this time. Make sure it's ready and available and then we'll do a really cool demo. So that's probably a smart move. Yeah, what I will say is, I mean, look, we've got obviously an incredible history of glass devices and smart devices. We can bring all those learnings to today, and I'm very excited about our new glasses, as you saw. What I've always talking to our team and Sheram and the team about is that, I mean, I don't know if I would agree, but I feel like the universal assistant is the killer app for smart glasses, and I think that's what's going to make it work, apart from the fact that

Starting point is 00:22:23 the hardware technology has also moved on and improved a lot, is I feel like this is the actual killer app, the natural killer app for it. Okay, briefly on video generation. I sat in the audience in the keynote today and was like fairly blown away by the level of improvement we've seen from these models. And I mean you had filmmakers talking about it in the presentation. I want to ask you, Demis, specifically about model quality. If the internet fills with video that's been made with artificial intelligence, does that then

Starting point is 00:22:58 go back into the training and lead to a lower quality model than if you were training just from human generated content. Yeah, look, we know, there's a lot of worries about this so-called model collapse. I mean, video is just one thing, but in any modality text as well. There's a few things to say about that. First of all, we're very rigorous with our data quality management and curation. We also, at least for all of our generative models, we attach synth ID to them. that there's this invisible AI actually made watermark

Starting point is 00:23:29 that is pretty very robust, as held up now for year, 18 months since we released it. And all of our images and videos are embedded with this watermark. So we can detect, and we're releasing tools to allow anyone to detect these watermarks and know that that was an AI-generated image or video. And of course, that's important to combat deep fakes and misinformation, but it's also, of course,

Starting point is 00:23:56 you could use that to filter out if you wanted to, whatever was in your training data. So I don't actually see that as a big problem. Eventually, we may have video models that are so good, you could put them back into the loop as a source of additional data, synthetic data, it's called. And there, you've just got to be very careful that you're actually creating from the same distribution

Starting point is 00:24:18 that you're going to model. You're not distorting that distribution somehow. The quality is high enough. We have some experience of this in a completely main with things like alpha fold, where there wasn't actually enough real experimental data to build the final alpha fold. So we had to build an earlier version that then predicted about a million protein structures, and then we selected, it had a confidence level on that, we selected the top three, 400,000 and put them back in the training data. So there's lots

Starting point is 00:24:43 of, it's very cutting edge research to like mix synthetic data with real data. So there are also ways of doing that. But on the terms of the video sort of generator stuff, you can just exclude it if you want to, at least with our own work, and hopefully other, the gen media companies follow suit and put robust watermarks in. Also, obviously, first and foremost, to combat deep fakes and misinformation. Okay. We have four minutes. I got four questions left. We now move to the miscellaneous part of my questions. Let's see how many we can get through and as fast as we can get through them. Let's go to Sergei, with this one. What does the web

Starting point is 00:25:17 look like in 10 years? What does the web look like in 10 years? I mean... You go one minute. Boy, I think 10 years... because of the rate of progress and AI is so far beyond anything we can see. Best guess. Not just the web. I don't know, I don't think we really know what the world looks like in 10 years.

Starting point is 00:25:37 Okay. Demis? Well, I think that's a good answer. I do think the web, I think in nearer term, the web is going to change quite a lot if you think about an agent first web. Like, does it really need to, you know, it doesn't necessarily need to see renders and things like we do as humans using the web. So I think things will be pretty different in a few years. in a few years. OK. This is kind of an under over question.

Starting point is 00:25:59 AGI before 2030 or after 2030? 2030, boy, you really kind of put it on that fine line. I'm going to say before. Before? Yeah. Demis? I'm just after. Just after.

Starting point is 00:26:14 OK. No pressure, Dennis. Exactly. I have to go back and get working harder. Is that? I can ask for it. He needs to deliver it. Yeah, exactly.

Starting point is 00:26:26 Stop sound bagging. We need that next week. That's true. I'll come to the review. All right, so would you hire someone that used AI in their interview? Demis? Oh, in their interview? Depends how they used it.

Starting point is 00:26:44 I think using today's models' tools probably not. But I think that would be, it depends how they would use it, actually. I think it's probably the answer. Sergey? I mean, I never interviewed at all, so, I don't know. I feel it would be hypocritical for me to judge people exactly how they interview. Yeah, I haven't either, actually. So, I've never done a job with you.

Starting point is 00:27:09 Okay. So, Demis, I've been reading your tweets. You put a very interesting tweet up where there was a prompt that created some sort of natural scene. Oh, yeah. Here was the tweet. Nature to Simulation at the press of a button. It does make you wonder with a couple of emojis and people ran with that and wrote some headlines saying, Demis thinks we're in a simulation.

Starting point is 00:27:34 Are we in a simulation? Not in the way that, you know, Nick Bostrom and people talk about. I think, though, this, so I don't think this is some kind of game, even though I wrote a lot of games, I do think that ultimately underlying physics is information theory. So I do think we're in a computational universe, but it's not just a straightforward. simulation. I can't answer you in one minute. But I think the fact that these systems are able to model real structures in nature is quite interesting and telling. And I've been thinking a lot about our work we've done with AlphaGo and Alpha Fold in these types of systems. I've spoken

Starting point is 00:28:14 a little about it. Maybe at some point I'll write up a scientific paper about what I think that really means in terms of what's actually going on here in reality. Sergei, you want to make a headline? Well, I think that argument applies recursively, right? If we're in a simulation, then by the same argument, whatever beings are making a simulation or themselves in a simulation for roughly the same reasons, and so on and so forth. So I think you're going to have to either accept that we're in an infinite stack of simulations or that there's got to be some stopping criteria.

Starting point is 00:28:49 And what's your best guess? I think that we're taking a very anthropocentric view, like when we say simulation in the sense that some kind of conscious being is running a simulation that we are then in and that they have some kind of semblance of desire and consciousness that's similar to us. I think that's where it kind of breaks down for me. So I just don't think that we're really equipped to reason

Starting point is 00:29:20 about sort of one level up in the hierarchy. Okay, well, Demis, Sergei, thank you so much. This has been such a fascinating conversation. Thank you. Thank you all. All right. Alex. Thank you.

Starting point is 00:29:33 Pleasure.

Big Technology Podcast - Google DeepMind CEO Demis Hassabis + Google Co-Founder Sergey Brin: Scaling AI, AGI Timeline, Simulation Theory

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.