Into the Impossible With Brian Keating - Meta’s Chief AI Scientist Yann LeCun: The Path Toward Human-Level Intelligence in AI [Ep. 473]

Starting point is 00:00:00 Study and play. Come together on a Windows 11 PC. And for a limited time, college students get the best of both worlds. Get the Unreal College deal, everything you need to study and play with select Windows 11 PCs. Eligible students get a year of Microsoft 365 premium and a year of Xbox GamePass Ultimate

Starting point is 00:00:20 with a custom color Xbox wireless controller. Learn more at Windows.com slash student offer. While supplies last, ends June 30th, turns at AKA.m.m.S. slash college PC. Yamava Resort and Casino at San Manuel is California's number one entertainment destination for today's superstars.

Starting point is 00:00:37 Catch the Jonas Brothers return to the Yamava Theater stage on April 30th, the powerful vocals of Demi Lovato on May 17th, and the signature Southern Country Rock of Eric Church on July 19th. Tickets on sale now at yamavaitheter.com, only at Yamava Resort and Casino, celebrating its 40th anniversary.

Starting point is 00:00:55 U.N.? must be 21 to enter. Welcome back to Into the Impossible. Today, we're going to dive deep into the frontier of artificial intelligence with a pioneer, Jan Lacoon. Jan only answers to one man at Meta. That's right, the Zuck. Today, we'll find out what makes Zuck tick, and along the way we'll explore Jan's controversial claims. Yon is a visionary and the motivating force behind new architectures like Jepa, a self-supervised AI approach that builds explicit mental models of the world,

Starting point is 00:01:28 reduces output of randomness and opens new frontiers for understanding, predicting, and solving complex challenges in physics, education, and healthcare. It may just transform the way we learn and teach. So join us for a mind-expanding conversation on advanced machine intelligence and the nature of intelligence itself. We'll push the boundaries and explore the evolving role of educators in an AI-driven future and will even explore the financial incentives for AI that drive a lot of the profit margins at places like Meadow. Now let's jump into this conversation with Jan the Mon behind the Metaverse. Any sufficiently advanced technology is indistinguishable from magic. Open the pod bay doors out.

Starting point is 00:02:18 Hey, Meta, who is Jan Lacoon? Jan Lecun, welcome to the Into the Impossible podcast. Pleasure to be here. These are my favorite piece of technology. In fact, it's my second version of these, and I've actually used them with a real CIA spy. to kind of diagnose how these are the ultimate spy tools. So you are to be commended on these devices. I actually had an Apple Vision Pro, which is kind of a trick to buy in a public university professor's salary, but I returned it, so I returned it within the Apple return window.

Starting point is 00:02:50 I'm not my third pair of the rebands. I had the first version, and then the second version, which I killed because I was sailing and flipped, and they went underwater. It was kind of an early model, so I sent it back to our colleagues, and they're trying to figure out, you know, what went wrong? And they sent me another pair. Yeah, I really feel like they are the future. I mean, anyone who's had an Apple vision or even the meta quest, to be honest with you, I enjoy it, you know, if I'm playing a game or so my kids steal it from me. But they're heavy, and humans don't like to have their peripheral vision, you know, kind of concealed because a predator will come from the from the behind us, right? So in my case, I love these to kind of augment reality. The quality is great. This is not a meta-commercial, but the point is I think Apple could have done a lot better,

Starting point is 00:03:38 and I don't see that surviving. I'm very interested to see the Generation 3 when those come out, and maybe I'll cajole you into giving me an early trial. But today we're here to talk about physics, but we have to start off because I am the Associate Director of the Arthur C. Clark Center for Human Imagination at UC San Diego. And you may notice in my background, I have a quote. And I would like you to maybe tell my listeners what that quote means to you, open the pod bay doors. Well, that's a famous quote from 2001, 2001 Space Odyssey. And I must say that this movie had a big influence on me,

Starting point is 00:04:13 because I saw it when I was nine years old when it came out. And I was very, very impressed by that movie because it was talking about all the stuff I was fascinated by, you know, the universe and space travel and how intelligence emerged. I was nine years old. and international computers and antiques like that. So most people don't realize that's where the word podcast comes from that we're on. There's an engineer at your rival, one of your rival companies, Vinnie Chieko at Apple,

Starting point is 00:04:42 who was inspired by Hal 9,000, just like you. And he saw the prototype of this white, gleaming little tiny device called later. And he said, we got to call it the iPod, and then the rest is history. And that's where podcasts come from. So we always open the audio episode of the show with actually Dave talking to how to open the pod bay doors, which he refuses to do, which some of your fellow researchers, including recent Nobel laureate, Greg Hinton, you know, kind of might be terrified by. But I want to start first with a quote that you were quoted in the Wall Street Journal in October around the time of the Nobel Prize. And you said something very interesting to me, or very provocative, as is. your want. You said AI is barely as smart as a cat. And I thought to myself, Jan, you haven't met my cats.

Starting point is 00:05:30 You know, my cat knock over glasses of water just to spill them on my laptop. And they play with a dead mouse just for fun. And I can't imagine an AI doing that because, you know, the only way to stop it would be to have a laser pointer in every room in the metaverse. So what did you mean by that? And why was that meant to comfort me? What I meant by that is that, um, the, the, the best of our LLMs, you know, can manipulate language in pretty amazing ways. But they basically have no understanding of the physical world because they're purely trained on text. So what the image of the world that they get is through human representation of it, which is extremely symbolic, first of all, but approximate, simplified, discrete.

Starting point is 00:06:17 The real world is much more complex than that. And RAI system are completely unable to handle the real world, which is why we have LLMs that can pass the bar exam, but we still don't have domestic robots. They can do what any 10-year-old can do in one shot, can learn in one shot without even spending any significant brain power. So now what I say about a 10-year-old is actually true of a cat. If you see cats trying to jump on a bunch of furniture

Starting point is 00:06:47 to reach a particular object of interest, right? They sit down and they kind of move their head and they plan their trajectory and then they go bounce, bounce, bounce, bounce, bounce, on bounds. So how did they do that, right? And so they can plan, they can reason. They understand the physical world. They have an extremely good model of themselves, of their own dynamics and intuitive physics about many things. And those are things that we are incapable of reproducing with computers today. It always seems remarkable to me, both the optimism and the pessimism about these,

Starting point is 00:07:20 objects. And I wonder, there's something called the Thissudity's trap or something like that where a rival power, you know, comes to power, it's weaker, and then the dominant power spends all of its attention focusing on it. It's sort of related to sunk cost fallacy. And I want to get your impression about this. Are we sort of dooming ourselves, at least in the physical sciences? This GPU and LLM approach is just sucking all the oxygen up in this space. I mean, as far as, consumers are concerned, and then that drives Nvidia to be worth $3 trillion. Are we now fully

Starting point is 00:07:55 basically committed to the GPU plus LLM model, and will that not stifle actual innovation in physics, for example? Well, yes and no. So yes, if we keep being obsessed by LLM and LLM are sucking everything else

Starting point is 00:08:11 out of the air, and currently it looks a little bit like this, LLMs are kind of a hammer and now everything looks like an L. And that's a mistake. Yeah, I mean, that's the point I've been making that, you know, LLM are not the BOL and all of AI. They're surprisingly powerful, given their conceptual simplicity, really. But there's a lot of things they cannot do.

Starting point is 00:08:31 And one of them is representing and understanding the physical world and certainly planning actions in the physical world is not something that LLMs can do. So I think that's the big challenge of the next few years in AI. It's going beyond the auto-rogressive LLN type. architecture, towards architectures that can perhaps understand the way of the world, acquire some level of common sense. And, you know, the way or intelligence operates, at least for complex tasks that require or deliberate, you know, conscious reasoning is that we have some mental model of further

Starting point is 00:09:08 world works, and we can imagine what the result of our actions is going to be, the effect of our actions. And that is what allows us to plan, because we can imagine what the result of a sequence of action is going to be, we can optimize that sequence of action so as to achieve a particular goal. That's exactly what goes on when the cat is, you know, standing and looking up and trying to figure out like what trajectory to follow. That's planning. And, and they have a mental model of themselves. They have a mental model of the material they're going to jump over. We do this absolutely all the time very often without even realizing. And that should

Starting point is 00:09:43 be of interest to physicists. And, you know, I hope we can talk about this next. Yeah. I like to bring up this guy here, Albert Einstein. So in 1907, he had this Godunken experiment where he envisioned the freely falling observer. In an elevator, God forbid, the cable snaps, and the elevator falls that such a person would experience no gravitational force field. And he called that the happiest thought of his life. And I want to ask you, Jan, in what case, or is it possible that, A, a computer could have a happy thought, let alone the happiest thought, and B, without embodiment, without some visceral sensation, to know that pit in our stomach that we all know when we go on a roller coaster or when the elevator

Starting point is 00:10:30 moves weirdly. How can a computer system ever have the capability to construct new physical laws like Einstein did? The short answer is that today, no. Like, AI systems today cannot have this kind of intrusion. Even though, I mean, the AI systems that are the most appropriate that are applied to scientific discoveries today are specialized models, right? So you want to predict the structure of a protein or predict the interaction between two molecules or the property or material. You develop somewhat specialized models for this, and you can't use the LLMs, really, for this kind of stuff. They're just going to regurgitate whatever they've been trained on, but they're not going to be able to come up in your way.

Starting point is 00:11:13 things. And those models, of course, are powerful in the sense that, you know, they all predict chemical reactions that nobody tried before and properties of material that nobody ever built and things like this. So they are a little more, you know, outside of the beaten path. They can go a little bit outside of beaten paths more than LLMs, which basically are ways to index existing knowledge. But they're not, they're not going to have this kind of insight that, you know, Einstein was famous for. Not yet, but the hope is that at some point they will. And so, I mean, my big question, scientific question and interest is how to do that,

Starting point is 00:11:56 is, you know, what kind of process, through what kind of process do we, humans, but also animals, build models of the real world? And one big thing there is figuring out the appropriate representation and relevant variables of a system or something that you are interested in

Starting point is 00:12:22 modeling and what's the right level of abstraction of that representation. So for example, you know that if you want that, you know, we can collect an infinite amount of data on, let's say, Jupiter. And there's like enormous amounts of data that we know about Jupiter, right?

Starting point is 00:12:38 In terms of weather, density composition, you know, temperature, all the, everything. But now, who would have thought that to predict the trajectory of Jupiter for the next few centuries, you only need to know six numbers, three positions and three velocities, and you're done. You don't even need to know density, composition, rotation, anything like that. It's just six numbers, right?

Starting point is 00:13:01 So the most difficult step to being able to make predictions is finding the appropriate representations of the reality and eliminating all the stuff that's irrelevant so that you can make those predictions. I've been obsessed for the last several years with an architecture that I think is capable of this, that we call Jepa, which I may explain if you want. Yeah, I'd like you too.

Starting point is 00:13:28 And I just want to thank you for bringing up another reference to 2001 of Space Odyssey. The planet Jupiter, of course, is where they find the mysterious monolith that then allows them to transport. port across probably a wormhole or something. Yeah, talk about Jepa. What is that? I'm not familiar with that. There was one characteristic of LLM or one trick that LLMs use, which I've been advocating for

Starting point is 00:13:49 a very long time, called self-supervised learning. So what is self-supervised running? It's basically you take an input. It can be a sequence or it can be anything, it can be an image. You corrupt it in some way. And then you train a system to recover the full input from the corrupted way. in the context of language and LLNs in particular. So there were several types of natural language understanding systems. But the only one, before the crop of LLN,

Starting point is 00:14:21 is one in which you take a piece of text, you corrupted by removing some other words, replacing them by black markers, or substituting some other words. And then you train a gigantic neural net to predict the words that are missing or are wrong, right? In the process after doing so, the system learns a good internal representation of the text that can be used for all kinds of potential applications, right?

Starting point is 00:14:45 So as input to, say, a traciation system or sentiment analysis or... It's peak pollination season, and my business is scaling fast. To keep the nectar flowing, I need a phone plan with top priority data speeds. That's why I chose GoogleFi wireless. My connections stay strong, even when the high-evolve. buzz in. Plus, unlimited plans started $35 a month. Now, that's a deal that doesn't stay. Explore GoogleFi Wireless plans today. Plus taxes and government fees. GoogleFi Wireless is not subject to data traffic deprioritization during times of high network usage.

Starting point is 00:15:22 Summurization, whatever you want. Okay. Now, L&Ms are a special case of this, where you build the architecture of the system in such a way that to predict a word in the input, the system can only look at the word to the left of it. So it can only look at the previous words to predict the particular word. So now you don't need to do the corruption process anymore because the architecture basically intrinsically corrupts the system by preventing the system from looking at all the data. It can only look at what to the left of a particular word to predict that word.

Starting point is 00:15:57 Right. So you put an input and then translate the system. and then train the system to just reproduce its input on this output. Okay. So that's self-supervised because there is no, you know, tasks that you ask the system to accomplish, right? There is no differentiation between input and output. Everything is an output and an input.

Starting point is 00:16:14 Right, so that's self-supervasionally. Now, that works amazingly well for language, and it works really well for, you know, DNA sequences and all kinds of stuff. But it only works for essentially sequences of disqualification of discrete things like language, right? So language is, there's only a finite number of words in the dictionary. You can never predict which word will follow a sequence of words, but you can predict a vector of scores or probably easy distribution over all possible

Starting point is 00:16:42 words in the dictionary. And that's easy to do, right? It's just a big factor of numbers between zero and one that's up to one. What do you do about natural data? Okay, data that comes to you from a sensor, say a camera, right? So now that your data is video, or let's say it's just an image. So what you could try is try the same thing, right? Take an image corrupted by masking pieces of it

Starting point is 00:17:02 and then train some their own net to reconstruct the full image. That's called a mass auto-encoder, N-AE. It doesn't work very well. And in fact, there is various ways to train systems to reconstruct from partial views, right? They're called auto-encoders, but there are various ways to train them. The mass technique is just one.

Starting point is 00:17:21 And none of that really works very well. All those techniques, by the way, are inspired by statistical physics. So one particular method to do this is called Variational Auto Encoder, and the Variational comes from Variational Free Energy. So it's the same math, okay, as the psychophysics. Does it fail because of some missing boundary conditions or initial condition, like the three-body problem?

Starting point is 00:17:46 I had an undergraduate this past year try to answer the following question. If you had the orbit of Mercury for the last 10,000 years, which we can, you know, we actually know its orbit from JPL and NASA, right? So we know its orbit over, you know, something like 200 of its years, because its year is much quicker than ours. And I said, given that data, could you predict that there's something missing from Newtonian mechanics? In other words, that you have to then augment it with a new variational method, which is the variation of Einstein-Legrongian, and it couldn't get it.

Starting point is 00:18:18 There was nothing it would do. It knew that it could predict that there's some anomalous procession of mercury, But it couldn't get the equations. And we had to basically force it, you know, by feeding it, you know, the analog of Einstein's equation. We want this answer to the quality. If we had LLMs in 1890, could we have predicted Einstein's theory of general relativity? And the answer, at least with type of models that we're using machine, we couldn't do it. Is that what's missing from, is that why it fails in sort of sense it's missing some core insight that Einstein's genius had to come up with or something else?

Starting point is 00:18:53 No, it's something else. Okay, what is that? It's much more pedestrian, unfortunately. It's the fact that to predict a continuous, high-dimensional, continuous signal like an image or a video, it's very difficult to represent a probability distribution over all possible images, right? You can, when you're predicting a word,

Starting point is 00:19:14 you don't know exactly which word comes after a sequence, but you can approximately... Sure, yeah. It's not going to be gibberish if it's not going to be gibberish, if it's actual language, right? Right. And, you know, if you have a verb, there's probably a compliment

Starting point is 00:19:30 that comes after, things like that, right? So you can't do this with video. So if you show a video, if you train a system to predict what happens in the video, right? You show it a segment of video, then you stop the video. You ask it, predict what happens next.

Starting point is 00:19:43 And then, of course, you show it what happens next, and then you train it to actually predict this. It doesn't work very well. I mean, I've been working on this for, like, the better part of the last 15 years, and it really doesn't work.

Starting point is 00:19:53 I trust you. And the reason it doesn't work is that if you train the system to make one single prediction, the best thing you can do is predict the average of all the plausible futures that may happen. And that's basically a blurry image. Because even if we take videos like our video right now of speaking, I could be saying a word or another. I could be moving my head one side or the other. I could be moving my hands one way or another.

Starting point is 00:20:20 And so if the system has to make one prediction, and we trade it to minimize the prediction error, it's just going to predict the average of all the things that could happen. You're going to see blurry versions of my hands, blurry versions of my face, very blurry versions of my mouth, and that's not a good prediction.

Starting point is 00:20:36 And so that just doesn't work. Basically, self-supervisorying by reconstruction or prediction does not work for natural signals. Okay, so now I'm coming to this idea of Jepa. Okay, so Jepa stands for joint abetting predictive architecture. So what's an embedding? An embedding is a representation for a signal, right? You take an image and you don't care about the precise value of all the pixels.

Starting point is 00:20:59 What you care about is some representation, which is going to be a list of numbers, a vector, that represents the content of the image, but does not represent all the details about it. Okay? That's an embedding. And a junk embedding is that if you take an image and you take a corrupted version of that image, or let's say a slightly transformed version of the image, viewpoint, for example, the content of the image doesn't change. And so the embedding should be the same.

Starting point is 00:21:27 So a joint-emitting architecture is trained by, it's basically a big neural net, and you train it in such a way that when you show it two versions of the same image, of the same thing, you produce the same embedding. You force it to produce the same output, essentially. And then the P, the predictive, is, let's say a version of the image is, you know, a frame a video and the corrupted version is the frame before. So now what you need to do is predict the next frame from the previous frame. Or predict the next few frames, so we produce few frames.

Starting point is 00:22:02 And that's called a JEPA. So a joint embedding, predictive architecture, right? You have two embeddings, one that takes the future of the video, one that takes the past, and then you have a predicter that tries to predict the representation of the future of the video from the representation of the past of that video. When you use this type of architecture to train a system to learn representations of images, it works really well. There's a number of different techniques.

Starting point is 00:22:27 My colleagues on I and many other people have come up with over the last few years to do this. And it works really well. So we can learn good representations of images. We're starting to get good representations of video, but it's very recent. But then what you can imagine is now that you have this principle that I was talking about for Jupiter, where you have data about Jupiter or Mercury, and then you ask the system, find a good representation of all the data you have,

Starting point is 00:22:53 eliminating all the stuff you can't predict, so that you can make predictions in representation space. So eliminate all the stuff you cannot predict, the weather on Jupiter, you know, like all kinds of details that you really would not be able to predict, and eliminate all that, and just find a representation such that you can make predictions at a certain horizon within that space.

Starting point is 00:23:13 And in my opinion, that's really the essence of kind of understanding the world that you do when you do physics, right? You're trying to find a model of a phenomenon, eliminating all the stuff that is irrelevant, and then finding a good set of relevant variables that allows you to make predictions. That's really what science is all about.

Starting point is 00:23:32 Is there an analog in that that is subjective? You speak about temperature of a model and things like that. Do you have to still specify parameters like that? Not particularly in this context. because when you have this kind of architecture, or at least the simple one I described, you eliminate the stochasticity to use a penitic term in the prediction. You basically, I mean, when the system is trained to do this,

Starting point is 00:24:01 it trains itself simultaneously to find a good representation of the input that preserves as much information as the input as possible, but at the same time, it's still predictable. So if you have phenomena in the input that is unpredictable, like chaotic behavior, you know, random things that you just can predict, you know, individual motion of particles and stuff like that, you know, from thermal fluctuation, it's not going to keep this kind of information. It's going to eliminate this and just keep whatever relevant part of the input are useful for prediction. So, you know, let's take a chamber full of gas and we can measure all kinds of stuff about it, including. the position of all the particles, which is an enormous amount of information, which you can't predict because you have to know how every particle interacts with the walls and the heat bass and everything, so you can't really make any prediction of that. And probably the system spontaneously will say,

Starting point is 00:24:56 well, I can't, you know, I'm not going to be able to predict the trajectory of each particle, but I can measure, you know, pressure, volume, maybe a number of particles, and temperature. And look, when I compress, the heat goes up. So, you know, PV equals an RT or something. That's really the essence, I think, of, like, where perhaps machine learning connects with science or physics in particular. Another aspect of this, which I think is fascinating, which really we haven't explored nearly enough yet, because that's kind of a bit of a new concept. It's the idea of the level of abstraction of representation. So people do this in physics, in science, right?

Starting point is 00:25:37 In principle, we could explain everything that is occurring between us right now, in terms. of, I don't know, quantum field theory, right? Yes, I suppose, although there be, yeah, the subjective nature of human, you know, consciousness might play a role. That's just particles interacting, right? So in principle, let's all, you know, that could all be reduced to quantum field theory. But of course, it's completely impractical because the amount of information you would have to manipulate is just ridiculously large, right?

Starting point is 00:26:02 You know, we use different levels of abstraction to represent phenomena that, again, eliminate the details, right? You know, quantum field theory, and then on top of this, you know, We have, you know, particle, elementary particle theories, atomic theory, molecules, materials. And then, you know, you go up the hierarchy, you know, it could go pretty high up and then have biology, you know, objects interacting with each other, you know, and then psychology, right at the top or something like that. So finding good abstract representation levels of representations allow you to kind of understand

Starting point is 00:26:40 what goes on, but eliminating all the irrelevant detail is really kind of at the root of intelligence, really, I think. I had this debate with someone I'm guessing is not a great friend of yours or champion of a Peter Teal back in May. And it was whether or not, you know, we can extrapolate from LLMs. There is something emergent about them, but it's not clear that what they are currently lacking is what will get them to a level of artificial Einstein. I call AE, you know, artificial Einstein. And that's, you know, the training data. So you talked about this in conversation with

Starting point is 00:27:19 our mutual friend, Lex Friedman, last year, early this year, you know, basically that LLMs have 20 trillion tokens, but a four-year-old has order of magnitude more. And maybe they get more. Maybe they do get that 10x. But I always say, is what's missing, is what's preventing us from coming up with the analog of general relativity or a theory of everything, for example. Is it that AI currently doesn't know the plot of Gladiator 2? I don't think so. In other words, there's infinite amount of training data that could be provided in tokenized form, but there's something very different. You know, Einstein didn't need the fast and the furious one to come up with general relativity, the principle of Lorenz invariance. So, or Poincorre, what is the most likely root to new physics, in your

Starting point is 00:28:03 opinion, is it the type of, you know, JEPA and more, you know, kind of visual data and, and modeling based on observed phenomena, predictive bridges between the two? Or is it something like symbolic regression, which is totally different and does require, in my opinion, at least, I'm an experimental cosmologist, so I'm not a theoretical physicist, but the point being, does it need some supervision that only humans can provide? So what do you think is the best route to focus it on a theory of everything, which is currently the Holy Grail in my field? What would be the most likely route to that, in your opinion, in terms of tools and techniques that a young person might want to pursue?

Starting point is 00:28:40 Okay. So I think there are a number of techniques that people have been working on for a while that we'll probably have a lot of utility in the short term. So things like symbolic regression. My Scramer has been kind of working on this kind of stuff. And, you know, he was, you know, connected with NYU and the Flatiron Institute and everything. And I've connected with him as well. yeah, it's really interesting stuff. This work on this kind of stuff going back decades.

Starting point is 00:29:08 But it didn't work very well at the time because computers were not that powerful and everything. So there's been a lot of progress. I don't think this is going to produce systems that have the kind of insight that we're talking about, the kind of insight that, you know, Einstein might have had or a lot of other physicists like Feynman or things like that

Starting point is 00:29:26 and certainly not produce the theory of everything just by itself. I think what's missing is much more fundamental. And so it's the ability to construct mental models of the world that the system would be able to manipulate in its mind and use corner cases, you know, extreme cases. I mean, this is the kind of get-on-going experiment that you were talking about before that Einstein was famous for. You know, having a mental model of something, making an hypothesis, and then trying to push that model to kind of an extreme case and see, like, what's happening there. Or like the get-onking experiment that people use very often to explain time contraction due to relative speed, right? So if you observe someone on the train, you know, shining a light that bounces up and down through two mirrors,

Starting point is 00:30:22 that person, the light, you know, goes at the speed of light and it just travels up and down the particular, you know, just the height of the wagon. But to someone observing this from the outside, the light is bouncing up and down in diagonal, right? So it's actually traveling a longer time, but still at the speed of light. So it must be that time is contracting, right? You said this place was steps from the water. We just haven't found the steps yet.

Starting point is 00:30:48 How much did we save? Enough. Enough to get lost. Or you could book a stay with Hilton. Welcome to your ocean front room. Just steps from the water. The Hilton sale is on now. Book on Hilton.com or the Hilton app

Starting point is 00:31:03 and save up to 20% to get the stay you expected. When you want savings, not surprises. It matters where you stay. Hilton, for the stay. I mean, that's a very simple one's, you know, up a stereo is kind of obvious, but you have to make the enormous assumption that the speed of light is the same for everyone.

Starting point is 00:31:25 Here is the thing, that you were talking about training data. Do you need to know about, But do you need to have watched Gladiator to be able to come up with Rantivity? Of course, you have it. The interesting thing is that historians seem to, historians of science seem to say that Einstein was not aware of the experiments at University of Chicago, the, you know, Mickelson-Morley experiments that showed that the- I was, sorry, I have to interrupt you. I have to do that.

Starting point is 00:31:52 That was at Case Western, my alma mater. Oh, sorry. Sorry. I can't let that one go. Sorry, go ahead. Indeed, indeed. They were trying to prove the existence of the ether, and they couldn't prove it,

Starting point is 00:32:03 and they thought their experiment was deficient, and I don't think Einstein was aware of it. At least, historians are saying he was unaware of it. So even in the absence of experimental evidence, for his hypothesis, he was kind of able to come up with this concept. I mean, that's pretty amazing. And, of course, he had great competition from none other than Henri, Poit-Carré, sticking with your countryman.

Starting point is 00:32:26 I think, Jan, not knowing you for more than 40 minutes now, but I think you're a deeply closeted physicist. I am. But I want to bring you out of the closet. It's safe now. So you've said something incredible recently that I just can't resist because it is literally in my field. And that's that you compare self-supervised learning to the dark matter of AI.

Starting point is 00:32:50 And much like dark matter in physics, it's essential. We know it exists or we think it exists, although there are some apostates. get into if you're interested. So we know, I mean, neutrinos are a form of dark matter. We know it exists. We know they're not sufficient to account for the observed amount of missing matter. But what did you mean by that? What did you mean by dark matter being analogous to self-supervised learning?

Starting point is 00:33:12 That remark actually is eight years old now. I made it many years ago at a keynote. And in the audience was my former colleague Kyle Kramer, high-energy physicist from NYU at the time, was not Wisconsin. And he said you should not

Starting point is 00:33:30 have used dark matter as the analog you should have used dark energy because that's really where most of the mass

Starting point is 00:33:38 is of the universe right? So what I was trying to explain the following analogy that the bulk of what we learn

Starting point is 00:33:48 we don't learn by being told an answer or we don't learn by trial and error. We just learn the structure of, you know, or sensory inputs

Starting point is 00:33:58 through self-supervised running, or something similar to this, right? We don't actually know what humans and animals use, but it certainly feels a lot more like self-supervised running than it feels like either supervised running or reinforcement running, right? So supervised running is a situation where you have a clear input and a clear output,

Starting point is 00:34:18 and you train the system to just map that input to that output, right? Show a picture of an elephant, tell the system that's an elephant, if it says it's a cat correct the parameters so that the output comes closer to elephant. Right, that's supervised learning. And then reinforcement is you show it an elephant and you wait for the answer

Starting point is 00:34:36 and you just tell it whether the answer is correct or not. You don't tell it the correct answer, you just tell it whether it's correct or not. Maybe with a score of some kind. Okay, and now the system has to search among all possible answers, which one is the correct one? if there is an infinite number of answers, it's super inefficient.

Starting point is 00:34:55 So reinforcement learning is so inefficient that it cannot possibly explain the type of efficient learning we observe in humans and animals. Supervised learning cannot possibly explain either because most of what we learn, we're not taught. We just seem to come up with it, right? And certainly animals,

Starting point is 00:35:13 there's a lot of animal species that become really smart without ever meeting their parents. A good example is octopus. But there are plenty of examples in birds and, you know, various other species. So they learn a lot about the world and they never meet their parents. So they're not being told anything.

Starting point is 00:35:28 They're not being, you know, taught anything. And then there is this sort of amorphous thing that we now call self-supervised learning. And that's really what the bulk of learning really takes place. And if anything, the success of LLM really is a sort of bright demonstration of the power of self-supervised learning. So I use an analogy where I should, showed a picture of a chocolate cake.

Starting point is 00:35:55 It said the bulk of the cake, the genoise of the cake, if you want, is our supervised learning. The icing on the cake is supervised learning, and the cherry on the cake is reinforcement learning. If you want to quantify the sort of relative importance of the different modes of learning, that's the right analogy. And when I was saying this in 2016, the entire world was completely focused on reinforce web learning. Reefront learning was going to be the past towards human-level AI. And I'd never believed in this.

Starting point is 00:36:23 And so that was kind of controversial. It's not anymore. And so then I said, you know, there is chocolate in this bulk of the genuise of the thing. That's dark matter. So yeah, that's the dark matter of AI. That's the thing we have to figure out how to do. And it's kind of like we're in the same embarrassing situation as physicists where we know how to do reinforcement learning and supervised learning,

Starting point is 00:36:45 but we don't really know how to do this self-supervised learning thing that represents the bulk. Hey, Cosmic Explorers. It's time for some Astro trivia. Do you know the difference between a constellation and an asterism? There are only 88 official constellations, and the last one was added way back in 1930. But I have over 900 ratings of Into the Impossible, and while you can't make your own constellation,

Starting point is 00:37:08 you can make an asterism of five stars, a collection burning bright enough to make Orion's belt jealous. So do that on Apple Podcast, scroll down to ratings and review, tap the five-star button, and leave your thoughts, or on Spotify, follow our show, tap the star rating. Don't forget to listen to a whole episode if you want to leave an actual rating. And please don't forget to follow or subscribe to the show, wherever you're listening to this.

Starting point is 00:37:32 The matter that you and I are made up of, these chunks of rock, which I'll give to you when we finally meet up someday. These are meteorites from the early universe, or from our early solar system. I give them away to anybody who has a dot edu email address at my website. The point is, this is very important. People say, oh, we don't even know what 90, you know, what's 80% of the matter is in the universe. But, you know, the 20% that we do know about is extremely important and without which we can't have this conversation. And last week I talked to a relative colleague of yours, Stephen Wolfram, and staying on the topic of dark matter, he believes that dark matter, unconventional idea that he has, is that the universe is a hypergraph, according to him, that evolves via pure computational rules. and that time is generated by the sort of update rate of the hypergraph.

Starting point is 00:38:22 And he suggests that as time and temperature are related through laws of thermodynamics via entropy, he's actually suggesting that dark matter is what he calls spacetime heat. I'm not asking you specifically to comment on that. I actually don't fully understand it. We debated it because the question I had is, can it? Okay, so there's dark matter that we know exist. there's dark matter that and we we don't know what it is. It could be, it could be some strange new particle like the axion. It could be, you know, some new force field. We don't understand,

Starting point is 00:38:54 but there is dark matter that we know about. Absolutely. Neutrinos, 100% wimps, weekly interacting massive particles. So can spacetime heat account for neutrinos, which are about 1.9 Kelvin today in the universe? And so we kind of fought that out. But generally speaking, what do you think about this hypergraph idea that the universe, you know, is pure computation? Does that hold any interest to you as a researcher? I don't know about the hypergraph idea specifically, but I can tell you I've been fascinated by the connection between physics and computation for a very long time. When I started my career at Bell Labs in 1988, all of my colleagues were physicists. The lab was a physics lab. I was the only sort of non-physicist. I don't want to say computer scientists because my undergraduate

Starting point is 00:39:37 degree is actually in electrical engineering and I did a lot of physics. But all my colleagues were physicist. And I had a brilliant colleague called John Danker, who was in your office next to mine, and both of us were very interested in sort of fundamental questions in physics and how they connect with computation. We attended a couple workshops at the Sontafi Institute, one of them organized by Vojtzejurek, who is also into those questions. I don't know if you know any of his work. But, you know, there were people like Seth Lloyd, who was just finishing his PhD at the time. We're talking in 1991 or something like that, 1992. And people like Murray Galman and John Wheeler.

Starting point is 00:40:18 And so John Wheeler gave a talk. His talk was, you know, it from bit, right? That's the title of a series of lectures that he gave. And he says, like, it's all information at the bottom. You know, we have to figure out, like, how to express all the physics in terms of information processing, essentially. And so I found this concept, you know, fascinating. And I've been sort of kind of remediating on this idea for a very long time, you know, not in concrete enough terms to actually write a paper on it,

Starting point is 00:40:49 but some interesting ideas around this, certainly. One connection on this that I had is this idea of reversible computation, right? Which, of course, has become kind of a big thing because of quantum computing now, but was it that popular in the early 90s? Yeah, and we also, speaking of a physicist, I took questions from the audience, and I just received one question via text message from my good friend and possibly your friend, Max Tagmark. Are you willing to answer Max? He has two questions for you that he texted to me.

Starting point is 00:41:24 First one is, when do you, Yon, expect AGI defined as AI that can do almost all Zoom jobs? Resisted the use of the phrase AGI. And the reason is not that I don't believe in the concept that AI system will eventually be as intelligent as humans, I certainly have no doubt that at some point in the future, we will have machines that are as intelligent as humans in all the domains where humans are intelligent. There's no question this is what happened. Okay, no doubt. It's a matter of time. But calling this AI is complete nonsense because human intelligence is incredibly specialized. We have a high time kind of accepting this concept that human intelligence is specialized,

Starting point is 00:42:04 but it is very specialized. That's why I don't like the term. I like the term I've been using is either human-level AI or AMI. So that stands for advanced machine intelligence. This is kind of the term that we use internally at Meta. We pronounce it Abby because friend. There's a lot of French people. Okay. Also means friend, right, in French.

Starting point is 00:42:26 But that's the same concept, right? So now, how long is it going to take? Strangely enough, I get asked that question by people like Mark Zuckerberg. And the reason is it's an important thing to know, if you want to invest, you know, tens of billions in infrastructure to, you know, train big AI systems, if you want to be able to tell people, like, within a few years, you're going to be able to wear those smart glasses that you were showing us initially. And in those glasses, there will be an intelligent assistant that you can ask, be with you at all time. You can ask any question that is going to be, you know, smarter than you possibly.

Starting point is 00:43:03 And you shouldn't feel threatened by that. It would be like, you know, having a smart colleague that you can talk to and ask any question. So how long is it going to take? So I think to have possibly a system that, at least to most people, feels like it has cerebral intelligence as humans, if all of the plans that all of the things that we are imagining will work, okay, so those JEPA architectures and, you know, some other ideas that we're playing with succeed, I don't see this. happening in less than five or six years. Okay.

Starting point is 00:43:39 But now is it going to happen in five or six years? And I think there's a distribution with a tail that's very long. And the history of AI is that people just keep underestimating how hard it is. I'm probably making the same mistake right now. You know, when I say five, six years, this is if, you know, we don't run into a major obstacle that we didn't foresee, if all the things that we're planning to try out actually work, if things can't scale, if computers, you know, accelerate and all that stuff. Like, you know, there's a lot of things, a lot of planets that need to line up for this to happen.

Starting point is 00:44:13 So that's the best case, right? It's not going to happen next year. Like, you might have heard from, from, you know, some other folks. Sam Alman, yeah, right. Yeah. Sal Alman, you know, it almost, you know, various, various people. Or, you know, Dario, I mean, the, yeah, it's going to happen within the next two years or something. No.

Starting point is 00:44:33 What may happen in the next two years. is that it's going to be more or more difficult to find cases where common people will be able to ask questions to the latest chatbot that the chatbot would not be able to answer. But like, again, where is my cat, where is my domestic robot, where is my level five cell driving car? Where is the cell driving car that can learn to drive in 20 hours of practice without killing itself? I wonder how much, you know, not as an expert in this field, but just someone is fascinated by it and has benefited. my life has just benefited so much because now, you know, I've got a bunch of kids and I don't read them, you know, stories. I ask, you know, meta to read them stories. I don't know.

Starting point is 00:45:12 I don't do that. But I don't think there's anything wrong with it. Morally, I feel fine because if you're reading somebody's book, it's basically the same thing. But I think we're kind of arguing about stuff that's maybe the most, you know, analogous thing I can, I can point to is like the Drake equation. Like the Drake equation parameterizes, you know, are basically a statement about optimism for detecting aliens. And it's based on a whole bunch of parameters. And those parameters are always given to us without any uncertainty. And you as a scientist, and I know the most important thing are the systematic errors are simple.

Starting point is 00:45:46 Systematic errors are hard. That's where the physics is. That's where the intuition comes in. That's where the craftsmanship comes in. So, you know, but in these questions, so you always get numbers like, oh, there's abundant, billions of civilizations in the universe or there's none, depending on what you choose for your error bar. And likely, too, for AGI, is. It's such a nebulous thing.

Starting point is 00:46:06 So, you know, people define in all different ways. I agree with you. I don't think it's true. But I think, you know, the Keating test, if I could be so bold, would be something like come up with a new law of physics. Come up with a solution that that makes a prediction that can be testable and falsifiable, that we can then say this is never, this is truly new. It's not reproducing. It's not predicting. It doesn't have temperature dependent.

Starting point is 00:46:29 So what would you say if you could have the lacoon test instead of the, I think the turn test was great for, you know, years ago. But the analogy with the Drake equation, Drake equation is like, who's talking to us? And the Turing test is like, who's listening to us? But I don't think that's sufficient.

Starting point is 00:46:44 Either one of those. What would you say is a lacoon test that you'd be comfortable with? Here's the bad news. I don't think there is any single test that would work. That's probably right. Because any area or sub-problems

Starting point is 00:46:55 that you can formulate, there is probably a sort of specific solution towards, you know, solving that problem with superhuman performance. And we see this with computer. that's a history of computer science, right? Computers can, you know, calculate faster than humans.

Starting point is 00:47:09 You know, now they can translate thousands of languages, right, in any direction. You know, can play chess, you know, a $30 gadget can beat you a chess, right? It can beat BHS, certainly. You know, on all of those tasks that we came up with, like games, we came up with them because they're hard for humans, and it turned out to not be that hard for machines, right? So, like, you know, every search algorithm, like, you know, shortest paths in a graph, things like that,

Starting point is 00:47:33 that, you know, your GPS, your map software uses, your map application uses. Those are fairly simple algorithms, and they have superhuman performance. So, you know, any particular application error you pick, there's going to be a specific specialized solution for it. And so no single test is going to test for intelligence. And what we're observing now is that people are being impressed by the fact that LLMs can be in impureate language. And it tells that manipulating language is simple.

Starting point is 00:48:03 It's much simpler than we thought. In fact, it has to be simple because it only popped up in evolution in the last few hundred thousand years. And given the difference between the genomes of humans and chimpanzees or something like that, it only represents like a tiny portion of the genome, if anything, actually, maybe a tiny, tiny, tiny portion. Maybe the equivalent of like a couple megabytes of genomic information, which is really not that much. And in the brain, the old language is handled by two tiny areas right here and right here, the brook area and vernic area area for producing language, runicay area for more for our understanding. We get fooled into thinking those things are intelligent and generally intelligent

Starting point is 00:48:44 because they behave a little bit like humans, but really they're very shadow. And we see this when we try to build systems that can accomplish very simple physical task, and it's just excruciatingly complicated. they really can't. I mean, we don't, I don't think we have a good solution. Although, you know, there's progress being made in robotics and stuff like that because of machine learning, but we're still not, nowhere near where we need to be. Max, ask second question.

Starting point is 00:49:13 You can probably guess this, Jan. What is your plan for preventing loss of control over even smarter AGI? Okay. So Max and I disagree on this, right? We've been on various panels discussing this issue. I also disagree with Jeff Hinton on this, right? Yeah, good friends, but we... Right. We disagree on that question.

Starting point is 00:49:32 So first of all, there is the sort of implied idea that if a system is intelligent, it wants to somehow take over a dominate. And that's just completely false. Not only is it false, it's not even true within the human species. You know, it is not the smartest among us who want to be the chief, right? We have examples on the political scene. That's right. We won't talk politics. The idea somehow that intelligence is necessarily associated with a desire to dominate is just false.

Starting point is 00:50:07 To have a desire to dominate or to, or not a desire, but just even dominate by accident, there has to be some hardwire drive into the entity for competition, for resources, for example, or for influencing other entities to be able to profit, from it or whatever. And this is a characteristic, the desire to dominate, is a unique characteristic of social species within the, within biology. So you have domination submission behavior in baboons, you have them in chimpanzees, you have them in wolves and dogs, you have them in humans, although in humans there is another way to acquire a status in human society, which is prestige, right? And you and I are both academics, so we're not influencing others by

Starting point is 00:51:02 force, right? We're doing it by prestige, or at least we hope we do. But then take orangutans. Orangutans are not social. They are solitary. They're territorial, as a matter of fact. And they don't have any desire to dominate anybody because, like, you know, they don't need to. So this idea that there is an intrinsic desire to dominate that is linked with intelligence is just false. It's just false. Okay, so now there is the question as to how do you make sure that the drives of an AI system

Starting point is 00:51:36 are going to be, the objectives are going to be aligned with human values. And those systems are not going to destroy us by deliberate action, but also not by accident. If you project, if you extrapolate the capabilities of LLM, then you might be entitled to be scared because LLMs, to some extent, are intrinsically unsafe. Now, they're not very smart, so it doesn't matter.

Starting point is 00:52:01 But they're not controllable in the sense that the way they produce an output is not by optimizing an objective, is not by trying to figure out the sequence of actions that will satisfy an objective. They just produce one token after the other auto-regress to be without thinking very much. Right, but why do you say that's not safe? I mean, my toddler does the same thing, but has no power, right? There's no embodiment. There's no controllability network that could allow her to actually do something to me,

Starting point is 00:52:30 even though she might be stochastically crazy, right? I mean, by an adult standard. She could be, like, you know, sitting on your lap and in front of your computer and then sort of randomly slap on the keyboard and destroy your entire file system. Right, but not launch nuclear war. Right. She's not going to... No, because, you know, she won't have that power, but, like, people don't have that power either.

Starting point is 00:52:49 And the AI system will not have that power either unless we're, very stupid with, you know, in the way we build-in. What your daughter has is a set of drives that were hardwired into her by evolution. Some of which have to do with just, you know, exploring the world and things like that to learn, you know, how the world works, basically. But some of them are very, very specific. So there is a drive for, you know, kids around when you're old to want to stand up because that's the way you learn to walk, right?

Starting point is 00:53:15 So they're happy when they stand up. That's hardwired. There are, you know, hardwired objectives. So I'm not talking about how hard-wired behavior. I'm talking about, like, you know, the fact that, you know, you touch the lips of a baby and the baby will start sucking, right? That's just hard-wired behavior directly. It's like a neural circuit that just does that.

Starting point is 00:53:35 I'm talking about hard-wired objectives. So things that you're driven to do, but nature doesn't tell you how to do that. Like eating, right? I mean, nature doesn't tell you how you find food. You have to hear that's out by yourself. I mean, with the help with your parents. Max's fellow Swede, Nick Bostrom, was on the podcast, and he's famous for this paperclip problem. But I pointed out to him, you know, you can't produce infinite paper clips because there's only so much iron inside of the core of the Earth or on any asteroid or on any solar system.

Starting point is 00:54:05 So it's kind of preposterous. They first have to have the agency, the desire, the target teleology, the purpose built. And I don't want to say it's impossible because that's one of my rules. I don't say anything's impossible. But it seems like Max is obsessed with perhaps proving something which can't be proven. He's asking you, provably safe, but what if that's proving a negative? I don't believe. I mean, also, Russell also wants to, is looking for, like, you know,

Starting point is 00:54:32 a provably safe AIS-SITM. I think that's just as impossible as a provably safe turbojet. You can't prove that a turbojet will be safe. Yet, we can build incredibly reliable turbojet. right, they can fly you half around the world in complete safety who is only the two-engine airplane, right? I mean, that's my body in terms of technology. But we can build those things.

Starting point is 00:54:57 Like, air is going to be the same. There's not going to be a magic bullet. There's not going to be a proof that we can build, you know, safe systems. But we're going to engineer safe systems. And the way we're going to engineer them, I think, that's the way I think things are going to go, is that we're going to build systems that are objective-driven. I call this objective driven AI, okay.

Starting point is 00:55:18 And it's the fact that the output that is produced by the AI system is not the result of just producing a token after the other, is a result of optimizing an objective with respect to a set of actions you take. So you have some mental model in your head in your mind, in the mind of the system. The system has a mental model of the situation it was to, or the environment it was to act into.

Starting point is 00:55:43 It's imagining a sequence of actions is going to accomplish. And through the sequence of action in this mental model, it can predict what the outcome is going to be. And now you can check whether this outcome

Starting point is 00:55:53 satisfies a set of objectives. So one of them is, did I accomplish the task that I set out to accomplish? Okay. But making a thousand paper clips. But then there might be other objectives that are more like constraints

Starting point is 00:56:07 and there would be godrails. So, you know, you pay a high price for killing someone or hurting someone, for example, right? or maybe for taking certain actions that will consume to rush energy or whatever, right?

Starting point is 00:56:19 So you can imagine having a series of those objectives, some of which are guardrails, some of which are task objectives. And then the way the system produces its output is that it, through optimization, it searches through the space of action sequences for one that

Starting point is 00:56:34 minimizes all those objectives and guardrails. Now that's objective driven. Those systems can no be job working unless you break them. Okay, but you can't jailbreak them like you can jailbreak an NLAB by giving it a real prompt which will

Starting point is 00:56:50 kind of go outside of its conditioning if you want, right? So the system cannot be jarbroken. The only outputs they can produce are outputs that satisfy the guardrails according to their internal mental model of the situation. And now the game to make a safe

Starting point is 00:57:06 AI is going to be how accurate can those mental models of the situation can be. and what guardrails do you have to put in to make sure that those things are not going to go wire and, you know, transform the planet into paperclips. And that's really easy to do. I mean, and we know how to do this for humans.

Starting point is 00:57:26 We've been doing this for humans for millennia. It's called making laws. A law is a guardrail objective that tell people, okay, maybe the act that you're planning to do here seems good for you. But if you do it, you're going to go to job for five years. Okay, so that changes your cost function. For AI will blow a circuit in the GPU.

Starting point is 00:57:46 It'll cause it fame. You know, humans can choose to ignore that guardrail, but an AI system that is objectively driven will not be able to ignore it by construction because they will have to optimize it, so it won't be able to escape those guardrails. I always call it like the, you know, the AGI of the gaps.

Starting point is 00:58:03 You know, it's kind of like this god of the gaps. They act as if, and I love Max, and he's been very kind to me personally and I've had him on many times, but the point is we're not just going to one day have AGI, right? It's going to be an iterative process. We already see it, right? We see Waymo, right? So right now I actually saw Waymo in LA last week, but it was being driven by, it was very disoriented. There was a guy in the front seat driving this Waymo. I was like, what's the point of that?

Starting point is 00:58:27 But I feel like, you know, if we started to see that, you know, like, so call that Generation Zero or Full Self Driving, you know, Tesla, I've been in them. They don't drive on the sidewalks, you know, that would be the shortest way to get somewhere. You know, the Waymo could just, you know, drive through the sidewalk to get past traffic. It was a, busy, but it doesn't do that. So we'll see evidence if there are guardrails, literal guardrails in the case of cars, say, driving off the rails. We're going to see that in iteration 0.1. And then, you know, it's not just one day we're going to have how. And even how, you know, I would say like the first operating system wasn't built with an operating

Starting point is 00:59:01 system. Like there's always precursor, you know, the first, you know, computer wasn't designed on a computer. I mean, by definition. So the first truly AGI is not going to be made by an AGI. and so it's not possible to have this recursion. But I want to give you the last word there before I turn to the final questions I want to ask you about being a professor in this new age. But do you have any other thoughts about AGI?

Starting point is 00:59:20 I have millions of more questions, but we'll have to wait till another part. I know you're very busy. Any other thoughts about, you know, the dangers or that's the most common question I think you get and my audience asked you. I have a very optimistic view, right? I mean, I think intelligence is, you know,

Starting point is 00:59:35 one of the most desirable commodity that we're missing the most in society, really, to make progress. So I think the effect of having machines that amplify our intelligence effectively might be as transformative as the effect of the invention of the printing press in the 15th century

Starting point is 00:59:58 and the dissemination of books, which allowed dissemination of knowledge, which actually gave a good reason for people to learn to read, for which there was no reason before, and then led to the dissemination of the Enlightenment, science, democracy, freeing people freeing themselves from the dogma of religion and becoming more rational and things like that.

Starting point is 01:00:21 So I think that had a profound effect, right? It had temporarily really bad effects also, like the printing press allowed the ideas to be disseminated that caused people to kill each other essentially for 200 years in Europe because of religious dogma or something. But eventually really had brought down the feudal systems and caused the American Revolution and the French Revolution and things like that. So I think, you know, AI will have similarly transformative effect

Starting point is 01:00:51 of amplifying human intelligence, assisting humans in accomplishing tasks that otherwise would require other humans. I think there's a bright future for humanity, if we can do this right. And I'm not, I'm not particularly scared of the kind of scenario that Max is talking about. So you think that AI will literally be more transformative for humanity than the Facebook poke feature of the early 2000? But speaking, speaking of Facebook, so you're an academic, how do you think of yourself? Are you a professor? Are you a scientist?

Starting point is 01:01:25 Are you an employee at a, you know, top three, you know, a firm on Earth? How do you identify yourself if you're woken up in the morning by super intelligent AI? and they asked you it. I'm a scientist. I would also say I'm an educator, not just because I'm a professor in university, but because also I do things like this, you know, trying to talk to the wider public

Starting point is 01:01:51 about certain aspects of science. But I'm really a scientist. So I spent about equal time during my career in industry and academia. I started my career in industry at Bell Labs, which then became AT&T Labs, and then I worked briefly at the NEC Research Institute. And then I was in my early 40s when I became a professor. I'd never been a professor before.

Starting point is 01:02:13 And then for 10 years, I was just a professor. And then after that, joined Facebook at the time, created a fair. I run fair for four years. So I was a research manager, I guess. And this was part-time. I was also kept being a professor at NYU, basically two thousand. one-third more or less, two-third Facebook or Meta now and one-third in NYU. After four years of running fair, bullying up and running it, I step down from running it. So now I'm chief scientist,

Starting point is 01:02:50 I'm chief AI scientist at Meta. And I'm what's called an individual contributor, which means I'm not a manager, I don't have any people, anybody reporting to me. I don't run anything. I hate running things anyway. I have a terrible administrator, okay? I disorganize, you know, I don't like doing it, you know, it's torture for me, you know, I did it for four years, but it's really torture. So I'm, I'm much more interested in sort of intellectual impact. So the, the effect I'm having at META is that I'm plotting a path towards human level intelligence because, you know, that's the scientific question of life if you want. But, but also, It's like there's a product desire to have intelligent assistance in your smart glasses that have human level intelligence.

Starting point is 01:03:38 So that's one of those rare cases where the lofty cosmic goal that you might have lines up with what people who pay you and we need to pay for, right? Everybody, I know that if you're enjoying these types of conversations, you're going to love my Monday Magic mailing list, where I explore the secrets of the guests that come on the show and other exciting facets from around the world of STEM, science, technology, engineering, and map. And best of all, I enter each and every one of you into a competition to win one of these little babies right here, a meteorite. Yes, that's right. A fragment of the early solar system produced in a cataclysmic supernova event, which ignited with as much intensity as I have for the members of my Monday Magic mailing list. I know you're going to love it. So go to Brian Keating.com slash YT to join the mailing list and enter into the competition to get one of these beauties each month.

Starting point is 01:04:33 But if you have a dot edu email address, you're guaranteed to win one. If you live in the United States, go to Brian Keating.com slash edu to get your fragment, not of the metaverse, but at the real early universe. Now back to the episode. I could see how the video kind of technology that we're talking about earlier might make for better filters on Instagram. and I use some of it. Actually, I have a hack for you, Jan. I don't know if you've discovered this, but... Ambition comes in all shapes and sizes.

Starting point is 01:05:01 At First Citizens Bank, we roll with your goals because we're built for what you're building. Fit for your ambition for Citizens Bank. I'm hoping that you haven't so I can do something to impress you. You ever go on an airplane and Wi-Fi is like $19 and it's a two-hour flight or you can do messaging for free? Have you ever been in that situation? Did you know that you can use WhatsApp AI, meta AI, your creations, you can use that for free.

Starting point is 01:05:32 That counts as messaging for free. You don't have to pay for any Wi-Fi. And you connect to the Internet. That's true. You knew that? I want to impress you. Come on. That's very cool.

Starting point is 01:05:43 I don't think a lot of people realize this. Yeah, yeah. You can basically access to that through with AI. $19 to, you know, use some one megabit per hour. or internet. Anyway, don't say too loud because the airline companies are doing. I know, I know. Well, yeah, luckily, our audience is not as big as, you know, Lex Friedman's yet, but we'll

Starting point is 01:06:02 see. What is Mark's, you know, interest in this? I mean, is it going to make, you know, is it just to make better filth? I mean, AGI versus what Facebook's main products are WhatsApp, besides the chat, you know, Lama, and then Instagram, Facebook, messenger, and store and all the things that are so cool about Facebook and meta, what Mark's interest? I mean, I'm very interested in what drives him at a scientific level, perhaps, because he's not trained as a scientist, obviously,

Starting point is 01:06:28 but he has this vision of the future. And it can't just be about consumption, right? There must be some other reason. What is his mission? What is his massive purpose in life? His massive purpose is to connect people with each other. That's the entire purpose of meta. It's just connecting people, right?

Starting point is 01:06:45 And it's very focused on this. But connecting people also means connecting not just people with each other, but connecting people with knowledge or helping people in their daily lives, right? So the future that is, the vision of the future is, you've seen the 2013 movie, her, right, the Spark John movie, right?

Starting point is 01:07:03 So it's this guy running around with AI assistant that is with him at all times, right? Either in glasses or things you put in your ears or little things with a camera, you know, that you wear. And, you know, he is with the assistant at all times. I mean, that's a vision of the future where, you know, every one of us would be, kind of walking around with, you know, super intelligent assistants working for us.

Starting point is 01:07:26 It's like, you know, it's like a business leader, politician, or an academic working around with a staff of people working for them who are smarter than them, okay, which is inevitable. I don't know about you, but I think about people who are smarter than me. But so that's incredible. That's a good thing. I mean, people should feel empowered by that future. So if you want that, if you have that vision of the future, and META has that vision.

Starting point is 01:07:51 you know, is building the devices for that, right? Then you need AI with human level intelligence, essentially. It's a product need. If we had it today, it would have a huge impact. Like everybody in the world will use it. We've done some, when I say we, it's the company as a whole and not connected with them, but people have taken a few of those Reban beta glasses

Starting point is 01:08:18 to rural areas in, India giving them to farmers. They absolutely love it. Look at this and tell me what you see. Tell me what's going on. And imagining what it can mean for people. They look at their plants and they say, well, this one looks diseased. You know, what disease is that?

Starting point is 01:08:36 Or what is this bug? You know, do I need to do something against them? Or like, you know, should I harvest now? What's going to be the weather tomorrow? I mean, all kinds of questions. And they can do this in their own local language. I could talk to them and, you know, some Hindi or something like that. And they could speak that.

Starting point is 01:08:52 Well, Hindi is a widely spread official language. But most people in India don't, don't. No, no, right. Sure. With some dialect. Yeah, they have 900 different dialects there, right? I've heard 700, but it is close enough. And then, you know, between 20 and 30 official languages.

Starting point is 01:09:11 So as we wrap up, I just have two more questions. So the first one relates to what we do. And I identify myself also as a, well, you know, first of all, as a father, husband, et cetera, but then as a scientist and then as an educator as well. And I always like to ask the question, you know, Galileo lived, my hero, it was Galileo, and not just because he has AI in his name, but, you know, he was a professor. He was a scientist, but he also had to make money. And so he would make telescopes only if he could sell the instruction manual, which was like the Sidirius, nuncius, and the dialog and so forth. But I call what you and I do, the, you know, the second oldest profession, because they've been professors since the year 1,000 in Bologna, Italy, and then Oxford, obviously, and then Sorbonne, you know, I don't have to tell you. How do you see AI impacting it? Why should someone listen, I'll say for me,

Starting point is 01:09:56 why should they learn cosmology, general relativity from Brian Keating when they can learn from this guy, Albert Einstein? What is the threat and the opportunity for me and my colleagues as professors? I mean, it's clear that the profession of knowledge transfer is going to be transformed pretty deeply.

Starting point is 01:10:13 And we might need to find sort of new economic models for research, for science, for a scholarship, you know, which, you know, plays an important role in society, which is not necessarily linked with education. Now, I think there is still the whole idea of, you know, the PhD and the advisor, which is a bit like, you know, the Jedi master and the pedowan, right? I mean, that kind of relationship, I think, would still exist. Because it's not just the, you know, it's not just the knowledge, right? I mean, I don't know about you. but I probably learn as much for my student as they learn from me.

Starting point is 01:10:55 It's just a different type of learning that takes place or communication of information. But the interaction, I think, is important in terms of behavior, ethics, you know, what is interesting, what to work on, you know, what's the practice of science and research. In a way, in the future with AI, all of us will be a bot of a team, of virtual people, right?

Starting point is 01:11:22 You can think of it this way. And that would include business leaders, professors, I mean, anybody, right? Not leaders, just everyone. Well, we'd be like a person today with a staff of

Starting point is 01:11:36 smart people working for them. That will be true of students too. So your interaction with the student will not be with just a student. We'll be with the student, augmented by, you know, the AI systems

Starting point is 01:11:48 that the student has access to. it wonderful. I'm not threatened by it. I've tried to do, you know, even to, even though I build telescopes and, you know, look for the heat left over from the Big Bang, I'm still fascinated by it. And I actually think it's people now, we're at the beginning, right? It's said that we're at the worst that AI will ever be right now. It's only going to get better. And I can do so much, so quickly. And there's certain things you have to be, you know, kind of supervised. But, but it's, it's, it's, the thing that meets so much to me is how pleasurable it is. I mean, talking with the glasses or and asking about the physical world and interacting with it, but also learning. It has

Starting point is 01:12:26 already an IQ of 120 in every subject on earth almost, especially with the new llama models. I'm so excited about those. But the last question I want to ask you comes from the namesake of this podcast. It was Arthur C. Clark's statement that the only way to know the limits of the possible is to go into the impossible and transcend them. But he said something else. He actually said a couple of very funny thing. He said, for every expert, there's an equal and opposite expert. So I use that on my department chair every now and then when I want to get out of a teaching. But he said the following, Jan, he said when an elderly, I'm not calling you elderly, but I'm saying when an older but distinguished scientist, certainly you are, says that something is possible. He is very likely to

Starting point is 01:13:07 be right. But when he says something is impossible, he is very likely to be wrong. I want to ask you, what have you been wrong about? What have you changed your mind about, if anything? Oh, I changed my mind about a number of things. One example was I, you know, in the sort of early, early days of neural nets, you know, when I started interacting with Jeff Hinton, I did my postdoc with Jeff Hinton in 87, 88. And at the time, I had negative interest in things that we, we called unsupervised running at the time. I thought this was ill-defined. I didn't see the point of it. And Jeff thought this was like the most interesting thing, you know, he'd be working on Boston machines, you know, for which he got the Nobel Prize.

Starting point is 01:13:48 Nobody uses both machines anymore, but that's beside the point. It was a very influential view. And it was mostly for unsupervised learning. And he had this vision that the bulk of learning had to be unsupervised. He was right about this. And I basically rallied to his side in the early 2000s.

Starting point is 01:14:08 So it took a long time, you know, from the late 80s to the early 2000s before I kind of changed my mind about this. and then, you know, became a true believer in that thing. And they started kind of advocating for it in more explicit term, you know, in the 2010s. But I was clearly wrong about that. Jan Lecun, hey, Mehta, how do I say thank you very much and good weekend to Jan Lecun? Merci beaucoup and bon weekend to you, Jan.

Starting point is 01:14:37 This has been fascinating. There's one thing I need to tell you because you were talking about telescopes. So I don't build telescopes, but I do. astrophotography as an amateur. So I don't know science, but I take pretty pictures, you know. I'd love to see it. Maybe I follow you on Twitter, Instagram, so maybe you'll post some stuff there.

Starting point is 01:14:58 That would be wonderful. I've posted a few pictures in the past, yeah. Yeah, that's great. Yeah, I mean, the original astrophotographer is Galileo, who sketched out the feelings that he had about the universe, not just what he saw. Jan LeCoon, this has been fascinating. I can't wait to meet you maybe in person,

Starting point is 01:15:14 and I come to visit the Simons Foundation. the Flatiron Institute. It would be a pleasure to meet you. Relax and let Ralph's delivery handle your grocery shopping this week. We start with only the freshest items, then review your list and carefully choose each one. Then we pack it all up and deliver it in as little as 30 minutes, so you can feel confident it's what you ordered. Fresh groceries, your way, with Ralph's delivery and pickup. And right now, you can save $20 on your first delivery or pickup order. Ralph's, Fresh for everyone.

Starting point is 01:15:56 When the job gets tough, you need equipment that's built to handle it. The Cabota Construction lineup, featuring the versatility to do more, durability to keep going, and increased comfort for long days on the job. With skid steers, track loaders, wheel loaders, utility vehicles, and the world's number one selling compact excavator for 20 years, our expanding lineup is built to deliver. Visit Cabota USA.com or your local Cabota dealer today to learn more. Go to Cabota USA.com flashed disclaimers for full disclaimers.

Into the Impossible With Brian Keating - Meta’s Chief AI Scientist Yann LeCun: The Path Toward Human-Level Intelligence in AI [Ep. 473]

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.