Big Technology Podcast - NVIDIA's Plan To Build AI That Understands The Real World — With Rev Lebaredian

Starting point is 00:00:00 Let's talk about NVIDIA's push to generate AI that understands the real world with technology that can influence the future of robotics, labor, cars, Hollywood, and more. We're joined by the company's VP of Omniverse and Simulation Technologies right after this. Welcome to Big Technology Podcast, a show for cool-headed, nuanced conversation of the tech world and beyond. Today, we're joined by Rev Liberdian. He's the vice president of Omniverse and Simulation Technology at NVIDIA, for a fascinating conversation about what may well be the next stage of AI progress, the pursuit of world models that provide common sense to AIs. Rev, I'm so happy to see you here. We actually spent some time at your headquarters a couple months back,

Starting point is 00:00:45 and I'm really glad that you're here today and to introduce you to the big technology audience. Welcome to the show. Thank you for having me. All right, before we jump into world models, obviously we're having this conversation in the wake of the deep sea revolution. I don't know what you want to call it. And everyone is talking about NVIDIA now. You're in quiet period, so we're not going to go into financials.

Starting point is 00:01:05 But I can and do want to ask you about the technology side of this, specifically about Jevin's Paradox. I keep hearing Invidia, Jevon's Paradox. Jevon's Paradox. What is Jevins Paradox and what do you think about it? My understanding of what Jevins Paradox is essentially an economic kind of principle that As you reduce the cost of something of running it, you create more demand for it because it unlocks essentially more uses of that technology when it becomes more economically feasible to use that.

Starting point is 00:01:44 I think that really does apply in this case in the same way that it applies to almost every other important computing innovation over the last 40, 50 years, or at least as long as I've been alive. You know, at the inception of Nvidia in 1993, Nvidia selected, very carefully selected the very first computing problem to address in order to create the conditions by which we could continue innovating and keep growing that market.

Starting point is 00:02:23 And this was the problem of computer graphics and particularly rendering within computer graphics, generating these images. The reason we selected it is because it's an endless problem. No matter how much compute you throw at it, no matter how much innovation we throw at it, you always want more. And throughout the time I've been at Nvidia,

Starting point is 00:02:47 which is now 23 years, many times I've heard, well, graphics are good enough. Rendering is good enough. And so soon, NVIDIA's big GPUs and more computing power, it's not gonna be necessary. We'll just get consumed by SOCs or integrated into another chip as integrated graphics and it'll disappear.

Starting point is 00:03:11 But that never happened because the fundamental problem of simulating the physics of light and matter was endless. We see this in almost every important computing domain. AI is one of these things. I mean, can we really say that we have now reached the point where our computers are intelligent enough or the intelligence we create is good enough? And so it's just going to shrink. We're not going to have any more use for more compute power there.

Starting point is 00:03:46 I don't think so. I think intelligence is something that is probably, the most endless of all, all computing problems. If we can throw more compute at the problem, we can make more intelligence and do it better and better. So making, making AI more efficient will just increase its economic value in, in many, many of the applications we want to apply it to and increase demand. And can we talk about the progression of AI models becoming more efficient. I know it's like a hot topic right now, but it does seem to me that over the past couple years, we've definitely seen models become more and more efficient. So what can you

Starting point is 00:04:29 tell us about? We'll just talk about large language models on this front, the efficiency gains that we've seen over time with them. I mean, this isn't new. This has been happening for the past 10, 12 years or so, essentially since we first discovered deep learning on our GPUs with AlexNet. If you look at the computational curve, what our GPUs can do in terms of tensor operations, the AI kind of math that we need to do, over the last 10 years,

Starting point is 00:05:10 we've had essentially a million X performance increase. And that increase isn't just from the raw hardware. It's also through many layers of the software algorithms. So we're getting these benefits, these speedups continuously at a very rapid rate exponentially by compounding many layers, all the different layers at which this computing happens. From the fundamental hardware, the chips themselves, at systems level, networking, system software, algorithms, frameworks, and so on. So what we've seen here with DeepSeek is a great advancement that's on the same curve that

Starting point is 00:05:57 we've been on for a decade now. Okay. And 23 years at Nvidia, I'm going to save a question to ask you about that as we get later on or towards the end of the interview because I'm very curious what your experience has been being at Nvidia for so long, especially given that, you know, the company's technology, at least from the outside world. was viewed as in favor and then people question it and back in favor, if people question, obviously we see what's going on now.

Starting point is 00:06:22 We're leaving through a mini cycle at this point. So I'm very curious about your experience, but I want to talk about the technology first. And let me just bring you into a conversation that we had here on the show with Jan Lecun, who's met as chief AI scientists, really right after chat cheap T came out. And one of the things that Jan did was he said, go ask chat GPT, what happens if you let go of a piece of paper

Starting point is 00:06:46 with your left hand and I typed it in it gave a very convincing answer it was completely wrong because with text you don't have the common sense about physics and tries you might to teach a model physics with text you can't there's just not enough literature that describes what happens when you drop a paper with a hand and therefore the models are limited and yon's point here was basically like if you want to get to truly intelligent machines you need to build a something into the AI that teaches common sense, that teaches physics, and you need to look beyond words to do that. And so now I turn it over to you, Rev, because I do think that right now within NVIDIA, a big initiative is to build a picture of the world to teach AI models that common

Starting point is 00:07:36 sense that Yon had mentioned was lacking. And I have some follow-ups about it, but I want to hear first a little bit about what you're doing and whether your efforts are geared towards solving the problem that Jan brought up. Well, what Jan said is absolutely true, and it makes intuitive sense, right? If an AI has only been trained on words, on text that we've digitized, how can it possibly know about concepts from our physical world, like what the color red really is, what it means to hear sound? what it, what it means to, to feel, uh, felt. You know, it can't, it can't know those things because it never experienced them. When, um, when we train a model, essentially what we're doing is we're providing life experience to that model and it's, and it's pulling apart patterns or

Starting point is 00:08:37 it's discerning patterns from all of the experience that we give it. And what's, what was really, really amazing about GPT, the advancements with LLMs, you know, starting with the Transformer, is that we could take this really, really complex set of rules that humans had no way of actually defining directly in a clear and robust manner, the rules of language. And we were able to pull that out of a corpus of data. we took all of this text, all these books, and whatever information you could scrape from the internet about that. And somehow this model figured out what all the patterns of language are in many different languages and could then, because it understands the fundamental rules

Starting point is 00:09:35 of language, do some amazing things. It could generate new text, it could style some text that you give it in a different way. It can translate text from one form to another, from one language to another. It can do all of this awesome stuff. But it lacks any information about our world other than what's been described in those words. And so the next step is, the next step in AI is for us to take the same fundamental technology we have, this machine we have, where we can feed it, life, and it figures out what the patterns and the rules are and feed it with actual data about our physical world and about how our world works so that it could apply that same learning to the rules of physics. Instead of the rules of grammar, the rules of language, it's going to understand how the physical world around us works.

Starting point is 00:10:34 And our thesis is that from all the AIs we're going to create into the future, the most valuable ones are going to be the ones that can interact with our physical world, the world that we experience around us, the world created out of atoms. Today, the AIs that we're creating are largely about our world of knowledge, our world of information, ones and zeros, things that you could easily represent inside a computer in the digital. world. But if we can apply the same AI technology to the physical world around us, then essentially we unlock robotics. We can have these agents with this intelligence and even superintelligence in specific tasks. Do amazing things in the world around us, which is if you look at global markets, if you look at all of the commerce happening in the world and GDP, the world of knowledge, information technology is somewhere between $2 to $5 trillion a year, but everything else, transportation, manufacturing, supply chain warehouse and logistics, creating drugs,

Starting point is 00:11:59 all the stuff in the physical world, that's about $100 trillion. So the application of this kind of AI to the physical world is going to bring more value to us. So it's interesting. It's not just basically inputting that real world knowledge into LLMs, right, so they can get the question about dropping the paper with a hand, correct? It is also something that you're working on is building the foundation. for robots to go out into our world and operate within it? So, yes, it's not inputting it in the same way that we do for these text models. We're not just going to describe with words what happens when you drop a piece of paper.

Starting point is 00:12:48 We're going to give these models other senses during the learning process. So they'll watch videos of paper dropping. We can also give it more accurate, specific information in the 3D realm. Because we can simulate these physical worlds inside a computer today, we have physics simulations of worlds, we can pull ground truth data about the position and orientation and state of things inside that 3D world

Starting point is 00:13:28 and use that as another mode of input into these models. And so what we'll end up with is a world foundation model that was trained on many different modes of data, essentially different sense. It can see, it can hear, it can touch and feel and do many of the things we can do or many things other animals or even things no creature can do because we can provide it with sensors that don't exist inside the natural world. And it can, from that, kind of decipher what are the actual combined rules of the world?

Starting point is 00:14:09 And this encoding of the knowledge of how the physical world works can then be the basis for us to build agents inside the real world, to build the brains of these agents, otherwise known as physical robots. Right. And so this is your recently announced Cosmos project. So talk a little bit about like what Cosmos is. I mean, obviously it's a world foundational model. But where, how long you've been building it and what type of companies and developers might use it and what they might use it for? We've been we've been working towards Cosmos.

Starting point is 00:14:50 for probably about 10 years, we envisioned that eventually this new technology that had formed with deep learning, that that was going to be the critical technology necessary for us to create robot brains. And that is ultimately what's going to unlock this incredible amount of value for us. So we started working towards this a long time ago. We realized early on that the big problem we were going to have is in order to train such a model to train a robot brain to understand the physical world and to work within it, we're going to have to give it experience. We're going to have to give it the data that represents the physical world. And capturing this data from the real world

Starting point is 00:15:48 is not really an easy thing to do. It's very expensive and in some cases, very dangerous. For example, for self-driving cars, which is a type of robot, it's a robot that can autonomously, you know, on its own, figure out how to get from point A to point B by controlling this physical being, a car, by braking and accelerating and steering,

Starting point is 00:16:14 how are we gonna ensure that a self-driving car really understands when a child runs into the street as it's barreling down the street, that it should stop? And how can we be sure that it's actually gonna do that without actually doing that in the real world? We don't wanna go capture data of a child running across the street. Well, we can do that by simulating it

Starting point is 00:16:39 inside a computer. And so we realized this early on. So we set about applying all of the technologies we'd been working on up into that point with computer graphics and for video games and video game engines and physics inside these worlds to create a system to do world simulation that was physically accurate

Starting point is 00:17:03 so that we could then train these AIs. And so we call that operating system, if you will, Omniverse. It's a system to create these physics simulations, which we then use to train AIs that we could test in that same simulation before we put them out in the real world. So we use it for our self-driving cars and other robots out there. So building Cosmos actually starts first with simulating the world. And so we've been building that stack and those computers for quite a while. Once the transformer model was introduced and we started seeing the amazing things large language models can do and the chat GPT moment came,

Starting point is 00:17:57 we understood that this essentially unlocked the one thing that we needed to really push forward in robotics, which is the ability to have this kind of general intelligence about a really complex set of things, complex set of rules. And so we set about building what is Cosmos today essentially a few years ago, using all of the technology we had built before with simulation and AI training. And what Cosmos is, it's actually a few things. It's a collection of some open weight models that we've made freely available. Along with it, we also provide essentially all of the tooling and pipelines you need to create a new World Foundation model.

Starting point is 00:18:56 So we give you the World Foundation models that we've started training, which are world class, especially for the purposes of building physical AI. And we also have a what's called a tokenizer, which are AIs themselves, that are world class. It's a critical element of building world foundation models. And then we have curation pipelines. The data that you select and curate to feed into the training of your World Foundation model is critical.

Starting point is 00:19:30 and just selecting the right data requires a lot of AI in it of itself. So we released all of this stuff and we put it out there in the open so that the whole community can join us in building physical AI. And so who's going to use it? Is it going to be robotics developers? Is it going to be somebody that's building, let's say, LM-based application, but just wants them to be a little smarter? Both? It will be all of them, yes. It's a, we, we feel that we're, as a, as, as the industry, the world is right at the beginnings of this physical AI revolution.

Starting point is 00:20:12 And no one company, no one organization is going to be able to build everything that's, that, that we need. So, so we're building it out there in the open, uh, to encourage others to come build on top of what we've built and come build it with us. And this is going to be essentially anybody that has an application that involves the physical world. And so that's definitely robotics companies are part of this and robotics in the very general sense. That includes self-driving car companies, Robotaxi companies, and as well as companies building robots that are in our factories and warehouses. Anybody that wants to make intelligent robots that have perception and operate

Starting point is 00:21:01 autonomously inside the real world, they want this. But it's not only about robots in the way we think about them as these agents that move around. We have sensors that we're placing in our spaces, in our cities, in urban environments, inside buildings. These sensors, need to understand what's happening in that world, maybe for security reasons, for coordinating other robots, changing the climate and energy efficiency of our buildings and data centers. So there's many applications of physical AI that are broader than what we generally think of as what you imagine when you say a robotic application. There's going to be thousands and thousands of companies that build these physical AIs, and this is just the beginning.

Starting point is 00:22:06 Now, you mentioned that the transformer model was an important development on this path, and that obviously was the thing that underpinned a lot of the real innovation we've seen with large language models. Can the real world AI learn from the knowledge base that has been sort of turned into these AI models with text? Like if you're, if you have a model that's trying to understand the world with common sense, do they take text as an input? They take all of it as input. How does it work with them with text? I mean, it's very interesting because it seems like that's like when we talk about the progression towards general intelligence, that is a very, you know, kind of amazing application of being able to read something and then sort of intuit what it means in a physical space, don't you think? Yeah, I think the way I think about it, and I think this is right, is these AIs learn the same way we do. When you're brought into this world, you don't know who is mommy, who is daddy.

Starting point is 00:23:12 You don't even know how to see yet. You don't have depth perception. You can't see color or understand what it is. You don't know language. You don't know these things. but you learn by being bombarded with all of this information simultaneously through the many different senses. So when your mommy looks at you and says, I'm mommy pointing, you're getting multiple modes of information coming at you, including essentially that text that's coming through an audio form there. And then eventually when you learn how to read, you learn how to read because a teacher points at letters and then words and sounds them out.

Starting point is 00:23:56 So you have this association that you build between the information that you understand, like mommy, and the letters that mean that thing. AIs learned in the same way. When we train them, if you give them all of these modes of information associated with each other at the same time, it'll associate them together. That's how image generators work today. When you go generate an image using a text prompt, the reason why it can generate, you know, an image of a red ball in a grass field in an overcast day is because when it was trained, there was an association of some text along with the images that were fed into it. It knew that during the training process that these words were related to that image. And so we can gather that understanding from that association. What we're trying

Starting point is 00:25:06 to do with World Foundation models is take it to the next level by giving it more, of information and richer information, but part of that will still include text. We'll feed in the text along with the video and other ground truth information from the physical state of the world. Yeah, so this is going to be a multi-part question, and I apologize, but I don't really know another way to ask it. So what are the other modes of information that you're feeding in there? And do you really need to go through this simulation process?

Starting point is 00:25:41 And I'll tell you, you know, it all sounds like a worthwhile endeavor to me, and I'm sure it is. But I also see video models today. And that is something that's really surprised me when we've seen the video generation models, is that they really have an understanding of physics. Like, they know just as an image, like an image generation is not moving, right? So you know that, let's say, the guy sits on the chair. But video, you could see people walking through a field and you watch the grass move. And that means that those models inherently have a concept of how physics works, I think.

Starting point is 00:26:16 And I'm going to run it by you because you're the expert here. But like, again, and Jan's going to come on the show in a couple weeks. So maybe this is just in my mind because I'm gearing up and thinking about our last conversation. But I'm going to put this to you also. Maybe I'll ask what your answers. I'll ask him to weigh in on your answers on this. But the thing that he always talked about is a human mind is able to sort of see infinite possibilities and accept that. It doesn't break us. So if you have a pencil and you hold it up,

Starting point is 00:26:44 you know it's going to fall, but you know it could fall in infinite ways, but it's still going to fall. For an AI that's been trained on different scenarios, it's very difficult for them to understand that that pencil might fall in infinite ways when asked to generate it. However, they've been doing a very good job with the video generators of like showing that they understand that. So just to sort of reiterate, what different modes of information are you using? And why do we need this broader simulation environment or this Cosmos tool if we are getting such good results from video generation already? All very, very good question.

Starting point is 00:27:21 So first off, we use many, many modes. The primary one, though, for training Cosmos is video, just like the video generation models. But along with that, there's text. We also feed it extra information and labels that we can gather from. data, particularly when we do, when we train, when we generate the data synthetically. If you use a simulator to generate the videos, you have perfect information about everything that's going on in every pixel in that video.

Starting point is 00:28:01 We know how far each object is in each pixel. We know the depth. We know what the object is in each pixel. You can segment out all of that stuff. Traditionally, what we've done for perception training for autonomous vehicles. So we've used humans to go through and label all that information from hours and hours of video that's been collected. it's inaccurate and not complete. So from simulation, we can get perfect information

Starting point is 00:28:42 about the actual videos themselves. Now, that being said, your question about, these video models seem to really know physics and know it well. I think it is pretty amazing, you know, how much physics they do know, and it's kind of surprising we're here at this point. Like, had you asked me five years ago, would we be able to generate videos with this much physics plausibility at this stage?

Starting point is 00:29:13 I wasn't sure, actually, because I continually had been wrong for years prior to that. I didn't expect to see image classification in my lifetime until we saw it with AlexNet. But I would have bet against it. And so we're pretty far along. That being said, there's a lot of flaws in the physics we see. So you see this in the video. One of the basic things is object permanence. If you direct the video to move the camera, point away and come back,

Starting point is 00:29:48 objects that were there at the beginning of the video are no longer there or they're different, right? And so that is such a fundamental violation of the laws of physics. it's kind of hard to say, well, these models currently understand physics well. And there's a whole bunch of other things in there. You know, my life's work has been primarily computer graphics and specifically rendering, which is a 3D rendering is essentially a physics simulation. It's the simulation of how light interacts with matter and eventually reaches a sensor of some sort. We simulate what a camera would do in a 3D world and what image it would gather

Starting point is 00:30:35 from the world. When I look at a lot of these videos that are generated, I see tons and tons of flaws because when we do those simulations in rendering, we're attuned to seeing when shadows are wrong and reflections are wrong and these sorts of things. To the untrained eye. It looks plausible. It looks correct. But I think people can still kind of feel something is wrong, you know, when it's AI generated, when it's not, in the same way that for decades now, since we introduced computer graphics to visual effects in the movies, you know, when some, you don't know what it is, but if the rendering's not great in there, it just feels CG, it feels wrong. We still have that kind of uncanny valley thing going on.

Starting point is 00:31:27 That all being said, I think we're going to rapidly get better and better. So the models today have an amazing amount of knowledge about the physical world, but they're maybe at like 5, 10% of what they should understand. We need to get them to 90, 95%. Right. Yeah, I just saw a video of a tidal wave hitting some island. And I looked at it. It was like super realistic.

Starting point is 00:31:54 It was, of course, it was on Instagram because that's all Instagram is right now. 3D generate, I mean, AI generated video. And it took me a second. And it's more frequently taking me a minute to be like, oh, that's AI generated. And sometimes I have to look in the comments and just sort of trust the wisdom of the crowds on that front. But you might not be the best judge of it as well. Humans, I mean, we're not particularly good at knowing whether the physics will really

Starting point is 00:32:18 be accurate or not. This is why movies directors can take such license with the physics when they do explosions and all kinds of other fun stuff like tidal waves in there. Yeah. Well, it's just like some comedian made this joke. They're like, Neil deGrasse Tyson likes to come out after these movies like gravity and talk about how they're like scientifically incorrect. And some comedians like, yeah, well, how about the fact that George Clooney and Sandra Bullock are

Starting point is 00:32:49 the astronauts? That didn't bother you at all. But it is interesting that we can watch these videos, watch these movies and fully believe at least in the moment that they're real. Like we can allow ourselves to like sort of lose ourselves in the moment. Exactly. And just be like, yep, I'm in this story. I feel emotion right now watching, you know, George Clooney in a spaceship,

Starting point is 00:33:09 even though I know he's no astronaut. And I think for that purpose, I mean, I worked on movies. Before I was at NVIDIA, that's what I did, computer graphics for visual effects. That is a perfectly legitimate use of that technology. It's just that that level of simulation is not sufficient for building physical AI that are going to be the underpinnings or the fundamental components of a robot brain. I don't want my self-driving car or my robot operating heavy machinery in a factory to be trained on physics that doesn't match the real one. world. Even if it looks right to us, if if it's not right, then it's not going to behave correctly. And that's dangerous. So it's a it's a different purpose. That's why what we're doing

Starting point is 00:34:07 with Cosmos, it's it is really a different class of AI than video generators. You can use it to generate videos, but the purpose is different. It's not about generating beautiful imagery or interesting imagery as for art. This is about simulating the physical world, using AI to create the simulation. Rev, I want to ask you one more follow-up question about not the flaws, but the video generator's ability to get things right.

Starting point is 00:34:48 And then we're going to move on from this topic. But it is just surprising and interesting for me to hear, You and Demis Asabas, the CEO of Google DeepMind, who was just on, who commented on this, talk about how these video generators have been surprisingly good at understanding physics. And Jan also, basically in our conversations previously, effectively saying that it's very difficult for AI to solve these problems. I won't say they've solved it. But everybody's surprised they've gotten to this point.

Starting point is 00:35:16 So what is your best understanding of how they've been, though flawed, like this good? You know, this is the trillion-dollar question, I guess. You know, we've been betting now for years that if we just throw more compute and more data at the problem, that these scaling laws are going to give us a level of intelligence that's really, really meaningful. that will be like step function changes in capabilities. There's no way for us to know for sure. It's very hard to predict that. It feels like we're on an X, we are on an exponential curve with this,

Starting point is 00:36:05 but which part of the exponential curve we're on, we can't tell. So we don't know how fast that's going to happen. Honestly, I've been surprised at how, how well these transformer models have been able to extract the laws of physics to this level by this point in time? I have, at this point, I believe in a few years, we're going to get to a level of physics understanding with our AIs that's going to unlock the majority of the applications we need to apply them in in robotics.

Starting point is 00:36:50 Let me ask you one more question about this. Then we're going to take a break and talk about some of the societal implications of putting robotics, let's say, in the workforce and in, I don't know, in all different areas of our lives. There's definitely a sizable portion of the population that is going to be surprised, maybe not our listeners, but a sizable portion of the population that would be surprised to hear that Nvidia itself is building these world foundational models, releasing weights to help others build. on top of them. The perception, I think, from some on the outside is, hey, isn't Nvidia just the company that makes those chips? So what do you say to that, Rev? Well, yeah, that's been the perception. It's been the perception since I started in video 23 years ago. And it's never been true that we just build chips. Chips are very, very important part of what we do. They're the foundation that we build on. But when I joined the company, there were about a thousand

Starting point is 00:37:50 people, 1,000 employees at the time. The grand majority of them were engineers, just like today, the majority of our employees are engineers. And the majority of those engineers are software engineers. I myself am a software engineer. I wouldn't know the first thing about making a chip. And so our form of computing, accelerated computing, the form of computing we invented is a full-stack problem. It's not just a chip. It's not just a chip that we throw over the fence and leave it to others to figure out how to make use of it.

Starting point is 00:38:29 It doesn't work unless we have these layers of software, and these layers of software have to have algorithms that are harmonized with the architecture of our chips and our systems. So we have to go, In these new markets that we enter, what Jensen calls zero billion dollar industries, we have to actually go invent these new things kind of top to bottom because they don't exist yet and nobody else is going to likely to do it. So we build a lot of software and we build a lot of AI these days because that's what's necessary in order to build the computers to power all of this stuff. We did this with LLMs early on. Many, many years ago, we trained at the time what was the largest model in terms of number of parameters for an LLM.

Starting point is 00:39:27 It was called Megatron. And because we did that, we build our computers, our chips and computers and the system software and the frameworks and pipelines and everything. We were able to tune them to do these large-scale things and we put all of that, all of that software out there, which was then used to create all the LLMs we enjoy today. Had we had not done that, I don't think we would have had chat GPT. And so this is essentially the same thing. We're creating a new market, a new capability that doesn't exist. we see this as being an endeavor that is greater than NVIDIA. We need many, many others to participate in this.

Starting point is 00:40:18 But there are some things that we're uniquely positioned to contribute, given our scale and our particular expertise. And so we're going to go do that. And then we're going to make that freely available to others so they can build on it. Yeah. For those wondering why NVIDIA has such a hold in the market right now, I think you just heard the response. So I do want to take a break, and then I want to talk about the implications for society

Starting point is 00:40:43 when we have, let's say, humanoid robots doing labor in that part of the economy that we simply haven't really put AI into yet, and what it means when it's many more trillions of dollars than the knowledge work. So we're going to do that when we're back right after this. Hey, everyone. Let me tell you about the Hustle Daily Show, a podcast filled with business, tech news, and original stories to keep you in the loop on what's trending. More than 2 million professionals read The Hustle's daily email

Starting point is 00:41:13 for its irreverent and informative takes on business and tech news. Now, they have a daily podcast called The Hustle Daily Show where their team of writers break down the biggest business headlines in 15 minutes or less and explain why you should care about them. So, search for The Hustle Daily Show and your favorite podcast app, like the one you're using right now. And we're back here on Big Technology Podcast with Rev. Liberdian. he's the vice president of omniverse and simulation technology at nVIDIA rev i want to just ask you the

Starting point is 00:41:42 question that obviously has been bouncing around my mind since we started talking about the fact that you're going to enable robotics to be able to sort of a take over i don't know it's take over the right word take over a lot of what we do currently in the workforce i mean what do you think the labor implications are here because yeah if you're if you've spent your entire life you know working at a certain manual task and next thing you know someone uses the you know the cosmos platform or you're new i think it's like a grout it's called what is it called grute that's our project for humanoid robots yeah building and training human robots exactly brains so all right so group you know that some company uses grute to start to put a humanoid work for uh human labor in let's say a factory or

Starting point is 00:42:35 even as a care robot, and I'm a nurse, and all of a sudden, some root-built robot is now helping take care of the elderly. What are the labor implications of that? Well, first and foremost, I think we need to understand that this is a really hard problem. It's not like overnight, we're going to have robots replace everything humans do everywhere.

Starting point is 00:43:02 It's a very, very difficult problem. We're just now at an inflection point where we can finally, we see a line of sight to building the technology we needed to unlock the possibility of these kind of general purpose robots. And that's, we can now build a general purpose robot brain. 20 years ago, that was not true. We could have built the physical robot, the actual body of a robot, but it would have been useless because we couldn't give it a brain that would let it operate in the world in a general purpose manner. We couldn't interact with it or program it in a useful way to do anything. So that's what's been unlocked here. I talked to a lot of, CEOs and executives for companies in the industrial sector, in manufacturing, in warehousing,

Starting point is 00:44:11 to companies to retail companies. In all of these companies I talk to in every geography, there's a recurring theme. There's a demographic problem the whole world is facing. We don't have as many young people who want to do the jobs that the older people who are retiring now have been doing. If you go to an automotive factory in Detroit or in Germany, go look around, most of the factory workers are aging and they're quickly retiring. And these CEOs that I'm talking to, their biggest concern is all of that knowledge they have on how to operate those factories and work in them. It's going to be lost. The young people don't want to come and do these jobs. And so we have to solve that problem.

Starting point is 00:45:13 If we're going to maintain, not just grow our economy, but just maintain where the economy is at and produce the same amount of things, we need to find some solutions. to this problem. We don't have enough workers. We've been seeing it in transportation. There's not enough truck drivers in the world to go deliver all the stuff that's moving around in our supply chains. We can't hire enough of them. And there's less and less young people that want to do that job every year. So we need to have self-driving trucks. We need to have self-driving cars to solve that problem. So I think before we talk about replacing jobs that humans want to do, we should first be talking about using these robots to fill in the gap that's being left by humans because they don't want to do it anymore. Right. And there could be specialization. Like, take nursing, for example,

Starting point is 00:46:12 the nurse that injects me with a vaccine or the nurse that puts medication in my IV, maybe we keep that human for a while, even though, you know, they make mistakes too, but I'd feel a lot more comfortable if that was human. The nurse that takes me for a walk down the hall after I've gotten a knee replacement, that could be a robot. Maybe better it's a robot. We'll see how this plays out. We're, we believe that the first place we're going to see general purpose robots like the humanoid robots really take off is in the industrial sector because of two things. One, demand is great there because we have the shortage of workers. And also because it makes more sense to have them adopted in these spaces where a company just decides to put them in there and

Starting point is 00:47:07 mostly warehouses and factories are kind of unseen. I think the last place we're going to start seeing humanoid show up is in our homes in your kitchen. Don't tell Jeff Bezos that. Well, they will show up there, and I think it's going to be uneven. It'll depend on even geographically. They'll probably show up in a kitchen in somebody's home in Japan before they show up in a kitchen in somebody's home in Munich, in Germany. And I think that's a cultural thing. You know, I personally don't even want another human in my kitchen. I like being in my kitchen and preparing stuff myself.

Starting point is 00:47:51 My wife and I are always in each other's space there, so we get kind of annoyed. So having a humanoid robot would be kind of weird. I don't even want to hire somebody else to do that. We kind of do that ourselves. So that's a kind of personal decision. I think things like jobs like caring for our elderly and health care, those are very human. human professions, you know, there's a lot of, a lot of what the care is. It's not really about the physical thing that they're doing. It's about the emotional connection with another human.

Starting point is 00:48:29 And for that, I don't think robots are going to take that away from us anytime soon. Well, the question is, do we have enough care professionals to take those jobs? That's the one that really seems in danger. And so what's likely to happen is it'll be a combination. The care professionals we do have will do the things that require EQ, that require empathy, that requires, you know, really understanding the other human you're taking care of. And then they can instruct the robots around them to assist them to do all of the more mundane things like cleaning and and maybe giving the shots and IVs. I don't know. How long the way is that future, Rev?

Starting point is 00:49:15 How long do you think? You know, I wouldn't venture to guess on that kind of interaction in a hospital or care situation quite yet. I believe it's going to happen in the industrial sector first. And I believe that it's within a few years. We're going to see it. We're going to see humanoid robots widely used in the most advanced. manufacturing and warehousing wild okay i want to ask you about hollywood before we go um i guess i have this question rattling in my mind which is are we just going to see like movies not that movies

Starting point is 00:49:58 that look real but are computer generated like we have computer generated movies now the cg i but they all look uh pretty cg i but i imagine well they don't all look cg i some of them look right pretty amazing. Somewhat real. But I'm curious, like, do you think that, like, is Hollywood going to move to a area where it's super real and just simulated? Go ahead. Absolutely.

Starting point is 00:50:24 I mean, well, was it a year or two ago when the last planet of the apes came out? I went to go see it with my wife. Now, my wife and I have been together since I worked at Disney in the mid-90s, working on visual effects and rendering. I had a startup company doing rendering, and she was a part of that. So she has a good eye, and she's been around computer graphics and rendering for decades now. When we went to go see Planet of the Apes, even though obviously those apes were not real, at one point she turned around and said, that's all CG, right?

Starting point is 00:51:07 She couldn't quite believe it. I think what Weta did there is amazing. It's indistinguishable from real life, except for the fact that the apes were talking. Like, other than that, it's indistinguishable. The problem with that, though, is to do that level of CG in the traditional way that we've done it requires an incredible amount of artistry and skills. that only a few studios in the world can do with the teams that they have and the pipelines they've built. And it's incredibly expensive to produce that. What we're building with AI, with generative AI, and particularly with World Foundation models,

Starting point is 00:52:00 that once we get to the point where they really understand the depth of the physics that they need to, to produce something like Planet of the Aves. Once we have that, of course, of course they're going to use those technologies to produce the same images because it's going to be a lot faster and it's going to be a lot less expensive to do the same things.

Starting point is 00:52:24 It's already starting to happen. Rev, I know we're getting close to time. Do I have time for two more questions? Absolutely. Okay. So the more I think about robotics, the more I think about sort of what the application in war might be. I know that, like, you can't think of every permutation

Starting point is 00:52:44 when you're developing the foundational technology, but we are living in a world where war is becoming much more roboticized, and it's sort of, like, remarkable that we have some wars going on or people are still fighting in trenches. So I'm just curious if you've given any thought to, like, how robotics might be applied in warfare and whether there's a way to prevent some of, like, the bad uses that might come about because of it.

Starting point is 00:53:11 You know, I'm not really an expert in warfare, so I don't feel that I'm the best person to talk about how it might be is or not, but I can say this. This isn't the first time where a new technology has been introduced that is so powerful that not only can we imagine great uses of it

Starting point is 00:53:38 that are beneficial to people, but also really, really scary, devastating consequences of it being used, particularly in warfare. And somehow we've managed to not have that kind of devastation. And in general, the world has gotten better and better, more peaceful and safer, despite what it might feel like today, by almost any measure,

Starting point is 00:54:03 We have less lives lost through wars and these sorts of tragedies than ever before in mankind's history. The big one, of course, everybody always talks about, is nuclear technology. I mean, I grew up. I was a little kid in the 80s. This was kind of the height of the Cold War, the end of it. But every day, I remember thinking, thinking, you know, it might happen. We might have some ICBMs arrive in Los Angeles at any point. And it hasn't happened because somehow the general understanding by everyone collectively,

Starting point is 00:54:52 such that this would be so bad for everyone that we put together systems, even though we had intense rivalry and even enemies between the Soviet Union and the U.S., we somehow figured out that we should create a system that prevents that sort of thing. We've done the same with biological weapons and chemical weapons. Largely, they haven't been used, even though the technologies existed there. And so I think that's a good indicator. of how we should deal with this new technology, this new powerful technology of AI,

Starting point is 00:55:36 and a reason for us to be optimistic that it's possible to actually have this technology and not have it be so devastating. We can set up rules and conventions that say, even though it's possible to use AI in this way, that we shouldn't and we should all agree on that, And anybody that skirts the line on that, you know, there should be ramifications to it to disincentivize them from using it that way. Yeah, I hope you're right on that.

Starting point is 00:56:09 It seems like it's something that we're going to, as a society, deal with more and more as this stuff becomes more advanced. All right. So last one for you, you've been at Nvidia. We've talked about a couple of times, 23 years. I already teased this. So I want to, I just want to ask you, you know, the technology's been in favor. it's not been in favor, you know, you're at the top of the world right now, even though, you know, there was some hiccup last week, but whatever, it doesn't seem like it's going to be

Starting point is 00:56:35 a long-term issue. Just what's, what is like one insight you can tell us, you know, that you can draw from your time at NVIDIA about the way that the technology world works? About, well, first I can tell you about how NVIDIA works. Yeah, and the reason I'm here, I've been here for 23 years, and this will be the last job. I ever have. I'm positive of it. When I joined Nvidia, that wasn't the plan. I thought I'd be here one year, two years max, and now it's been 23 years. When I hit my 20 year mark, Jensen at our next company meeting had rattled off a bunch of stats on how long various groups have been here, how many people had been there for a year, two years, and so on. When you got to 20, there were more

Starting point is 00:57:25 than 650 people that were at 20-year. Now, earlier I had said when I joined the company, there were about 1,000 people. So this means that most of the people that were there when I started, when I started Nvidia, were still there after 20 years. I wasn't as special as I thought I was when I hit my 20-year mark. And so this is actually a very strange thing about NVIDIA, we have people that that have been here a long time and haven't left. It's strange in general for most companies, but particularly for Silicon Valley tech companies. People move around a lot. And I believe the reason why we've stayed here through all of our trials and tribulations and whatnot is because fundamentally what Jensen has built here is a company,

Starting point is 00:58:20 where people come to do their lives work, and we really mean it. Like, you feel it when you're here. This is more than just about making some money or having a job. You come here to do great work and to do your life's work. And so the idea of leaving just,

Starting point is 00:58:41 it feels painful to me, and I think it is to many others. That's what's actually, I think behind why, despite the fact that NVIDIA's had its ups and downs. And you can go back to look at our stock chart going back to like the mid-2000s. We introduced Kuta in 2006, and that was a really important thing, and we stuck to it. The analysts and nobody wanted us to keep sticking to it, but we kept investing in it. And our stock price took a huge hit, and it was flat.

Starting point is 00:59:19 there for a long time flat or dropping and then it finally happened AI was born on our GPU that's what we were waiting for and we went we went all in on that and we've had ups and down since then we'll we'll continue to have ups and downs but I think the trend is going to still be up into the right because this is an amazing place where where people who want to do their life's work, the best people in the world at what we do, wanted their life's work, they come here and they stay here. Yep. Well, Rev, look, it's always such a pleasure to speak with you. I really enjoyed our time together at NVIDIA headquarters. That was a really fun day. We did some cool demos,

Starting point is 01:00:03 and I appreciate that. And I'm just thrilled to get a chance to speak with you about this technology today. It is fascinating technology. It is cutting edge. Obviously brings up a lot of questions, some of which we got to today. I'm sure we could have talked for three hours. and I hope to keep the conversation up. So thanks for coming on the show. Thank you for inviting me and I hope we do talk for three hours one day. That'll be great.

Starting point is 01:00:25 All right, everybody. Thank you for listening. Ron John and I will be back to break down the news on Friday. Already a lot of news this week with Open AI's deep research coming out. I just paid $200 for chat GPT, which is a lot more than I ever thought I would for a month. But that's where we are today. So we're going to talk about that and more on Friday.

Starting point is 01:00:41 Thanks for listening and we'll see you next time on Big Technology Podcast.

Big Technology Podcast - NVIDIA's Plan To Build AI That Understands The Real World — With Rev Lebaredian

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.