Big Technology Podcast - NVIDIA's Plan To Build AI That Understands The Real World — With Rev Lebaredian
Episode Date: February 5, 2025Rev Lebaredian is the Vice President, Omniverse & Simulation Technology at NVIDIA. He joins Big Technology Podcast for a conversation about NVIDIA's push to develop AI that understands the dynamics of... the real world, including physics. In this conversation, we cover how NVIDIA is building this technology, what it might be useful for (things like robotics and building common sense into AI models), how it will change labor, and even potentially warfare. We also cover how AI videos today possess a solid understanding of the real world. Tune in for the first few minutes where we discuss Lebaredian's perspective on DeepSeek and Jevon's Paradox.
Transcript
Discussion (0)
Let's talk about NVIDIA's push to generate AI that understands the real world with technology that can influence the future of robotics, labor, cars, Hollywood, and more.
We're joined by the company's VP of Omniverse and Simulation Technologies right after this.
Welcome to Big Technology Podcast, a show for cool-headed, nuanced conversation of the tech world and beyond.
Today, we're joined by Rev Liberdian. He's the vice president of Omniverse and Simulation Technology at NVIDIA,
for a fascinating conversation about what may well be the next stage of AI progress,
the pursuit of world models that provide common sense to AIs.
Rev, I'm so happy to see you here.
We actually spent some time at your headquarters a couple months back,
and I'm really glad that you're here today and to introduce you to the big technology audience.
Welcome to the show.
Thank you for having me.
All right, before we jump into world models,
obviously we're having this conversation in the wake of the deep sea revolution.
I don't know what you want to call it.
And everyone is talking about NVIDIA now.
You're in quiet period, so we're not going to go into financials.
But I can and do want to ask you about the technology side of this, specifically about Jevin's Paradox.
I keep hearing Invidia, Jevon's Paradox.
Jevon's Paradox.
What is Jevins Paradox and what do you think about it?
My understanding of what Jevins Paradox is essentially an economic kind of principle that
As you reduce the cost of something of running it, you create more demand for it because it
unlocks essentially more uses of that technology when it becomes more economically feasible
to use that.
I think that really does apply in this case in the same way that it applies to almost every
other important computing innovation over the last 40, 50 years, or at least as long as
I've been alive.
You know, at the inception of Nvidia in 1993,
Nvidia selected, very carefully selected
the very first computing problem to address
in order to create the conditions by which
we could continue innovating and keep growing that market.
And this was the problem of computer graphics
and particularly rendering within computer graphics,
generating these images.
The reason we selected it is because it's an endless problem.
No matter how much compute you throw at it,
no matter how much innovation we throw at it,
you always want more.
And throughout the time I've been at Nvidia,
which is now 23 years, many times I've heard,
well, graphics are good enough.
Rendering is good enough.
And so soon,
NVIDIA's big GPUs and more computing power,
it's not gonna be necessary.
We'll just get consumed by SOCs or integrated into another chip
as integrated graphics and it'll disappear.
But that never happened because the fundamental problem
of simulating the physics of light and matter was endless.
We see this in almost every important computing domain.
AI is one of these things.
I mean, can we really say that we have now reached the point where our computers are intelligent enough
or the intelligence we create is good enough?
And so it's just going to shrink.
We're not going to have any more use for more compute power there.
I don't think so.
I think intelligence is something that is probably,
the most endless of all, all computing problems. If we can throw more compute at the problem,
we can make more intelligence and do it better and better. So making, making AI more efficient
will just increase its economic value in, in many, many of the applications we want to apply
it to and increase demand. And can we talk about the progression of AI models becoming more
efficient. I know it's like a hot topic right now, but it does seem to me that over the past
couple years, we've definitely seen models become more and more efficient. So what can you
tell us about? We'll just talk about large language models on this front, the efficiency
gains that we've seen over time with them. I mean, this isn't new. This has been happening
for the past 10, 12 years or so, essentially since we first discovered deep learning
on our GPUs with AlexNet.
If you look at the computational curve,
what our GPUs can do in terms of tensor operations,
the AI kind of math that we need to do,
over the last 10 years,
we've had essentially a million X performance increase.
And that increase isn't just from the raw hardware.
It's also through many layers of the software algorithms.
So we're getting these benefits, these speedups continuously at a very rapid rate exponentially
by compounding many layers, all the different layers at which this computing happens.
From the fundamental hardware, the chips themselves, at systems level, networking, system software,
algorithms, frameworks, and so on.
So what we've seen here with DeepSeek is a great advancement that's on the same curve that
we've been on for a decade now.
Okay.
And 23 years at Nvidia, I'm going to save a question to ask you about that as we get
later on or towards the end of the interview because I'm very curious what your experience
has been being at Nvidia for so long, especially given that, you know, the company's
technology, at least from the outside world.
was viewed as in favor and then people question it and back in favor,
if people question, obviously we see what's going on now.
We're leaving through a mini cycle at this point.
So I'm very curious about your experience,
but I want to talk about the technology first.
And let me just bring you into a conversation that we had here on the show
with Jan Lecun, who's met as chief AI scientists,
really right after chat cheap T came out.
And one of the things that Jan did was he said,
go ask chat GPT, what happens if you let go of a piece of paper
with your left hand and I typed it in it gave a very convincing answer it was completely wrong
because with text you don't have the common sense about physics and tries you might to teach a
model physics with text you can't there's just not enough literature that describes what happens
when you drop a paper with a hand and therefore the models are limited and yon's point here was
basically like if you want to get to truly intelligent machines you need to build a
something into the AI that teaches common sense, that teaches physics, and you need to look beyond
words to do that. And so now I turn it over to you, Rev, because I do think that right now within
NVIDIA, a big initiative is to build a picture of the world to teach AI models that common
sense that Yon had mentioned was lacking. And I have some follow-ups about it, but I want to hear
first a little bit about what you're doing and whether your efforts are geared towards
solving the problem that Jan brought up.
Well, what Jan said is absolutely true, and it makes intuitive sense, right?
If an AI has only been trained on words, on text that we've digitized, how can it possibly know about concepts from our physical world, like what the color red really is, what it means to hear sound?
what it, what it means to, to feel, uh, felt. You know, it can't, it can't know those things
because it never experienced them. When, um, when we train a model, essentially what we're doing
is we're providing life experience to that model and it's, and it's pulling apart patterns or
it's discerning patterns from all of the experience that we give it. And what's, what was really,
really amazing about GPT, the advancements with LLMs, you know, starting with the Transformer,
is that we could take this really, really complex set of rules that humans had no way of
actually defining directly in a clear and robust manner, the rules of language.
And we were able to pull that out of a corpus of data.
we took all of this text, all these books, and whatever information you could scrape from
the internet about that. And somehow this model figured out what all the patterns of language
are in many different languages and could then, because it understands the fundamental rules
of language, do some amazing things. It could generate new text, it could style some text
that you give it in a different way.
It can translate text from one form to another, from one language to another.
It can do all of this awesome stuff.
But it lacks any information about our world other than what's been described in those words.
And so the next step is, the next step in AI is for us to take the same fundamental technology we have, this machine we have, where we can feed it, life,
and it figures out what the patterns and the rules are and feed it with actual data about our physical world and about how our world works so that it could apply that same learning to the rules of physics.
Instead of the rules of grammar, the rules of language, it's going to understand how the physical world around us works.
And our thesis is that from all the AIs we're going to create into the future, the most valuable ones are going to be the ones that can interact with our physical world, the world that we experience around us, the world created out of atoms.
Today, the AIs that we're creating are largely about our world of knowledge, our world of information, ones and zeros, things that you could easily represent inside a computer in the digital.
world. But if we can apply the same AI technology to the physical world around us, then essentially
we unlock robotics. We can have these agents with this intelligence and even superintelligence
in specific tasks. Do amazing things in the world around us, which is if you look at global
markets, if you look at all of the commerce happening in the world and GDP, the world of
knowledge, information technology is somewhere between $2 to $5 trillion a year, but everything
else, transportation, manufacturing, supply chain warehouse and logistics, creating drugs,
all the stuff in the physical world, that's about $100 trillion.
So the application of this kind of AI to the physical world is going to bring more value to us.
So it's interesting.
It's not just basically inputting that real world knowledge into LLMs, right, so they can get the question about dropping the paper with a hand, correct?
It is also something that you're working on is building the foundation.
for robots to go out into our world and operate within it?
So, yes, it's not inputting it in the same way that we do for these text models.
We're not just going to describe with words what happens when you drop a piece of paper.
We're going to give these models other senses during the learning process.
So they'll watch videos of paper dropping.
We can also give it more accurate, specific information
in the 3D realm.
Because we can simulate these physical worlds inside a computer today,
we have physics simulations of worlds,
we can pull ground truth data about the position and orientation
and state of things inside that 3D world
and use that as another mode of input into these models.
And so what we'll end up with is a world foundation model
that was trained on many different modes of data,
essentially different sense.
It can see, it can hear, it can touch and feel and do many of the things we can do
or many things other animals or even things no creature can do
because we can provide it with sensors that don't exist inside the natural world.
And it can, from that, kind of decipher what are the actual combined rules of the world?
And this encoding of the knowledge of how the physical world works can then be the basis for us to build agents inside the real world,
to build the brains of these agents, otherwise known as physical robots.
Right.
And so this is your recently announced Cosmos project.
So talk a little bit about like what Cosmos is.
I mean, obviously it's a world foundational model.
But where, how long you've been building it and what type of companies and developers might use it and what they might use it for?
We've been we've been working towards Cosmos.
for probably about 10 years, we envisioned that eventually this new technology that had formed
with deep learning, that that was going to be the critical technology necessary for us to create
robot brains. And that is ultimately what's going to unlock this incredible amount of value for us.
So we started working towards this a long time ago. We realized early on that the big problem
we were going to have is in order to train such a model to train a robot brain to understand
the physical world and to work within it, we're going to have to give it experience. We're going to have to
give it the data that represents the physical world.
And capturing this data from the real world
is not really an easy thing to do.
It's very expensive and in some cases, very dangerous.
For example, for self-driving cars,
which is a type of robot,
it's a robot that can autonomously, you know,
on its own, figure out how to get from point A to point B
by controlling this physical being, a car,
by braking and accelerating and steering,
how are we gonna ensure that a self-driving car
really understands when a child runs into the street
as it's barreling down the street,
that it should stop?
And how can we be sure that it's actually gonna do that
without actually doing that in the real world?
We don't wanna go capture data of a child running across the street.
Well, we can do that by simulating it
inside a computer.
And so we realized this early on.
So we set about applying all of the technologies
we'd been working on up into that point
with computer graphics and for video games
and video game engines and physics inside these worlds
to create a system to do world simulation
that was physically accurate
so that we could then train these AIs.
And so we call that operating system, if you will,
Omniverse. It's a system to create these physics simulations, which we then use to train
AIs that we could test in that same simulation before we put them out in the real world.
So we use it for our self-driving cars and other robots out there.
So building Cosmos actually starts first with simulating the world.
And so we've been building that stack and those computers for quite a while.
Once the transformer model was introduced and we started seeing the amazing things large language models can do and the chat GPT moment came,
we understood that this essentially unlocked the one thing that we needed to really push forward in robotics,
which is the ability to have this kind of general intelligence about a really complex set of things,
complex set of rules.
And so we set about building what is Cosmos today essentially a few years ago, using all of the
technology we had built before with simulation and AI training.
And what Cosmos is, it's actually a few things.
It's a collection of some open weight models that we've made freely available.
Along with it, we also provide essentially all of the tooling and pipelines you need to create a new World Foundation model.
So we give you the World Foundation models that we've started training, which are world class,
especially for the purposes of building physical AI.
And we also have a what's called a tokenizer,
which are AIs themselves, that are world class.
It's a critical element of building world foundation models.
And then we have curation pipelines.
The data that you select and curate to feed into the training
of your World Foundation model is critical.
and just selecting the right data requires a lot of AI in it of itself.
So we released all of this stuff and we put it out there in the open so that the whole community can join us in building physical AI.
And so who's going to use it? Is it going to be robotics developers?
Is it going to be somebody that's building, let's say, LM-based application, but just wants them to be a little smarter?
Both?
It will be all of them, yes.
It's a, we, we feel that we're, as a, as, as the industry, the world is right at the
beginnings of this physical AI revolution.
And no one company, no one organization is going to be able to build everything that's, that, that we
need.
So, so we're building it out there in the open, uh, to encourage others to come build on top of
what we've built and come build it with us. And this is going to be essentially anybody that
has an application that involves the physical world. And so that's definitely robotics companies
are part of this and robotics in the very general sense. That includes self-driving car companies,
Robotaxi companies, and as well as companies building robots that are in our factories and
warehouses. Anybody that wants to make intelligent robots that have perception and operate
autonomously inside the real world, they want this. But it's not only about robots in the way
we think about them as these agents that move around. We have sensors that we're placing in our
spaces, in our cities, in urban environments, inside buildings. These sensors,
need to understand what's happening in that world, maybe for security reasons, for coordinating
other robots, changing the climate and energy efficiency of our buildings and data centers.
So there's many applications of physical AI that are broader than what we generally think
of as what you imagine when you say a robotic application.
There's going to be thousands and thousands of companies that build these physical AIs, and this is just the beginning.
Now, you mentioned that the transformer model was an important development on this path, and that obviously was the thing that underpinned a lot of the real innovation we've seen with large language models.
Can the real world AI learn from the knowledge base that has been sort of turned into these AI models with text?
Like if you're, if you have a model that's trying to understand the world with common sense, do they take text as an input?
They take all of it as input.
How does it work with them with text?
I mean, it's very interesting because it seems like that's like when we talk about the progression towards general intelligence, that is a very, you know, kind of amazing application of being able to read something and then sort of intuit what it means in a physical space, don't you think?
Yeah, I think the way I think about it, and I think this is right, is these AIs learn the same way we do.
When you're brought into this world, you don't know who is mommy, who is daddy.
You don't even know how to see yet.
You don't have depth perception.
You can't see color or understand what it is.
You don't know language.
You don't know these things.
but you learn by being bombarded with all of this information simultaneously through the many different senses.
So when your mommy looks at you and says, I'm mommy pointing, you're getting multiple modes of information coming at you, including essentially that text that's coming through an audio form there.
And then eventually when you learn how to read, you learn how to read because a teacher points at letters and then words and sounds them out.
So you have this association that you build between the information that you understand, like mommy, and the letters that mean that thing.
AIs learned in the same way.
When we train them, if you give them all of these modes of information associated with each other at the same time, it'll associate them together.
That's how image generators work today.
When you go generate an image using a text prompt, the reason why it can generate, you know, an image of a red ball in a grass field in an overcast day is because
when it was trained, there was an association of some text along with the images that
were fed into it. It knew that during the training process that these words were related
to that image. And so we can gather that understanding from that association. What we're trying
to do with World Foundation models is take it to the next level by giving it more,
of information and richer information, but part of that will still include text.
We'll feed in the text along with the video and other ground truth information from the physical
state of the world.
Yeah, so this is going to be a multi-part question, and I apologize, but I don't really know
another way to ask it.
So what are the other modes of information that you're feeding in there?
And do you really need to go through this simulation process?
And I'll tell you, you know, it all sounds like a worthwhile endeavor to me, and I'm sure it is.
But I also see video models today.
And that is something that's really surprised me when we've seen the video generation models,
is that they really have an understanding of physics.
Like, they know just as an image, like an image generation is not moving, right?
So you know that, let's say, the guy sits on the chair.
But video, you could see people walking through a field and you watch the grass move.
And that means that those models inherently have a concept of how physics works, I think.
And I'm going to run it by you because you're the expert here.
But like, again, and Jan's going to come on the show in a couple weeks.
So maybe this is just in my mind because I'm gearing up and thinking about our last conversation.
But I'm going to put this to you also.
Maybe I'll ask what your answers.
I'll ask him to weigh in on your answers on this.
But the thing that he always talked about is a human mind is able to sort of see infinite possibilities
and accept that. It doesn't break us. So if you have a pencil and you hold it up,
you know it's going to fall, but you know it could fall in infinite ways, but it's still going
to fall. For an AI that's been trained on different scenarios, it's very difficult for them to
understand that that pencil might fall in infinite ways when asked to generate it. However,
they've been doing a very good job with the video generators of like showing that they understand
that. So just to sort of reiterate, what different modes of information are you using? And why do we need
this broader simulation environment or this Cosmos tool if we are getting such good results from
video generation already?
All very, very good question.
So first off, we use many, many modes.
The primary one, though, for training Cosmos is video, just like the video generation
models.
But along with that, there's text.
We also feed it extra information and labels that we can gather from.
data, particularly when we do, when we train, when we generate the data synthetically.
If you use a simulator to generate the videos, you have perfect information about everything
that's going on in every pixel in that video.
We know how far each object is in each pixel.
We know the depth.
We know what the object is in each pixel.
You can segment out all of that stuff.
Traditionally, what we've done for perception training for autonomous vehicles.
So we've used humans to go through and label all that information from hours and hours of video that's been collected.
it's inaccurate and not complete.
So from simulation, we can get perfect information
about the actual videos themselves.
Now, that being said, your question about,
these video models seem to really know physics
and know it well.
I think it is pretty amazing, you know,
how much physics they do know, and it's kind of surprising
we're here at this point.
Like, had you asked me five years ago, would we be able to generate videos with this much physics plausibility at this stage?
I wasn't sure, actually, because I continually had been wrong for years prior to that.
I didn't expect to see image classification in my lifetime until we saw it with AlexNet.
But I would have bet against it.
And so we're pretty far along.
That being said, there's a lot of flaws in the physics we see.
So you see this in the video.
One of the basic things is object permanence.
If you direct the video to move the camera, point away and come back,
objects that were there at the beginning of the video are no longer there or they're different, right?
And so that is such a fundamental violation of the laws of physics.
it's kind of hard to say, well, these models currently understand physics well.
And there's a whole bunch of other things in there.
You know, my life's work has been primarily computer graphics and specifically rendering,
which is a 3D rendering is essentially a physics simulation.
It's the simulation of how light interacts with matter and eventually reaches a sensor of some
sort. We simulate what a camera would do in a 3D world and what image it would gather
from the world. When I look at a lot of these videos that are generated, I see tons and tons of
flaws because when we do those simulations in rendering, we're attuned to seeing when shadows
are wrong and reflections are wrong and these sorts of things. To the untrained
eye. It looks plausible. It looks correct. But I think people can still kind of feel something is
wrong, you know, when it's AI generated, when it's not, in the same way that for decades now,
since we introduced computer graphics to visual effects in the movies, you know, when some,
you don't know what it is, but if the rendering's not great in there, it just feels CG,
it feels wrong. We still have that kind of uncanny valley thing going on.
That all being said, I think we're going to rapidly get better and better.
So the models today have an amazing amount of knowledge about the physical world,
but they're maybe at like 5, 10% of what they should understand.
We need to get them to 90, 95%.
Right.
Yeah, I just saw a video of a tidal wave hitting some island.
And I looked at it.
It was like super realistic.
It was, of course, it was on Instagram because that's all Instagram is right now.
3D generate, I mean, AI generated video.
And it took me a second.
And it's more frequently taking me a minute to be like, oh, that's AI generated.
And sometimes I have to look in the comments and just sort of trust the wisdom of the
crowds on that front.
But you might not be the best judge of it as well.
Humans, I mean, we're not particularly good at knowing whether the physics will really
be accurate or not.
This is why movies directors can take such license with the physics when they do explosions
and all kinds of other fun stuff like tidal waves in there.
Yeah.
Well, it's just like some comedian made this joke.
They're like, Neil deGrasse Tyson likes to come out after these movies like gravity
and talk about how they're like scientifically incorrect.
And some comedians like, yeah, well, how about the fact that George Clooney and Sandra Bullock are
the astronauts?
That didn't bother you at all.
But it is interesting that we can watch these videos, watch these movies and fully believe
at least in the moment that they're real.
Like we can allow ourselves to like sort of lose ourselves in the moment.
Exactly.
And just be like, yep, I'm in this story.
I feel emotion right now watching, you know, George Clooney in a spaceship,
even though I know he's no astronaut.
And I think for that purpose, I mean, I worked on movies.
Before I was at NVIDIA, that's what I did, computer graphics for visual effects.
That is a perfectly legitimate use of that technology.
It's just that that level of simulation is not sufficient for building physical AI that are going to be the underpinnings or the fundamental components of a robot brain.
I don't want my self-driving car or my robot operating heavy machinery in a factory to be trained on physics that doesn't match the real one.
world. Even if it looks right to us, if if it's not right, then it's not going to behave
correctly. And that's dangerous. So it's a it's a different purpose. That's why what we're doing
with Cosmos, it's it is really a different class of AI than video generators. You can use it to
generate videos, but the purpose is different. It's not about generating beautiful imagery
or interesting imagery as for art.
This is about simulating the physical world,
using AI to create the simulation.
Rev, I want to ask you one more follow-up question
about not the flaws,
but the video generator's ability to get things right.
And then we're going to move on from this topic.
But it is just surprising and interesting
for me to hear,
You and Demis Asabas, the CEO of Google DeepMind, who was just on, who commented on this,
talk about how these video generators have been surprisingly good at understanding physics.
And Jan also, basically in our conversations previously, effectively saying that it's very difficult for AI to solve these problems.
I won't say they've solved it.
But everybody's surprised they've gotten to this point.
So what is your best understanding of how they've been, though flawed, like this good?
You know, this is the trillion-dollar question, I guess.
You know, we've been betting now for years that if we just throw more compute and more data at the problem,
that these scaling laws are going to give us a level of intelligence that's really, really meaningful.
that will be like step function changes in capabilities.
There's no way for us to know for sure.
It's very hard to predict that.
It feels like we're on an X, we are on an exponential curve with this,
but which part of the exponential curve we're on,
we can't tell.
So we don't know how fast that's going to happen.
Honestly, I've been surprised at how,
how well these transformer models have been able to extract the laws of physics to this level
by this point in time? I have, at this point, I believe in a few years, we're going to get
to a level of physics understanding with our AIs that's going to unlock the majority of the
applications we need to apply them in in robotics.
Let me ask you one more question about this.
Then we're going to take a break and talk about some of the societal implications of putting robotics, let's say, in the workforce and in, I don't know, in all different areas of our lives.
There's definitely a sizable portion of the population that is going to be surprised, maybe not our listeners, but a sizable portion of the population that would be surprised to hear that Nvidia itself is building these world foundational models, releasing weights to help others build.
on top of them. The perception, I think, from some on the outside is, hey, isn't Nvidia just
the company that makes those chips? So what do you say to that, Rev? Well, yeah, that's been the
perception. It's been the perception since I started in video 23 years ago. And it's never been true
that we just build chips. Chips are very, very important part of what we do. They're the foundation
that we build on. But when I joined the company, there were about a thousand
people, 1,000 employees at the time. The grand majority of them were engineers, just like today,
the majority of our employees are engineers. And the majority of those engineers are software
engineers. I myself am a software engineer. I wouldn't know the first thing about making a chip.
And so our form of computing, accelerated computing, the form of computing we invented
is a full-stack problem.
It's not just a chip.
It's not just a chip that we throw over the fence
and leave it to others to figure out how to make use of it.
It doesn't work unless we have these layers of software,
and these layers of software have to have algorithms
that are harmonized with the architecture of our chips and our systems.
So we have to go,
In these new markets that we enter, what Jensen calls zero billion dollar industries, we have to actually go invent these new things kind of top to bottom because they don't exist yet and nobody else is going to likely to do it.
So we build a lot of software and we build a lot of AI these days because that's what's necessary in order to build the computers to power all of this stuff.
We did this with LLMs early on.
Many, many years ago, we trained at the time what was the largest model in terms of number of parameters for an LLM.
It was called Megatron.
And because we did that, we build our computers, our chips and computers and the system software and the frameworks and pipelines and everything.
We were able to tune them to do these large-scale things and we put all of that, all of that software out there, which was then used to create all the LLMs we enjoy today.
Had we had not done that, I don't think we would have had chat GPT.
And so this is essentially the same thing.
We're creating a new market, a new capability that doesn't exist.
we see this as being an endeavor that is greater than NVIDIA.
We need many, many others to participate in this.
But there are some things that we're uniquely positioned to contribute,
given our scale and our particular expertise.
And so we're going to go do that.
And then we're going to make that freely available to others so they can build on it.
Yeah.
For those wondering why NVIDIA has such a hold in the market right now,
I think you just heard the response.
So I do want to take a break, and then I want to talk about the implications for society
when we have, let's say, humanoid robots doing labor in that part of the economy
that we simply haven't really put AI into yet, and what it means when it's many more
trillions of dollars than the knowledge work.
So we're going to do that when we're back right after this.
Hey, everyone.
Let me tell you about the Hustle Daily Show, a podcast filled with business, tech news,
and original stories to keep you in the loop on what's trending.
More than 2 million professionals read The Hustle's daily email
for its irreverent and informative takes on business and tech news.
Now, they have a daily podcast called The Hustle Daily Show
where their team of writers break down the biggest business headlines
in 15 minutes or less and explain why you should care about them.
So, search for The Hustle Daily Show and your favorite podcast app,
like the one you're using right now.
And we're back here on Big Technology Podcast with Rev. Liberdian.
he's the vice president of omniverse and simulation technology at nVIDIA rev i want to just ask you the
question that obviously has been bouncing around my mind since we started talking about the fact that
you're going to enable robotics to be able to sort of a take over i don't know it's take over
the right word take over a lot of what we do currently in the workforce i mean what do you think
the labor implications are here because yeah if you're if you've spent your entire life you know
working at a certain manual task and next thing you know someone uses the you know the cosmos platform
or you're new i think it's like a grout it's called what is it called grute that's our project for
humanoid robots yeah building and training human robots exactly brains so all right so group you know
that some company uses grute to start to put a humanoid work for uh human labor in let's say a factory or
even as a care robot, and I'm a nurse,
and all of a sudden, some root-built robot
is now helping take care of the elderly.
What are the labor implications of that?
Well, first and foremost,
I think we need to understand that this is a really hard problem.
It's not like overnight,
we're going to have robots replace everything humans do everywhere.
It's a very, very difficult problem.
We're just now at an inflection point where we can finally, we see a line of sight to building the technology we needed to unlock the possibility of these kind of general purpose robots.
And that's, we can now build a general purpose robot brain.
20 years ago, that was not true. We could have built the physical robot, the actual body of a robot,
but it would have been useless because we couldn't give it a brain that would let it operate in the world
in a general purpose manner. We couldn't interact with it or program it in a useful way to do anything.
So that's what's been unlocked here. I talked to a lot of,
CEOs and executives for companies in the industrial sector, in manufacturing, in warehousing,
to companies to retail companies. In all of these companies I talk to in every geography,
there's a recurring theme. There's a demographic problem the whole world is facing.
We don't have as many young people who want to do the jobs that the older people who are retiring now have been doing.
If you go to an automotive factory in Detroit or in Germany, go look around, most of the factory workers are aging and they're quickly retiring.
And these CEOs that I'm talking to, their biggest concern is all of that knowledge they have on how to operate those factories and work in them.
It's going to be lost.
The young people don't want to come and do these jobs.
And so we have to solve that problem.
If we're going to maintain, not just grow our economy, but just maintain where the economy is at and produce the same amount of things, we need to find some solutions.
to this problem. We don't have enough workers. We've been seeing it in transportation. There's not
enough truck drivers in the world to go deliver all the stuff that's moving around in our supply chains.
We can't hire enough of them. And there's less and less young people that want to do that job
every year. So we need to have self-driving trucks. We need to have self-driving cars to solve that
problem. So I think before we talk about replacing jobs that humans want to do, we should first
be talking about using these robots to fill in the gap that's being left by humans because they
don't want to do it anymore. Right. And there could be specialization. Like, take nursing, for example,
the nurse that injects me with a vaccine or the nurse that puts medication in my IV, maybe we
keep that human for a while, even though, you know, they make mistakes too, but I'd feel a lot
more comfortable if that was human. The nurse that takes me for a walk down the hall after I've
gotten a knee replacement, that could be a robot. Maybe better it's a robot. We'll see how this
plays out. We're, we believe that the first place we're going to see general purpose robots like
the humanoid robots really take off is in the industrial sector because of two things. One,
demand is great there because we have the shortage of workers. And also because it makes more sense
to have them adopted in these spaces where a company just decides to put them in there and
mostly warehouses and factories are kind of unseen. I think the last place we're going to start
seeing humanoid show up is in our homes in your kitchen. Don't tell Jeff Bezos that.
Well, they will show up there, and I think it's going to be uneven.
It'll depend on even geographically.
They'll probably show up in a kitchen in somebody's home in Japan before they show up in a kitchen in somebody's home in Munich, in Germany.
And I think that's a cultural thing.
You know, I personally don't even want another human in my kitchen.
I like being in my kitchen and preparing stuff myself.
My wife and I are always in each other's space there, so we get kind of annoyed.
So having a humanoid robot would be kind of weird.
I don't even want to hire somebody else to do that.
We kind of do that ourselves.
So that's a kind of personal decision.
I think things like jobs like caring for our elderly and health care, those are very human.
human professions, you know, there's a lot of, a lot of what the care is. It's not really about
the physical thing that they're doing. It's about the emotional connection with another human.
And for that, I don't think robots are going to take that away from us anytime soon.
Well, the question is, do we have enough care professionals to take those jobs? That's the one
that really seems in danger. And so what's likely to happen is it'll be a
combination. The care professionals we do have will do the things that require EQ, that require
empathy, that requires, you know, really understanding the other human you're taking care of.
And then they can instruct the robots around them to assist them to do all of the more mundane
things like cleaning and and maybe giving the shots and IVs. I don't know.
How long the way is that future, Rev?
How long do you think?
You know, I wouldn't venture to guess on that kind of interaction in a hospital or care situation quite yet.
I believe it's going to happen in the industrial sector first.
And I believe that it's within a few years.
We're going to see it.
We're going to see humanoid robots widely used in the most advanced.
manufacturing and warehousing wild okay i want to ask you about hollywood before we go um i guess i have
this question rattling in my mind which is are we just going to see like movies not that movies
that look real but are computer generated like we have computer generated movies now the cg i but
they all look uh pretty cg i but i imagine well they don't all look cg i some of them look right pretty
amazing.
Somewhat real.
But I'm curious, like, do you think that, like, is Hollywood going to move to a area
where it's super real and just simulated?
Go ahead.
Absolutely.
I mean, well, was it a year or two ago when the last planet of the apes came out?
I went to go see it with my wife.
Now, my wife and I have been together since I worked at Disney in the mid-90s, working on visual
effects and rendering.
I had a startup company doing rendering, and she was a part of that.
So she has a good eye, and she's been around computer graphics and rendering for decades now.
When we went to go see Planet of the Apes, even though obviously those apes were not real,
at one point she turned around and said, that's all CG, right?
She couldn't quite believe it.
I think what Weta did there is amazing.
It's indistinguishable from real life, except for the fact that the apes were talking.
Like, other than that, it's indistinguishable.
The problem with that, though, is to do that level of CG in the traditional way that we've done it requires an incredible amount of artistry and skills.
that only a few studios in the world can do with the teams that they have and the pipelines they've built.
And it's incredibly expensive to produce that.
What we're building with AI, with generative AI, and particularly with World Foundation models,
that once we get to the point where they really understand the depth of the physics that they need to,
to produce something like Planet of the Aves.
Once we have that, of course,
of course they're going to use those technologies
to produce the same images
because it's going to be a lot faster
and it's going to be a lot less expensive
to do the same things.
It's already starting to happen.
Rev, I know we're getting close to time.
Do I have time for two more questions?
Absolutely.
Okay.
So the more I think about robotics,
the more I think about sort of what the application in war might be.
I know that, like, you can't think of every permutation
when you're developing the foundational technology,
but we are living in a world where war is becoming much more roboticized,
and it's sort of, like, remarkable that we have some wars going on
or people are still fighting in trenches.
So I'm just curious if you've given any thought to, like,
how robotics might be applied in warfare
and whether there's a way to prevent some of, like,
the bad uses that might come about because of it.
You know, I'm not really an expert in warfare,
so I don't feel that I'm the best person
to talk about how it might be is or not,
but I can say this.
This isn't the first time
where a new technology has been introduced
that is so powerful
that not only can we imagine great uses of it
that are beneficial to people,
but also really, really scary, devastating consequences
of it being used, particularly in warfare.
And somehow we've managed to not have that kind of devastation.
And in general, the world has gotten better and better,
more peaceful and safer,
despite what it might feel like today,
by almost any measure,
We have less lives lost through wars and these sorts of tragedies than ever before in mankind's history.
The big one, of course, everybody always talks about, is nuclear technology.
I mean, I grew up.
I was a little kid in the 80s.
This was kind of the height of the Cold War, the end of it.
But every day, I remember thinking, thinking, you know, it might happen.
We might have some ICBMs arrive in Los Angeles at any point.
And it hasn't happened because somehow the general understanding by everyone collectively,
such that this would be so bad for everyone that we put together systems,
even though we had intense rivalry and even enemies between the Soviet Union and the U.S.,
we somehow figured out that we should create a system that prevents that sort of thing.
We've done the same with biological weapons and chemical weapons.
Largely, they haven't been used, even though the technologies existed there.
And so I think that's a good indicator.
of how we should deal with this new technology,
this new powerful technology of AI,
and a reason for us to be optimistic
that it's possible to actually have this technology
and not have it be so devastating.
We can set up rules and conventions that say,
even though it's possible to use AI in this way,
that we shouldn't and we should all agree on that,
And anybody that skirts the line on that, you know, there should be ramifications to it to disincentivize them from using it that way.
Yeah, I hope you're right on that.
It seems like it's something that we're going to, as a society, deal with more and more as this stuff becomes more advanced.
All right.
So last one for you, you've been at Nvidia.
We've talked about a couple of times, 23 years.
I already teased this.
So I want to, I just want to ask you, you know, the technology's been in favor.
it's not been in favor, you know, you're at the top of the world right now, even though,
you know, there was some hiccup last week, but whatever, it doesn't seem like it's going to be
a long-term issue. Just what's, what is like one insight you can tell us, you know, that you can
draw from your time at NVIDIA about the way that the technology world works?
About, well, first I can tell you about how NVIDIA works. Yeah, and the reason I'm here,
I've been here for 23 years, and this will be the last job.
I ever have. I'm positive of it. When I joined Nvidia, that wasn't the plan. I thought I'd be here
one year, two years max, and now it's been 23 years. When I hit my 20 year mark, Jensen at our next
company meeting had rattled off a bunch of stats on how long various groups have been here,
how many people had been there for a year, two years, and so on. When you got to 20, there were more
than 650 people that were at 20-year. Now, earlier I had said when I joined the company,
there were about 1,000 people. So this means that most of the people that were there when I
started, when I started Nvidia, were still there after 20 years. I wasn't as special as I thought
I was when I hit my 20-year mark. And so this is actually a very strange thing about
NVIDIA, we have people that that have been here a long time and haven't left. It's strange
in general for most companies, but particularly for Silicon Valley tech companies. People move
around a lot. And I believe the reason why we've stayed here through all of our trials and
tribulations and whatnot is because fundamentally what Jensen has built here is a company,
where people come to do their lives work,
and we really mean it.
Like, you feel it when you're here.
This is more than just about making some money
or having a job.
You come here to do great work
and to do your life's work.
And so the idea of leaving just,
it feels painful to me,
and I think it is to many others.
That's what's actually,
I think behind why, despite the fact that NVIDIA's had its ups and downs.
And you can go back to look at our stock chart going back to like the mid-2000s.
We introduced Kuta in 2006, and that was a really important thing, and we stuck to it.
The analysts and nobody wanted us to keep sticking to it, but we kept investing in it.
And our stock price took a huge hit, and it was flat.
there for a long time flat or dropping and then it finally happened AI was born on our
GPU that's what we were waiting for and we went we went all in on that and we've had ups and
down since then we'll we'll continue to have ups and downs but I think the trend is going to still
be up into the right because this is an amazing place where where people who want to do their
life's work, the best people in the world at what we do, wanted their life's work, they come
here and they stay here.
Yep. Well, Rev, look, it's always such a pleasure to speak with you. I really enjoyed our time
together at NVIDIA headquarters. That was a really fun day. We did some cool demos,
and I appreciate that. And I'm just thrilled to get a chance to speak with you about this
technology today. It is fascinating technology. It is cutting edge. Obviously brings up a lot
of questions, some of which we got to today. I'm sure we could have talked for three hours.
and I hope to keep the conversation up.
So thanks for coming on the show.
Thank you for inviting me
and I hope we do talk for three hours one day.
That'll be great.
All right, everybody.
Thank you for listening.
Ron John and I will be back to break down the news on Friday.
Already a lot of news this week with Open AI's deep research coming out.
I just paid $200 for chat GPT,
which is a lot more than I ever thought I would for a month.
But that's where we are today.
So we're going to talk about that and more on Friday.
Thanks for listening and we'll see you next time on Big Technology Podcast.