Everyday AI Podcast – An AI and ChatGPT Podcast - EP 233: Robots Among Us - How NVIDIA is building the future of robotics
Episode Date: March 21, 2024Awesome Stuff From Our Partner, NVIDIA -Register for the FREE virtual NVIDIA GTC Conference or buy tickets to the in-person event and fill out this form here: https://www.youreverydayai.com/nvidia-giv...eaway/AI + Robots — is it just science fiction? Or, could the intersection of AI and robotics change our daily lives WAY sooner than we’d expect? We find out with Amit Goel, the Director of Robotics at NVIDIA, a global leader in the robotics industry.Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Ask Jordan and Amit questions on AI and roboticsUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTimestamps:01:30 About Amit and NVIDIA robotics04:11 Large language models and consume data teach robots.06:20 NVIDIA's role in robotics at GTC conference.10:38 Public perception of robotics and humanoid future.14:47 Robots and AI replacing human tasks concern.15:49 Robotics tasks, automation, and train workforce.20:44 Keynote discusses faster computing's impact on industries.23:51 Large language models enable natural language interaction.27:03 Excited about improving existing robots with AI.Topics Covered in This Episode:1. Robotics and AI Advancements2. Public Perception Towards Robotics3. Deploying Tasks to Robots4. Use of Generative AI in Robotics5. NVIDIA's Role and Impact in RoboticsKeywords:Robotics, NVIDIA, GTC conference, Amit Goel, AI, large language models, generative AI, simulation technology, computing power, embedded products, Yaskawa, digital twin, industrial robots, manufacturing, logistics, Boston Dynamics, humanoids, human-robot interaction, automation, IT systems, operations, traditional robotics, Graphics, chat GPT, Blackwell, data, real physical implementation, robot-specific language, Groot model, public perception.Send Everyday AI and Jordan a text message. (We can't reply back unless you leave contact info) Start Here ▶️Not sure where to start when it comes to AI? Start with our Start Here Series. You can listen to the first drop -- Episode 691 -- or get free access to our Inner Cricle community and all episodes: StartHereSeries.com Also, here's a link to the entire series on a Spotify playlist.
Transcript
Discussion (0)
This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips.
Listen daily for practical advice to boost your career, business, and everyday life.
Meet Firefly AI Assistant, now live in Adobe Firefly, the All In One Creative AI Studio.
Just describe what you want to create and the assistant handles the rest,
orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface.
You direct the outcome.
The assistant accelerates execution.
It seems like in the past couple of weeks, robotics have been everywhere.
And that can't even be truer than this week, as we've seen at GTC here at the GTC conference with NVIDIA.
I'm extremely excited today to talk about the robotics industry and what's been going on and some recent announcements from NVIDIA with the man himself.
But more in that in a second.
reminder if you're joining us live. Thank you very much. Please make sure to check out your show notes or if you're on the
podcast because you can still, it is not too late to register for free for the NVIDIA GTC conference. So you can do that.
You can go watch sessions, win a GPU. You know, there's DLI credits. You can go out and learn a lot of things.
So make sure to check that out in the show notes and in the newsletter as well. All right. And you can always get that information on our website too at your
everyday AI.com. All right, enough of that. Let's get into why you showed up today, why you're
listening to the podcast. It's to hear about the future of robotics. And there is so much going on,
especially it seems like in the past couple of weeks and the past couple of days here at
NVIDIA GTC. So with that, please help me welcome our guest for today. Amit Goel, the director
of robotics at NVIDIA. Amit, welcome to the Everyday AI show.
Thanks, Jordan. Thanks. Thanks. My pleasure to be here today. Yeah. Can you tell us just a little bit about
what you do at your role at NVIDIA?
Sure. So I head up our ecosystem team for robotics and product management for our embedded products that go into the robots.
Been at Enidavir for about 13 years. So my team is working with all the partners that you see that are building the next generation of AI robots.
And I'm sure, you know, over the course of 13 years working at Nvidia, there's been a fair share of excitement and new advancements.
But how do you feel about the state of robotics today, at least from an outsider, from an everyday person like myself?
It seems like the sector is just exploding.
Is that how you feel as well or is it always just feel like, you know, like this?
No, I think, you know, with the coming of AI, robotics has been making steady progress.
But over the last six, 12 months, we've seen sort of a shift that has happened.
an inflection point, if you may,
where people are starting to realize that there is a lot that can be done.
And a lot of that can be attributed to the large language
models and generative AI.
The fact that you can have a lot of data and use that
to make new tasks for robots is just a turning point.
The robots have been there for a while.
AI has been helping.
but they have been typically still a single function robot, right?
Amazon talks a lot about their robots in the warehouses
that are picking and packaging things for you.
But that robot, if you ask it to make a coffee for you, can't do it, right?
So that is the change that has happened.
The change where now with the large models,
you can really train a robot on multiple modalities,
understand from what you and I are doing, right?
You don't need a PhD to be programming a robot.
That's what is going to really unlock the capabilities of robots and for everyone.
That's like if you're watching now, what I mean it just said is amazingly accurate
in putting everything into perspective and why now.
But, you know, I want to dig in even deeper because what, you know,
people may or may not realize robotics is not new, right?
robotics have been used across many industries and verticals for decades.
Why, though, that inflection point now, you know, you said over the last couple of months,
is it just because of large language models or are there other factors that are kind of
bringing the robotics industry to this inflection point?
I would say it's a confluence of several things that has made this possible, right?
At the heart of it is, of course, this ability of large language models to consume a huge
amount of data and learn from it without somebody actually having to annotate everything, right?
We just, you may have seen the work from Nvidia research called Eureka, where we taught a hand to
spin a pen. In the past, even just telling what the robot needs to do was so hard.
Now you can type, I want you to move this pen with your fingers.
And all the complex math underneath it is taken care of by a large model.
Right. So that's one thing.
The second thing that has happened is in order to develop these robots and test these robots,
you can't really do them in physical world. These are robots are expensive.
Physical world is you have to be careful, safe around it.
So the simulation has come a long way. We started working in this platform called Omnibor.
which is designed for simulating things,
and we announced the Isaac Lab.
This is where you can safely and very quickly test these robots.
So that's the second thing that came on.
And the third thing was, in general, the computing power, right?
To run these big group models, to run all the AI capabilities,
you need to have the computing power in the robot.
You cannot be running its brain somewhere else in the class.
out because it needs to just make a decision if it needs to stop moving its hand or not.
So I think the culmination of those three things, the large language models,
a great simulation environment to test these robots before you put them in the real world.
And the third, all the horsepower that you need to do the processing.
It all came together.
And so that's what has kicked off this, you know, a big bang of new generation of robots.
Yeah.
And it's definitely that, you know, even if we look out here on the GTC floor, you see robotics everywhere.
Maybe that you wouldn't expect if you were at the GTC conference 10 years ago.
You maybe might not expect this.
But I do want to zoom out a little bit and even talk specifically about NVIDIA's role in all this because you're not technically making the robots.
So tell us a little bit about how NVIDIA works.
It seems like there's so many moving parts, both literally and physically.
right, or metaphorically, I guess, but how does
Nvidia's role play into the grander scheme?
That's a great question.
I mean, when people look at all our demos and things that we show
for our technology, they often get confused.
Like, look at Nvidia is making a robot, we're making a human eye,
we're making a mobile robot.
That's not what we do.
Our ambition is to be the foundry, be that platform,
where the robotics developers, the people who are making the hardware, the people who are making
the end application can come to this foundry with their use case, with their data, with their
robot hardware, and take back the intelligence for those robots.
So that's that's in video's role, being that underlying platform on which people can bring
their stuff, build the final thing, take it back and deploy it in the real world.
What does that actually mean?
Right?
Like, you know, when we talk about Isaac Lab, you know, in, I guess, inside or underneath the
Nvidia Omniverse, what does that actually mean, you know, for, you know, the world of manufacturing,
production, et cetera, how does that change what companies and enterprises can do if they
bring their simulation into Isaac Lab?
That's great question.
For example, if you see the announcements that we had today at DTC this week at DTC, I'll highlight a few.
We had an announcement with Yaskawa, industrial robots that you will see in a lot of manufacturing, a lot of logistics places.
And typically the way those robots were programmed was you get the robot, somebody goes there,
spend several weeks, months programming them, and then you have one application done.
Now, with Isaac Lab and the work that we're doing with Yaskawa, they can create a model, a digital
twin of their robot in Omniverse. They can use the foundation models, these models that already know
how to pick things, but they don't really know how to pick things for that particular robot,
right? So that's the adaptation that they need to do specific to their hardware. So, the
these are things what we call fine tuning, right? So they can create a digital twin, they can fine
tune them, and take out of it a working skill, a working task. And that's what we demonstrated here
with Yaskawa. They worked with us, brought their robot in, all the secret source of their robot,
trained it in the lab, right, in the virtual lab, and then deployed it to the real world.
similar things that we did with Boston Dynamics.
They have probably the best team to make some of the amazing robots,
their mechanical design, the dynamics,
the software that runs on the robot is fantastic.
But every new skill to be added to the robots,
they took a long time.
And together with them, we announced an end-to-in platform.
You can take the Boston Dynamics robot.
They worked with us to create a simulation digital,
twin of their robot. And now you can make it run on the sand. You can create a snow environment.
You can create a forest environment. Learn all those things. And when it's learned, take this brain
back, put it on an Nvidia Jetson, right? To put it on the robot and you have your skill. So that's
essentially what it looks like. Bring your hardware, bring your use case, create the environment,
test it, take it back to the real world. Yeah. I think the only downside of that is we
may get fewer of those viral videos that we always see of the Boston dynamic, you know,
robot dog, you know, running across all different surfaces. But, you know, I'm really interested
because even it seems public perception around robotics, it seems, is also changing at the same time.
What does it, what does it mean as we even transition from, you know, general robotics to now
what, you know, people are saying, humanoids? Is it more of, you know, that single use, you know,
Amazon warehouse, that's still general robotics.
But when a humanoid type robot can do anything, is that, you know, when it's a humanoid,
can you explain, you know, what that even means as we kind of look forward into a future of
these humanoids?
Yeah.
If you look around you, you don't see many robots, right?
And even the ones that you see probably the Rumba in your house, the challenge has been
human robot interaction.
As you and I cannot even explain what's going in the Rumba's mind,
it just keeps going around the same place sometimes over and over again.
So in order to robots and people to coexist,
there are two important things.
They need to operate in shared spaces.
The world was created for us, for people.
Most of the world has been created for us.
The factories are still created for a lot of robots.
The new warehouses are created for robots, but majority of the world is created for humans.
So they need to exist in a form factor that they can live around people.
And second thing that is important is they should be able to engage with people.
You should be able to understand what the robot is doing, and the robot should be able to understand what you wanted to do.
Right. So that's essentially why humanoid form factor is so interesting.
We have a lot of data on humans, right? You and I doing our things, there is an ample amount of data for that.
And for generative AI, the one thing that you need is a lot of data. So there is a lot of data for humans that helps with the humanoid form factor.
Again, these are designed with the dexterity and the capabilities so they can exist around people in the shared space.
And last piece is with the large models, with this generative AI, you can really engage with them.
We've seen some of the recent work where you can tell the robot what you do.
You can even ask how it did it.
So you can have more trust and comfort around these machines.
Yeah, you talked about the need for, you know, a space for humans and robots to kind of coexist.
I think maybe for people like you and I, maybe that's exciting, right?
But for some people, it might make them feel uncomfortable.
I mean, should people feel afraid of coexisting, you know, with humanoids?
And what does that look like?
Or maybe what can you say to those people who do feel a little uneasy, maybe sharing that space?
I think, you know, there's just, if you look at this data right now,
there are just so many jobs where we don't have.
have people to work on.
And all of these robots will need human supervisors.
So instead of actually doing some of those work,
anybody who's worried, they should not be,
because they would be actually telling what these robots should do.
So you are still, we'll still have human supervisors
determining the task, determining what the robots are expected to do.
So there is just a lot of need for those skills.
skills that are, you know, robots have been great at, you know,
dull, dangerous and dirty, that we'll be able to upload to them and still keep,
keep monitoring them and control them and tell them what is needed.
Adobe just introduced an entirely new way to create, bringing the power and
precision of its creative suite into one conversational experience.
Meet Firefly AI Assistant now live in the Adobe Firefly app, the all-in-one creative
AI Studio.
Powered by Adobe's creative agent, Firefly AI assistant lets you start with your vision,
just describe what you want, and shape the outcome as it takes form with the assistant.
The assistant orchestrates multi-step workflows, drawing on 60 plus pro-grade tools across
Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and
more to help bring your ideas to life.
You can also get started with creative skills, a growing library of pre-built work.
workflows for common creative tasks like batch editing photos, creating mood boards, portrait retouching,
and creating social variations.
Every step the assistant takes is visible so you can refine, redirect, or take over at any time.
You stay in the driver's seat as the creative director.
Adobe Firefly AI assistant now in public beta.
See it today at firefly.adopi.com.
You know, aside from those, you know, dangerous and dirty, which I love, I, I've just
just love the alliteration there. It makes it so easy to see where robots can go in the future.
But I also assume that there's just more, you know, repetitive tasks that a lot of right now,
you know, humans are doing. So, and maybe that's the reason why a lot of people, you know,
are scared of AI or are not fully on board with robotics. You know, I guess in that situation,
whether you're talking about manufacturing, production plants, et cetera, especially those, you know,
kind of like low hanging, you know, you might think of them as low hanging fruit where
these humanoids can go in and be placed there. For those humans in those industries, where should
they even be focusing their time and attention right now? Because you can sit around and say,
oh, well, you know, what happens if a humanoid is out here doing this in two years or you can
do something about it? So where would you focus or where would you recommend people, you know,
maybe in those industries, start focusing their time and attention on? I think the first thing is to
identify these tasks, right? That's the number one thing to do is to identify what are the
piece of your operations that you could automate and start preparing for it, right? You know,
bringing automation is not just being a robot and putting it out there. You need to prepare
the, you need to prepare your operations, your IT systems to work with those robots and you need
you prepare your workforce.
So I think just the first thing is,
knowing what these robots are going to be capable of, right?
Not assuming that they'll do everything,
all your job, not going to happen anytime soon.
But they will be really good at, you know,
picking things up from a cart,
loading it up somewhere else, right?
We're seeing what's happening in warehousing,
picking things, putting in the packages, shipping them out.
So there are a lot of these things which are going
to happen in the near term, right?
So identifying those tasks,
and then working with this ecosystem to help them define those tasks
and preparing the environment for those things.
So I think those are some of the areas where they should focus on
and not be worried about it at all.
Yeah.
You know, the Nvidia CEO, your CEO Jensen yesterday,
said in a session that the chat GPT moment for robotics
is near, it's soon.
What's your thoughts on that?
And what do you think that moment will actually be for the robotics industry or
humanoid industry?
What is that chat GPT moment you think?
Yeah.
What was different about chat GPU versus prior AI was that how, who could use it?
I think the capabilities were of course amazing, right?
But how accessible was it was the important piece?
well. So I think for me, when I look at about, think about chat GPD moment of robotics, I think about,
you know, somebody who does not have a robotic background, if they feel comfortable that I can
use a robot, I can program a robot, I can tell what I wanted to do. I think that would be,
that would be the big moment where you've said, okay, you know, we've got there, right? Instead of that
handful of specialist robotics, PhDs, and master's degree people who can really work with these
robots when most people would feel comfortable using these robots, operating these robots,
telling them what to do. I think that would be in my mind at the moment.
What do you think still has to be done or still has to be accomplished in order to get there?
Because, you know, at least for me when I'm, you know, was listening to the keynote earlier,
week at GTC. And again, as a reminder, if you're listening live, you can sign up and rewatch that
keynote. Great, great two hours. You won't regret it. But it seems like all of the pieces
to get to what you just described are either there or were recently announced. So what are still
the obstacles to get to that moment? I think we've seen like from the AV industry, for example,
right? The physical world is a lot complex that are just,
There are a lot of variations, and there are real implications of things not working out.
So, you know, while we have the key technology pieces, getting it to that mass deployment will require,
especially because these are physical things that have to exist around and be doing stuff,
I think there is a lot to be discovered on actually, you know, what happens when you put them in real.
world. So that's that aspect. The second is the data, right? We, in order to realize this
large language models based and really generative AI based robots, we need a lot of data.
And a lot of work is happening right now on how can we bring that data? And again, you know,
we are seeing people creating a big farm of teleoperated robots so they can just keep them doing
things and collect that data. There is work happening on learning from imitation, right? How are we doing
those things and translating that? So I think there's a lot of work that needs to happen on the
data side, a lot of work that has to happen on real physical implementation, the gaps that we
will see when we put it on the physical system. And then, yeah, I think, you know, those are
the two big things that we still need to solve. But I think, as you said, all the key building
blocks are there to get us to that promise lack.
Yeah, and I think, you know, in his keynote incense, you know, Jensen has talked about, you know,
even this transition from, from Hopper to Blackwell and what that means, you know, being able to,
I think the example that he gave is, you know, training a, you know, 1.8 trillion parameter,
GPT4 model in, you know, I think a fourth of the time or something like that.
What does that mean, though, for other industries such as robotics, right?
Like when we look over the course of the coming months or
quarters or year, when this, you know, these GPU chips and the Jetson, you know, what does that mean for
your industry and what you maybe were not capable to do before just because of compute?
What does that open up? What do these announcements open up? It's just a question of iterations,
right? The way to get to perfection for this robot is to iterate quickly. And if you had to wait
for six to 12 months to train a policy.
If you can do that in six weeks,
that just means you can iterate faster.
And that's exactly what we are starting to see now, right,
with some of these early partners who are adopting these technologies,
that the pace has picked up, right?
Two years has come down to two months,
and over time, it'll come down to two weeks.
So you can iterate faster, faster, faster.
And then you, that's the big impact.
this technology has. And second thing it'll do is also it'll bring down the cost, right?
In order to, in order to really deploy these, build these capabilities at scale,
you need to do it in a way that is also, you know, cost effective, right? So if it's just taking
you two years, there is no way it can be cost effective, right? But if it's happening in two months
and two days, it's going to be cost effective. So the scale will come and it'll allow people to do
I trade fast, it'll allow people to scale their learning pieces and third pieces.
It'll help to bring the cost down to something that is more commercially viable in the long term.
Yeah, even with that speed of development and deployment, let's say you and I are having this
exact same conversation next year at GTC.
What do you hope we're talking about next year?
Maybe just for InVidio or maybe just for robotics and humanoids in general, where do you hope we are at this point next year?
Well, I think we saw a lot of humanoid robots today that were standing and doing some stuff.
But I think next year when we are here, we might be talking about why are they not doing something.
Okay, okay.
So I don't know.
I think, you know, I don't want to be forecasting things here.
But the pace in which innovation is happening is amazing.
And I'm super excited about what it is going to unlock for robotics and looking forward to that.
Yeah.
And I do want to get back to something that we talked about earlier.
You know, we kind of talked about why now and why this inflection point.
And one of those factors was large language models.
It's generative AI.
You know, I think a lot of people I've saw, you know, I've seen some demos recently.
the figure and the chat GPT kind of collaboration.
Is that what you kind of see happening in the near future?
Just everyday people having the ability, you know, like you said,
you don't need necessarily a master's degree in deep learning or robotics.
Is that kind of what we should be looking at as maybe being the near future,
the technology is kind of already there by being able to have a robot or a humanoid help
with everyday tasks just by using natural language.
I think that's absolutely correct, Jordan.
The way generative AI is actually helping this next generation of robots is in multiple ways, right?
One is, of course, how you are engaging with the robot.
How are you telling the tasks to do?
How are you building that human robot interaction?
but in the background, there is a lot of work that's happening on,
well, what does it mean for a robot when you say,
hey, go pick up, find that can and put it in the trash,
what does it mean in the robot language world, right?
So there is behind the word, I think there is a lot of research that is happening.
There's a lot of core engineering that has to happen to translate that into a robot-specific language.
And this is where we announced a group model, right?
the foundation model where it can take text it can take videos demonstration hey i want you to
stack this thing this way right so there's a lot of that work that needs to happen in the background
uh for translating that into a robot specific language and then also the way generated via is
helping is creating all this world uh the human world is big and complex how are you going to create all
these environments in simulation. So generative AI is going to help there as well with the generative
capabilities to create environment, create settings where you can just say, I want to, I'm kind of
deploy this in a kitchen with, you know, this much high shelves, you know, all these instruments
should be there. And boom, it creates an environment for you to go play with. So I think there's,
there's just a lot of things that are happening on the core technology side as well to bring that
future forward. Yeah. You know, speaking of that future forward, you know, you're in a very unique
position as the director of robotics here at Nvidia and even Nvidia's place in the grand scheme of
things, right? So, so you get a little bit of taste of, you know, what your hardware partners are doing,
you know, other partners you're working with in cloud and the AI side. But what's the thing that
excites you, right? Like once we wrap up the craziness of the GTC conference, what are you going to be
excited to be working on and what are you going to be paying attention to in the coming weeks and
months? I think for me in the coming weeks and months, the exciting part is the traditional robotics,
like the industrial robot arm, the collaborative robots, the AMRs that are already there today.
They have a lot of potential, right? The new things with humanites and, you know, other embodiments will come.
But we also announced two key things, right, Isaac Manipulator and Isaac Perceptor,
which applies to the robots of today, the robots that are operating today.
So in the coming months, that is what I'm most excited about,
is how we will make these existing robots smarter.
How can we make them easy for people to use?
How can we help these companies go from hundreds and thousands to tens of thousands of robots
with AI and with the tools that we are providing.
So I think that's what I'm most excited about making those tangible things happen in the near term
with the existing robots.
And so we've talked about so many things on today's show from new announcements that you
had here at Nvidia GTC to partnerships that you just unveiled to the future of robotics
and humanoids.
But maybe as we wrap this up, what is the one big takeaway?
that you want people to walk away from this conversation on, you know,
specifically on where Nvidia is focusing and doing their work right now.
What's that one big takeaway message you have for people?
I think the one takeaway would be that we are building the foundry and the foundation
for the world's roboticists to come and build, realize their dreams.
We're building all, it takes a long, long time, a lot of key technologies.
As Jensen said, the soul of Nvidia, right?
Graphics and AI and simulation, right?
All physics.
These three pieces are core building blocks of robotics.
We've been investing on it for so many years.
So now I think we have that solid foundation on which, you know, the new generation of robots can be built.
It's such an exciting time, not just for the industry, but how it's ultimately going to, you know, play out in the real world and the business, you know, sector and everything else.
So, I mean, thank you so much for joining the Everyday AI show.
We really appreciate your time.
Thanks for having me.
I really enjoyed this conversation.
All right.
And, hey, as a reminder, there is so much.
much more. Make sure to not only check out the show notes of the live stream or the podcast,
but also the newsletter. So go to your everyday AI.com, sign it for the newsletter and we'll
have information in there as well where you can still register for the Nvidia GTC conference.
Don't worry. It's not too late. If you slept through the first two and a half days,
you can go back some of the sessions that we were talking about here. You can go back,
watch those replays, sign up. You can do it for free as well as enter into our giveaway to win a GPU as
well as DLI credit.
So you've got to go check that out.
And you got to also continue to join us.
We have a lot more here from the GTC conference.
Thank you for tuning in.
We hope to see you back later today and tomorrow and every day for more everyday AI.
Thanks y'all.
Meet Firefly AI assistant.
Now live in Adobe Firefly, the Allman One Creative AI Studio.
Just describe what you want to create in your own words and the assistant handles the rest,
orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premiere
Express and more in one conversational interface.
You direct the outcome while the assistant accelerates execution.
Stand control with the ability to step in and refine at any time.
See it today at firefly.adobie.com.
And that's a wrap for today's edition of Everyday AI.
Thanks for joining us.
If you enjoyed this episode, please subscribe and leave us a rating.
It helps keep us going.
For a little more AI magic, visit Your EverydayAI.com and sign up to our daily news
newsletter so you don't get left behind. Go break some barriers and we'll see you next time.
