Everyday AI Podcast – An AI and ChatGPT Podcast - EP 233: Robots Among Us - How NVIDIA is building the future of robotics

Starting point is 00:00:00 This is the Everyday AI Show, the Everyday Podcast where we simplify AI and bring its power to your fingertips. Listen daily for practical advice to boost your career, business, and everyday life. Meet Firefly AI Assistant, now live in Adobe Firefly, the All In One Creative AI Studio. Just describe what you want to create and the assistant handles the rest, orchestrating multi-step workflows across Photoshop, Premiere Express, and more in one conversational interface. You direct the outcome. The assistant accelerates execution. It seems like in the past couple of weeks, robotics have been everywhere.

Starting point is 00:00:53 And that can't even be truer than this week, as we've seen at GTC here at the GTC conference with NVIDIA. I'm extremely excited today to talk about the robotics industry and what's been going on and some recent announcements from NVIDIA with the man himself. But more in that in a second. reminder if you're joining us live. Thank you very much. Please make sure to check out your show notes or if you're on the podcast because you can still, it is not too late to register for free for the NVIDIA GTC conference. So you can do that. You can go watch sessions, win a GPU. You know, there's DLI credits. You can go out and learn a lot of things. So make sure to check that out in the show notes and in the newsletter as well. All right. And you can always get that information on our website too at your everyday AI.com. All right, enough of that. Let's get into why you showed up today, why you're

Starting point is 00:01:48 listening to the podcast. It's to hear about the future of robotics. And there is so much going on, especially it seems like in the past couple of weeks and the past couple of days here at NVIDIA GTC. So with that, please help me welcome our guest for today. Amit Goel, the director of robotics at NVIDIA. Amit, welcome to the Everyday AI show. Thanks, Jordan. Thanks. Thanks. My pleasure to be here today. Yeah. Can you tell us just a little bit about what you do at your role at NVIDIA? Sure. So I head up our ecosystem team for robotics and product management for our embedded products that go into the robots. Been at Enidavir for about 13 years. So my team is working with all the partners that you see that are building the next generation of AI robots.

Starting point is 00:02:32 And I'm sure, you know, over the course of 13 years working at Nvidia, there's been a fair share of excitement and new advancements. But how do you feel about the state of robotics today, at least from an outsider, from an everyday person like myself? It seems like the sector is just exploding. Is that how you feel as well or is it always just feel like, you know, like this? No, I think, you know, with the coming of AI, robotics has been making steady progress. But over the last six, 12 months, we've seen sort of a shift that has happened. an inflection point, if you may, where people are starting to realize that there is a lot that can be done.

Starting point is 00:03:21 And a lot of that can be attributed to the large language models and generative AI. The fact that you can have a lot of data and use that to make new tasks for robots is just a turning point. The robots have been there for a while. AI has been helping. but they have been typically still a single function robot, right? Amazon talks a lot about their robots in the warehouses

Starting point is 00:03:49 that are picking and packaging things for you. But that robot, if you ask it to make a coffee for you, can't do it, right? So that is the change that has happened. The change where now with the large models, you can really train a robot on multiple modalities, understand from what you and I are doing, right? You don't need a PhD to be programming a robot. That's what is going to really unlock the capabilities of robots and for everyone.

Starting point is 00:04:22 That's like if you're watching now, what I mean it just said is amazingly accurate in putting everything into perspective and why now. But, you know, I want to dig in even deeper because what, you know, people may or may not realize robotics is not new, right? robotics have been used across many industries and verticals for decades. Why, though, that inflection point now, you know, you said over the last couple of months, is it just because of large language models or are there other factors that are kind of bringing the robotics industry to this inflection point?

Starting point is 00:04:54 I would say it's a confluence of several things that has made this possible, right? At the heart of it is, of course, this ability of large language models to consume a huge amount of data and learn from it without somebody actually having to annotate everything, right? We just, you may have seen the work from Nvidia research called Eureka, where we taught a hand to spin a pen. In the past, even just telling what the robot needs to do was so hard. Now you can type, I want you to move this pen with your fingers. And all the complex math underneath it is taken care of by a large model. Right. So that's one thing.

Starting point is 00:05:44 The second thing that has happened is in order to develop these robots and test these robots, you can't really do them in physical world. These are robots are expensive. Physical world is you have to be careful, safe around it. So the simulation has come a long way. We started working in this platform called Omnibor. which is designed for simulating things, and we announced the Isaac Lab. This is where you can safely and very quickly test these robots. So that's the second thing that came on.

Starting point is 00:06:20 And the third thing was, in general, the computing power, right? To run these big group models, to run all the AI capabilities, you need to have the computing power in the robot. You cannot be running its brain somewhere else in the class. out because it needs to just make a decision if it needs to stop moving its hand or not. So I think the culmination of those three things, the large language models, a great simulation environment to test these robots before you put them in the real world. And the third, all the horsepower that you need to do the processing.

Starting point is 00:06:56 It all came together. And so that's what has kicked off this, you know, a big bang of new generation of robots. Yeah. And it's definitely that, you know, even if we look out here on the GTC floor, you see robotics everywhere. Maybe that you wouldn't expect if you were at the GTC conference 10 years ago. You maybe might not expect this. But I do want to zoom out a little bit and even talk specifically about NVIDIA's role in all this because you're not technically making the robots. So tell us a little bit about how NVIDIA works.

Starting point is 00:07:30 It seems like there's so many moving parts, both literally and physically. right, or metaphorically, I guess, but how does Nvidia's role play into the grander scheme? That's a great question. I mean, when people look at all our demos and things that we show for our technology, they often get confused. Like, look at Nvidia is making a robot, we're making a human eye, we're making a mobile robot.

Starting point is 00:07:57 That's not what we do. Our ambition is to be the foundry, be that platform, where the robotics developers, the people who are making the hardware, the people who are making the end application can come to this foundry with their use case, with their data, with their robot hardware, and take back the intelligence for those robots. So that's that's in video's role, being that underlying platform on which people can bring their stuff, build the final thing, take it back and deploy it in the real world. What does that actually mean?

Starting point is 00:08:35 Right? Like, you know, when we talk about Isaac Lab, you know, in, I guess, inside or underneath the Nvidia Omniverse, what does that actually mean, you know, for, you know, the world of manufacturing, production, et cetera, how does that change what companies and enterprises can do if they bring their simulation into Isaac Lab? That's great question. For example, if you see the announcements that we had today at DTC this week at DTC, I'll highlight a few. We had an announcement with Yaskawa, industrial robots that you will see in a lot of manufacturing, a lot of logistics places.

Starting point is 00:09:20 And typically the way those robots were programmed was you get the robot, somebody goes there, spend several weeks, months programming them, and then you have one application done. Now, with Isaac Lab and the work that we're doing with Yaskawa, they can create a model, a digital twin of their robot in Omniverse. They can use the foundation models, these models that already know how to pick things, but they don't really know how to pick things for that particular robot, right? So that's the adaptation that they need to do specific to their hardware. So, the these are things what we call fine tuning, right? So they can create a digital twin, they can fine tune them, and take out of it a working skill, a working task. And that's what we demonstrated here

Starting point is 00:10:10 with Yaskawa. They worked with us, brought their robot in, all the secret source of their robot, trained it in the lab, right, in the virtual lab, and then deployed it to the real world. similar things that we did with Boston Dynamics. They have probably the best team to make some of the amazing robots, their mechanical design, the dynamics, the software that runs on the robot is fantastic. But every new skill to be added to the robots, they took a long time.

Starting point is 00:10:45 And together with them, we announced an end-to-in platform. You can take the Boston Dynamics robot. They worked with us to create a simulation digital, twin of their robot. And now you can make it run on the sand. You can create a snow environment. You can create a forest environment. Learn all those things. And when it's learned, take this brain back, put it on an Nvidia Jetson, right? To put it on the robot and you have your skill. So that's essentially what it looks like. Bring your hardware, bring your use case, create the environment, test it, take it back to the real world. Yeah. I think the only downside of that is we

Starting point is 00:11:24 may get fewer of those viral videos that we always see of the Boston dynamic, you know, robot dog, you know, running across all different surfaces. But, you know, I'm really interested because even it seems public perception around robotics, it seems, is also changing at the same time. What does it, what does it mean as we even transition from, you know, general robotics to now what, you know, people are saying, humanoids? Is it more of, you know, that single use, you know, Amazon warehouse, that's still general robotics. But when a humanoid type robot can do anything, is that, you know, when it's a humanoid, can you explain, you know, what that even means as we kind of look forward into a future of

Starting point is 00:12:06 these humanoids? Yeah. If you look around you, you don't see many robots, right? And even the ones that you see probably the Rumba in your house, the challenge has been human robot interaction. As you and I cannot even explain what's going in the Rumba's mind, it just keeps going around the same place sometimes over and over again. So in order to robots and people to coexist,

Starting point is 00:12:42 there are two important things. They need to operate in shared spaces. The world was created for us, for people. Most of the world has been created for us. The factories are still created for a lot of robots. The new warehouses are created for robots, but majority of the world is created for humans. So they need to exist in a form factor that they can live around people. And second thing that is important is they should be able to engage with people.

Starting point is 00:13:14 You should be able to understand what the robot is doing, and the robot should be able to understand what you wanted to do. Right. So that's essentially why humanoid form factor is so interesting. We have a lot of data on humans, right? You and I doing our things, there is an ample amount of data for that. And for generative AI, the one thing that you need is a lot of data. So there is a lot of data for humans that helps with the humanoid form factor. Again, these are designed with the dexterity and the capabilities so they can exist around people in the shared space. And last piece is with the large models, with this generative AI, you can really engage with them. We've seen some of the recent work where you can tell the robot what you do. You can even ask how it did it.

Starting point is 00:14:05 So you can have more trust and comfort around these machines. Yeah, you talked about the need for, you know, a space for humans and robots to kind of coexist. I think maybe for people like you and I, maybe that's exciting, right? But for some people, it might make them feel uncomfortable. I mean, should people feel afraid of coexisting, you know, with humanoids? And what does that look like? Or maybe what can you say to those people who do feel a little uneasy, maybe sharing that space? I think, you know, there's just, if you look at this data right now,

Starting point is 00:14:45 there are just so many jobs where we don't have. have people to work on. And all of these robots will need human supervisors. So instead of actually doing some of those work, anybody who's worried, they should not be, because they would be actually telling what these robots should do. So you are still, we'll still have human supervisors determining the task, determining what the robots are expected to do.

Starting point is 00:15:14 So there is just a lot of need for those skills. skills that are, you know, robots have been great at, you know, dull, dangerous and dirty, that we'll be able to upload to them and still keep, keep monitoring them and control them and tell them what is needed. Adobe just introduced an entirely new way to create, bringing the power and precision of its creative suite into one conversational experience. Meet Firefly AI Assistant now live in the Adobe Firefly app, the all-in-one creative AI Studio.

Starting point is 00:15:51 Powered by Adobe's creative agent, Firefly AI assistant lets you start with your vision, just describe what you want, and shape the outcome as it takes form with the assistant. The assistant orchestrates multi-step workflows, drawing on 60 plus pro-grade tools across Adobe Creative Cloud apps, including Photoshop, Illustrator, Premiere, Lightroom Express, and more to help bring your ideas to life. You can also get started with creative skills, a growing library of pre-built work. workflows for common creative tasks like batch editing photos, creating mood boards, portrait retouching, and creating social variations.

Starting point is 00:16:28 Every step the assistant takes is visible so you can refine, redirect, or take over at any time. You stay in the driver's seat as the creative director. Adobe Firefly AI assistant now in public beta. See it today at firefly.adopi.com. You know, aside from those, you know, dangerous and dirty, which I love, I, I've just just love the alliteration there. It makes it so easy to see where robots can go in the future. But I also assume that there's just more, you know, repetitive tasks that a lot of right now, you know, humans are doing. So, and maybe that's the reason why a lot of people, you know,

Starting point is 00:17:07 are scared of AI or are not fully on board with robotics. You know, I guess in that situation, whether you're talking about manufacturing, production plants, et cetera, especially those, you know, kind of like low hanging, you know, you might think of them as low hanging fruit where these humanoids can go in and be placed there. For those humans in those industries, where should they even be focusing their time and attention right now? Because you can sit around and say, oh, well, you know, what happens if a humanoid is out here doing this in two years or you can do something about it? So where would you focus or where would you recommend people, you know, maybe in those industries, start focusing their time and attention on? I think the first thing is to

Starting point is 00:17:52 identify these tasks, right? That's the number one thing to do is to identify what are the piece of your operations that you could automate and start preparing for it, right? You know, bringing automation is not just being a robot and putting it out there. You need to prepare the, you need to prepare your operations, your IT systems to work with those robots and you need you prepare your workforce. So I think just the first thing is, knowing what these robots are going to be capable of, right? Not assuming that they'll do everything,

Starting point is 00:18:27 all your job, not going to happen anytime soon. But they will be really good at, you know, picking things up from a cart, loading it up somewhere else, right? We're seeing what's happening in warehousing, picking things, putting in the packages, shipping them out. So there are a lot of these things which are going to happen in the near term, right?

Starting point is 00:18:47 So identifying those tasks, and then working with this ecosystem to help them define those tasks and preparing the environment for those things. So I think those are some of the areas where they should focus on and not be worried about it at all. Yeah. You know, the Nvidia CEO, your CEO Jensen yesterday, said in a session that the chat GPT moment for robotics

Starting point is 00:19:15 is near, it's soon. What's your thoughts on that? And what do you think that moment will actually be for the robotics industry or humanoid industry? What is that chat GPT moment you think? Yeah. What was different about chat GPU versus prior AI was that how, who could use it? I think the capabilities were of course amazing, right?

Starting point is 00:19:43 But how accessible was it was the important piece? well. So I think for me, when I look at about, think about chat GPD moment of robotics, I think about, you know, somebody who does not have a robotic background, if they feel comfortable that I can use a robot, I can program a robot, I can tell what I wanted to do. I think that would be, that would be the big moment where you've said, okay, you know, we've got there, right? Instead of that handful of specialist robotics, PhDs, and master's degree people who can really work with these robots when most people would feel comfortable using these robots, operating these robots, telling them what to do. I think that would be in my mind at the moment.

Starting point is 00:20:34 What do you think still has to be done or still has to be accomplished in order to get there? Because, you know, at least for me when I'm, you know, was listening to the keynote earlier, week at GTC. And again, as a reminder, if you're listening live, you can sign up and rewatch that keynote. Great, great two hours. You won't regret it. But it seems like all of the pieces to get to what you just described are either there or were recently announced. So what are still the obstacles to get to that moment? I think we've seen like from the AV industry, for example, right? The physical world is a lot complex that are just, There are a lot of variations, and there are real implications of things not working out.

Starting point is 00:21:23 So, you know, while we have the key technology pieces, getting it to that mass deployment will require, especially because these are physical things that have to exist around and be doing stuff, I think there is a lot to be discovered on actually, you know, what happens when you put them in real. world. So that's that aspect. The second is the data, right? We, in order to realize this large language models based and really generative AI based robots, we need a lot of data. And a lot of work is happening right now on how can we bring that data? And again, you know, we are seeing people creating a big farm of teleoperated robots so they can just keep them doing things and collect that data. There is work happening on learning from imitation, right? How are we doing

Starting point is 00:22:20 those things and translating that? So I think there's a lot of work that needs to happen on the data side, a lot of work that has to happen on real physical implementation, the gaps that we will see when we put it on the physical system. And then, yeah, I think, you know, those are the two big things that we still need to solve. But I think, as you said, all the key building blocks are there to get us to that promise lack. Yeah, and I think, you know, in his keynote incense, you know, Jensen has talked about, you know, even this transition from, from Hopper to Blackwell and what that means, you know, being able to, I think the example that he gave is, you know, training a, you know, 1.8 trillion parameter,

Starting point is 00:23:01 GPT4 model in, you know, I think a fourth of the time or something like that. What does that mean, though, for other industries such as robotics, right? Like when we look over the course of the coming months or quarters or year, when this, you know, these GPU chips and the Jetson, you know, what does that mean for your industry and what you maybe were not capable to do before just because of compute? What does that open up? What do these announcements open up? It's just a question of iterations, right? The way to get to perfection for this robot is to iterate quickly. And if you had to wait for six to 12 months to train a policy.

Starting point is 00:23:47 If you can do that in six weeks, that just means you can iterate faster. And that's exactly what we are starting to see now, right, with some of these early partners who are adopting these technologies, that the pace has picked up, right? Two years has come down to two months, and over time, it'll come down to two weeks. So you can iterate faster, faster, faster.

Starting point is 00:24:09 And then you, that's the big impact. this technology has. And second thing it'll do is also it'll bring down the cost, right? In order to, in order to really deploy these, build these capabilities at scale, you need to do it in a way that is also, you know, cost effective, right? So if it's just taking you two years, there is no way it can be cost effective, right? But if it's happening in two months and two days, it's going to be cost effective. So the scale will come and it'll allow people to do I trade fast, it'll allow people to scale their learning pieces and third pieces. It'll help to bring the cost down to something that is more commercially viable in the long term.

Starting point is 00:24:57 Yeah, even with that speed of development and deployment, let's say you and I are having this exact same conversation next year at GTC. What do you hope we're talking about next year? Maybe just for InVidio or maybe just for robotics and humanoids in general, where do you hope we are at this point next year? Well, I think we saw a lot of humanoid robots today that were standing and doing some stuff. But I think next year when we are here, we might be talking about why are they not doing something. Okay, okay. So I don't know.

Starting point is 00:25:34 I think, you know, I don't want to be forecasting things here. But the pace in which innovation is happening is amazing. And I'm super excited about what it is going to unlock for robotics and looking forward to that. Yeah. And I do want to get back to something that we talked about earlier. You know, we kind of talked about why now and why this inflection point. And one of those factors was large language models. It's generative AI.

Starting point is 00:26:04 You know, I think a lot of people I've saw, you know, I've seen some demos recently. the figure and the chat GPT kind of collaboration. Is that what you kind of see happening in the near future? Just everyday people having the ability, you know, like you said, you don't need necessarily a master's degree in deep learning or robotics. Is that kind of what we should be looking at as maybe being the near future, the technology is kind of already there by being able to have a robot or a humanoid help with everyday tasks just by using natural language.

Starting point is 00:26:42 I think that's absolutely correct, Jordan. The way generative AI is actually helping this next generation of robots is in multiple ways, right? One is, of course, how you are engaging with the robot. How are you telling the tasks to do? How are you building that human robot interaction? but in the background, there is a lot of work that's happening on, well, what does it mean for a robot when you say, hey, go pick up, find that can and put it in the trash,

Starting point is 00:27:16 what does it mean in the robot language world, right? So there is behind the word, I think there is a lot of research that is happening. There's a lot of core engineering that has to happen to translate that into a robot-specific language. And this is where we announced a group model, right? the foundation model where it can take text it can take videos demonstration hey i want you to stack this thing this way right so there's a lot of that work that needs to happen in the background uh for translating that into a robot specific language and then also the way generated via is helping is creating all this world uh the human world is big and complex how are you going to create all

Starting point is 00:28:00 these environments in simulation. So generative AI is going to help there as well with the generative capabilities to create environment, create settings where you can just say, I want to, I'm kind of deploy this in a kitchen with, you know, this much high shelves, you know, all these instruments should be there. And boom, it creates an environment for you to go play with. So I think there's, there's just a lot of things that are happening on the core technology side as well to bring that future forward. Yeah. You know, speaking of that future forward, you know, you're in a very unique position as the director of robotics here at Nvidia and even Nvidia's place in the grand scheme of things, right? So, so you get a little bit of taste of, you know, what your hardware partners are doing,

Starting point is 00:28:46 you know, other partners you're working with in cloud and the AI side. But what's the thing that excites you, right? Like once we wrap up the craziness of the GTC conference, what are you going to be excited to be working on and what are you going to be paying attention to in the coming weeks and months? I think for me in the coming weeks and months, the exciting part is the traditional robotics, like the industrial robot arm, the collaborative robots, the AMRs that are already there today. They have a lot of potential, right? The new things with humanites and, you know, other embodiments will come. But we also announced two key things, right, Isaac Manipulator and Isaac Perceptor, which applies to the robots of today, the robots that are operating today.

Starting point is 00:29:40 So in the coming months, that is what I'm most excited about, is how we will make these existing robots smarter. How can we make them easy for people to use? How can we help these companies go from hundreds and thousands to tens of thousands of robots with AI and with the tools that we are providing. So I think that's what I'm most excited about making those tangible things happen in the near term with the existing robots. And so we've talked about so many things on today's show from new announcements that you

Starting point is 00:30:14 had here at Nvidia GTC to partnerships that you just unveiled to the future of robotics and humanoids. But maybe as we wrap this up, what is the one big takeaway? that you want people to walk away from this conversation on, you know, specifically on where Nvidia is focusing and doing their work right now. What's that one big takeaway message you have for people? I think the one takeaway would be that we are building the foundry and the foundation for the world's roboticists to come and build, realize their dreams.

Starting point is 00:30:53 We're building all, it takes a long, long time, a lot of key technologies. As Jensen said, the soul of Nvidia, right? Graphics and AI and simulation, right? All physics. These three pieces are core building blocks of robotics. We've been investing on it for so many years. So now I think we have that solid foundation on which, you know, the new generation of robots can be built. It's such an exciting time, not just for the industry, but how it's ultimately going to, you know, play out in the real world and the business, you know, sector and everything else.

Starting point is 00:31:40 So, I mean, thank you so much for joining the Everyday AI show. We really appreciate your time. Thanks for having me. I really enjoyed this conversation. All right. And, hey, as a reminder, there is so much. much more. Make sure to not only check out the show notes of the live stream or the podcast, but also the newsletter. So go to your everyday AI.com, sign it for the newsletter and we'll

Starting point is 00:32:02 have information in there as well where you can still register for the Nvidia GTC conference. Don't worry. It's not too late. If you slept through the first two and a half days, you can go back some of the sessions that we were talking about here. You can go back, watch those replays, sign up. You can do it for free as well as enter into our giveaway to win a GPU as well as DLI credit. So you've got to go check that out. And you got to also continue to join us. We have a lot more here from the GTC conference.

Starting point is 00:32:29 Thank you for tuning in. We hope to see you back later today and tomorrow and every day for more everyday AI. Thanks y'all. Meet Firefly AI assistant. Now live in Adobe Firefly, the Allman One Creative AI Studio. Just describe what you want to create in your own words and the assistant handles the rest, orchestrating multi-step workflows across Adobe Creative Cloud apps, including Photoshop, Premiere Express and more in one conversational interface.

Starting point is 00:33:02 You direct the outcome while the assistant accelerates execution. Stand control with the ability to step in and refine at any time. See it today at firefly.adobie.com. And that's a wrap for today's edition of Everyday AI. Thanks for joining us. If you enjoyed this episode, please subscribe and leave us a rating. It helps keep us going. For a little more AI magic, visit Your EverydayAI.com and sign up to our daily news

Starting point is 00:33:33 newsletter so you don't get left behind. Go break some barriers and we'll see you next time.

Everyday AI Podcast – An AI and ChatGPT Podcast - EP 233: Robots Among Us - How NVIDIA is building the future of robotics

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.