Science Friday - How AI Advances Are Improving Humanoid Robots

Episode Date: September 25, 2025

Robots are just about everywhere these days: circling the grocery store, cleaning the floor at the airport, making deliveries. Not to mention the robots on the assembly lines in factories. But how far... are we from having a human-like robot at home? For example, a robot housekeeper like Rosie from “The Jetsons.” She didn’t just cook and clean, she bantered and bonded with the Jetsons. Stanford roboticist Karen Liu joined Host Ira Flatow to talk about how AI is driving advances in humanoid robotics at a live show at the Fox Theatre in Redwood City, California.Guest: Dr. Karen Liu is a professor of computer science at Stanford University.Transcripts for each episode are available within 1-3 days at sciencefriday.com. Subscribe to this podcast. Plus, to stay updated on all things science, sign up for Science Friday's newsletters.

Transcript
Discussion (0)
Starting point is 00:00:01 Hi, this is Flor Lickman, and you're listening to Science Friday. Today in the podcast, a conversation from our live show in Redwood City, California. Ira's talking AI and what it means for humanoid robots. The conversation part is, it feels very amazing, but it's actually the easiest part. You know, robots are just about everywhere these days. Maybe you've spotted one circling the grocery store, cleaning the floor at the airport, making delivery. not to mention the robots on the assembly lines in factories, but what about having your own robot in your own home?
Starting point is 00:00:48 How far are we from having a human-like robot at home? Joining me now to talk about the latest advances in humanoid robots is my guest. Dr. Karen Lio, professor of computer science at Stanford University, based, of course, in Stanford, California. Welcome to Science Friday. Thank you, Ira. Nice to be here today. Now, you developed and trained humanoid robots.
Starting point is 00:01:17 What is the definition of a humanoid? How do you define one? Well, there's really no universal consensus on the definition of humanoid. But I think most people would agree that, you know, humanoid should have a similar morphology to humans. That means you should have two legs for locomotion, two hands for manipulations,
Starting point is 00:01:38 and you need to be able to have egocentric, vision system so you can navigate the world, you could interact with objects in the world. Now, I understand not just talking about it. You've actually brought one with you, right? We actually build a humanoid from scratch at Stanford. Tottie. Tottie, as we call a toddler bot because it's model off of a human toddler. It's very small, but it has a lot of features almost like a four-size robot. It can talk, it can listen, better than toddlers. and you can also see the world. All right, let's bring in the math.
Starting point is 00:02:15 Let's bring out Taddy the toddler robot. Toddy, are you excited to be here tonight? He-he, I'm so happy to be here, too, buddy. Toddy, can you tell us how old you are? I'm one and a half years old. Yuppie. Toddy, do you know that you are a robot? He-he.
Starting point is 00:02:41 Yep, I'm a robot. Wiggle-Wiggly. I have to ask you something I ask all my two-year-old grandchildren. Toddy, do you have to go potty? No, no potty for me. I'm all goody. Thank you for coming, Toddy. Good to see you.
Starting point is 00:03:03 Thank you. Nice to see you, Taddy. Nice to see you too, buddy. Thank you. Isn't that great? That's terrific. Tell us what are some of the advances in robotics that make something so humanoid like that? Yeah.
Starting point is 00:03:25 So there are so many things. You know, AI is definitely one thing. I mean, you've probably heard of that. You know, chat GPT, large language models. They fundamentally change the way robots think about the world, reason about the world. But there's also something that is, you know, less mentioned, you know, often overlooked factor,
Starting point is 00:03:46 which is the advances in hardware, in robotics hardware. And this is not so much about like one single breakthrough, technology breakthrough, but rather a convergence of all the enabling technologies. So, for example, 3D printing, like the toddler bar you just saw, they are completely made of the 3D printed materials, and all the motors are commercially available. So this is something that you can actually just build from scratch at home. And then I think that is one of the reason that you would start to see that this is very welcoming environment for robotics, hobbyists, to do.
Starting point is 00:04:23 this. How do you teach it to be intelligent like that? Yeah, so that's a really good question. The conversation part is, it feels very amazing, but it's actually the easiest part. Really? Because of a chat GPT, because a large language model, if we have an interface that can communicate with chat GPT, with a large model fast enough, then you can, you know, basically directly talk to chat GPT. And so what you just did, we didn't do any scripting, right? I have no idea what IRS going to ask. I was a little bit worried that. I was more worried than you were.
Starting point is 00:05:01 But I think the hardest part is something that you feel like very easy to do, for example, balance. For example, walking forward, locomotion, manipulation. Those are things that we take for granted, but it's very difficult to train robots to do that reliably. Is that the hardest part then to get it to walk and to mimic how humans do things? Yeah, so walking is hard.
Starting point is 00:05:26 Manipulation is another very difficult task. And the reason walking is really challenging for bipedals, meaning the robots with two legs, is because we are, this is what we call the under-actuated system. Under-actuation means that, you know, we could actuate our joints, our motor. Well, we don't have motors, but human noise use motors. We could actuate the motors locally.
Starting point is 00:05:51 but the whole system, the whole dynamic system, we cannot accelerate it from point A to point B without using external forces. So the way we taught human noise can walk is because it figures out how to exert torques at its hip joints and knee joints and ankle joint in a very coordinate way so that you can push the ground,
Starting point is 00:06:16 get exactly the forces you want from the ground, and move your accelerate your center of middle, mass forward to the right location that you wanted to be. And that's a lot of a computation. I'll bet. I imagine the hands are just as difficult, right? Yeah, hands, that's a different thing. And the reason manipulation is really hard is because we really want to be able to manipulate
Starting point is 00:06:37 anything in the world in a very general ways, right? If you just build a robot hands, robot arms to do one thing in the factory, in the industrial setting, that's actually not too hard. But the reason it is hard today is because manipulation means that you need to handle all different kind of diversity in the world and do so with the same brand, same policy. That's what we call it. Well, what do you think the limitations are of what you can teach Tadi or robots to do? One of the biggest limitation is that robot doesn't, you know, like if you were going for the machine learning or AI approach, we need a lot of the data to teach robots, just like the robot. way we teach, you know, large language models. So my colleagues would say we are, we have this
Starting point is 00:07:28 100,000 years of data deficiency comparing to training large language models. What it means is that, you know, in order to train a LLAM or large language models, it takes the data that's, that takes an individual, a person, to read the text training data, 100,000 years to read. And if you think about the largest data set we have in robotics, it's about 10,000 hours. 10,000 hours. Yeah, that's the amount of data we have today. Right.
Starting point is 00:08:01 So data is definitely really challenging for robotics. And another really difficult things, just we're learning, not just understanding text or images, we need to understand the mapping between what you see and what you do, right? because this is the most crucial part of robotics, a decision maker. Now, we have lots of people lining up. I'll try to hit both sides of the room.
Starting point is 00:08:28 Let me start on this side first. Yes. So you've talked about data to teach the robot. I'm a science teacher who teaches life science, and I think a lot about how our senses give us feedback, which is like a huge part of learning how to walk, for instance. What kinds of feedback does the robot take? Yeah, so using data to teach robots, we call it imitation learning or supervised learning,
Starting point is 00:08:56 is only one way. Robot can also learn from trial and errors, learn from its own experiences of interacting with the world, and this is what we call reinforcement learning. In that case, we will ask robots just to try things. Here's your task, you try different actions, and then we look at the results. We'll give you a score to let you know you're doing well. or you're not doing well, and robot will take that as a signal to decide, oh, what I just did was pretty bad.
Starting point is 00:09:27 I got a score. Maybe I should reduce the probability of doing the same thing again. So that's kind of like, you know, getting feedback from human. But in order to make this process automatic, we don't have a human, you know, keep telling robots, you know, with what score you get. Instead, we would design what we call a reward function. It's an automatic way to assess the performance of robot. Next question.
Starting point is 00:09:54 Over here on this side. So to follow up on the concept of the reinforcement learning, I read as a young boy Isaac Asimov's books, and among them were rules of robots. And so how does this reinforcement learning coordinate with that and the ethical use of robots to be sure they're not going to harm us? Yeah, so, you know, talking about ethical aspect of robotics, it is still, so the reward function I was just talking about is designed by a person, right?
Starting point is 00:10:29 So, you know, if we believe there are certain values and belief that we want a robot to learn from, we have to encode it into the reward function. And that is definitely not an easy way to do because, you know, how do you turn ethics into a mathematic function in the equation. Do you think robots are going to get smarter than us and will be the robots someday? Well, you know, robots can appear very smart in certain ways, but sometimes they were also not so smart. So, for example, going back to that reward function again, my colleague was, you know, would say,
Starting point is 00:11:06 you know, if you ask a robot to cook dinner for you and you write it down into the reward function, robot would probably do that, cook dinner for you, but along the way, it kills your cat. And the robot said, well, in the reward function, you say, you know, I get a good reward for cooking the dinner, and I don't get any punishment for killing the cat, you know, why not, right? So the problem is that you can never, it's kind of like arguing with the toddler. But you didn't say I cannot do that, right? They always, you know, corner cases that you cannot capture in your reward function. You mean, I don't want to kill any cats, no. What happens when a robot fails?
Starting point is 00:11:41 It can fail in many different ways, right? can it actually teach itself what it did wrong without you having to tell it? Does it learn from its mistakes? Yeah, that is actually really great question, Ira, because I would say this is the major deficiency of a robot today comparing to humans. Humans are really, really good at knowing how well,
Starting point is 00:12:05 self-awareness, knowing how well it does in this particular task, doing the scale learning time, and after skill is deployed to the real world. If you ask a small child to put together a block structure, and if the child fails at the task, it immediately knows what went wrong. He knows, oh, maybe I didn't set a foundation sturdy enough or I didn't align this particular piece, well, precise enough,
Starting point is 00:12:38 or a dog just notched the table and, you know, cause the structure to collapse, they know immediately. And this ability, this innate ability of understanding the source of error is something that robot doesn't have. So a robot will fail it, and then the only thing you would know, it has no idea why it fails. It just, well, assume I'm not going to do that same thing again because I got a bad score.
Starting point is 00:13:03 But if robots can reflect on its behavior, maybe we'll figure out, oh, because my sensor was not a calms. celebrate a ride, or I estimate the floor not as slippery as it should be. And then knowing that, will help it to accelerate learning a lot. It's not going to say, oh, darn, I did something. That's right. Yeah. I'm not going to go that way.
Starting point is 00:13:31 After the break, more virus conversation with roboticist Karen Leo about the current state of humanoid robots. Yeah, so if you want something like Todi, you can have it today. You can. Oh, yeah. You have a background in computer animation, correctly? How has that helped you in your robotics? What did you bring to that?
Starting point is 00:14:03 That's right. So I actually started as a computer graphics person and then kind of worked my way into robotics. But if you think about computer animation, you probably imagine there's an animators making those keyframe, those poses, and then they interpolated those key friends and make a nice looking smooth animation.
Starting point is 00:14:23 And that is something that you can do for, let's say, making movies. But if you want to do video games, then you probably will need a process, an automatic process that can generate motion. Let's just call it motion generator, which understand the high-level commands, like moving forward, make a jump on the platform, you know, when you play video games. And if you take one step further, if this, you know, instead of this motion generator, instead of a, outputting a pose or a sequence of a poses, it outputs joint torques. Then what you can do is you take that joint torque together with the gravity and contact forces, whatever external forces is, put it into a physics engine and let physics engine figure out the poses. And this is what we call physics-based animation.
Starting point is 00:15:13 And at this point, you are actually very close to robotics. The only difference is that you have human animators that given those high-level commands, or you have an AI model to do that. So if you replace the humans with an AI model and let this AI model make high-level commands based on the task, based on the observation, then your character is basically a virtual robot, and you're ready to deploy it to the real world.
Starting point is 00:15:42 Now, I understand that Tadi, who was out before, was actually 3D-printed body, which makes me think how far along until we can actually get our own home robot, do you think? Yeah, so if you want something like Toddy, you can have it today. You can? Oh, yeah. And we have them in the lobby for sale out there?
Starting point is 00:16:03 No, but almost as good as that. We have open source everything. We open source the hardware design, the algorithms, and also step-by-step instruction manual. So if you have a 3D printer at home, you want to build Todi, All you need to do is just order those commercially available motors and then just follow the instructions. Sure. If you can bake a cake, you can do that, right?
Starting point is 00:16:31 Yeah, we have the recipe. But in general, when might we see home robots, not maybe just toddy, but other robots? It is really difficult question, and I get this question all the time, right? And I always say five years. Everything is always five years away, right? Because beyond five years, then, like, you know, it's not like responsibility. The reason I think it is possible, but not just, you know, tomorrow, is because right now we're still building a lot of building blocks, really,
Starting point is 00:17:05 like locomotion manipulation. Before they are completely reliable, robust, and safe, I don't think it is ready to build a product out of it. But I also feel like the deployment of humanoid is going to come in stages, kind of like, you know, the autonomous vehicles. So at very beginning, maybe the robots are not going to be able to do anything you want in a household, maybe just for laundries or, you know, clean your dishwasher. I'd settle for that. I know. Those are the two things I want.
Starting point is 00:17:40 Dr. Leo, this is fascinating. Thank you for taking time to be with us today. Dr. Carl Leo, Professor of Computer Science. Thanks. Thank you so much. Thanks, of course, in Stanford, California. Thanks for listening. Don't forget to rate and review us.
Starting point is 00:18:00 Wherever you listen, it really does help us get the word out and get the show in front of new listeners. Today's episode was produced by Shoshana Bucksbaum. I'm Flora Lichtman. Thanks for listening.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.