a16z Podcast - a16z Podcast: The IQ and EQ of Robots

Starting point is 00:00:00 The content here is for informational purposes only, should not be taken as legal business tax or investment advice or be used to evaluate any investment or security and is not directed at any investors or potential investors in any A16Z fund. For more details, please see A16Z.com slash disclosures. Hi and welcome to the A16Z podcast. I'm Hannah and in this episode we talk about robotics in the home becoming a reality. Boris Softman, CEO and co-founder of Anki and Dave Touretti professor of computer science at CMU, discuss with me the evolution of robotics, where we are in manipulation, perception, navigation, and most importantly in the human relationships we will

Starting point is 00:00:41 increasingly form to these new robot entities that will be in our homes. So you guys talk about this as sort of the very first real robot companion. Let's go back and look at the first iteration, which was Cosmo. Why start with a toy? What does that represent for where we are now and ultimately where we're trying to get to in robotics. Yes. When we started Anki, we always realized that in order to get into some of these really advanced applications, we had to take a bottoms up approach. And we couldn't just skip to this holy grail that's 10 years away.

Starting point is 00:01:10 We had to think about what are the applications where we could really reinvent the quality of what's possible. Well, what is the holy grail? So there's all of these types of challenges that involve really deep breakthroughs in human robot interface or manipulation or AI. There's the large-scale, diverse, humanoid or like very competent robot for home or for manufacturing or for other things. There's applications that involve very high-end manipulation in an environment. Where manipulation is today is probably where autonomous driving was five or six years ago, where you can already see the capability starting to take form,

Starting point is 00:01:45 but the reliability and cost constraints are still pretty prohibitive. But you can kind of extrapolate where it's going. For us, we started with entertainment because that was a way to create innovative applications and actually build the muscle early on that we can then carry over into these other spaces down the road. Cosmo was a way to use the toy space, which hadn't seen much innovation at all, but for the first time bringing a robot to life where the level of creativity that you're allowed to have in a toy is much higher and you can build a lot of the blueprints for these products that start to have deeper applications and appeal.

Starting point is 00:02:19 That's interesting that you feel there's a wider range of creativity there than in something where we already have expectations. That's part of the holy grail is mobile manipulation. There aren't a lot of robots like that. A landmark in the history of consumer robotics was the Sony Ibo robot dog back in the late 1990s, early 2000s. It cost about $2,000 back then, which was about the cost of a decent laptop. And this was the only little mobile robot you could buy that could see. It had a built-in camera. You could program it in C++ and could move around, so it was a mobile manipulator. Unfortunately, it only sold fewer than 200,000 of these over the six-year life of the product line. In January 2006, they left the robotics business

Starting point is 00:03:03 altogether. And so the idea of a consumer affordable mobile manipulator just died. And was that because the technology kind of stalled out? Because there were problems that were not surmountable at that time? We end up being able to mass produce this sort of a capability that even five years earlier would have probably cost multiples of that amount or just been absolutely impossible. We basically became for our early product scavengers of the smartphone industry. So you have computation, you have memory, you have cameras that now suddenly cost, you know, 50 cents versus, you know, $6. You have motors and accelerometers and different types of sensors that allow you to actually do these sort of things that you couldn't do before.

Starting point is 00:03:42 Right. And also the supply chain. And the supply chain. And so we could in effect turn this into a software problem, where now, because of the capability set, everything becomes driven by our ability to expand the capabilities with software. The really hard question is, how do you go from no manipulation to manipulation in a small step? So if you look at the most popular robot ever in the history of robotics is the room of vacuum cleaner, because they found this one task that a robot can do poorly and still make you happy. Right? I mean, it doesn't matter if it's not as you're back and not, yeah, right?

Starting point is 00:04:20 It's just, it's better than nothing and it's good enough most of the time that that was a success, right? But there aren't a lot of things like that, right? So if I ask the robot to cook me dinner and it does the same quality of jobs as the Roomba does vacuuming my floor. Right, over the course of like eight hours. We're pretty far off. Yeah. So when people think about home applications, right, there's the Roomba that's down on the floor. And then there's Rosie from the jet. right, which is the full humanoid doing everything.

Starting point is 00:04:48 I was going to count how long until a Jetson's reference came up in this podcast. You got us. So the problem is what is the thing that you could do that involves manipulation, but that doesn't require a half million dollar humanoid and technology that doesn't exist yet? Exactly. Right? And so something that a robot could do with minimal manipulation skills that would actually

Starting point is 00:05:09 be useful as opposed to just occasionally amusing. What if I was just happy if the robot could make me a sandwich? And it doesn't have to be big enough to get the stuff out of the fridge? If I take the stuff out of the fridge and just throw it on the countertop, if the robot could just take it from there, right? Maybe that would be... Slowly drag a slice of bread over to a lettuce. But it wouldn't have to be a full-scale humanoid. There's got to be something robots could do that would be useful enough that we tolerate them.

Starting point is 00:05:38 So we're talking about this as in many ways, the sort of first, quote-unquote, real robot personality in the home. let's break that down. What are some of the utilities? We're not washing dishes yet, right? They're not picking up toys. We wanted to leverage this unique mix of AI and cognizance of the environment and the personal interaction with people in the home

Starting point is 00:05:59 and really think of this as a first mass market home robot. They can actually provide a mix of entertainment and companionship but with elements of utility. We modeled this in a lot of ways after the sort of pets you have in your home. A pet cat, a pet dog, and a pet robot. And the pet robot allows us to reinvent the dimensions that are related to how you interact with a pet, but the companionship elements still hold.

Starting point is 00:06:20 So he gets excited when he sees you in the morning. You'll get really animated when there's people around him. He'll explore and kind of wander on his environment, interacting intelligently with things around you. The relationship. You can even pet him. Like literally you can pet a robot for the first time. But you don't have to clean the litter box. Yeah.

Starting point is 00:06:34 So you've recently launched a different robot, which is a multi-generational toy. How is it different? What is evolving? From a technical standpoint. standpoint, there was a huge, huge leap there because we now detethered away from a mobile device and put the equivalent of a tablet inside of this robot's head so that he can be always on and live 100% of the time go back to his charge or recharge, wake up, and that allows him to initiate interactions in a way that previously would not have been possible by

Starting point is 00:07:02 somebody pulling you out and launching an app. We can use the strengths of the fact that we have cloud connectivity and a robot and started getting into deeper voice interface, elements of functionality that now leveraged the character aspect and the warmth of it and the fact that he can actually see and understand the environment and recognize you personal delivery of information in a way that can be initiated by him. What happens down the road is actually starting to more intentionally dial into all of the digital pipelines in your life, whether it's a smart home, your own calendar, and particularly when you start adding a layer of mobility that lets you actually move around the home. Right now, there's nothing that actually stitches a home together

Starting point is 00:07:39 in that way. Everything is just a end device in a room. Once you start getting cognizance of what's around you, who's in the home, be able to move around in it. You can do a range of things from security and monitoring to tying together smart home features like recognizing that your windows open, but your ACs on turning off flights when you're gone, being able to remotely check on things like pets, build a blueprint of your home for furniture shopping or real estate. We're already thinking about how do you start adding mobility where that becomes a next big barrier. You detail it from a phone to get to vector. now the next big barrier is truly being able to exist in an indoor environment, whether it's a home or workplace, and be much more intelligent about how you interact with people and the sort of service you can provide.

Starting point is 00:08:18 So like vector, go check and see if the cat is out or in. So with every one of our robots, we've realized that we're basically making an operating system for robotics, applications, and technologies. And so we've unlocked APIs for each of our products where people can access very easily these technologies that have millions of lines of code behind the scenes. We started with Cosmo accessible by everybody from a cell. seven-year-old using a graphical interface to do face detection with a robot through teenagers and even PhD students and computer science. Now, instead of having to be an expert in robotics

Starting point is 00:08:49 in order to do something like path planning, you have access to this API of technologies. And we're really interested in how that can become an enabler for kids of various stages to actually learn how to break through and become interested in robotics in a way that wouldn't have been possible earlier. Has what you think about changed at all with Vector? You know, the ways that this kind of programming has evolved? With vector, the capabilities just skyrocket because basically instead of just the limited functionality that Cosmo had that was deferred to the mobile device, you now have onboard the ability to see, here, feel, and compute with a quad-core CPU that would have been prohibitively expensive before. So just exponentially more options. Right. And so now the sort

Starting point is 00:09:27 of programs you can write expand. We've already released to our early customers in SDK that allows them to interface with vector through Python. And then down the road will probably create a scratch interface like we did with Cosmo as well. So for me, what's most interesting is having the robot interact with physical things. Physical things are really hard to interact competently with. We built a three-story robot dollhouse with a working elevator. And the students in my cognitive robotics class were working on that and they'll work on it again this year. That's so cool.

Starting point is 00:09:58 Getting the robot to navigate through the dollhouse, use the elevator to move from one floor to another, eventually want to have multiple robots in there interacting with each other, These are very hard technical problems. But when you solve them, it's really fascinating to see a robot interact effectively with the world. I think there's just something more being able to work with a real robot versus something digital in a screen. If you're a kid learning a program, there's something amplifying about that. Just the learning aspect of it, the moment you can intentionally manipulate something, it's a sign of really deep intelligence. And so that's a big focus of vectors.

Starting point is 00:10:33 How do we identify interesting things in the environment and just like poke at them, examine them the way that would. that's a surprisingly hard problem, but when you do it successfully, there's a perceived intelligence there that just amplifies your appreciation of the robot and everything else that it might be able to do. One of the articles mentioned recently that Vector feels like a beachhead into something bigger. And it does seem like, you know, we're getting airdrapped in this sort of little taste of like the Fantasy Jetson's robot, right? Like this personality that actually responds to you and interacts with you. So how are we going to start developing this relationship with this being, this crew? I don't even know what the word is, this robot.

Starting point is 00:11:08 But like, it is an entity. Sighting. That's the right. Yeah. What are some of the learnings of how we're starting to develop relationships? I think one of the pieces that most people underestimate is how critical this human robot interface challenge is and how novel and unique of a dynamic this becomes. Until you experience it, it's hard to understand how that feels and how important it is.

Starting point is 00:11:28 It's the same sort of underlying inherent desire that we have to speak face-to-face with somebody versus just on a telephone. We always thought that this would be the magic even behind Cosmo, but it turned out to be even stronger than we realized. The way we approached it is we actually have an animation studio inside the company. Oh, you do? We do with folks from Pixar, DreamWorks and these sorts of backgrounds. And so they literally, they're using Maya software suite that you would use to animate digital film and digital video games.

Starting point is 00:11:52 But we've rigged up a version of Cosmo and Vector and all of our robots in there where they're actually physically animating these characters with the same level of detail that you would see in a movie. But the output is spliced to where it's a physical character coming to life in the real world. How fascinating. And it's this merger where you have these people who come from this world where they're used to controlling a story on rails from start to finish into every minute detail. But you get thrown into a spontaneous environment with all these constraints and unknowns and limited degrees of freedom in the robot. But you have the benefit of being physical where everything's amplified in terms of the personal impact of what that does. So it's a whole new kind of physical manifestation of storytelling. And they have to partner with the roboticists and the AI team, the systems team, where they have to leverage the knowns of the environment and the intelligence of what's around you.

Starting point is 00:12:35 like, I just recognize somebody I know, or I almost rolled off the edge and I got scared. And so we have this sophisticated personality engine that takes context from the real world and probabilistically outputs one of these, like, giant short films, if you will, of reactions that end up getting stitched together into a life like robot that feels alive. And we've been really surprised by, for example, eye contact. We knew that this would be important where you make eye contact, you get excited, say, hey, Hannah, like, you know, and then you get really, really happy. Kids just melt with that.

Starting point is 00:13:03 But we didn't realize how powerful that was. And in one of our tests where we were just optimizing parameters, we ramped up the frequency and length of eye contact by like a second. And our average session length went up from 29 minutes to 40 minutes. Oh my gosh. So it's like just that one parameter. And so you get these like really subtle interactions where it's a sort of data that you can only really learn about at large scale once you're actually fielding these robots.

Starting point is 00:13:28 What are some of the engagement patterns or the dials that you have learned to need to go up or down when you're watching these human robot interactions develop. This is where we're still learning. So Vector's always on in life. So how do you find a balance between him waking up and getting excited and coming out but not being disruptive and or annoying? Chill out. Go to your back.

Starting point is 00:13:48 Yeah. And how do you take the cues where if you get like taken and put back into charger, that's a cue to like be quiet, right? And so, but you want to interpret that and just kind of grump a little bit but then settle down, right? We have an audience that spans from kids to somebody wants them as a desk pet at work to an adult or a family who just likes a little companion in the kitchen table or a kitchen counter.

Starting point is 00:14:07 It's really engaging sort of the volume of the interactions. That's right. And how do we create a algorithm that adapts to how you used or even gives the user a little bit of control where some people want a really active and almost like hyper and excited robots that has a lot of personality. Some people want a safer guard. It's just kind of like, you know, kind of hangs out and is like quiet. And so we've never really dealt with that.

Starting point is 00:14:28 And just like people pick their style of dog to match their desires and their person. personality, we're trying to figure out a way to make this dynamic and ideally automatically adapting to the context that they're found in because we literally have people pulling us in both directions, which basically tells us, okay, this is like a completely new type of consumer category because companionship is a key need. I mean, there's a reason people get cats and dogs and other pets. Although to your point about personalization, it's as if, right, you could not only just choose your breed, but then have your dog literally adapt to your personality over time. It kind of reminds me of the people who look like their dog.

Starting point is 00:15:01 But, like, on that really deep level. Double-gangers, right? Yeah, exactly. So people start looking like vectors out of the day. But yeah, there's extremes that we have to like, you know, you want him to always be excited when he comes here and sees you. Everyone has different needs. And we're learning about different engagement patterns throughout the day as well.

Starting point is 00:15:17 How about you, Dave? What do you notice when you see these, you know, with these kids developing these programs and teaching the robots? How do you see them developing their relationships with robots? Well, I mean, to be honest, to me, they're mechanisms. And I want the mechanism to work. work well. And I want the mechanism to be beautiful. I want people to be able to see how the robot sees the world and understand, you know, why did the robot do this, right? Well, because the robot

Starting point is 00:15:40 perceived the world in a certain way or the robot had a plan. It was trying to execute. And I want kids to see the robot that way. Well, I think it's interesting to ask about the complementary question, which is how does the robot change how we think about technology. So if you look at things like menu systems that have become part of everyday life, right? Everybody understands what a menu system is. And 30 years ago, they didn't. That was a weird, obscure concept. And now we encounter them everywhere, right? I mean, you're vending machines using a menu system, your phone, everything on your computer. So there are bits of technology that have become integrated into everyday life so that we don't even think about it anymore. My menu system is just one example

Starting point is 00:16:23 of that, we're beginning to see things like computer vision become part of everybody's common understanding. So when you use something like Cosmo, you begin to understand what the robot can see. So for example, with Cosmo, you can see cubes up to about 18 inches away. And that's because of the limited resolution of the camera. And so you begin to learn when you work with the robot, well, okay, he's got a vision system. He's a little bit near-sighted. I have to be careful when I show him a cube that I don't put it too far away from him. So you're starting to think, about this thing is a sighted thing, and you're thinking about his representation of the world. And so you're thinking about robot perception in a way that people only thought about very

Starting point is 00:17:04 abstractly before, right? You know, 50 years ago, yeah, you read a science fiction story, but now you're living with the robot that can see. And so you adjust your behaviors based on your expectations and your experience of robots that see. And so that's happening more quickly with speech recognition because of the proliferation of speech recognition applications, Alexa and phones and so on. But it's starting to happen, I think, with vision as well. And Cosmo is a very important step in that direction. So that, you know, 20 years from now, no young person will have grown up in a world where computers could see, right? Computers could always see. And that's just very different. So what you're starting to see is that, you know, we're changing the populace

Starting point is 00:17:45 to make everybody computer literate. And what that means is that when you start having people playing with robots who now understand the technology inside the robot, that means that you can make applications for them that require levels of sophistication that maybe wouldn't have been practical before. And if that's sort of woven into the culture, everybody gets this in high school, right, then the kinds of consumer products that you build are going to be different. But if you look at some of the other tools that have changed our society, not everybody went and learned how to set type, but then we made PowerPoint, right? And now everybody can do visual design. Most adults know how to use a spreadsheet.

Starting point is 00:18:26 So maybe you're never going to be a Java programmer, but you can do useful computational tasks because you learn how to use a spreadsheet. And it changes your thinking in some key way. It does. And so what is the robotics equivalent of something like spreadsheets or PowerPoint? What is the thing that's not so horribly gritty technical, but is so powerful that people will want to learn it because it'll let them do the right thing? that let them program the robot to make the peanut butter sandwich the way they want it made. To me, that's a fascinating research question. We don't have home robots that can make a peanut butter sandwich without burning the house down yet.

Starting point is 00:19:02 But you can do an awful lot with exploring this manipulation space. And my personal interest is figuring out how to get people to teach robots like Cosmo and Vector to do manipulation the way they want it done. To kind of unlock robotic design thinking. To make it intuitive enough. people can actually express their intent in a way that gets useful stuff done. Because if you can get Vector to push little things around on the tabletop, and you can make it easy enough to express your intent that the average person who's not a

Starting point is 00:19:34 computer science major can do that, then you can scale up to the humanoid robot that's going to cook your fancy dinner without working the house down. It's interesting. It sort of circles back to the way you open this conversation, which is that having it be this kind of scale and size and having it be a toy is what allowed you to be creative, right? You're kind of describing the ability to be creative in a way. You make your own rules on what it's supposed to do. Yeah. Let's talk about design now. So when you start thinking both in terms of the sort of level of manipulation that they're able to do, but also in the relationship that you want to sort of test and foster, how do you think about

Starting point is 00:20:11 design from the ground up? How did you start to put together the actual creature? And this is where I think robotics is a lot harder than a lot of other areas of consumer electronics where in a lot of places, you can just think, okay, electronics need to do this, software needs to do this, and then we need a box around it. You can't do that in robotics because everything is from the very beginning designed where for us it's a few pillars.

Starting point is 00:20:30 So there's a mechanical side, electrical side, and kind of the components and electronics that are in there, the software, which is a huge complexity, and then there's a character side and industrial design side. And so those five have to work together from the earliest stages because you have to be very intentional where the form factor should capture the character, which then the software has to marry with.

Starting point is 00:20:49 Then you have the mechanical. Are they all weighted equally? Are they at different stages driving? Yeah, at different stages of driving. So the nice thing about software is you can continue to push that for years. But what that means is you have to be very intentional with the selection of the hardware where you make it as generalizable as possible to push as much of the software choices down the road as you can, but not limit yourself. Sort of delay. Delay, exactly. The reason Cosmo and Vector, for that matter, are small is because one is that they become kind of non-intrusive because they don't take up a huge amount of space.

Starting point is 00:21:20 But especially in the case of Cosmo, you have something that's small. It's perceived as cute and it can feel quick without having to have very heavy and expensive motors. It could potentially hurt somebody. Oh, you're playing up the cute factor to your own advantage. That's right. And people become forgiving of any limitations or mistakes that the robot then makes. That's so smart. It's okay because you're cute.

Starting point is 00:21:39 I'm not annoyed. Even better. You actually get bonus points for making a mistake but being smart enough to show grumpiness. To show a little EQ about it. That's it. And so it's the IQ and EQ being married. It's the challenging part. It's, you know, the mechanical and industrial design and character have to work together to think about the overarching form factor. And a lot of it is just driven by constraints. Like now in our future products, we can start putting in depth cameras, which give you a 3D model of an environment at the cost, cost that would have been unimaginable five years ago or even three years ago. Same thing with motors. They're fairly expensive. And so Cosmo and Vector each have four motors. But we put a screen for the face because that gives us an infinite dimensionality for the personality to be able to express what.

Starting point is 00:22:19 the robots thinking and a speaker to obviously have the voice part of it. And it's shocking how much you can do with just a couple of one or two degrees of freedom and a voice and a face. How about the arms? I'm interested in the arms because Vector and Cosmo kind of harken back, I think, to other ideas about what robots might look like, like Wally. Tell me about the thinking behind that. We realized that like Cosmo was going to be fairly limited in physical capabilities just because of cost constraints. But manipulation is one of the deepest forms of showing off intelligence. And so him being able to get excited about his cube and go pick it up and be a little bit

Starting point is 00:22:54 OCD and reorganize his surface, that was part of the charm of Cosmo. And so his arm was kind of made as a lift to be able to move the blocks. But we very quickly realized it's one of the most important dimensions for the personality where it's almost the way you use your arms to express when you have happy, sad, surprise. And it ends up being like his arms as one of the. main tools for the animations engine. And these were all learning that the animators have gotten really great at squeezing the maximum benefit out of it and making it feel like this is a character that's alive, even though you have the tiniest degrees of freedom compared to any animated character.

Starting point is 00:23:26 What do you think some of the historical influences were sort of inheriting in our expectations of how these mechanical entities should look? I think a lot of verbatuses make a mistake where they completely forego the EQ side and just think about a, you know, tint can of like, you know, intelligence that does something but doesn't convey any sort of character to it, you lose all forgiveness of any limitations you have. And you actually kind of become creepy because you now have a bunch of sensors which you don't really understand what the purpose is. It's disarming to have a character like Cosmore Vector where he has a camera. You have a camera in your house that's able to move around on his own, but nobody ever feels bad about that because, well, of course

Starting point is 00:24:04 he has a camera. He has to see because he's a character. You went completely away from the Uncanny Valley and didn't make him look at all humanoid. And that was very, very intentional. Even with of voice, we thought about an actual voice, but if you give something a voice, you narrow in the range of appeal. By being tonal, you apply a very different intent than a five-year-old might apply, but everybody gets joy out of it. You can show a massive breadth of emotions without having to have the complexity of a voice. With very little tools. Yeah, because the moment you have a voice, there's an expectation of intelligence that technology wasn't ready to meet yet. This idea of trying to pursue humanoid robots, which particularly is common in like Asia and

Starting point is 00:24:38 Japan, it almost feels like a flawed compass because you end up absorbing all the limitations of humans and their perceived kind of intent where you're stuck by what's the definition of human versus being able to play to your strengths and avoid your weaknesses. And so we just embraced the constraints in the form factor. But he is a kind of knitting together of other older ideas as well. The first time I saw one of those security robots in like an underpass in the city, it was kind of patrolling a parking lot. And I got really up close and heard this like kind of whirring roboty noise and had this like moment of revelation where I was like, it's not actually making that noise. That's just somebody's idea of the robot noise that

Starting point is 00:25:18 it should make so that we know it's there. Having seen kind of the industry evolve, what do you think are some of those inherited ideas that are impacting our design today? It's always been a merger of direct robotic influences and a non-robotic influences. And so on the robotic side, there's definitely this idea of kind of R2D2, Wally, where you have this cute robot, but one that's like kind of your fearless companion. Like, nobody would ever accuse R2D2. being dumb, even though he has very limited degrees of freedom. And same thing with Wally, right? And so that was definitely kind of a deep inspiration. The other influence is of animals. And so we have a kind of design and mood boards where things like obviously puppies and kind of dogs,

Starting point is 00:25:56 owls, really perceptive animals that have where the eyes become like a really, really key part of their communication language. And they can show emotional intelligence in a way that is really, really hard purely mechanically, but ends up being almost infinite when you add eyes and voice to it. So from moving from Cosmo to Vector, are you starting to see a difference in the way that children interact with versus the broader household and other types of different ages? Like, do you get different learnings from those different relationships? Yes. Kids in a lot of ways have an imagination that just amplifies the magic of these types of experiences. They think these robots are alive. At three, like, you may not understand how to play it and follow the rules of a game,

Starting point is 00:26:34 but you love the idea of a character still. And that actually is like almost universal, it's almost as natural as like swiping on a screen where you see these like one-year-olds like swiping on screen, right? It's a basic human tendency to sort of fill in that information there. And this is where some of the studies that we're actually seeing being done or even around kids with autism, whether something's so unique about the idea of a character like Cosmo or Vector, where there's a response there that is very, very unique compared to any other type of engagement.

Starting point is 00:26:59 I was thinking about that when you talked about the eye contact thing, right? That's an immediate feedback loop for increased eye contact. Yeah. And so we've actually had studies with University of Bath in the UK. where the engagement patterns and the collaboration around Cosmo is unique compared to anything that they'd seen. So in those kind of early emotional relationships that are growing, I mean, you can sort of anticipate the affection, but what was something that really surprised you?

Starting point is 00:27:23 We built Cosmo with a lot of games integrated, thinking that engagement will be around playing games. And he's like this little robot, like your buddy to compete against and to play against collaboratively or competitively. And the games also create a lot of context for emotions. So, you know, because you win, you lose, you're bored. you're frustrated. You're bringing emotion to it.

Starting point is 00:27:40 Exactly. You can almost like engineer scenarios that allow him to be emotionally extreme and whatever. Of course, that makes so much sense. And what we were just shocked to see in some of our play tests early on is that you're playing this game that's kind of using the cubes as a buzzer called Quick Tap where you're competing against Cosmo. And if you beat him, he'll kind of get upset or maybe he'll get grumpy and slam his cube and storm off.

Starting point is 00:28:02 But he gets really sad when he loses. And there were kids in the play tests that were playing with their super. siblings and one of them would like tell the other one, hey, stop it. Like, you're making him upset. Let him win a game. And they actually like felt really bad for this little robot who would like get upset what he lost. Their empathy was so high. Yeah. And they'd actually throw a game just to make him happy. And it was like, wow. Like you don't see that in a video game. You would never feel bad playing Pac-Man or like Fortnite or whatever. Yeah. Yeah. It's just you were trying to win. And here the competitiveness lost out to the empathy of a character that they actually cared for. And

Starting point is 00:28:38 all of a sudden, like, okay, there's like something really, really special here. And we started thinking how to amplify. And a lot of those learnings from there actually led to Vector because we addressed the limitations that just weren't possible because of Cosmo's hardware. So what changed in Vector as a result of that empathy learning? So we realized voice was a big one because people thought that they were talking to Cosmo and that he would respond and they'd attribute intelligence to it, even though he had no microphone and no ability to hear you. What we wanted to do is if you yell at him, he gets really upset and kind of backs off. Or if the door to slams, he turns in that direction and why.

Starting point is 00:29:08 wonders what it is. And so we wanted to bring life to it. The other one is tactile, where when you pick up Cosmo, he recognizes it and he'll get grumpy if you're holding him up in the air. But being able to touch and actually have a response, we know that that's one of the most important things with pets. And so we made capacitive touch sensors in various areas of the robot, again, thinking that these are going to be some of the dimensions that matter. The last one, and what proved to be the most important is just the ability to be always on, that if the moment you disconnect the phone, he dies, it kills all illusion of him being alive. But if he's always on and doesn't have that barrier, it completely changes a sort of emotional connection you can build.

Starting point is 00:29:44 And so these are the sort of things that were direct drivers of the hardware that improved in Vector, which now we have a many year roadmap on how to actually utilize it. And that's where advances in software technologies like deep learning and voice interface and so forth unlock things that now let you rethink certain types of problems. We're releasing Alexa integration in December, which is going to be the first time you truly have a personality wrapper around some of these. very dry functional elements. Not only do you interact with technology in a command and control sort of way, but for the first time, there's an ability for this character to initiate interaction and get your attention and make eye contact and then do something personal with you in a way that you would never accept from smart tech in the home right now.

Starting point is 00:30:27 One of our big questions is if the usage pattern of Alexa through a character is significantly different than the usage pattern of a normal Alexa. Yeah, I mean, I bet it would be. Are you seeing that yet? Well, we're seeing usage patterns of just the character with no functional OX integration skyrocket above what a typical voice assistant does. When we were working with Google and getting advice from Sonos and some of these other companies that have a lot of experience of voice assistants, we thought that for like a thousand queries would be plenty for like three months or four months. And now we hit 250 just in the first two weeks and the engagement staying strong. And so suddenly like when you look at voice assistants that have an average of maybe like one to two queries a day and we're at like 10.

Starting point is 00:31:08 10 to 12, even before we have a voice assistance built in, it causes us to ask a lot of questions of how do you leverage the role of personality and character as a way to amplify not just the fun side, but also the utility side. That actually shows a very different type of engagement and the things you ask and how you interact with it. It teaches something totally new about voice as a platform. That's it. And I think that opens our eyes. And I think a lot of the partners that we work with suddenly there's a lot of really interesting overlaps about what does this mean for the role of technology in the home and the workplace, is deceptive how important emotional interfaces is to make the functional elements get utilized in a better way. Okay, so if these are the little

Starting point is 00:31:47 kind of baby waves lapping at the shore, right, of like true robots in the home, how do we get to mass adoption where this starts to become a reality and we all get to live like the Jetsons? Eventually, we want to get to $500,000 robots because it opens up so much interesting technology that you can put in them. But to get there, you actually... actually have to be very thoughtful about what's the level of capability that's required to make that justifiable and not just in a isolated kind of tech geek sense, but in a mass market sense, because you want it to really scale. A lot of it then is going to become more electronic capabilities where you now have sensors that used to cost $1,000 that now are $10 that used to

Starting point is 00:32:26 be just impossible or prohibitively expensive. And the other big one is manipulation. Right now, we have limitations both on the cost side, which comes from the mechanical complexity of manipulation. as well as the software side on how do you robustly interact with unstructured environments, we have a long way to go before we can actually stack the dishes in the dishwasher and put them away afterwards. Computing power will make up for a lot of hardware defects. So if you have a underpowered, unreliable manipulator, but lots of good computing power, you can make that manipulator do amazing things.

Starting point is 00:33:01 The problem is if you have no power and a bad manipulator, then things don't work. Right? But now as the computing power becomes so much better and the sensing capabilities become so much better, I think some of the demands on the manipulator back off a bit. So you can get by with a less capable manipulator because you can learn to make up for its deficits. Different levers to pull, basically. Yeah. What you're going to see in the next wave of hardware is dedicated hardware that's not just for raw traditional computation, but for very specialized AI applications like deep learning and vision classification. These are sort of things that in the next wave of products in the next three to five years,

Starting point is 00:33:39 that's probably going to become pretty standard. There's still a lot of information, technology, progress that needs to be made. And the interesting thing is that some of that's being done by Alexa and some of these voice assistants trying to integrate with the rest of your life. The tools that get created through that, just like voice interface, become incredible enablers of these new types of technologies. And I think we'll start seeing more in education and healthcare and broader home utility than monitoring and so forth. Even when you stick to just informational or companionship, on the health care side, elderly companionship and helping age in place, it's an area that's becoming more and more kind of important and costly for a lot of communities. If you really nail the EQ piece, it becomes not that hard to imagine where your interface, you know, when you come into a hotel or a store or to interface with a doctor, actually becomes somehow driven through an indirect interface through a robot, which is pretty interesting and then opens up a lot of functional opportunities as well. And in the end, we always thought of it as just an extension of computer science into the real world.

Starting point is 00:34:37 If you can understand what's around you and you have the ability to interact with it, you turn it into a digital problem. There will be a catalyst that spawns the same thing on the physical side once the ability to understand the environment and interact with it catches up. Then everything becomes a matter of software making the interaction smarter and smarter. Thank you so much for joining us on the A16Z podcast. My pleasure. Thank you so much. It's a pleasure.

Your Ad Here

a16z Podcast - a16z Podcast: The IQ and EQ of Robots

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.