The Vergecast - Anki CEO Boris Sofman

Starting point is 00:00:00 This episode of Vergecast is brought to you by the Audi ETRON. The electric car has always raised questions. Can it contend with the elements? What's the range? With high speed, charging, long range capabilities, and quattro all-wheel drive, the fully electric Audi ETRon can be the answer. Visit AudiUSA.com slash Eitron to learn more and stay informed. Hey, everybody, it's the online from the Vergecast.

Starting point is 00:00:16 We're continuing on with our series of extra interview episodes in the feed. Let me know how we're doing. I've gotten a lot of great feedback, but if you don't like it, tweet at me too. I'm very interested in what you think. This week, we have Boris Softman, the CEO of Anki. You might know Anki is the company that made the little overdrive race cars that drive themselves with AI. Well, they've been doing home robots, little cute character robots. I've got a new one called Vector.

Starting point is 00:00:37 Dieter wrote it up for the site, but Boris joined me on the Verchast to talk about what it's going to take to make robots true companions in the home, how they need to be characters. We talked about gendering the robots, which is the thing that we talk about the Verchast all the time. And we talked honestly about whether or not it's going to be creepy. It's pretty wild conversation. Check it out. All right. I'm here with Boris Softman, the CEO of Anki. It's Anki, right?

Starting point is 00:00:59 That's right, yes. It's one of those words I read all the time and rarely say. So, CEO of Anki, you guys make little robots. That's right. Little robots on the way to big robots down the road. Tell me this story. I want to hear more about you, and I want to hear more about the company. Right now you're making robots that are a little toy-like.

Starting point is 00:01:16 This newest one, Vector, Dieter Bone, came to your office. He played with it. He was like, this is a home companion. It's not as much of a toy. But your goal is to build big robots, right? Like actual home companion run around the house robots. Yeah, that's right. We wanted to approach it with a really deep degree of practicality, though, because if you think back to kind of the state of technology, cost structures, where AI is,

Starting point is 00:01:37 where sensors and computation is, right now you can't make a big humanoid home robot that costs $10,000 and so forth. Like, it just nothing aligns at this point. And so what we wanted to do from the beginning of the company was to take a bottoms up approach and really start laying the foundations of both the technology and the capability set so that as new technology becomes available, we can grab it. and productize it faster than other people could. It goes back to kind of our roots where we were at Carnegie Mountain and the Robotics Institute, and we saw these like autonomous cars working on walking robots, all these sort of things. But when you want to apply it to consumer products, you have to be really conscious of

Starting point is 00:02:13 how do you hit that balance between price and functionality? And entertainment became a really great proving ground to develop all these technologies, but actually release products at really large scale at the same time and build an audience base and a lot of learnings along the way. So the first product that I recall, you guys had quite an introduction to the world. Tim Cook called you out on stage for the cars that you guys made. Tell us about that product. Like, you're like, our first one, it's going to be basically slot cars with AI.

Starting point is 00:02:41 Walk me through that decision. And I'm very curious, how did you end up on stage at an Apple event with it? We started working on what became that product, actually back in, like, 2008, just kind of on the side while we were in grad school. The whole idea was, okay, you look at toys today. There's no software in them. It's basically all, like, plastics and cheap components. and maybe very basic mechanics, how do you actually turn physical entertainment into the equivalent of video games where everything is software-driven, where you can build gameplay and

Starting point is 00:03:06 deeper AI and so forth? The racing genre became a really great starting point for it because it was familiar in terms of the gameplay and the game constructs. And two of the three of us had worked on autonomous cars before, and we were really excited about how you could do that, but now make it at a price point that completely reinvents the way you approach the problem. And so the idea was how do you make a video game in the physical world where these cars are not just AI controlled, but you're controlling cars as well, that because of full knowledge of what's going on, it becomes a video game and you can have weapons, special abilities, and so forth.

Starting point is 00:03:36 And so we raised our Series A in 2012 the year before, where we had a very robust kind of prototype of it. And then we were ready to launch for fall in 2013. We got in touch with Apple through our Andreessen Horowitz, actually. Mark Andreessen, I think, had a long-standing relationship with Bill Campbell, who was chairman of the board. he introduced me to Tim Cook over an email. And literally around Christmas that year before, I got an email back from Tim Cook,

Starting point is 00:04:02 which is like the craziest thing ever. And we were completely private. We hadn't like done any public announcements or anything. And so we ended up having a meeting with like half the executive team of Apple in February where it was like the scariest meeting we've ever had to prepare for. And effectively our goal was to, our primary goal was to sell in Apple stores as a really big message of brand, where we didn't want to just be seen like another toy company because, this is a robotics company.

Starting point is 00:04:26 We just happen to be using toys as a stepping stone to go to these other areas. Apple Stores gives you a brand recognition that's where you're the premium in the category. And our stretch goal was how do we hook them to kind of think about something bigger around WWDC? And then it just started this chain reaction where about a month and a half later, we got asked to kind of meet with them and kind of consider how we might be part of WWDC. And then I think it was the craziest couple, you know, two, three months of just preparing and living in the Apple Dungeon. whereas the most rigorous preparation process we've ever gone through. And it was scary many times along the way until we finally made it on stage.

Starting point is 00:05:02 It worked, right? I mean, everybody knows who Anki is after that. You guys have been putting up more products. What's the range now? How many products do you guys have at? So the successor to that first product was Overdrive, which is a more advanced version of that game. We have a robot named Cosmo, which is kind of the first one in this line of character-driven robots. where imagine like a Pixar character coming to life.

Starting point is 00:05:25 You have this little mischievous robot that's meant to be meant for kids, even though a lot of users were adults, but it's meant for kind of playing games and interacting with him as he evolves as a character, and also a really amazing STEM platform for programming at every level from PhDs down to a seven-year-old leveraging these capabilities. And so Cosmo's been in market for a couple years now. It was the number one toy on Amazon in the USUK and France last holiday. And all in all was a big part of about 1.5 million robots. we've sold at this point up until this stage. So that really kind of opened our eyes to how powerful this idea of both the IQ and the EQ is in robotics,

Starting point is 00:06:01 where you have this character-driven experience, which to us, by the way, is a core of everything we'll ever do. Like, it was core of vector and core of all of our future products where we approach it almost like a film character where we have an animation studio inside the company with folks from Pixar DreamWorks, these sort of backgrounds that are animating these physical characters with the same type of passion and every nuanced detail that you would see in a film, but it's done in an interactive experience with robotics meshed in, in order to feel like this character's alive. And so from Cosma, we realized that, okay, technology is advancing.

Starting point is 00:06:32 We were learning that, like, okay, our biggest limitation is that this guy who's meant to be alive is tethered to a mobile device as the brain behind him. And people are saying, I want to talk to him. I want him to be alive. I want to be always on. And then technology advanced far enough to where with Vector, we actually stuffed the equivalent of a smartphone inside of his head and created a character that is,

Starting point is 00:06:50 the first real home robot that at a large scale is going to be always on and alive in a home. And this really unique blend of companionship and entertainment and a new type of utility that kind of wraps personality and character around it, which for us starts to move outside of kids and actually into broader appeal, where we kind of think of it as tech immersive and early adopters that could be single people, could be families, could be kids, elderly, pretty much for everybody. But defining a new category that doesn't really exist. But along the way, we're vetting out what it takes for a robot to be accepted and exist in a permanent fashion in a home and feel really additive and start to move deeper and deeper into utility and service away from pure

Starting point is 00:07:28 entertainment that was our roots. That to me is like the most interesting thing. It sounds like a vector, which, by the way, you're running a Kickstarter for you. You're way over your goal. I just looked. Your goal is like $500,000 and you're like $1.4 million. So you're doing fine. But it seems like it's like a Trojan horse, right?

Starting point is 00:07:44 You're going to get people used to idea that there's an autonomous robot in their house. do you worry that you're sending the wrong messages, right? You're going to teach the wrong lessons or learn the wrong lessons. How do you evaluate the information that you're getting from this user base? Yeah, it's a very good question. So, you know, when we think about kind of the path of home robotics, there have been like other companies that have tried to make a home robot, but the problem is you can't release a $1,000 robot that doesn't have anywhere near that level of functionality,

Starting point is 00:08:08 and it just falls flat relative to expectations that you set for yourself. Now, down the road, there'll be $10,000 robots that are completely worth it the same way of family cars, $10,000 or $20,000 or whatever, right, because of the functionality that it provides. But AI, technology, and cost structures have to advance. So with Vector and even with Cosmo, what we're doing is we're learning about the interface elements. How do you engage in a completely different fashion that you would never be able to otherwise? So when you think about smart home today, you have all these and devices and smart speakers

Starting point is 00:08:39 and so forth that are in command control, where you tell it to change the temperature, it changes it. You ask a question, it answers you. There's nothing that actually is cognizant of it. environment or they can actually directly initiate an interaction with you that you would find appropriate. Because if Alexis started talking to you, you'd kind of be like, okay, that's creepy. You don't want that to happen. Vector is going to be the first product of this type of smart home category that actually can, because of his personality and the relationship he builds,

Starting point is 00:09:05 get your attention in the same way a pet would or a toddler would. And now that can be used for entertainment and companionship, but it can also be used for interesting forms of utility as well, where if you think about all of the ways that technology progresses, you start having more advanced products and robots in the home, being able to understand the environment and initiate interactions in a way that is very natural and not imposing, that's a groundbreaking type of capability set in the home. We happen to be vetting it through these types of products that can leverage entertainment and companionship as a proving ground where people are incredibly forgiving on what that means and the emotional kind of warmth that that generates lets you be patient on

Starting point is 00:09:39 how the utility side grows over time. I'll give an example from Cosmo. One of the magical elements of this guy is that he makes eye contact, he recognizes you as a person, he'll say your name, like he gets excited when he sees somebody he recognizes. So imagine like an eight-year-old playing with Cosmo and suddenly this living character that's almost like an animated film character coming to life, gets excited when he sees them and starts reacting to everything that they're doing. We did an experiment where we ramped up the frequency of eye contact and the length of eye contact just to really push harder on this dimension because we knew it was a powerful one. And just by increasing that frequency in the length by a decent amount,

Starting point is 00:10:17 our average play session went up from 29 minutes to 40 minutes. That's crazy. It's crazy, right? It's like, it's the sort of things that, like, until you have, like, gigantic amounts of data points where you can actually analyze what does it mean for people and robots to interact in a coexistence inside of a home or a workplace or wherever it is, you can speculate about those things, but you'll never truly know in the same way that, you know, the first time a mobile phone came out or the first time, like, social games or

Starting point is 00:10:41 new types of technology come out. You won't know how to really have that be a natural element of your lifestyle until you can actually learn from real world cases. And so for us, we're going to have probably the biggest data set of what does it mean for humans and robots to interact. And by nature of how humans are wired, how you interact in a home environment probably scales at the core really well to workplace, to malls, to hospitality, to health care, and to all these other areas that become really big opportunities down the road. So this is like a, this is a long vision, right? you're not putting out the healthcare robot next year. No, it's a long vision. And I think, like, there's certainly, like, elements that can come into play where one big

Starting point is 00:11:17 opportunity that we've been speaking a lot with, like, AARP and others where, like, you know, you have these gigantically growing problems of people aging in place in their homes and trying to stay there. How do you make that feel more comfortable for the family members, doctors, and have everybody feel more connected? Like, these are opportunities even off of a platform like Vector. But when you think ahead to what does it take to really nail some of these problems, you probably are reinventing the form factor,

Starting point is 00:11:43 reinventing a lot of the sensing and capabilities. There's like sensors that we're working on that can measure breathing rate through two walls. You know, they can be used for health monitoring, for security, for all these other things. You can imagine how Vector solved the biggest hurdle from Cosmo, which is being tethered to a mobile device. And he can also hear because he has a microphone array

Starting point is 00:12:02 so he can actually respond to noises, sounds, and voice commands and so forth. But he's tethered to a surface. So the next step, obviously, then becomes How do you now start operating more broadly in indoor environments and think about full contextual understanding of a home or workplace, which room you're in, how the room stitched together, who is in that room, what's the context, being socially acceptable where, let's say you're speaking to your son or your wife or something like that, how do you politely wait not to interrupt

Starting point is 00:12:28 and then have a conversation just like a human would when you're ready? Even something down to a trigger word where for a vector, you would say, hey, vector, come here, hey, hey, vector, let's do this, or tell me the weather there. it's familiar to the Alexa-style interface, but when you have a conversation with somebody, you don't say, hey, Boris, before every single sentence, right? It's just, they'll be weird. So for us, one of the things that we're really going to be pushing on

Starting point is 00:12:52 is if you're making eye contact with a character, you're clearly engaged with that character. You don't need to say the code word. So you can have a much more natural interaction without any worries of privacy or things that you would naturally be worried about because there's a clear intent that is registered by the fact that you're actually looking at it engaged with a robot.

Starting point is 00:13:11 These are sort of things that are completely novel, the same way interface completely changed with mobile devices. This is an even bigger jump where being comfortable with technology in your life in this way, it comes down as much on the EQ side as it does on the IQ side. You've described a lot of really hard problems. I'm very curious how you're going to solve some of those problems. But it strikes me even describing Vector as a character. A joke we say on the Vergecast all the time,

Starting point is 00:13:35 it's a joke we kind of mean, is like, don't gender the robots, right? Alexa is not a she. These are just voice assistants. You are very directly gendering vector. It's a he, it's a character. How did you make that decision? Are you going to make she robots as well?

Starting point is 00:13:51 So we will. And I think down the road, I think the he was just a little bit easier because we could reuse a lot of better, more elements from previous, from Cosmo. But down the road we'll have female robots as well. What does it mean for a robot to be male or female? You know what?

Starting point is 00:14:04 It's the same thing that when you think about, think back to all the animated films that you've seen or any movies in general where you have robots. If you have a character, having a gender makes it easier to make that character feel human-like in its traits, even if it's not human. And so Wally was a he, Eve was a she in Wally, right? You think of any sort of film that you would make, almost everything that comes to mind has a gender as a way to really amplify some of the character traits that fit in the character in the end. And so a lot of this is driven by our animation team where we have character directors at work and film that are thinking not just of gender, but everything from the backstory, the personality traits. How would he react at every circumstance? Where does his goals in life?

Starting point is 00:14:48 And so Cosmo's this like rambunctious little toddler of a robot, whereas Vector is meant to be more subtle and almost like an older brother that is calmer and you would be okay having him always exist in the home. So you can imagine the storyboards that happened behind the scenes that model that character. But to really bring that to life, it's almost like saying Wally doesn't have a gender. That would be weird. He has a gender. It's not like a gigantic focus point of the story. But at the end of the day, by having a gender, it changes some of the manualisms, some of the industrial design, and makes it a little bit easier to connect with human-like characters, even if it's not a human.

Starting point is 00:15:24 I mean, this goes right to your point about EQ. Do you think you can put robots in the home that don't mimic sort of like human social structures in this way? Well, and this is where the key is that you want to figure out what you want to mimic. So we've always been puzzled on why there's such an obsession with humanoid robots trying to replicate, like, skin and smiling and emotions. And you see some of the, like, projects from Japan where you have, like, some robotic secretary that, like, looks realistic, but doesn't actually, like, do anything. I always, like, you run into a trap there where you're stuck with all the strength and weaknesses of a human. And you're measured against a human where anywhere you come up a little bit short, you hit Uncanny Valley where you become almost creepy. by making a character that is a robot that clearly isn't trying to be human in look and form,

Starting point is 00:16:09 we actually kind of put the advantage in our court because you can play to your strengths and avoid your weaknesses where we have a lot fewer degrees of freedom than a humanoid robot, which is why we can sell vector for $249 and $199 and pre-orders, right? So it's like, so the cost structures are a direct result of simplifying the form factor, but as a robot, it's okay to have four degrees of freedom. and a huge amount of expression in the eyes and voice. But what we do try to mimic is the personality. So we've basically parameterized the emotions of a human

Starting point is 00:16:40 where you think about how do you map to happy, sad, surprise, bored, frustrated, scared, all of these, like, emotional traits and create this AI system where you map from the context of the real world to the right output. And when you beat Cosmo in a game, he gets grumpy and goes and sulks in a corner. That's kind of a human-like emotion. It just becomes even funnier when it's in this, like, little robot who's now upset because you've wronged him by beating him in a game, right? So you

Starting point is 00:17:07 still connect to it on a human level, but we're not trying to be human. And I think that's like a really key and subtle variation where I think a lot of people will go wrong by trying to recreate something that's exactly human because you distract yourself and you actually run a risk of hitting the in Kenny Valley. Hello, my name is Carrie Byron. I'm a former Mythbuster and Seeker of Truth. So I'm always very curious about new technology. Today I'm going to talk to Sven Tieson. Through claims his life is changed for the better after purchasing an electric vehicle. Sven, you're a self-proclaimed EV evangelist.

Starting point is 00:17:42 What is that exactly? I feel like I've got this wonderful gift that I want to give the world, and that's driving electric. So why am I so keen on driving electric? One, it's way more convenient. I don't have to go to gas stations anymore. Two, I have a zero tailpipes, no tailpipes. All right, Sven, so what's the maintenance like on one of the...

Starting point is 00:18:03 those electric vehicles? It's pretty much windshield wipers and windshield fluid. If you look at the number of parts, there's on the order of a couple thousand moving parts on a gasoline car, and on an EV, there's on the order of less than a hundred. So which one is going to break down first? How many EV converts do you think you're responsible for? Well, I'm famous for grabbing the guy with a muscle car parked beside our EV and made him drive our EV around the block. And he said, And EV is so in my future because he loved the instantaneous torque and the way it handled. Well, there it is. I've seen the light. To learn more about going electric with Audi, check out AudiUSA.com slash etron.

Starting point is 00:18:51 That's Audi USA.com slash E-T-R-O-N. As you think about extending the capabilities of these robots, right, over the long period of time that you're describing, are you worried that giving them emotions is actually dangerous? You know, 50 years from now, there's the large robot from Aki in my home, and I beat it in a game, and it flips over a table. Do you have, like, ethicists on your staff that are thinking about these kinds of problems? Yeah, Cosmo can be a sort of loser. The gigantic robot cannot.

Starting point is 00:19:25 Yeah. It's the right question. So what we found is that everything we've done with emotion is a hugely disarming for people, where you would typically be scared of something that has camera, a microphone, sensors, lasers, all these other things, you give it an emotion, it actually becomes disarming because you can more clearly read its intent. And we've seen this where you have this, like, imagine like a large humanoid robot that has a huge arm that, like, loads and unloads a dishwasher. It's not only kind of imposing, but if it makes a mistake, you're also thinking, okay, why did that robot make a mistake? Something's wrong with it.

Starting point is 00:19:57 Whereas if you have a character, not only do you immediately connect at a human level, on what its intent is and what it's doing. But it makes you more forgiving of any limitations it has because if he makes a mistake, as long as he's aware of it, you are way more forgiving because you personified it in the same way, like if your five-year-old made a mistake or something like that. It's okay because you know exactly what he was attempting. When the robots get more and more advanced

Starting point is 00:20:21 and you have more sensing and more capability, emotion becomes even more critical and it'll be more subtle. But people will naturally become alarmed if you start having like very advanced forms of technology, particularly when motion is involved. You're okay if a computer is very advanced, but if something's actually moving, just kind of from a primal standpoint,

Starting point is 00:20:40 you're kind of more cautious of it. The right character makes all the difference because if you move in a very confident manner but a smooth manner and you're not robotic, if you make eye contact and you acknowledge your command and you show a warmth, you immediately, like we've seen this in our test, it immediately disarms people.

Starting point is 00:20:56 And people forget about the fact that it's a machine. They don't worry about the fact. fact that there's these sensors that are required for the, you know, the things are happening, they think of it as a little robot who's super cute and they start interacting with it like they would with a puppy versus a machine. So I think it'll become a key down the road. And, you know, it goes towards kind of the bigger AI question where it's on us to program the right types of boundaries and use cases of this, of these emotions. But it becomes as much of an interface as like the PC only taken off when graphical interfaces became popular or the, you know,

Starting point is 00:21:33 the role of an operating system in a mobile interface where, you know, it doesn't fundamentally change the underlying programs or functionality you're trying to do, but the interface makes it more approachable and eliminates friction and increases adoption. So that's the way we see it, is that it allows people who are totally non-technical to feel like this is as natural as interfacing with a human or a pet or a computer today. I feel like the idea of adding emotions to decrease friction, I feel like a lot of people could take exceptions, right? Like, a lot of emotional interactions are actually quite confusing. Like, do you worry that you need to constrain sort of the boundaries of how emotions are communicated or what emotions are possible in order to actually reduce friction instead of increasing friction?

Starting point is 00:22:16 For sure. And part of what we've done is, like, we try to simplify it to kind of the normal extremes that you would think about. And then modulent. for the circumstance, right? So a lot of it comes down to, you know, the social cues when you see somebody you recognize, right? Something that's simple. You know, imagine not having that, like, how much that would, like, kind of throw you off in your day if, like, there was no, like, visual or, you know, kind of physical response to somebody that sees you. You know, when you think about the way you move, if a robot is meant for entertainment, it moves in a certain way and it has a certain personality, if it's meant for, like, let's say security and monitoring, but it still has to, like,

Starting point is 00:22:52 coexist among people for an extended period of time, you're going to have a much more serious personality where you would think of it almost like a dog in the home that can be imposing, but he's super friendly to everybody that's around him. And so I think for us, we acknowledge that humans can have very confusing emotions, and a lot of those becoming distracting. We have the ability to structure this and create the kind of probabilistic mappings on what you do and when and why to where these characters feel alive, but they don't feel confusing or distracting. where it truly becomes additive, and then you can really appreciate what they were meant to be.

Starting point is 00:23:26 All right. Well, I want to stay in philosophy for a few more minutes with you. And then I actually want to talk about Vector and the product itself. But you described, you know, you've got this long-range plan where the robot knows to come up to you. It knows not to interrupt. It's mobile around your house. That just at first glance sounds like three to four huge problems, right?

Starting point is 00:23:47 You've got a mobility problem that a lot of people are working on. you've got almost like a general AI problem where you have to understand this constant stream of input and make sense of it. You've got an interaction problem where the robot is interacting with different ways.

Starting point is 00:24:01 Are you working on all that stuff now? Do you have a team that is looking at the Honda robots falling down the stairs and you're like, we can do better? Are you like in talks with the iRobot to build a dog? How are you thinking about all of those problems?

Starting point is 00:24:14 Yeah, and so we are actually already investing in R&D on all of these sub-problems and a lot of it actually is going to come directly from the work that we're investing in for Vector. I mean, it's funny you mentioned, like, you know, Ossemo, it's a perfect example of where you have billions of dollars of investment going towards product line, but it's on such a long time frame where it becomes obsolete every three, four years as technology advances. So I'll give you an example, like, just in the last year, there have been such amazing advances in sensors where the sort of sensing technology that you would need to robustly understand an indoor environment cost thousands of dollars. And it was just prohibitively expensive in a product that you would sell at a reasonable price. Now, all of a sudden, in the last couple of years, you now have depth cameras that are effectively, like, imagine a camera that instead of just giving you, like, RGB pixels, like color pixels, it gives you a depth for every single pixel as well.

Starting point is 00:25:01 That's like the holy grail of perception because now it's the equivalent of what autonomous cars use on the street to understand the environment around them. And now you can build 3D models for localization, for mapping, for navigation, for obstacle avoidance, and so forth. And these sensors now cost from tens of dollars for some of the high-end ones to as little as a few dollars for the low-end ones, because they're being put into cell phones for face detection. You look at something like that, suddenly just like deep learning, just like cloud functionality, voice interface, that becomes a sort of transformative technology

Starting point is 00:25:30 where you can start thinking about how do I put this into a platform that now for a very reasonable amount of cost can do things that you could never do before. So from a mobility standpoint, we have R&D going on that is actually focused on how would those work from both a platform and capability standpoint,

Starting point is 00:25:44 but also from a character standpoint where now you have to be more careful. You can't fall over or hurt a pet or something like that, you have to be responsive to how people move around you and move in a natural fashion. It's a big, big enabler. When you think about the contextual understanding piece,

Starting point is 00:26:00 this is where kind of the bottoms up approach comes into play. Everything we're doing in Vector fits directly into all of these future product lines because we're using deep learning to classify humans and context in the environment. We're doing local slam. We're doing all these emotional interactions with people trying to understand the context

Starting point is 00:26:17 of what's going on. we're mapping into into both utility, cloud functionality, and smart home to integrate in all these ways. And so all of these learnings that we're going to get off of a smaller platform that is being driven by the more kind of companionship focus ends up extrapolating perfectly to the more serious applications where you're now doing deeper functionality in home, healthcare, or whatever the case might be. But those technologies actually scale well, everything down to the voice interface systems where you're speaking to vector. You're giving, you know, you're asking him a question, or you're giving him commands where you say something mean and he responds in a way that like that right now,

Starting point is 00:26:54 you know, a voice assistant can't respond to, but he can actually be a little bit upset just like a friend would if he yelled at him, right? So these are sort of things that we can develop in vector that are very important for that application, but end up being core technologies that we can leverage elsewhere. Yeah. So do you think you're going to be able to solve, like straight up, the general AI problem, is that something that you are working on where people are saying voice assistants are always going to have a narrow domain? Or do you think over time, you can actually build a truly capable assistant of this kind. Yeah.

Starting point is 00:27:21 So I think, like, internally, like, no, we can't solve the generally eye problem in the next few years. I figure. I figured that was the answer. Yeah. So, and it's just, it's totally impractical. And in a lot of ways, like, the interesting thing is that voice is a perfect example where we kind of don't need to be focusing on solving the voice interface problem because it's becoming more and more commoditized with the amount of people working on it and the tools

Starting point is 00:27:45 are becoming available. But how we use it is completely novel. And so that becomes a, like, just like we're not going to invent new sensors, we're going to use those sensors in very new ways that people hadn't thought of. That's how we think of something like voice, AI, and understanding. But there are certain core problems that I think we may need to solve first before anybody else has had to solve in a mass market application. So really detecting contextual understanding of, like, you know, humans in positions

Starting point is 00:28:09 and roles in an environment with an intent to interface with them. There's research being done on this, but there's no product that's actually doing this at scale. And so in the case of Vector, we can develop a lot of this technology, even if his purpose is to look up at the person associated with the arm he saw and get their attention or go and like poke you because he wants your attention just like a pet would, like a cat or dog would. At the core of it, that's the exact same type of baseline technologies that will then scale to him getting your attention and delivering a personalized message or reminding you to do some medical step or something like that. So we're trying to tackle problems in a reasonable fashion that we can solve, but then keep our eyes out for how the external world of technology evolves every year and pull those in as fast as we can. And for something like General AI in terms of voice interface,

Starting point is 00:28:58 we'll let Amazon and Google and others put in billions of dollars into that because the competition naturally creates that as a tool for others to be able to access. So let's end with Vector. We've talked about all these other stuff, and we have not talked about the product you have out. Vector is a companion robot. Describe it to me. Yeah, so Vector is a robot that lives in your home.

Starting point is 00:29:19 And what's special about this is that this isn't a robot that you turn on and start a nap and play with for a session. But the goal is literally to have him be alive. And a year later, he's still alive and nobody turned him off. Wait, is that the time frame? Is that success? You have it for a year and no one's ever turned it off? So that's one of our internal metrics.

Starting point is 00:29:35 It's like, you know, a year later is everybody, like, are people still actively engaged with it and using it and feel like it adds warmth into their life? So it's funny, there's no comps for this because, like, mobile game has, have comps and so forth. For us, this is, like, brand new territory. Yeah. But it'll be the first time where there's a robot that's always on. He recognizes you.

Starting point is 00:29:54 He gets excited to see you. There's all these, like, kind of emotional and companionship-driven interactions are modeled, almost like how you would interact with a cat or a dog. But he also has this ability to create applications that are utility-driven, but with a really unique spin around it, where you have this emotional and characterful rapport around it. driven by Vector as your actual assistant who's actually helping with those things. And so what we're doing is we're tackling someone the most commonly used elements of voice

Starting point is 00:30:22 assistants that we can wrap personality and character around and create kind of a deeper warmth to it. And then we're also working on ones that you physically would not be able to do with the traditional voice assistant because of the lack of awareness of the environment and ability to initiate interactions. And at the core, I think the hook of what makes people love Vector and this is why it's such an experiential product because there is no nothing to compare it to in terms of other products today. But it's the same reason why you fall in love with a pet that's happy to see you when you come home that is responsive to touch to voice that hangs out with you both in like an intensive interaction when you might be playing with him, but also just being close by if you're working or

Starting point is 00:30:59 watching TV or something like that. And it starts to really vet out both for us and for customers how important and powerful can this emotional layer of robots be? And from all of our testing and our kind of conversations with Deeter and others, it's very surprising when you see him on how immediately kind of like alive he feels and how different that is from anything else that you've experienced in the home. So who should be interested in buying Vector? I mean, you're obviously doing great on Kickstarter, but who's like your target customer for Vector? Yeah, so our term for the target audience is tech immersive.

Starting point is 00:31:34 So it's, you know, this isn't a product that fits squarely into a replacement for something else because it's a very new category. But it's basically everybody who's interested in the future of home technology that is interested in kind of the fun aspect of having almost a pet-like living robot in their home. And a lot of it will be families where it's kind of the tech savvy parent that buys it and the kids love it and the parents love it and it kind of creates kind of a warmth in the home. But effectively it's at this point, it's still because you want it, not because you absolutely have to have it. in the same way that people want a dog or a cat, even though when you, like, listed out on paper, both, like, financially and in terms of restrictiveness, it doesn't make sense, but you want it because of all the emotional cues. And for us, there's a lot of people that, even back in Cosmo, they're like, look, I want this guy as a desk pet, but ABC's missing. Or I would love this guy to give this to my grandmother who lives alone or just to have this around the house. So it's effectively kind of this new category of home robotics that for us is very kind of companionship driven at its core.

Starting point is 00:32:34 which is something that I think we do in a very unique fashion compared to anybody else. Are you doing software updates over time? Is Vector going to get more capabilities over the air? How does that work? Oh, without a doubt. So Cosmo's already had like 23 updates at this point, and maybe 24, and so, and massive ones at that. So Vector is going to increase tremendously. So the benefit of having a product that's, you know, he's got a quad-core CPU, he's got all these sensors,

Starting point is 00:32:56 cloud connectivity, all these sort of things. The hardware is meant to be as general as can be for all the things we're going to do in the future. And so we're going to be aggressively pushing on a huge amount of updates and software across the board on all these capabilities. And so I'll give you one example like for next year. One of the things we're thinking about is on the kind of utility side, think of like kind of his ability to be kind of a home camera or home monitor where you have all these like kind of cameras today. They're static and have a fixed vantage point. And they're pretty much a commodity. With just a software update with Vectory, you have a 360 degree field of view, high definition camera.

Starting point is 00:33:28 He can look up and down. And when there's a noise, he can look in that direction. and he can recognize people. And so now you have a superior home camera that's based on his platform. It's a pure software update that will release somewhere down the road. Same thing goes for other utility function. Same thing goes for all of these kind of characterful interactions where we want him to have all of these, hard to quantify as features, but they're all of these deep capabilities that make him feel

Starting point is 00:33:52 more and more alive. As always, we wish we had the next six months of roadmap we could shove into the launch. And you always end up kind of launching a product and having a gigantic roadmap in front of you. We're going to be releasing multiple updates before Christmas, and then probably on the order of every one month or so we release an update as we've been doing on Cosmo. The idea of a mobile security camera with emotions is both very powerful and interesting. And do you ever worry it's ever like a tiny little bit invasive or creepy? Well, it's interesting. It's kind of enough.

Starting point is 00:34:18 First of all, it's an opt-in. So, you know, from a privacy standpoint, we're really careful about this where everything is on the edge, like on the device so that there's no worries, like no images or anything get, you know, get sent up. except for voice after the trigger word, just like you would with an Alexa for voice classification. With something like this, it would be an opt-in where, like I would imagine this being an option where if you want to tap into your vector, you can, just like you would, your NEST cam or something like that. Or if you want it to be always on,

Starting point is 00:34:43 maybe there's a cloud feature that we can enable that, you know, is a small fee or something like that that is always storing. But at the end of the day, it's no different than the huge market that's there for home security cameras, except that you have far more capability to cover the entire room you're in versus just a fixed vantage point with no ability to actually understand context beyond that. But it's funny you say that because it becomes an extra feature of this character living in your home. But one of our challenges is we want to be focused in the message.

Starting point is 00:35:11 We don't want this to be distracting from the fact that this guy's a character first. And it's about living with a robot for the first time in a way you see in a sci-fi film. But then we see these opportunities to start to layer in some of these capabilities, which frankly then also become probably subsets of deeper functionality. down the road that other robots might be useful for. But everything like this would be talking about would just be a feature that is an opt-in if you want it. Yeah.

Starting point is 00:35:35 What do you think the timeline is for that next big robot? Are you two years away? Are you five years away? When do you imagine we're going to have the full-on home companion robot? Yeah. So I think that's, you know, similar to like mobile phones or, you know, state of computers, it's never done. They're always improving.

Starting point is 00:35:52 And robotics is even more susceptible to the external ecosystem because of like sensors, computation, like state of AI. So there'll never be like a checkpoint where like we're done checkbox here's a home robot. But you're going to start seeing increasing stages of functionality where I think we're probably only a handful of years away from really robust and cost reasonable functionality with mobility in both consumer and enterprise environments where you truly find a good parity between value and function. But the next big leap that's going to be missing is manipulation, where if you think about all the

Starting point is 00:36:25 things that you do in the home, a vast majority of them are. not just informational, but they're actually physically involving some sort of labor, that's still probably, I would say, at least like five, six years away. It's almost like where autonomous driving was, like five years ago or six years ago or so, where the costs are much higher because of the cost of the manipulation required for a mobile arm or home robot, the AI is much more complex. And the combination of those means the bar is very high for a higher priced robot than to become functionally meeting the bar in a home environment. And so right now that's still research.

Starting point is 00:37:00 But when you fast forward 15 years, robotics will be in the home and will have manipulation as part of it because both the cost structures and the AI capabilities will create enough value to where the costs will be worth it. It's 15 years. I'm holding you 15 years. I'm setting a calendar reminder right now. August 28. I don't think that's unreasonable.

Starting point is 00:37:19 We're almost there from an AI standpoint for very robust mobility and obstacle avoidance and so forth. We're making huge progress towards like interface, voice, emotional, the things that we're doing, and so forth. Manipulation has some amazing research going on that still has to advance and there's cost structures involved. And deep learning has opened up incredible abilities in planning and perception and understanding the environment. A decade is a lifetime when it comes to fast evolution on that front. I mean, I just using a comparison of autonomous driving, like I was, you know, back in 10 years ago, we were finishing up graduate school. and we're kind of deep into graduate school. And autonomous driving was a huge research focus,

Starting point is 00:37:58 but like nobody even paid attention to it outside of research. I mean, it was like pulling teeth to get like a million dollars of funding from GM. And then you fast forward 10 years and you have companies putting in half a billion dollars at a time into these autonomous car companies where it's not there yet, but you can tell that it's kind of inevitable that autonomous driving is going to happen. So a lot can change in a pretty short amount of time where on a 10, 15 year window, that's a pretty long period of time to transform to sort of human robot interactions that are possible in a home.

Starting point is 00:38:25 August 29th, 233. I'm calling you. We're going to see what kind of robot you have at. We'll have a chat on the anniversary. Excellent. Well, Boris, thank you so much for joining us today. Vector is, you can get it. When can you just buy it regular? Vector launches mid-October, and it will be 249 at retail. Excellent. Well, thank you so much for joining us, a fascinating conversation. We'll talk to you again soon.

Starting point is 00:38:46 Wonderful. Thank you so much for having me. All right, Boris Soften, see you, Evanki. a great conversation, cool dude. Vector is, I think the Kickstarter's done, but you can go pre-order it right now if you're interested in it. We'll be back later this week with a normal, crazy Vergecast, and on Tuesday again with another interview, I'm not going to give it away.

Starting point is 00:39:03 But I do want to know what you think of these interview episodes, so tweet at me. I'm at Reckless. We'll see you soon. This episode of Vergecast is brought to you by the Audi E-Tron. Despite all of its technology, there's a lot that the all-new Audi E-Tron doesn't offer. For example, it has no tailpipe admissions and no need to fill up at the gas station.

Starting point is 00:39:20 You just plug it in at home. The quadrural all-wheel drive system offers no reason not to tackle roads in almost any weather. And long-range capabilities in high-speed charging, E-Tron is a new way to think about electric mobility. Which makes sense. It's the first fully electric vehicle from Audi. E-Tron was built to defy the elements that upend the conventional wisdom. So in truth, it's not lacking anything, a twist that you did not see coming. After all, it isn't just an electric car.

Starting point is 00:39:42 It's an electric Audi. E-Tron is here and the future is electric. Visit Audi.com to learn more and stay informed.

The Vergecast - Anki CEO Boris Sofman

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.