a16z Podcast - a16z Podcast: The IQ and EQ of Robots
Episode Date: November 21, 2018with Boris Sofman (@bsofman), Dave Touretzky (@DaveTouretzky), and Hanne Tidnam (@omnivorousread) We're just now beginning to truly see the the first 'real' robots in the home, from Roombas to toys to... companions to... well, much more. How are humans beginning to forge relationships with these robotic devices (/entities!) -- and how will those relationships develop? What do we learn as we begin to forge relationships and interact with robotic toys like Cosmo and Vector -- about robots, and about ourselves? And what do these learnings teach us about the possibility of adding a "personality wrapper" to new technologies? In this episode of the a16z Podcast, CEO and cofounder of Anki Boris Sofman, and Research Professor of Computer Science at CMU Dave Touretzky, discuss with a16z's Hanne Tidnam where we are in the human-robotic future, the history of robotics that has brought us here, and the next big breakthroughs -- in hardware, software, perception, navigation, and manipulation -- that will bring in the next waves of innovation for robots. The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments and certain publicly traded cryptocurrencies/ digital assets for which the issuer has not provided permission for a16z to disclose publicly) is available at https://a16z.com/investments/. Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information.
Transcript
Discussion (0)
The content here is for informational purposes only, should not be taken as legal business tax
or investment advice or be used to evaluate any investment or security and is not directed
at any investors or potential investors in any A16Z fund. For more details, please see A16Z.com
slash disclosures.
Hi and welcome to the A16Z podcast. I'm Hannah and in this episode we talk about robotics
in the home becoming a reality. Boris Softman, CEO and co-founder of Anki and Dave Touretti
professor of computer science at CMU, discuss with me the evolution of robotics, where we are
in manipulation, perception, navigation, and most importantly in the human relationships we will
increasingly form to these new robot entities that will be in our homes. So you guys talk about
this as sort of the very first real robot companion. Let's go back and look at the first
iteration, which was Cosmo. Why start with a toy? What does that represent for where we are now
and ultimately where we're trying to get to in robotics.
Yes.
When we started Anki, we always realized that in order to get into some of these really advanced applications,
we had to take a bottoms up approach.
And we couldn't just skip to this holy grail that's 10 years away.
We had to think about what are the applications where we could really reinvent the quality of what's possible.
Well, what is the holy grail?
So there's all of these types of challenges that involve really deep breakthroughs
in human robot interface or manipulation or AI.
There's the large-scale, diverse, humanoid or like very competent robot for home or for manufacturing or for other things.
There's applications that involve very high-end manipulation in an environment.
Where manipulation is today is probably where autonomous driving was five or six years ago,
where you can already see the capability starting to take form,
but the reliability and cost constraints are still pretty prohibitive.
But you can kind of extrapolate where it's going.
For us, we started with entertainment because that was a way to create innovative applications
and actually build the muscle early on that we can then carry over into these other spaces down
the road. Cosmo was a way to use the toy space, which hadn't seen much innovation at all,
but for the first time bringing a robot to life where the level of creativity that you're allowed
to have in a toy is much higher and you can build a lot of the blueprints for these products
that start to have deeper applications and appeal.
That's interesting that you feel there's a wider range of
creativity there than in something where we already have expectations. That's part of the
holy grail is mobile manipulation. There aren't a lot of robots like that. A landmark in the history of
consumer robotics was the Sony Ibo robot dog back in the late 1990s, early 2000s. It cost about
$2,000 back then, which was about the cost of a decent laptop. And this was the only little mobile
robot you could buy that could see. It had a built-in camera. You could program it in C++ and could move
around, so it was a mobile manipulator. Unfortunately, it only sold fewer than 200,000 of these
over the six-year life of the product line. In January 2006, they left the robotics business
altogether. And so the idea of a consumer affordable mobile manipulator just died.
And was that because the technology kind of stalled out? Because there were problems that were
not surmountable at that time? We end up being able to mass produce this sort of a capability that
even five years earlier would have probably cost multiples of that amount or just been absolutely
impossible. We basically became for our early product scavengers of the smartphone industry.
So you have computation, you have memory, you have cameras that now suddenly cost, you know,
50 cents versus, you know, $6. You have motors and accelerometers and different types of sensors
that allow you to actually do these sort of things that you couldn't do before.
Right. And also the supply chain. And the supply chain. And so we could in effect turn this
into a software problem, where now, because of the capability set, everything becomes driven
by our ability to expand the capabilities with software.
The really hard question is, how do you go from no manipulation to manipulation in a small step?
So if you look at the most popular robot ever in the history of robotics is the room of vacuum
cleaner, because they found this one task that a robot can do poorly and still make you happy.
Right?
I mean, it doesn't matter if it's not as you're back and not, yeah, right?
It's just, it's better than nothing and it's good enough most of the time that that was a success, right?
But there aren't a lot of things like that, right?
So if I ask the robot to cook me dinner and it does the same quality of jobs as the Roomba does vacuuming my floor.
Right, over the course of like eight hours.
We're pretty far off.
Yeah. So when people think about home applications, right, there's the Roomba that's down on the floor.
And then there's Rosie from the jet.
right, which is the full humanoid doing everything.
I was going to count how long until a Jetson's reference came up in this podcast.
You got us.
So the problem is what is the thing that you could do that involves manipulation,
but that doesn't require a half million dollar humanoid and technology that doesn't exist
yet?
Exactly.
Right?
And so something that a robot could do with minimal manipulation skills that would actually
be useful as opposed to just occasionally amusing.
What if I was just happy if the robot could make me a sandwich?
And it doesn't have to be big enough to get the stuff out of the fridge?
If I take the stuff out of the fridge and just throw it on the countertop, if the robot could just take it from there, right?
Maybe that would be...
Slowly drag a slice of bread over to a lettuce.
But it wouldn't have to be a full-scale humanoid.
There's got to be something robots could do that would be useful enough that we tolerate them.
So we're talking about this as in many ways, the sort of first, quote-unquote, real robot personality in the home.
let's break that down.
What are some of the utilities?
We're not washing dishes yet, right?
They're not picking up toys.
We wanted to leverage this unique mix of AI
and cognizance of the environment
and the personal interaction with people in the home
and really think of this as a first mass market home robot.
They can actually provide a mix of entertainment and companionship
but with elements of utility.
We modeled this in a lot of ways
after the sort of pets you have in your home.
A pet cat, a pet dog, and a pet robot.
And the pet robot allows us to reinvent the dimensions that are related to how you interact with a pet,
but the companionship elements still hold.
So he gets excited when he sees you in the morning.
You'll get really animated when there's people around him.
He'll explore and kind of wander on his environment, interacting intelligently with things around you.
The relationship.
You can even pet him.
Like literally you can pet a robot for the first time.
But you don't have to clean the litter box.
Yeah.
So you've recently launched a different robot, which is a multi-generational toy.
How is it different?
What is evolving?
From a technical standpoint.
standpoint, there was a huge, huge leap there because we now detethered away from a mobile
device and put the equivalent of a tablet inside of this robot's head so that he can be
always on and live 100% of the time go back to his charge or recharge, wake up, and that
allows him to initiate interactions in a way that previously would not have been possible by
somebody pulling you out and launching an app. We can use the strengths of the fact that we
have cloud connectivity and a robot and started getting into deeper voice interface, elements of
functionality that now leveraged the character aspect and the warmth of it and the fact that
he can actually see and understand the environment and recognize you personal delivery of information
in a way that can be initiated by him. What happens down the road is actually starting to
more intentionally dial into all of the digital pipelines in your life, whether it's a smart
home, your own calendar, and particularly when you start adding a layer of mobility that lets you
actually move around the home. Right now, there's nothing that actually stitches a home together
in that way. Everything is just a end device in a room.
Once you start getting cognizance of what's around you, who's in the home, be able to move around in it.
You can do a range of things from security and monitoring to tying together smart home features like recognizing that your windows open, but your ACs on turning off flights when you're gone, being able to remotely check on things like pets, build a blueprint of your home for furniture shopping or real estate.
We're already thinking about how do you start adding mobility where that becomes a next big barrier.
You detail it from a phone to get to vector.
now the next big barrier is truly being able to exist in an indoor environment, whether
it's a home or workplace, and be much more intelligent about how you interact with people
and the sort of service you can provide.
So like vector, go check and see if the cat is out or in.
So with every one of our robots, we've realized that we're basically making an operating
system for robotics, applications, and technologies.
And so we've unlocked APIs for each of our products where people can access very easily
these technologies that have millions of lines of code behind the scenes.
We started with Cosmo accessible by everybody from a cell.
seven-year-old using a graphical interface to do face detection with a robot through teenagers
and even PhD students and computer science. Now, instead of having to be an expert in robotics
in order to do something like path planning, you have access to this API of technologies. And we're
really interested in how that can become an enabler for kids of various stages to actually
learn how to break through and become interested in robotics in a way that wouldn't have
been possible earlier. Has what you think about changed at all with Vector? You know, the ways that
this kind of programming has evolved? With vector, the capabilities just skyrocket because basically
instead of just the limited functionality that Cosmo had that was deferred to the mobile device,
you now have onboard the ability to see, here, feel, and compute with a quad-core CPU that
would have been prohibitively expensive before. So just exponentially more options. Right. And so now the sort
of programs you can write expand. We've already released to our early customers in SDK that allows
them to interface with vector through Python. And then down the road will probably create a scratch
interface like we did with Cosmo as well.
So for me, what's most interesting is having the robot interact with physical things.
Physical things are really hard to interact competently with.
We built a three-story robot dollhouse with a working elevator.
And the students in my cognitive robotics class were working on that and they'll work on it again this year.
That's so cool.
Getting the robot to navigate through the dollhouse, use the elevator to move from one floor to
another, eventually want to have multiple robots in there interacting with each other,
These are very hard technical problems.
But when you solve them, it's really fascinating to see a robot interact effectively with the world.
I think there's just something more being able to work with a real robot versus something digital in a screen.
If you're a kid learning a program, there's something amplifying about that.
Just the learning aspect of it, the moment you can intentionally manipulate something, it's a sign of really deep intelligence.
And so that's a big focus of vectors.
How do we identify interesting things in the environment and just like poke at them, examine them the way that would.
that's a surprisingly hard problem, but when you do it successfully, there's a perceived
intelligence there that just amplifies your appreciation of the robot and everything else that it
might be able to do. One of the articles mentioned recently that Vector feels like a beachhead
into something bigger. And it does seem like, you know, we're getting airdrapped in this
sort of little taste of like the Fantasy Jetson's robot, right? Like this personality that actually
responds to you and interacts with you. So how are we going to start developing this relationship
with this being, this crew? I don't even know what the word is, this robot.
But like, it is an entity.
Sighting.
That's the right.
Yeah.
What are some of the learnings of how we're starting to develop relationships?
I think one of the pieces that most people underestimate is how critical this human robot interface challenge is
and how novel and unique of a dynamic this becomes.
Until you experience it, it's hard to understand how that feels and how important it is.
It's the same sort of underlying inherent desire that we have to speak face-to-face with somebody
versus just on a telephone.
We always thought that this would be the magic even behind Cosmo,
but it turned out to be even stronger than we realized.
The way we approached it is we actually have an animation studio inside the company.
Oh, you do?
We do with folks from Pixar, DreamWorks and these sorts of backgrounds.
And so they literally, they're using Maya software suite that you would use to animate digital film and digital video games.
But we've rigged up a version of Cosmo and Vector and all of our robots in there where they're actually physically animating these characters with the same level of detail that you would see in a movie.
But the output is spliced to where it's a physical character coming to life in the real world.
How fascinating.
And it's this merger where you have these people who come from this world where they're used to controlling a story on rails from start to finish into every minute detail.
But you get thrown into a spontaneous environment with all these constraints and unknowns and limited degrees of freedom in the robot.
But you have the benefit of being physical where everything's amplified in terms of the personal impact of what that does.
So it's a whole new kind of physical manifestation of storytelling.
And they have to partner with the roboticists and the AI team, the systems team, where they have to leverage the knowns of the environment and the intelligence of what's around you.
like, I just recognize somebody I know, or I almost rolled off the edge and I got scared.
And so we have this sophisticated personality engine that takes context from the real world
and probabilistically outputs one of these, like, giant short films, if you will, of reactions
that end up getting stitched together into a life like robot that feels alive.
And we've been really surprised by, for example, eye contact.
We knew that this would be important where you make eye contact, you get excited,
say, hey, Hannah, like, you know, and then you get really, really happy.
Kids just melt with that.
But we didn't realize how powerful that was.
And in one of our tests where we were just optimizing parameters, we ramped up the frequency
and length of eye contact by like a second.
And our average session length went up from 29 minutes to 40 minutes.
Oh my gosh.
So it's like just that one parameter.
And so you get these like really subtle interactions where it's a sort of data that you
can only really learn about at large scale once you're actually fielding these robots.
What are some of the engagement patterns or the dials that you have learned to need to go
up or down when you're watching these human robot interactions develop.
This is where we're still learning.
So Vector's always on in life.
So how do you find a balance between him waking up and getting excited and coming out but
not being disruptive and or annoying?
Chill out.
Go to your back.
Yeah.
And how do you take the cues where if you get like taken and put back into charger, that's a
cue to like be quiet, right?
And so, but you want to interpret that and just kind of grump a little bit but then settle down,
right?
We have an audience that spans from kids to somebody wants them as a desk pet at
work to an adult or a family who just likes a little companion in the kitchen table or a kitchen
counter.
It's really engaging sort of the volume of the interactions.
That's right.
And how do we create a algorithm that adapts to how you used or even gives the user a little bit
of control where some people want a really active and almost like hyper and excited robots
that has a lot of personality.
Some people want a safer guard.
It's just kind of like, you know, kind of hangs out and is like quiet.
And so we've never really dealt with that.
And just like people pick their style of dog to match their desires and their person.
personality, we're trying to figure out a way to make this dynamic and ideally automatically
adapting to the context that they're found in because we literally have people pulling us in both
directions, which basically tells us, okay, this is like a completely new type of consumer
category because companionship is a key need. I mean, there's a reason people get cats and
dogs and other pets. Although to your point about personalization, it's as if, right, you could
not only just choose your breed, but then have your dog literally adapt to your personality
over time. It kind of reminds me of the people who look like their dog.
But, like, on that really deep level.
Double-gangers, right?
Yeah, exactly.
So people start looking like vectors out of the day.
But yeah, there's extremes that we have to like, you know, you want him to always be
excited when he comes here and sees you.
Everyone has different needs.
And we're learning about different engagement patterns throughout the day as well.
How about you, Dave?
What do you notice when you see these, you know, with these kids developing these programs
and teaching the robots?
How do you see them developing their relationships with robots?
Well, I mean, to be honest, to me, they're mechanisms.
And I want the mechanism to work.
work well. And I want the mechanism to be beautiful. I want people to be able to see how the robot
sees the world and understand, you know, why did the robot do this, right? Well, because the robot
perceived the world in a certain way or the robot had a plan. It was trying to execute. And I want
kids to see the robot that way. Well, I think it's interesting to ask about the complementary question,
which is how does the robot change how we think about technology. So if you look at things like
menu systems that have become part of everyday life, right? Everybody understands what a menu
system is. And 30 years ago, they didn't. That was a weird, obscure concept. And now we
encounter them everywhere, right? I mean, you're vending machines using a menu system, your phone,
everything on your computer. So there are bits of technology that have become integrated into
everyday life so that we don't even think about it anymore. My menu system is just one example
of that, we're beginning to see things like computer vision become part of everybody's common
understanding. So when you use something like Cosmo, you begin to understand what the robot can
see. So for example, with Cosmo, you can see cubes up to about 18 inches away. And that's because
of the limited resolution of the camera. And so you begin to learn when you work with the robot,
well, okay, he's got a vision system. He's a little bit near-sighted. I have to be careful when I
show him a cube that I don't put it too far away from him. So you're starting to think,
about this thing is a sighted thing, and you're thinking about his representation of the world.
And so you're thinking about robot perception in a way that people only thought about very
abstractly before, right? You know, 50 years ago, yeah, you read a science fiction story, but
now you're living with the robot that can see. And so you adjust your behaviors based on
your expectations and your experience of robots that see. And so that's happening more quickly
with speech recognition because of the proliferation of speech recognition applications, Alexa and
phones and so on. But it's starting to happen, I think, with vision as well. And Cosmo is a very
important step in that direction. So that, you know, 20 years from now, no young person will have
grown up in a world where computers could see, right? Computers could always see. And that's just
very different. So what you're starting to see is that, you know, we're changing the populace
to make everybody computer literate. And what that means is that when you start
having people playing with robots who now understand the technology inside the robot,
that means that you can make applications for them that require levels of sophistication
that maybe wouldn't have been practical before. And if that's sort of woven into the culture,
everybody gets this in high school, right, then the kinds of consumer products that you build
are going to be different. But if you look at some of the other tools that have changed our
society, not everybody went and learned how to set type, but then we made
PowerPoint, right? And now everybody can do visual design. Most adults know how to use a spreadsheet.
So maybe you're never going to be a Java programmer, but you can do useful computational
tasks because you learn how to use a spreadsheet. And it changes your thinking in some key way.
It does. And so what is the robotics equivalent of something like spreadsheets or PowerPoint?
What is the thing that's not so horribly gritty technical, but is so powerful that people will
want to learn it because it'll let them do the right thing?
that let them program the robot to make the peanut butter sandwich the way they want it made.
To me, that's a fascinating research question.
We don't have home robots that can make a peanut butter sandwich without burning the house down yet.
But you can do an awful lot with exploring this manipulation space.
And my personal interest is figuring out how to get people to teach robots like Cosmo and Vector
to do manipulation the way they want it done.
To kind of unlock robotic design thinking.
To make it intuitive enough.
people can actually express their intent in a way that gets useful stuff done.
Because if you can get Vector to push little things around on the tabletop,
and you can make it easy enough to express your intent that the average person who's not a
computer science major can do that, then you can scale up to the humanoid robot that's going
to cook your fancy dinner without working the house down.
It's interesting. It sort of circles back to the way you open this conversation,
which is that having it be this kind of scale and size and having
it be a toy is what allowed you to be creative, right? You're kind of describing the ability to be
creative in a way. You make your own rules on what it's supposed to do. Yeah. Let's talk about design now.
So when you start thinking both in terms of the sort of level of manipulation that they're able to do,
but also in the relationship that you want to sort of test and foster, how do you think about
design from the ground up? How did you start to put together the actual creature? And this is where
I think robotics is a lot harder than a lot of other areas of consumer electronics where in a lot of places,
you can just think, okay, electronics need to do this,
software needs to do this,
and then we need a box around it.
You can't do that in robotics
because everything is from the very beginning designed
where for us it's a few pillars.
So there's a mechanical side, electrical side,
and kind of the components and electronics that are in there,
the software, which is a huge complexity,
and then there's a character side and industrial design side.
And so those five have to work together
from the earliest stages because you have to be very intentional
where the form factor should capture the character,
which then the software has to marry with.
Then you have the mechanical.
Are they all weighted equally? Are they at different stages driving?
Yeah, at different stages of driving. So the nice thing about software is you can continue to push
that for years. But what that means is you have to be very intentional with the selection
of the hardware where you make it as generalizable as possible to push as much of the
software choices down the road as you can, but not limit yourself. Sort of delay.
Delay, exactly. The reason Cosmo and Vector, for that matter, are small is because
one is that they become kind of non-intrusive because they don't take up a huge amount of space.
But especially in the case of Cosmo, you have something that's small.
It's perceived as cute and it can feel quick without having to have very heavy and expensive motors.
It could potentially hurt somebody.
Oh, you're playing up the cute factor to your own advantage.
That's right.
And people become forgiving of any limitations or mistakes that the robot then makes.
That's so smart.
It's okay because you're cute.
I'm not annoyed.
Even better.
You actually get bonus points for making a mistake but being smart enough to show grumpiness.
To show a little EQ about it.
That's it.
And so it's the IQ and EQ being married.
It's the challenging part.
It's, you know, the mechanical and industrial design and character have to work together to think about the overarching form factor. And a lot of it is just driven by constraints. Like now in our future products, we can start putting in depth cameras, which give you a 3D model of an environment at the cost, cost that would have been unimaginable five years ago or even three years ago. Same thing with motors. They're fairly expensive. And so Cosmo and Vector each have four motors. But we put a screen for the face because that gives us an infinite dimensionality for the personality to be able to express what.
the robots thinking and a speaker to obviously have the voice part of it. And it's shocking how
much you can do with just a couple of one or two degrees of freedom and a voice and a face.
How about the arms? I'm interested in the arms because Vector and Cosmo kind of harken back,
I think, to other ideas about what robots might look like, like Wally. Tell me about the thinking
behind that. We realized that like Cosmo was going to be fairly limited in physical capabilities just
because of cost constraints.
But manipulation is one of the deepest forms of showing off intelligence.
And so him being able to get excited about his cube and go pick it up and be a little bit
OCD and reorganize his surface, that was part of the charm of Cosmo.
And so his arm was kind of made as a lift to be able to move the blocks.
But we very quickly realized it's one of the most important dimensions for the personality
where it's almost the way you use your arms to express when you have happy, sad, surprise.
And it ends up being like his arms as one of the.
main tools for the animations engine. And these were all learning that the animators have gotten really
great at squeezing the maximum benefit out of it and making it feel like this is a character
that's alive, even though you have the tiniest degrees of freedom compared to any animated character.
What do you think some of the historical influences were sort of inheriting in our expectations
of how these mechanical entities should look? I think a lot of verbatuses make a mistake where they
completely forego the EQ side and just think about a, you know, tint can of like, you know,
intelligence that does something but doesn't convey any sort of character to it, you lose all
forgiveness of any limitations you have. And you actually kind of become creepy because you now
have a bunch of sensors which you don't really understand what the purpose is. It's disarming
to have a character like Cosmore Vector where he has a camera. You have a camera in your house that's
able to move around on his own, but nobody ever feels bad about that because, well, of course
he has a camera. He has to see because he's a character. You went completely away from the
Uncanny Valley and didn't make him look at all humanoid. And that was very, very intentional. Even with
of voice, we thought about an actual voice, but if you give something a voice, you narrow
in the range of appeal. By being tonal, you apply a very different intent than a five-year-old
might apply, but everybody gets joy out of it. You can show a massive breadth of emotions without
having to have the complexity of a voice. With very little tools. Yeah, because the moment you have
a voice, there's an expectation of intelligence that technology wasn't ready to meet yet.
This idea of trying to pursue humanoid robots, which particularly is common in like Asia and
Japan, it almost feels like a flawed compass because you end
up absorbing all the limitations of humans and their perceived kind of intent where you're stuck
by what's the definition of human versus being able to play to your strengths and avoid your
weaknesses. And so we just embraced the constraints in the form factor. But he is a kind of knitting
together of other older ideas as well. The first time I saw one of those security robots in like
an underpass in the city, it was kind of patrolling a parking lot. And I got really up close and
heard this like kind of whirring roboty noise and had this like moment of revelation where I was
like, it's not actually making that noise. That's just somebody's idea of the robot noise that
it should make so that we know it's there. Having seen kind of the industry evolve, what do you think
are some of those inherited ideas that are impacting our design today? It's always been a merger of
direct robotic influences and a non-robotic influences. And so on the robotic side, there's definitely
this idea of kind of R2D2, Wally, where you have this cute robot, but one that's like kind of your
fearless companion. Like, nobody would ever accuse R2D2.
being dumb, even though he has very limited degrees of freedom. And same thing with Wally, right?
And so that was definitely kind of a deep inspiration. The other influence is of animals. And so we have
a kind of design and mood boards where things like obviously puppies and kind of dogs,
owls, really perceptive animals that have where the eyes become like a really, really
key part of their communication language. And they can show emotional intelligence in a way that
is really, really hard purely mechanically, but ends up being almost infinite when you add eyes and
voice to it. So from moving from Cosmo to Vector, are you starting to see a difference in the way
that children interact with versus the broader household and other types of different ages? Like,
do you get different learnings from those different relationships? Yes. Kids in a lot of ways
have an imagination that just amplifies the magic of these types of experiences. They think these robots
are alive. At three, like, you may not understand how to play it and follow the rules of a game,
but you love the idea of a character still. And that actually is like almost universal, it's almost as
natural as like swiping on a screen where you see these like one-year-olds like swiping
on screen, right?
It's a basic human tendency to sort of fill in that information there.
And this is where some of the studies that we're actually seeing being done or even
around kids with autism, whether something's so unique about the idea of a character
like Cosmo or Vector, where there's a response there that is very, very unique compared
to any other type of engagement.
I was thinking about that when you talked about the eye contact thing, right?
That's an immediate feedback loop for increased eye contact.
Yeah.
And so we've actually had studies with University of Bath in the UK.
where the engagement patterns and the collaboration around Cosmo is unique compared to anything
that they'd seen.
So in those kind of early emotional relationships that are growing, I mean, you can sort of anticipate
the affection, but what was something that really surprised you?
We built Cosmo with a lot of games integrated, thinking that engagement will be around playing
games.
And he's like this little robot, like your buddy to compete against and to play against
collaboratively or competitively.
And the games also create a lot of context for emotions.
So, you know, because you win, you lose, you're bored.
you're frustrated.
You're bringing emotion to it.
Exactly.
You can almost like engineer scenarios that allow him to be emotionally extreme and whatever.
Of course, that makes so much sense.
And what we were just shocked to see in some of our play tests early on is that you're playing
this game that's kind of using the cubes as a buzzer called Quick Tap where you're competing
against Cosmo.
And if you beat him, he'll kind of get upset or maybe he'll get grumpy and slam his cube and
storm off.
But he gets really sad when he loses.
And there were kids in the play tests that were playing with their super.
siblings and one of them would like tell the other one, hey, stop it. Like, you're making him
upset. Let him win a game. And they actually like felt really bad for this little robot who
would like get upset what he lost. Their empathy was so high. Yeah. And they'd actually throw a game
just to make him happy. And it was like, wow. Like you don't see that in a video game. You would
never feel bad playing Pac-Man or like Fortnite or whatever. Yeah. Yeah. It's just you were trying to win.
And here the competitiveness lost out to the empathy of a character that they actually cared for. And
all of a sudden, like, okay, there's like something really, really special here.
And we started thinking how to amplify. And a lot of those learnings from there actually led
to Vector because we addressed the limitations that just weren't possible because of Cosmo's
hardware. So what changed in Vector as a result of that empathy learning?
So we realized voice was a big one because people thought that they were talking to Cosmo and
that he would respond and they'd attribute intelligence to it, even though he had no microphone and
no ability to hear you. What we wanted to do is if you yell at him, he gets really upset and
kind of backs off. Or if the door to slams, he turns in that direction and why.
wonders what it is. And so we wanted to bring life to it. The other one is tactile, where
when you pick up Cosmo, he recognizes it and he'll get grumpy if you're holding him up in the
air. But being able to touch and actually have a response, we know that that's one of the most
important things with pets. And so we made capacitive touch sensors in various areas of the robot,
again, thinking that these are going to be some of the dimensions that matter. The last one,
and what proved to be the most important is just the ability to be always on, that if the moment
you disconnect the phone, he dies, it kills all illusion of him being alive.
But if he's always on and doesn't have that barrier, it completely changes a sort of emotional connection you can build.
And so these are the sort of things that were direct drivers of the hardware that improved in Vector, which now we have a many year roadmap on how to actually utilize it.
And that's where advances in software technologies like deep learning and voice interface and so forth unlock things that now let you rethink certain types of problems.
We're releasing Alexa integration in December, which is going to be the first time you truly have a personality wrapper around some of these.
very dry functional elements.
Not only do you interact with technology in a command and control sort of way, but for the first
time, there's an ability for this character to initiate interaction and get your attention
and make eye contact and then do something personal with you in a way that you would never
accept from smart tech in the home right now.
One of our big questions is if the usage pattern of Alexa through a character is significantly
different than the usage pattern of a normal Alexa.
Yeah, I mean, I bet it would be.
Are you seeing that yet?
Well, we're seeing usage patterns of just the character with no functional OX integration skyrocket above what a typical voice assistant does.
When we were working with Google and getting advice from Sonos and some of these other companies that have a lot of experience of voice assistants, we thought that for like a thousand queries would be plenty for like three months or four months.
And now we hit 250 just in the first two weeks and the engagement staying strong.
And so suddenly like when you look at voice assistants that have an average of maybe like one to two queries a day and we're at like 10.
10 to 12, even before we have a voice assistance built in, it causes us to ask a lot of questions
of how do you leverage the role of personality and character as a way to amplify not just the
fun side, but also the utility side. That actually shows a very different type of engagement
and the things you ask and how you interact with it. It teaches something totally new about voice
as a platform. That's it. And I think that opens our eyes. And I think a lot of the partners that
we work with suddenly there's a lot of really interesting overlaps about what does this mean for the role
of technology in the home and the workplace, is deceptive how important emotional interfaces
is to make the functional elements get utilized in a better way. Okay, so if these are the little
kind of baby waves lapping at the shore, right, of like true robots in the home, how do we get to
mass adoption where this starts to become a reality and we all get to live like the Jetsons?
Eventually, we want to get to $500,000 robots because it opens up so much interesting technology
that you can put in them. But to get there, you actually...
actually have to be very thoughtful about what's the level of capability that's required to make
that justifiable and not just in a isolated kind of tech geek sense, but in a mass market sense,
because you want it to really scale. A lot of it then is going to become more electronic
capabilities where you now have sensors that used to cost $1,000 that now are $10 that used to
be just impossible or prohibitively expensive. And the other big one is manipulation. Right now,
we have limitations both on the cost side, which comes from the mechanical complexity of manipulation.
as well as the software side on how do you robustly interact with unstructured environments,
we have a long way to go before we can actually stack the dishes in the dishwasher and put them away
afterwards.
Computing power will make up for a lot of hardware defects.
So if you have a underpowered, unreliable manipulator, but lots of good computing power,
you can make that manipulator do amazing things.
The problem is if you have no power and a bad manipulator, then things don't work.
Right? But now as the computing power becomes so much better and the sensing capabilities
become so much better, I think some of the demands on the manipulator back off a bit.
So you can get by with a less capable manipulator because you can learn to make up for its
deficits. Different levers to pull, basically. Yeah. What you're going to see in the next wave
of hardware is dedicated hardware that's not just for raw traditional computation, but for
very specialized AI applications like deep learning and vision classification.
These are sort of things that in the next wave of products in the next three to five years,
that's probably going to become pretty standard.
There's still a lot of information, technology, progress that needs to be made.
And the interesting thing is that some of that's being done by Alexa and some of these voice assistants trying to integrate with the rest of your life.
The tools that get created through that, just like voice interface, become incredible enablers of these new types of technologies.
And I think we'll start seeing more in education and healthcare and broader home utility than monitoring and so forth.
Even when you stick to just informational or companionship, on the health care side, elderly companionship and helping age in place, it's an area that's becoming more and more kind of important and costly for a lot of communities.
If you really nail the EQ piece, it becomes not that hard to imagine where your interface, you know, when you come into a hotel or a store or to interface with a doctor, actually becomes somehow driven through an indirect interface through a robot, which is pretty interesting and then opens up a lot of functional opportunities as well.
And in the end, we always thought of it as just an extension of computer science into the real world.
If you can understand what's around you and you have the ability to interact with it, you turn it into a digital problem.
There will be a catalyst that spawns the same thing on the physical side once the ability to understand the environment and interact with it catches up.
Then everything becomes a matter of software making the interaction smarter and smarter.
Thank you so much for joining us on the A16Z podcast.
My pleasure.
Thank you so much. It's a pleasure.