Imaginary Worlds - Stuck in the Uncanny Valley
Episode Date: March 22, 2018The holy grail for many animators is to create digital humans that can pass for the real thing -- in other words to cross the "uncanny valley."Â The problem is that the closer they get to realism, the... more those almost-real humans repulse us. Blame evolution for that. I talk with Hal Hickel from ILM who brought Peter Cushing to life on Rogue One, Marianne Hayden who worked on games like The Last of Us and Uncharted for Naughty Dog studios, Vladimir Mastilovic from 3Lateral studios who worked on Hellblade: Senua's Sacrifice, and SVA instructor Terrence Masson about what it takes to cross that valley.Learn more about your ad choices. Visit megaphone.fm/adchoices Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
This episode is brought to you by Secret.
Secret deodorant gives you 72 hours of clinically proven odor protection,
free of aluminum, parabens, dyes, talc, and baking soda.
It's made with pH-balancing minerals and crafted with skin-conditioning oils.
So whether you're going for a run or just running late,
do what life throws your way and smell like you didn't.
Find Secret at your nearest Walmart or Shoppers Drug Mart today.
A special message from your family jewels brought to you by Old Spice Total Body.
Hey, it stinks down here.
Why do armpits get all of the attention?
We're down here all day with no odor protection.
Wait, what's that?
Mmm, vanilla and shea.
That's Old Spice Total Body Deodorant.
24-7 freshness from pits to privates with daily use.
It's so gentle.
We've never smelled so good.
Shop Old Spice Total Body Deodorant now.
You're listening to Imaginary Worlds,
a show about how we create them and why we suspend our disbelief.
I'm Eric Malinsky.
So, way back in the 20th century, when I was studying animation,
our instructors used to tell us to avoid animating realistic human beings.
Because what's the point? I mean, you can animate anything you want.
And more importantly, you can't do it.
No one can cross the Uncanny Valley.
Now, in case you don't know what that means,
the Uncanny Valley was first proposed in 1970
by a roboticist named Masahiro Mori.
He predicted that in the future,
as robots look more and more like humans,
they would actually repulse us
because they're so close to human beings,
but not quite there.
The reason why this
uncanny valley exists is because we are biologically wired to read human faces. If anything is slightly
off, we will notice it right away. And it's because we have a primal instinct to avoid people that may
appear psychotic or diseased. That's why to me some of the scariest zombie movies are not the ones with a
lot of makeup, but just the ones where the people are just slightly off. Now, robotics are still not
there yet. I don't think anyone will mistake the animatronics at a Disney World ride for a real
person. But computer animation, I mean, nobody saw that coming. I mean, I remember how shocked
people were by the digital dinosaurs in Jurassic Park.
And those special effects are ancient by today's standards.
And speaking of Steven Spielberg, his new movie Ready Player One is coming out,
and it's based on the novel by Ernest Cline, which takes place in the future,
where the Internet has evolved into a virtual reality universe,
where everybody interacts with each other through their digital avatars.
This is the Oasis, a whole virtual universe.
You can do anything, be anyone, without going anywhere at all.
And when I read the book, I imagined these avatars to be completely realistic.
But when I watched the trailers, the avatars were so digital looking.
It completely triggered the Uncanny Valley in me to the point where I'm just not going to see the movie.
I mean, I'm just sticking with the version of Ready Player One that was in my imagination when I read the book.
Which made me wonder, why is it so hard to cross the uncanny valley?
And why is it still the holy grail of computer animation? Are we getting closer? What's holding
us back? And what happens when we cross it? We'll strap on your VR headsets after the break.
That's after the break.
Now, this episode is going to get very technical, but it's actually really personal for me because even though I left animation, I've often wondered if I'd stayed, what would I be working on?
And how would my job evolve with the technology. Because, you know, when I studied animation, I mean, we were filming pencil drawings
with these clunky video cameras,
you know, in rooms surrounded by thick black curtains.
So I was curious,
how is the Uncanny Valley being taught today?
Well, I visited Terence Masson,
who runs the computer animation department
at the School of Visual Arts in New York.
And when I showed up on a Friday afternoon,
he was lecturing about a dozen students at their computer terminals.
I have this
example of why CG
characters so often look
bad and why it pisses me off. It really
does. Now Terrence has worked
in the industry for decades. He was an animator
on the Star Wars prequels.
He worked at major video game companies.
He even helped Trey Parker and Matt
Stone build the software for South Park.
Thanks, everybody.
Thank you.
So after the lecture, we sat in his office,
and he explained that the big breakthrough in digital humans came about 10 years ago
with a technology called subsurface scattering.
Subsurface scattering basically simulates all the inconsistent splotches, freckles, colors, and hairs in the human skin.
But it can also simulate something much more subtle and important.
Everybody has done this, including my five-year-old as a kid.
You put a flashlight behind your fingers, and your fingers kind of glow red, right?
Because the light enters the skin, it scatters,
it bounces around, and then it reflects back out having picked up some of that color of blood.
So to be able to do that in CG was basically a huge necessary leap forward to accurately render
flesh, skin. And before that, it just looked plastic. So this is sort of, you're talking
about like the, almost the internal light that a person
has that's coming from inside of you as a living person, as opposed to being a corpse
that makes your skin luminescent.
Is that what you're saying?
Yeah, exactly right.
It's the thing that makes us look healthy.
Yeah.
But subsurface scattering can make the uncanny valley worse because if a digital human looks
photorealistic, then our expectations are
sky high once it starts moving. So I asked Terrence, what are the basic problems with the uncanny
valley? And he just started ticking them off. First of all, no two people form their words in the same
way. Everyone's mouth shapes are idiosyncratic, and that requires a lot of extra work from animators
who are sometimes crunched for time. Secondly, if a digital human is seen in a close-up or a medium shot,
very often the animators will only animate what we see on screen.
But the human body never stops moving.
Otherwise, it's just somebody on a stick, you know, a severed torso on a stick.
But he thinks the big problem are the eyes.
In fact, his favorite term is eye darts,
which is when your eyes dart back and forth
while you're thinking.
Animators often forget to put these in.
And then they'll do, you can tell someone said,
well, you better have the eyes move.
So they'll just, they'll kind of look at something else
in the room and then come back to the center position.
And without that kind of animated thought
going on within the character's mind,
it just goes dead.
It's funny, once you pointed this out, I started noticing this all the time.
I mean, as people are speaking, their eyes are darting back and forth, almost like micro movements.
And to demonstrate this point, Terrence showed his students footage of the most eye-darty person he could think of, the director Martin Scorsese.
That's the thing, if the actor, he or she is moving and they're on a level... When he talks about working with someone on set about a film or about something he's recalling
from 50 years ago, you can watch his eyes and he's actually imagining what he's talking about
in his mind and his eyes are darting around and following those ideas and tracing what he's talking about in his mind and his eyes are darting around following those ideas and tracing
what he's recalling it brings it all alive and he's going to raise that gun he's going to say
i saw you coming and i really knew he had something when he said i i know you're talking
to me because i'm the only one here now we tend to think of the uncanny valley with movies but
there is a much bigger imperative to get them right in video games. Now, if you haven't played a video game since Atari, they've gotten a lot more complex.
In fact, the way that a lot of the major games work, the big budget games,
is you control a character who is running, shooting, jumping, kicking, punching, you know, video game stuff.
And then when you reach your goal, you can kind of put down your game controller
because a cinematic or a cutscene will come up, which is like a computer animated movie that will illustrate the next story beat.
And that's where you see the characters up close as we get all the subtle acting.
And for a long time, that's where you'd be really trapped in the uncanny valley.
So animators today are not just trying to move game animation out of the uncanny valley, but they're also trying to blend those two.
So you don't just put down your game controller and watch this cut scene.
Everything is blended together.
So you really feel like you're controlling a character in a movie.
And that puts a lot of pressure on animators like Vladimir Mostilovich.
Yeah, I mean, in movies, it's much easier.
He runs a game studio in Serbia called Three Lateral.
And he says the big difference in a movie is that whatever you see is what the director wants you to see.
But in a lot of video games...
You can't control the camera.
You can't control, you know, from which angle the player will view the characters.
So the challenge is orders of magnitude greater.
Now, his company specializes in scanning real
actors to create digital doubles. And his company was part of a team that collaborated on a game
called Hellblade Senua's Sacrifice. Senua is the name of the main character. She's an eighth
century Celtic warrior, goes on an epic journey. But she's also mentally ill. She thinks that her
psychosis are supernatural voices. In the game, these voices manifest themselves as mythical
characters or doppelgangers of her. It's a trippy game with a very spooky atmosphere,
and the attention to detail is gritty. All of her suffering will have been for nothing.
Now until Vladimir
explained this to me, I hadn't really thought about it because
when you watch a movie, you just put it on
and it plays. But a game is recreated
every time you play it.
So the more acting a character
does, that requires
a lot more memory power from the game
console. It's usually the
high and intense emotion, which is also hardware intense, because that's
where you need the most, the biggest number of facial shapes and the code that runs the
faces.
And so if you want really subtle, complicated acting on a human face in a video game, you
also have to figure out how to compress all that data.
That's why we have this very elaborate level of detail systems, which completely invisibly
to the player shed off a lot of weight when it comes to the asset itself.
Now, of course, they don't create these digital humans from scratch.
They're usually based on actors wearing motion capture suits, otherwise known as mocap for
short.
And I'm sure you've seen the behind the scenes
footage where, you know, and the actor is wearing like a full body black leotard and they've got
ping pong balls all over their body and dots on their faces. And there's a kind of a rig on their
heads that has a camera mounted that's pointed directly back at their faces. And then that
performance gets transferred onto the digital character. And Vladimir's company actually invented a technology that allows that transfer to happen instantly.
So the actor performs and in real time, the digital character is reflecting that performance.
And he says they need to do all this because the animators would never think to add things like...
The specific way of how the skin on the lips gets stuck with the upper lip or the lower lip or how the eyelid unfolds against the skin that covers it or the jiggle of the iris.
So, you know, you would say that that amount of detail is crazy.
But for some reason, we perceive all that.
So another game studio that's doing really incredible hyper-realistic human beings is Naughty Dog in Santa Monica, California.
Naughty Dog's two big franchises are the Uncharted games, which are like Indiana Jones adventures, and The Last of Us, which is a zombie-type adventure, but the zombies are really monstrous.
Now, there are two main characters in The Last of Us.
There's a young girl named Ellie and this older guy that's mentoring and protecting her named Joel.
And the trailer for the second game in the series just came out recently.
It takes place a few years later.
Ellie is now a teenager, and we see her in a room.
There's blood on the floor, blood dripping down her face. We don't know what's going on, what happened,
but she's calmly playing the guitar.
And the close-up on her hands is incredibly realistic.
And we cut to her face, blood dripping down,
and she says to Joel,
I'm going to find and I'm going to kill every last one of them.
And I'm going to kill every last one of them.
And it's weird because she doesn't look like a real person.
She looks like a hyper-realistic digital person.
But she felt more real and alive than really any video game character I've seen.
I talked with Marianne Hayden, who's an animator at Naughty Dog Studios, and I asked her, why are digital humans looking so much better?
She says it's because the motion capture has gotten so much better.
You know, the first couple games of Uncharted, you can look at the motion capture data.
It kind of looks jagged around the edges, even though it's been touched by an animator.
It's just much more precise now, picking up a lot of the nuances that maybe we
didn't see before. Now, Marianne went to the same animation school that I went to, CalArts.
And, you know, our program really stressed creative freedom. And I asked her, does she feel
less creative freedom working with motion capture? She said, no. You know, first of all, the actors
never look exactly like the characters. So there's a lot of tweaking there. Also, they have to
caricature the movements that the actors do to make them feel more real,
which is a weird kind of trick of animation that shouldn't make sense, and yet it does.
And sometimes they'll sort of stitch together different performances the actor gave to create
an original acting moment that was invented by the animator.
Some people feel like if we're losing animation when we have the motion capture.
I just think it's just a base, such a base to start at as an animator
and having the eye of an artist take it one step further.
But what amazes me about the characters in the Naughty Dog games are the eyes.
Like this scene between the characters Drake and Elena fromena from the uncharted video games those are the
indiana jones type adventures and i've seen previous versions of drake and elena which are
very plastic looking but in the latest uncharted game they're hyper real and their acting is so
subtle like in this scene they're having a difficult conversation they're shifting their
weight they're having trouble making eye contact And their eyes are doing those kind of micro eye darts.
Come on, wait.
Elena, wait!
I don't get you.
Look, I wanted to tell you.
You know what? Enough.
No, I wanted to, but how could I?
I don't know. Just say it.
I had to protect you.
That is bullshit, Nate.
You just didn't have the nerve to face me again.
I knew you would react like this.
And Marianne says the motion capture suits do track the eyes of the actors.
But there still needs to be an extra layer of love put into the eye movement.
And it's not just the eyes moving, because when the eyes move, your eyebrows move and like like your cheeks move, and your nose moves sometimes.
If you're smiling and looking around, like your whole face is alive. And if part of that isn't
captured in the data, then we have to go back and add that in. And I think that part's
getting it as perfect as it can be from like taking that original performance and then
amplifying it so that it
crosses the uncanny valley that's the tricky part that's that's what our job is and of course telling
a good story i think the more immersed you are hopefully the less you pay attention to
the fact that these aren't really real people but they feel like they're real because you're emotionally invested
and visually invested.
If it looks really great,
it plays really great,
and you're enjoying it,
then I think you're not in that valley anymore.
But a game studio is still limited
by the processing power of the game console,
time, money,
the amount of people you can hire.
You know who isn't limited by very much?
Industrial light and magic.
In 2016, they took a big leap into the uncanny valley with the movie Rogue One.
And if you haven't seen Rogue One, spoilers ahead.
The movie takes place right before the original Star Wars A New Hope,
and it was about how the Rebels stole the plans to right before the original Star Wars A New Hope, and it was about
how the Rebels stole the plans to blow up the Death Star. And the filmmakers wanted to bring
back characters from the 1977 film. Darth Vader was easy. They just got James Earl Jones to do
the voice, and they have a new guy in the suit. But they also needed Darth Vader's right-hand man,
Grand Moff Tarkin. The actor who played him, Peter Cushing, died in 1994.
So they had an actor on the set, Guy Henry, who played Tarkin in a motion capture suit.
And then they animated a digital version of Peter Cushing on top of Guy Henry's performance.
But to make things even more challenging, this digital version of Tarkin
was sharing the screen with real flesh-and-blood actors.
We've heard word of rumors circulating through the city.
Apparently, you've lost a rather talkative cargo pilot.
If the Senate gets wind of our project, countless systems will flock to the rebellion.
When the battle station is finished, Governor Tarkin, the Senate will be of little concern.
Hal Hickel was the animation supervisor on that character.
You know, I think we had shots of Tarkin in Rogue that were great and totally convincing.
And then there were others that were less so.
And he mentioned all the issues we've talked about before.
Eye darts that feel motivated, mouth shapes that were specific to Peter Cushing and not Guy Henry,
animating the full body, even if it's not in the shot.
But they had another problem.
We've only seen Tarkin in the original Star Wars,
which had this harsh 1970s-style lighting.
The lighting in Rogue One was more subtle and modern.
And so we would put our CG Tarkin into these shots
in that lighting, the Rogue One lighting.
And we'd work on it, we'd work on it, we'd work on it.
And we'd feel like, boy, we're so close.
But it just doesn't, it feels like Peter Cushing's cousin or something.
What's the problem?
And so we did an experiment where we took same animation, same texture, same model that we've been working with.
And all we changed was the lighting.
And we lit it like one of these shots we've been staring at from 1977. And boom, it was an instant improvement
in likeness. It was like, oh, that's Tarkin. That looks great. That's Peter Cushing. But the problem
was when you light him that way, then he doesn't match with anything else in the shot and it looks
pasted on. They eventually managed to find a balance between the two styles of lighting.
But the other thing that's difficult about digital humans is the sort of
swamp of opinions that you find yourself in. Because, you know,
you assemble the best team you possibly can, and you
are all together every day reviewing the work and looking at it, but
what you find often happening is, you know, you're all in there looking at it, and somebody goes,
you know what it is? The forehead too high and then somebody else goes no no
forehead's fine we checked it's you know what it is I think the nose is just not
quite long enough or whatever you know and then somebody else is like no no the
nose is fine it's look at the cheeks the peaks and the quite I swear and you know
it was rare that everyone would walk in and there'd be this consensus exactly what the issue is. And
identifying what those final little percents of believability are and realism are is just the
very hardest thing. Now, I have nothing but reverence for the skills of the crew at ILM. I
mean, they are some of the best animators in the world. But I have to admit
this, Tarkin didn't quite work for me. I mean, he looked great, but maybe this is my animation
training, but I could feel the decisions that the animators were making. Like I could see when they
decided his eyebrows should crinkle right here, or he should blink and turn his head. Now Hal did
get that kind of feedback from some of his colleagues,
and he also got a lot of praise from people in the industry.
But amongst general populace moviegoers,
like I've given a bunch of talks since the movie came out,
and some of them have been to completely non-industry people.
And I would say the vast majority of those folks,
I've had tons of them come up to me and say,
oh, I thought you recast it or something, you know. So that tells me that we got most of the way there. But if Grand Moff Tarkin
was hard enough to animate, recreating the young Princess Leia was even harder, and she was only in
one scene. The architecture of Tarkin's face, it's kind of hard to describe, but it kind of just gave us more to work with.
Whereas her face is just, especially at that age, 18 or 19, I think,
and it's just this perfect form with, like, flawless skin.
And as soon as you started moving things and lighting it,
just, like, anything that was even the tiniest bit off was glaring.
Your Highness, the transmission we received. Anything that was even the tiniest bit off was glaring.
Your Highness, the transmission we received.
What is it they've sent us?
Help.
Now, while they were making the film,
they tried to schedule time with Carrie Fisher to do motion capture,
but she was not available.
Shortly beforehand, the producer Kathleen Kennedy did show
her the footage of the young Princess Leia. You know, Kathy showed her the shot when it was done,
before the movie came out. Hal was really nervous about it. Finally, word came back.
She loved it. That really made us all feel good. That was like the thing we were biting our nails
about the most, to be frank. I mean, we knew it had to be the capper on the film, and that was enough pressure. But honestly, the thing we cared about the most,
I was carrying a feel about it. So I was curious, are there any digital humans that
completely blew him away? And he did not hesitate for a second. He said, Blade Runner 2049.
And if you have not seen Blade Runner 2049,
sorry, this is going to be another spoiler.
So that movie, of course, is a sequel to Blade Runner from 1982.
And they wanted to recreate the character of Rachel,
who is a replicant, which means she could be reborn in any moment.
In this case, the animators were able to get the original actress, Sean Young, to do the motion capture
performance. Don't you love me? But that doesn't necessarily mean it's going to work because Jeff
Bridges, Robert Downey Jr., Kurt Russell, Michael Douglas have all played younger versions of
themselves in flashbacks and movies. And those were cool special effects, but when you're watching
those scenes, you could tell it was a special effect.
I asked Hal, in his professional opinion, what did they do differently in recreating the young Rachel?
And he says when you're 99% almost all the way there with the uncanny valley, even he can't put his finger on what they did right.
It was the absence of, you know, she comes on screen and you go, oh, that's cool.
But it was the absence of the but.
It was just like, wow, it's her.
That's amazing.
This whole experience has left him feeling kind of frustrated.
You kind of get to a point of what's the point?
And he says, you know, his company, ILM, gets contacted all the time to resurrect actors
who are no longer alive. I'm very squeamish about that. I mean, people have asked me in some of the
talks I've given some pretty pointed questions about the morality of even, you know, what we
did with Tarkin. And with Tarkin, I felt really assured because he only did in that film the same
things he did in New Hope, which is to, you know, stand around on the Death Star and bark at people about, you know, firing the Death Star laser.
On the other hand, if someone came in and said, you know, we want to hire you guys to do a TV commercial and we want to put Jimmy Stewart in it, I'd have to just decline.
But recreating celebrities is not just for the pros anymore.
There's a new app called Deep Fake where you can try this at home. But recreating celebrities is not just for the pros anymore.
There's a new app called Deep Fake where you can try this at home. And in fact, a bunch of people use the app to reanimate that scene of Princess Leia from Rogue One.
There are all these articles saying, look, these people did it just as good.
But they really, really didn't. I'm sorry.
It just does not even look close.
But Deep Fake is being used to create fake sex tapes of celebrities.
And even more disturbing, it's being used to create digital versions of politicians saying things they never said.
That is a whole new scary level of fake news.
But Terence Masson is still optimistic about where this technology is going and how it could be used.
Eventually it'll become so much automatic and so cheap.
That's going to be the endgame, is that it'll be available in real-time augmented reality and virtual reality and photorealistic and hyperreal.
In other words, he's talking about the world of Ready Player One.
Yeah, yeah, but really cheap and just everywhere.
And yeah, everywhere and everything.
You know, when I studied animation,
our teachers would often use a phrase called the illusion of life,
which came from these two Disney animators,
Frank Thomas and Ollie Johnston,
who said when you animate your first character,
even if it's just a bouncing ball,
you're going to be so amazed
at the illusion of life that you created. You're going to be hooked on animation forever. And I
remember that feeling 20 years ago, thinking, oh my god, I created something that looks like it's
alive. I'm going to do this forever. And I didn't do it forever, but I remember that feeling. And in fact, talking with these animators reminded me how great that was.
And it made me miss it again.
And this, you know, this desire to create life, it's what makes us human, not just for the survival of the species.
And whether we think this technology is good or scary or both, we will never stop wanting to create the illusion of life.
Even if the life we create is only good at convincing us, it's human.
Well, that's it for this week. Thank you for listening. Special thanks to Terrence Masson,
Vladimir Mastilovich, Marianne Hayden, Hal Hickel, and all the other experts that I talked with that I just didn't have room to include.
Imaginary Worlds is part of the Panoply Network.
Stephanie Billman is my assistant producer.
You can like the show on Facebook.
I tweeted emolinski and Imagine Worlds pod.
My website is imaginaryworldspodcast.org. Panoply.