Making Sense with Sam Harris - #53 — The Dawn of Artificial Intelligence
Episode Date: November 24, 2016Sam Harris speaks with computer scientist Stuart Russell about the challenge of building artificial intelligence that is compatible with human well-being. If the Making Sense podcast logo in your play...er is BLACK, you can SUBSCRIBE to gain access to all full-length episodes at samharris.org/subscribe.
Transcript
Discussion (0)
Thank you. of the Making Sense podcast, you'll need to subscribe at SamHarris.org. There you'll find our private RSS feed to add to your favorite podcatcher, along with other subscriber-only
content. We don't run ads on the podcast, and therefore it's made possible entirely through
the support of our subscribers. So if you enjoy what we're doing here, please consider becoming Today I'll be speaking with Stuart Russell. He is a professor of computer science and engineering
at UC Berkeley. He's also an adjunct professor of neurological surgery at UC San Francisco.
He is the author of the most widely read textbook on the subject of AI, Artificial
Intelligence, A Modern Approach. And over the course of these 90 minutes or so, we explore the
topics that you may have heard me raise in my TED Talk. Anyway, Stuart is an expert in this field
and a wealth of information, and I hope you find this conversation as useful as I did.
I increasingly think that this is a topic that will become more and more pressing every day,
and if it doesn't for some reason, it will only be because scarier things have distracted us from it. So things are going well if we worry more and more
about the consequences of AI, or so it seems to me. And now I give you Stuart Russell.
I'm here with Stuart Russell. Stuart, thanks for coming on the podcast.
You're welcome.
Our listeners should know you've been up nearly all night working on a paper relevant to our
topic at hand. So double thank you for doing this.
No problem. I hope I will be coherent.
Well, you've got now nearly infinite latitude not to be. So perhaps you can tell us a little
bit about what you do. How do you describe your job at this point? So I'm a professor at Berkeley, a computer scientist,
and I've worked in the area of artificial intelligence for about 35 years now,
starting with my PhD at Stanford. For most of that time, I've been what you might call a mainstream AI researcher.
I work on machine learning and probabilistic reasoning, planning, game playing, all the things that AI people work on.
And then the last few years, although this has been something that's concerned me for a long time. I wrote a textbook in 1994 where I had a section of a chapter talking about what happens if we succeed in AI,
meaning what happens if we actually build machines
that are more intelligent than us.
What does that mean?
So that was sort of an intellectual question,
and it's become a little bit more urgent in the last few years as progress is accelerating and the resources going into AI have grown enormously.
So I'm really asking people to take the question seriously.
What happens if we succeed?
As you know, I've joined the chorus of people who really in the last two years have begun worrying out loud about the consequences of AI or the consequences of us not building it with
more or less perfect conformity to our interests. And one of the things about this chorus is that it's mostly made up of
non-computer scientists, and therefore people like myself or Elon Musk or even physicists like
Max Tegmark and Stephen Hawking are seemingly dismissed with alacrity by computer scientists
who are deeply skeptical of these worried noises we're making.
And now you are not so easily dismissed because you have really the perfect bona fides of a
computer scientist. So I want to get us into this territory. And I don't actually know that you are
quite as worried as I have sounded publicly. So if there's any difference between your take and mine, that would be interesting to explore. But I also want us to, at some point, I'd like you to express
the soundest basis for this kind of skepticism that we are crying wolf in a way that is
unwarranted. But before we get there, I just want to ask you a few questions to get our bearings.
The main purpose here is also just to educate our listeners about what artificial intelligence is and what its implications are,
whether if everything goes well or if everything goes less than well. So a very disarmingly simple
question here at first, what is a computer? Well, so pretty much everyone these days has a computer, but, um, doesn't necessarily understand
what it is. Uh, the way it's presented to the public, whether it's your, your smartphone or
your laptop is something that runs a bunch of applications and the applications do things like,
you know, edit Word documents, um, of, you know, face-to-face video chat and things like edit Word documents, allow face-to-face video chat, and things like that.
And what people may not understand is that a computer is a universal machine,
that any process that can be described precisely
can be carried out by a computer,
and every computer can simulate every other computer.
by a computer and, and every computer can simulate every other computer.
Uh, and this, this property of universality means that, um, uh, that intelligence itself is something that a computer can in principle, uh, emulate.
And this was realized, um, among other people by Ada Lovelace in the 1850s when she was working with Charles
Babbage. They had this idea that the machine they were designing might be a universal machine,
although they couldn't define that very precisely. And so the immediate thought is,
well, if it's universal, then it can carry out the processes of intelligence,
as well as ordinary mechanical calculations. So a computer is really anything you want
that you can describe precisely enough to turn into a program.
So relate the concept of information to that. These sound like very simple questions,
but these are disconcertingly deep questions, I'm aware. I think everyone understands that out there is a world,
the real world, and we don't know everything about the real world. So it could be one way,
or it could be another. In fact, it could be, there's a gazillion different ways the world
could be, you know, all the cars that are out there parked could be parked. In fact, it could be, there's a gazillion different ways the world could be,
you know, all, all the cars that are out there parked could be parked in different places and I wouldn't even know it. So there are many, many ways the world could be. And information is just
something that, uh, tells you a little bit more about what, uh, you know, what the world is,
which way is the, is the real world out of all the possibilities that it could be.
And as you get more and more information about the world, typically we get it through our eyes and ears, and increasingly we're getting it through the Internet.
Then that information helps to narrow down the ways that the real world could be.
helps to narrow down the ways that the real world could be.
And Shannon, who is an electrical engineer at MIT,
figured out a way to actually quantify the amount of information.
So if you think about a coin flip,
if I can tell you which way that coin is going to come out uh heads or tails then that that's one bit of information and uh so that lets you give you gives you the answer for a binary
choice between two things and uh so from information theory we have um we have wireless
communication we have the internet we have uh We have all the things that allow computers to talk to each other through the physical medium.
So information theory has been, in some sense, the complement or the handmaiden of computation and allowing the whole information revolution to happen.
Is there an important difference between what you just described,
computers and the information they process, and minds?
Let's leave consciousness aside for the moment, but if I asked you,
what is a mind, would you have answered that question differently?
you what is a mind, would you have answered that question differently? So I think I would,
because the word mind carries with it this notion, as you say, of consciousness. It's not with the word mind, you can't really put aside the notion of consciousness.
Except if you're talking about the unconscious mind, you know, like all the unconscious cognitive processing we do, does mind seem a misnomer there without consciousness?
It might. Yeah, unconscious mind is kind of like saying artificial grass. It isn't grass, but it kind of is like grass.
So just to give you a quote, John Hoagland has written a lot about AI.
He's a philosopher, and he describes the notion, you know, perception, perceptual experience.
And I actually think this is an incredibly important thing because without that, nothing has moral value.
There are lots of complicated physical processes in the universe.
value. There are lots of complicated physical processes in the universe. You know, stars exploding and rivers and, you know, glaciers melting and all kinds of things like that. But
none of that has any moral value associated with it. The things that generate moral value are
things that have conscious experience. So it's, that's a really, it's a really important topic, but AI has nothing to say about it
whatsoever. Well, not yet. I guess we're going to get there in terms of if consciousness is at
some level just an emergent property of information processing, if in fact that is the punchline at
the back of the book of nature, well then we need to think about the implications of building
conscious machines, not just intelligent machines. But you introduced a term here which we should define. You talked about
strong versus weak AI, and I guess the more modern terms are narrow versus general
artificial intelligence. Can you define those for us?
Right. So the word strong and weak have actually changed their meaning over time.
So the word strong and weak have actually changed their meaning over time.
So strong AI was, I believe, a phrase introduced by John Searle in his probability it's going to be a conscious device, that the functional properties of intelligence
and consciousness are inseparable.
And so strong AI is sort of the super ambitious form of AI,
and weak AI was about building AI systems that have capabilities that you want
that you want them to have but they don't necessarily have the the consciousness or
the the first person experience so that and then and then I think there's been a number of
people both inside and outside the field sort of of using strong and weak AI in various different ways. Building AI systems that have the capabilities comparable to or greater than those of humans without any opinion being given on whether there's consciousness or not.
And then narrow AI, meaning AI systems that don't have the generality.
They might be very capable, like AlphaGo is a very capable Go player, but it's narrow in the sense that it can't do anything else.
So we don't
think of it as general purpose intelligence right and given that consciousness is something that uh
we just don't have uh a philosophical handle let alone a scientific handle on um i think for the
time being we'll just have to put it to one side, and the discussion is going to have to focus on capabilities, on the functional properties of intelligent systems.
Well, there's this other term, one here is in this area, which strikes me as an actual term that names almost nothing possible, but it's human-level AI. And that is,
you know, it's often put forward as kind of the nearer landmark to super-intelligent AI or
something that's beyond human. But it seems to me that even our narrow AI at this point, you know, the calculator in your
phone or anything else that gets good enough for us to dignify it with the name intelligence,
very quickly becomes superhuman even in its narrowness. So the phone is a better calculator
than I am or will ever be. And if you imagine building a system that is a true general intelligence,
its learning is not confined to one domain as opposed to another, but it's much more like a
human being in that it can learn across a wide range of domains without having the learning in
one domain degrade its learning in another. Very quickly, if not immediately, we'll be talking about superhuman AI, because presumably this
system will, it's not going to be a worse calculator than my phone, right? It's not going
to be a worse chess player than Deep Blue. At a certain point, it's going to very quickly be
better than humans at everything it can do. So is human-level AI a mirage, or is there some serviceable way to think about that concept?
So I think human-level AI is just a notional goal. And I basically agree with you that if we
can achieve the generality of human intelligence, then we will probably exceed on many dimensions the actual capabilities of humans.
So there are things that humans do that we really have no idea how to do yet.
For example, what humans have done collectively in terms of creating science, we don't know how to get machines to do something like that.
We don't know how to get machines to do something like that.
I mean, we can imagine that theoretically it's possible.
You know, somewhere in the space of programs,
there exists a program that could be a high-quality scientist, but we don't know how to make anything like that.
So it's possible that we could have human level capabilities on sort of on all the
mundane intellectual tasks that don't require these really creative reformulations of our
whole conceptual structure that happen from time to time in science. And this is sort of what's happening already, right?
I mean, as you say, in areas where computers become competent, they quickly become super competent.
And so we could have super competence across all the mundane areas, like the ability to read a book and answer sort of the kinds of questions that, you know, an undergraduate could answer by reading
a book, we might see those kinds of capabilities. But it might be then quite a bit more work,
which we may not learn how to do, to get it to come up with the kinds of answers that the, you know, a truly creative and deep
thinking human could do from, from looking at the same material. But this is, this is something that
at the moment is very speculative. I mean, what we, what we do see is the beginning of generality.
So you'll often see people in the media claiming,
oh, well, you know, computers can only do
what they're programmed to do.
They're only good at narrow tasks.
But when you look at, for example, DQN,
which was Google DeepMind's first system
that they demonstrated,
so this learned to play video games
and it learned completely from scratch.
So it was like a newborn baby opening its eyes for the first time. It has no idea what kind of a world it's in. It
doesn't know that there are objects or that things move or there's such a thing as time
or good guys and bad guys or cars or roads or bullets or spaceships or anything, just like a newborn baby. And then within a few hours of messing around with a video game,
essentially through a camera, so it's really just looking at the screen.
It doesn't have direct access to the internal structures of the game at all.
It's looking at the screen.
Very much the way a human being is interfacing with a game.
Yeah, exactly.
The only thing it knows is that it wants to get more points.
And so within a few hours, it's able to learn a wide range.
So most of the games that Atari produced,
it reaches a superhuman level of performance in a few hours,
entirely starting from nothing.
And it's important to say that it's the same algorithm
playing all the games. This is not like to say that it's the same algorithm playing all
the games. This is not like Deep Blue that is the best chess player, but he can't play tic-tac-toe
and will never play tic-tac-toe. This is a completely different approach. Correct. Yep.
This is one algorithm, you know, it could be a driving game, it could be Space Invaders,
it could be Pac-Man, it could be Undersea, SeaQuest with submarines.
So in that sense, when you look at that, if your baby did that, woke up the first day in the hospital,
and by the end of the day was beating everyone, beating all the doctors at Atari Video Games, you'd be pretty terrified.
So it's demonstrating generality up to a point, right?
There, there is certain characteristics of video games that don't hold for the real world
in general.
The main, one of the main things being that, uh, in a video game, the idea is that you
can see everything on the screen.
Um, but in the course of the real world, at any given point, there's tons of the real
world that you can't see.
Um, but it all But it still matters.
And then also with video games, they tend to have very short horizons because you're supposed to play them in the pub when you're drunk or whatever.
So they typically, unlike chess, they don't require deep thought about the long-term consequences of your choices.
But other than those two things, which are certainly important, something like DQN and
various other reinforcement learning systems are beginning to show generality.
And we're seeing with the work in computer vision that the same basic technology, the convolutional deep networks
and their recurrent cousins,
that these technologies
with fairly small modifications,
not really conceptual changes,
just sort of minor changes
in the details of the architecture,
can learn a wide range of tasks
to an extremely high level,
including recognizing thousands of
different categories of objects in photographs, doing speech recognition, learning to even
write captions for photographs, learning to predict what's going to happen next in a video
and so on and so forth.
So I think we're arguably, you know, if there is going to be an explosion of capabilities that feeds on itself, I think we may be seeing the beginning of it.
Now, what are the implications with respect to how people are designing these systems. So if I'm not mistaken, most, if not all of these deep learning approaches,
or more generally machine learning approaches, are essentially black boxes in which you can't
really inspect how the algorithm is accomplishing what it is accomplishing. Is that the case? And
if so, or wherever it is the case, are there implications there that we need to be
worried about? Or is that just a novel way of doing business, which doesn't raise any special
concerns? Well, I think it raises two kinds of concerns. One, maybe three. So one is a very
practical problem that when it's not working, you really don't know why it's not working.
And there is a certain amount of blundering about in the dark. Some people call this
graduate student descent, which is, that's a very nerdy joke. So gradient descent is,
or, you know, walking down a hill is a way to find the lowest point.
And so graduate student descent, meaning that you're trying out different system designs,
and in the process, you're using up graduate students at a rapid rate.
And that's clearly a drawback.
You know, and in my research, I've generally favored techniques where the design of a system
is derived from the characteristics of the problem
that you're trying to solve.
And so
the function of each of the components
is clearly understood and you can
show that
the system is going to do
what it's supposed to do for the right reasons.
In the black box approach, there are people who just seem to have great intuition about how
to design the architecture of these deep learning networks so that they produce good performance.
I think there are also practical questions from the legal point of view, that there are a lot of areas, for example,
medical diagnosis or treatment recommendations, recommending for or against parole for
prisoners, approving credit or declining credit applications, where you really want a clear explanation of why the
recommendation is being made.
And without that, people simply won't accept that the system is used.
And one of the reasons for that is that a black box could be making decisions that are
biased, rac are biased,
racially biased, for example,
and without the ability to explain itself,
then you can't trust that the system is unbiased.
And then there's a third set of reasons,
which I think is what's behind your question,
about why we might be concerned with systems that are entirely black box.
That we, since we can't understand how the
system is reaching its decisions or what it's doing, that gives us much less control. So as
we move towards more and more capable and perhaps general intelligent systems, the fact that we
really might have no idea how they're working or what
they're thinking about, so to speak, that would give you some concern. Because then one of the
reasons that the AI community often gives for why then they're not worried, right? So the people who
are skeptical about there being a risk is that, well, we designed these systems. Obviously,
we would design them so that they did what we want. But if they are completely opaque black boxes that you don't know what they're doing, then that sense of control and safety disappears.
Let's talk about that issue of what Bostrom called the control problem. I guess we could
call it the safety problem as
well. And this is, many people listening will have watched my TED talk where I spend 14 minutes
worrying about this, but just perhaps you can briefly sketch the concern here. What is,
what is the concern about general AI getting away from us? How do you articulate that?
general AI getting away from us? How do you articulate that?
So you mentioned earlier that this is a concern that's being articulated by non-computer scientists.
And Bostrom's book, Superintelligence, was certainly instrumental in bringing it to the attention of a wide audience, people like Bill Gates and Elon Musk and so on. But the fact is
that these concerns have been articulated by
the central figures in computer science and AI. So I'm actually going to...
Going back to I.J. Goode and von Neumann.
Well, and Alan Turing himself.
Right.
So people, a lot of people may not know about this, but I'm just going to read a little quote.
So Alan Turing gave a talk on BBC Radio, Radio 3, in 1951.
So he said, if a machine can think, it might think more intelligently than we do.
And then where should we be?
Even if we could keep the machines in a
subservient position, for instance, by turning off the power at strategic moments, we should,
as a species, feel greatly humbled. This new danger is certainly something which can give us
anxiety. So that's a pretty clear, you know, if we achieve super intelligent AI, we could have
if we achieve super intelligent AI, we could have a serious problem.
Another person who talked about this issue was Norbert Wiener.
So Norbert Wiener was one of the leading applied mathematicians of the 20th century.
He was the founder of a good deal of modern control theory and automation.
He's often called the father of cybernetics.
So he was concerned because he saw Arthur Samuel's checker playing program in 1959,
learning to play checkers by itself, a little bit like the DQN that I described, learning to play video games.
But this is 1959,
so more than 50 years ago, learning to play checkers better than its creator. And he saw clearly in this the seeds of the possibility of systems that could out-distance human beings in
general. And he was more specific about what the problem is. So Turing's
warning is in some sense, the same concern that gorillas might've had about humans. If they had
thought, you know, a few million years ago when the human species branched off from, from the
evolutionary line of the gorillas, if the gorillas had said to themselves, you know, should we create
these human beings, right? They're going to be much smarter than us. You know, it kind of makes me worried, right? And then they would have been
right to worry because as a species, they sort of completely lost control over their own future
and humans control everything that they care about. So Turing is really talking about this
general sense of unease about making something smarter than you.
Is that a good idea?
And what Wiener said was this.
If we use to achieve our purposes a mechanical agency with whose operation we cannot interfere effectively, we better be quite sure that the purpose put into the machine is the purpose which we really desire.
So this is 1860.
Nowadays, we call this the value alignment problem.
How do we make sure that the values that the machine is trying to optimize are, in fact,
the values of the human who is trying to get the machine to do something or the values
of the human race in general?
to do something or the values of the human race in general.
And so Wiener actually points to the sorcerer's apprentice story as a typical example of when you give a goal to a machine,
in this case fetch water, if you don't specify it correctly,
if you don't cross every T and dot every I and make sure you've
covered everything, then machines being optimizers, they will find ways to do things that you don't
expect. And those ways may make you very unhappy. And this story goes back to King Midas,
500 and whatever BC, um,
where he got exactly what he said,
which is the thing turns to gold,
uh,
which is definitely not what he wanted. He didn't want his food and water to turn to gold or his relatives to turn to
gold,
but he got what he said he wanted.
And all of the stories with the genies,
the same thing,
right?
You,
you give a wish to a genie,
the genie carries out your wish very
literally. And then, you know, the third wish is always, you know, can you undo the first two
because I got them wrong. And the problem with super intelligent AI is that you might not
be able to have that third wish. Or even a second wish. Yeah. So if you get it wrong,
and you might wish for something very benign sounding like, you know, could you cure cancer?
But if you haven't told the machine that you want cancer cured, but you also want human beings to be alive.
So a simple way to cure cancer in humans is not to have any humans.
A quick way to come up with a cure for cancer is to use the entire human race as guinea pigs for millions of different
drugs that might cure cancer.
So there's all kinds of ways things can go wrong.
And, you know, we have, you know, governments all over the world try to write tax laws that
don't have these kinds of loopholes, and they fail over and over and over again. And they're only competing
against ordinary humans, you know, tax lawyers and rich people. And yet they still fail despite
there being billions of dollars at stake. So our track record of being able to specify objectives and constraints completely so that we are
sure to be happy with the results, our track record is abysmal. And unfortunately,
we don't really have a scientific discipline for how to do this. So generally, we have all
these scientific disciplines, AI, control theory, economics, operations, research, that are
about how do you optimize an objective?
But none of them are about, well, what should the objective be so that we're happy with
the results?
So that's really, I think, the modern understanding, as described in Bostrom's book and other
papers, of why a super intelligent machine could be problematic.
It's because if we give it an objective
which is different from what we really want,
then we're basically like creating a chess match
with a machine, right?
Now there's us with our objective
and it with the objective we gave it
which is different from what we really want.
So it's kind of like having a chess match for the whole world. And we're not too good at beating machines at chess. That's a great image, a chess match for the whole world.
I want to drill down on a couple of things you just said there, because I'm hearing the skeptical
voice even in my own head, even though I think I have smothered
it over the last year of focusing on this. But it's amazingly easy, even for someone like me,
and this was really kind of the framing of my TED Talk, where it's just I was talking about
these concerns and the value alignment problem, essentially. But the real message of my talk was that it's very
hard to take this seriously emotionally, even when you are taking it seriously intellectually.
There's something so diaphanous about these concerns, and they seem so far-fetched,
even though you can't give an account, or I certainly haven't heard anyone give an
account, of why, in fact, they are far-fetched when you look closely at them. So, like, you know,
the idea that you could build a machine that is super intelligent and give it the instruction to
cure cancer or fetch water and not have anticipated that one possible solution to that problem was to
kill all of humanity or to fetch the water from
your own body. And that just seems, we have an assumption that things couldn't conceivably go
wrong in that way. And I think the most compelling version of pushback on that front has come to me
from people like David Deutsch, who you probably know. He's one of the father of quantum
computing, or the concept there, a physicist at Oxford who's been on the podcast. He argues,
and this is something that I don't find compelling, but I just want to put it forward,
and I've told him as much. He argues that superintelligence entails an ethics. If we've built a superintelligence system,
we will have given it our ethics to some approximation, but it will have a better
ethics than ourselves, almost by definition. And to worry about the values of any intelligent
systems we build is analogous to worrying about the values of our descendants
or our future teenagers, where they might have different values, but they are an extension of
ourselves. And now we're talking about an extension of ourselves that is more intelligent than we are
across the board. And I could be slightly misrepresenting him here, but this is close
to what he advocates, that there is something about that
that should give us comfort, almost in principle, that there's just no... Obviously, we could
stupidly build a system that's going to play chess for the whole world against us that is malicious,
but we wouldn't do that. And what we will build is, by definition, going to be a more intelligent extension of the best of our ethics.
I mean, that's a nice dream.
But as far as I can see, it's nothing more than that.
There's no reason why the capability to make decisions successfully is associated with any...
If you'd like to continue listening to this conversation,
you'll need to subscribe at SamHarris.org.
Once you do, you'll get access to all full-length episodes of the Making Sense podcast,
along with other subscriber-only content,
including bonus episodes and AMAs
and the conversations I've been having on the Waking Up app.
The Making Sense podcast is ad-free and relies entirely on listener support.
And you can subscribe now at SamHarris.org.