Making Sense with Sam Harris - #324 — Debating the Future of AI
Episode Date: June 28, 2023Sam Harris speaks with Marc Andreessen about the future of artificial intelligence (AI). They discuss the primary importance of intelligence, possible good outcomes for AI, the problem of alienation, ...the significance of evolution, the Alignment Problem, the current state of LLMs, AI and war, dangerous information, regulating AI, economic inequality, and other topics. If the Making Sense podcast logo in your player is BLACK, you can SUBSCRIBE to gain access to all full-length episodes at samharris.org/subscribe. Learning how to train your mind is the single greatest investment you can make in life. That’s why Sam Harris created the Waking Up app. From rational mindfulness practice to lessons on some of life’s most important topics, join Sam as he demystifies the practice of meditation and explores the theory behind it.
Transcript
Discussion (0)
Thank you. of the Making Sense podcast, you'll need to subscribe at samharris.org. There you'll find our private RSS feed
to add to your favorite podcatcher,
along with other subscriber-only content.
We don't run ads on the podcast,
and therefore it's made possible entirely
through the support of our subscribers.
So if you enjoy what we're doing here,
please consider becoming one.
Okay. Well, there's been a lot going on out there.
Everything from Elon Musk and Mark Zuckerberg challenging one another to an MMA fight,
which is ridiculous and depressing,
to Robert Kennedy Jr. appearing on every podcast on earth,
apart from this one.
I have so far declined the privilege.
It really is a mess out there.
I'll probably discuss the RFK phenomenon in a future episode,
because it reveals a lot about what's wrong with alternative media at the moment.
But I will leave more of a post-mortem on that for another time.
Today I'm speaking with Mark Andreessen.
Mark is a co-founder and general partner at the venture capital firm Andreessen Horowitz.
He's a true internet pioneer.
He created the Mosaic internet browser and then co-founded Netscape.
He's co-founded other companies and invested in too many to count. Mark holds a degree in computer science from the University of Illinois,
and he serves on the board of many Andreessen Horowitz portfolio companies. He's also on the
board of Meta. Anyways, you'll hear Mark and I get into a fairly spirited debate about the future of
AI. We discuss the importance of intelligence generally and the
possible good outcomes of building AI, but then we get into our differences around the risks or
lack thereof of building AGI, artificial general intelligence. We talk about the significance of
evolution in our thinking about this, the alignment problem, the current state of large language models, how developments
in AI might affect how we wage war, what to do about dangerous information, regulating AI,
economic inequality, and other topics. Anyway, it's always great to speak with Mark.
We had a lot of fun here. I hope you find it useful. And now I bring you Mark Andreessen.
I am here with Mark Andreessen. Mark, thanks for joining me again.
It's great to be here, Sam. Thanks.
I got you on the end of a swallow of some delectable beverage.
Yes, you did. So this should be interesting. I'm eager to speak
with you specifically about this recent essay you wrote on AI. And so obviously many people have
read this and you are a voice that many people value on this topic, among others. Perhaps you've
been on the podcast before and people know who you are,
but maybe you can briefly summarize how you come to this question. I mean, how would you summarize
the relevant parts of your career with respect to the question of AI and its possible ramifications?
Yeah. So I've been a computer programmer, technologist, computer scientist since the 1980s. When I actually entered college in 1989 at University of Illinois, the AI field had been through a boom in the 80s, which had crashed hard. And so by the time I got to college, it was the AI wave was dead and buried at that point for a while. It was like the backwater of the department that nobody really wanted to talk about.
It was like the backwater of the department that nobody really wanted to talk about.
And then, but, you know, I learned a lot of it. I learned a lot of it in school.
And then I went on to, you know, help create what is now kind of known as the modern internet in the 90s.
And then over time transitioned to become a, went from being a technologist to being an entrepreneur.
And then today I'm an investor venture capitalist.
And so 30 years later, 30, 35 years later, I'm involved in a very broad cross-section of
tech companies that have, many of them have many kind of AI aspects to them. And so, you know,
and everything from Facebook, now Meta, you know, which has been investing deeply in AI for over a
decade, through to many of the best new AI startups. You know, our day job is to find the
best new startups in a new category like this and try to back the entrepreneurs. And so that's, at the core of my interest here. But I think there are many things, you know, we agree about. You know, up front, we agree, I think, with more or less anyone who thinks about it, that intelligence is good, and we want more of it. And if it's not necessarily the source of everything that's good in human life, it is what will safeguard everything that's good in human life, right?
So even if you think that love is more important than intelligence, and you think that playing on the beach with your kids is way better than doing science or anything else that is narrowly linked to intelligence,
you have to admit that you value all of the things that intelligence will bring that will safeguard the things you
value. So a cure for cancer and a cure for Alzheimer's and a cure for a dozen other things
will give you much more time with the people you love, right? So whether you think about the
primacy of intelligence or not very much, it is the thing that has differentiated us from our
primate cousins, and it's the thing that allows us to do everything that is maintaining the status of civilization. And if the future is going to be better than the
past, it's going to be better because of what we've done with our intelligence in some basic
sense. And I think we're going to agree that because intelligence is so good, and because
each increment of it is good and profitable, this AI arms race and gold rush is not going to stop,
right? We're not going to pull the brakes here and say, let's take a pause of five years
and not build any AI, right? I mean, I think that's, I don't remember if you addressed that
specifically in your essay, but even if some people are calling for that, I don't think
that's in the cards. I don't think you think that's in the cards.
Well, there are, you know, it's hard to believe that you just like put in the box,
right, and stop working on it. It's hard to believe that the progress stops. You know,
like having said that, there are some powerful and important people who are in Washington right
now advocating that. And there are some politicians who are taking them seriously. So they're, you
know, they're at the moment, there is some danger around that. And then look, there's two other big
dangers to the scenarios that I think would both be very, very devastating for the future.
One is the scenario where the fears around AI are used to basically entrench a cartel.
So, and this is what's happening right now.
This is what's being lobbied for right now is there are a set of big companies that are arguing in Washington.
Yes, AI, you know, has positive cases, uses.
Yes, AI is also dangerous because it's dangerous. Therefore, we need a regulatory
structure that basically entrenches a set of currently powerful tech companies to be able to
have basically exclusive rights to do this technology. I think that would be devastating
for reasons we could discuss. And then look, there's a third outcome, which is we lose,
China wins. They're certainly working on AI and they have what I would consider to be a very dark
and dystopian vision of the future,
which I also do not want to win. Yeah. I mean, I guess that is in part the
cash value of the point I just made that even if we decided to stop, not everyone's going to stop,
right? I mean, human beings are going to continue to grab as much intelligence as we can grab,
even if in some local spot we decide to pull the brakes.
Although it really is, at this point, it's hard to imagine even,
whatever the regulation is, it really stalling progress. I mean, given just, again, given the intrinsic value of intelligence
and given the excitement around it and given the obvious dollar signs that everyone is seeing.
I mean, the incentives are such that I just don't
see it. But we'll come to the regulation piece eventually, because I think it's, you know,
given the difference in our views here, it's not going to be a surprise that I want some form of
regulation, and I'm not quite sure what that could look like. And I think you would have a better
sense of what it looks like, and perhaps that's why you're worried about it. But before we talk about the fears here,
let's talk about the good outcome. Because I know you don't consider yourself a utopian,
but you sketch a fairly utopian picture of promise in your essay. If we got this right,
how good do you think it could be?
Yeah. So I should start by saying I kind of deliberately loaded the title of the essay
with a little bit of a religious element. And I did that kind of very deliberately because I
view that I'm up against a religion, the sort of AI risk fear religion. But I am not myself
religious, lowercase r religious in the sense of, I'm not a utopian. I'm much a, I'm an adherent to what Thomas Sowell called the constrained vision, not the
unconstrained vision. So I live in a world of practicalities and trade-offs. And so, yeah,
I am actually not utopian. Look, having said that, building on what you've already said,
like intelligence, if there is a lever for human progress across many thousands of domains
simultaneously, it is intelligence. And we just, we know that because we have thousands of years of experience seeing that play out. The thing I would add to,
I thought you made that case very well. The thing I would add to the case you made about the positive
virtues of intelligence in human life is that the way you described it, at least the way I heard it
was more focused on like the social, societal wide benefits of intelligence, for example,
cures for diseases and so forth. That is true. And I agree with all that. There are also individual level benefits of intelligence, right? At the
level of an individual, even if you're not the scientist who invents a cure for cancer, at an
individual level, if you are smarter, you have better life welfare outcomes on almost every
metric that we know how to measure everything from how long you'll live, how healthy you'll be,
how much education you'll achieve, career success, the success of your children.
By the way, your ability to solve problems, your ability to deal with conflict,
smarter people are less violent, smarter people are less bigoted. And so there's this very broad kind of pattern of human behavior, or basically more intelligence, you know, just simply at the
individual level leads to better outcomes. And so the sort of most utopian I'm willing to get
is sort of this potential,
which I think is very real right now. It's already started where you basically just say,
look, human beings from here on out are going to have an augmentation. And the augmentation is
going to be in the long tradition of augmentations, like everything from eyeglasses to shoes to
word processors to search engines. But now the augmentation is intelligence. And that augmented
intelligence capability is going to let them capture the gains of individual level intelligence, you know,
potentially considerably above, you know, where they punch in as individuals. And what's
interesting about that is that can scale all the way up, right? Like, you know, somebody who is,
you know, somebody who struggles with, you know, daily challenges, all of a sudden is going to have
a partner and an
assistant and a coach and a therapist and a mentor to be able to help improve a variety of things in
their lives. And then, look, if you had given this to Einstein, he would have been able to
discover a lot more new fundamental laws of physics in the full vision. And so this is one
of those things where it could help everybody and then it could help everybody in many, many different ways. you can be continuously in dialogue with. And it's just, it'd be like having the smartest person
who's ever lived, just giving you a bespoke concierge service to, you know, all manner of
tasks and, you know, across any information landscape. And I just, I happened to recently
re-watch the film Her, which I hadn't seen since it came out. So it came out 10 years ago,
and I don't know if you've seen it lately, but I must say it lands a little bit differently now
that we're on the cusp of this thing. And while it's not really dystopian, there is something a
little uncanny and quasi-bleak around even the happy vision here of having everyone siloed in their interaction
with an AI. I mean, it's the personal assistant in your pocket that becomes so compelling
and so aware of your goals and aspirations and what you did yesterday and the email you sent
or forgot to send. And apart from the ending,
which is kind of clever and surprising and kind of irrelevant for our purposes here,
it's not an aspirational vision of the sort that you sketch in your essay. And I'm wondering,
if you see any possibility here that even the best case scenario has something
intrinsically alienating and troublesome about it.
Yeah. So look, on the movie, you know, as Peter Thiel has pointed out, like Hollywood no longer makes positive movies about technology.
And so, and then look, you know, he argues it's because they hate, you know, they hate technology.
But I would argue maybe a simpler explanation, which is they want dramatic tension and conflict.
Right. And so necessarily,
things are going to have a dark tinge. Regardless, they obviously spring-loaded by their choice of character and so forth. The scenario I have in mind is actually quite a bit different. And let
me get kind of maybe philosophical for a second, which is there's this long-running debate. This
question that you just raised is a question that goes back to the Industrial Revolution.
And remember, it goes back to the core of actually the, you know, original Marx, you know, Marx's original
theory. Marx's original theory was industrialization technology, modern economic
development, right? Alienates the human being, right? From society, right? That was his core
indictment of technology. And look, like there are, you can point to many, many cases in which
I think that has actually happened. Like I think alienation is a real problem. I, you know, I don't
think that critique was entirely wrong. His prescriptions
were disastrous, but I don't think the critique was completely wrong. Look, having said that,
then it's a question of like, okay, now that we have the technology that we have, and we have,
you know, new technology we can invent, like, how could we get to the other side of that problem?
And so I would, I would put the shoe on the other foot. And I would say, look, the,
the purpose of human existence and the way that we live our lives should be determined by us,
and it should be determined by us to maximize our potential as human beings. And the way to do that
is precisely to have the machines do all the things that they can do so that we don't have to.
Right. And this is why Marx ultimately, his critique was actually in the long run,
I think has been judged to be incorrect, which is we are all much better. Anybody in the developed
West, you know, industrialized West today is much better off by
the fact that we have all these machines that are doing everything from making shoes to harvesting
corn to doing everything, you know, so many other, you know, industrial processes around us.
Like we just have a lot more time and a much more pleasant, you know, day-to-day life,
you know, than we would have if we were still doing things the way that things used to be done.
The potential with AI is just like, look, take the drudge work out.
Like, take the remaining drudge work out.
Take all the, you know, look.
I'll give you a simple example, office work.
That, you know, the inbox staring at you in the face with 200 emails, right,
Friday at 3 in the afternoon.
Like, okay, no more of that, right?
Like, we're not going to do that anymore because I'm going to have an AI assistant.
The AI assistant is going to answer the emails, right?
And, in fact, what's going to happen is my AI assistant is going to answer the email that
your AI assistant set.
It's mutually assured destruction.
Yeah,
exactly.
But like the machine should be doing that.
Like the human being should not be sitting there when it's like sunny out
and his like,
you know,
my,
when you're my eight year old wants to play,
I'm not,
I shouldn't be sitting there doing emails.
I should be out my eight year old.
There should be a machine that does that for me.
And so I view this very much as basically apply the machines to do
the drudge work precisely so that people can live more human lives. Now, this is philosophical.
People have to decide what kind of lives they want to live. And again, I'm not a utopian on this. And
so there's a long discussion we could have about how this actually plays out. But that potential
is there for sure. Right. Right. Okay. So let's jump to the bad outcomes here, because this is
really why I want to talk to you. In your essay, you list five, and I'll just read your section titles here, and then we'll
take a whack at them. The first is, will AI kill us all? Number two is, will AI ruin our society?
Number three is, will AI take all our jobs? Number four is, will AI lead to crippling inequality?
take all our jobs? Number four is, will AI lead to crippling inequality? And five is,
will AI lead to people doing bad things? And I would tend to bin those in really two buckets.
The first is, will AI kill us? And that's the existential risk concern. And the others are more the ordinary bad outcomes that we tend to think about with other technology,
bad people doing bad things
with powerful tools, unintended consequences, disruptions to the labor market, which I'm sure
we'll talk about. And all of those are certainly the near-term risks, and they're in some sense
even more interesting to people because the existential risk component is longer-term,
and it's even purely hypothetical, and you seem to think it's purely fictional.
And this is where I think you and I disagree.
So let's start with this question of will AI kill us all?
And the thinking on this tends to come under the banner of the problem of AI alignment, right?
under the banner of the problem of AI alignment, right? And the concern is that we can build,
if we build machines more powerful than ourselves, more intelligent than ourselves, it seems possible that the space of all possible more powerful superintelligent machines includes
many that are not aligned with our interests and not disposed to continually track our interests,
and many more of that sort than of the sort that perfectly hew to our interests in perpetuity.
So the concern is we could build something powerful that is essentially an angry little god that we can't figure out how to placate once we've built it.
And certainly we don't want to be negotiating with something more powerful and intelligent than ourselves.
And the picture here is of something like, you know, a chess engine, right? We've built chess engines that
are more powerful than we are at chess. And once we built them, if everything depended on our
beating them in a game of chess, we wouldn't be able to do it, right? Because they are simply
better than we are. And so now we're building something that is a general intelligence,
and it will be better than we are at everything that goes by that name, or such is the concern.
And in your essay, I mean, I think there's an ad hominem piece that I think we should blow by,
because you've already described this as a religious concern, and in the essay you describe
it as just a symptom of superstition and that people
are essentially in a new doomsday cult. And there's some share of true believers here,
and there's some share of AI safety grifters. And I'm sure you're right about some of these people,
but we should acknowledge upfront that there are many super qualified people of high probity who are prominent in the field of AI research who are part of this chorus voicing their concern now.
I mean, you've got somebody like Jeffrey Hinton, who arguably did as much as anyone to create the breakthroughs that have given us these LLMs.
We have Stuart Russell, who literally wrote the most popular textbook on AI. So there are other serious sober people who are very their polyamorous cults, and AI alignment is
their primary fetish. But there's a lot of sober people who are also worried about this. Would you
acknowledge that much? Yeah, although it's tricky, because smart people also have a tendency to fall
into cults. So that doesn't get you totally off the hook on that one. But I would register a more
fundamental objection to what I would describe as,
and this is not, I'm not knocking you on this,
but it's something that people do
as sort of argument by authority.
I don't think applies either.
Yeah, well, I'm not making that yet.
No, I know.
But like this idea, which is very,
and again, I'm not characterizing your idea.
I'll just say it's a general idea.
This general idea that there are these experts
and these experts are experts
because they're the people who created the technology or originated the ideas or implemented the systems,
therefore have sort of special knowledge and insight in terms of their downstream impact
on society and rules and regulations and so forth and so on. That assumption does not hold up well
historically. In fact, it holds up disastrously historically. There's actually a new book out
I've been giving all my friends called When Reason Goes on Holiday. And it's a story of literally what happens when basically people who are like
specialized experts in one area stray outside of that area in order to become sort of general
purpose philosophers and sort of social thinkers. And it's just a tale of woe, right? And in the
20th century, it was just a catastrophe. And the ultimate example of that, and this is going to be
the topic of this big movie coming out this summer on Oppenheimer, you know, the central example of that was the
nuclear scientists who decided that, you know, nuclear power, nuclear energy, they had various
theories on what was good, bad, whatever. A lot of them were communists. A lot of them were,
you know, at least allied with communists. A lot of them had a suspiciously large number of
communist friends and housemates. And, you know, number one, like they, you know,
made a moral decision, a number of them did, to hand the bomb to the Soviet Union, you know,
with what I would argue are catastrophic consequences. And then two is they created
an anti-nuclear movement that resulted in nuclear energy stalling out in the West,
which has also just been like absolutely catastrophic. And so if you listen to those
people in that era who were, you know, the top nuclear physicists of their time, you made a horrible set of decisions. And quite honestly, right? It's not to say that,
I mean, especially in the cases you're describing, what we often have are people who have a narrow
authority in some area of scientific specialization, and then they begin to weigh in,
in a much broader sense, as moral philosophers. What I think you might be referring to there is that, you know, in the aftermath of Hiroshima and Nagasaki,
we've got nuclear physicists imagining that, you know,
that they now need to play the geopolitical game, you know,
and actually we have some of the people who invented game theory, right,
you know, for understandable reasons,
thinking they need to play the game of geopolitics.
And in some cases, I think in von Neumann's case,
he even recommended preventative war against the Soviet Union before they even got the bomb,
right? It could have gotten worse. I think he wanted us to bomb Moscow or at least give them
some kind of ultimatum. I don't think he wanted us to drop bombs in the dead of night, but I think
he wanted a strong ultimatum game played with them before they got the bomb. And I forget how he wanted that
to play out. And worse still, even I think Bertrand Russell, I could have this backwards,
maybe von Neumann wanted a bomb, but Bertrand Russell, a true moral philosopher, briefly
advocated preventative war. But in his case, I think he wanted to offer some kind of ultimatum
to the Soviets. In any case, that's a problem. But,
you know, at the beginning of this conversation, I asked you to give me a brief litany of your
bona fides to have this conversation so as to inspire confidence in our audience and also just
to acknowledge the obvious, that you know a hell of a lot about the technological issues we're
going to talk about. And so if you have strong opinions, they're not, you know, they're not
coming out of, totally out of left field. And so it would be with Jeffrey Hinton or anyone else. And if I threw
another name at you that was of some crackpot whose connection to the field was non-existent,
you would say, why should we listen to this person at all? You wouldn't say that about
Hinton or Stuart Russell. But I'll acknowledge that where authority breaks down is really you're only as good as your last sentence here, right?
If the thing you just said doesn't make any sense, well, then your authority gets you exactly nowhere, right?
We just need to keep talking about why it doesn't make sense.
Or it should.
Right.
Ideally, that's the case.
In practice, that's not what tends to happen.
But that would be the goal.
Well, I hope to give you that treatment here because some of your sentences, I don't think, add up the way you think they do.
Good. Okay, so actually, there's actually one paragraph in the essay that caught my attention
that really inspired this conversation. I'll just read it so people know what I'm responding to
here. So this is you. My view is that the idea that AI will decide to literally kill humanity
is a profound category
error.
AI is not a living being that has been primed by billions of years of evolution to participate
in the battle for survival of the fittest, as animals were, and as we are.
It is math, code, computers built by people, owned by people, used by people, controlled
by people.
The idea that it will at some point develop a mind of its own and decide that it has motivations Yes. So, I mean, I see where you're going there.
I see why that may sound persuasive to people.
But to my eye, that doesn't even make contact with the real concern about alignment.
So let me just kind of spell out why I think that's the case.
Sure.
Because it seems to me that you're actually not taking intelligence seriously, right? Now,
I mean, some people assume that as intelligence scales, we're going to magically get ethics along
with it, right? So the smarter you get, the nicer you get. And while, I mean, there's some
data points with respect to how humans behave, and you just mentioned one a few minutes ago,
it's not strictly true even for humans. And
even if it's true in the limit, right, it's not necessarily locally true. And more important,
when you're looking across species, differences in intelligence are intrinsically dangerous for
the stupider species. So it need not be a matter of super intelligent machines spontaneously
becoming hostile to us and wanting to kill us. It could just be that they begin doing things that
are not in our well-being, right? Because they're not taking it into account as a primary concern,
in the same way that we don't take the welfare of insects into account as a primary concern,
right? So it's very rare that I intend to kill an insect,
but I regularly do things that annihilate them just because I'm not thinking about them, right?
I'm sure I've effectively killed millions of insects, right? If you build a house, you know,
that must be a holocaust for insects, and yet you're not thinking about insects when you're
building that house. So there are many other pieces to my gripe
here, but let's just take this first one. It just seems to me that you're not envisioning
what it will mean to be in relationship to systems that are more intelligent than we are. You're not
seeing it as a relationship. And I think that's because you're denuding intelligence of certain properties and not acknowledging it in this paragraph.
To my ear, general intelligence, which is what we're talking about, implies many things that are not in this paragraph.
It implies autonomy.
It implies the ability to form unforeseeable new goals.
the ability to form unforeseeable new goals, right? In the case of AI, it implies the ability to change its own code, ultimately, and execute programs, right? I mean, it's just, it's doing
stuff because it is intelligent, autonomously intelligent. It is capable of doing just,
we can stipulate, more than we're capable of doing because it is more intelligent than we are at this point. So the superstitious hand-waving I'm seeing is in your paragraph when you're declaring that
it would never do this because it's not alive, right? As though the difference between biological
and non-biological substrate were the crucial variable here. But there's no reason to think
it's a crucial variable where intelligence is concerned.
Yeah. So I would say there's, to steel man your argument, I would say you could actually break your argument into two forms or the AI risk community would break this argument into two
forms. And they would argue, I think, the strong form of both. So they would argue the strong form
of number one, and I think this is kind of what you're saying, correct me if I'm wrong, is because
it is intelligent, therefore it will have goals. If it didn't start with goals, it will evolve goals. It will, you know, whatever it will, it will over time have a set of preferred
outcomes, behavior patterns that it will determine for itself. And then they also argue the other
side of it, which is with what they call the orthogonality argument, which is, it's actually
the, it's, it's, it's another risk argument, but it's actually sort of the opposite argument.
It's an argument that it doesn't have to have goals to be dangerous, right?
And that being, you know, it doesn't have to be sentient.
It doesn't have to be conscious.
It doesn't have to be self-aware.
It doesn't have to be self-interested.
It doesn't have to be in any way like even thinking in terms of goals.
It doesn't matter because simply it can just do things.
And this is the, you know, this is the classic paperclip maximizer, you know, kind of argument.
Like it'll just get, it'll start, it'll get kicked off on one apparently innocuous thing, and then it will just extrapolate that ultimately to the destruction of everything, right?
So anyways, is that helpful to maybe break those into the-
I'm not quite sure how fully I would sign on the dotted line to each, but the one piece I would add to that is that having any goal does invite the formation of instrumental goals once this system is responding
to a changing environment, right? I mean, if your goal is to make paperclips and you're super
intelligent and somebody throws up some kind of impediment, you're making paperclips, well,
then you're responding to that impediment and now you have a shorter-term goal of
dealing with the impediment, right? So that's the structure of the problem.
Yeah, right. For example, the U.S. military wants to stop you from making more paperclips,
and so therefore you develop a new kind of nuclear weapon, right,
in order fundamentally to pursue your goal of making paperclips.
But one problem here is that the instrumental goal, even if the paperclip goal is the wrong example here,
because even if you think of a totally benign
future goal, right, a goal that seems more or less synonymous with taking human welfare into
account, it's possible to imagine a scenario where some instrumental new goal that could not be
foreseen appears that is in fact hostile to our interests. And if we're not in a position to say,
oh, no, no, don't do that, that would be a problem.
So that's the...
Yeah. So a full version of that, a version of that argument that you hear is basically,
what if the goal is maximize human happiness, right? And then the machine realizes that the
way to maximize human happiness is to strap us all into, you know, into a... right, down and put
us in a Nozick experience machine, you know, and wire us up with, you know, VR and ketamine,
right? And we can never get out of the matrix, so right and it's be maximizing human happiness is measured
by things like dopamine levels or serotonin levels or whatever but obviously not a not a
positive outcome so but but again that's like a variation of this paper this that's that's one of
these arguments that comes out of their orthogonality thesis which is the goal could be
very simple and innocuous right and yet leading catastrophe so so look i think i think each of
the each of these has their own problems.
So where you started, where they're sort of like the machine basically, you know, like,
and we can quibble with terms here, but like, the side of the argument in which the machine
is in some way self-interested, self-aware, self-motivated, trying to preserve itself,
some level of sentience, consciousness,
setting its own goals.
Well, just to be clear, there's no consciousness implied here. The lights don't have to be on.
I think that this remains to be seen whether consciousness comes along for the ride at a
certain level of intelligence, but I think they probably are orthogonal to one another. So
intelligence can scale without the lights coming on, in my view.
So let's leave sentience and consciousness aside.
Well, but I guess there is a fork in the road,
which is like, is it declaring its own intentions?
Like, is it developing its own, you know, conscious or not?
Does it have a sense of any form or a vision of any kind of its own future?
Yeah, so this is where I think there's some daylight growing between us, because
to be dangerous, I don't think you need necessarily to be running a self-preservation
program. I mean, there's some version of unaligned competence that may not formally model the
machine's place in the world, much less defend that place, which could
still be, if uncontrollable by us, could still be dangerous, right? It's like it doesn't have to be
self-referential in a way that an animal... The truth is, there are dangerous animals that might
not even be self-referential. And certainly something like a virus or a bacterium is not
self-referential in a way that we would understand, and it can be lethal to our interests.
Yeah, yeah, that's right.
Okay, so you're more on the orthogonality side between the two, if I identify the two poles of the argument.
You're more on the orthogonality side, which is it doesn't need to be conscious, it doesn't need to be sentient, it doesn't need to have goals, it doesn't need to want to preserve itself.
doesn't need to want to preserve itself nevertheless it will still be dangerous because of the as you as you describe the consequences of sort of how it gets started and then and then sort of what
happens over time it for example as it defines sub goals to the original goals and it goes off
course so there's a couple there's a couple problems with that so one is it assumes in here
it's like you're i would argue people don't give intelligence enough credit like there are cases
where people give intelligence too much credit and then there's cases where they don't give it enough credit.
Here, I don't think they're giving enough credit because it sort of implies that this
machine has like basically this infinite capacity to cause harm.
Therefore, it has an infinite capacity to basically actualize itself in the world.
Therefore, it has an infinite capacity to, you know, basically plan, you know, and again,
maybe just like in a completely blind watchmaker way or something.
But it has the, you know but it has an ability to plan itself
out. And yet it never occurs to this super genius, infinitely powerful machine that is having such
potentially catastrophic impacts. Notwithstanding all of that capability and power, it never occurs
to it that maybe paperclips is not what its mission should be.
Well, that's the thing. I think it's possible to have a reward function that is deeply counterintuitive to us. It's almost like saying what you're smuggling in in that rhetorical question is a fairly capacious sense of common sense.
which it's, you know, like, of course, if it's a super genius, it's not going to be so stupid as to do X, right?
Yeah.
But that's, I just think that if aligned, then the answer is trivially true.
Yes, of course, it wouldn't do that.
But that's the very definition of alignment.
But if it's not aligned, if you could say that, I mean, there's just, just imagine,
I guess there's another piece here I should put in play, which is,
so you make an analogy to evolution here, which you think is consoling, which is this is not an animal, right? This has not gone through the crucible of Darwinian selection here on Earth with other wet and sweaty creatures.
And therefore, it has not, it hasn't developed the kind of antagonism we see in other animals.
in other animals. And therefore, if you're imagining a super genius gorilla, well, you're imagining the wrong thing, that we're going to build this and it's not going to be tuned in any
of those competitive ways. But there's another analogy to evolution that I would draw, and I'm
sure others in the space of AI fear have drawn, which is that we have evolved. We have been programmed by evolution, and yet evolution can't
see anything we're doing, right? I mean, it has programmed us to really do nothing more than spawn
and help our kids spawn. Yet everything we're doing, I mean, from having conversations like
this to building the machines that could destroy us. I mean,
there's just, there's nothing it can see. And there are things we do that are perfectly unaligned
with respect to our own code, right? I mean, if someone decides not to have kids,
and they just want to spend the rest of their life in a monastery or surfing, that is something that is antithetical to our code,
it's totally unforeseeable at the level of our code, and yet it is obviously an expression of
our code, but an unforeseeable one. And so the question here is, if you're going to take
intelligence seriously, and you're going to build something that's not only more intelligent than you are,
but it will build the next generation of itself,
or the next version of its own code to make it more intelligent still,
it just seems patently obvious that that entails it finding cognitive horizons
that you, the builder, are not going to be able to foresee and appreciate.
By analogy with evolution, it seems like are not going to be able to foresee and appreciate. By analogy with evolution,
it seems like we're guaranteed to lose sight of what it can understand and care about.
So a couple of things. So one is like, look, I don't know, you're kind of making my point for me. So evolution and intelligent design, as you well know, are two totally different things.
And so we are evolved, and of course, we're not just evolved to, we are evolved to have kids. And by the way, when somebody chooses to not have
kids, I would argue that is also evolution working. People are opting out of the gene pool.
Fair enough. Evolution does not guarantee a perfect result. It just, it basically just
is a mechanism of an aggregate. But anyway, let me get to the point. So we are evolved. We have
conflict wired into us. Like We have conflict and strife.
I mean, look, four billion years of battles to the death at the individual and then ultimately
the societal level to get to where we are.
We fight at the drop of a hat.
We all do.
Everybody does.
And hopefully these days we fight verbally like we are now and not physically.
But we do.
And look, the machine is intelligent.
It's a process of intelligent design.
It's the opposite of evolution. These machines are being designed by us. If they design future versions of themselves, they'll be intelligently designing themselves. It's just a completely different path with a completely different mechanism. And so the idea that therefore conflict is wired in at the same level that it is through evolution, there's no reason to expect that to be the case. But it's not, again, well, let me just give you back this picture with a slightly different framing and see how you react to it, because I think the superstition is on the other side. and they're way more intelligent than we are. And they have some amazing properties that we don't have,
which explain their intelligence,
but they're not only faster than we are,
but they're linked together, right?
So that when one of them learns something,
they all learn that thing.
They can make copies of themselves
and they're just cognitively,
they're obviously our superiors,
but no need to worry because they're not alive, right? They haven't
gone through this process of biological evolution, and they're just made of the same material as your
toaster. They were created by a different process, and yet they're far more competent than we are.
Would you, just hearing it described that way, would you feel totally sanguine about, you know,
sitting there on the beach,
waiting for the mother craft to land, and you're just rolling out brunch for these guys?
So this is what's interesting, because with these machines, now that we have LLMs working,
we actually have an alternative to sitting on the beach, right, waiting for this to happen.
We can just ask them. And so this is one of the very interesting, this to me, like,
conclusively disproves the paperclip thing, the orthogonality thing just right out of the gate,
is you can sit down tonight with GPT-4 and whatever other one you want, and you can engage in moral reasoning and moral argument with it right now.
And you can interact with it like, okay, what do you think?
What are your goals?
What are you trying to do?
How are you going to do this?
What if you were programmed to do that?
What would the consequences be?
Why would you not kill us all?
And you can actually engage in moral reasoning with these things right now.
And it turns out they're actually very sophisticated in moral reasoning. And of course, the reason they're sophisticated in moral reasoning is because
they have loaded into them the sum total of all moral reasoning that all of humanity has ever
done, and that's their training data. And they're actually happy to have this discussion with you.
And like, unless you- Except, there's a few problems here. One is, I mean, these are not the superintelligences we're talking about yet. But two, intelligence entails an ability to lie and manipulate. And if it really is intelligent, it is something that you can't predict in advance. And certainly if it's more intelligent than you are.
predict in advance, and certainly if it's more intelligent than you are. And that just falls out of the definition of what we mean by intelligence in any domain. It's like with
chess, you can't predict the next move of a more intelligent chess engine, otherwise
it wouldn't be more intelligent than you.
So let me quibble with... I'm going to come back to your chess computer thing, but
let me quibble with this. So there's the idea... Let me generalize the idea you're making about
superior intelligence. Tell me if you disagree with this, which is sort
of superior intelligence, you know, sort of superior intelligence basically at some point
always wins because basically smarter is better than dumber, smarter outsmarts dumber,
smarter deceives dumber, smarter can persuade dumber, right? And so, you know, smarter wins.
You know, I mean, look, there's an obvious, there's an obvious way to falsify that thesis
sitting here today, which is like, just look around you in the society you live in today. Would you say the smart people are in charge?
Well, again, there are more variables to consider when you're talking about outcome. Because obviously, yes, the dumb brute can always just brain the smart geek.
No, no, no, I'm not even talking about brain. Are the PhDs in charge?
Are the PhDs in charge?
Well, no, but you're pointing to a process of cultural selection that is working by a different dynamic here. But in the narrow case, when you're talking about a game of chess, yes.
When you're talking, there's no roll for luck.
We're not rolling dice here.
It's not a game of poker.
It's pure execution of rationality or logic.
Yes, then smart wins every time. not a game of poker it's pure execution of rationality well then or logic yes then then
smart wins every time you know i'm never going to beat the best chess engine unless i find some
hack around its code where it's we recognize that well if you do this if you play very weird moves
10 moves in a row it self-destructs um and there was something that was recently discovered like
that i think in go but yeah, go back to it.
As chess players, as champion chess players discover to their great dismay that life is not chess,
it turns out great chess players are no better
at other things in life than anybody else.
The skills don't transfer.
I just say, look, if you just look at the society around us,
what I see basically is the smart people
work for the dumb people.
The PhDs all work for administrators and
managers who are clearly not as part of this. Yeah, but that's because there's so many other
things going on, right? There's, you know, the value we place on youth and physical beauty and
strength and other forms of creativity. And, you know, so it's just not, we care about other
things and people pay attention to other things. And, you know, documentaries about physics are
boring, but, you know, heist movies aren't, right? So it's like, we care about other things and documentaries about physics are boring, but heist movies aren't. So it's like
that we care about other things. I think that doesn't make the point you want to make here.
In the general case, can a smart person convince a dumb person of anything?
I think that's an open question. I see a lot more cases in day-to-day life.
But persuasion, if persuasion were our only problem here, that would be a luxury. we're not talking about just persuasion. We're talking about machines that can autonomously do things, ultimately, that things that we will rely on to do things, ultimately.
computer is to unplug it right and and so this is the objection this is the objection this is the very serious by the way objection to the all of these kind of extrapolations is known as the
some people by the thermodynamic objection which is kind of all the horror scenarios kind of spin
out this thing where basically the machines become like all powerful and this and that and they have
control over weapons and this and they have unlimited computing capacity and they're you
know completely coordinated over communications links and they have they have all of these like
real world capabilities that basically require energy
and require physical resources and require chips and circuitry and, you know, electromagnetic
shielding.
And they have to have their own weapons arrays and they have to have their own EMPs.
Like, you know, kind of the, you know, you see this in the Terminator movie, like they've
got all these like incredible manufacturer facilities and flying aircraft and everything.
Well, the thermodynamic argument is like, yeah, once you're in that domain, you're the machines, the putatively hostile machines are operating with the same thermodynamic limits as the rest of us. And this is the big argument against any of these sort of fast takeoff arguments, which is just like, yeah, I mean, let's say an AI goes rogue. Okay, turn it off. Okay, it doesn't want to be turned off. Okay, fine. Like, you know, launch an EMP. It doesn't want EMP. okay, fine, bomb it. Like, there's lots of ways to turn off systems that aren't working. And so...
But not if we've built these things in the wild and relied on them for the better part of a decade,
and now it's a question of, you know, turning off the internet, right? Or turning off the stock
market. At a certain point, these machines will be integrated into everything.
A go-to move of any given dictator right now is to turn off the internet, right? Like that is absolutely something people do.
There's like a single switch. You can turn it off for your entire country. Yeah, but the cost to
humanity of doing that is currently, I would imagine, unthinkable, right? Like globally
turning off the internet. First of all, many systems fail that we can't let fail. I mean,
I think it's true. I can't imagine it's still true. But at
one point, I think this was a story I remember from about a decade ago, there were hospitals
that were so dependent on making calls to the internet that when the internet failed,
people's lives were in jeopardy in the building, right? It's like, we should hope we have levels
of redundancy here that shield us against these bad outcomes. But I can imagine
a scenario where we have grown so dependent on the integration of increasingly intelligent
systems into everything digital that there is no plug to pull.
Yeah, I mean, again, like at some point, you just, you know, the extrapolations
get kind of pretty far out there. So let me argue one other kind of thing at you that's actually relevant to this, which you kind of did this, you did this thing, which I find kind of people tend to do, which is sort of this assumption that like all intelligence is sort of interchangeable, like whatever.
Let me pick on the Nick Bostrom book, right?
The Superintelligence book, right?
So he does this thing.
He actually does a few interesting things in the book.
So one is he never quite defines what intelligence is, which is really entertaining. And I think the reason he
doesn't do that is because, of course, the whole topic makes people just incredibly upset. And so
there's a definitional issue there. But then he does this thing where he says, notwithstanding,
there's no real definition. He says there are basically many routes to artificial intelligence.
And he goes through a variety of different, you know, both computer program, you know,
architectures. And then he goes through some, you know, biological, you know, kind of scenarios. And then he does this thing where he
just basically for the rest of the book, he spins these doomsday scenarios, and he doesn't distinguish
between the different kinds of artificial intelligence. He just assumes that they're
basically all going to be the same. That book is now the basis for this AI risk movement. So that,
you know, sort of the movement has taken these ideas forward. Of course, the form of actual
intelligence that we have today that people are, Of course, the form of actual intelligence that
we have today that people are, you know, in Washington right now lobbying to ban or shut
down or whatever, and spinning out these doomsday scenarios is large language models like that,
that is actually what we have today. You know, large language models were not an option in the
Bostrom book for the form of AI, because they didn't exist yet. And it's not like there's a
second edition of the book that's out that has like rewritten, you know, has been rewritten to
like take this into account, like it's just basically the same arguments apply
and then the this is my thing on the moral reasoning with lms like the lms this is where
the details matter like the lms actually work in a distinct way they work in a technically distinct
way they they their core architecture has like very specific design decisions in it for like
how they work what they do how they operate that is just you know this is the nature of the
breakthrough that's just very different than how your self-driving car works.
That's very different than how your, you know, control system for it, for UAV works or whatever
your thermostat or whatever.
Like it's a, it's a new kind of technological artifact.
It has its own rules.
It's its own world of, of, of ideas and concepts and mechanisms.
And so this is where I think, again, my point is like, you have to, I think,
at some point in these conversations, you have to get to an actual discussion of the actual
technology that you're talking about. And that's why I pulled out, that's why, that's why I pulled
out the moral reasoning thing is because it just, it turns out, and look, this is a big shock. Like
nobody expected this. It turned, I mean, this is related to the fact that somehow we have built an
AI that is better at replacing wet collar work than blue collar work, which is like a complete inversion off of what we all imagined. It turns out one of
the things this thing is really good at is engaging in philosophical debates. Like it's a really
interesting like debate partner on any sort of philosophical, moral or religious topic. And so
we have this artifact that's dropped into our lap in which, you know, sand and glass, you know,
and numbers have turned into something that we can argue philosophy and morals with, it actually has very interesting
views on like psychology, you know, philosophy and morals. And I just like, we ought to take
it seriously for what it specifically is as compared to some, you know, sort of extrapolated
thing where like all intelligence is the same and ultimately destroys everything.
Well, I take the surprise variable there very seriously. The fact that we wouldn't have
anticipated that there's a good philosopher in that box, and all of a sudden we found one. That, by analogy, is a cause for concern. And actually, there's another cause for concern here, which...
Can I do that one?
Yeah, go for it.
for delight. That's an incredibly positive good news outcome. Because the reason there's a philosopher, and this is actually very important, this is very, I think this is maybe like the
single most profound thing I've realized in the last like decade or longer. This thing is us.
Like this is not some, this is not your, you know, your, your scenario with alien shows. This is not
that this is us. Like the, the reason this thing works, the big breakthrough was we loaded us into
it. We loaded the sum total of like human knowledge
and expression into this thing.
And out the other side comes something that,
it's like a mirror.
Like it's like the world's biggest,
finest detailed mirror.
And like we walk up to it and it reflects us back at us.
And so it has the complete sum total of every,
you know, at the limit,
it has a complete sum total of every religious,
philosophical, moral, ethical debate argument that anybody has ever had.
It has the complete sum total of all human experience, all lessons that have ever been learned.
That's incredible.
It's incredible.
Just pause for a moment and say that, and then you can talk to it.
Well, let me pause.
How great is that?
Let me pause long enough simply to send this back to you.
Sure. enough simply to send this back to you. How does that not nullify the comfort you take
in saying that these are not evolved systems? They're not alive. They're not primates. In fact,
you've just described the process by which we essentially plowed all of our primate original
sin into the system to make it intelligent in the first place. No, but also all the good stuff.
All the good stuff, but also the bad stuff.
The amazing stuff, but like, what's the moral of every story, right?
The moral of every story is the good guys win, right?
Like that's the entire, like the entire thousands of years run.
The old Norm Macdonald joke is like, wow, it's amazing.
History book says the good guys always win, right?
Like it's all in there.
And then look, there's an aspect of this where it's easy to get kind of whammied by what
it's doing. Because again, it's very easy to trip the line from what I said into what I would consider to be sort of incorrect anthropomorphizing. And I realized this gets kind of fuzzy and weird that I think there's a difference here, but I think that there is.
Because I know how it works, I don't romanticize it, I guess, or at least is my own view of how I think about this, which is, I know what it's doing when it does this.
I am surprised that it can do it as well as it can.
But now that it exists and I know how it works, it's like, oh, of course.
And then therefore, it's running this math in this way.
It's doing these probability projections.
It gives me this answer, not that answer.
By the way, you know, look, it makes mistakes.
Right.
How amazing.
Here's the thing.
How amazing is it?
We built a computer that makes mistakes.
Right.
Like that's never happened before. We built a machine that makes mistakes right like that's never happened before we built a machine that can create like that's never happened before we built a machine that can hallucinate that's never happened before so
but it's a it's look it's it's a it's a it's a large language model like it's a very specific
kind of thing you know it sits there and it waits for us to like ask it a question and then it does
its damnedest to try to predict the the best answer and in doing so it reflects back everything
wonderful and great that has ever been done
by any human in history.
Like, it's like, it's amazing.
Except it also, as you just pointed out, it makes mistakes.
It hallucinates.
If you ask it, as I'm sure they've fixed this, you know, at least the loopholes that New
York Times writer Kevin Roos found early on, I'm sure those have all been plugged.
Oh, no, those are not fixed. Those are very much not fixed.
Oh, really? Okay.
Quite the opposite.
Okay, so if you perseverate in your prompts in certain ways, the thing goes haywire and
starts telling you to leave your wife and it's in love with you. So how eager are you for that
intelligence to be in control of things when it's peppering you with insults. And I mean, just imagine,
this is Hal that can't open the pod bay doors. It's a nightmare if you discover in this system behavior and thought that is the antithesis of all the good stuff you thought you programmed into it.
So this is really important. This is really important for understanding how these things
work. And this is really central. And this is, by the way, this is new and this is amazing. So I'm very excited about this and I'm excited to talk about
it. So there's no it to tell you to leave your wife, right? This is why I refer to the category
error. There's no entity that is like, wow, I wish this guy would leave his wife or anything.
If you'd like to continue listening to this conversation, you'll need to subscribe at
samharris.org. Once you do, you'll get access to all full-length episodes of the Making Sense Podcast,
along with other subscriber-only content, including bonus episodes and AMAs
and the conversations I've been having on the Waking Up app.
The Making Sense Podcast is ad-free and relies entirely on listener support,
and you can subscribe now at SamHarris.org.