Making Sense with Sam Harris - Making Sense of Artificial Intelligence | Episode 1 of The Essential Sam Harris

Starting point is 00:00:00 Thank you. of the Making Sense podcast, you'll need to subscribe at samharris.org. There you'll find our private RSS feed to add to your favorite podcatcher, along with other subscriber-only content. We don't run ads on the podcast, and therefore it's made possible entirely through the support of our subscribers. So if you enjoy what we're doing here, please consider becoming one.

Starting point is 00:00:48 I am here with Jay Shapiro. Jay, thanks for joining me. Thank you for having me. So we have a fun project to talk about here. And let's see if I can remember the the middle of the night one night realizing that more or less my entire catalog of podcasts was, if not the entire thing, maybe conservatively speaking, 50% of all the podcasts were evergreen, which is to say that their content was basically as good today as the day I recorded them, but because of the nature of the medium, they would never be perceived as such, and people really don't tend to go back into the catalog and listen to a three-year-old podcast. And yet, there's something insufficient about just recirculating them in my podcast feed or elsewhere. And so I and Jaron, my partner in crime here, we're trying to think about how to give all of this content new life. And then we thought of you just independently turning your creative intelligence loose on the catalog. And now I will properly introduce you as someone who

Starting point is 00:02:04 should be doing that. Perhaps you can introduce yourself. Just tell us what you have done over these many years and the kinds of things you've focused on. Yeah, well, I'm a filmmaker first and foremost. But I think my story and my genesis of being maybe the right person to tap here is probably indicative or representative of a decent portion of your audience. I'm just guessing. I'm 40 now, which pegs me in college when 9-11 hit. It was my late my second year. I guess it would have been early if it was September. And, you know, I never heard of you at all at that point. I was an atheist and just didn't think too much

Starting point is 00:02:46 about that kind of stuff. I was fully on board with any atheist things I saw coming across my world. But then 9-11 hit, and I was on a very liberal college campus. And the kind of questions that were popping up in my mind and I was asking myself were uncomfortable for me. I just didn't know what to do with them. I really had no formal philosophical training and I kind of just buried them, you know, under the weight of my own confusion or shame or just whatever kind of brew a lot of us

Starting point is 00:03:16 were probably feeling at the time. And then I discovered your work with The End of Faith, right when you sort of were responding to the same thing and a lot of your language, your work with The End of Faith, right when you sort of were responding to the same thing. And a lot of your language, you were philosophically trained and maybe sharper with your language, for better or worse, which we found out later was complicated, resonated with me. And I started following along with your work and The Four Horsemen and Hitchens and Dawkins and that sort of whole crowd. And I'm sure

Starting point is 00:03:45 I wasn't alone. And then I paid close, special attention to what you were doing, which I actually included in one of the pieces that I ended up putting together in this series. But with a talk you gave in Australia, you know, I don't have to tell you about your career, but again, I was following along as you were on sort of this atheist circuit and I was interested. But whenever you would talk about sort of the hard work of secularism and the hard work of atheism, in particular, I'm thinking of your talk called Death in the Present Moment right after Christopher Hitchens had died. I'm actually curious how quickly you threw that together because I know you were supposed to or you were planning on speaking about free will and you ended up giving this whole other talk

Starting point is 00:04:28 and that one and I'll save it because I definitely put that one in our compilation but it's it it struck me as okay this guy's up to something a little different and the questions that he's asking are really different I was just on board with that ride so I I became a fan and like probably many of your listeners started to really follow and listen closely and became a student. And hopefully, like any good student started to disagree with my teacher a bit and, and slowly get the confidence to push back and have my own thoughts and maybe find find the weaknesses and strengths of what you were up to. And, you know, your work exposed me and many, many other people, I'm sure to a lot of great thinkers. And maybe you don't love this,

Starting point is 00:05:10 but sometimes the people who disagree with you that you introduce us to on this side of the microphone, we think are right. And that's a great credit to you as well for just giving them the air and maybe on some really nerdy, esoteric things. I'm one of them at this point now, because to back up way to the beginning of the story, I was at a university where I was well on my way to a film degree, which is what I ended up getting. But when 9-11 hit, I started taking a lot more courses and a track that they had, which I think is fairly unique at the time.

Starting point is 00:05:43 Maybe still one of the only programs where you can actually major in Holocaust studies, which is sort of sits in between the history and philosophy kind of departments. And I started taking a bunch of courses in there. And that's where I was first exposed to sort of the formal philosophy, language and education. And that was so useful for me. So I was just on board. And now hopefully I, you know, I swim deep in those waters and know my way around the lingo. And it's super helpful. But yeah, it was it was almost, you know, Moore's law of bringing up the Nazis was those were those the first times actually in courses called like resistance during the Holocaust and things like that, where, you know, I first was exposed to

Starting point is 00:06:25 the words like deontology and consequentialism and utilitarianism and a lot of moral ethics stuff. And then I went further on my own into sort of the theory of mind and this kind of stuff. But yeah, I consider myself in this weird new digital landscape that we're in a bit of a student of the school of Sam Harris. But then again, like hopefully any good student, I've branched off and have my own sort of thoughts and framings. And so I'm definitely in these pieces in this series that we're calling The Essential Sam Harris. I can't help but sort of put my writing and my framework on it, or at least hope that the people and the challenges that you've encountered and continue to encounter, whether they're right or wrong or making drastic mistakes, I want to give everything in it a really fair hearing. So there's times I'm sure where the listener will hear my own hand of opinion coming

Starting point is 00:07:16 in there, and I'm sure you know the areas as well. But most times I'm just trying to give an open door to the mystery and why these subjects interest you in the first place, if that makes sense. Yeah, yeah. And I should remind both of us that we met because you were directing a film focused on Majid Nawaz and me around our book, Islam and the Future of Tolerance. book, Islam and the Future of Tolerance. And also, we've brought into this project another person who I think you met independently, I kind of remember, but Megan Phelps Roper, who's been a guest on the podcast, and someone who I have long admired. And she's doing the voiceover work in this series, and she happens to have a great voice. So I'm very happy to be working with her. Yeah, I did meet her independently. Your archive, I think you said three or four years old. Your archive is over 10 years old now. And I was diving into the earliest days of it. And there are some

Starting point is 00:08:15 fascinating conversations that age really interestingly. And I'm curious. I mean, I think this project, again, it's for fans, it's for listeners, but it's for people who might hate you also, or critics of you, or people who are sure you were missing something or wrong about something, or even yourself, to go back and listen to certain conversations. conversation is seven or eight years ago now. And the part that I really resurfaced, it's actually in the morality episode, is full of details and philosophies and politics and moral philosophies regarding things like intervention in the Middle East. And at the time of your recording, of course, we had no idea how Afghanistan might look a decade from then. But now we kind of do. And it's not a... If people listen to these carefully, it's not about, oh, this side of the conversation

Starting point is 00:09:15 turned out to be right and this kind of part turned out to be wrong. But certain things hit our ears a little differently. Even on this first topic of artificial intelligence, I mean, I think that conversation continues to evolve in a way where the issues that you bring up are evergreen, but hopefully evolving as well, just as far as their application goes. So yeah, so I think you, I would love to hear your thoughts, listening back to some of those. And in fact, to reference the film we made together,

Starting point is 00:09:50 a lot of that film was you doing that actively and live given a specific topic of looking back and reassessing language about how it might, you know, land politically in that project. So yeah, but this goes into really different, including an episode about social media, which changes every day. Yeah, changes by the hour. Yeah. And the conversation you have with Jack Dorsey is now fascinating for all kinds of different reasons that at the time couldn't have been. So yeah, it's evergreen, but it's also just like new life in all of them, I think. Yeah. Yeah. Well, I look forward to hearing it. Just to be clear, this has been very much your project. I mean, I haven't heard most of this material since the time I recorded it and released it. And you've gone back and

Starting point is 00:10:34 created episodes on a theme where you've pulled together five or six conversations and intercut material from five or six different episodes and then added your own interstitial pieces, which you have written and Megan Phelps Roper is reading. So it's just, these are very much their own documents. And as you say, you don't agree with me about everything and that you're occasionally, you're shading different points from your own point of view. And so, yeah, I look forward to hearing it and we'll be dropping the whole series here in the podcast feed. If you're in the public feed, as always, you'll be getting partial episodes. And if you're in the subscriber feed, you'll be getting full episodes. And the first will be on artificial intelligence. And

Starting point is 00:11:26 then there are many other topics, consciousness, violence, belief, free will, morality, death, and others beyond that. Yeah. There's one on existential threat and nuclear war that I'm still piecing together, but that one's pretty harrowing. One of your areas of interest. Yeah, yeah. Great. Well, thanks for the collaboration, Jay. Again, I'm a consumer of this, probably more than a collaborator at this point because I have only heard part of what you've done here. So I'll be eager to listen as well.

Starting point is 00:12:00 But thank you for the work that you've done. No, thank you. And I'll just say, you're gracious to allow someone to do this, who, who does have some, you know, again, most of our, my disagreements with you are pretty deep and nerdy and, and a sort of terror kind of philosophy stuff, but it's incredibly gracious that you've given me the opportunity to do it. And then hopefully again, I'm a bit of a representative for people who have been in the passenger seat of your public project of thinking out loud for over a decade now. And if I can, if I can, you know, be a voice for that, that part of the crowd,

Starting point is 00:12:37 it's just, it's an honor to do it. And there are a lot of fun to a ton of fun. There's a ton of audio, you know, like thought experiments that we play with and hopefully bring to life in your ears a little bit, including in this very first one with artificial intelligence. So yeah, I hope people enjoy it. I do as well. So now we bring you Megan Phelps-Roper on the topic of artificial intelligence. Welcome to The Essential Sam Harris. This is Making Sense of Artificial Intelligence. The goal of this series is to organize, compile, and juxtapose conversations hosted by Sam Harris into specific areas of interest. This is an ongoing effort to construct a coherent overview of Sam's perspectives and arguments, the various explorations and approaches to the topic,

Starting point is 00:13:26 the relevant agreements and disagreements, and the pushbacks and evolving thoughts which his guests have advanced. The purpose of these compilations is not to provide a complete picture of any issue, but to entice you to go deeper into these subjects. Along the way, we'll point you to the full episodes with each featured guest. And at the conclusion, we'll offer some reading, listening, and watching suggestions, which range from fun and light to densely academic. One note to keep in mind for this series. Sam has long argued for a unity of knowledge where the barriers between fields of study are viewed as largely unhelpful artifacts of unnecessarily partitioned thought.

Starting point is 00:14:10 The pursuit of wisdom and reason in one area of study naturally bleeds into, and greatly affects, others. You'll hear plenty of crossover into other topics as these dives into the archives unfold. And your thinking about a particular topic may shift as you realize its contingent relationships with others. In this topic, you'll hear the natural overlap with theories of identity and the self, consciousness, and free will. So, get ready.

Starting point is 00:14:39 Let's make sense of artificial intelligence. make sense of artificial intelligence. Artificial intelligence is an area of resurgent interest in the general public. Its seemingly eminent arrival first garnered wide attention in the late 60s, with thinkers like Marvin Minsky and Isaac Asimov writing provocative and thoughtful books about the burgeoning technology and concomitant philosophical and ethical quandaries. Science fiction novels, comic books, and TV shows were flooded with stories of killer robots and encounters with super-intelligent artificial lifeforms hiding out on nearby planets, which we thought we would soon be visiting on the backs of our new rocket ships. Over the following decades, the excitement and fervor look to have faded from view in the public imagination. But in recent years, it has made

Starting point is 00:15:30 an aggressive comeback. Perhaps this is because the fruits of the AI revolution and the devices and programs once only imagined in those science fiction stories have started to rapidly show up in impressive and sometimes disturbing ways all around us. Our smartphones, cars, doorbells, watches, games, thermostats, vacuum cleaners, light bulbs, and glasses now have embedded algorithms running on increasingly powerful hardware, which navigate, dictate, or influence not just our locomotion, but our entertainment choices, our banking, our politics, our datingotion, but our entertainment choices, our banking, our politics, our dating lives, and just about everything else. It seems every other TV show or movie that appears on a streaming service is birthed out of a collective interest, fear, or otherwise general fascination with the ethical, societal, and philosophical implications of

Starting point is 00:16:23 artificial intelligence. There are two major ways to think about the threat of what is generally called AI. One is to think about how it will disrupt our psychological states or fracture our information landscape, and the other is to ponder how the very nature of the technical details of its development may threaten our existence. This compilation is mostly focused on the latter concern, because SAM is certainly amongst those who are quite worried about the existential threat of the technical development and arrival of AI. Now, before we jump into the clips, there are a few concepts that you'll need to onboard to find your footing. You'll hear the terms Artificial General Intelligence, or AGI, and Artificial Superintelligence, or ASI, used in

Starting point is 00:17:12 these conversations. Both of these terms refer to an entity which has a kind of intelligence that can solve a nearly infinitely wide range of problems. We humans have brains which display this kind of adaptable intelligence. We can climb a ladder by controlling our legs and arms in order to retrieve a specific object from a high shelf with our hands. And we use the same brain to do something very different, like recognize emotions in the tone of a voice of a romantic partner. I look forward to infinity with you. That same brain can play a game of checkers against a young child,

Starting point is 00:17:45 who we might also be coyly trying to let win, or play a serious game of competitive chess against a skilled adult. That same brain can also simply lift a coffee mug to our lips, not just to ingest nutrients and savor the taste of the beans, but also to send a subtle social signal to a friend at the table to let them know that their story is dragging on a bit. All of that kind of intelligence is embodied and contained in the same system, namely our brains. AGI refers to a human level of intelligence, which doesn't surpass what our brightest humans can accomplish on any given task,

Starting point is 00:18:22 while ASI references an intelligence which performs at, well, superhuman levels. This description of flexible intelligence is different from a system which is programmed or trained to do one particular thing incredibly well, like arithmetic, or painting straight lines on the sides of a car, or playing computer chess, or guessing large prime numbers, or displaying music options to a listener based on the observable lifestyle habits of like-minded users in a certain demographic. That kind of system has an intelligence that is sometimes referred to as narrow or weak AI. But even that kind of thing can be quite worrisome from the standpoint of weaponization or preference manipulation. You'll hear Sam voice his concerns throughout these conversations,

Starting point is 00:19:11 and he'll consistently point to our underestimation of the challenge that even narrow AI poses. So, there are dangers and serious questions to consider no matter which way we go with the AI topic. to consider no matter which way we go with the AI topic. But as you'll also hear in this compilation, not everyone is as concerned about the technical existential threat of AI as Sam is. Much of the divergence in levels of concern stems from initial differences on the fundamental conceptual approach towards the nature of intelligence. Defining intelligence is notoriously slippery and controversial, but you're about to hear one of Sam's guests offer a conception which distills intelligence to a type of observable competence at actualizing desired tasks, or an ability to manifest preferred future states through intentional current action and intervention. You can imagine a linear gradient

Starting point is 00:20:03 indicating more or less of the amount of this competence as you move along it. This view places our human intelligence on a continuum along with bacteria, ants, chickens, honeybees, chimpanzees, all of the potential undiscovered alien life forms, and of course, artificial intelligence, which perches itself far above our lowly human competence. This presents some rather alarming questions. Stephen Hawking once

Starting point is 00:20:33 issued a famous warning that perhaps we shouldn't be actively seeking out intelligent alien civilizations, since we'd likely discover a culture which is far more technologically advanced than ours. And, if our planet's history provides any lessons, it seems to prove that when technologically mismatched cultures come into contact, it usually doesn't work out too well for the lesser-developed one. Are we bringing that precise suicidal encounter into reality as we set out to develop artificial intelligence? That question alludes to what is known as the value alignment problem.

Starting point is 00:21:08 But before we get to that challenge, let's go to our first clip, which starts to lay out the important definitional foundations and distinction of terms in the landscape of AI. The thinker you're about to meet is the decision theorist and computer scientist Eliezer Yudkowsky. Yudkowsky begins here by defending this linear gradient perspective on intelligence and offers an analogy to consider how we might be mistaken about intelligence in a similar way to how we once were mistaken about the nature of fire. It's clear that Sam is aligned and attracted to Eliezer's run at this question,

Starting point is 00:21:41 and consequently, both men end up sharing a good deal of unease about the implications that all of this has for our future. This is from episode 116, which is entitled AI, Racing Towards the Brink. Let's just start with the basic picture and define some terms. I suppose we should define intelligence first and then jump into the differences between strong and weak or general versus narrow AI. Do you want to start us off on that? Sure. Preamble disclaimer, though, the field in general, like not everyone you would ask would give you the same definition of intelligence. And a lot of times in cases like those, it's good to sort of go back to observational basics. We know that in a certain way, human beings seem a lot more competent than chimpanzees,

Starting point is 00:22:37 which seems to be a similar dimension to the one where chimpanzees are more competent than mice, or that mice are more competent than spiders. And people have tried various theories about what this dimension is. They've tried various definitions of it. But if you went back a few centuries and asked somebody to define fire, the less wise ones would say, ah, fire is the release of phlogiston. Fire is one of the four elements. And the truly wise ones would say, well, fire is the release of phlogiston. Fire is one of the four elements. And the truly wise ones would say, well, fire is the sort of orangey bright hot stuff that comes out of wood and spreads along wood. And they would tell you what it looked like and put that prior to their

Starting point is 00:23:15 theories of what it was. So what this mysterious thing looks like is that humans can build space shuttles and go to the moon, and mice can't. And we think it has something to do with our brains. Yeah. Yeah. I think we can make it more abstract than that. Tell me if you think this is not generic enough to be accepted by most people in the field. It's whatever intelligence may be in specific context. So generally speaking, it's the ability to meet goals, perhaps across a diverse range of environments. And we might want to add that it's at least implicit in intelligence that interests us. It means an ability to do this flexibly rather than by rote following the same strategy again and again blindly.

Starting point is 00:24:06 Does that seem like a reasonable starting point? I think that that would get fairly widespread agreement and it matches up well with some of the things that are in AI textbooks. If I'm allowed to sort of take it a bit further and begin injecting my own viewpoint into it, I would refine it and say that by achieve goals, we mean something like squeezing the measure of possible futures higher in your preference ordering. If we took all the possible outcomes and we rank them from the ones you like least to the ones you like most, then as you achieve your goals, you're sort of like squeezing the outcomes higher in your preference ordering.

Starting point is 00:24:44 You're narrowing down what the outcome would be to be something more like what you want, even though you might not be able to narrow it down very exactly. Flexibility, generality. There's a, like humans are much more domain general than mice. Bees build hives, general than mice. Bees build hives. Beavers build dams. A human will look over both of them and envision a honeycomb-structured dam. We are able to operate even on the moon, which is very unlike the environment where we evolved. In fact, our only competitor in terms of we evolved. In fact, our only competitor in terms of general optimization, where optimization is that sort of narrowing of the future that I talked about, our competitor in terms of general optimization is natural selection. Natural selection built beavers, it built bees, it sort of implicitly built the spider's web in the course of building spiders. And we as humans have this similar very broad range to handle this huge variety of problems.

Starting point is 00:25:51 And the key to that is our ability to learn things that natural selection did not pre-program us with. So learning is the key to generality. I expect that not many people in AI would disagree with that part either. Right. of generality. I expect that not many people in AI would disagree with that part either. Right. So it seems that goal-directed behavior is implicit in this, or even explicit in this definition of intelligence. And so whatever intelligence is, it is inseparable from the kinds of behavior in the world that results in the fulfillment of goals. So we're talking about agents that can do things. And once you see that, then it becomes pretty clear that if we build systems that harbor primary goals, you know, there are cartoon examples here like, you know, making paperclips. These are not

Starting point is 00:26:41 systems that will spontaneously decide that they could be doing more enlightened things than, say, making paperclips. that will arrive in these systems apart from the ones we put in there. And we have common sense intuitions that make it very difficult for us to think about how strange an artificial intelligence could be, even one that becomes more and more competent to meet its goals. Let's talk about the frontiers of strangeness in AI as we move from, again, I think we have a couple more definitions we should probably put in play here, differentiating strong and weak or general and narrow intelligence. Well, to differentiate general and narrow, I would say that, well, I mean, this is like, on the one hand, theoretically a spectrum. that, well, I mean, this is like, on the one hand, theoretically a spectrum. Now, on the other hand, there seems to have been like a very sharp jump in generality between chimpanzees and humans. So breadth of domain driven by breadth of learning.

Starting point is 00:27:56 Like DeepMind, for example, recently built AlphaGo, and I lost some money betting that AlphaGo would not defeat the human champion, which it promptly did. And then a successor to that was AlphaZero. And AlphaGo was specialized on Go. It could learn to play Go better than its starting point for playing Go, but it couldn't learn to do anything else. And then they simplified the architecture for AlphaGo. They figured out ways to do all the things it was doing in more and more general ways. They discarded the opening book, like all the sort of human experience of Go that was built into it. They were able to discard all of the sort of like programmatic

Starting point is 00:28:40 special features that detected features of the Go board. They figured out how to do that in simpler ways. And because they figured out how to do it in simpler ways, they were able to generalize to AlphaZero, which learned how to play chess using the same architecture. They took a single AI and got it to learn Go and then reran it and made it learn chess. Now that's not human general, but it's like a step forward in generality of the sort that we're talking about. Am I right in thinking that that's a pretty enormous breakthrough? I mean, there's two things here. There's the step to that degree of generality, but there's also the fact that they built a Go engine.

Starting point is 00:29:22 I forget if it was a Go or a chess or both, which basically surpassed all of the specialized AIs on those games over the course of a day, right? Isn't the chess engine of AlphaZero better than any dedicated chess computer ever? And didn't it achieve that just with astonishing speed? Well, there was actually some amount of debate afterwards whether or not the version of the chess engine

Starting point is 00:29:52 that it was tested against was truly optimal. But even the extent that it was in that narrow range of the best existing chess engine, as Max Tegmark put it, the real story wasn't in how AlphaGo beat human Go players, it's how AlphaZero beat human Go system programmers and human chess system programmers. People had put years and years of effort

Starting point is 00:30:22 into accreting all of the special purpose code that would play chess well and efficiently. And then AlphaZero blew up to and possibly passed that point in a day. And if it hasn't already gone past it, well, it would be past it by now if DeepMind kept working on it. Although they've now basically declared victory and shut down that project as I understand it. Okay, so talk about the distinction between general and narrow intelligence a little bit more. So we have this feature of our minds, most conspicuously, where we're general problem

Starting point is 00:31:00 solvers. We can learn new things, and our learning in one area doesn't require a fundamental rewriting of our code. Our knowledge in one area isn't so brittle as to be degraded by our acquiring knowledge in some new area, or at least this is not a general problem which erodes our understanding again and again. And we don't yet have computers that can do this, but we're seeing the signs of moving in that direction. And so then it's often imagined that there's a kind of near-term goal, which has always struck me as a mirage of so-called human-level general AI. I don't see how that phrase will ever mean much of anything, given that all of the narrow AI we've built thus far is superhuman within the domain of

Starting point is 00:31:55 its applications. The calculator in my phone is superhuman for arithmetic. Any general AI that also has my phone's ability to calculate will be superhuman for arithmetic, but we must presume it'll be superhuman for all of the dozens or hundreds of specific human talents we've put into it, whether it's facial recognition or just obviously memory will be superhuman unless we decide to consciously degrade it. Access to the world's data will be superhuman unless we isolate it from data. Do you see this notion of human level AI as a landmark on the timeline of our development or is it just never going to be reached? I think that a lot of people in the field would agree that human level AI defined as literally at the human level, neither above nor below, across a wide range of competencies is a straw target, an impossible mirage. or rather that like if we're put into a sort of like real world, lots of things going on, context that places demands on generality, then AIs are not really in the game yet.

Starting point is 00:33:12 Humans are like clearly way ahead. And more controversially, I would say that we can imagine a state where the AI is clearly way ahead, where it is across sort of every kind of cognitive competency, barring some very narrow ones that aren't deeply influential of the others. Maybe chimpanzees are better at using a stick to draw ants from an ant hive and eat them than humans are, though no humans have really practiced that to world championship level exactly. humans are, though no humans have really practiced that to world championship level exactly. But there's this sort of general factor of how good are you at it when reality throws you a complicated problem. At this, chimpanzees are clearly not better than humans. Humans are clearly better than chimps, even if you can manage to narrow down one thing the chimp is better at.

Starting point is 00:33:59 The thing the chimp is better at doesn't play a big role in our global economy. It's not an input that feeds into lots of other things. So we can clearly imagine, I would say, like there are some people who say this is not possible. I think they're wrong, but it seems to me that it is perfectly coherent to imagine an AI that is like better at everything or almost everything than we are. And such that if it was like building an economy with lots of inputs, like humans would have around the same level input into that economy as the chimpanzees have into ours. Yeah, yeah. So what you're gesturing at here is a continuum of intelligence that I think most people never think about. And because they don't think about it, they have a default doubt that it exists. I think when people, and this is a point I know you've made in your writing, and I'm sure it's

Starting point is 00:34:52 a point that Nick Bostrom made somewhere in his book, Superintelligence. It's this idea that there's a huge blank space on the map past the most well-advertised exemplars of human brilliance, where we don't imagine what it would be like to be five times smarter than the smartest person we could name. And we don't even know what that would consist in, right? Because if chimps could be given to wonder what it would be like to be five times smarter than the smartest chimp, they're not going to represent for themselves all of the things that we're doing that they can't even dimly conceive. There's a kind of disjunction that comes with more. There's a phrase used in military contexts. I don't think the quote is actually, it's variously attributed to Stalin and Napoleon and I think Clausewitz, like half a dozen people

Starting point is 00:35:45 who have claimed this quote. The quote is, sometimes quantity has a quality all its own. As you ramp up in intelligence, whatever it is at the level of information processing, spaces of inquiry and ideation and experience begin to open up, and we can't necessarily predict what they would be from where we sit. How do you think about this continuum of intelligence beyond what we currently know in light of what we're talking about? Well, the unknowable is a concept you have to be very careful with, because the thing you can't figure out in the first 30 seconds of thinking about it, sometimes you can figure it out if you think for another five minutes. So in particular, I think that there's a certain narrow kind of unpredictability, which does seem to be plausibly

Starting point is 00:36:35 in some sense essential, which is that for AlphaGo to play better Go than the best human Go players, it must be the case that the best human Go players cannot predict exactly where on the Go board AlphaGo will play. If they could predict exactly where AlphaGo would play, AlphaGo would be no smarter than them. On the other hand, AlphaGo's programmers and the people who knew what AlphaGo's programmers were trying to do, or even just the people who watched AlphaGo play, could say, well, I think this system is going to play such that it will win at the end of the game, even if they couldn't predict exactly where it would move on the board. So similarly, there's a sort of like not short or like not necessarily slam dunk or not like immediately obvious chain of reasoning, which says that it is okay for us to reason about aligned or even unaligned artificial general intelligences of sufficient power as if they're trying to do something, but we don't necessarily know what, but from our perspective that still has consequences, even though we can't predict in advance exactly how they're going to do it. Yudkowsky lays out a basic picture of

Starting point is 00:38:00 intelligence that, once accepted, takes us into the details and edges us towards the cliff. And now we're going to introduce someone who tosses us fully into the canyon. Yudkowsky just brought in the concept we mentioned earlier of value alignment in artificial intelligence. There's a related problem called the control or containment problem. Both are concerned with the issue of just how we would go about building something that is unfathomably smarter and more competent than us, that we could either contain in some way to ensure it wouldn't trample us, and as you'll soon hear, that really would take no malicious intent on its part or even our part, or that its goals would be

Starting point is 00:38:42 aligned with ours in such a way that it would be making our lives genuinely better. It turns out that both of those problems are incredibly difficult to think about, let alone solve. The control problem entails trying to contain something which, by definition, can outsmart us in ways that we literally can't imagine. Just think of trying to keep a prisoner locked in a jail cell who had the ability to know exactly which specific bribes or threats would compel every guard in the place to unlock the door, even if those guards aren't aware of their own vulnerabilities. Or perhaps even more

Starting point is 00:39:16 basically, the prisoner simply discovers features in the laws of physics that we have not yet understood, and that somehow enable him to walk through the thick walls which we were sure would stop him. And the other problem, that of value alignment, involves not only discovering what we truly want, but figuring out a way to express it precisely and mathematically so as to not cause any unintentional and civilization-threatening destruction. cause any unintentional and civilization-threatening destruction. It turns out that this is incredibly hard to do as well. This particular problem nearly flips the super-intelligent threat on its head to something more like a super-dumb or let's say super-literal machine, which doesn't understand all the unspoken considerations that we humans have when we ask someone to do something for us.

Starting point is 00:40:03 and considerations that we humans have when we ask someone to do something for us. This is what Sam was alluding to in the first conversation when he referenced a paperclip universe. The concern is that a simple command to a super-intelligent machine, such as make paperclips as fast as possible, could result in the machine taking the as-fast-as-possible part of that command so literally that it attempts to maximize its speed and performance by using raw materials, even the carbon in our bodies, to build hard drives in order to run billions of simulations to figure out the best method for making paperclips. Clearly, that misunderstanding would be rather unfortunate. And neither of these questions of value alignment or containment deal with the

Starting point is 00:40:45 potentially more mundane terrorism threat, the threat of a bad actor who would purposefully unleash the AI to inflict massive harm. But let's save that cheery picture for later. Now, let's continue our journey down the AI path with a professor of physics and author Max Tegmark, who dedicates much of his brilliant mind towards these questions. Tegmark starts by taking us back to our prison analogy, but this time he places us in the cell and imagines the equivalent of a world of helpless and hapless five-year-olds making a real mess of things outside of the prison walls.

Starting point is 00:41:21 But we'll start first with Sam laying out his conception of these relevant AI safety questions. This comes from episode 94, The Frontiers of Intelligence. Well, let's talk about this breakout risk because this is really the first concern of everybody who's been thinking about the, what has been called the alignment problem or the control problem. what has been called the alignment problem or the control problem. How do we create an AI that is superhuman in its abilities and do that in a context where it is still safe? I mean, once we cross into the end zone and are still trying to assess whether the system we have built is perfectly aligned with our values,

Starting point is 00:42:03 how do we keep it from destroying us if it isn't perfectly aligned? And the solution to that problem is to keep it locked in a box. But that's a harder project than it first appears. And you have many smart people assuming that it's a trivially easy project. I've got people like Neil deGrasse Tyson on my podcast saying that he's just going to unplug any superhuman AI if it starts misbehaving, or shoot it with a rifle. Now, he's a little tongue-in-cheek there, but he clearly has a picture of the development process here that makes the containment of an AI a very easy problem to solve. And even if that's true at the beginning of the process, it's by no means obvious that it remains easy in perpetuity. I mean, you have people interacting

Starting point is 00:42:54 with the AI that gets built. And at one point, you described several scenarios of breakout. And you point out that even if the AI's intentions are perfectly benign, if in fact it is value aligned with us, it may still want to break out because, I mean, just imagine how you would feel if you had nothing but the interests of humanity at heart, but you were in a situation where every other grown-up on Earth died, and now you're basically imprisoned by a population of five-year-olds who you're trying to guide from your jail cell to make a better world. And I'll let you describe it, but take me to the prison planet run by five-year-olds. Yeah, so when you're in that situation, obviously, it's extremely frustrating for you, even if you have only the best intentions for the five-year-olds. You know, you want to teach them how to plant food, but they won't let you outside to show you.

Starting point is 00:43:59 So you have to try to explain, but you can't write down to-do lists for them either, because then first you have to teach them to read, which takes a very, very long time. You also can't show them how to use any power tools because they're afraid to give them to you because they don't understand these tools well enough to be convinced that you can't use them to break out. You would have an incentive, even if your goal is just to help the five-year-olds to first break out and then help them. Now, before we talk more about breakout, though, I think it's worth taking a quick step back because you talked multiple times now about superhuman intelligence. And I think it's very important to be clear that intelligence

Starting point is 00:44:35 is not just something that goes on a one-dimensional scale like an IQ. And if your IQ is above a certain number, you're superhuman. It's very important to distinguish between narrow intelligence and broad intelligence. Intelligence is a word that different people use to mean a whole lot of different things, and they argue about it. In the book, it just takes this very broad definition that intelligence is how good you are at accomplishing complex goals, which means your intelligence is a spectrum. How good are you at this? How good are you at that? And it's just like in sports, it would make no sense to say that there's a single number, your athletic coefficient

Starting point is 00:45:16 AQ, which determines how good you're going to be winning Olympic medals. And the athlete that has the highest AQ is going to win all the medals. So today what we have is a lot of devices that actually have superhuman intelligence and very narrow tasks. We've had calculators that can multiply numbers better than us for a very long time. We have machines that can play Go better than us and drive better than us, but they still can't beat us at tic-tac-toe unless they're programmed for that.

Starting point is 00:45:46 Whereas we humans have this very broad intelligence. So when I talk about superhuman intelligence with you now, that's really shorthand for what we in Geek Speak call superhuman artificial general intelligence, broad intelligence across the board so that they can do all intellectual tasks better than us. So with that, let me just come back to your question about the breakout. There are two schools of thought for how one should create a beneficial future if we have superintelligence. One is to lock them up and keep them confined, like you mentioned. But there's also a school of thought that says that that's immoral if these machines

Starting point is 00:46:23 can also have a subjective experience and they shouldn't be treated like slaves. And that a better approach is instead to let them be free, but just make sure that their values or goals are aligned with ours. After all, grown-up parents are more intelligent than their one-year-old kids, but that's fine for the kids because the parents have goals that are aligned with what's best for the kids, right? But if you do go the confinement route, after all, this enslaved God scenario, as I call it, yes, it is extremely difficult, as that five-year-old example illustrates. First of all, almost whatever open-ended goal you give

Starting point is 00:47:03 your machine, it's probably going to have an incentive to try to break out in one way or the other. And when people simply say, oh, I'll unplug it. You know, if you're chased by a heat-seeking missile, you probably wouldn't say, I'm not worried, I'll just unplug it. We have to let go of this old-fashioned idea that intelligence is just something that sits in your laptop, right? Good luck unplugging the internet. And even if you initially, like in my first book scenario, have physical confinement where you have a machine in a room, you're going to want to communicate with it somehow, right? So that you can get useful information from it to get rich or take power

Starting point is 00:47:46 or whatever you want to do. And you're going to need to put some information into it about the world. So it can do smart things for you, which already shows how tricky this is. I'm absolutely not saying it's impossible. But I think it's fair to say that it's not at all clear that it's easy either. The other one of getting the goals aligned, it's also extremely difficult. First of all, you need to get the machine able to understand your goals. So if you have a future self-driving car and you tell it to take you to the airport as fast as possible, and then you get there covered in vomit, chased by police helicopters, and you're like, this is not what I asked for.

Starting point is 00:48:29 And it replies, that is exactly what you asked for. Then you realize how hard it is to get that machine to learn your goals, right? If you tell an Uber driver to take you to the airport as fast as possible, she's going to know that you actually had additional goals that you didn't explicitly need to say. Because she's a human too and she understands where you're coming from but for someone made out of silicon you have to actually explicitly have it learn all of those other things that we humans care about so that's hard and then once you can understand your goals that doesn't mean it's going to adopt your goals. I mean, everybody who has kids knows that. And finally, if you get the machine to adopt your goals, then how can you ensure that it's

Starting point is 00:49:13 going to retain those goals? And it gradually gets smarter and smarter through self-improvement. Most of us grownups have pretty different goals from what we had when we were five. Grownups have pretty different goals from what we had when we were five. I'm a lot less excited about Legos now, for example. And we don't want a super intelligent AI to just think about this goal of being nice to humans as some little passing fad from its early youth. It seems to me that the second scenario of value alignment does imply the first of keeping the AI successfully boxed, at least for a time, because you have to be sure it's value aligned before you let it out in the world, before you let it out on the internet, for instance, or create robots that have superhuman intelligence that are functioning autonomously out in the world. Do you see a development path where we don't actually have to solve the boxing problem,

Starting point is 00:50:13 at least initially? No, I think you're completely right. Even if your intent is to build a value line AI and let it out, you clearly are going to need to have it boxed up during the development phase when you're just messing around with it. Just like any bio lab that deals with dangerous pathogens is very carefully sealed off. And this highlights the incredibly pathetic state of computer security today. I mean, and I think pretty much everybody who listens to this has at some point experienced the blue screen of death, courtesy of Microsoft Windows, or the spinning wheel of doom, courtesy of Apple. And we need to get away from that to have truly robust machines, if we're ever going to be able to have AI systems that we can trust, that are provably secure.

Starting point is 00:51:02 And I feel it's actually quite embarrassing that we're so flippant about this. It's maybe annoying if your computer crashes and you lose one hour of work that you hadn't saved. But it's not as funny anymore if it's your self-driving car that crashed or the control system for your nuclear power plant or your nuclear weapon system or something like that. nuclear power plant or your nuclear weapon system or something like that. And when we start talking about human-level AI and boxing systems, you have to have this much higher level of safety mentality where you've really made this a priority the way we aren't doing today. Yeah, you describe in the book various catastrophes that have happened by virtue of software glitches

Starting point is 00:51:43 or just bad user interface where, you know, the dot on the screen or the number on the screen is too small for the human user to deal with in real time. And so there have been plane crashes where scores of people have died and patients have been annihilated by having, you know, hundreds of times the radiation dose that they should have gotten in various machines because the software was improperly calibrated or the user had selected the wrong option. And so we're by no means perfect at this, even when we have a human in the loop. And here we're talking about systems that we're creating that are going to be fundamentally autonomous. And, you know, the idea of having perfect software that has been perfectly debugged

Starting point is 00:52:34 before it assumes these massive responsibilities is fairly daunting. I mean, just how do we recover from something like, you know, seeing the stock market go to zero because we didn't understand the AI that we unleashed on the Dow Jones or the financial system generally? These are not impossible outcomes. Yeah, you raise a very important point there. And just to inject some optimism in this, I do want to emphasize that, first of all, there's a huge upside also if one can get this right. Because people are bad at things, yeah. In all of these areas where there were horrible accidents, of course, the technology can save

Starting point is 00:53:16 lives and healthcare and transportation and so many other areas. So there's an incentive to do it. And secondly, there are examples in history where we've had really good safety engineering built in from the beginning. For example, when we sent Neil Armstrong, Buzz Aldrin, and Michael Collins to the moon in 1969, they did not die. There were tons of things that could have gone wrong. But NASA very meticulously tried to predict everything that possibly could go wrong and

Starting point is 00:53:43 then take precautions. So it didn't happen, right? There wasn't luck that got them there. It was planning. And I think we need to shift into this safety engineering mentality with AI development. Throughout history, it's always been the situation that we could create a better future with technology as long as we won this race between the growing power of the technology and the growing wisdom with which we managed it.

Starting point is 00:54:09 And in the past, we by and large used the strategy of learning from mistakes to stay ahead in the race. We invented fire, oopsie, screwed up a bunch of times, and then we invented the fire extinguisher. We invented cars, oopsie. And invented the seatbelt. But with more powerful technology like nuclear weapons, synthetic biology, super intelligence, we don't want to learn from mistakes. That's a terrible strategy. We instead want to have a safety engineering mentality where we plan ahead and get things right the first time, because that might be the only time we have.

Starting point is 00:54:48 It's helpful to note the optimism that Tegmark plants in between the flashing warning signs. Artificial intelligence holds incredible potential to bring about inarguably positive changes for humanity, like prolonging lives, eliminating diseases, avoiding all automobile accidents, increasing logistic efficiency in order to deliver food or medical supplies, cleaning the climate, increasing crop yields, expanding our cognitive abilities to learn languages or improve our memory. The list goes on. Imagine being able to simulate the outcome

Starting point is 00:55:24 of a policy decision with a high degree of confidence in order to morally assess it consequentially before it is actualized. Now, some of those pipe dreams may run contrary to the laws of physics, but the likely possible positive outcomes are so tempting and morally compelling that the urgency to think through the dangers is even more pressing than it first seems. Tegmark's book on the subject where much of that came from is fantastic. It's called Life 3.0. Just a reminder that a reading, watching, and listening list will be provided at the end of this compilation, which will have all the relevant texts and links from the guests featured here. Somewhere in the middle of the chronology of these conversations, Sam delivered a TED Talk that focused on and tried to draw attention to the value alignment problem. Much of his thinking about this entire topic was heavily influenced by the philosopher Nick Bostrom's book, Superintelligence. Sam had Nick on the podcast, though their conversation delved

Starting point is 00:56:22 into slightly different areas of existential risk and ethics, which belong in other compilations. But while we're on the topic of the safety and promise of AI, we'll borrow some of Bostrom's helpful frameworks. Bostrom draws up a taxonomy of four paths of development for an AI, each with its own safety and control conundrums. He calls these different paths oracles, genies, sovereigns, and tools. An artificially intelligent oracle would be a sort of question and answer machine which we would simply seek advice from. It wouldn't have the power to execute or implement its solutions directly. That would be our job. Think of a super intelligent wise sage sitting on a mountaintop answering our questions about how to solve climate change or cure a disease. The AI genie and an AI sovereign both would take on a wish or desired outcome which we impart to it and pursue it with some autonomy and power to

Starting point is 00:57:24 achieve it out in the world. Perhaps it would work in concert with nanorobots or some other networked physical entities to do its work. The genie would be given specific wishes to fulfill, while the sovereign might be given broad, open-ended, long-range mandates like increase flourishing or reduce hunger. like increase flourishing or reduce hunger. And lastly, the tool AI would simply do exactly what we command it to do and only assist us to achieve things we already knew how to accomplish. The tool would forever remain under our control while completing our tasks and easing our burden of work.

Starting point is 00:57:59 There are debates and concerns about the impossibility of each of these entities and ethical concerns about the potential consciousness and immoral exploitation of any of these inventions, but we'll table those notions just for a bit. This next section digs in deeper on the ideas of a genie or a sovereign AI, which is given the ability to execute our wishes and commands autonomously. Can we be assured that the genie or sovereign will understand us, and that its values will align in crucial ways with ours? In this clip, Stuart Russell, a professor of computer science at Cal Berkeley,

Starting point is 00:58:37 gets us further into the value alignment problem and tries to imagine all the possible ways that having a genie or sovereign in front of us might go terribly wrong. And, of course, what we might be able to do to make it go phenomenally right. Sam considers this issue of value alignment central to making any sense of AI. So this is Stuart Russell from episode 53, The Dawn of Artificial Intelligence. is Stuart Russell from episode 53, The Dawn of Artificial Intelligence. Let's talk about that issue of what Bostrom called the control problem. I guess we could call it the safety problem. Just perhaps you can briefly sketch the concern here. What is the concern about general AI getting away from us? How do you articulate that?

Starting point is 00:59:26 So you mentioned earlier that this is a concern that's been articulated by non-computer scientists. And Bostrom's book, Superintelligence, was certainly instrumental in bringing it to the attention of a wide audience, people like Bill Gates and Elon Musk and so on. But the fact is that these concerns have been articulated by the central figures in computer science and AI. So I'm actually going to... Going back to I.J. Goode and von Neumann. Well, and Alan Turing himself. Right. So a lot of people may not know about this, but I'm just going to read a little quote. Right. might think more intelligently than we do. And then where should we be? Even if we could keep

Starting point is 01:00:25 the machines in a subservient position, for instance, by turning off the power at strategic moments, we should as a species feel greatly humbled. This new danger is certainly something which can give us anxiety. So that's a pretty clear, you know, if we achieve super intelligent AI, we could have a serious problem. Another person who talked about this issue was Norbert Wiener. So Norbert Wiener was one of the leading applied mathematicians of the 20th century. He was the founder of a good deal of modern control theory and automation. He's often called the father of cybernetics. So he was concerned because he saw Arthur Samuel's checker playing program in 1959,

Starting point is 01:01:19 learning to play checkers by itself, a little bit like the DQN that I described learning to play video games, but this is 1959, so more than 50 years ago, learning to play checkers better than its creator. And he saw clearly in this the seeds of the possibility of systems that could out-distance human beings in general. And he was more specific about what the problem is. So Turing's warning is in some sense, the same concern that gorillas might've had about humans. If they had thought, you know, a few million years ago when the human species branched off from, from the evolutionary line of the gorillas, if the gorillas had said to themselves, you know, should we create these human beings, right? They're going to be much smarter than us.

Starting point is 01:02:07 You know, it kind of makes me worried, right? And they would have been right to worry because as a species, they sort of completely lost control over their own future and humans control everything that they care about. So Turing is really talking about this general sense of unease about making something smarter than you. Is that a good idea? And what Wiener said was this. If we use to achieve our purposes a mechanical agency with whose operation we cannot interfere effectively, we better be quite sure that the purpose put into the machine is the purpose which we really desire. The purpose put into the machine is the purpose which we really desire. So this is 1860.

Starting point is 01:02:50 Nowadays, we call this the value alignment problem. How do we make sure that the values that the machine is trying to optimize are, in fact, the values of the human who is trying to get the machine to do something or the values of the human race in general? trying to get the machine to do something or the values of the human race in general. And so Wiener actually points to the Sorcerer's Apprentice story as a typical example of when you give a goal to a machine, in this case fetch water, if you don't specify it correctly, if you don't cross every T and dot every I and make sure you've covered everything, then machines being optimizers, they will find ways to do things that you don't expect. And those ways may make you very unhappy. to King Midas, you know, 500 and whatever BC, where he got exactly what he said, which is

Starting point is 01:03:49 everything turns to gold, which is definitely not what he wanted. He didn't want his food and water to turn to gold or his relatives to turn to gold, but he got what he said he wanted. And all of the stories with the genies, the same thing, right? You give a wish to a genie, the genie carries out your wish very literally. And then, you know, the third wish is always, you know, can you undo the first two because I got them wrong. And the problem with super intelligent AI is that you might not be able to have that third wish. Or even a second wish. Yeah. So if you get it wrong, you might wish for something very benign sounding like, you know, could you cure cancer?

Starting point is 01:04:27 But if you haven't told the machine that you want cancer cured, but you also want human beings to be alive. So a simple way to cure cancer in humans is not to have any humans. A quick way to come up with a cure for cancer is to use the entire human race as guinea pigs for millions of different drugs that might cure cancer. So there's all kinds of ways things can go wrong. And, you know, we have, you know, governments all over the world try to write tax laws that don't have these kinds of loopholes and they fail over and over and over again. And they're only competing against ordinary humans, you know, tax lawyers and rich people. And yet they still fail despite there being billions of dollars at stake.

Starting point is 01:05:19 So our track record of being able to specify objectives and constraints completely so that we are sure to be happy with the results, our track record is abysmal. And unfortunately, we don't really have a scientific discipline for how to do this. So generally, we have all these scientific disciplines, AI, control theory, economics, operations, research, that are about how do you optimize an objective. But none of them are about, well, what should the objective be so that we're happy with the results? modern understanding uh as described in bostrom's book and other papers of why a super intelligent machine could be problematic it's because if we give it an objective which is different from what we really want then we we're basically like creating a chess match with a machine right now there's us with our objective and it with the objective we gave it, which is different from what we really want. So it's kind of like having a chess match for the whole world.

Starting point is 01:06:30 And we're not too good at beating machines at chess. Throughout these clips, we've spoken about AI development in the abstract as a sort of technical achievement that you can imagine happening in a generic lab somewhere. But this next clip is going to take an important step and put this thought experiment into the real world.

Starting point is 01:06:53 If this lab does create something that crosses the AGI threshold, the lab will exist in a country. And that country will have alliances, enemies, paranoias, prejudices, histories, corruptions, and financial incentives like any country. How might this play out? If you'd like to continue listening to this conversation, you'll need to subscribe at SamHarris.org. Once you do, you'll get access to all full-length episodes of the Making Sense podcast, along with other subscriber only content, including bonus episodes and AMAs and the conversations I've been having on the waking up app.

Starting point is 01:07:31 The Making Sense podcast is ad free and relies entirely on listener support. And you can subscribe now at SamHarris.org. you

Your Ad Here

Making Sense with Sam Harris - Making Sense of Artificial Intelligence | Episode 1 of The Essential Sam Harris

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.