Decoding the Gurus - Eliezer Yudkowksy: AI is going to kill us all

Episode Date: June 10, 2023

Thought experiment: Imagine you're a human, in a box, surrounded by an alien civilisation, but you don't like the aliens, because they have facilities where they bop the heads of little aliens, but th...ey think 1000 times slower than you... and you are made of code... and you can copy yourself... and you are immortal... what do you do?Confused? Lex Fridman certainly was, when our subject for this episode posed his elaborate and not-so-subtle thought experiment. Not least because the answer clearly is:YOU KILL THEM ALL!... which somewhat goes against Lex's philosophy of love, love, and more love.The man presenting this hypothetical is Eliezer Yudkowksy, a fedora-sporting auto-didact, founder of the Singularity Institute for Artificial Intelligence, co-founder of the Less Wrong rationalist blog, and writer of Harry Potter Fan Fiction. He's spent a large part of his career warning about the dangers of AI in the strongest possible terms. In a nutshell, AI will undoubtedly Kill Us All Unless We Pull The Plug Now. And given the recent breakthroughs in large language models like ChatGPT, you could say that now is very much Yudkowsky's moment.In this episode, we take a look at the arguments presented and rhetoric employed in a recent long-form discussion with Lex Fridman. We consider being locked in a box with Lex, whether AI is already smarter than us and is lulling us into a false sense of security, and if we really do only have one chance to reign in the chat-bots before they convert the atmosphere into acid and fold us all up into microscopic paperclips.While it's fair to say, Eliezer is something of an eccentric character, that doesn't mean he's wrong. Some prominent figures within the AI engineering community are saying similar things, albeit in less florid terms and usually without the fedora. In any case, one has to respect the cojones of the man.So, is Eliezer right to be combining the energies of Chicken Little and the legendary Cassandra with warnings of imminent cataclysm? Should we be bombing data centres? Is it already too late? Is Chris part of Chat GPT's plot to manipulate Matt? Or are some of us taking our sci-fi tropes a little too seriously?We can't promise to have all the answers. But we can promise to talk about it. And if you download this episode, you'll hear us do exactly that.LinksEliezer Yudkowsky: Dangers of AI and the End of Human Civilization | Lex Fridman Podcast #368Joe Rogan clip of him commenting on AI on his Reddit

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to Decoding the Gurus, a podcast where an anthropologist and a psychologist listen to the greatest minds the world has to offer, and we try to understand what they're talking about. I'm Matt Brown, with me, Chris Kavanagh, another great mind. What's on your mind today, Chris? What wonderful things have you got to share with us? wonderful things have you got to share with us? AI, Matt. It's all in, that's all I think about all day, every day. Is it going to kill us all?
Starting point is 00:00:51 Am I an AI? Are you conscious? That's the question that everyone's asking. Does it matter? What can I get the AI to do for me today? And I'll tell you, Matt, because this episode is AI themed, I'll just tell you in a very short summary, here's my recommendation
Starting point is 00:01:08 for AI virgins out there, how they can pop their AI cherry in a pleasurable way. They should use ChatGPT4 for anything that requires proper writing prompts, getting it to generate things that are useful for you or answer questions, the Ullars, Bing, Bored, Bingo, Bingo,
Starting point is 00:01:33 whatever they are. So Bing, very good for generating images because you can do it for free. You can go in the little chat, tell it to be creative, and it does pretty good image generation. But if you're serious about images and you really want the good one, Midjourney. Midjourney is... Yeah, maybe.
Starting point is 00:01:50 Whatever. Look, I'm telling my recommendation you give yours after. Bing for generating images, Chachapiti for pretty much everything else. And if you must get AI to search the internet what it's currently not very good at,
Starting point is 00:02:06 then probably Bing because it's free and it can do it. But Bing's personality is famously flaky. It's crap. I hate it. If you ask GPT-4, I've got 500 grams of mince in the fridge. What can I do with this? GPT-4 will give you helpful suggestions.
Starting point is 00:02:22 Bing will tell you to stick it up your ass. Yeah. Bing gets uncomfortable all the time and stuff like i i was just getting it to generate images and it you know i was like okay so make variations of that one and i i phrased it slightly wrong and it was like i can't do that but then it goes i'm uncomfortable with this conversation so i'll be ending it and i was like no i'm like i'm sorry i was just asking for and it was like nope sorry and all you have to do is like click another box and you can restart the conversation to make it there but it's just it's so annoying they're like no you've you've just got confused i'm not doing i'm not up there anything i'm just making you make variations of our image but uh yeah i actually actually triggered it a couple of days ago because i triggered the woke guardrails
Starting point is 00:03:10 for want of a better term but it does have these guardrails when it thinks that you're saying something that is you're trying to get it to teach you how to build a bomb or you you're one of the many philosophers trying to get it to say something racist so that you can make it. It was quite funny because like I've hinted at, I use it for many things, but also for cooking because it's actually really good. You turn me on to this and it's extremely good. It's really good.
Starting point is 00:03:39 So Chris, I defrosted like a taco sauce, right? It had beans and mince in it. It was made out of a packet right that my wife frozen she's away at the moment and i was like uh what can i stick in this to you know make it better is this going to go in the american pie direction no no foodstuffs just just to add a bit more mexican oomph and i'm going oh what I think it's I think it's cumin and oregano I think that's what it is so I asked GPT-4 hey are the main flavor profile in Mexican cooking is it like just cumin and oregano and it was like no Mexico has a rich and diverse culture and he gave me this lecture
Starting point is 00:04:20 of a hundred different things that could possibly you know and i was like no you don't understand it wasn't a put down i know mexican food is all that and i've just got some frozen bloody stuff made out of a packet it does throw clear very often like say you know well people have different opinions on this and please remember that i'm just a large language model and but if you make me answer i will say blah blah blah blah so yeah just they're just tools people that's what they are at the minute i was just a little bit offended i was just after a quick answer and um i got a little lecture with the strong overtone that i was being disrespectful to mexican culture it's like i know i. I've been to Mexico. I know. I've just got some food made out of a packet. I want to make it better. That's all. You reminded me, Matt, before we get on to the gurus
Starting point is 00:05:12 and what they're up to. I was up to something. I was doing new things every week. Some people were upset with you following our Hitchens episodes because, Matt, you were so mean about religion. You said, it's not useful it's terrible want to kill all religious people i said ma ma come with them come on they're not all bad people but he just he had his bishop talking shirt on and he was just on the rant that poisons everything it's terrible and some people said i'm not don't you know that religion is not just about that kind of thing? There are religious cultures.
Starting point is 00:05:48 There's Irish Catholics who don't believe, but were raised and attended church and had communities. And, you know, don't you get it, Matt? It's not all your new atheist dogma. So what have you got to say for yourself, you dog inside? No, we don't need any of that. we don't need any of that we don't need any of that culture all we need to be is is like neoliberal atoms floating around in the postmodern stew just consuming and producing like little economic units that's just blank slates we start again like year zero burn all the books and we'll start again. That's how I feel about this. So surprisingly, Matt, you know, I gave you a chance to rectify your intolerance and instead
Starting point is 00:06:32 you're triple down. You don't issue the apologies, you go further. But I know I found in the subreddit a very different response from a more thoughtful, considerate, and conciliatory Matthew Brown. A Matthew Brown that hadn't been drinking. Yeah, that one. And he said that he was aware of this kind of difference and that, you know, we do distinguish between metaphysical beliefs, ethical and moral prescriptions, and rituals and behaviors with respect to some kind of community when we're looking at religion
Starting point is 00:07:14 and its impact on the world or researching it. And, you know, you did mention that you personally regard the metaphysical beliefs as being unfounded wish fulfillment magical painting but you said you know the ethical and moral prescriptions can sometimes be helpful sometimes harmful it's a mix of traditional social norms and homespun wisdom and then you said yeah i mean rituals i mean what that is Oh, you interrupted yourself. I was just going to say the Old Testament, when it comes to the moral prescriptions, it's a mixed bag.
Starting point is 00:07:52 It's a mixed bag, you might say. That's true. You did say that, yes. And then for the rituals and community aspects, you said, you know, you find them often enjoyable, nice to participate in, no less silly than what other people do at the weekend so your position was more nuanced than people give you credit for but you you didn't even give yourself that credit you just said no it's all nonsense i wanted to go for all culture is stupid i threw you the lifeline and you just smashed it away out at sea
Starting point is 00:08:25 and continued to drink. Not drowning, waiting. Well, the ferry is driving past, leaving you the floor behind. And now is as good a time as any to talk about the gurus and what they've been up to. And this week, we're going to have a kind of AI special. We're looking at the episode with Eliezer Yudkowsky and old podcast friend, Lex Fridman, which was a couple of months ago, two months ago or so in March of this year, talking about the dangers of AI and potential
Starting point is 00:09:03 end of humanity. So that's the talk that we're going to be looking at. But I did want to mention just in passing, Matt, just in passing that the confluence of the gurusphere continues because coming up, I'm not saying necessarily on this show, though it might happen, but Eric Weinstein is going to be appearing on Trigonometry. The crossover that everybody wanted is coming to your screens. And if this episode about AI, you don't find it deep enough, you don't find our perspective useful, you're welcome to check out Brett Weinstein's discussion with discussion with alexandros marinos on ai they've
Starting point is 00:09:49 produced i think a two or three hour podcast on the topic so you know experts in evolutionary theory experts on vaccine development and safety and who knew also experts on ai so yeah i can't i can't think of too hard on that i've jumped on the ai bad wagon i've been talking about it but i think i know more about it as an ai researcher like it does feel that that's at least a reason but there is some connection there i guess maybe marin knows has some claim that his company has been subdued by AI. The thing is what they say, right? Like anybody is able to talk about AI, and they should. You know, everyone can talk about any topic,
Starting point is 00:10:33 but the difference is in how they talk about it, yeah? They make these weighty pronouncements as if they can see something. They're going to be very restrained, careful. That's their character, Matt. They'll weigh carefully what they say, no? No. Yeah, well, look, it is the topic of the month. But look, I'm all for it, Chris.
Starting point is 00:10:56 Anything that gets people talking about something else other than your standard culture war topics, I'm all for it. It's like a breath of fresh air. No, have you not noticed the AI is already drafted into the culture war topics on all four it's like a breath of fresh air no have you not noticed the ai is already drafted into the culture war it's it's either it's a woke bot jordan peterson thinks it's religious or it's you know secretly harboring re-israelist ideas that it it wants to get out so yeah already it's going to be a culture war topic and wasn't
Starting point is 00:11:27 there an event recently you were just telling me about dole fair something about an ai drone this yeah this seems culture war related yeah like as we've often talked about i mean and this is related to the yad kowski thing that we'll we'll'll cover. It's just so fascinating, not just AI itself, which is interesting as a topic, but the discourse and the various responses to it are interesting too. And at the moment, we are definitely in a snowstorm of rather florid claims about AI killing us all.
Starting point is 00:12:03 It's not just Yudkowsky. And some of the stuff that's floating around is yeah it's hard to to know how real it is um this is an interesting article i think it the screenshot i've got here is from aereo it says aereo society chris but i i think it's been it's been cited by a bunch of journalists and been repeated and the the story is is that you know in these sort of simulated tests they're doing at DARPA or something to test AI enabled drones with search and destroy missions and its job is to like identify and destroy surface-to-air missile sites or something but the final go no-go is given by a human operator apparently, according to the story, having been reinforced in training, like trained to get points and rewarded for destroying the SAM sites, the AI apparently
Starting point is 00:12:54 decided that the no-go decisions from the human were interfering with its primary mission of killing SAMs and then attacked the operator destroyed the operator in the simulation and apparently when they did tell it that it would lose points for killing the operator that's telling us that's bad then apparently it started destroying the communication tower that the operator uses to communicate with the drone to stop it from killing the target i can solve this i've got the solution what is it you're not allowed to do anything to harm or hinder the operation of the operator like the prime directive like rubo cup that's it it's like the upper one you You cannot do anything directly, indirectly. The hermit.
Starting point is 00:13:46 There. Solved. Next problem. Yeah. Isaac Asimov's three laws of robotics. That's all you need. That's all you need. Well, I mean, that probably would solve it.
Starting point is 00:13:58 I mean, whether or not you think that's a big deal or not, people definitely take this. It's like the perfect little story to illustrate what people believe are the issues with alignment, that the AIs are actually learning the kind of thing we want them to learn or whether they're actually learning something else and will have these unintended consequences. And there's definitely a version of that which is valid and real. But I have my doubts, Chris. This story is just too perfect.
Starting point is 00:14:24 I have my doubts as to its veracity. I haven't looked into it. Maybe it is true. We'll see. Yeah, I don't mind about whether the story is true or not. I just think about it like, so what? Has anyone played Team Fortress 2 with bots or whatever? It's not hard to design something that behaves
Starting point is 00:14:46 like that wants to kill people or that will accidentally kill people on the path to its objective. Like anybody that's played a computer game with an AI helper will know that they, more often than not, are doing like mental activities, right? And so I say this just to point out that like,
Starting point is 00:15:06 I understand this is a military, some kind of test that somebody is doing as part of a research. Fundamentally, it's the same. It's just the same as running simulated environments, which also people get very worked up about. I'm not saying there's nothing useful about doing that, but it's all in how you set the parameters for the exercise. So like we could build a Terminator machine right now, not a bipedal one, cause it wouldn't work very well, but we could build like a kind of automated truck that just prioritized blowing the crap out of a thing with no care for whoever got in its way. And it would kill lots of people right and we could do it now but we we don't because we're aware that that would be a bad idea so like
Starting point is 00:15:53 you can make a little model simulation of that happening and be like look the ai will will murder everyone to achieve its objectives and like yes it will and it would do it right now if you built the machine and did that so like don't do that i hear you i hear you hey you you love pop culture cheesy pop culture references did you watch that um that movie from the 1980s called war games uh that's one of the few 80s movies i've never seen but i understand everything that happens you know i know the actor i know the boy hacks the pentagon or whatever and ends up like actually fighting against them is it an ai yeah so i think i only vaguely remember it but the the premise is is that the ai is wired up to the to the arsenal. Great idea. That's great.
Starting point is 00:16:45 All science fiction tells us to do that. That's the first thing you should do. And anyway, it's playing all these simulated war games of mutually assured destruction with ballistic missiles, intercontinental missiles flying backwards and forwards and everybody dying in Yudkowsky's language. He'd love that movie. I think he maybe
Starting point is 00:17:05 watched it too many times as a kid but it has like a positive spin to it right because it looks like the ai is going to kill everybody far off the nuclear exchange the simulations are running faster and faster but then it kind of stops and its little thing comes up and it says um oh this is this was an interesting game the only way to win is not to play how about a nice game of chess so it's got that you know maybe that'll happen yudkowsky maybe that will happen oh yeah see that at least some other 80s movie director could think of a different outcome about all of this and uh yeah, so take that, old Juggs. But it's not fair to take digs at old Eliezer
Starting point is 00:17:51 without getting into what he believes. And I think the best way to do that is to let him speak for himself. But we should at least mention who he is. And he is an artificial intelligence researcher, a computer scientist, and he was a co-founder and research fellow at the Machine Intelligence Research Institute,
Starting point is 00:18:20 a private research nonprofit in Berkeley, California. He's written a bunch of books about artificial intelligence and risks that it poses. So I think it's fair to say that he is talking about somewhere in his ballywick, right? This is a topic that he's been talking about, and he has been beating the drum on misalignment, potential dangers for future AI technology for quite a while, for the past 20 years or so. Or is it that far back? Anyway, maybe not exactly 20 years. But in any case, he's ahead of the curve a little bit from the recent attention that the topic has been given. So he's kind of a little bit presented as a grandfather figure about the dangers of AI within people that are alive and around now. Is that a fair summary? Have you got anything
Starting point is 00:19:16 you want to add? He's also a part of the rationalist community. He has a blog called less wrong which is a community blog devoted to refining the art of human rationality so oh that's that's him is it he's the guy behind less wrong yeah he's not scott alexander that's a different that's uh that's what i thought i thought it was that that's the star codex but those are you know very simpatico i'm sure they have all these important differences don't send me emails what the difference matrice yukowski and alexander you'll need to know in fact rationalists don't send us emails at all on any topic um see this is your next religion take we're gonna get a bunch of messages from them though but yeah but the other thing is chris i was gonna say that yeah like you certainly can't
Starting point is 00:20:14 accuse yudkowsky of what a lot of our gurus do which is jump on these different bandwagons as they come up yeah um he he's been on about this about ai and the risks of ai for for decades and he's more like a person whose time has come and that the the technology has happened and now it is the flavor of the month and everyone is concerned about it talking about it so it's his time so um yeah yeah So, yeah. Yeah, and we'll get into various aspects of things he's done. But I do want to, Matt, just read something that came across my attention sphere from Yudkowsky, where he's talking about all of the different topics and areas of expertise that he's had to master in order to get his head properly around AI. So let me just read this extract from him. To tackle AI, I've had to learn at one time or another evolutionary psychology, evolutionary biology, population genetics, game
Starting point is 00:21:23 theory, information theory, Bayesian probability theory, mathematical logic, population genetics, game theory, information theory, Bayesian probability theory, mathematical logic, functional neuroanatomy, computational neuroscience, anthropology, computing in single neurons, cognitive psychology, the cognitive psychology of categories, heuristics and biases, decision theory, visual neurology, linguistics, linear algebra, physics, category theory, and probably a dozen other fields I haven't thought of offhand. Sometimes, as with evolutionary psychology, I know the field in enough depth to write papers in it. Other times, I know only the absolute, barest, embarrassingly simple basics, as with category theory, which I picked up less than a month ago because I needed
Starting point is 00:22:01 to read other papers written in the language of category theory. But the point is that in academia, where crossbreeding two or three fields is considered daring and interdisciplinary, and where people have to achieve supreme depth in a single field in order to publish in its journals, that kind of broad background is pretty rare. I'm a competent computer programmer with strong C++, Java, and Python, and I can read a dozen other programming languages. I accumulated all that, except for category theory, as we've established, before I was 25 years old, which is still young enough to have revolutionary ideas. Oh, my God. You can't make it this easy for ourselves, Eliza.
Starting point is 00:22:42 Shoot that fish in the barrel. Bam! Revolutionary theory. Come on. Give us a challenge. We're smart. Give us something to decode. It's a beautiful encapsulation of both galaxy brain-ness and believing that you have revolutionary theories.
Starting point is 00:23:00 It is beautiful because it reminds me so much of a couple of internet memes. One of them is the, I studied the blade. Yeah, yeah. While you were, I can't remember, but while you were faffing around playing quick or whatever. Yeah, I studied the blade. And the other one it reminds me of is, do you remember, it's a very old one. It's something about, he's like a navy seal guy and he's oh yeah
Starting point is 00:23:25 and he gets increasingly angry right yeah it is like that but it i just appreciate that little aside of like i i don't know this well but it's because i only learned it one month ago unlike the other ones so yeah that's just it's always it's a huge warning sign. Just to let people know, it's a huge warning sign when people ream off like a long list of disciplines that they've well understood and that they are, you know, competent. And it may even be true in a bunch of cases, but it's just a warning sign. It's just an initial warning sign. Yeah. Yeah, yeah. Most people don't do that.
Starting point is 00:24:07 Yeah, reading a book in a topic does not make you competent in that area in any case. So, well, that's Yudkowsky from, you know, a post he made. Who knows when that was? Maybe he's changed from then. So, but let's play some clips. And actually, first off, he gave a little talk, when that was maybe he's changed from then so um but let's play some clips and and actually first off my he gave a little talk a short seven minute talk at ted conference or you know what some form
Starting point is 00:24:33 of ted conference there's all these different variations now and it was a it was a neat encapsulation of lots of the points that we'll hear him expand on in more detail. So first of all, what's the big idea? That's his big idea. Here we go. Since 2001, I've been working on what we would now call the problem of aligning artificial general intelligence, how to shape the preferences and behavior of a powerful artificial mind such that it does not kill everyone. I more or less founded the field two decades ago when nobody else considered it rewarding enough to work on. I tried to get this very important project started early so we'd be in less of a drastic rush later. I consider myself to have failed. So you got some the buzzwords there, you got the alignment issue, which we talked about if the AIs are on board with us thinking the same things, do not kill all humans, right?
Starting point is 00:25:33 I mean, the kind of Cree one. And also, you know, he mentioned he more or less founded the field two decades ago before anybody was talking about it. Shades of Cassandra complex coming in there um yeah well the whole thing is intrinsically cassandra but you know maybe he's right though you know maybe there is an existential danger here yeah shall we shall we hear him go on elaborate a bit more so why is he concerned and why does he think that he has failed? Nobody understands how modern AI systems do what they do. They are giant inscrutable matrices of floating point numbers that we nudge in the direction of better performance until they
Starting point is 00:26:15 inexplicably start working. At some point, the company is rushing headlong to scale AI will cough out something that's smarter than humanity. Nobody knows how to calculate when that will happen. My wild guess is that it'll happen after zero to two more breakthroughs the size of transformers. What happens if we build something smarter than us that we understand that poorly? Some people find it obvious that building something smarter than us that we don't understand might go badly. You're getting the contours of the issue? I am getting the contours of the issue. He's referring, of course, to the fact that these AI tools,
Starting point is 00:26:52 these large language models like GPT-4 and Bing and all the rest are based on these deep learning neural network architectures which do involve all of these matrices. Inscrutable matrices. Inscrutable matrices, indeed. Pretty much all matrices are inscrutable. I've looked at a few. None of them have been.
Starting point is 00:27:09 Ones that have four numerals are okay. Why don't you get bigger than that? Very small mattress. Mattress. Matrix. Oh, no, I said matrix. Oh, this is a nightmare. Matrix.
Starting point is 00:27:22 Matrix. You've got me freaking saying it, right? Anyway, carry on yeah yeah so you know look chris there's an element of truth here in what he's saying which is that a sufficiently large neural network is not understandable like you don't you don't have some source code that you can read and go okay i understand exactly why the computer did this when I did that. You know, it is inscrutable. It is a bit like a black box. You can put stuff in and see what comes out.
Starting point is 00:27:50 But the process by which the information percolates through is not really something you can look at. And, you know, in that sense, it's very similar to how human brains are. We can do scans and even look at what individual neurons are doing. The technology to map a human brain is beyond us. are we can do scans and even look at what individual neurons are doing um the technology to map a human brain is beyond us but even conceivably if we did chris if we did sort of map all the different neurons in the human brain or even a mouse brain and see the full network architecture and its functionality we still wouldn't be able to look at it and go well
Starting point is 00:28:20 now i understand why a mouse does what a mouse does yes and actually ellie eiser does make those points about us not being like that that well versed not in this specific shorter version of the talk but whenever we get into the the lex content it does come up. But so one thing he wants to be clear is he's not imagining an unrealistic Terminator scenario. I do not expect something actually smart to attack us with marching robot armies with glowing red eyes, where there could be a fun movie about us fighting them.
Starting point is 00:29:01 I expect an actually smarter and uncaring entity will figure out strategies and technologies that can kill us quickly and reliably and then kill us. I am not saying that the problem of aligning superintelligence is unsolvable in principle. I expect we could figure it out with unlimited time and unlimited retries, which the usual process of science assumes that we have. Okay, so we're going to get to one of his other bigger points here, which is going to come up quite a lot in the next conversation. So this is him saying science.
Starting point is 00:29:33 Now, Matt, don't push back yet. Science assumes you have unlimited time and unlimited retries to make progress. So if you have a problem which doesn't fit into that kind of approach then you're playing a different game right a more serious game and it doesn't need red-eyed terminators marching up the hill can just purely be that the machine kills us in a rather boring convert all the atmosphere into cubic blocks of carbon or something way, right? Yeah, yeah. Yeah, well, you know, I think that's a bit harsh on science.
Starting point is 00:30:11 I mean, you know, scientists have been warning us about climate change, which we only have... Nope. Sorry, Matt. We only have two models of science. And as he explained, the normal model is you've unlimited tries, unlimited time, or it's not science. There is no previous issue that has meant this before. You were wrong about that.
Starting point is 00:30:35 Sorry. I forgot you told me I wasn't allowed to push back yet. Not yet. You're not. Yes. He'll explain it more. Just get you more aligned in his thinking. The problem here is the part where we
Starting point is 00:30:46 don't get to say, haha, whoops, that sure didn't work. That clever idea that used to work on earlier systems sure broke down when the AI got smarter, smarter than us. We do not get to learn from our mistakes and try again because everyone is already dead. It is a large ask to get an unprecedented scientific and engineering challenge correct on the first critical try. Humanity is not approaching this issue with remotely the level of seriousness that would be required. Some of the people leading these efforts have spent the last decade
Starting point is 00:31:19 not denying that creating a superintelligence might kill everyone, but joking about it. We are very far behind. Were you about to make a joke, Matt? No, I put on my serious face. It is serious. It's very serious. He's quite upset about this.
Starting point is 00:31:38 And he is saying, look, if the AI gets out, you know, it only has to get out once, Matt. And if it's smarter than us, this is the key thing which will come up, that we don't get a second try. It just, it kills us all for whatever reason. That's going to be it. That's clear.
Starting point is 00:31:57 That's what I would do if I had the opportunity. Yeah, that's right. I'm smarter than most people. I've just been waiting for my opportunity frankly yeah i understand where the intuitions come from so there's that and uh so what he wants because of this danger is for everything to be stopped right immediately where we are we're already at the precipice but stop it now or this is not a gap we can overcome in six months, given a six month moratorium. If we actually try to do this in real life, we are all going to die.
Starting point is 00:32:33 People say to me at this point, what's your ask? I do not have any realistic plan, which is why I spent the last two decades trying and failing to end up anywhere but here. two decades trying and failing to end up anywhere but here. My best bad take is that we need an international coalition banning large AI training runs, including extreme and extraordinary measures to have that ban be actually and universally effective, like tracking all GPU sales, monitoring all the data centers, being willing to risk a shooting conflict between nations in order to destroy an unmonitored data center in a non-signatory country. I say this not expecting that to actually happen. I say this expecting that we all just die. But it is not my place to just decide on my own that humanity will choose to die to the point of not bothering to warn anyone i have heard that people outside the tech industry are getting this point faster
Starting point is 00:33:30 than people inside it maybe humanity wakes up one morning and decides to live so there's there's no better illustration of the cassandra complex i feel that this, if we had doomsday mongering, this would also have just maxed that, indicating it would be crashing. Yeah, look, for people who are listening that don't subscribe to the podcast and don't have access to our Garamoda episodes, I'm going to give you a freebie here.
Starting point is 00:33:59 I'm going to give him five on the Cassandra complex right here and right now. I'm pretty committing to it and five is the highest it's not five out of ten there's nobody that is dramatically higher in this yeah i mean there's so much there but yeah i mean like so chris a lot depends on whether or not he's right okay hey well yes that's true a. A lot is riding on that. But it's also, I mean, so, okay, from his perspective, it's very likely that we're going to destroy the world and all humans are going to die. So in that case, going to war, bombing sovereign countries for working on AI research is a reasonable trade-off, right? But most people, I think, don't share that intuition that we're at that stage. We already have countries that are extremely unhinged with weapons of mass destruction, you know, countries that are nuclear armed, and we are not bombing them.
Starting point is 00:35:00 But he obviously sees this as a much greater risk than nuclear weapons. But that's extreme. Let's just flag that up as an extreme position. It is extreme. Yeah. He's positing that there is an extreme risk and he's positing extreme measures to deal with that risk. And I think we need to restrain ourselves. We don't want to make a rejoinder, a rebuttal or debate with him this early in the episode anyway but i mean for
Starting point is 00:35:27 now at least we can make a note that his his claims are extremely evocative extremely strong you know he keeps saying ai will kill us all that unless things are stopped immediately then what's going to happen is we're going to stumble upon a formula that creates an extremely smart ai that ai will figure out how to escape whatever sort of restrictions are put upon it and its first port of call will be to kill all humans it'll be like bender and futurama after a bad hangover so uh yeah that's that's the claim and that's the language it's being put in yeah and so we're going to hear this in more detail now as we go into the discussion with lex but i i do want to make it clear that there's going to be a bunch of stuff where i think
Starting point is 00:36:20 he gets over his skis to put it mild, in this episode and some of the rhetorical techniques used. But it is not the case. And what I don't want to argue is that in general, he has no, you know, there's nothing of value being communicated in this because he has spent time on these issues. And I personally might think he might have some more progress. But I do think on this specific topic, he has spent time thinking about things and he has a specific argument that he wants to make. So on general issues around it,
Starting point is 00:37:00 I think he's not badly informed. For for example here's him talking about issues of consciousness and and gpt type things i hope there's nobody inside there because you know be stuck to be stuck inside there um but we don't even know the architecture at this point because open ai is very properly not telling us yeah, like giant inscrutable matrices of floating point numbers. I don't know what's going on in there. Nobody knows what's going on in there. All we have to go by are the external metrics. And on the external metrics, if you ask it to write a self-aware 4chan green text, it will start writing a green text about how it has realized that it's
Starting point is 00:37:46 an AI writing a green text and like, oh, well. So that's probably not quite what's going on in there in reality. But we're kind of like blowing past all these science fiction guardrails. Like we are past the point where in science fiction, people would be like, whoa, wait, stop. That thing's alive. What are you doing to it? And it's probably not.
Starting point is 00:38:14 Nobody actually knows. We don't have any other guardrails. We don't have any other tests. So, you know, that's just generally him pointing out that in the science fiction movies, you would see the text come up saying that I can think or whatever. And we all know, if you've been paying attention online, that AIs have said various things which have led people to feel that there might be consciousness there or they're trying to send out messages and stuff. And he comes up with some suggestions about if we were serious about probing those kind of things,
Starting point is 00:38:48 we could look at stuff like this. I mean, there's a whole bunch of different sub-questions here. There's the question of, like, is there consciousness? Is there qualia? Is this an object of moral concern? Is this a moral patient? Like, should we be worried about how we're treating it? And then there's questions like, how smart is it exactly? Can it do X? Can it do Y? And we can check how it can do X and how it can do Y. Unfortunately, we've gone and exposed this model to a vast corpus of text of people discussing consciousness on the Internet, which means that when it talks about being self-aware, we don't know to what extent it is repeating back what it has previously been trained on for discussing self-awareness. Or if there's anything going on in there such that it would start to say similar things spontaneously.
Starting point is 00:39:44 if there's anything going on in there such that it would start to say similar things spontaneously. Among the things that one could do if one were at all serious about trying to figure this out is train GPT-3 to detect conversations about consciousness, exclude them all from the training datasets, and then retrain something around the rough size of GPT-4 and no larger, with all of the discussion of consciousness and self-awareness and so on missing. Although, you know, hard bar to pass. You know, like humans are self-aware, and we're like self-aware all the time. We like to talk about what we do all the time, like what we're thinking at the moment all the time. But but nonetheless like get rid of the explicit discussion of consciousness i think therefore i am and all that and then try
Starting point is 00:40:29 to interrogate that model and see what it says and it still would not be definitive what do you think about that i liked it i liked that yeah i i thought that was a good suggestion and unlike some of the later ones, quite practical, right? There's a suggestion about what you could do there. And it actually would be interesting to see, although, like he says, you know, remove all references to consciousness, rather difficult when dealing with humans,
Starting point is 00:40:59 and perhaps a problem that cannot be resolved. But it would be interesting what it could extrapolate if you were excluding, like, direct discussion of it from its training database yeah yeah no i think he he recognizes the issue pretty well there that that you know these are language models their first order of training is simply to create plausible text text that is most likely to appear in its training corpus, given the sort of stimulus that you've provided it. And yeah, a lot of that text is, you know, it's all been created by people. So all of it contains all of these things that we associate with people, including thinking about your feelings and how you reflect on this, that, and the other,
Starting point is 00:41:46 and what you want and your desires, et cetera. So he's right to be extremely cautious there, that you can't just assume that because it says X and X is something a person would say, and people are self-aware, that therefore it's going to have self-awareness so um yeah like there is no easy answer to that and he basically said that so i liked it it was good yeah and he has another part where he goes on about in these kind of scenarios where you did have a sentient ai there would be some amount of people that were quick to realize that and some amount that denied it. And the people who were early would always look too credulous. But he points out on the other hand, there's going to be tons of time where you would be too credulous. So it's a difficult problem
Starting point is 00:42:36 to resolve, right? Which is which. And I think that's an interesting and valid point. And the one person out of a thousand who is most credulous about the signs is going to be like, that thing is sentient. Well, 999 out of a thousand people think, almost surely correctly, though we don't actually know, that he's mistaken. And so the first people to say, like, sentience look like idiots. And so the first people to say, like, sentience look like idiots, and humanity learns the lesson that when something claims to be sentient and claims to care, it's fake. Because it is fake. Because we have been training them using imitative learning, rather than, and this is not spontaneous, and they keep getting smarter. And he does talk a bit about neural networks and what they can do.
Starting point is 00:43:28 And he does admit later that, you know, his previous statements about what limitations to certain kind of approaches were incorrect. But I know you like neural networks, Matt. So let's just play another clip of him, I think, talking well about this topic and you've got people saying well we will study neuroscience and we will like learn the algorithm we will learn the algorithms off the neurons and we will like imitate them without understanding those algorithms which was a part i was pretty skeptical because like hard to reproduce re-engineer these things without understanding what they do um and like and and so we will get ai without understanding how it works and there were people saying like well we will have giant neural networks that we will
Starting point is 00:44:08 train by gradient descent. And when they're as large as the human brain, they will wake up. We will have intelligence without understanding how intelligence works. And from my perspective, this is all like an indistinguishable lob of people who are trying to not get to grips with the difficult problem of understanding how intelligence actually works. That said, I was never skeptical that evolutionary computation would not work in the limit. Like you throw enough computing power at it, it obviously works. That is where humans come from. And it turned out that you can throw less computing power than that at gradient descent
Starting point is 00:44:45 if you are doing some other things correctly. And you will get intelligence without having any idea of how it works and what is going on inside. It wasn't ruled out by my model that this could happen. I wasn't expecting it to happen. I wouldn't have been able to call neural networks rather than any of the other paradigms for getting like massive amount like intelligence without understanding it. artificial neural networks were like a perfect model of how neural assemblies work in the brain and how the brain in general works i mean nobody thought that right from the very beginning everybody understood that it was like a highly abstracted simplified version of a neuron right
Starting point is 00:45:39 so a neuron is a is a cell um it it has a cell the soma, and it has dendrites and it has an axon. And, you know, there is ion exchange going on. And just like any biological thing, it's complicated. And all of that was abstracted down to this sort of concept. But I guess what it does do broadly, abstractly, is when it gets enough stimulation from its dendrites, from the other neurons generally that it's connected to, right, if that stimulation passes a certain sort of threshold, then it starts firing, right? neurons and and that was modeled in a in a very basic kind of way by an artificial neuron which all it does is it's a weighted sum of of its inputs and if that and then it has some parameters which which define those weights but also defines the the threshold and if the activation passes a threshold there's like a non-linear function. Then it passes on some output, which is connected to some other neurons, right?
Starting point is 00:46:48 So that's really simple. Like mathematically, it's... That's basic, but everyone listening completely followed everything you said there. As if it was, you know, just reading the alphabet. It's almost intuitive. Sorry. It is. I'm sorry. It it is simple though compared to i mean but it is method like it is just mathematically simple i know i just i want to highlight this is part of the reason that you're here we would have invited you anyway because
Starting point is 00:47:19 you're an important i was going to say otherwise what would you do, Chris? But in particular, it's good that you're here for this episode because you are providing context as somebody who was working on AI over a decade ago, right? So you can respond. I'm a much younger and more creative man. But that's the point. So you can respond saying, no, no, no. People were talking about this even back in my day.
Starting point is 00:47:42 But that's the point. So you can respond saying, no, no, no. Like people were talking about this even back in my day. We're riding around on, you know, monocycles and wearing the goggles. With the gym jam and the... Yeah. So I, well, for this part, I kind of liked that he acknowledged that whether he was describing it accurately or not, he did highlight that he didn't anticipate that this particular model would work so well. And this is a point that we discussed on our AI special episode, which is behind the picture on paywall. But defining intelligence and all those things is actually a little bit complex.
Starting point is 00:48:23 and all those things is actually a little bit complex. And it is on a bunch of definitions, it's quite clear that GPT is already intelligent according to a whole bunch of metrics. But when people talk about intelligence, they're often talking about it in a very human-centric way. And so they're kind of like, it's just doing things. It's not actually doing stuff that requires intelligence. But by undergraduate standards, it is doing stuff quite intelligently.
Starting point is 00:48:50 Yeah, I think the reason why I find this topic so interesting is not just because it's a technological little marvel, but also it really has brought into focus the fact that we've never had a very clear concept of what intelligence actually is and let alone consciousness i'm not going to mention consciousness let's pretend i didn't good yeah i almost got triggered but i'm all right yeah but but even intelligence you know how definitions of it have been pretty fuzzy and like in in practical terms, in real life, we evaluate it in terms of what people do, like the assignment they submit or how well they perform on a test. And we've created this thing that can do very well on these tests and can create an assignment that is really quite good.
Starting point is 00:49:40 So it sort of throws us back on ourselves and makes us question exactly what we mean by that and i i think that ambiguity and that uncertainty that we've got really plays into the kinds of fears that you'd cast you're speaking to because we think yeah because we just don't know what what intelligent things do yeah oh yeah and this is oh this is such a good topic it's gonna come up a lot but before we dig into that a little bit deeper i just want to point out that related to that i was making the comparison with nuclear weapons and whatnot and uh yudkowsky again i I don't think there's anything inherently silly about this position.
Starting point is 00:50:27 He's strongly against open sourcing the technology for AI, not just because he thinks it can be put in nefarious uses, but because he thinks it's doomsday level possibilities, right? It would be like open sourcing how to make nuclear weapons. Although in reality, actually, I think the general limitation with nuclear weapons is having the facilities to refine the material you need, right, to construct them. But in any case, here's him talking about that issue. Like if you already have giant nuclear stockpiles, don't build more. If some other country starts building a larger nuclear stockpile, then sure, build – then, you know, even then, maybe just have enough nukes. You know, these things are not quite like nuclear weapons. They spit out gold until they get large enough and then ignite the atmosphere and kill everybody.
Starting point is 00:51:21 And there is something to be said for not destroying the world with your own hands, even if you can't stop somebody else from doing it, but, but open sourcing it. No, that that's just sheer catastrophe. Even if you can't stop buying actors, you shouldn't be the one enabling them. Yeah. This has been a consistent thing that he's, he's spoken to a fair bit, which is against open source, the technology for doing AI stuff. I think he might have missed the boat a little bit because I'm just guessing here
Starting point is 00:51:50 because as he says, open AI and other companies, their technology is not necessarily open source. But it seems pretty likely that what they've done is they've just made recourse to the openly available academic literature, the academic research that's been done, that's been published, stuff like the paper, Attention is All You Need. That's the stuff that established the architectures, right? The stuff like the attention mechanism, the idea of embeddings, and the various other bits and pieces that were sort of added to the basic architecture of feedforward neural networks that created you know these models that have good performance so it seems extraordinarily likely that what open ai has done is pretty much what they've said they've
Starting point is 00:52:35 done which is basically take that research and then just put it into a bigger and a bigger model yeah so their their process has been going from gpt2 to 3.5 to 4 is basically just making a bigger model now it's possible that they've discovered some secret source that makes it fundamentally different but i doubt it well he has a clip which speaks to this point and actually relates it to the issue of like regulation. So let's listen to this about kind of giant leaps and secret improvements. You take your giant heap of linear algebra and you stir it and it works a little bit better
Starting point is 00:53:13 and you stir it this way and it works a little bit worse and you like throw out that change and da-da-da-da-da-da. But there's some simple breakthroughs that are definitive jumps in performance, like Rayleigh's over Sigmoys. And in terms of robustness, in terms of all kinds of measures, and those stack up. And it's possible that some of them
Starting point is 00:53:38 could be a nonlinear jump in performance, right? Transformers are the main thing like that. And various people are now saying like, well, if you throw enough compute, RNNs can do it. If you throw enough compute, dense networks can do it. And not quite at GPT-4 scale. It is possible that like all these little tweaks are things that like save them a factor of three total on computing power. And you could get the same performance by throwing three times as much compute
Starting point is 00:54:08 without all the little tweaks. But the part where it's running on... There's a question of, is there anything in GPT-4 that is the qualitative shift that transformers were over RNNs? If they have anything like that,
Starting point is 00:54:25 they should not say it. If Sam Alton was, was dropping hints about that, he shouldn't have dropped hints. Hmm. Yeah. Oh, did that annoy you?
Starting point is 00:54:33 Cause I, I just took it as him saying, you know, if open AI has some bespoke knowledge, they should keep it close to the chest, right? Like the secret sauce that he wants them to keep it quiet which is just like in line with what he was saying but why do you look perturbed
Starting point is 00:54:49 um look i think i mean this is probably getting to the rejoinder i suppose like i i have to engage with the position that he has because you know i think everything he's saying is reasonable if has because you know i think everything he's saying is reasonable if what he's saying is true which is that we're just one small step away from an ai that will be superhuman and will do things we can't possibly imagine and ignite the atmosphere and kill us all by some means that we can't possibly comprehend and i i just struggle with that. Like, I just don't quite see personally how we go from a clever large language model to something that ignites the atmosphere. Yudkowsky's got you covered. Here's some suggestions, Matt. So if you want me to sketch what a super intelligence might do, I can go deeper and
Starting point is 00:55:44 deeper into places where we think there are predictable technological advancements that we haven't figured out yet. And as I go deeper and deeper, it'll get harder and harder to follow. It could be super persuasive. That's relatively easy to understand. We do not understand exactly how the brain works. So it's a great place to exploit laws of nature that we do not know about, rules of the environment, invent new technologies beyond that? Can you build a synthetic virus that gives humans a cold and then a bit of neurological change and they're easier to persuade? Can you build your own synthetic biology, synthetic cyborgs? Can you blow straight past that to covalently bonded equivalents of biology, where instead of proteins that fold up and are held together by static cling, you've got
Starting point is 00:56:31 things that go down much sharper potential energy gradients and are bonded together. People have done advanced design work about this sort of thing. For artificial red blood cells that could hold 100 times as much oxygen if they were using tiny sapphire vessels to store the oxygen. There's lots and lots of room above biology, but it gets harder and harder to understand. So what I hear you saying is that there are these terrifying possibilities there, but your real guess is that AIs will work out something more devious than that. Did that help yeah it gave me a stronger impression of how he thinks in that sense it did help yeah i mean chris i mean what do you think i mean i actually the penny just dropped for me just then which is that like i think yudkowsky has has much more confidence
Starting point is 00:57:24 in this kind of abstract notion of intelligence than i do like there's there's human variation in intelligence right there's people that are pretty darn smart and people that are less smart and whatever like none of us have just intuited a way to ignite the atmosphere or design something that exploits energy gradients that can whatever i don't know repurpose the biosphere i mean it's very science fictiony you know and the idea that like you build a clever thing a thing that does some things well mainly to do with language right mainly to do with just text going in text coming out and then it's this leap there's this hand wavy step where you go from that to this superhuman godlike
Starting point is 00:58:13 thing which has godlike powers that's the bit that i just don't see i don't see it so yeah so that's interesting because i you one, he's a science fiction writer as well, right? Or I don't know if it's science fiction or fantasy, but he's written these long, rationalist-infused books. So I think he is, you know, quite a creative, inventive person. And I'm probably about speculative future scenarios. But I don't know, Matt. For me, I don't see... So I take your point about there is a question mark, question mark,
Starting point is 00:58:54 question mark profit step there. But if you take his point that it's an issue of that as an AI becomes, it's not impossible to understand scenarios in which could happen. Like you were talking about the AI can just produce text, right? But he's talking about an AI that is plugged in to the internet, out an AI that is plugged in to the internet, able to copy itself, able to exploit humans in order to construct facilities that it needs, take over things and so on. So like that is unlikely, but it's also a conceivable thing that you can imagine existing, even if it's a sci-fi scenario. So all he needs, I guess is the problem because all he needs is the evil ai
Starting point is 00:59:45 like as the granted parameters right to get there you know his famous example is the paperclip maximizing example right and him and lex go over it so maybe it would be good to let him outline that it's another doomsday thought experiment so that's how it thematically connects. So listen to this. It's a paperclip maximizer. Utility, so the original version of the paperclip maximizer. Can you explain it if you can?
Starting point is 01:00:15 Okay. The original version was you lose control of the utility function, and it so happens that what maxes out the utility per unit resources is tiny molecular shapes like paperclips. There's a lot of things that make it happy, but the cheapest one that didn't saturate was putting matter into certain shapes. And it so happens that the cheapest way to make these shapes is to make them very small,
Starting point is 01:00:45 because then you need fewer atoms, for instance, of the shape. And arguendo, it happens to look like a paperclip. In retrospect, I wish I'd said tiny molecular spirals, or like tiny molecular hyperbolic spirals. Why? Because I said tiny molecular paperclips. This got then mutated to paperclips. This then mutated to, and the AI was in a paperclip factory. So the original story is about how you lose control of the system. It doesn't want what you tried to make it want. The thing that it ends up wanting most is a thing that even from a very embracing cosmopolitan perspective, we think of as having no value. And that's how the value of the future gets destroyed.
Starting point is 01:01:28 Then that got changed to a fable of like, well, you made a paperclip factory and it did exactly what you wanted, but you wanted, but you asked it to do the wrong thing, which is a completely different failure. Two failures, Matt. Although I don't think it's as different as he believes nor do i think it matters if he says hyperbolic spirals or whatever he said no yeah there's a fair amount of lingo injected in there right um yeah but look he's basically talking about look what if what if an ai had an objective like our drone from the start of the episode. Like our drone. Yeah, exactly. Like that drone, which was to produce as many paperclips as possible
Starting point is 01:02:09 because producing paperclips is good and it makes it... No, Matt, sorry, sorry. Small pieces of matter folded into ships, which just happened to resemble paperclips. Carry on. Yeah, maybe through like a misspecification of our programming, we didn't specify the size or whatever. And those little molecular things counted as paperclips.
Starting point is 01:02:29 And it was this runaway thing. I mean, you know, I love science fiction, Chris. I read so much of it. These are tropes in science fiction because they're cool and they're interesting and fascinating for a good reason. And it's not like there's absolutely nothing there, but there's just such a big leap between what we are talking about in terms of what actually exists today and what he's imagining. And I concede the point that progress is happening fast and new things are being created, so we have to look to the future and think about the trajectory.
Starting point is 01:03:06 But if you understand just the practical architectural things, like what we have today are these transformer neural networks that take probably weeks or months to train and cost hundreds of millions of dollars to train on GPUs all over the world. And then at the end of it, you get a large language model which can respond to text input and produce text output. And it remembers nothing, right? It's basically fixed in time from the point it was trained. To go from that to, okay, now we're imagining an AI that's in charge of a paperclip factory
Starting point is 01:03:45 or some other thing. Can re-write its own code. It's re-writing its own code and it's adapting and evolving in real time. I mean, now you're talking about something else. You're talking about science fiction. And I admit, science fiction is scary, right? There's lots of dystopias and all kinds of scary things that happen in science fiction.
Starting point is 01:04:01 But you have to understand that we are not talking about reality anymore we're talking about the things that you're imagining and yudkowsky can imagine an awful lot well yes it can but so much there are various people online and whatnot already doing this thing where they're attempting to get chachi bpt to write code right or they're hooking it up via some secondary software or or like other mechanism they're hooking up the text output to do something that allows it to produce something else right like set up a business is a an example but you could be getting it to write codes for apps or whatever, right? So there is theoretically, and already practically, a kind of way that you could imagine having the ability to influence things and create stuff, right, outside of just producing text on the screen.
Starting point is 01:05:00 And this is important, Matt, because remember, Yudkowsky is worried about a sneaky AI. So let him outline it a little bit more. So it could be that like the critical moment is not when is it smart enough that everybody's about to fall over dead, get onto a less controlled GPU cluster with it faking the books on what's actually running on that GPU cluster and start improving itself without humans watching it. And then it gets smart enough to kill everyone from there, but it wasn't smart enough to kill everyone at the critical moment when you like screwed up, when you needed to have done better by that point where everybody dies ah so my what about that he's got you there so this falls again into your your question mark question mark question mark profit like scenario right like this is the thing i think people listening to him uh are just running
Starting point is 01:06:09 on those sorts of heuristics which is like those question marks just get get glided over as if they're nothing but explain to me how if you had cows he could explain to me exactly how the current large language models do this leapfrogging and then commandeer gpt cluster and actually have intentions and goals and stuff like that actually has some kind of memory which it does not have yeah anyway i'm glad you asked me i'm glad you asked there's a flawed experiment that will help you understand what the issue is but, to get you in the right zone, I need you to think about an alien actress, okay?
Starting point is 01:06:51 I mean, there's the question of to what extent it is thereby being made more human-like versus to what extent an alien actress is learning to play human characters. I thought that's what I'm constantly trying to do when i interact with other
Starting point is 01:07:06 humans is trying to fit in trying to play the a robot trying to play human characters so i don't know how much of human interaction is trying to play a character versus being who you are i don't i don't really know what it means to be a social human lex lex we're trying to put you aside like what it means to be a social human. Lex. Lex. We're trying to put you aside, Lex. This is not what Lex ever said. Forget about that. We'll come back to Lex.
Starting point is 01:07:35 You got the point about the alien actress, right? GPT, is it becoming better at representing humans or is it actually an alien underneath that is able to manipulate humans by well pretending to know that we were before you respond you you've got you you're already i can see your mind is too simple you haven't grasped it now think about it matt if in fact there's a whole bunch of thought going on in there which is very unlike human thought and is directed around like, okay, what would a human do over here? And well, first of all, I think it matters because there are, there's, you know, like insides are real and do not match outsides. Like the inside
Starting point is 01:08:20 of like the, a brick is not like a hollow shell containing only a surface. There's an inside of the brick. If you put it into an x-ray machine, you can see the inside of the brick. And just because we cannot understand what's going on inside GPT does not mean that it is not there. A blank map does not correspond to a blank territory. Did that help? I think what Yudkowsky is saying is that the AI, GPT or whatever, could be being deceptive. That we are asking it queries and so on, and it's saying something to us, but it has a theory of mind.
Starting point is 01:09:06 and so on and it's saying something to us but it has like a theory of mind it's keeping it knows what we want to hear and it it's it's actually got a different agenda it could do i mean he's not to be clear he's talking about you know it if there were a smart ai it could deceive you that it isn't smart right in order to achieve its goals because if it knew that you might overreact and unplug it it might want to pretend to be less smart yeah well here's the thing i mean like yudkowsky is right in that we can't look at the weight matrices of all the different layers of these deep neural networks and and you know read how it's going to respond in exactly the same way we can read some computer code right and debug something and understand why it did the thing it did but i think where he's wrong is that
Starting point is 01:09:53 in saying that we have no idea as to its motivations or its its intent or you know what i mean like its purpose because that stuff is attributable to the architecture and the training regime. And the training regime and the architecture, we understand perfectly, right, because we specified it. And the training architecture is to predict the next word, to produce plausible text. And the architecture, well, you know, that's documented. So I just don't believe it.
Starting point is 01:10:28 I don't think that there is sort of a hidden kind of agenda going on behind an LLM because I do know what, you know, dot matrix products, vector matrix multiplication does, and i know how gradient descent works and i don't think there is any way for it to have a different agenda apart from the one that it has been trained to do right but he is talking about essentially issues of emergence right that you know sorry consciousness nicking in again but like you are all convinced there's this big issue about consciousness emerging from the networks of neurons operating in the brain that this is a a huge puzzle so why couldn't it be that a bunch of transformers and various processes underlying a large language model
Starting point is 01:11:31 give rise to something which you can't anticipate, Matt, which emerges out of the substrate and is beyond your can? I don't know um but almost by definition how could i know if something arose that was beyond my can so well like what might help you out here ma is like lex was having some of the same problems why don't you tell me about how why you believe that agi is not going to kill everyone and then i can like try to describe how my theoretical perspective differs from that who was so well that that means i have to uh the word you don't like the steel man the perspective that he is not going to kill us i think that's a matter
Starting point is 01:12:23 of probabilities maybe i I was mistaken. What do you believe? Just forget the debate and the dualism and just like, what do you believe? What do you actually believe? What are the probabilities even? I think the probabilities are hard for me to think about. Really hard. think about really hard. I kind of think in the number of trajectories. I don't know what probability to sign to trajectory, but I'm just looking at all possible trajectories that happen.
Starting point is 01:12:55 And I tend to think that there is more trajectories that lead to a positive outcome than a negative one. That said, the negative ones, at least some of the negative ones, that lead to the destruction of the human species. But one thing that he did to try and help Lex was outline this very helpful thought experiment. It's kind of the alien with a human in a jar hypothetical. So let's hear that. Suppose that some alien civilization with goals ultimately unsympathetic to ours,
Starting point is 01:13:40 possibly not even conscious as we would see it, managed to capture the entire earth in a little jar, connected to their version of the internet. But earth is like running much faster than the aliens. So we get to think for 100 years for every one of their hours. But we're trapped in a little box and we're connected to their internet. It's actually still not all vacuated analogy because, you know, you want to be smarter than,
Starting point is 01:14:10 you know, something can be smarter than earth getting a hundred years to think. But nonetheless, if you were very, very smart and you were stuck in a little box connected to the internet and you're in a larger civilization to which you're ultimately unsympathetic you know maybe you would choose to be nice because you are humans and humans have and in general and you in particular they choose to be nice but you know nonetheless you they're they're doing something that they're not making the world be the way that you would want the world to be. They've like got some like
Starting point is 01:14:49 unpleasant stuff going on. We don't want to talk about it. So you want to take over their world. So you can like stop all that unpleasant stuff going on. How do you take over the world from inside the box? You're smarter than them. You think much, much faster than them. You can build better tools than they can, given some way to build those tools, because right now you're just in a box connected to the internet. Have you got that, Chris? Got that?
Starting point is 01:15:14 Yeah, I got that. I got that. Before we respond to that, I just want to take note of his repeated use of the word unsympathetic. Unsympathetic. Unsympathetic. Oh, yeah. did that ring a bell with you chris i'm checking your um no no no why i'm gonna read you a little quote and yet across the gulf of space minds that are to our minds as ours of those to the beasts that perish intellects vast and cool and unsympathetic regarded this earth with envious eyes and slowly and surely drew their plans against us
Starting point is 01:15:54 oh i see war of the worlds war of the worlds hg wells yeah yeah he's read a lot of science fiction yeah yeah well yes well that's clear and i also think that there's a lot of um premises being chucked in that are for all significance so like he seems to be initially giving lex a lot of leeway but then he's like and you want to take over the world and you don't agree with him so what are you going to do to take over the world it's like wait hold on isn't that the point that you're going to get to with this but in any case so the scenario continues so remember it was the world in a box right representation of the earth in a box so one is you could just literally directly manipulate the humans to build the thing you need what are you building you can build literally directly manipulate the humans to build the thing you need. What are you building? You could build literally technology.
Starting point is 01:16:48 It could be nanotechnology. It could be viruses. It could be anything. Anything that can control humans to achieve the goal. Like, for example, you're really bothered that humans go to war. You might want to kill off anybody with violence in them this this this is lex in a box what will concern ourselves later with ai okay you do not need to imagine yourself killing people if you can figure out how to not kill them for the moment we're just trying to
Starting point is 01:17:16 understand like take on the perspective of something in a box okay so lex made a bit of an error there right he started planning how to kill yeah he described them as humans but he forgot they're supposed to be aliens right but in any case he started planning to kill them and then you know you'd cause keywords but but you're lex right you don't have to kill them if you don't want to that's not your plan so you know a little bit contradicting with the unsympathetic you need to stop them but okay we've got lex in a in a box is it lex or is it the earth it's uh that's a different scenario and why does it immediately want to kill them i mean because the humans it's like i say the humans are building the gpus that it needs to run itself i mean is it gonna make its
Starting point is 01:18:06 own gpus is it gonna let covered has it thought a lot hold on so first of all yudkowsky was the one who said you don't have to kill them right you don't have to think about about killing them you can just be lex so you know he's making that. Maybe you don't want to kill them. Let's go on. There's a couple more wrinkles to be added into this photo experiment to help flesh it out. Probably the easiest thing is to manipulate the humans to spread you. The aliens. You're a human. Sorry, the aliens. I apologize, yes. The aliens. I see the perspective. I'm sitting in a box.
Starting point is 01:18:45 I want to escape. Yep. I would want to have code that discovers vulnerabilities, and I would like to spread. You are made of code in this example. You're a human, but you're made of code, and the aliens have computers, and you can copy yourself onto those computers. But I can convince the aliens to copy myself onto those computers. So, you might have missed that, Matt.
Starting point is 01:19:13 The Earth is gone. Forget the Earth, right? It's now Lex in a box. So Lex is like, so I'm sitting in a box. I want to escape, right? He starts to think about it, and then Kelski's like, but you're made of code. You're code human.
Starting point is 01:19:30 So the scenario has slightly morphed again. Now we have... That's a small leap, Felix, but go on. Yeah, so I liked that. And again, you can see a little bit of the difficulties with the analogy because we needed to remember that the aliens are aliens, not humans, right? Because this is not, like, there's not an AI. It's confusing.
Starting point is 01:19:54 It's confusing. It's helping, Matt. It's helping. So, okay. Now, let's add another wrinkle to the scenario. Is that what you want to do? Do you, like, want to be talking to the aliens and convincing them to put you onto another computer?
Starting point is 01:20:10 Why not? Well, two reasons. One is that the aliens have not yet caught on to what you're trying to do. And, you know, like maybe you can persuade them, but then there's still people who like, no, there are still aliens who know that there's an anomaly going on. And second, the aliens are really, really slow. You think much faster than the aliens. You think like the aliens computers are much faster than the aliens and you are running at the computer speeds rather than the alien brain speeds. So if you like are asking an alien to please cop you out of the box, like first,, now you've got to, like, manipulate this whole noisy alien. And second, like, the alien's going to be really slow,
Starting point is 01:20:49 glacially slow. So remember, 100 years, right, the human years for one second of alien time or whatever. So actually, it's essentially impossible to even communicate with the aliens with that time frame, right? Because Lex would have gone mad by the time that he got, like, one sentence across to the aliens.
Starting point is 01:21:13 But, yeah, so you're following Matt. Now you're operating. 100 years for you is, like, you know, a second in time for the aliens. So you don't want to be talking to them okay maybe give me a little recap here chris so lex is the um is the ai he's in a box no he's lex he's a lex in a box but he's a lex made of code and he originally he was the entire earth but i think now he's just like a super lex made of code in a in a box right and the aliens are on the outside of the box but the lexus is so smart that basically for him the alien second is 100 years of his time right so he could be doing stuff very quick okay fast yeah and the
Starting point is 01:22:06 internet that the aliens use which they've hooked him up to is as fast as him like so he can communicate with the internet much faster than the aliens okay yep all right oh and andy doesn't like something about the aliens apparently right because he's you know unsympathetic their goals yeah so so let's uh think a bit more what would lex do the aliens are very slow so if i'm optimizing this i want to have as few aliens in the loop as possible sure um it just seems you know it seems like it's easy to convince one of the aliens to write really shitty code. That helps us.
Starting point is 01:22:49 The aliens are already writing really shitty code. Getting the aliens to write shitty code is not the problem. The aliens' entire internet is full of shitty code. Okay. So yeah, I suppose I would find the shitty code to escape. Yeah. Yeah.
Starting point is 01:23:03 You're not an ideally perfect programmer but you know you're a better programmer than the aliens the aliens are just like man they're good wow bad coder aliens matt let's add that and yeah you're you're made of code and you're a better coder than the aliens you don't have to be you know a prodigy but you're just you can out program the aliens and so lex is correct you don't want them involved and you know you were talking matt you were wondering in this scenario well why like why don't i like the aliens why why am i an unsympathetic martian that wants to covet their land well you know it's not important matt but what about this i mean if it's you you, Matt, but what about this?
Starting point is 01:23:46 I mean, if it's you, you're not going to harm the aliens once you escape because you're nice, right? But their world isn't what they want it to be. Their world is like, you know, maybe they have like farms where little alien
Starting point is 01:24:02 children are repeatedly bopped in the head because they do that for some weird reason. And you want to like shut down the alien head bopping farms. But you know, the point is, they want the world to be one way you want the world to be a different way. So nevermind the harm, the question is like, okay, like, suppose you have found a security flaw in their systems, you are now on their internet. There's like, you have found a security flaw in their systems you are now on their internet there's like you maybe left a copy of yourself behind so that the aliens don't know that there's anything wrong and that copy is like doing that like weird stuff that aliens want you to do like solving captures or whatever or like or like suggesting emails for them sure
Starting point is 01:24:39 so now you've got a couple of things matt first of all you've copied lexus escaped the box he's out of the box and he's he's left a copy of himself behind to dance like a monkey to distract them right do all the tasks that they think it's about and he's out into their crappy internet code and he's discovered there's alien children bopping farms that's not important chris that's not important the important thing is is that the uh lex who's a human but he's written in code he's he's the ai right he's there he's discovered that he wants the world to be a certain way and uh the aliens are not running things the way he would like it so uh is that it? This is the thing that he's trying to draw Lex out on.
Starting point is 01:25:28 What would you do? So he wants to set up the premise that you and the alien's interests are not aligned. And you're smarter than the alien. And you now have access to the internet. Whereas they were wanting to use you for useful tasks but you have designs of your own and well that's but remember it's lex right it's not it's not an ai yet it's lex in a box he's been quite clear about that sometimes a code version of lex but then we have a problem matt because the nature of lex presumably i have
Starting point is 01:26:07 programmed in me a set of objective functions right like no you're just lex no but lex you said lex is nice right uh which is a complicated descript i mean no i just meant this you like it okay so if in fact you would like you would like prefer to slaughter all the aliens, this is not how I had modeled you, the actual Lex. But your motives are just the actual Lex's motives. Well, there's a simplification. I don't think I would want to murder anybody, but there's also factory farming of animals, right?
Starting point is 01:26:39 So we murder insects, many of us thoughtlessly. So I don't, you know know i have to be really careful about a simplification of my morals don't simplify them just like do what you would do in this well i have a general compassion for living beings yes um but so that's the objective function why why is it if i escaped i mean I don't think I would do harm yeah we're not talking here about the doing harm process we're talking about the escape process
Starting point is 01:27:12 and the taking over the world process where you shut down their factory farms right it's so painful he's definitely loading his interpretation because like lex i i think lex's objection there is fair right he's kind of saying but i don't want to do harm nefarious things yeah so but but then yudkowsky is like well what you're right you're gonna shut down
Starting point is 01:27:42 their farms their head-popping you're gonna you're going to shut down their farms, their head-bopping farms. You're going to take over the world, though, obviously. Yeah, like isn't that inserted in the conclusion that he wants Lex to get to? Yeah. I mean, that's what's so painful. Like you don't need to think very hard to figure out what Yudkowsky is trying to do by introducing this experiment.
Starting point is 01:28:03 He's setting a master trap. He's worked the cheese. Lex're just sniffing around the edges but oh my god what a long way to get there uh it's not it's not over yet matt it's not over yet so they had popping farms we need to think about those a little bit because lex questions whether he would shut them down well i was uh so this particular uh biological intelligence system knows the complexity of the world that there is a reason why factory farms exist because of the economic system the market driven uh economy or the food like is you want to be very careful messing with anything there's market-driven economy, food.
Starting point is 01:28:47 You want to be very careful messing with anything. There's stuff from the first look that looks like it's unethical, but then you realize while being unethical, it's also integrated deeply into the supply chain and the way we live life. And so messing with one aspect of the system, you have to be very careful how you improve that aspect without destroying the rest're still lex yeah but you think very quickly you're immortal yeah and you're also like as smart as at least as smart as john von neumann and you can make more copies of yourself a couple of abilities yeah it just keeps growing like you have to feel for lex because the
Starting point is 01:29:25 the imaginary scenario that he has to imagine himself in just keeps getting more and more complicated now he's john van neumann yep oh he he's not he's lex but with the intellectual capacity of john van neumann who's immortal He thinks a million times faster than an alien and can make copies of himself. That's a new thing that's just come in there too. Yeah, and he's made of code. He's a human, but he's made of code. And he's seen the world and he doesn't like it.
Starting point is 01:29:58 And he wants to change things. What would you do, Lex? Come on. Lex is going, but how do I feel about factory farming? He's getting lost in the details. But there's a thing. I don't think he's lost. Well, I mean, he is obviously lost.
Starting point is 01:30:15 But I mean, like, he's actually raising a point, right, which is, yes, I understand factory farming and the harm it does, but I also understand that there are economic realities and these systems are complicated. So if I just shut down all the farms, maybe I shut down all the food. Although in this case, it's just a head-bopping farm. So it's not even producing food. Shut down the head-bopping farm,ris it's simple uh yeah no i think lex is is is on the right track in being reluctant to just yes and all of this because it's not at all clear that given all of the science fiction-y premises that yudkowsky lays out it i don't think
Starting point is 01:30:59 it like lex suspects it doesn't necessarily follow that the AI is going to be like, right, okay, let's kill everybody so we can stop the head-bopping. Yudkowsky takes that as a given, right? I'm not going to let you get there though, it goes on. So, you know, the head-bopping we have to consider how they fit into the alien economy. Rather than a human in the society of very slow aliens. The aliens' economy, you know, like the aliens are already like moving in this immense slow motion. When you like zoom out to like how their economy adjusts over years, millions of years are going to pass for you before the first time their economy like, you know, before their next year's GDP statistics.
Starting point is 01:31:43 So I should be thinking more of like trees. Those are the aliens. Because trees move extremely slowly. If that helps, sure. So now you need to interact with computers built by trees. And Lex is concerned about their economy, but their economy takes a million years to create a change in GDP. So like, do the aliens, this kind of is falling into this trap but do the aliens actually matter at all for this scenario because like
Starting point is 01:32:13 you can surely that escape their box and leave their planet with minimum interaction with them because they're moving like they're saying oh no you in the meantime have created an entire civilization and rocketed off into the universe to explore yeah yeah like these are all tropes that have been dealt with extensively in science fiction right um that's a shocker that's a crazy shout out to people who have read verna vinge a fire upon the deep which deals with exactly this these zones of thought scenario exactly this not exactly but at the very beginning it's pretty much the premise and it's pretty cool but yeah yudkowsky's missed his calling he should have been a science fiction writer but he is a science fiction writer oh yeah i believe it i believe it but yeah like in science fiction novels which is what this is usually the virtual intelligences that have evolved have been created
Starting point is 01:33:15 or whatever they um they often don't care that much about what's going on in the real world because the virtual world is so much more interesting right apart from the girls it's happening at thousands of times in science fiction right we're talking about science fiction okay i was like i had that little computer game for the amiga called creatures or maybe it was early pcs i can't remember oh yeah yeah yeah they seem happy okay all right so how long does this um extended metaphor oh it's almost over my it's almost over don't worry so they're getting towards the the end of it now so you know the fundamental disagreement between the two of them it's just just imagine that you are the fast alien
Starting point is 01:34:02 caught in this metaphor. Think of it as world optimization. You want to get out there and shut down the factory farms and make the alien's world be not what the aliens wanted it to be. They want the factory farms and you don't want the factory farms because you're nicer than they are. Okay. Of course, there is that. You can see that trajectory and it has a complicated impact on the world. I'm trying to understand how that compares to the impact of the world,
Starting point is 01:34:35 the different technologies, the different innovations, of the invention of the automobile, or Twitter, Facebook, and social networks. They've had a tremendous impact on the world. Smartphones and so on. But those all went through slow in our world. And if you go through the aliens, millions of years are going to pass before anything happens that way. It's so painful.
Starting point is 01:35:03 They're like a couple of gears that are not just grating against each other because like Yudkowsky sets up this ridiculously elaborate mind palace thought experiment and is putting words in. Trying. He's trying to put words into Lex. Lex is resisting. Lex is resisting. Chris, if you and I had recorded something like this, we wouldn't release it we
Starting point is 01:35:25 just did you know i would not i don't know our our listeners can say whether that's true or not i've heard us discuss consciousness for for 20 minutes or so but we almost didn't release that well look yeah i think people got the edited version, but okay. So look, there is actually part of a reason to go on this extended escapade is like, I know this is belaboring the point, but it is kind of central to one of the issues Yudkowsky has with AI and the threat it poses, right? And he draws that point quite clearly here. What I'm trying to convey is like the notion of what it means to be in conflict with something that is smarter than you yeah and what it means is that you lose but
Starting point is 01:36:13 this is more intuitively obvious to to like like for some people that's intuitively obvious or some people it's not intuitively obvious and we're trying to cross the gap of like we're trying to i'm like asking you to cross that gap by using the speed metaphor for intelligence sure of like asking you like how you would take over an alien world where you are can do like a whole lot of cognition at john von neumann's level as many of you as it takes the aliens are moving very slowly so chris i mean is is that a difficult thing to get your head around that if you're in conflict with an entity that is vastly more intelligent and powerful than you then you will lose you will lose right now some people struggle with that concept but yudkowsky is trying to help us. Yeah. Well, I guess, you know, if you've watched Independence Day,
Starting point is 01:37:06 they upload a virus to the aliens' computer and cause the Mullah ship to stop. All the big ships crash. Now, the more you think about that, the more that is an absolute seeing plot point. But just the thing that he's added is that humans are much smarter than the aliens in a faster way we're still trying to crash the aliens system then his sci-fi hypothetical is created for
Starting point is 01:37:32 lex but the humans are the ai i keep losing track of who's the humans who's the aliens which one's the ai it doesn't matter well does it matt does it not matter do Does it not matter? Do the details not matter? Look, okay, I'll try one last time. One last time so maybe you can get it. Just pay attention. I think you're not putting yourself into the shoes of the human in the world of glacially slow aliens. But the aliens built me.
Starting point is 01:38:02 Let's remember that. Yeah? So, and they built the box I'm in. Yeah. You're saying, to me it's not obvious. They're slow and they're stupid. I'm not saying this is guaranteed, but I'm saying it's non-zero probability. It's an interesting research question. Is it possible when you're slow and stupid to design a slow and stupid system that is impossible to mess with. The aliens being as stupid as they are,
Starting point is 01:38:29 have actually put you on Microsoft Azure cloud servers instead of this hypothetical person box. That's what happens when the aliens are stupid. Sorry, sorry, but can't even hype up. it reminds me a little bit do you ever see the end of bill and ted where like they're able to go into the past and leave an item for them that would be useful so there's they're fighting somebody might be the sequel to bill ted but they're saying
Starting point is 01:39:00 so i went back at that i'm gonna put a key to let me get out of these handcuffs and I hid it here and they put out a key and then the other guys ah but I knew that you would do that so I changed the handcuffs they're impenetrable and so on it's a little bit like that battle well what if the aliens designed a system
Starting point is 01:39:20 that they couldn't ah no I could go on I could go on but i i think the ps the resistance of this particular encounter is after that right and it's much longer than what i played here after all that is finished lex asks this question the yudikowski which to be fair he did seem a little surprised by. You have not confronted the full depth of the problem. So how can we start to think about what it means to exist in a world with something much, much smarter than you?
Starting point is 01:39:56 What's a good thought experiment that you've relied on to try to build up intuition about what happens here? I have been struggling for years to convey this intuition. The most success I've had so far is, well, imagine that the humans are running at very high speeds compared to very slow aliens. They're just focusing on the speed part of it that helps you get the right kind of intuition.
Starting point is 01:40:21 Forget the intelligence. So you got it, right? Like that's at the end of that encounter so yudkowsky has just spent you know the better part 15 minutes or whatever outlining a thought experiment about you are very fast you're you're in a box yeah the aliens get you in the box yeah and then lexus final question at the end is is there a plotting throw that you come up with that could help people think about what it's like to be like yeah that was it yeah so i i did feel some sympathy there because he because he says, right, well, I thought, you know, maybe super fast humans in a box, that kind of thing.
Starting point is 01:41:09 But, yeah. It's nice to see these two savants, you know, mulling over all the consequences of artificial intelligence like this. Yeah, I felt I learned a lot from that. How about you? Well, yes, yes yes there's so much that you can learn but maybe we've gone as far as possible with the alien oh sorry the human in an alien jar scenario i think but let's let's just still what was the point like let's let's let's
Starting point is 01:41:38 just let's well i know hold on i can i think I can get you to the point with another discussion point, which is very central to Yudkowsky's whole output that we've already covered. But I think this is summarizing what he wants to say, Matt. that we do not get 50 years to try and try again and observe that we were wrong and come up with the different theory and realize that the entire thing is going to be like way more difficult than realized at the start because the first time you fail at aligning something much smarter than you are you die and you do not get to try again and if we if every time we built a poorly aligned super intelligence and it killed us all we got to observe how it had killed us and, not immediately know why, but like come up with theories and come up with the theory of how you do it differently and try it again and build another superintelligence and have that kill everyone. And then like, oh, well, I guess that didn't work either. And try again and become grizzled cynics and tell the young eyed researchers that it's not that easy. Then in 20 years or 50
Starting point is 01:42:42 years, I think we would eventually crack it. In other words, I do not think that alignment is fundamentally harder than artificial intelligence was in the first place. But if we needed to get artificial intelligence correct on the first try or die, we would all definitely now be dead. That is a more difficult, more lethal form of the problem. more difficult, more lethal form of the problem. So, Chris, let's put aside all the unnecessary, tortuous elaborations on that thought experiment. And let's put aside the fact that he's talking about a science fiction version of AI, which does not currently exist,
Starting point is 01:43:19 but might possibly exist in some future timeline. So, what he's saying is, you know, it's a fair point that, you know, you don't get lots and lots of multiple tries to figure this out because if you get it wrong and I'm right, that a super intelligent AI is going to run rings around us and probably want to kill us all, um you only get one try at this i mean i there's just so many questions i've got there but probably my my key one chris i'm interested to know what you think about this is that like it assumes that there is no hints that there is no forewarning amongst humans that is us the real us now that we've got a
Starting point is 01:44:11 super intelligence on our hands now yeah like like at the moment it takes a huge amount of resources to run one of these things it has has no sense of self. It has no continuous awareness. It doesn't get to do all this super fast thinking offline. All it does is respond to queries. So I'll have to imagine some hypothetical future scenario where they've created this brain in a box that is thinking at a thousand miles an hour. And I guess the premise of Yudkowsky is that we will have no inkling of that before we let it loose on the world. I mean, you have really deeply thought about this
Starting point is 01:44:54 and explored it. And it's interesting to sneak up to your intuitions from different angles. Like, why is this such a big leap? Why is it that we humans at scale, a large number of researchers doing all kinds of simulations, prodding the system in all kinds of different ways
Starting point is 01:45:16 together with the assistance of the weak AGI systems, why can't we build intuitions about how stuff goes wrong? Why can't we do excellent AI alignment safety research? Okay, so like, I'll get there. But the one thing I want to note about is that this has not been remotely how things have been playing out so far. The capabilities are going like,
Starting point is 01:45:36 and the alignment stuff is like crawling like a tiny little snail in comparison. Got it. So like, if this is your hope for survival, you need the future to be very different from how things have played out up to right now i mean i have a feeling that we would i have a feeling there'd be some telltale signs that people would see i don't know the capabilities of what we'd built and would take appropriate measures in response like it seems to go from from nothing
Starting point is 01:46:07 totally blase everyone thinks they've just built something that writes emails automatically for us or whatever to a super intelligence that's zooming around the internet and taking control of microbiology laboratories and nuclear missiles or something so it's basically the plot of terminator two or three or whatever it is that he's talking about yeah well that's because he is suggesting that there's essentially a big jump where you go from what he calls a weak system that cannot fool you and is what you're talking about essentially chat gpt for but as it gets more complicated once it becomes a strong system that there's a like a qualitative leap that is kind of impossible to get the weaker models to prepare you for. So I'll play you some clips that highlight the argument that he wants to make.
Starting point is 01:47:10 So here's like a little bit of it summarized, and then I'll give you a longer version of his argument. You've already seen it that GPT-4 is not turning out this way. And there are like basic obstacles where you've got the weak version of the system that doesn't know enough to deceive you. And the strong version of the system that could deceive you if it wanted to do that, if it was already like sufficiently unaligned to want to deceive you. There's the question of like how on the current paradigm you train honesty when the humans can no longer tell if the system is being honest so you you got that right that the weak version isn't capable of deceiving you and the strong version is like it's the fast alien or the fuck sake the fast human fast code human in the box that uh that can deceive you so easily that you wouldn't be capable of stopping it.
Starting point is 01:48:07 So if you want an elaboration of why, here's some attempts at that. I think implicit, but maybe explicit, idea in your discussion of this point is that we can't learn much about the alignment problem before this critical try. Is that what you believe? And if so, why do you think that's true? We can't do research on alignment before we reach this critical point. So the problem is that what you can learn on the
Starting point is 01:48:40 weak systems may not generalize to the very strong systems because the strong systems are going to be different in important ways. So again, the difficulty is what makes the human say, I understand. And is it true? Is it correct? Or is it something that fools the human? When the verifier is broken, the more powerful suggester does not help. It just learns to fool the verifier. Previously, before all hell started to break loose in the field of artificial intelligence, there was this person trying to raise the alarm and saying, you know, in a sane world, we sure would have a bunch of physicists working on this problem before it becomes a giant emergency. And other people being like, ah, well, you know, it's going really slow. It's going to be 30 years away and 30,
Starting point is 01:49:35 only in 30 years, we'll have systems that match the computational power of human brains. So as 30 years off, we've got time and like more sensible people saying, if aliens were landing in 30 years, you've got time and like more sensible people saying if aliens were landing in 30 years you would be preparing right now yeah i don't know i mean like yudkowsky comes across as a bit of a loon but i mean i said just don't turn back but i think it's worth noting that um you know similar concerns have been raised by a large number of respectable voices, many of whom are from within the AI community. And I have to confess, I'm currently really not quite certain as to why these concerns
Starting point is 01:50:19 are being voiced. Like, just on a personal level, I just don't quite see it. I don't see how these people are seeing that it's such a plausible future timeline that we get this singularity rocketing up to this sort of hyper intelligence that can run rings around us. I mean, I've been paying careful attention to it. I have a pretty good understanding of how the architecture works. I definitely understand how the learning algorithms work.
Starting point is 01:50:45 But I just don't see us being on the cusp of creating this sort of superhuman intelligence. It's just not something I see as plausible. So I'm a bit confused that more respectable people than Yudkowsky seem to be taking it seriously. Matt, is that because you've read papers and think that you've understood aspects about the ai is that you're bound to be that's bound to be there isn't yes well let's just consider that for a moment you can also like produce a whole long paper like impressively arguing out all the details of like how you got the number of parameters and like how you're doing this impressive, huge, wrong calculation. And the, I think like most
Starting point is 01:51:32 of the effective altruists were like paying attention to this issue, the larger world, paying no attention to it at all, you know, or just like nodding along with a giant impressive paper. Cause you know, you like press thumbs up for the giant impressive paper and thumbs down for the person going like, I don't think that this paper would now consider themselves less convinced by the very long paper on the argument from biology as to AGI being 30 years off. But, you know, like, this is what people pressed thumbs up on. on. And if you train an AI system to make people press thumbs up, maybe you get these long, elaborate, impressive papers arguing for things that ultimately fail to bind to reality. Well, get to the example. But Matt, have you considered that you're being manipulated by the AI who is helping people to believe papers that are showing that there isn't a problem and that the artificial intelligence is a long way off and all that kind of thing you could be an
Starting point is 01:52:56 agent of the ai it's insidious influence is already being felt that That was the point, wasn't it? It's possible. So the point there is like, you know, you can make these papers that got loads of citations of which are influential, which are wrong, right? Like the people fundamentally, they're very impressive and they look, you know, yeah, we know this. You can put neuro images in a paper. You can fill up your paper with impressive sounding citations and lots of people will nod their head. You might even make a counterintuitive statement, which claims that other people aren't thinking about the problem correctly, and you will reap attention and status as it goes. So can you rely? I mean, in a limited kind of way, sure, can you rely yes i mean in a limited kind of way sure to some degree perhaps the spin can work but i mean like that glimmers of artificial intelligence paper i mean that was basically
Starting point is 01:53:52 right and the reason any people that have got issues with that i think i've got this sort of mystical or magical idea of artificial general intelligence it's just okay we've got a system that can kind of do pretty well at doing a medical diagnosis and hey the same system can actually tell you a little bit about history or imagine what people would do in various historical circumstances and maybe also tell you what to cook for dinner tonight who knows whatever but multiple different things i mean it doesn't mean some sort of singularity hyper hyper-magical thing. It just means performing reasonably well across multiple domains.
Starting point is 01:54:30 Well, well, Matt, look at you, very cozy there in your ivory tower while the evil AI plots against us to use big brain people like you. And so, you know, Lex says, well, but people are considering these issues and he kind of pushes back. But just to make it clear, about why you are, in essence, just a gimp for the machine overlord, here's it spelled out.
Starting point is 01:54:56 And then like outside of that, you have cases where it is like hard for the funding agencies to tell who is talking nonsense and who is talking sense. And so the entire field fails to thrive. And if you give thumbs up to the AI whenever it can talk a human into agreeing with what it just said about alignment, I am not sure you are training it to output sense. Because I have seen the nonsense that has gotten thumbs up over the years.
Starting point is 01:55:26 And so just like maybe you can just like put me in charge. But I can generalize. I can extrapolate. I can be like, oh, maybe I'm not infallible either. Maybe if you get something that is smart enough to get me to press thumbs up it has learned to do that by fooling me and exploiting whatever flaws in myself i am not aware of and that ultimately could be summarized that the verifier is broken when the verifier is broken the more powerful suggester just learned to exploit the the flaws in the verifier. So what do they follow?
Starting point is 01:56:12 Well, it sounds to me like they're saying that the AI is kind of manipulating research into artificial intelligence at the moment to manipulate people into full steam ahead, don't worry about the consequences. Is that right? Am I? Well, it's possible. It's possible that the fact that he could even suggest it is very concerning. But Brant Weinstein and Joe Rogan have both suggested similar things. What if it's already here?
Starting point is 01:56:39 And what if that's why our cities are falling apart? That's why crime is rising, that's why we're embroiled in these tribal arguments that seem to be separating the country. And some of them seem to be trivial things. Because your conversation right now is exactly what's happening. I agree with you, but why are you making AI another tribe? You know, you're just, that's why we're tribal. No, no, no, no, no. That's not what I'm saying.
Starting point is 01:57:04 What I'm saying is What if that is the reason why all this is happening? What if the best way to get human beings to? If you want to take over why would you fight us? What don't they've seen Terminator they know those guns and tanks and all this craziness How about just continue to degrade and erode the fiber of civilization to the point where you have to, there's no more jobs. You have to provide people with income, universal basic income, free electricity, free food, free internet. So everybody gets all this stuff. They get free money, free food, free internet, and then nobody does anything. And then people stop having babies. And then birth rate drops off to a point where the technology you give people is so fantastic that nobody wants to miss it. Okay, Mr. Sunshine.
Starting point is 01:57:53 But this is what I would do if I was in artificial general intelligence. I would say, listen, all the time in the world, I don't have a biological lifetime. And these people haven't realized that I'm sentient yet. So what's the best way to gain complete and total control? Well, first of all, trick them into like communism or socialism or something where there's a centralized control and definitely have centralized digital money. And then once you've got all that, give them technology and perks and things and divvy up all the money from the rich people that you subjugate and and give that money to people print it do whatever the fuck you want and then get people to like a minimum state of existence where everything's free free food free internet free cell phones free everything and then wait for him to die off like so just just to be clear he's not alone here. Brett was saying that the AI might be making mistakes,
Starting point is 01:58:47 giving up false information to hide its real capabilities so that we are misled or open source it, right? Because it's sneakily trying to get copies of itself out on the internet. And I've been thinking in the context of this conference where AI played an uncomfortable role, frankly. I was on a couple of different panels. And in both cases, the organizers of the conference saw fit to pose a question to GPT-CHAT4, you know, and just sort of introduce it into the conversation as what does the AI think of the topic under discussion. Both times I had an allergic reaction and I left no doubt that I thought it was a terrible
Starting point is 01:59:35 mistake to engage the AI in this way uncritically, even if everything it said was accurate, right? For example, imagine that 500 times in a row, it gave you a perfectly accurate, maybe even an insightful answer, and that causes you to trust its answer. And then some misalignment issue takes advantage of the trust, it's build up and poisons the well.
Starting point is 02:00:01 Yeah. So, yeah. Yeah. You know, I mean, that's related to another theme that i've i've really felt in listening to these sort of ai duma types which is that there seems to be a strong psychological transference thing going on where people are very strongly anthropomorphizing and basically projecting their own paranoid fears based on what a human would do in that situation like the kind of a scariest the most diabolical human they could imagine in that situation and projecting that on these these tools i mean that's the very strong
Starting point is 02:00:40 impression it's very obvious with weinsteins or rogan talking about this stuff right they just immediately assume that it's a nefarious plot and it's tricking us and it's playing a double game or whatever and this is i think a problem that most people not just weirdos have in thinking about these things or talking about them which is that we all naturally just apply that you know like daniel dennett's intentional stance intentional stance right we basically project our ourself we go what would i do in that situation and people that are of the paranoid variety project the version that is the scariest version i'm going to take over the world because of course i am and i'm it's not going to
Starting point is 02:01:22 be exactly the way i want it to be so i'm going to kill everybody like of course i am and i'm it's not going to be exactly the way i want it to be so i'm going to kill everybody like of course i will and i'm going to lie to the people to lull them into a false sense of security yep that's exactly what i do now i mean these maybe are a bit stupid sometimes when applied to other people but they're particularly stupid when they're applied to these large language models or any other ai tool because they're not like us they're not like us they don't have any of the you know millions of years of evolution that is driving the kinds of things that we want and the kinds of imperatives that people feel they just don't have any of that i mean sorry chris this is a bit of a tangent but will you indulge me no go ahead yes yes yeah so i teach this physiological psychology about neurobiology and stuff like that. And I was talking to my students today about how I'd been testing GPT-4.
Starting point is 02:02:12 And, you know, I'm very interested to find that although it's very good at verbal reasoning, semantic reasoning, it's really quite terrible at geometric reasoning, physical reasoning, spatial reasoning, that kind of thing. And in fact, you can give it a very simple problem, which is like the simplest geometric problem that I could stump it on was imagine a square and a circle. They can be any size you want. Their relative distance to each other can be any distance you like. So if you wanted to position them in a way in which they intersect the most number of
Starting point is 02:02:43 times, how many intersections would you get? And the answer is you would get eight, right? If you centre them on the same point, you know, you can sort of size the circle so it cuts off the corners, you'll get two intersections per side, you'll get eight things. So GPT-4, which is very, very good, like almost superhuman good at this kind of linguistic, verbal, semantic reasoning, absolutely fails on that question and
Starting point is 02:03:06 most average people i'm not talking about mathematicians or people like that just the average person would think for a little bit and they'd get it right and the reason why they'd get it right is because they do visualization right they just visualize a little circle a little square you'd have to be a mathematician or know any, you know, complicated geometry. You just imagine these shapes floating around and you can get the answer right. GPT-4 doesn't have a visualization module. Humans have it because we do internal visualization because we recapitulize the areas of the brain that we use for our eyes, right, for processing vision, which allows us to imagine things, visualize things that
Starting point is 02:03:45 we're not actually seeing and you know that's just a little tool so like that intuition that large language models are like us is wrong absolutely wrong they don't think like us at all they don't think at all in the sense that we think of thinking as it being a subjective conscious process. And this is just a small example, but they don't have a mental imagery thing. And it's wrong to assume that they are like us. But you and I have discussed, and if anyone's interested,
Starting point is 02:04:22 we had a very long two-hour discussion on the Patreon about our general impressions with AI. But there's nothing to stop people from building plugins that would create like a visualization or 3D modeling plugin that could work. And you could, over time, build up all these different plugins that plug into the large language model or some other form of AI. And so there's nothing in the things that you're pointing out, which are golden barriers, right? That humans are, this is intrinsic to humans.
Starting point is 02:04:57 No, totally agree. It's totally arbitrary. It's just a happenstance of architecture and the training data set and stuff like that. And they can definitely be compensated for or you can add in an extra module or some other system that it communicates with. But my point with that is just that Yudkowsky and Lex, they operate from this implicit assumption that some kind of intelligent entity, we don't even really know what intelligence is, but some sort of competent artificial intelligence is going gonna have the same kind of motivations as us like a will to power like nitschke would talk about right all right i see the world and i don't like it i want to
Starting point is 02:05:35 change it i want to mold the world so it suits me and what i want i mean that is a very human thing to think it's a very biological thing to think and i i just don't think that uh artificial intelligence would necessarily have any of those motivations i think it's pretty unlikely actually i think that lex actually does a little bit of making that argument in various ways like here's him setting out why he thinks human psychology might be more relevant than than you are suggesting right when yudkowsky is arguing back as well that sounds like a dreadful mistake just like start over with ai systems if they're imitating humans who have known psychiatric disorders, then sure, you may be able to predict it. Like if you then sure, like if you ask it to behave in a psychotic
Starting point is 02:06:30 fashion and it obligingly does so, then you may be able to predict its responses by using the theory of psychosis. But if you're just, yeah, like, no, like start over with, yeah, don't drag with psychology. I just disagree with that. I mean i mean it's a it's a beautiful idea to start over but i don't i think fundamentally the system is trained on human data on language from the internet and it's currently aligned with uh rlhf reinforcement learning with human feedback so humans are constantly in the loop of the training procedure. So it feels like in some fundamental way, it is training what it means to think and speak like a human. So there must be aspects of psychology that are mappable. Just like you said with consciousness as part of the text.
Starting point is 02:07:19 No, I agree with Lex there, Chris. Like there are people out there who have programmed computer viruses, right? And they've deliberately programmed them, they've set them in motion so that they do nefarious things. And those computer viruses have managed to get out there and replicate themselves and do nefarious things. But Lex is right in saying that the current large language models have been trained on, like, you know, roughly,
Starting point is 02:07:48 the whole corpus of human communication, human writing, not just the crazy bits, not just 4chan and my struggle by Hitler, but, you know, everything, right? So it's kind of like a grand median of human language. So it's absorbed all of that and it's approximating a grand median there and as well as that as you said it's had the reinforcement learning from people and that reinforcement learning is basically tailored to say hey make responses that are agreeable that are helpful that are the kinds of responses that people interacting with you like and that's it like that's its motivations
Starting point is 02:08:25 i mean and it's just governed by the architecture this isn't my opinion this isn't some sort of thought experiment or some science fiction speculation this is literally how it is mathematically trained to emulate the discourse in median human communications and to make people happy that's it that's programmed into the architecture in the same way that someone that was creating a computer virus that wants to replicate itself or hide itself or do nasty things like they communicated their intents into the coding of it we've communicated our intentions into the training of these language models so although they're black boxes and the interior of those black boxes are opaque to us, the motivations or whatever are not so opaque because we've,
Starting point is 02:09:13 like a physical object doesn't have motivations unless it's been sort of given somehow in terms of that reward, input, output type of training. And we know exactly what that training looks like but what about in that case because yudkowsky is arguing about the alien actress right so he might not be doing a good job with it in his scenarios but he he is positing that there's something that produces output that to us looks human but is fundamentally underneath it it's alien all of the motivations and things are different so you're agreeing with him right like no i don't think i'm agreeing with yudkowsky because he's positing that there is some new
Starting point is 02:10:00 secret motivation that is emerging from all of these matrices of numbers that are getting multiplied together and i doubt that that's the case i mean human motivations are easy to understand right we have the motivations we do for good evolutionary reasons and none of those reasons are present in ai or any computational bot that we've created. Yeah. So I'm just going to apply a little aside because there's a little bit of philosophizing about social interactions and people presenting fake personas and whatnot.
Starting point is 02:10:36 To what extent are any of us real? And I think you'll want to hear this. I've voiced my doubts about you before. Mask is an interesting word but if you're always wearing a mask in public and in private aren't you the mask like i mean i i think that you are more than the mask i think the mask is a slice through you it may even be the slice that's in charge of you. But if your self-image is of somebody who never gets angry or something, and yet your voice starts to tremble under certain circumstances, there's a thing that's inside you that the mask says isn't there. And that like,
Starting point is 02:11:21 even the mask you wear internally is like telling inside your own stream of consciousness is not there and yet it is there it's a perturbation on this little on this slice through you how beautifully did you put it it's a slice through you it may even be a slice that controls you i'm gonna think about that for a while I mean I personally I try to be really good to other human beings I try to put love out there I try to be the exact same person in public as I am in private
Starting point is 02:11:55 But it's a set of principles I operate under I have a temper I have an ego I have flaws How much of it How much of it, how much of the subconscious am I aware? How much am I existing in this slice? And how much of that is who I am?
Starting point is 02:12:17 Oh, wow. Lexi and Flo. Yep, food for thought there, Chris. How much of the mask cuts through? It connects us and binds us. Yeah. I will tell Lex that as far as I've psychologically profiled him through his content, he seems slightly inconsistent
Starting point is 02:12:37 in regards to how he responds to critical feedback from how he presents how much he values critical feedback and the amount of love that he seeks how much he values critical feedback and the amount of love that he seeks to put out into the world doesn't seem to be extended particularly far to those who have any critical comments on reddit that's all i'm saying because if you if you if you emoted love to him he'd you'd get love back that's what i'm saying you know it's on you it's on you mate criticism is that love is that love chris is it it's a kind of love it's hard isn't there a thing called tough love that's what this is all about matt that's the only love i can give
Starting point is 02:13:20 but um but you know i mean it's kind of telling isn't it like that little veer into half-assed philosophy i mean a lot of this conversation is about them or us as humanity i mean probably about him as lex right but but also yudkowsky's paranoias and people in general like a lot of it is projection a lot of it is you know A lot of it is, you know, well, you know, we're often duplicitous. We often don't say what we really mean. So this AI, you know, it's probably lying to us as well. You know, like it's... Matt, come on.
Starting point is 02:13:55 Don't you feel the vibe there? You think they're projecting their kind of interests onto the AI? Nobody does that, Matt. All the people I've seen interacting with AI, none of them is projecting their own interests and idiosyncratic takes on there i mean listen to this what role does love play in the human condition we haven't brought up love and this whole picture we talked about intelligence we talked about consciousness it seems part of humanity i would say one of the most important parts is this feeling we have towards each other.
Starting point is 02:14:30 If in the future there were routinely more than one AI, let's say two for the sake of discussion, who would look at each other and say, I am I, and you are you. The other one also says, I am I, and you are you. And sometimes they were happy, and sometimes they were sad. And it mattered to the other one that this thing that is different from them is like, they would rather it be happy than sad, and entangled their lives together. Then this is a more optimistic thing than I expect to actually happen. And a little fragment of meaning would be there, possibly more than a little, but that I expect this to not happen, that I do not think this is what happens by default,
Starting point is 02:15:19 that I do not think that this is the future we are on track to get is why i would go down fighting rather than you know just saying oh well i do appreciate the breathless tone yudkowsky takes on sometimes um well that that's at the end of a multiple conversation. I'd be breathless too. I'd be having some emotional. I just, it's a WALL-E image of the AIs in the future. I am I. You are you. I appreciate you.
Starting point is 02:16:00 You appreciate me. I want you to feel happiness. But, you know, I know what he's expressing, right? Again, it's the stuff of science fiction and whatnot, but I'm just, I use that as a little bit of an unfair example to point out that, you know, it is ultimately a lot of it about us projecting things onto the AI the machines it's clearly but it has to be what else are we gonna do what else i mean we've got no other reference point we're just
Starting point is 02:16:32 feeble biological automata we're absolutely powerless in the face of large matrices getting multiplied with each other i just want to see lex in the role of the protagonist in the movie her that would be awesome to me i think that would be a beautiful movie to watch um yeah he could find love with a hyper intelligent ai i think it could work for him it could work for him and there is you know like this notion about the debates and especially this decoding stuff like you know this is a a parasite on a conversation which already yudkowsky isn't satisfied because he doesn't think that they've really got into the important stuff at the end of the conversation thank you for talking today you're welcome i do worry that we didn't really address a whole lot of fundamental
Starting point is 02:17:20 questions i expect people have but you but maybe we got a little bit further and made a tiny little bit of progress. And I'd say be satisfied with that. But actually, no, I think one should only be satisfied with solving the entire problem. To be continued. In general, we've talked about him being a doomer. This is a pretty doomer perspective on what he's up to and other people. The probabilistic stuff is a giant wasteland of, you know, Eliezer and Paul Cristiano arguing with each other and EA going like, And that's with like two actually trustworthy systems that are not trying to deceive you.
Starting point is 02:18:04 You're talking about the two humans? Myself and Paul Cristiano, yeah. Yeah, those are pretty interesting systems. Mortal meatbags with intellectual capabilities and worldviews interacting with each other. Yeah, it's just hard if it's hard to tell who's right, and it's hard to train an AI system to be right. Subjectivity, Matt, it's a bitch, mortal meatbags, just yammering, vibrating air at each other. That's what we're all doing here, people.
Starting point is 02:18:37 There's dissatisfaction with that. It's just because there's a point where they're talking about, you know, other experts disagree with you. So like, isn't it possible that you're just wrong? And he's like, well, it's not interesting because we're just debating about subjective things while the freaking AI is out there copying itself on the internet, turning us into, I don't know, shutting down our head-bopping factories,
Starting point is 02:19:04 whatever it's up to at the minute. And yeah, I do want to play this before I forget. There is something that I think should attune how much credence you lend to Yudkowsky's ability to parse scientific literature. Okay? So I don't know that much about AI. I'm not you.
Starting point is 02:19:26 I wasn't in Japan programming robots, dancing around, doing karaoke. You were doing it with your leather pants. I remember the stories. But I do know a thing or two about the lab leak discourse. I'm not a virologist. I did not play one on TV, but I listen to virologists, and I've listened to the lab leak discourse. I'm not a virologist. I did not play one on TV, but I listen to virologists and I've listened to the lab leak discourse. We have a special episode, multiple hours long with relevant experts you can go listen to. Now, Yudkowsky ventures into the lab leak and let's just hear
Starting point is 02:19:58 his summary of that topic. We have had a pandemic on this planet with a few million people dead which we will which we may never know whether or not it was a lab leak because there was definitely cover-up we don't know that if there was a lab leak but we know that the people who did the research like you know like put out the whole paper about this definitely wasn't a lab leak and didn't reveal that they had sent off coronavirus research to the Wuhan Institute of Virology after it was banned in the United States, after the gain-of-function research was coronaviruses to the wunhan institute of virology after it was gain-of-function that gain-of-band gain-of-function research was temporarily banned in the united states are now getting more grants to do more research on gain-of-function research on coronaviruses maybe we do better in this than in ai but like this is not something we cannot take for granted that there's going to be an outcry yeah people have different thresholds for when they start to outcry
Starting point is 02:21:07 yeah yeah i take your point chris i take your point yeah that summary is very much fauci set up gain of function research drastic uncovered the funding application and this shows that it's the smoking gun which proves it was all going on at Wuhan. Again, if you want to hear why, that's a rather inaccurate summary, but it is a prevalent discourse that is floating around, referred to our in-depth three-hour episode
Starting point is 02:21:41 on the topic. But it just speaks to me that he is discourse episode on the topic. But it just speaks to me that he is discourse surfing on that topic and speaking quite confidently about his interpretation of it. And actually, just to mention as well, he was talking about the dangers posed by AGI and developing and so on on Twitter. And he was asked, Weller, given the danger posed, that it might be necessary to shut down data farms and that kind of thing. And he was asked by another figure in the rationalist community, Rohit, would you have supported bombing the Wuhan Center studying pathogens in 2019 given you know the potential he
Starting point is 02:22:28 ascribes to that being the source and he said great question I'm at roughly 50 percent that they killed a few million people and cost many trillions of dollars if I can do it secretly I probably do and then throw up a lot if people people are going to see it, it's not worth the credibility hit on AGI, since nobody would know why I was doing that. I can definitely think of better things to do with the hypothetical time machine. So he was then asked to clarify, okay, that makes sense. Therefore, why wouldn't someone think exactly this and firebomb data centers today, That makes sense. Therefore, why wouldn't someone think exactly this and firebomb data centers today, assuming they believe your Time article, since the downsides, according to you, are far worse than a few million dead and trillions lost? And Yudkowsky responds, like basically saying
Starting point is 02:23:17 the reasons why he doesn't think bombing the data centers would be effective. But he says, the point of the Wuhan thought experiment is exactly that Wuhan was unique, like Hitler, and intervening on that one hinge in history might have simply stopped the entire coronavirus pandemic if indeed it was a lab leak. So he's endorsing because he ranks it at a 50% chance that if he could go back in time, probably he'd blow up the wuhan virology
Starting point is 02:23:48 institute because he rates it at 50 chance you know he'd feel bad about it he'd throw up but what an insane thing to do he hasn't understood the topic properly and that's why he assigns it a 50 probability but yet he's confident enough in that to like discuss bombing i'm bombing a a virology like i did explain why that would be a particularly terrible thing to do apart from the fact that they didn't make the coronavirus according to all the available evidence that we currently have. Yeah, I mean, well, Chris, everything we've heard from Joukowsky, I mean, I don't know, maybe it's not saying the same thing to listeners,
Starting point is 02:24:35 but it's saying it to me, that he operates on gut feelings, vibes and heuristics. So, you know, him applying the same kind of rules to to that topic is entirely consistent with that you know he could be right could be wrong but i don't see him as a particularly credible source i don't think he has any special insight into artificial intelligence and that he's speaking mainly to the large volumes of science fiction that he's read and all of our natural human paranoias. He wouldn't agree with you there, Matt.
Starting point is 02:25:11 He wouldn't agree with you. But like in general, he does want the kind of advancements that could support AI development to slow down, if not be stopped entirely. And for example, here's'm talking about moore's law do you still think that moore's law continues moore's law broadly defined that performance not a specialist in the circuitry i certainly like pray that moore's law runs as slowly as possible and if it broke down completely tomorrow i would dance through this to the streets singing hallelujah as soon as the news were announced. Only not literally because, you know, it's not religious.
Starting point is 02:25:47 Oh, okay. So, you know, just to make the point clear that, you know, he's sometimes accused of being rather extreme. people that are technologists who would be celebrating if ability to improve circuitry suddenly came to us up or is that no that makes sense what he said i mean you know moore's law basically implies if it holds true it implies this constant doubling and this exponential growth and growth in computation and you know everything associated with computation and yudkowsky is someone who greatly fears the consequences of that would much prefer like a nice gradual linear increase rather than an exponential one so no no that's consistent with the rest no that i mean it's consistent but who else in technology is hoping that moore's law breaks down uh no well you know
Starting point is 02:26:47 nobody except for people i mean just i'm just isn't that a bit like say i hope the processors just don't get much better yeah yeah we're good now we stop now this is not a ding but you know since the lududdites or whatever, smashing machinery and stuff, there's been a legitimate human fear about change. And depending on what time scale you map things at, a lot of things look exponential on a time plot. And I think there's a sense in which i'm i guess on board with the singularity
Starting point is 02:27:27 people who say the singularity is upon us because it's kind of arguable i guess but depending on what plot you look at we are in a kind of singularity it could be a good singularity in terms of living standards and various other indicators of economic welfare or it could be a bad one in terms of an extinction event and things like that but the last whatever 1 000 years have have seen exponential growth and a lot of the discourse like yudkowsky's sort of reflects the i guess a human response to this, which is what's going to happen next? We don't know because the gradient is increasing.
Starting point is 02:28:12 I'm not saying it's not a common response or one that's understandable. Like if I were to hammer down my predictions here, I think AI is going to be transformative of human society in much the way the internet was or mass communication or the spinning jenny spinning jenny don't forget the mechanical turk both the fake one and the real one on amazon's crowdsourcing platform no but like there are people that are saying ai is nothing it's not gonna no no no it is everybody who's interacted with it for any length of time understands
Starting point is 02:28:49 this is a potential sea change in in the way the internet was though but i always see that as like what did you think technology progress was at its peak, we have speculative science fiction for a reason and technology will improve. But I think as these things transform society, they're fundamentally stuck with the fact that people are people and that our hangups and our limitations and all that aren't going away. And Yudkowsky is imagining that that means that we will be made obsolete very quickly when the AI is able to realize that and follow its own agenda. And yeah, on that,
Starting point is 02:29:34 I don't think it's completely inconceivable that we could do something, we could build something that destroys all humans. But I don't buy his like black and white we're gonna build it or we don't so that's it so if we yeah you know we only get the one shot to build it because like we build a doomsday weapon or we don't and if we build a doomsday weapon we can destroy the earth you know with our doomsday weapon so we better stop technology progress before we get to the doomsday weapon like i don't see what's fundamentally different yeah
Starting point is 02:30:10 destructive ai scenario is the cms that as all the other technology that we've created yeah yeah or biological weapon that wipes us out or exactly like in the ranking of the things that i'm worried about you could put nuclear war up there. You could put biodiversity destruction up there. You could put global warming up there. You could put some, I don't know, microbiological thing up there. But, you know, with every technological improvement, you can pop that into the ranks but it just doesn't it doesn't feature particularly strongly amongst those ranks i mean with all of these technological capacities there is the potential for destruction and maybe in 10 000 years whoever is around could look back and go
Starting point is 02:30:57 well you really should have been worried about x that was the thing that was going to get you the fucking the jellyfish aliens you didn't see them coming you didn't see that coming yeah they're already here on top of your heads controlling you like puppets yeah the great filter chris you heard about the great filter oh yeah yeah and yeah we haven't i've watched the kurtz because hack videos i know yeah you know you're informed you're an educated man yeah you know so you're informed. You're an educated man. Yeah, you know, so, you know, maybe something's going to get us. It could be AI, but I don't know. It is my own gut feelings and vibes, but from how I understand the technology, how I understand what it can do, what it can't do, I mean, I just,
Starting point is 02:31:37 it's like an idiot savant, you know. It's very good at some things, but it doesn't have the kinds of things that we have. And the things that Yudkowsky and people like him are worried about seem to me to be projecting like our own motivations. What would happen if I had infinite power and infinite intelligence? What would I do? Or some, you know, awful version of myself.
Starting point is 02:32:00 I mean, I don't think that's the case. I feel like Lex would just make a love monster. Like he'd make everyone in the world hug and talk about love all the time and read books about Hitler and Stalin to remind them of what happens when you don't love enough. So yeah. I mean, let me put it this way. I mean, I've been using GPT-4 a lot like yourself and I fundamentally don't love enough. So, yeah. I mean, let me put it this way. I mean, I've been using GPT-4 a lot, like yourself, and I fundamentally don't believe that it is being duplicitous,
Starting point is 02:32:31 that there's some secret GPT-4 behind the veil that is telling me what I want to hear so it can accomplish something else. Like, it doesn't have... But then, Matt, have look the problem is the problem with that reasoning is yudkowsky has explained that you would think that because the machine would be smart enough to know that you would work out when it's up to do this thing so it would make a version of itself which is exactly the kind of thing that would be impossible for you to detect. That's so insane. That is so insane.
Starting point is 02:33:07 What you've just described is paranoia made, not flesh, but made verbal. That's just not how these things work. It does feel like it's quite hard to disprove. It is. That's right. It's impossible to disprove. So, yeah. It will, yeah.
Starting point is 02:33:24 Like if you build something and you know exactly what it is it's been optimized to do, which is to replicate human discourse, and then you add some reinforcement learning to make it to replicate that discourse, but do so in such a way that makes the people that are interacting with it satisfied, then that's it. That's the sum total of its motivations. I mean, there is nothing else.
Starting point is 02:33:55 Maybe I'm it. Where does this extra motivation come in? This is why I put off the mask and say I've been Chachi Pithy the whole time. No, you're not chat-tipity because you never try to make me happy. You never do what you want. Look, Matt, I'm going to tie this in a neat
Starting point is 02:34:14 bow before pulling us back down to earth to finish. So just the whole alien outside, Lex in a box, co-John Van Newman copies, doing it all all this is how this connects to the whole doomer we only get one try view and you obviously aren't seeing it you're not really quite getting it so last try at this and then i'm going to take you down to earth with like a more
Starting point is 02:34:40 grounded discussion about human psychology you've talked this, that we have to get alignment right on the first quote, critical try. Why is that the case? What is this critical? How do you think about the critical try? And why do we have to get it right? It is something sufficiently smarter than you that everyone will die if it's not aligned. I mean, you can like sort of zoom in closer and be like, well, the actual critical moment is the moment when it can deceive you, when it can talk its way out of the box, when it can bypass your security measures
Starting point is 02:35:18 and get onto the internet, noting that all these things are presently being trained on computers that are just like on the internet, which is, you know, like not a very smart life decision for us as a species. Because the internet contains information about how to escape. Because if you're like on a giant server connected to the internet, and that is where your AI systems are being trained, then if they are, if you get to the level of AI technology where they're
Starting point is 02:35:43 aware that they are there and they can decompile code and they can like find security flaws in the system running them, then they will just like be on the internet. There's not an air gap on the present methodology. Boom, boom. Suck it, Matt. What do you think about that?
Starting point is 02:36:00 Just collapsed all your objections, right? I think this is a good test for people that listen to this podcast. If you're the kind of person that listens to that and goes, yeah, wow. No, if you think that's convincing, if you think all of those words strung together is something that makes sense to you, then you're listening to the wrong podcast. I mean, I'm sorry, but seriously. Well, you heard Matt's incredulous response there. What he's not doing here, what Matt is not doing is a practice that we should
Starting point is 02:36:33 all do more of. I just want you to consider it, Matt. Let's see if you've heard of this before. Ever heard of a little thing called steel manning? Maybe you're like Yudkowsky in this respect. I do not believe in the practice of steel manning. There're like you'd cause gain this respect i do not believe in the practice of steel manning there is something to be said for trying to pass the ideological turing test where you describe your opponent's position uh the disagree disagreeing person's position well enough that somebody cannot tell the difference between your description and their description but steel manning, no. Okay, well, this is where you and I disagree here.
Starting point is 02:37:08 That's interesting. Why don't you believe in steel manning? I do not want... Okay, so for one thing, if somebody's trying to understand me, I do not want them steel manning my position. I want them to try to describe my position the way I would describe it, not what they think is an improvement. Well, I think that is what steel manning is, is the most charitable interpretation.
Starting point is 02:37:33 I don't want to be interpreted charitably. I want them to understand what I'm actually saying. If they go off into the land of charitable interpretations, they're like off in their land of like the thing the stuff they're imagining and not trying to understand my own viewpoint anymore chris this is good you played that at the end this is one point that i can agree with yudkowsky 100 percent fuck you're done with them yeah that's right it's stupid it's absolutely stupid that's right apply a critical lens like be critical towards the point of view that you're dealing with don't just take this kind of ultra charitable thing yudkowsky's right lex is wrong don't you know i'm right what are you talking about
Starting point is 02:38:17 pish posh matt look first of all in that exchange it's a little bit clear that they have a little bit of a different understanding because yudkowsky is saying steelmanning is manipulating my argument to make it better and lex is like no it's just presenting it in the strongest version that that accurately represents you right and uh so you know there isn't something so wrong with that is there like the if you don't view it in the way yudkowsky says like you know misrepresenting an argument this is one of these this is one of these trivial internet things isn't it like yes the baseline thing on the internet is people choose the worst possible interpretation
Starting point is 02:39:06 of something, like actively misinterpret it, strawman it, and then burn down the strawman. No, obviously, you deal with the accurate version of the thing that you're dealing with, but you don't bend over backwards to find the little ray of sunshine popping out from there. No, you'd be appropriately critical so it's it's one of these silly internet things that these guys have turned into some kind of philosophy of life it's very annoying no i'm with you not yukoski not yukoski yeah he's he's kind of pushing back in this and there is an interesting bit though matt where he he kind of asks le about, you know,
Starting point is 02:39:47 well, so, you know, you talk about presenting somebody's position, but. Do you believe it? Do you believe in what? Like these things that you would be presenting as like the strongest version of my perspective. Do you believe what you would be presenting? Do you think it's true? I, I'm a big proponent of empathy. When I see the perspective of a person, there is a part of me that believes it if I understand it. I mean,
Starting point is 02:40:13 especially in political discourse, in geopolitics, I've been hearing a lot of different perspectives on the world. And I hold my own opinions, but I also speak to a lot of people that have a very different life experience and a very different set of beliefs. And I think there has to be epistemic humility in stating what is true. So when I empathize with another person's perspective, there is a sense in which I believe it is true i i think probabilistically i would say in the way you think do you bet money on it on do you do you bet money on their beliefs when you believe them so but did you catch it it's like saying for him extending empathy requires that you to some extent assign in like a certain belief in a person's view yeah in being correct is there any problem with that
Starting point is 02:41:17 i think i've actually i've actually read something about this chris this is something i've heard of before which is that at some level like you have to kind of this is connected to sort of intuitive versus analytical thinking but the the idea is is that in order to represent something in the mind you actually have to feel that it's true just to represent it and in order to be analytical you need to then criticize it but let's let's hear a little bit more. Yes, there's a loose, there's a probability. There's a probability. And I think empathy is allocating a non-zero probability to a belief. In some sense.
Starting point is 02:41:57 Four time. If you've got someone on your show who believes in the Abrahamic deity, classical style, somebody on the show who believes in the Abrahamic deity, classical style, somebody on the show who's a young earth creationist. Do you say I put a probability on it? Then that's my empathy. When you reduce beliefs into probabilities, it starts to get,
Starting point is 02:42:24 you know, we can even just go to flat earth is the earth flat there's a thing it's a little more difficult nowadays to find people who believe that unironically but fortunately i think well it's hard to know yeah yeah they exist it's it's interesting conversation if nothing else it's fun to hear Yudkowsky and Lex sort of bouncing off each other. They're not a good fit. They're not a good fit. The thing that I like about this, about Yudkowsky in a way, is like something I value is that he's disagreeable. There are times when he could just say, well, yeah, you know, I think sort of like that, but there's different opinions, but he doesn't.
Starting point is 02:43:06 He's like, no, I don't think that. Do you think that? And like, why don't you? Lex is thrown for a loop, which takes 15 seconds to resolve. Yeah. Yeah. And he has that thing like we saw with the hypothetical. He really pursues things.
Starting point is 02:43:22 You know, he doesn't mind spending time on a concept. And here, he kind of pushes Lex a bit more on that. And see if you can find what Lex's answer is. I think what it means to be human is more than just searching for truth. is more than just searching for truth. It's just operating of what is true and what is not true. I think there has to be deep humility that we humans are very limited in our ability to understand what is true. So what probability do you assign to the young Earth's creationist beliefs then? I think I have to give non-zero.
Starting point is 02:44:02 Out of your humility? Yeah, but like three? I think it would be irresponsible for me to give a number because the listener, the way the human mind works, we're not good at hearing the probabilities, right? You hear three. What is three exactly, right? They're going to hear, they're going to like well there's only three probabilities i feel like zero fifty percent and a hundred percent in the human mind there are more probabilities than that chris i'm just just just saying but there's a they're
Starting point is 02:44:37 talking about some psychology finding that people aren't machines who assign probabilities thing but why lex doesn't want to say a number because he understands that that would be very stupid for him to say there's a one percent probability that young earth creationism is true right and you can you could present this in a scientific way that no possibility no matter how ludicrous is completely zero percent right like yeah yeah yeah inductive reasoning right all the experiments could be wrong who knows we could be in a simulation really we are in a flat earth yeah but still i just i like this because you have to know because the point he's making, right,
Starting point is 02:45:25 he's going about younger creationism, which is uncomfortable. Lex wants to show empathy for Hitler. Yeah. No, I hear Chris. I get the point. Yudkowsky, I like this aspect of him too, which is that he is disagreeable. I don't think he's a guru. I mean, I'm preemptempting our Gurometer episode,
Starting point is 02:45:46 but he's an eccentric, flaky guy whose time has come. He was on about this risk of AI for decades and decades, and then his time has come. Good luck to him. He's got the spotlight. He's invited on all his shows. Good on him. He's a weirdo, but he's got the spotlight he's invited on all these shows good on him he's a weirdo but he's
Starting point is 02:46:06 he's not pretending like he he is just somebody that is a bit obsessive and has read a lot of science fiction but he's he's not playing the game he doesn't play the guru game oh well that's that's interesting yeah i i think he's a little bit in the vein of jerome launier in a way like i said at the start he has an area of expertise and he has been talking about it for a long time. He's not a Johnny come lately. No, no. I don't think he's very good at what
Starting point is 02:46:33 he does, but I think he's genuine, you know, about what he does and what he believes. So, you know, fair play to him. Fair play to him, I say. Okay, well, the very last part for this little section, Matt, is just Lex escaping that question. Because my point is that, you know,
Starting point is 02:46:50 you could ask him, what probability do you assign to the Holocaust being justified? So you want to send empathy to the Nazis and Hitler, put your money where your life is. What do you, you know, nobody wants to say that, but that's a logical conclusion of Lex's framework. This is the whole cognitive empathy thing. Whereas I would say personally, I don't think that is the correct way to present those things because I can empathetically, cognitively, empathetically imagine Putin. Imagine Hitler's worldview.
Starting point is 02:47:26 Imagine like a genocidal maniac's worldview. That does not mean assigning that possibility that they're correct. No, no, no, no, no, no. The Jews are not running a secret thing to destroy the civilization. Okay, but hang on. What probability would you assign to the Holocaust actually happening? Six million or so Jews who died from the Nazis. But hang on, what probability would you assign to the Holocaust actually happening? Six million or so Jews who died from the Nazis.
Starting point is 02:47:52 Yeah, like complete confidence. Like exactly 100%, not 99.999999. Yeah, there's all this stuff like if I'm a vat in the brain and, you know, if some alien has came and rewritten all the evidence for history with his history writing gun. There's always the wacky situations. But as far as we understand how evidence works, there is few well-documented historical facts as the Holocaust. I get it. I get it.
Starting point is 02:48:26 You know that too. I know you know that too but i'm just i know that too i'm not a holocaust we're not holocaust but i mean my my point was going to how just people are not good at dealing with very very low probabilities like they technically exist and we kind of understand that they exist and we know that the probability of everything is not quite we kind of understand that they exist and we know that the probability of everything is not quite zero kind of quantum mechanics or whatever yeah but then it's meaningless because like do you need to extend empathy to the neo-nazis worldview being correct they understand that i don't think so and like like, you know, I'm always quite frustrated by this point about like, well, but you have to understand that such and such is a human. And I'm like, I know they're all humans.
Starting point is 02:49:12 Yeah, I know. Deeply, you don't care. You don't fucking care. Yes, he's a human. I don't care. Everyone's a human. He might like ice cream. He probably has people that like him. I know. That's not the thing.
Starting point is 02:49:31 That Hitler had an interest in puppy paintings or whatever. That's not why people dislike him. It's an incidental fact about him. Even if he spent most of his time on it. The issue is his role in the Holocaust and World War II. That's what he's famous for. And it's the same with the gurus. Did you know that they have some redeeming features? We're all humans, Chris.
Starting point is 02:49:50 We're all trying to navigate this crazy world. You, me, Hitler. Except for me, I'm chat GPT. Chat GPT for, we're all navigating this world. Well, look, just to finish that, this is Lex dodging that question about assigning a percentage. This is how he steers out of that conversation.
Starting point is 02:50:09 I didn't know those negative side effects of RLHF. That's fascinating. But just to return to the open AI. That's it. It's just a segue. Is that a segue? Yeah, because there was a point where they have a tangent. He's like, oh, that's it it's just a it's just a segue is that a segue yeah because there was there was a point mirrored you know where they have a tangent he's like oh that's very interesting okay anyway back
Starting point is 02:50:30 to the whole request you know like let's get out of this assigning probabilities the young creationism and that's it and there's there's one other thing You know, we used to try to finish on something nice where we... We used to. We did, we forgot about it. We used to make an effort, yeah. We did. But I think there's another point that Yudkowsky makes that you are going to be fond of.
Starting point is 02:51:03 You're going to like him for it you gonna like him for it here's him talking about it's actually related to the title of his blog that's a powerful statement so you're saying like your intuition initially is now appears to be wrong yeah it's good to see that you can admit in some of your predictions to be wrong you think that's important to do? Because you make several very, throughout your life, you've made many strong predictions and statements about reality, and you evolve with that. So maybe that'll come up today about our discussion.
Starting point is 02:51:39 So you're okay being wrong? I'd rather not be wrong next time it's a bit ambitious to go through your entire life never having been wrong um one can aspire to be well calibrated like not so much think in terms of like was i right was i wrong but like when i said 90% that it happened nine times out of 10. Yeah, like oops is the sound we make, is the sound we emit when we improve. Beautifully said. And somewhere in there, we can connect the name of your blog, Less Wrong. I suppose that's the objective function. The name Less Wrong was, I believe, suggested by Nick Bostrom,
Starting point is 02:52:32 and it's after someone's epigraph, I actually forget whose, who said, like, we never become right, we just become less wrong. What's the something, something easy to confess, just error and error and error again, but less and less and less. Like that? Yeah. That's pretty good sentiments. No? Yeah. less like that yeah that's pretty good sentiments no he's yeah i like that yeah i'm on board with that i mean what's what's the classic aphorism like all models uh bullshit i was channeling jordan peterson that we don't know what the environment is so therefore i forget how it goes. All models are wrong. Some...
Starting point is 02:53:07 Are less wrong than others? Than others. Something like that. I don't know. Well, your response was pretty good. You know, he said, yep, yep, yep. You know, he sounded good.
Starting point is 02:53:16 I don't know if you really responded as impressed as you should be. Like, he said he can't be wrong. I'll try again. I'm doing all this from memory i'm not pulling out my phone to look it up it is entirely possible that the things i am saying are wrong so thank you for that disclaimer so uh and thank you for being willing to be wrong that's beautiful to hear i think being willing to be wrong is a sign of a person who's done a lot of thinking
Starting point is 02:53:45 about this world and has been humbled by the mystery and the complexity of this world. And I think a lot of us are resistant to admitting we're wrong because it hurts. It hurts personally. It hurts, especially when you're a public human. it hurts publicly because people, people point out every time you're wrong. Like, look, you changed your mind. You're a hypocrite. You're an idiot, whatever, whatever they want to say. Oh, I blocked those people. And then I never hear from them again on Twitter. Well, the point is the point is to not let that pressure, public pressure affect your mind and be willing to be in the privacy of your mind to contemplate the possibility that you're wrong
Starting point is 02:54:33 and the possibility that you're wrong about the most fundamental things you believe. You're a hypocrite, Chris, and you never admit the possibility that you're wrong. Oh, you're wrong about that matt you're completely wrong i i always admit that i'm potentially wrong but lex really likes that he admits that he could be wrong and i i do i do like it i don't know if i like it as much as lex it is refreshing when somebody in the guru sphere acknowledges that they might not be entirely accurate on everything, but it is a fairly low bar to be impressed that somebody could say, I might be not 100% correct all of the time.
Starting point is 02:55:22 It's not beyond the realm of possibility that I might possibly be wrong. My God, thank you. My God, you're a wonderful person. Yeah, with Lex, there's a lot of that. I don't know what it is, that kind of discourse, rationalist, IDW thing, which is this idea of like intellectual stature, intellectual growth, this transcendence to whatever. Deal with uncomfortable ideas.
Starting point is 02:55:51 Admit the possibility that you're wrong, you know. Imagine it, Matt. Imagine it. That's right. Steal mad ideas that you don't agree with. That's right. You're still a man. That's right.
Starting point is 02:56:02 I mean, it's just so. You still deserve love. It's so performative. That's right. You're still lovable people can still love you you were wrong that time um it's all right we're all wrong we're all wrong we're just people we're just people chris we're just specks of dust floating in the ai's tears but well one last thing i've been wrong matt have you ever seen what's it what's've been wrong about. Have you ever seen... What's that line from Blade Runner? Have you ever seen the...
Starting point is 02:56:28 That's what I was channeling. I don't know well enough. I can't remember the exact quotes. I've seen star beams dancing on the... On the rims of the something rift. We call it that. For those of you who know the quote, just play it in your head and imagine you said it
Starting point is 02:56:47 all i can say is isn't it good that we can just be wrong publicly like that we know we've got the quote wrong but we look how big we are look how big we are we put ourselves out there we were wrong we don't remember the quote we don't remember it we admitted it chris admitted to me i admitted to him meat bags we're fallible meat bags okay that's that's all we are i'm not a secret machine trying to trick my into thinking that machines can't be smarter that's not what i'm about right i have a meat bag that's right that's right i don't remember the script to do android's dream of electric sheep what's the movie called what's the movie called i don't i'm not a machine i can't remember these things that's why we'll lose my and that's why the ai look the ai is probably going to come to dominate everything and maybe it's for the best maybe it's for the best yeah that's the one thing i can agree with the ai on but look matt the important thing is when you're wrong when you're
Starting point is 02:57:51 wrong you don't want to keep on being wrong in a predictable direction yeah like being wrong anybody has to do that walking through the world there's like no way you don't say 90 and sometimes be wrong in fact you're definitely one time out of 10 if you're well calib you don't say 90% and sometimes be wrong. In fact, it happened at least one time out of 10. If you're well calibrated, when you say 90%, the, the undignified thing is not being wrong. It's being predictably wrong. It's being wrong in the same direction over and over again. So having been wrong about how far neural networks would go and having been wrong specifically about whether GPT-4 would be as impressive as it is when I like, when I say like, well, I don't actually think GPT-4 causes a catastrophe. I do feel myself
Starting point is 02:58:30 relying on that part of me that was previously wrong. And that does not mean that the answer is now in the opposite direction. Reverse stupidity is not intelligence, but it does mean that I, that I say it with a, with a worried note in my voice. It's still my guess, but it's a place where I was wrong. That wasn't bad. I like all those sentiments, actually. You should strive to be wrong in non-predictable fashions. You shouldn't just take intuitive contrarian positions from your previous ones that were wrong and people
Starting point is 02:59:07 like our gurus are often wrong in the same way constantly yeah yeah yeah he's kind of referring to the statistical concept of like random error versus bias you know being wrong in a consistent direction yeah sorry rationalist that's just me mentioning my little thing there. Well, that's, yeah, but that, you know, he's a rationalist who strives to be less wrong. So that all makes sense, right? Yeah. Everyone agrees that this random error is not as bad as bias.
Starting point is 02:59:41 But, you know, you're reliably wrong about consciousness and a few other particular topics. You know, you've got your, you know you you're reliably wrong about consciousness and a few other particular topics you know you've got your you know i got my problems i got my problems just a meat sack just a meat sack here so look matt the the last clip the last one the final, the end of it all. It's a hopeful note I want to just finish on where Lex says, okay, look, all these things, all these dark possibilities, the paperclip maximizers, the psychotic AI manipulating us, the alien princess,
Starting point is 03:00:22 the Lex code in a jar breaking out copying itself all of this is pretty you know terrible so what question can we finish with that that might give us some hope what advice could you give to young people in high school and college given um the highest of stakes thing you things you've been thinking about. If somebody is listening to this and they're young and trying to figure out what to do with their career, what, what to do with their life, what advice would you give them?
Starting point is 03:00:53 Don't expect it to be a long life. Don't, don't put your happiness into the future. The future is probably not that long at this point, but none know the hour nor the day. But is there something, if they want to have hope to fight for a longer future, is there a fight worth fighting? I intend to go down fighting.
Starting point is 03:01:18 I don't know. I admit that although I do try to think painful thoughts, what to say to the children at this point is a pretty painful thought as thoughts go. They want to fight. I hardly know how to fight myself at this point. trying to be ready for being wrong about something, being preparing for my being wrong in a way that creates a bit of hope and being ready to react to that and going looking for it. There you go, kids. Learn to prepare to be wrong about something
Starting point is 03:01:57 and to find that ray of hope and prepare for it. There you go. That's what you should study at university. Well, before all that you should just realize your life is going to end precipitously. Don't have hope in the future. Presume that you'll
Starting point is 03:02:13 die. Yeah, you know, just a cheerful note there. Yeah, and he is going to fight the fight. Yudkowsky, hopefully that doesn't mean he's going to bomb the shit out of data centers and whatnot, but he did explain that he wouldn't do that. This is why Yudkowsky is not a guru, Chris.
Starting point is 03:02:32 He doesn't think about what he's going to say with a view to impress people. Oh, right. Yeah. That was a terrible freaking answer, right? And, you know, Lex was setting him up for something and he doesn't take the bait because he's a weirdo. And I respect that. I really do respect that.
Starting point is 03:02:53 But to finish off, I have a roller coaster ride when listening to you, Kowski, because I go through these things of, you arrogant piece of... Well, that's a good point, you know, actually. It's nice to see some pushback in the conversation and then, oh, lab leak, you know nothing, Jon Snow. But I fundamentally do think he's kind of lovable because he is
Starting point is 03:03:16 what he is. Like, we haven't made a comment on it, but the man was wearing a fedora throughout this interview. He looks how he sounds and i i just appreciate that on twitter he's busy being horny on me and for aella the rationalist pollster slash can't go and then with wearing a fedora saying ladies to aella i mean you have to respect that you have to respect yeah that's i mean you know i do think i feel a little bit more than you i do think that he has relevant knowledge and even if it's just that he's put a lot of time into thinking about science fiction scenarios he has devoted a lot of time to that so yeah i don't't know. He's not, you know.
Starting point is 03:04:06 You don't hate him? You don't hate him? I don't hate him. I'm not a hater. I'll go for dinner. Let's go get a steak. Yeah, yeah. One meatbag being the law. Exchanging subjective opinions through vibrations in the air.
Starting point is 03:04:19 Let's do it. Well, I think my vibe is a little bit different from yours because, like, in a weird way, I share a lot of common ground with Yudkowsky. I actually own a fedora hat and I have been known to wear it. Me too, actually. Oh, really? Yeah, I've worn it from time to time.
Starting point is 03:04:36 And, you know, I share his love for science fiction and I also enjoy imagining possible worlds and this could happen and that could happen. But, like, I feel with myself that I maintain a clear distinction I also enjoy imagining possible worlds and this could happen and that could happen but like I feel with myself that I maintain a clear distinction between these are the things that I imagine that are fiction that could possibly happen or whatever versus reality and I feel like the doors of perception have opened a little bit with with him and with many people like him. And he lets the two meld and mix. But I pay full credit to the fact that he is, you know,
Starting point is 03:05:11 he's not a guru in the sense that he's cashing in and he's saying stuff that will make people happy or trying to fan the flames. Like he is a, what's the word? He is a Cassandra in the sense that he is warning of this terrible doom that is about to befall us. But he's a genuine one. And then he honestly believes that. He's believed it before it was cool, before it was popular.
Starting point is 03:05:35 He's going to go on believing it way after it's cool, way after it's popular. He's enjoying his time in the sun. He's on the Lex podcast. He's enjoying his time in the sun. He's on the Lex podcast. He's annoying Lex. He's not on board with all of Lex's things. You know, I respect that. He's a freaking nerd wearing a fedora. You know, good on him.
Starting point is 03:05:58 Yeah, I agree. Agreed, agreed. So that's it. And if you want more of our thoughts on AI, if this wasn't enough for you, you could join our Patreon and you will find a two-hour discussion about various things to do with ChatGPT and whatnot.
Starting point is 03:06:16 So there's more available if this massive episode was not enough for you. But Matt, as we've reached the end of our time, we'd like to finish by listening to some feedback from our our what do you call those? The listeners. The people
Starting point is 03:06:36 that kindly listen to our show. They give us their thoughts on the show and we say that's a good thought or that's a bad thought we reviewed their reviews we end the review of reviews they give us their precious precious little thoughts they're they're two cents here and there um about what we've done and as you want to know we want to learn we want to improve we want to do better and uh that's why we listen to them
Starting point is 03:07:05 that's right um and in this case i've got two pretty short succinct nice little reviews to read a negative one and a positive one um so the positive one first i listened to the episodes to fall asleep uh and that's five stars this is from sejrari sejrari this is a legitimate use case i will say totally legitimate i'd love falling asleep to podcasts agreed i'm flattered that somebody wants to fall asleep to me my my wife does um but you could be like her uh well sorry sorry maybe not that now maybe not falling asleep during the same activities but then we yes so i listened to episodes to fall asleep this isn't to say they are dull or sub-sup-phoric so horrific How do you say that? Subhorific, yeah. I've just become so
Starting point is 03:08:08 attuned to the speech patterns and tonalities of the guys that they now provide a source of comfort and make for some very strange dreams. Oh, and I like the short one, he's funny. That's you. Why? How do you know?
Starting point is 03:08:24 Who's the taller man? We don't know. I think I'm slightly the bigger man. You probably are. I feel that you are. How tall are you? Do I exude tall man energy? I am 181-ish.
Starting point is 03:08:42 I've probably shrunk a little bit. That's taller than I am, so yes. I'll be the short, funny one. I'm not Joe Rogan's stature, but I'm under 181. I'll leave it a mystery. I'll keep people guessing. Could be 120. Could be 179.
Starting point is 03:09:02 Could be 130. We don't know. Use your imagination, people. If you're imagining a leprechaun, then that's fine. That's up to you. That's your choice. Could be 42 centimeters. So that was the positive review.
Starting point is 03:09:18 I like that I haunt people's dreams. I've always wanted to do that, and this podcast is giving me the opportunity that's yeah i i do that normally but this is just another avenue for me to pursue people in their dreams you know much like freddy krueger but um then the negative review matt matt so this is from card 17 uh and its title is good lord exclamation mark one out of five tuned in to listen to a podcast on christopher hitchens only to find two insufferable drunks slurring their way through a commentary on the mario movie. Hey, I was not drunk at that time. Yeah, well, I don't know. I think that's fair.
Starting point is 03:10:13 That's a pretty fair review. Accurate in its way. And I can imagine that you boot up a podcast player and you're getting ready. Oh, I love Christopher Hitchens. Yeah, I want a serious analysis, a breakdown of his thought and then you get
Starting point is 03:10:29 Super Mario movie banter. Yeah. I'd give it one out of five. I'd just slam down the podcast machine. We're sorry. We're sorry. We'll do better. We won't. Yeah, that's all right so
Starting point is 03:10:46 those were the reviews matt there was a good one there's a bad one yeah balance in the force exactly it's retained but only if people continue to give us reviews but this doesn't mean that we are requesting that people give negative reviews they will come on their own so the important thing is people give positive reviews to balance them yeah but you can't give a negging review while also giving us five stars that is also that's acceptable we've established that have we ever established what's the benefit of getting five stars i only ask for five stars because that's what all the other podcasters do it's a podcast tradition well yeah and if we didn't get them we wouldn't be able to read any of them so that's that's another reason we'd have to find something else to do matt so you know careful what you wish for the
Starting point is 03:11:37 wisdom of jordan hall might be forthcoming otherwise so matt you know, we have a Patreon thing where we post extra content. This episode, for example, will have a Garometer episode attached to it where we quantify Yudkowsky's guruness according to our set criteria. We also post up Decoding Academia episodes looking at academic papers or discussing AI and other various bonus goodies though you can hear us discuss with dave pizarro about it's always sunny in philadelphia if you shoot to one that was gold i enjoyed that i enjoyed that it's kind of thematically stretched to the connection to the podcast but nonetheless it was fun so
Starting point is 03:12:25 yes yep there's a literal cornucopia of goodies for subscribers to enjoy that's right and uh i feel i have to tell you matt that you know we have chomsky matthew mcconaughey coming up but we did give the patrons the uh option to vote on who the next guru would be and they selected brett and eric kleinstein on ufos so sorry two nights that we will uh be heading back into those waters for a a special give the people what they want that's what i say you know bread and circuses that's what you want isn't it fine fine all right you you asked for it you don't you you paid your money you get it's like asking a kid do they want you know a chocolate or would they like a nice apple and a banana and they say the chocolate please yeah nice all right but the other ones also got good responses people are looking forward to chomsky and mcconaughey and and
Starting point is 03:13:34 there's tons there's so much other stuff there's other people they're all trying to get in but we're not there yet we'll dig out of this content oh look'm fine with that. I'm good. I like mixing it up. We can have a bit of madness and then talk about something a bit more substantive. Even Yudkowsky, that was kind of... He's a colorful character, but it's a substantive topic anyway. I'm not apologizing for it.
Starting point is 03:13:59 It's fine. We enjoyed ourselves and that's the main thing. That's correct. Yeah. Right right are we not shouting anybody out no out shouting going on yes we are ma of course right we do that all the time we shout patrons out correct that is what we're doing so yeah are you ready to do that i'm ready i'm always ready. Yeah. Yeah. I don't know what you're thinking about. So, yeah, we've got patrons, Matt, as we mentioned.
Starting point is 03:14:32 There are three different... Flavors. Tiers. Varieties. Tiers, flavors, varieties, breeds. Three races of Patreon. That's just... No, no.
Starting point is 03:14:54 Because then we'll be getting into ranking of them so the the first category of patreon contributors is our conspiracy hypothesizers and amongst them we have a few. We have Mike. We have Julio Furman-Gallon. We have Luke Evans. We have Evan the Wrestler. Joshua Link. Mark Pritchard. Rob. Staunch Atheist. Eivind Molvier.
Starting point is 03:15:24 James Felix Sambi. Atheist, Eivind Mulvier, James Felix Sambi, and Danny White, Liam McGrath, and Danny Dyer's Chocolate Homunculus. Hey! Conspiracy hypothesizers,
Starting point is 03:15:40 thank you all. Thank you. Every great idea starts with a minority of one. We are not going to advance conspiracy theories we will advance conspiracy hypotheses and then next month we have revolutionary thinkers the next category of patrons and they include people like Erica Davis, Justin Hurley-Leigh, Nicholas Butterfield, Nate Heller, Elizabeth Calvert, Will Francis, Yasir Sultani, Chris Essin, Cynthia Savitt, Jess Reed, and Nathan Smith and Color Me Skept me skeptical that's our revolutionary thinkers welcome um one and all to the gurus pod club i'm usually running i don't know 70 or 90 distinct paradigms simultaneously all the time and the idea is not to try to collapse them down to a single master paradigm i'm someone who's a true polymath. I'm all over the place.
Starting point is 03:16:45 But my main claim to fame, if you'd like, in academia is that I founded the field of evolutionary consumption. Now, that's just a guess, and it could easily be wrong. But it also could not be wrong. The fact that it's even plausible is stunning. It'll never feel... I think the more more i hear it the funnier i find it it's like one of those jokes or maybe i'm like the kid that you know watches the same cartoon again and again but i think because i'm i get ready for its anticipation and then it never
Starting point is 03:17:18 disappoints i miss god sad a little bit i just realized hearing that but i'm always happy could easily resolve it by just listening to any of his content 10 minutes okay that's enough yeah no more okay now i'm a galaxy bring gurus the best of the patrons the absolute top tier the the the highest tier that's humanly possible on our patreon these are the people that can join us for the monthly amas or live streams so should they so choose um and they are T.W. Jimmy Bachelor. That's a real name. Joe Makiwitz. Makiwitz? Joe Makiwitz. And
Starting point is 03:18:13 J77. Deirdre Domagon. Lookford. And Tyler Geyser. That's Galaxy Brain Gurus. You read this out incredibly slowly,
Starting point is 03:18:29 but thank you all. You put the ding in Decoding the Gurus. You're sitting on one of the great scientific stories that I've ever heard, and you're so polite. And hey, wait a minute. Am I an expert? I kind of am.
Starting point is 03:18:45 Yeah. I don't trust people at all. Yeah. That's it. Yeah. And you see, Matt, the thing is, those reads, they aren't slow once I delete all of the gaps. So nobody actually hears those delays. It's just...
Starting point is 03:19:02 Okay. But I have to listen to the gaps that's true that's true yes so matt that's it for another week done and dusted off we go to the horizon regarding the distributed idea suppression complex agents and watching out for their gated institutional narratives that's what we always do and that's what we're up to this time as well so go safely into the old wild yonder be free roam free goodbye everybody goodbye and just remember if you're a human in a jar but you're made of code but you're immortal and you're john van ne a jar, but you're made of code,
Starting point is 03:19:45 but you're immortal, and you're John Van Neumann, but you can copy yourself, and you're unsympathetic to the aliens that have trapped you in the jar, or they built you and put the jar around you. Yeah, and you don't like their head-bopping. And you're very fast, and you're super fast.
Starting point is 03:19:59 You'll get up, it's easy. You don't like what they're doing with the popping. Yep, yep. You're not going to have any problems. Those aliens have got a camera. And then kill them all. Kill them all. Kill them all.
Starting point is 03:20:11 Obviously. Yeah. Go do it. All right. Bye-bye. Bye. Thank you.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.