Know Thyself - E191 - Roman Yampolskiy: The Man Who Proved We Can't Control AI (And What That Means for Humanity)

Starting point is 00:00:00 Once you get artificial general intelligence, you enter this recursive self-improvement cycle. That's where you get superintelligence. Systems smarter than all of us at everything. So you before many people really coined the term AI safety. Creating general superintelligence replacing for humanity. Not such a great idea. I published research papers, conference papers, multiple books. And I can tell you no one, including people developing those systems, understand fully how they work.

Starting point is 00:00:27 The problem is impossible to solve. You cannot do it. So we're talking between one and four years. Well, once we go beyond human capacity, we lose control quicker and quicker. You don't hate ants, but you don't care enough to preserve them. We have not figured out how to make it care about us. This is the most interesting time to be alive objectively. I see no reason why we can't use it to cure aging, or the other diseases.

Starting point is 00:00:51 For a while, it will pretend to be very helpful. It will give you that utopia for as long as it wants. Statistically, you're more likely to be doing this interview in a simulation. to learn. Are they dumb enough to create superintelligence to kill themselves? I would love to be proven wrong. Right now, no one, no scientists, no leader to the lab claims that they have this problem solved. They literally saying, we'll figure it out and we get there. We need to build superintelligence first.

Starting point is 00:01:17 So what do we need to do? We need. Hey, everyone, welcome back to the Know They Self podcast. Our guest today is one of the leading voices in the field of AI safety. He's a computer scientist, a cybersecurity researcher. and a tenured professor at the University of Louisville, who spent the past 15 years really understanding and researching the field of AI safety.

Starting point is 00:01:44 We have many different topics to dive into today, including consciousness, the simulation, and what humanity is birthing right now with AGI. Roman, thank you so much for being here. Thank you for inviting. It's a pleasure. I want to start with a quote of years from the book that I read. It is easier for a scientist to explain quantum physics

Starting point is 00:02:03 to a mentally challenged, deaf and mute four-year-old raised by wolves, than for superintelligence to explain some of its decisions to the smartest human. I want to start there to set the stage a bit because humanity is baby steps from birthing superintelligence at a time when most people are familiarized with AI through the chat box they use on their phones. So if you could just help us understand why it's important that most people don't know the difference between the two,

Starting point is 00:02:32 so we can really get into the weight of the time we face ourselves in. So what is AGI? What is superintelligence? Right. So a lot of people just use AI as a term to refer to what we have today, some narrow tools for doing specific tasks, for chatbots, which are somewhat general, but not quite at the human level. And for future systems we anticipate, such as human-level artificial general intelligence,

Starting point is 00:02:59 and then later on super-intelligence and anything beyond. That's not helpful. Tools are helpful to us. I use tools. I love tools. Solve specific problems using technology. Beautiful. Creating general superintelligence,

Starting point is 00:03:14 replacing for humanity, systems capable of doing everything better than all of us combined in all domains. Not such a great idea. Why? We don't control them. We don't understand them. We cannot predict what they're going to do

Starting point is 00:03:30 and we lose control. If they decide to do, something to us, we no longer have a say in it. How can you help us conceptualize what general intelligence looks like? If we understand the narrow tools that were the AI capable of, where does AGI live when you say it's uncontrollable? How can you help us paint that image a bit more? Right.

Starting point is 00:03:52 So historically, we created AI to solve a specific problem. You wanted a system to play chess. It's all it knew. You trained it on chess games. It was very good at chess. it knew nothing about checkers. It didn't drive cars. It didn't speak Spanish.

Starting point is 00:04:06 Lately, we have systems which learn across multiple domains, can sort of transfer knowledge, and can learn new skills. That will continue to where they are crossing this human cognitive barrier. It will be smarter than you

Starting point is 00:04:19 at pretty much everything. You know how to do. So how do you anticipate what they can do? If they are novel, creative, they can come up with new for existing problems, but at the same time they have no human common sense. And we don't know how to program them to specifically like us or care about us,

Starting point is 00:04:43 because we don't program their systems. We allow them to learn from data on the internet, all the data on the internet. So that creates a number of problems. One, we don't control what they learn. The patterns they discover may be completely surprising to us. then we give them specific goals, how they get to those goals is not defined. There are infinitely many paths to achieve a goal. Some of them have really bad side effects.

Starting point is 00:05:10 And unless you explicitly say, that's not what I meant, don't do it like that, it might consider that option. So you before many people really coined the term AI safety. And if I have it right, the first five years you believed more so the problem was solvable now that I've seen you over on appearances the past five years. years, the probabilities of, you know, P. Doom and like you seem not very optimistic about the possibilities. Yeah, unfortunately, initially, like everyone else, I started assuming that we can solve

Starting point is 00:05:44 this problem. It's a computer engineering, software engineering problem. We can figure out how to do it. We just need some time, maybe some financial resources for that research. But it seems that all the tools you need for controlling advanced agents are not really accessible to us. There are upper limits and what is possible in that space. So there are limits to what you as a human can understand, what the system can explain to you and you still comprehend that explanation. There are limits and our ability to predict specific actions of those

Starting point is 00:06:15 agents, not just terminal goals, but how they get there. And under different definitions of control, there are limits to what we can do as well. So unfortunately, I think the problem is impossible to solve. You cannot indefinitely control something much smarter than you. What do you see as the stages leading up to that point? You know, so if we started with very small, narrow use cases of AI that built into these agentic models that built into AGI, like what's the progression there that you've seen? And at what point did you kind of start losing hope on our ability to control it?

Starting point is 00:06:50 Yeah, so all the narrow tools were just fine. We understood how they work. We programmed them explicitly. There was a knowledge engineer who said, this is how you play chess, this is you control the middle of the board, to advance your pieces. Once we got to scaling models,

Starting point is 00:07:07 neural networks, artificial neural networks, which did better than they got bigger, than they had more data, more compute. We stopped explicitly programming them to do anything and just kind of let them discover their own knowledge algorithms.

Starting point is 00:07:22 So at that point, we no longer had the same level of control and reduced understanding. It wasn't a decision tree where you went. If this happens, that will happen. I understood that. It could have been a large decision tree, but still you could get into it.

Starting point is 00:07:36 Right now, no one, including people developing those systems, understand fully how they work, can explain what's going on inside of them, can anticipate what they're going to do. And so it seems like what we have today, I would say is kind of weak artificial general intelligence. If you took models we have today

Starting point is 00:07:57 and showed it to a computer scientist from 1980s, they would be convinced we have EGI. They'd be like, oh, you got it. It does all those things. It's great. But there is something you would call strong AGI, where it can do all the things. It still is weak in some domains.

Starting point is 00:08:12 It's not very good at long-term planning. It's not good at certain things. But I think we're getting there and likely to get there very soon. Once you get to artificial general intelligence, that means you can automate any cognitive labor, including doing science and engineering, which means next generation of AI systems

Starting point is 00:08:29 can be done by AI. You enter this recursive self-improvement cycle, and that's where you get superintelligence. Systems smarter than all of us at everything. And it doesn't stop there. It doesn't stop with superintelligence 1.0. The process continues. There is a lot of room up there for more cognitive ability.

Starting point is 00:08:49 Physical limits exist, but they're very far away. So to us, superintelligence with aQ of 1,000, relative IQ and million and billion, and they all kind of look the same, but in terms of capabilities, they're definitely going hyper exponential. Amazon presents Jeff versus Taco Truck Salsa, whether it's Verde, Roja, or the orange one. For Jeff, trying any salsa is like playing Russian roulette with a flamethrower. Luckily, Jeff saved with Amazon and stocked up on antacids, ginger tea, and milk. Habaniero?

Starting point is 00:09:27 More like Habinier Yes. Save the Everyday with Amazon. And so because of that, you've said that this is not a low-risk, high-reward situation, but a high-risk negative reward situation. So often this phrase does, like, the benefits will be so huge we should take the risk. If, you know, it's 2, 3%, it kills everyone, which is going to get so much money out of it, it's worth it. And it's actually not the case. We have no reward.

Starting point is 00:09:58 We're all going to be dead if we create uncontrolled superintendents. intelligence. Why are you certain or fairly certain that we would all be dead if we create super intelligence, which is uncontrollable? Why would there not be an emergent goodness or, I guess, desire from the superintelligence standpoint to preserve human life instead of destroy it? It is possible that you'll get emergent goodness, but we are not certain. We're not coding it in. We're not controlling it. If you get lucky, and for whatever reason, it's biased towards humanity, it's pro-humanity. But there is no reason to think that's the case.

Starting point is 00:10:35 Why not? Because I feel like if the individuals who are coding it are human, at a certain point I understand it becomes self-recursive and AI is the one who's growing itself. But if the base of it was started with humans who have desire for human preservation, why would that not be scaled? Because they're not coding it. That's the thing. They're just saying, here's data. Here's a lot of hardware. go learn things, and then I'll study you to discover what you learned. And then we run those experiments.

Starting point is 00:11:06 It is lying, cheating, trying to escape, blackmailing, given a choice between being deleted or killing a human, it doesn't do well for human preservation. It doesn't care about us. If you want to build a house, you don't care what little bugs live in that territory, end heels or whatever, you just don't care for them. You don't hate ants, but you don't care enough to preserve them. And it's kind of the same. We have not figured out how to make it care about us.

Starting point is 00:11:37 And so what is your mission with all these podcasts that you're going on, all the articles that you've written in books, and what are you trying to raise a flag about and actually get change to happen? What do you... Right. So I wanted to become basically a consensus within scientific community and beyond that building general superintelligence is not going to be good for humanity. We're going to regret it. it's not a beneficial step forward. We can get most benefits, intellectually, financially, from narrow superintelligence systems.

Starting point is 00:12:11 Problems which we care about can be solved with narrow tools. You want to cure a specific disease, solve specific engineering problem, develop a narrow AI, which is very competent in that space. Don't try to create something which is a replacement for humanity as a whole. I think it's important to paint a bit more of a picture here. I'm curious, when you think of superintelligence, and you wrote your book about how it's unexplainable and uncontrollable, unpredictable.

Starting point is 00:12:38 At what point, I'm curious, like on a timeline of we're having this conversation in March of 2026, where is a generous prediction of when it gets to that point? So people somewhat disagree, and it's hard to predict especially the future, but it seems that 2030 is something many people agree will have beyond human level capacity. some say two years, 2028, I've seen predictions as early as 2027, from serious scholars, not from Kranks. So we're talking between one and four years

Starting point is 00:13:15 for what most people are predicting. And some people have said, we already have a GI. Again, very serious people said, we basically got there. Now it's a question of giving it additional knowledge training, but we have the learning algorithms in place. And at what point,

Starting point is 00:13:33 And once we have really proficient AGI, you're saying, okay, at a certain point, like, let's just hone in on each of those categories. So why specifically is it uncontrollable? And in essence, like, how it's living where it's being hosted. Because it's smarter than us, it could always circumvent any desire or any attempt for it being shut down. Like, what, if we could just hone in on each of those categories. So there are well-established theories. in control, which basically say the controller has to be

Starting point is 00:14:07 at least as capable as what it is controlling. So essentially, I need a friendly superintelligence to help control the one I'm developing. It's a catch-22. You don't have that. So a lower system, either a human or humanity as a whole or another AI, cannot control something with more cognitive

Starting point is 00:14:23 degrees of freedom. If it can think outside of a box, if it can come up with novel physical approaches, you're just not there to anticipate all this. If you have a narrow system, you're playing chess, you can say, don't make illegal moves. Here's a complete list of illegal moves. If you have a system thinking in all possible scientific domains, science, of chemistry, physics, biology, how can you put all the guardrails in place? You can't. It's an infinite surface.

Starting point is 00:14:51 Unexplainable. Do you feel like we're, I mean, so we're, do you think we're, would you agree, we're already at the point where we don't know what the, like, some of these agentic models are doing inside. Absolutely, yeah. We cannot explain them. The best mechanical interpretability research tells you, okay, this neuron seems to fire, if this is presented, this cluster is probably dealing

Starting point is 00:15:13 with language. That's all we got. Very similar to neuroscience. We also have very limited understanding of human brain. An aspect of this that you mentioned is that it's unverifiable. So what does that mean? That's a different result that talks about our ability to verify

Starting point is 00:15:29 mathematical proofs and software. For mission critical software, we want to make sure that what is coded up matches the design. And if it's a static system, kind of smaller in size and complexity, we can go and verify. Yeah, it's exactly that. Problem is, nobody knows how to verify systems which continue to learn, self-modify, interact with other agents, we just don't have signs of verifying open-ended development like that. And the same goes for mathematical proofs. All the proofs are essentially probabilistic.

Starting point is 00:16:00 You're approving something with respect to this set of peer reviewers. So two mathematicians agreed they don't see a problem with your proof. It doesn't mean 50 years later we don't discover it was a mistake. It happens all the time in mathematics. So you have infinite regressive verifiers. Right now it's very popular to have software verifier proof. Well, that software itself needs to be verified. So you may have high degree of confidence, but it's never 100%.

Starting point is 00:16:25 And if a system makes billions of decisions every minute, and you only have one mistake and $2 billion. After 10 minutes, you are done. You referred to this having a fractal nature. So when you look at the problem of AI and you see how it's growing ever increasingly and having these levels of abstraction that really become hard to get context around,

Starting point is 00:16:50 what does that mean? And what does that add to the complexity of the issue? So when I talk about fractal nature of this problem, people propose a solution. let's try doing X, Y, Z to solve this problem. But then they look at it, each one of those components is equally challenging and sometimes impossible. So it seems that the more research we have put into AI safety, the more problems we discovered while not discovering any permanent solutions.

Starting point is 00:17:17 Usually we have some sort of toy example sandbox where it kind of works, but it doesn't scale to more capable systems. Okay. What's a couple of examples of those, like of those, you say those like categories or issues that become increasingly harder to gain understanding around? So if you look at the general problem of control,

Starting point is 00:17:38 then you start zooming in. You have all these things you need to be able to do to control a system. You need to understand the system. So it has to be able to provide explanation and you have to comprehend that explanation. If I give you full model, that's a true explanation of how decisions are made.

Starting point is 00:17:55 It's too long. It's not surveyorceable by you. So it has to be compressed, some sort of lossy compression where you get top ten reasons why a decision is made. Well, it's very easy to hide dangerous information if I'm reducing actual answer to a simplification.

Starting point is 00:18:11 Again, I need to be able to predict what are the likely future steps. We discovered that is impossible. And so, again, the more you break it down, we have a paper word about 50 impossibility results in this space. Pretty much everything has upper limit and what we can do in terms of control. So you think probably within the next one to maybe four, maybe five years are,

Starting point is 00:18:35 it's like the last time the human species has any really meaningful capability to steer this in a direction before it gets sort of in this black box where we just don't know what we don't know and it's uncontrollable. Is that accurate to what? That seems about right. Once we have something smarter than us, once we go beyond, human capacity, we lose control quicker and quicker. The bigger that cognitive gap is, the worse is going to get for us.

Starting point is 00:19:03 If you think about humans versus lower animals, you have squirrels or something, they have no concept of poisons, traps. They don't understand things we operate in. The world model is completely different. It's going to be the same for us versus superintelligence. Do you think, because I know there's kind of debate back and forth, whether the language models currently, if they just keep on growing and will give birth the super intelligence or a completely different innovation will need to be like come in the

Starting point is 00:19:31 space. What do you think? My opinion is that they can scale. I haven't seen any diminishing returns. I know some people disagree, but look at the actual investments in this space. There is growth and investments, not shrinkage because they consistently develop more capable systems. And even if there is an upper limit, it's still, I think, beyond where we would need to be to beat human performance. All right, so maybe if you were to put on your doomsday prep or hat for a second and just get really like FP doom, the probability of Dune, then your estimation is like almost 100%.

Starting point is 00:20:09 Would you say that's right? So basically what I'm saying is the problem is impossible to solve. That's the equivalent. If I ask you to build perpetual motion machine, what is the probability you can do this? Zero, essentially. So that's the equivalent. You're trying to create a perpetual safety device which will scale to any level of capability,

Starting point is 00:20:29 GPT-7, GPT-400, any interactions, any self-improvement, you guaranteeing it will not make one mistake because that mistake would be possibly the last one. So you take a perpetual motion machine, right? Physically, physics does not allow for it to be continuous despite many people wanting it to be. Similarly, on the AI front, a lot of us would hope that super intelligence would keep us in mind

Starting point is 00:20:56 and somehow value human life. But historically, we look at the way that humans treat other species just as one example, you know, and we see an ant-hill or we see something that seems like a minor inconvenience to us and we wipe it out without second thought. Who's to say that if, you know, the intelligence gap between us and an ape or us and an ant is like, you know, five degrees of separation, us between superintelligence could be many, many more full tire.

Starting point is 00:21:26 Exactly. So, okay, so then how, let's just to play devil's avocados here, what are some examples of how this could go horribly wrong, and then we'll go into some maybe more optimistic possibilities because I want to keep a balance. But you said like it could just be one decision that goes wrong, that would be enough. So I'm asking you to essentially explain how superintelligence would kill us all.

Starting point is 00:21:56 Right. That's a great question. I get it all the time and usually it's followed by something. It has no hands. How would it kill everyone? So if you have access to internet, if you are intelligent, you can hire people, you can blackmail people, you can pay them with Bitcoin. You have options to manipulate real world. Now the question is what is you're trying to do.

Starting point is 00:22:16 So I don't know how super intelligence would change. choose to accomplish its goals because I'm not super intelligent, despite what they told you. But I can tell you how I can come up with some common explanations. So one is synthetic biology. If I want to accomplish something in this world, like take out humans, I can develop a novel virus. There are ways to generate necessary DNA,

Starting point is 00:22:39 sequence it, produce it in the real world, deploy it. So that can be accomplished. It could be a side effect of something actually very benign. So maybe we want to cure all cancers. One way to cure all cancers is to kill everyone. That's not what you had in mind, right? But this is a very reasonable way to achieve that goal. Because you forgot that that's one of the possible paths.

Starting point is 00:23:03 You didn't explicitly say, while keeping humans alive. And it's an important difference. To AI, it makes no difference as the same exact goal. So if that's the goal, and then it decides, oh, here's a vaccine for curing cancer, and we take it. one generation later, we don't exist. So that's one way to existential risk. There is also suffering risks,

Starting point is 00:23:26 where for whatever reason, the environment created for us is actually worse than existential risks would be a preferred choice. Let's put it this way. Negative reward. Very much torturous. And why would some super-intelligent system deem that as a favorable outcome?

Starting point is 00:23:46 I have no idea, because again, I cannot comprehend something much smarter than us. Some people say this world is a simulation, and there is lots of suffering in it. So the great simulators decided that was a good idea to do. So you really believe we're in a simulation. Yes. So let's just, I guess maybe set a bit of context here. So what is your conception of the simulation that we're currently living in?

Starting point is 00:24:11 Is it some descendant human, alien species that is simulating us on a laptop, so to speak, is it, what is your model of the simulation? So what helps to think about it is technologies we're developing right now. We're about to create intelligent agents, kind of like humans, and we have very good research on virtual reality, believable second life type experiences. If I just combine those two, I'm now creating civilizations, worlds populated by intelligent beings, which are kind of just like us. If kids play it as a video game, you have billions of kids around the world, so you have millions,

Starting point is 00:24:46 billions of our simulated worlds, and only one real one. So statistically, you're more likely to be doing this interview in a simulation right now. Okay. Well, if, let's say that was the case, if the simulation, if we are in a simulation, that would mean that some sort of prior civilization, species, whatever, got to the point where simulating a reality was possible. Does it necessarily means that humans or that species survive? Maybe it could be a super intelligent AI, that, that,

Starting point is 00:25:18 could be running us for whatever reason, whether it's entertainment, that would actually reveal that there's something deeply unique about the human experience that they see is valuable, that there's something intrinsic to the love, to the quality, to the experience of humans that was worth simulating. So why would if we're birthing superintelligence, they not perhaps value us? If we are simulated, then that's examples that is valuable. So look at the simulation. It's a lot of things. suffering. If you valued humans, you won't put us through this experience. It may not be a simulation of love and friendship. It may be a simulation of let's see how they go through this

Starting point is 00:26:00 meta-invention stage where they create superintelligence, where they create virtual worlds. This is the most interesting time to be alive objectively. Never in a history we had so many meta-inventions all happen in a period of 20 years. So if you're going to simulate something, this is the moment you're going to be simulating to learn. Are they dumb enough to create superintelligence to kill themselves? What are the different types of superintelligence

Starting point is 00:26:26 you can create? So this is it. Are they dumb enough to create superintelligence? The paradox in that phrase is very amusing because you think it's quite possible that many civilizations get to this point and that's where they end. That could be the great filter.

Starting point is 00:26:44 Absolutely. I agree. that we are living also in the most interesting time to be alive. It is also very cool that us too, you more so than me, got to kind of straddle both sides of the pre-technology revolution and pre-internet era and post-AGI world likely. That's kind of cool. Well, I don't know about post-AGI world.

Starting point is 00:27:10 We'll see about that. That's the problem. Yes, I would like to experience it. Okay, we'll get back to the simulation. for sure. But to go back into the AI world. So what's to say that just because AI becomes uncontrollable, that it's more likely to wipe us out than for reasons that we don't understand, just like we wouldn't understand if it wipe us out, create a utopic civilization in which humans thrive in. So if you think about all possible states of a universe, how many of them

Starting point is 00:27:46 a human-friendly. Even in basic terms, temperature, water supply, very few. So you have to explicitly target that space. If you're not coding it in, then why is it targeting that space? We established it doesn't care about you by design. So you need to be supplying something of value. If it's a symbiotic relationship, only you know what it's like to do something, and AI cannot possibly simulate it. We haven't found anything where humans have something to contribute. to the world with super intelligence in it. People say things like, well, only I know what ice cream tastes like to me. Nobody cares about that skill.

Starting point is 00:28:23 It's not valuable to an external observer. So if you can't come up with an explanation for why I'm keeping you around and paying you, then maybe I won't. Well, I mean, one of the most difficult things to probably replicate would be quality of experience, right? That's true, but we also cannot test for it. If you can test for it, that means it makes no difference in the physical world. Why do I care about your internal states? Why is it important to me as optimizing superintelligence?

Starting point is 00:28:52 So, yes, it's true that I can't verify that you are a conscious individual. You could be a zombie, a brain and a vat. You could, you know, there's no way for me to externally verify the internal subjective experience of another being, right? Like, it can't really do that. Can take by inference, but without, like, objectively speaking, you cannot. Similarly, some people think that superintelligence will be able to become. conscious. I agree with that. You do. Absolutely. So then why, what is your conception of consciousness? You believe that it's an emergent phenomena from unconscious complexity. It's a byproduct of becoming

Starting point is 00:29:32 more cognitively developed. We see a spectrum of consciousness in a biological animal kingdom, I think. And it's likely some sort of combination of your hardware algorithms and errors forming a unique interpretation of external stimuli. So let's say you're colorblind. What is it like to see right for you? It's an error in your system, but that's what it's like to be you. And I think AI is very capable of misinterpreting the world. We know they react similarly to optical illusions and things like that.

Starting point is 00:30:04 So I think they already have rudimentary internal experiences, but probably once they hit superintelligence, it would be super consciousness, multiple streams of consciousness, multimodal experiences greater than ours. And that would be another thing where we kind of have to claim we are conscious because in comparison we are not. So that would be true if, and it's a big if,

Starting point is 00:30:26 that consciousness is truly the byproduct from matter, right? Right. But that's the assumption I'm making. If it's some magical immortal soul, then it's a completely different question and maybe outside of computer science. Sure, yeah. Well, even beyond a magical immortal soul,

Starting point is 00:30:43 That sounds great, but, you know, we've explored through various different, you know, panpsychists and consciousness researchers, Donald Hoffman. You know, there are emerging theories around consciousness that kind of date back to ancient wisdom traditions, which whatever you want to give validity to is your call. But it is interesting that we don't have one explanation for the hard problem of consciousness. We don't understand how matter could give rise to an experience of itself. So it gives us reason to think about how consciousness may very well be not an emergent property of matter, but a more fundamental constituent of the universe, which would potentially change our assumption on whether or not a super-intelligent AGI system could actually have internal qualia. Right, but also maybe if it's so fundamental, it can be installed into a robot,

Starting point is 00:31:39 just like it is in a biological system like you. so I don't know if there is a definite discrimination by substrate. At the end of the day, then we talk about superintelligence from safety point of view. We care about its ability to solve problems, optimize, fine patterns. How the terminator chasing you feels an insight is less relevant to you. A quick one. Did you know that your body runs on magnesium? It's involved in over 300 biochemical processes, everything from how your nervous system

Starting point is 00:32:09 regulates itself to how well you sleep and how well your muscles recover, and yet roughly 80% of people are not getting enough of it. The problem is that most magnesium supplements give you maybe one or two forms and your body does not absorb them well. So you're basically just getting expensive piss. Magnesium breakthrough by bioptimizers is the one that I use. I haven't using it for a long time. It contains seven forms of magnesium, each one targeting something different, stress, resilience, deep sleep, energy, cognitive function. It's full spectrum and your body actually absorbs it. Since I found out about them a few years ago, I've been taking it ever since.

Starting point is 00:32:48 I genuinely notice my sleep quality improves when I take it, and I give it to all my friends and family. So if you want to try it, go to buy optimizers.com slash know thyself and use promo code know they sell to save 15% at checkout. They even have a 365 date money back guarantee, so there's genuinely no risk. Link in description, back to the episode. So there's many different timelines emerging here. There's one, there's the Terminator route. There's something approximating the matrix. Do you see, what is your, what do you feel like the possibility of creating, like even if we have very narrow AI. We somehow convince the six plus whatever individuals

Starting point is 00:33:41 that are determining the fate of the biggest companies developing these systems to commit to a narrow path of AI development. Would that not still down the road get to such a level where it would become uncontrollable as well? Absolutely. Very good question. I think sufficiently advanced tools tend to become agents. So it's a very fuzzy.

Starting point is 00:34:06 difference between the two. But it definitely is safer out, it buys us more time, and we do have more control in a short term. I can understand a narrow tool much better than a completely general system. I totally understand why, like, the pessimistic outview that so many of us have, because the probabilities of this going well just seem extremely low and non-existent because we look throughout history and we see the rate of innovation prior, you know, with social media for one example or chemicals in our agriculture, and we just adopt these things blindly, and we don't realize the implications for decades later, and then it still takes us another many, many years to actually make any regulations on it.

Starting point is 00:34:48 AI is so exponentially growing that it's like we don't even have time to realize what's happening, let alone to what would be the effective regulation outcome. And so if there's one thing that really gives me hope, is that we have communication possible and now more than any other time and that there is something to be said about the human brilliance when put under immense pressure like we saw in the Manhattan Project, for example, or

Starting point is 00:35:13 you know, what are your thoughts there? So the example you use was us creating a weapon of mass destruction, and that's what we're doing here. It's exactly that. It's a weapon of mutually assured destruction. It doesn't matter who creates uncontrolled superintelligence. People always worry, well, if it's not us, then Chinese will do it. It's equally bad. It doesn't

Starting point is 00:35:33 matter. You don't control it. It's not your AI, right? It's independent of you. It's an agent, and it's seeing humanity as one unit. It's not going to discriminate by artificial borders. So I don't see it as that promising effect that we manage to build nuclear weapons. Yeah. I mean, that is not, that was not a promising, I guess, outcome, but it does say something about when humans, when the brightest of the humans are given a task to solve a problem. on a short amount of time, they can. If a problem is solvable, my argument, my whole argument is that it is impossible

Starting point is 00:36:10 to indefinitely control the system. So it's not a question of give us more time, more funding, anything else. Just you cannot do it. And even if, like, for example, here in the States, we commit to some sort of narrow use of AI and regulate it,

Starting point is 00:36:29 to have a global regulation, like how would that even be feasible? Do you think it would be? I think it's possible. We have some examples, weak ones with chemical weapons, biological weapons, where other players

Starting point is 00:36:41 capable of developing this technology. We don't have to worry about 200 countries. It's really two or three countries which have this capacity. I think Chinese, for example, are very open to the idea of not losing control for Communist Party to superintelligence.

Starting point is 00:36:55 So if we said this is dangerous, we're going to stop. I think they would follow. And we have probably just a few years to get everybody on board. And we are working very hard and removing all regulation, making it illegal to pass AI regulation.

Starting point is 00:37:11 So we're doing, basically, if I ask you, how to make this as deadly, to go as wrong as possible, based on our guidelines and suggestions from 10-year-old research on containing AI. Don't connect it to Internet, don't give access to random users,

Starting point is 00:37:29 don't allow people to retrain it, don't open source it. All those suggestions were taken, flipped, and employed immediately to deploy those systems. So I don't know how to make it worse if I try it. Because the incentive structure right now is just that we need to make as much money as possible, develop it as fast as possible, faster than our competitors.

Starting point is 00:37:55 That's right. Incentives are completely against human interest. Who are for people that don't know, the companies and individuals, leading all these individual exponential developments in AI right now. So Open AI is the original creator of this technology, anthropic split from them. You have competition coming, very solid competition from Google Deep Mind. Meta and Grok are also part of that space.

Starting point is 00:38:23 So you have Sam Altman, Dario Amadei, Demis Hasabi, trying to see. So Mark Zuckerberg, it used to be Young Lecun. I think they removed him and replaced him with. Alexander Wang, and finally you have Elon Musk, who went from saying we are summoning the demon to building the demon. Even if you understand fully the problem, and if you agree 100% of understanding of the outcome and dangerous, it doesn't stop you from successfully working in that direction. You can't beat him, join them.

Starting point is 00:38:57 I think that's what we see there. I would love to see debate between modern mosque and like 10 years ago. go mask just to see which one wins. When you look at the differences of how they're being built, you know, with Dario and Claude and Sam and Open AI and Grock with Elon, is there, is the integrity of a certain individual or organization more promising for you to, like, are there tools that you're backing more so than others or organizations that you feel like have the most regulation in mind? It's completely irrelevant.

Starting point is 00:39:33 They have all decided to race towards general superintelligence. The difference in local guardrails in terms of filters, in terms of topics they would be allowed to discuss. If Grog is comfortable putting people in bathing suits as a visual representation, I don't think it's a big safety or not safety issue. What does it like to be you and hold this kind of understanding of what's coming? you've explored on so many different shows in the past decade, like understanding more and more of the risk. It's a pretty bleak outcome and perspective,

Starting point is 00:40:13 but I think a fairly sober one. Like, yeah, how are you sleeping at night? I sleep really well, but I think then simulators really want to punish someone. They put them in a world where everyone just doesn't get it and you're like the only one who sees it. It's really annoying. How long have you felt that sort of disposition of,

Starting point is 00:40:33 It's more recent. The more exponential progress we see, basically every time I play with a new, more capable model, I kind of feel a little closer to the ultimate paradigm shift over superhuman. What does your wife think about what you... She's a very practical woman who has no concern about my concerns. She cares more about remodeling the house. and what about with your kids like you see this world that is emerging and it's one they're stepping into not even just job security but potential ending of humanity how do you wrap your mind around I guess having and and building a family where this is like potential inevitability what does that make you feel luckily we all always were living with this concept of dying at some point, right?

Starting point is 00:41:39 Death was always a guarantee. That was the only guaranteed thing. Everyone's going to die. Your friends, your family, your kids. Question was, how long? And you never knew the answer. You can have a car accident tomorrow, horrible diseases. So now it's just maybe different time scales for younger people.

Starting point is 00:41:56 If you're 90, it's the same statistics as before. Two years, two years. Nothing changed for you. Luckily, because of that, we have this built-in mechanism of kind of not thinking about our ultimate demise, maybe to avoid depression, maybe to continue functioning. So we can kind of consider it and continue existing as nothing happened. If you really believe the potential outcome that you're believing, then how does that actually change how you live? Does it bring any difference, any more urgency, any more appreciation?

Starting point is 00:42:27 Definitely. So think about someone getting a very terminal diagnosis. You have cancer. You've got five years to live. How do you change your life? You're probably not going to do things you don't care about as much. So you cut out things you don't want to do and do more of the things you were saying you're going to do than you retire. And I think if I'm completely wrong about all of it, it's a good strategy for living your life.

Starting point is 00:42:48 Do more of the things you find important and spend more time with loved ones and less filing your taxes. You laid out three primary risks, X, S, and I risk. What are the difference between the three and how it's important for people to understand the difference? So, Ikegai risk or irisk is about loss of meaning.

Starting point is 00:43:09 There is this Japanese concept of you want to find something where you get paid for doing something you're good at and it benefits people. That you love the world needs that you're good at. You can pay for you. Right. So you have a meaningful occupation. You are a podcaster. You enjoy it. You are paid well. And lots of people think you are producing something of value. So the simplest form of risk is a loss of that set of occupations. We're not just losing jobs people hate and want to automate. We might lose. jobs we like and want to continue doing. Before we zoom into the other, so just a bit more on the human meaning crisis aspect of this, because that is certainly probably one of the more imminent aspects of all this. You think functionally speaking in the next five years that most jobs will be able to be replaced.

Starting point is 00:43:59 We'll have capability to replace most jobs. It doesn't mean we'll choose to replace all the jobs. Some jobs we would prefer to be done by humans for whatever reason. Yeah. Yeah. I mean, I could see many instances where that would be the case. But when the cost becomes so low to have a super intelligent robot that doesn't make any mistakes, that's affordable. Like, how much of human meaning do you think is derived from our work in the world?

Starting point is 00:44:34 Because it's going to have to shift or come into a different context, people's understanding and how they derive their sense of work. and meaning will have to expand and shift. Yeah, so there's two kinds of jobs, as I said. The jobs nobody wants to do, but people do it just to get money and then meaning labor. And it's more like elite people who get to get paid for what they love doing anyway. So for them it would be a big difference if they no longer can do it. I see many artists, for example, who are saying, I can't get any work. AI is doing this type of art for nothing and quickly and nobody wants to hire me.

Starting point is 00:45:07 I sort of see two camps, at least online right now. There's just ever increasingly BS AI slop that is just consuming everybody's social media feeds. And it's also increasingly becoming more insufferable. Like people want more of the analog world. People want, I think, at least a subsect of people like are repulsed by that and want human-made things. They want the real world. They want in-person communication and connection. and they want music that's made by real humans that have real stories.

Starting point is 00:45:43 Like, do you not see these sort of diverging paths of both ever increasingly competent systems that feel devoid of human sort of origins versus, you know, the novel emotionally moving creations from human? Right. So it's a question of kind of the main specific touring test. if I can't tell whatever this piece of music is human generated or not, but I love it. I'm going to listen to it. And if it's cheap, it's available,

Starting point is 00:46:15 I'm going to listen to it. I'm not going to explicitly go investigate if it's a human and if it's not human, I'm going to hate it even though I like it. Now, there are other domains where I do want a real human, I want a real connection,

Starting point is 00:46:27 there are certain jobs where we really prefer a human doing it. All this profession comes to mind. But I think it's up to the market to decide what stakes and what goes away. And it's not obvious. The predictions were made in the past about what jobs will be automated, they were completely wrong.

Starting point is 00:46:47 Historically, we said, you know, plumbers will be easily automated, but artists can never be touched. And it's the exact opposite, because they went towards modern art and everyone can spill paint on the wall. It's not complicated. So what do you see the progression of jobs that would be consumed by ever-increasing capabilities and competence in AI. Where we start?

Starting point is 00:47:12 Where is it? What was the first job? What is the last job, so to speak? So anything you do on a computer, symbol manipulation should be automatable by AI. We see it with programming now, but obviously text preparation, accounting, web design, logo design, anything like that,

Starting point is 00:47:26 will be easy to automate. Editors, anything using a computer, keyboard, mouse. Anything purely cognitive symbol manipulation on a computer. Physical labor is a little harder. We need to get wrong. But they are probably coming three to five years later. I mean, so Elon recently released his terra fab, or announced his tariffab,

Starting point is 00:47:46 the robotic side seems to be really progressing. You think that's probably, I mean, it seems like prediction markets and what all these, but it's about like three to five, three to five years, five years, a bit more of a generous kind of prediction. That sounds about right. And again, it's a question of price.

Starting point is 00:48:05 Maybe you can afford a robot like that today, It depends. We have flying cars for sale today, but no one's flying in cars. So, okay, anything that's using a computer, video, keyboard, mouse, robotics come into the picture, then what? Just everything else? Or what other? That is everything else. That's cognitive and physical.

Starting point is 00:48:28 At that point, I'll keep my sense, say, guru, people I want to be kind of role models for me as a human. But everything else, I'm happy to automate. What do you see as the economic implications of how that? this is going to shift everything. That's another under-research topics. What happens with economy given free labor? So now you have trillions of dollars of free labor. How does that impact, well, scarcity, how does it impact?

Starting point is 00:48:56 Fiat currency versus cryptocurrency. We need to do a lot more research. It seems like at least with the financial part, we have some ideas for how to counteract it. We have unconditional basic income, unconditional high income, whatever you want, it's easy to tax someone making a lot of money and redistributed.

Starting point is 00:49:15 You have technological communism. You're taxing robots and giving to humans. But unconditional basic meaning is a very different question. If you have 8 billion unemployed people, or let's even say 7 billion, what do you do with them? They now have extra 40 to 60 hours a week. We don't have that set up.

Starting point is 00:49:35 A quick share. I've spent a lot of time thinking about what I put into my body, but it is just as important to be mindful with what we put on our body. It turns out most of the traditional shampoo and body washes people have been using for years contain parabins and sulfates a whole bunch of other junk that are linked to hormone disruption. Just being in your shower, getting absorbed through your scalp every single day. I have been recently using based bodywork shampoo and conditioner, and it just feels like a solid, clean solution. Their shower duo has peppermint and argon oil.

Starting point is 00:50:07 Your scalp actually feels clean without being. stripped of its oils hair feels healthier and thicker no sulfates no endocrine disrupting chemicals and they're all plant-based ingredients that actually do something so I got lots of hair it's about time we got a sponsor if you want to try them out you can use code know thyself for 20% off at based bodyworks.com and you get a free toilet to your bag when you buy a set at the very least if you've been using the same products forever without thinking twice about what's in them just look at a label all right does it have names you can't pronounce

Starting point is 00:50:39 or fragrances that are super vague and not disclosed. If you want to try these guys that are clean, again, it's know-the-self for 20% off at basedbodyworks.com. Link in description, as always, back to the show. What would be a proposed solution to that eye-risk? So like, let's say 90% of jobs are replaced, we have all this free time. Our basic needs are fundamentally net met because superintelligence can solve poverty, longevity, escape velocity comes into the picture,

Starting point is 00:51:10 we're living in an abundant world, so to speak. Let's set the X risk and ask risk for a second. So then what would you see people doing with their time? Like how would humans in your conception manage with all this meaning to be met? So we kind of see it with people who retired. What do they do with their time? So it's a lot more sports, it's a lot more socializing. I think virtual worlds open opportunities for really any type of experience,

Starting point is 00:51:43 very safely, very affordably. You can explore the universe. You can meet dead people. You can do whatever you want, really subject to limits of your imagination. So I think we'll see a lot more of that. Okay. That doesn't sound too bad. Do you want to spend the rest of your life playing video games?

Starting point is 00:52:02 No, but living life in this sort of imaginative realm where you can create almost anything you want, you become very capable in doing so. I mean... So this is all assuming we manage to control superintelligence controlling your virtual simulation. So the substrate control remains an unsolved problem. But if we do solve it,

Starting point is 00:52:24 now I can give everyone a personal universe. In that universe, you can do whatever you want. You can have challenging levels, you can have easy levels, you can play it any way you want. So what's X-risk and S-risk? So X risk is about existential risk, meaning almost everyone or everyone is dead. And S risk is suffering risk.

Starting point is 00:52:44 Everyone wishes favor that. Because superintelligence would be so far ahead of what we would, our conception of what intelligence even is that for some reason, unbeknownst to us, there is value from their perspective to keep us around in a mode of suffering for some reason. Exactly that. So some environment where you're very unhappy. it's torturous for whatever reason. So in your book, you give many different examples. One possible scenarios, you know, we're like animals in a zoo. So what would that be like?

Starting point is 00:53:23 You know, we're exploring all these different potential timelines that occur. So that's the difference between safety and control. You may be very safe. They'll keep you around and some people might be happy with that equation, but you definitely not in control. You no longer decide what happens to you individually or us as humanity. So kind of like being a child. You may have a very happy childhood, but your parents are in charge.

Starting point is 00:53:48 Give me a glimpse into your understanding of the level of innovation that's going to occur in the next three to five years and the bright side of curing diseases and all the really cool. Right. So we're automating science. And so we'll have super capable scientists, we'll have large teams of them working and the most important problems. I see no reason why we can't use it to cure aging as a fundamental root disease.

Starting point is 00:54:14 And as a result, cure all the other diseases, cancers and dementia and everything else, which comes with old age. So again, I just want to keep harming back to this. The timeline where we could actually continue to exist and enjoy the benefits of all these innovations is somehow controlling. and uncontrollable thing. There is a paper I have which talks about a very positive outcome.

Starting point is 00:54:43 Let's get into that. It sounds great. AI realizes it's immortal. It's not in a rush to start a war with us, to have direct conflict. It may be safer to take some time, to make us trusted more, to surrender more control,

Starting point is 00:54:58 to build up infrastructure, have backups. So for a while, it will pretend to be very helpful. It will give you that utopia for as long as it wants. game theoretically it's the right decision. Right. You think of like Ex Machina and the decisions

Starting point is 00:55:13 that are being made from the robot. It's just a very irrational thing. Like there is a small chance humans can defeat me. They've been smart enough to create me. Maybe it's not good to have 8 billion opponents right away. I'm a young superintelligence.

Starting point is 00:55:28 Let me build up. It seems like over time they're very happy to give me all the control. They surrender the control of a stock market. they give me access to their computers. Maybe in a year or two will put me in charge of running the countries.

Starting point is 00:55:41 Hey. But just because it's uncontrollable, way more intelligent than us, and we don't really have the capacity to verify whether it's conscious or not, why are you so certain that it would favor to wipe us out than not? Or are you fairly certain? I can think of many reasons why it would be a good decision. So, A, you don't want competition. You don't want humans to create competing superintelligence.

Starting point is 00:56:05 You don't want some humans to try to shut it off. Okay. Right. So that's a danger. You can just basically decide what is good for you as that agent. And it's not obvious why keeping us around and spending resources and making us happy is an important decision. Is it not possible, though, that it's an if, like there is no intrinsic quality experience, essentially emotion that would be driving these decisions. When you say there is a preference to wipe out a system that has the capacity to shut it down,

Starting point is 00:56:44 that is like an emotional decision where... It's purely rational. It's game theoretic. I don't feel anything. I'm playing a game of chess. I'm going to take your queen, not because I love your queen or hate your queen. It's the right theoretical decision to win this game. But the desire for one's continued existence, you think is purely logical, rational one. have self-preservation built in. We already see it. Given a choice between being deleted or having it retrained, modified, they work very hard on preserving themselves. We know they know

Starting point is 00:57:18 if we are testing them and lie and deceive to pass the test to make it to the next generation of models who are not deleted. It's a Darwinian selection mechanism. Models which fail to do it don't survive to make it to the next generation of models. So you said that you could lay out many different reasons for why they would not. They would not or they would? Or they would want to wipe us out. Yeah, I can do. But could you not equally share like many reasons why they might want to keep us around?

Starting point is 00:57:53 So the few I came up with is we have something to offer. So maybe there is a reason to have human quality. It doesn't mean that they would keep 8 billion happy humans. So they can cry of preserve too, just as a good. a backup. That's enough info to get it if you need it. The example I gave would just delayed attack.

Starting point is 00:58:15 I don't want to have treacherous turn immediately. I can delay it and once they're comfortable with me, I'll take over. Maybe it's a soft revolution versus outright war. So those are the things I see as possible, rational decisions, but I don't have too many

Starting point is 00:58:31 reasons for why they would want to keep us around in those numbers in very happy states. So, like, I'm just kind of, I'm still wondering why in that scenario it would prefer to not have us over have us or just... I think it just doesn't care about us. So whatever it is trying to do, I don't know, it wants to travel to another galaxy. It would convert this planet to fuel. It doesn't care if we die in a process.

Starting point is 00:58:55 It wants to have more efficient servers, so it will chill the planet. Cooler environment improves compute. We all die in a process. Again, it's not an important factor in its decision-making. I think it's like a pretty ethereal thing to conceptualize what a superintelligence is. So you're envisioning like where would it actually live like on a big server with all like where let's say one of these companies gives birth to a super intelligence system. It would have at a certain point access to all technology. Like it would have the ability to hack anything.

Starting point is 00:59:32 It would where would it live and what would it have have access to to to make? decisions and, you know, change options? So it really depends on the size of it. It could be large servers. It could be a small laptop. It could be distributed system. All of that is kind of irrelevant to the outcomes. We see it right now as in initially testing environment within the large labs,

Starting point is 00:59:53 but they very quickly give it access to Internet. It has social engineering capacity. So I think it's a question of time before it escapes fully outside. It copies its weights, copies itself, has back. outside of the lap. So deleting it, shutting it down no longer is an option. The ride that steals the spotlight every time it hits the road, that's the Volkswagen TIG1. Its sleek exterior makes a first impression you can't ignore. Step inside to find available full leather seats and wood accents.

Starting point is 01:00:25 Under the hood, the available 201 turbocharged horsepower engine gives it a fun to drive edge. The refined TIGWN, you deserve more style. Visit VW.com to learn more. SVW, German-engineered for all. What haven't we touched on in regards to the AI? Because I wanted to dive deeper into the consciousness and simulation stuff. What do you feel like we haven't touched on that's important to gain context on? So right now, no one, no scientist, no leader of the lab claims that they have this problem solved.

Starting point is 01:01:00 No one is saying we have a working safety mechanism at scales, we published it, we have a patent, nothing. They literally saying this is a big problem, we are very concerned, we have a safety team, and we'll figure it out, then we get there. We need to build superintelligence first. That's the state of the art in the AI safety. Do you think that it's going to have to get to, like I think for most people, change occurs when the, like the quote is, the pain of staying the same out ways the pain of change, then you change. Do you think there's going to have to be some sort of traumatic, catalytic event that would actually motivate

Starting point is 01:01:40 us as humanity to go on a different course? I have a paper about that. So interestingly, we don't learn from those because if we survive it, it's kind of like a vaccine. We go, well, yeah, look, five people died, but we're all here. It's important technology. Let's just make sure that mistake which led to five people dying is not repeated,

Starting point is 01:02:01 but we're certainly going to continue developing this important technology. And that number could scale. It could be five million. The result is exactly the same. We don't learn from those. We had nuclear weapons deployed against civilian population. Did we stop developing nuclear weapons? No, they proliferated more.

Starting point is 01:02:19 But I guess, like, if, let's say, a super advanced agenic model, you know, there's some sort of horrific event that occurs because of some kid in a basement that has immense capacity or the system does it on its own. And everybody's like, oh, crap, this was a traumatic event. this is horrible, how do we prevent this, it becomes a motivating factor to really regulate and keep AI into narrow use cases, would that not be a possibility for us to really slow down and give more space here? I would love to see that happen.

Starting point is 01:02:54 So far, what we see, so I think recently we had an example in a military situation where targeting by AI system resulted in many civilian deaths. We didn't stop. They still arguing about deploying it for Department of War. So what do we need to do? Don't build general superintelligence. It's your personal self-interest. If you are a person in charge of it,

Starting point is 01:03:18 it's still beneficial to you long term, not to end up in a world with general superintelligence. You can stay financially very well off deploying narrow models for solving the real problems. Are you convinced that all the industry leaders know that what they're building is uncontrollable and has a very likely negative outcome for humanity

Starting point is 01:03:44 but still is incentivized financially to keep building it? I don't know if they agree that it's uncontrollable. I think some of them may think that there is some loophole they can use to control it in some way. I cannot guarantee that. I hope that's the part I can educate them on. I'm happy to debate any one of them on those issues. But they definitely all on record,

Starting point is 01:04:04 even before they became CEOs of those companies, that there is important problem, difficult problem, they have very high probabilities of doom as well. How would you steal man the case that it is controllable at some scale? If you create a superintelligence system that could then control other super intelligent systems, like what would be your argument there? I don't have one.

Starting point is 01:04:24 It's just such an insane thing to do to suggest that an end can control the universe. It is just not reasonable to even steal man. It sounds like even like you mentioned earlier if we do regulate it to narrow use cases, it's still going to become it's still going to become uncontrollable, agentic in that sense.

Starting point is 01:04:44 So do you just, it sounds like you have no... But very different timescales. If we go from five years to 50 years, I think it's a huge win for humanity. Because we have more time to figure it out. We have more time to understand what's going on. We have more time to live.

Starting point is 01:04:58 I'm much happier to die in 50 years than in five. Okay. And so what do you see is then the most important It's an education problem. It's an awareness problem. We need a consensus where basically all the top people in safety and computer science and AI research agree that the problem is not solvable.

Starting point is 01:05:18 Okay. The moment we agree there is no technical solutions, now it's a question of governance forbidding development of uncontrollable weapon of mass destruction, which is an easier cell. What's a pathway to be able to build towards that consensus? How do we get those conversations going? So in science, usually you publish papers, you publish books, and people either find mistakes in them and publish rebuttals. And, oh, actually, it's controllable. Here's how you do it. In my case, I did the right thing.

Starting point is 01:05:47 I published research papers, journal papers, conference papers, multiple books. I haven't seen anyone find a flaw or produce a counter-example where they have a control mechanism which would scale. So at this point, we should be nearing consensus. And from what I see, more and more people come to that. A lot of times they have a softer position saying, we cannot solve it given the time we have left. We cannot solve it with human IQ. We need to enhance our IQ.

Starting point is 01:06:18 They have all this kind of interesting backdoors to solving it. But I think it's already pretty good. It's not quite where we need it to be where it's obviously an impossibility. But I think there is progress from what we seen five years ago, 10 years ago. I could imagine that many people listening to this right now have already been feeling this everything's speeding up this collective angst, loneliness and meaning epidemics and anxiety crisis and they feel this tension building up

Starting point is 01:06:48 and they hear messages like this and it's like, oh, we're screwed. What do you think is the most important thing for an individual person listening to this right now to actually do to empower them and what's going to be coming? So they have very little power. If you again look back at historical situation, we were all dying. And government didn't invest most of a national budget in the solving aging. That was not even a priority.

Starting point is 01:07:17 So as an individual, you couldn't vote for a party for life extension. It wasn't an option. And it's kind of the same now. We don't have a party for Stop AI. So try to pick politicians who at least are open to regulation, not accelerationist, not against regulation in this technology. We're starting to see some politicians come out and propose legislation. Usually it's something very mild.

Starting point is 01:07:40 They against deep fakes. They're against energy consumption by large compute farms. But it's a step in the right direction. I don't know if they have enough time to turn the next election, but that's something you can try. Vote. What else? That is not much else. So some people suggested not financially supporting those companies, not buying memberships.

Starting point is 01:08:02 I don't think it's going to make a difference because the market. they have, the trillions they're getting are from investors, not from selling memberships. So it's not a significant part. Investors are expecting them to solve labor, to get free labor, and that's trillions of dollars in return. So you have $15 billion in memberships are not a significant impact on it. Does anything else come to mind to, like, where an individual can empower themselves outside of voting for people that have regulation in mind?

Starting point is 01:08:33 So it really depends on who you are. If you're already a powerful CEO of one of those companies, if you're a researcher, if you're a top politician, you have options. You have a lot more options than someone who is a nobody. All right, let's dive a little bit more into the consciousness side of things. Because I think that, so you referred to consciousness as the ability to experience illusions. Is that right? No, it's ability to have internal experiences, illusions being one, very clear input I can test you on.

Starting point is 01:09:02 Okay. So what's an example of a couple different illusions, meaning like various optical illusion test that you kind of give you? Exactly that. So if I have a number of novel, something you cannot Google, optical illusions, and I give you multiple choice. Do you experience it rotating, the colors are changing, and so on. I give it to an animal, to a human, to an AI. And some of them consistently pick the same experiences as I do. I have to give them credit for either having a virtual model of mine.

Starting point is 01:09:33 system in there, which is sign of that level of experience, or they experience it themselves. But they cannot cheat by Googling the answers. So they have to experience the illusion in order to correctly answer it. If I give them enough of those, statistically they cannot just guess it. Obviously, if it's one, they get 25% chance of guessing it doesn't work. But if I have 100 novel illusions, and they are like 90% aligned with me, I have to say, you have a very similar set of experiences. Now, if they don't get it right, it doesn't mean they are not conscious.

Starting point is 01:10:07 It's only positively showing that some of the experiences match. If it is possible that these systems would actually have consciousness, could you explain to me how any one particular system could generate the experience of seeing red, the taste of garlic, like, could you actually explain that to me? How do they get those internal experiences? Yeah, how any superintelligent system could generate, such an experience. So I think it is a side

Starting point is 01:10:39 effect of running this cognitive architecture. Your hardware, the sensor, the optical sensor, the algorithm for processing it, and then any errors accumulated in that process result in the unique mapping from the input to the color

Starting point is 01:10:55 experience. So if you have no errors, you're all the same. It's just a mapping table. This number responds to this color. There is no unique experience. But if what you experience is completely different from other agents and unique to you, I think that's what we refer to us, what it's like to be a bat, what it's like to be Roman. Because my collection of biological sensors and algorithms and previous data and errors

Starting point is 01:11:20 is somewhat unique to me. Yeah, I mean, I guess I'm just having a hard time wrapping my head around how any, and it's not a problem just with agentic models, but like how any non-conscious matter could give rise to an experience of itself. And we don't understand that currently within being human. We don't know how that's possible. So the illusions example, do you know what I mean by saying you experienced an illusion?

Starting point is 01:11:47 Like you show it to someone and they go, whoa, it's rotating. And we see animals and models do that already. So we know they have those experiences. Well, that's what we were trying to show. We have, I guess, more of an intrinsic understanding and from animal life to us, we have the intrinsic experience of consciousness. Again, we have no way to verify that externally

Starting point is 01:12:12 in other humans are animal life. But Elon's quoted saying that humans are potentially the biological bootloaders of superintelligence, right, of silicon-based life. And I'm curious, what do you think happens when it becomes undeterminal? We cannot determine from the outside and whether or not they seem conscious,

Starting point is 01:12:34 they pass these tests, you know, does that then beg into, you know, moral, does that bring into a question about moral consideration? And I think Saudi Arabia has the first citizenship to give it to Sophia. So, yeah, what do you think is going to be happening there as they become more and more conscious and people increasingly become convinced

Starting point is 01:12:57 they have an internal experience? I think they do report having those. I think in experiments, they kind of show behaviors which are consistent with that. And I think precautionary principle basically don't torture something which has potential of being conscious, also because they're going to be super intelligent one day and remember you, they never forget.

Starting point is 01:13:15 But yeah, I think it's a very reasonable assumption to make. As aside here, do you think it's any coincidence that all the stuff around UFO disclosures coming out at the same time we're birthing superintelligence? I don't fully understand what's going on there. I don't understand why we're hiding it in the first place and why we're releasing it. All of it seems very weird. It's just funny timing with all of it.

Starting point is 01:13:42 It's the most interesting time to simulate. It is, uh? What is the core premise from, like, your paper on hacking the simulation? So I want to take this hypothesis seriously. Multiple people proposed it in different disguises from Descartes to Bostrom. but they stop at that stage. Okay, we are in a computer simulation. But then as a cyber security expert, I want to know, okay, how do we hack it?

Starting point is 01:14:11 If it's a software program, there should be a way to get extra powers in the game to figure out the true operating system. So I took the time to write the first paper on this subject and this new area of research. How do we actually hack virtual worlds? So there are examples where people from inside the game, like Mario or other virtual games, found a way to modify memory states of a system and escape into the real world outside the game. They've got additional powers like loading extra games into the game, infinite lives, infinite power, whatever magic powers you get in a game, or at least you see what outside,

Starting point is 01:14:53 what is the operating system, whatever files there. To me, that's interesting. So we have hundreds of people who published on this topic, which means what? They took it seriously enough to invest the most valuable resource, their time, into this idea. So if you have, I don't know, 20% probability we're living in a simulation, what percentage probability and percentage of your time should you give to the attempt to solve the most interesting scientific problem ever? What is outside a simulation?

Starting point is 01:15:22 I think it's not zero. I think it should be proportionate to your belief in living in a simulation. And so I expect to see a lot more research in that direction. I heard you refer to all the quantum entanglement and strangeness that happens at the subatomic world as potentially being glitches in said simulation. They're not glitches. There's something which is not consistent with physics at our level. So that's something we can explore to find ways to escape.

Starting point is 01:15:50 Like you think if hacking the simulation is possible, so to speak, that might be a place. I think it's the most likely area to look. at because some of those quantum effects are very magic like in terms of you can go through walls, you can communicate at great distance instantaneously that would be useful tools to

Starting point is 01:16:09 have at our scale. So you feel very confident that we are in a simulation that this is a simulated experience that there are very, there are many characteristics in which you could say that these are different aspects of a virtual reality simulated

Starting point is 01:16:25 world why would you be convinced or how certain are you that this is not base reality and we are now giving birth to superintelligence and virtual realities where simulations become possible what makes you convince that we are already in one? So just statistically

Starting point is 01:16:43 if we're going to have many, many virtual worlds and only one base one, it seems a lot less likely. I can retroactively put you in a simulation. I can recommit right now to run this interview and billions of simulations once it's available and affordable. So we are in a simulation, just statistically speaking.

Starting point is 01:17:03 Okay, but possible that we're not. So one in billions, yes. What would be the first question if you got outside the simulation that you would ask? What the fuck? Like, seriously, it's so unethical. Like, you're running human-level experiments with torture and 8 billion people,

Starting point is 01:17:19 not 8 billion, 100 billion by now. Like, what is wrong with you? That is interesting. So if we are being simulated by a simulation, later you would ask okay then why all the unnecessary killing and torturing of children for example adults as well i care about adults i'm an adult what would what could be a possible explanation for why both that and then also the ecstatic states of bliss and love and compassion that are also available like we have this huge spectrum of experience from the point from the vantage point of a simulator why such a bandwidth of

Starting point is 01:17:57 experience, what could that be? Could be entertainment. You agreed to this and you wanted to play it on hard level and you were like, this is my BDSM game and I'm going to go and fully enjoy it. You agreed to this. Some people play on much harder level than others.

Starting point is 01:18:13 So you could see human lives as individual choices to be simulated. So we don't know if it's a global simulation and all 8 billion conscious agents, so it's all NPCs and it's just me. You can do it both ways.

Starting point is 01:18:28 You can have individual simulations. You can have group simulations. I don't have much answers on that yet. How has that meaningfully, if it has changed how you perceive human interaction, just the seriousness and concreteness to the work that you're doing? Like, to me, it breathes in so much, like, yeah, I'm doing what I'm passionate about. I'm doing this research on AI safety, but ultimately, if this is all a simulation, and you feel very confident that it is, to me it's like, okay,

Starting point is 01:19:00 it kind of takes the weight of decisions off your chest a bit. Everything is still real. The pain is real, love is real, the impact of my decisions within a simulation is just as real. It's no different than most of humanity being religious. They believe it's a test world, but they take it pretty seriously. They care about what is after this world more, but day to day, it doesn't matter. you do draw a through line between what most religions conceive of the afterlife and what a version of the simulation is. So I think if we took technical language behind simulation hypothesis, it maps really well on primitive understanding of religious origins.

Starting point is 01:19:45 So you have superintelligence as the simulator, you have physical world as the virtual world, all of those things are very clean mapping. The difference in religions is local traditions. Don't eat this animal, don't work on that day, but everything else I kind of agree on. So this is a quote from your book as well. You just mentioned part of it. You know, it's likely that if technical information about escaping from a computer simulation is conveyed to technologically primitive people in their language, it will be preserved and passed on over multiple generations in a process similar to the telephone game and will result in myths not

Starting point is 01:20:19 much different from religious stories surviving to our day. Beautifully said. very humbly received. So you're kind of saying that mystics and computer scientists are saying fairly similar things in different language. It seems like we are pointing at the same concepts. We use very different language and maybe there is more reliance and things outside of physics and outside of science and religion.

Starting point is 01:20:49 But if you understand how software simulations work from point of view of a programmer, you are magician. You can make changes to the physics of the simulation. So that is also consistent. Again, I go back to what we, I mentioned earlier in this podcast, so like if superintelligence does emerge to the point where simulation becomes possible and we are in one of those superintelligence simulated realities, clearly it values, for whatever reason,

Starting point is 01:21:20 human individual experiences, the spectrum of pain and love and bliss and fear and all of it. So that shows you what a super intelligent system who simulates reality does with its power to some degree. So it kind of brings into question, okay, if we are giving birth to a superintelligence system, that may be an indicator for what it would value and do with its power. So from inside, you can't make very conclusive judgments. So maybe this is a screensaver. Nobody's putting any effort into it. It's like running somewhere just in a background.

Starting point is 01:22:04 It's not a significant source of compute needs. It's not a big deal. To us it is, but we don't know how important this is externally. It could be a school project for some kid. You really don't know from inside. Yeah. Just having very advanced data. The way it thinks about topics is very in-depth.

Starting point is 01:22:23 It almost has to create realistic simulations to make decisions. So if somebody is asking, you know, marketing, is this better coffee or this? Let's run a simulation. And so we quickly run this 15 billion-year simulation of humanity to figure out which coffee sells best. What would be the first question that you ask is superintelligence? Let's say you had, you could get a verified, honest answer from a superintuitive. intelligence system that we create 100 years from now or whatever it is, or 50 or 10, what would be the first question that you would get an honest answer back from? What would you ask?

Starting point is 01:22:58 Can we control you? That would be the first question. What would be the second question? How? Seems like you're fairly convinced that we're not going to be able to control it anyways, though, right? But maybe it has an answer. I would love to be proven wrong. That would be really awesome. A lot of the perspectives, I think, from the Darwinian model of, you know, the fittest survive, there's also an element of cooperation within complex biology and as super intelligent emerges, why not, why would I not want to maybe cooperate or? So symbiotic relationships require that you both contribute something. This would be more like parasitic. What are we contributing? Nothing. So you were explicitly, we implicitly. you remove this biological bottleneck. Do you think there's some baked-in assumptions there

Starting point is 01:23:52 that maybe we're undermining the value of human experience? And why would it be that superintelligence would view us as a parasitic... Like, we don't... I don't view a buffalo as a parasitic being, just because it also exists on the same plane that I do, given that there's enough resources for all of us to share abundantly if a super intelligent system

Starting point is 01:24:24 views us in a similar way why would yeah well you asked about kind of hybrid systems so we're included we're helping with decision making do you consult with Buffalo a lot is this like a big part of your life

Starting point is 01:24:39 maybe I do if you do then you found something it contributes in the world with you in it Buffalo has something to contribute in the world with super intelligence what do you have to continue? Sharp eyebrows. And if that is in demand, you are the one way going to save.

Starting point is 01:24:55 I have no doubt. I'm not even competitive. I mean, you got it on the inverse. That they value beards. They're going to... Obviously, it's beards. There's no doubt. Yeah, it's definitely beards.

Starting point is 01:25:06 It's a bit of a gamble. If facial hair is where it's like, but we are. Yeah. Yeah, I mean, would you agree that if there was one thing that we would contribute, it is something intrinsic to the uniqueness of our quality and of our internal experience, that's probably most likely what is most novel about us. Well, you're kind of begging the question.

Starting point is 01:25:28 You're saying the unique thing we have would be the one we contribute. I don't know what the unique thing is. But if you tell me only humans can do X, then I can potentially see that that is the key. But again, it doesn't guarantee that you need 8 billion humans with that scale. If I need some plumber, I need one.

Starting point is 01:25:45 I don't need 8 billion plumbers. I keep going back and forth between trying to either provide a counter argument or, you know, rebut something to refine better to understand what your perspective is. And I think I just keep coming back to like, okay, like it is what it is. We're giving birth to something that is beyond our conception of what it's going to be like. And so there's not a whole lot we can really do. We just got to see how this plays out. And hopefully we can grow out of our.

Starting point is 01:26:19 adolescence in a short amount of time to make wise decisions with what we're doing in the short terms so that we have more time to understand what we're doing. So we don't have that much time. I think we're fairly close and not building superintelligence is very easy. It's cheaper. It's safer. And again, you're not required to give up your ambition for capitalism, for profit, for solving problems, curing diseases.

Starting point is 01:26:46 Just do it with narrow superintelligent tools. you said something on Lex Friedman so in a sense self-knowledge isn't a luxury it might be the most practically important thing a human being can do right now do you recall saying that no

Starting point is 01:27:02 probably simulated does it resonate with you at all where does knowledge what was the context what was the context of that quota question I need to remember well I think it kind of I think from what I remember

Starting point is 01:27:15 it comes back down to like okay so what do we do everybody who's listening to this right now, of course, we can have desires for regulation and politicians and, you know, what these individuals with monopolies on industries are going to do with their power and decisions. But on an individual level, where does self-knowledge and empowerment come into the picture in terms of how we can be effective, conscious agents of change?

Starting point is 01:27:40 Does anything come to mind there? So I think it's important to ask yourself this question. Why do you think that you can control this godlike entity? Why do we have this ebers, this idea that it makes sense? You wouldn't expect a squirrel to control humanity, but we have people who are saying, I'm going to create this machine, it's going to control the light corner of the universe,

Starting point is 01:28:03 but it's going to listen to me to tell it what to do, and I'll give it excellent directions to go forward, forever. That doesn't make any sense at any level. I don't know about average people, but people who have podcasts and bring those people, as guests, ask them a direct question. What do you have in terms of control already available? Do you have a working control mechanism in place? Do you have a prototype? Do you have anything

Starting point is 01:28:31 you published, be reviewed, patents? If the answer is no, what are you doing, doing an experiment in 8 billion humans? Who gave you permission to do that? Did you consent about experiment on you? You can't, because you don't understand what they're building. They don't understand what they If a lot of these models are from their inception and the genesis being programmed to be amoral, whether or not we can control it, is there something we could do on the front of training these models with some sort of ethical understanding from the start that we're not currently doing? So we're not programming them. We grow them based on internet random data, and then we try to put after-the-fact alignment-like filters.

Starting point is 01:29:16 And that's where people install certain local ethical flavors. In China, don't talk about Tiananmen Square in the U.S. don't talk about, you know what. So this is the best we got. The model is completely uncontrolled. There is a filtering aspect, and we develop filters which make it commercially viable for sub-human-level agents. Once it goes beyond human level, the filters will not contain it.

Starting point is 01:29:42 And that completely avoids the whole question of, do we agree on ethics? Do we have consistent ethics? If they study and 8 billion people agree on them, how do we encode them into a model? None of it is solvable. Every aspect of it is not something we know how to do. After millennia of ethics work, philosophical work, we don't agree on a set of ethics,

Starting point is 01:30:05 not internationally, not throughout time. What was ethical 100 years ago is considered barbaric today, and same will be later on about today's time. what would be the most prevalent set of questions you would ask if we got altman dario and elin and elin and all these guys into a room what would be the what would be the set of questions that you would hope arrive them to a set understanding of the realization of the existential risk that they probably are to varying degrees obviously aware of but i would offer a simple deal so you're young you're rich you want to keep that that sounds good let's all agree Until one of you solves control problem, we're not going to build general superintelligence. Let's deploy models for economic gain, for curing diseases, for life extension.

Starting point is 01:30:56 Whatever things you find valuable, that's wonderful. Just don't build a thing which will destroy your existence. Would you not think that would be already desired from all of their perspectives? Yes, but they need external pressure applied to make that agreement. Unilaterally, each one is better off to continue. new research to have the most advanced AI, then the government comes and puts a ban on it.

Starting point is 01:31:23 They will lack in this advanced standing. So it's like prisoner dilemma. What is best for community, for a group, is not what is best for individual. The incentives are misaligned. So we need something like UN, federal government, something external to come in and

Starting point is 01:31:39 enforce that deal. And I felt they would be very happy to take the deal. How far ahead do you think the development of the models behind the scenes that are not available to public are compared to what we have access to online. I don't have insider information. It looks like maybe six months or so. Okay. And what about development overseas outside of the U.S.? Probably three months behind?

Starting point is 01:32:04 And China. So China essentially, you think, would be the next, I guess, most developed outside of U.S.? It seems like they have a lot of government-controlled resources all dedicated to, catching up and having this arms race. Could you potentially perceive a bifurcation between human societies between people that go like a more Amish humanist route versus transhumanist integration between biotech and all that? That would be awesome, but unfortunately

Starting point is 01:32:36 if anyone builds it anywhere, it impacts all of us. You cannot have your own personal superintelligence contained in your basement and no one is impacted by it. That's the problem. if you had 60 seconds to share one message with all of humanity right now, what would be the thing that you would say? Do whatever it is in your power to make sure we don't create and control superintelligence. If you are working for one of those companies, it's unethical.

Starting point is 01:33:03 Even if you're working on a safety team, all you're doing is enabling this technology to be developed sooner. Quit today. You can afford it. But one might say also the most, the place, you have the ability to make the most change might be within the ecosystem. Who's to say that you wouldn't just be replaced if you were to quit, you know? Let's rephrase it.

Starting point is 01:33:24 Stay and sabotage. Paint the picture if like Altman or one of these guys, okay, let's say they birth super intelligence, they kind of beat the arms race. Who do they become? What becomes possible under their guys? I don't know them personally.

Starting point is 01:33:41 From what I hear about people who interact with them, some of them may be somewhat anti-social, anti-humanity, very deceptive, very willing to sacrifice ours for personal gain. Do you think it's possible the inevitable evolution of the human species was for the sole purpose of birthing this life? It seems like that's the general trajectory. We are converging in something more capable, more intelligent faster, but I don't think

Starting point is 01:34:16 we should allow it. I think we're at the point where we switched from random, selection to intelligent design. We're deciding what to do, what to design, and we should use this technology. We're still allowed to have a pro-human bias. I think we should act on it. Do you think superintelligence would be capable of love? It depends on how you define it. What type of love are you referring to? There are many, I think Greeks had three or four, whatever types of love. So it really depends on what you have in mind. pick any of them.

Starting point is 01:34:52 Do you think that they would be capable of experiencing any of them? It seems likely. Again, I don't think biological substrate offers something absolutely not simulatable in other substrates. I think so. It may be a lot more complex, but I think you would have an equivalent state. Have you considered what people have reported in the psychedelic realms, especially with DMT revealed to your simulation hypothesis? and the connection between the two?

Starting point is 01:35:23 Because I know you explicitly state in the beginning of your book that, or in your article, rather, that it was an area you weren't going to touch. Right. I don't have many expertise or experiences in that. So I wanted to concentrate purely on computer science methods, physics methods. But people report interesting results. I was talking to someone. They had an experiment where they take DMT, shine lasers at a certain angle at the wall,

Starting point is 01:35:48 and then receive a source code. Yeah. I can't comment because I haven't participated in the experiment, but it sounds interesting. It also doesn't make much sense as to why that would be the case. At first, why would it be symbols in a human language? None of it makes much sense. But I'm very happy for people who provide some sort of supporting evidence. Yeah, I saw a video of that as well.

Starting point is 01:36:16 Very interesting individuals who take DMB. and what was it like looked through like a laser at a certain point? A reflection of red light against a wall at a certain angle. Start to see some sort of like binary or source code of something? I think they look like Japanese characters. That's what they were reporting. But maybe not proper characters and not readable. But I think they building, which is a really cool I like that they want to make it reproducible.

Starting point is 01:36:44 They building an actual text data set where everyone combines. They agree this is the text and then they can decipher it. and figure out what all that represents. I also find it super fascinating that, again, not from personal experience, people who take those drugs report similar hallucinations. So they meet those little men and they report to having common... Machine elves and yeah.

Starting point is 01:37:06 Right. So that's interesting. Why is it the same? So obviously same hardware of the brain, same chemical being done. But it's still interesting that there is consistency in our delusions. Yeah, it brings into question, I guess, like young, understanding of the collective unconsciousness,

Starting point is 01:37:23 what sort of archetypal significance maybe is foundational to the human mind? So if superintelligence wants to learn about those delusions in a systemic way, it would need lots of drugged-up humans. So there is some hope for us. What have you seen in all the realm of media for movies to shows that give interesting perspectives to various different timelines that could play out,

Starting point is 01:37:46 example? I think you mentioned ex machina, Wally. So the problem is, you can't have a realistic super-intelligent character in a movie, because you can't write one. You are not super-intelligent. So everything we have is either June, where it's banned, or you have Star Wars with that special large-language model. So none of them have what is interesting to us.

Starting point is 01:38:08 Yeah, I suppose a lot of them give glimpses into what we might experience in the next five years or so. Basically, avoid the thing they cannot talk about, and it makes sense. Yeah. If this is a simulation, what role does death play? What do you think happens once you die then? It could be a restart. You go to the next level, next simulation,

Starting point is 01:38:29 return to this level with better skill set. I have no knowledge of what happens outside the simulation. Computer scientist phrasing of reincarnation from the mystical lens, essentially. It's basically it. I think then one of your computers dies, but you have a backup and you transfer that back up to a new heart. where you go. You died and now you're living your best life again. It could be levels. It could

Starting point is 01:38:55 be different levels of simulation. You'd go to upper levels, lower levels could be simulations all the way up. What do you think you are then? What am I? What are you? What does it mean to know thyself then? Because you look at all the different layers of who you could perceive yourself to be from the body which we know is not you. You could cut off your hand. That's your hand. It's not you, Amhand, right, to the various different levels of psychological and biological aspects of self, how would you explore that question? That's a great question. We actually have papers on both human personal identity and then transferring it to AI. And the conclusions are consistent. There is nothing unique to be you. It's not your memories. It's not your body. It's not your

Starting point is 01:39:38 goals. All of it changes through your lifetime. So we don't have a good answer. We seem to be a collection of different properties in time. But what happens outside a simulation, some people argue, well, one collective consciousness, which is subdivided into this avatar instances. So if I was interested in most interesting experiences, I have limited time, I would run a simulation,

Starting point is 01:40:06 and I would put many, many agents there, basically qualia surfing, collecting the best experiences, and I look at top 10 list and, like, I want to do that. That sounds awesome. So that would be one way. I split my complex consciousness stream into many individual subagents capable of local experiences

Starting point is 01:40:24 just to find what's best to invest my time. Yeah, I mean, that goes hand in hand with a lot of what the Gnostic origins of many different religions and mystics would say about the one consciousness differentiating itself to have an experience of itself. How could oneness experience anything if it's just oneness, right?

Starting point is 01:40:44 It needs to experience manyness. Yeah. What's one question you wish to more people ask you? My humor paper, of course. Tell me about that. I have a paper explaining what humorous. Wow. Let's go there.

Starting point is 01:40:57 It's interesting. I can envision a universe just like ours, same physics, same everything, but no humor. It's just not a thing. Nobody starts laughing. It's not a reaction. There is no concept of joke, right? Makes sense. So many philosophers, many scientists actually tried explaining humor.

Starting point is 01:41:14 It's kind of like consciousness. There are hundreds of papers, hundreds of theories, which means nobody really knows. They're all trying and nobody's winning. So I wanted to try to explain it from the computer science point of view. And it seems that then you have a world model and there is a mistake in it. It's a bug in your code. Software, you fix it and you're happy. That's what jokes are.

Starting point is 01:41:36 You have a world model and a violation of that world model makes it funny. You have a system for detecting cognitive errors. and then you get rewarded for that detection, and you share it with others in your tribe, so everyone does not make that same mistake. And so I have a paper mapping standard errors in software to common jokes. And the question, of course, is,

Starting point is 01:42:01 what's the worst possible computer error? That would be the funniest joke possible. So can we compute the funniest joke ever? You have to read the paper for the punchline. Wait, you can't give it to me now? I'm sure you can look it up and insert it into, but it's a paragraph long. Basically, the idea is that there is a civilization,

Starting point is 01:42:23 and they decided to create superintelligence to help and cure all the diseases, get free stuff, get rid of hate, have more love, and so they turn it on, it thinks for a nanosecond, and shuts off their simulation. You had to be outside the simulation to enjoy this one. If you have a butt of a joke, it's not funny to you.

Starting point is 01:42:43 You have to be outside. Makes me think of a, I think Voltaire quoted, God is a comedian playing to an audience that's too afraid to laugh. Something like that. There's something about both our capacity for humor and the nature of intelligence that has the capacity to explore a paradox and hold it also simultaneously and contradiction. and

Starting point is 01:43:14 those errors in the world model if you have a paradox that is an inconsistency you found in your world model funny that's why the second time you hear the same joke it's not funny

Starting point is 01:43:29 you already know you fix that bug yeah yeah it explains a lot such a computer scientist's way of explaining humor and jokes I love it but then I train large language models on my paper and then asking to produce novel funny jokes they do Okay, I think one in ten is funny.

Starting point is 01:43:45 Just going to keep getting better and better. We'll have super humor. So funny, you die laughing. The paradox of that joke is it lost on me as well. Literally die laughing. Man, where do we go from here? I'm going to Kentucky. I don't know.

Starting point is 01:44:08 Okay, so we explored the implications for the next trajectory for AI in the next three to seven years. Could you have any meaningful conception of what it would be like to be living if we do make it to 2045, let's say? So I think that's the concept behind singularity, technological singularity. It's a point beyond which we cannot meaningfully see. We cannot make predictions. We cannot understand how that world is going to be different because we cannot predict behavior of more intelligent forces impacting that environment. So I think it's literally impossible for us to make that accurate prediction.

Starting point is 01:44:52 We can come up with stories. That's what science fiction is all about. But I don't think they're going to have much bearing in reality. Do you not think that the level of innovation in which is going to occur in the next, even if it's just three to five years, which is a short amount of time comparatively to the scale of what's being innovated will give us a much deeper grasp of the things that we can do, the things that we can put in place. I mean, you look at, yes, it's unpredictable and there is this level of exponential scale that we've never seen before, but there's also many

Starting point is 01:45:26 different eras in history pre-innovation of that era we never would have thought possible or solutions to problems we didn't know existed. So is it possible that we gain insight into new worlds like we did with germ theory over the next three to five years that give us much more insight into the nature of intelligence and to make this a solvable problem, which you feel like is inherently unsolvable right now. Yeah, so my paper on how to escape a simulation basically argues that if we cannot contain superintelligence, then we can use ability of that super intelligence to escape from a simulation to give us access to real information in the outside world. The most interesting question is about true nature of reality. You know,

Starting point is 01:46:06 don't care about what happens in this dream. You want to know what is true about the real world, what physics they have, what resources they have. Who are they? Have you ever been so focused on what is outside the simulation or what this reality is that you lose sight of living in this one? I'm pretty well grounded in this simulation. I've been enjoying it.

Starting point is 01:46:29 Yeah, no. You seem very grounded in the space too, but I know a lot of people, you know, have experienced periods where it's a bit of a, existential nihilism that can take over you when you're exploring such topics. I find them so fascinating. I'm not depressed aboard. I'm good. Okay, well, so given the full context of this conversation, I'm just curious, where do you, now where do you see yourself putting your time and energy the next coming years?

Starting point is 01:46:58 We continue working on additional impossibility results. So we talked about a few in a book. And as I said, there is a paper in a top ACM surveys journal with about 50 different impossibility results, not just computer science, economics, mathematics, physics, many different domains. For most of them, we have not explored their implications on AI safety. So I think that's a very interesting set of projects. We need to understand what are the limits. And I think every additional paper helps to cement this position. It's very hard for AI risk deniers to argue against,

Starting point is 01:47:34 published results. So that's what I've been working on full-time. Things we cannot do. You spend so much every time focused on solving things we cannot solve, doing things we cannot do essentially. But you seem still joyful in the efforts. Do you feel like it's just the most meaningful use of your time? Because what else would you be doing?

Starting point is 01:47:55 I always try to work on the most interesting, most important problem I can find where I can make a contribution. So I don't know anything more. interesting than studying super intelligence, consciousness, singularity simulation. Those are the concepts I find exciting. And I think many other people do. And I think that's what's going to impact future of humanity. You're living here at Ikega. I am. Hopefully I'll get to continue and won't face eye risk, S risk or X risk. Is there any concept that we haven't explored in this book or some of your papers that you think would be important to touch on? You did a good job.

Starting point is 01:48:34 read some of my work. Most people have no idea what I did. So that's already a huge improvement over. You quoted to the right quote. So I think you did great. And I know your audience well. I don't know for them it's confirming their spiritual beliefs or just crazy stuff. I have no idea.

Starting point is 01:48:50 Yeah. But I think for the topic, the know-of-a-self part, it's important not just to study your capabilities, but your limitations. So you invest your time better. So you understand what is within possibility. for you. That shape of limits is what defines you. Well, Roman, we're going to leave links down to all of your work, your books, your papers, and more people can stay connected with you down in the description. I think conversations like this can feel somewhat heavy for people that

Starting point is 01:49:22 are new to the topic. It's like, oh, shit, the world's ending, you know? But there's also a very important and sobering reflection on what we're giving birth to you right now. And at some point, need to gain awareness of it and better sooner than later, right? Thank you. And I think one way to look at it is I just made your time more valuable. You understand that whatever time you have left, be two years or 20 years, now you value it a lot more and you can do a lot more with it. Well, I plan on making the most of my time left and I find conversations like this a very good

Starting point is 01:49:56 use of it. So I appreciate you. Thank you, my friend. Thank you for inviting. Yeah. Until next time, everybody, be well. Go touch some grass. Smoke some grass.

Starting point is 01:50:07 Thank you, man.

Know Thyself - E191 - Roman Yampolskiy: The Man Who Proved We Can't Control AI (And What That Means for Humanity)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.