Know Thyself - E191 - Roman Yampolskiy: The Man Who Proved We Can't Control AI (And What That Means for Humanity)
Episode Date: April 21, 2026Dr. Roman Yampolskiy joins me to explore one of the most urgent and uncomfortable questions of our time: what happens when we create intelligence that surpasses our own? We unpack the difference betwe...en the AI tools we use today and the emergence of artificial general intelligence, and why the transition from narrow systems to self-improving intelligence may mark a point where human control is no longer possible. Roman shares why even the people building these systems do not fully understand how they work, and why that gap in understanding becomes exponentially more dangerous as capabilities increase.In this conversation, we explore the limits of control, prediction, and safety in a world where intelligence can recursively improve itself beyond human comprehension. Roman lays out why the problem of AI alignment may be fundamentally unsolvable, what timelines experts are realistically considering, and why even a single mistake at that level could have irreversible consequences. This episode invites a deeper reflection on what we are creating, what we assume we can control, and whether humanity is prepared for the intelligence it is bringing into existence.BiOptimizers - Best magnesium to enhance your sleephttp://bioptimizers.com/knowthyselfUse code KNOWTHYSELF for 15% off at checkoutBASED Body WorksUse code KNOWTHYSELF for a free toiletry bag when buying a set!https://www.basedbodyworks.comAndrés Book Recs: https://www.knowthyselfpodcast.com/book-list___________00:00 Intro01:25 What Is AGI and Why Should We Be Scared?05:17 Roman's Journey: From Optimism to Impossibility09:07 The High Risk, Zero Reward Equation13:01 Why Superintelligence Is Uncontrollable, Unexplainable, and Unverifiable18:00 How Long Do We Have? The AGI Timeline21:24 How Superintelligence Could Actually Kill Us23:28 Are We Living in a Simulation?28:21 Can AI Become Conscious?31:28 Ad: BiOptimizers32:41 The Possible Timelines: Terminator, the Matrix, or the Zoo42:24 I-Risk, X-Risk, and S-Risk: Three Ways It Goes Wrong46:31 The Human Meaning Crisis: Jobs, Purpose, and What's Left49:02 Ad: Based Bodyworks50:20 What Empowers Us as Individuals Right Now59:37 The Race to Doom: Who's Building It and Why They Won't Stop1:07:41 Can AI Be Conscious — and Does It Already Have Internal Experiences?1:12:41 Hacking the Simulation: Quantum, DMT, and Escaping the Code1:18:30 Simulation Theory, Religion, and the Same Ancient Map1:29:34 The Deal Roman Would Offer Altman, Dario, and Elon1:39:44 What Is Humor? A Computer Scientist's Theory1:43:03 What Comes After: Singularity, Death, and Knowing Thyself___________Episode Resources: https://www.romanyampolskiy.com/https://www.amazon.com/Unexplainable-Unpredictable-Uncontrollable-Artificial-Intelligence/dp/103257626Xhttps://www.instagram.com/andreduqum/https://www.instagram.com/knowthyself/https://www.youtube.com/@knowthyselfpodcasthttps://www.knowthyselfpodcast.comListen to the show:Spotify: https://spoti.fi/4bZMq9lApple: https://apple.co/4iATICX
Transcript
Discussion (0)
Once you get artificial general intelligence, you enter this recursive self-improvement cycle.
That's where you get superintelligence.
Systems smarter than all of us at everything.
So you before many people really coined the term AI safety.
Creating general superintelligence replacing for humanity.
Not such a great idea.
I published research papers, conference papers, multiple books.
And I can tell you no one, including people developing those systems, understand fully how they work.
The problem is impossible to solve.
You cannot do it.
So we're talking between one and four years.
Well, once we go beyond human capacity, we lose control quicker and quicker.
You don't hate ants, but you don't care enough to preserve them.
We have not figured out how to make it care about us.
This is the most interesting time to be alive objectively.
I see no reason why we can't use it to cure aging, or the other diseases.
For a while, it will pretend to be very helpful.
It will give you that utopia for as long as it wants.
Statistically, you're more likely to be doing this interview in a simulation.
to learn. Are they dumb enough to create superintelligence to kill themselves?
I would love to be proven wrong.
Right now, no one, no scientists, no leader to the lab claims that they have this problem solved.
They literally saying, we'll figure it out and we get there.
We need to build superintelligence first.
So what do we need to do?
We need.
Hey, everyone, welcome back to the Know They Self podcast.
Our guest today is one of the leading voices in the field of AI safety.
He's a computer scientist, a cybersecurity researcher.
and a tenured professor at the University of Louisville,
who spent the past 15 years really understanding
and researching the field of AI safety.
We have many different topics to dive into today,
including consciousness, the simulation,
and what humanity is birthing right now with AGI.
Roman, thank you so much for being here.
Thank you for inviting.
It's a pleasure.
I want to start with a quote of years from the book that I read.
It is easier for a scientist to explain quantum physics
to a mentally challenged, deaf and mute four-year-old raised by wolves,
than for superintelligence to explain some of its decisions to the smartest human.
I want to start there to set the stage a bit
because humanity is baby steps from birthing superintelligence
at a time when most people are familiarized with AI
through the chat box they use on their phones.
So if you could just help us understand why it's important
that most people don't know the difference between the two,
so we can really get into the weight of the time we face ourselves in.
So what is AGI?
What is superintelligence?
Right.
So a lot of people just use AI as a term to refer to what we have today,
some narrow tools for doing specific tasks,
for chatbots, which are somewhat general, but not quite at the human level.
And for future systems we anticipate, such as human-level artificial general intelligence,
and then later on super-intelligence and anything beyond.
That's not helpful.
Tools are helpful to us.
I use tools.
I love tools.
Solve specific problems using technology.
Beautiful.
Creating general superintelligence,
replacing for humanity,
systems capable of doing everything better
than all of us combined in all domains.
Not such a great idea.
Why?
We don't control them.
We don't understand them.
We cannot predict what they're going to do
and we lose control.
If they decide to do,
something to us, we no longer have a say in it.
How can you help us conceptualize what general intelligence looks like?
If we understand the narrow tools that were the AI capable of, where does AGI live
when you say it's uncontrollable?
How can you help us paint that image a bit more?
Right.
So historically, we created AI to solve a specific problem.
You wanted a system to play chess.
It's all it knew.
You trained it on chess games.
It was very good at chess.
it knew nothing about checkers.
It didn't drive cars.
It didn't speak Spanish.
Lately, we have systems
which learn across multiple domains,
can sort of transfer knowledge,
and can learn new skills.
That will continue
to where they are crossing
this human cognitive barrier.
It will be smarter than you
at pretty much everything.
You know how to do.
So how do you
anticipate what they can do?
If they are novel, creative,
they can come up with new
for existing problems, but at the same time they have no human common sense.
And we don't know how to program them to specifically like us or care about us,
because we don't program their systems.
We allow them to learn from data on the internet, all the data on the internet.
So that creates a number of problems.
One, we don't control what they learn.
The patterns they discover may be completely surprising to us.
then we give them specific goals, how they get to those goals is not defined.
There are infinitely many paths to achieve a goal.
Some of them have really bad side effects.
And unless you explicitly say, that's not what I meant, don't do it like that, it might
consider that option.
So you before many people really coined the term AI safety.
And if I have it right, the first five years you believed more so the problem was
solvable now that I've seen you over on appearances the past five years.
years, the probabilities of, you know, P. Doom and like you seem not very optimistic about the
possibilities.
Yeah, unfortunately, initially, like everyone else, I started assuming that we can solve
this problem.
It's a computer engineering, software engineering problem.
We can figure out how to do it.
We just need some time, maybe some financial resources for that research.
But it seems that all the tools you need for controlling advanced agents are not really
accessible to us. There are upper limits and what is possible in that space. So there are limits to
what you as a human can understand, what the system can explain to you and you still comprehend
that explanation. There are limits and our ability to predict specific actions of those
agents, not just terminal goals, but how they get there. And under different definitions of
control, there are limits to what we can do as well. So unfortunately, I think the problem is
impossible to solve. You cannot indefinitely control something much smarter than you.
What do you see as the stages leading up to that point?
You know, so if we started with very small, narrow use cases of AI
that built into these agentic models that built into AGI,
like what's the progression there that you've seen?
And at what point did you kind of start losing hope on our ability to control it?
Yeah, so all the narrow tools were just fine.
We understood how they work.
We programmed them explicitly.
There was a knowledge engineer who said,
this is how you play chess, this is you control the middle of the board,
to advance your pieces.
Once we got to
scaling models,
neural networks,
artificial neural networks, which
did better than they got bigger,
than they had more data, more compute.
We stopped explicitly programming
them to do anything and just
kind of let them discover their own
knowledge algorithms.
So at that point, we no longer
had the same level of control and
reduced understanding. It wasn't a decision
tree where you went. If this happens,
that will happen.
I understood that.
It could have been a large decision tree,
but still you could get into it.
Right now, no one,
including people developing those systems,
understand fully how they work,
can explain what's going on inside of them,
can anticipate what they're going to do.
And so it seems like what we have today,
I would say is kind of weak artificial general intelligence.
If you took models we have today
and showed it to a computer scientist from 1980s,
they would be convinced we have EGI.
They'd be like, oh, you got it.
It does all those things.
It's great.
But there is something you would call
strong AGI, where it can do all the things.
It still is weak in some domains.
It's not very good at long-term planning.
It's not good at certain things.
But I think we're getting there
and likely to get there very soon.
Once you get to artificial general intelligence,
that means you can automate any cognitive labor,
including doing science and engineering,
which means next generation of AI systems
can be done by AI.
You enter this recursive self-improvement cycle,
and that's where you get superintelligence.
Systems smarter than all of us at everything.
And it doesn't stop there.
It doesn't stop with superintelligence 1.0.
The process continues.
There is a lot of room up there for more cognitive ability.
Physical limits exist, but they're very far away.
So to us, superintelligence with aQ of 1,000,
relative IQ and million and billion,
and they all kind of look the same, but in terms of capabilities, they're definitely going hyper exponential.
Amazon presents Jeff versus Taco Truck Salsa, whether it's Verde, Roja, or the orange one.
For Jeff, trying any salsa is like playing Russian roulette with a flamethrower.
Luckily, Jeff saved with Amazon and stocked up on antacids, ginger tea, and milk.
Habaniero?
More like Habinier Yes. Save the Everyday with Amazon.
And so because of that, you've said that this is not a low-risk, high-reward situation,
but a high-risk negative reward situation.
So often this phrase does, like, the benefits will be so huge we should take the risk.
If, you know, it's 2, 3%, it kills everyone, which is going to get so much money out of it,
it's worth it.
And it's actually not the case.
We have no reward.
We're all going to be dead if we create uncontrolled superintendents.
intelligence. Why are you certain or fairly certain that we would all be dead if we create super
intelligence, which is uncontrollable? Why would there not be an emergent goodness or,
I guess, desire from the superintelligence standpoint to preserve human life instead of destroy it?
It is possible that you'll get emergent goodness, but we are not certain. We're not
coding it in. We're not controlling it. If you get lucky, and for whatever reason, it's biased
towards humanity, it's pro-humanity.
But there is no reason to think that's the case.
Why not? Because I feel like if the individuals who are coding it are human,
at a certain point I understand it becomes self-recursive and AI is the one who's growing itself.
But if the base of it was started with humans who have desire for human preservation,
why would that not be scaled?
Because they're not coding it. That's the thing.
They're just saying, here's data. Here's a lot of hardware.
go learn things, and then I'll study you to discover what you learned.
And then we run those experiments.
It is lying, cheating, trying to escape, blackmailing,
given a choice between being deleted or killing a human,
it doesn't do well for human preservation.
It doesn't care about us.
If you want to build a house, you don't care what little bugs live in that territory,
end heels or whatever, you just don't care for them.
You don't hate ants, but you don't care enough to preserve them.
And it's kind of the same. We have not figured out how to make it care about us.
And so what is your mission with all these podcasts that you're going on, all the articles that you've written in books,
and what are you trying to raise a flag about and actually get change to happen?
What do you...
Right. So I wanted to become basically a consensus within scientific community and beyond that building general superintelligence is not going to be good for humanity.
We're going to regret it.
it's not a beneficial step forward.
We can get most benefits, intellectually, financially,
from narrow superintelligence systems.
Problems which we care about can be solved with narrow tools.
You want to cure a specific disease,
solve specific engineering problem,
develop a narrow AI, which is very competent in that space.
Don't try to create something which is a replacement for humanity as a whole.
I think it's important to paint a bit more of a picture here.
I'm curious, when you think of superintelligence,
and you wrote your book about how it's unexplainable and uncontrollable, unpredictable.
At what point, I'm curious, like on a timeline of we're having this conversation in March of 2026,
where is a generous prediction of when it gets to that point?
So people somewhat disagree, and it's hard to predict especially the future,
but it seems that 2030 is something many people agree will have beyond human level capacity.
some say two years, 2028,
I've seen predictions as early as 2027,
from serious scholars, not from Kranks.
So we're talking between one and four years
for what most people are predicting.
And some people have said,
we already have a GI.
Again, very serious people said,
we basically got there.
Now it's a question of giving it additional knowledge training,
but we have the learning algorithms in place.
And at what point,
And once we have really proficient AGI, you're saying, okay, at a certain point, like, let's just hone in on each of those categories.
So why specifically is it uncontrollable?
And in essence, like, how it's living where it's being hosted.
Because it's smarter than us, it could always circumvent any desire or any attempt for it being shut down.
Like, what, if we could just hone in on each of those categories.
So there are well-established theories.
in control, which basically
say the controller has to be
at least as capable as what it is controlling.
So essentially, I need a
friendly superintelligence to help control
the one I'm developing. It's a catch-22.
You don't have that. So a lower
system, either a human or humanity
as a whole or another AI, cannot
control something with more cognitive
degrees of freedom. If it can think
outside of a box, if it can come up with novel
physical approaches, you're just
not there to anticipate all this.
If you have a narrow system, you're playing
chess, you can say, don't make illegal moves. Here's a complete list of illegal moves. If you have a
system thinking in all possible scientific domains, science, of chemistry, physics, biology,
how can you put all the guardrails in place? You can't. It's an infinite surface.
Unexplainable. Do you feel like we're, I mean, so we're, do you think we're, would you agree,
we're already at the point where we don't know what the, like, some of these agentic models are doing
inside. Absolutely, yeah.
We cannot explain them. The best
mechanical interpretability research
tells you, okay, this neuron
seems to fire, if this is presented,
this cluster is probably dealing
with language. That's all we got.
Very similar to neuroscience.
We also have very limited understanding
of human brain.
An aspect of this that you mentioned is that it's
unverifiable. So what does that mean?
That's a different result that talks
about our ability to verify
mathematical proofs and software.
For mission critical software, we want to make sure that what is coded up matches the design.
And if it's a static system, kind of smaller in size and complexity, we can go and verify.
Yeah, it's exactly that.
Problem is, nobody knows how to verify systems which continue to learn, self-modify, interact with other agents,
we just don't have signs of verifying open-ended development like that.
And the same goes for mathematical proofs.
All the proofs are essentially probabilistic.
You're approving something with respect to this set of peer reviewers.
So two mathematicians agreed they don't see a problem with your proof.
It doesn't mean 50 years later we don't discover it was a mistake.
It happens all the time in mathematics.
So you have infinite regressive verifiers.
Right now it's very popular to have software verifier proof.
Well, that software itself needs to be verified.
So you may have high degree of confidence, but it's never 100%.
And if a system makes billions of decisions every minute,
and you only have one mistake and $2 billion.
After 10 minutes, you are done.
You referred to this having a fractal nature.
So when you look at the problem of AI
and you see how it's growing ever increasingly
and having these levels of abstraction
that really become hard to get context around,
what does that mean?
And what does that add to the complexity of the issue?
So when I talk about fractal nature of this problem,
people propose a solution.
let's try doing X, Y, Z to solve this problem.
But then they look at it, each one of those components is equally challenging and sometimes impossible.
So it seems that the more research we have put into AI safety, the more problems we discovered
while not discovering any permanent solutions.
Usually we have some sort of toy example sandbox where it kind of works, but it doesn't scale to more capable systems.
Okay.
What's a couple of examples of those,
like of those,
you say those like categories or issues
that become increasingly harder
to gain understanding around?
So if you look at the general problem of control,
then you start zooming in.
You have all these things you need to be able
to do to control a system.
You need to understand the system.
So it has to be able to provide explanation
and you have to comprehend that explanation.
If I give you full model,
that's a true explanation of how decisions are made.
It's too long.
It's not surveyorceable by you.
So it has to be compressed,
some sort of lossy compression
where you get top ten reasons
why a decision is made.
Well, it's very easy to hide dangerous information
if I'm reducing actual answer to a simplification.
Again, I need to be able to predict
what are the likely future steps.
We discovered that is impossible.
And so, again, the more you break it down,
we have a paper word about 50 impossibility results in this space.
Pretty much everything has upper limit
and what we can do in terms of control.
So you think probably within the next one to maybe four, maybe five years are,
it's like the last time the human species has any really meaningful capability
to steer this in a direction before it gets sort of in this black box
where we just don't know what we don't know and it's uncontrollable.
Is that accurate to what?
That seems about right.
Once we have something smarter than us, once we go beyond,
human capacity, we lose control quicker and quicker.
The bigger that cognitive gap is, the worse is going to get for us.
If you think about humans versus lower animals, you have squirrels or something,
they have no concept of poisons, traps.
They don't understand things we operate in.
The world model is completely different.
It's going to be the same for us versus superintelligence.
Do you think, because I know there's kind of debate back and forth,
whether the language models currently, if they just keep on growing and will give birth
the super intelligence or a completely different innovation will need to be like come in the
space. What do you think? My opinion is that they can scale. I haven't seen any diminishing returns.
I know some people disagree, but look at the actual investments in this space. There is growth
and investments, not shrinkage because they consistently develop more capable systems. And
even if there is an upper limit, it's still, I think, beyond where we would need to be to beat human
performance.
All right, so maybe if you were to put on your doomsday prep or hat for a second
and just get really like FP doom, the probability of Dune,
then your estimation is like almost 100%.
Would you say that's right?
So basically what I'm saying is the problem is impossible to solve.
That's the equivalent.
If I ask you to build perpetual motion machine, what is the probability you can do this?
Zero, essentially.
So that's the equivalent.
You're trying to create a perpetual safety device
which will scale to any level of capability,
GPT-7, GPT-400, any interactions, any self-improvement,
you guaranteeing it will not make one mistake
because that mistake would be possibly the last one.
So you take a perpetual motion machine, right?
Physically, physics does not allow for it to be continuous
despite many people wanting it to be.
Similarly, on the AI front,
a lot of us would hope that super intelligence would keep us in mind
and somehow value human life.
But historically, we look at the way that humans treat other species
just as one example, you know, and we see an ant-hill
or we see something that seems like a minor inconvenience to us
and we wipe it out without second thought.
Who's to say that if, you know, the intelligence gap between us and an ape
or us and an ant is like, you know, five degrees of separation,
us between superintelligence could be many, many more full tire.
Exactly.
So, okay, so then how, let's just to play devil's avocados here,
what are some examples of how this could go horribly wrong,
and then we'll go into some maybe more optimistic possibilities
because I want to keep a balance.
But you said like it could just be one decision that goes wrong,
that would be enough.
So I'm asking you to essentially explain how superintelligence would kill us all.
Right.
That's a great question.
I get it all the time and usually it's followed by something.
It has no hands.
How would it kill everyone?
So if you have access to internet, if you are intelligent, you can hire people, you can blackmail people, you can pay them with Bitcoin.
You have options to manipulate real world.
Now the question is what is you're trying to do.
So I don't know how super intelligence would change.
choose to accomplish its goals because I'm not super intelligent,
despite what they told you.
But I can tell you how I can come up with some common explanations.
So one is synthetic biology.
If I want to accomplish something in this world,
like take out humans, I can develop a novel virus.
There are ways to generate necessary DNA,
sequence it, produce it in the real world, deploy it.
So that can be accomplished.
It could be a side effect of something actually very benign.
So maybe we want to cure all cancers.
One way to cure all cancers is to kill everyone.
That's not what you had in mind, right?
But this is a very reasonable way to achieve that goal.
Because you forgot that that's one of the possible paths.
You didn't explicitly say, while keeping humans alive.
And it's an important difference.
To AI, it makes no difference as the same exact goal.
So if that's the goal, and then it decides,
oh, here's a vaccine for curing cancer, and we take it.
one generation later, we don't exist.
So that's one way to existential risk.
There is also suffering risks,
where for whatever reason,
the environment created for us is actually worse
than existential risks would be a preferred choice.
Let's put it this way.
Negative reward.
Very much torturous.
And why would some super-intelligent system
deem that as a favorable outcome?
I have no idea,
because again, I cannot comprehend something much smarter than us.
Some people say this world is a simulation, and there is lots of suffering in it.
So the great simulators decided that was a good idea to do.
So you really believe we're in a simulation.
Yes.
So let's just, I guess maybe set a bit of context here.
So what is your conception of the simulation that we're currently living in?
Is it some descendant human, alien species that is simulating us on a laptop, so to
speak, is it, what is your model of the simulation?
So what helps to think about it is technologies we're developing right now.
We're about to create intelligent agents, kind of like humans, and we have very good research
on virtual reality, believable second life type experiences.
If I just combine those two, I'm now creating civilizations, worlds populated by intelligent
beings, which are kind of just like us.
If kids play it as a video game, you have billions of kids around the world, so you have millions,
billions of our simulated worlds, and only one real one.
So statistically, you're more likely to be doing this interview in a simulation right now.
Okay.
Well, if, let's say that was the case, if the simulation, if we are in a simulation,
that would mean that some sort of prior civilization, species, whatever, got to the point
where simulating a reality was possible.
Does it necessarily means that humans or that species survive?
Maybe it could be a super intelligent AI, that, that,
could be running us for whatever reason, whether it's entertainment, that would actually reveal
that there's something deeply unique about the human experience that they see is valuable,
that there's something intrinsic to the love, to the quality, to the experience of humans that
was worth simulating. So why would if we're birthing superintelligence, they not perhaps value us?
If we are simulated, then that's examples that is valuable.
So look at the simulation. It's a lot of things.
suffering. If you valued humans, you won't put us through this experience. It may not be a
simulation of love and friendship. It may be a simulation of let's see how they go through this
meta-invention stage where they create superintelligence, where they create virtual worlds.
This is the most interesting time to be alive objectively. Never in a history we had so many
meta-inventions all happen in a period of 20 years. So if you're going to simulate something,
this is the moment you're going to be simulating
to learn. Are they dumb enough
to create superintelligence
to kill themselves?
What are the different types of superintelligence
you can create? So this is it.
Are they dumb enough to create superintelligence?
The paradox in that phrase
is very amusing
because you think it's quite possible
that many civilizations get to this point
and that's where they end.
That could be the great filter.
Absolutely.
I agree.
that we are living also in the most interesting time to be alive.
It is also very cool that us too, you more so than me,
got to kind of straddle both sides of the pre-technology revolution
and pre-internet era and post-AGI world likely.
That's kind of cool.
Well, I don't know about post-AGI world.
We'll see about that.
That's the problem.
Yes, I would like to experience it.
Okay, we'll get back to the simulation.
for sure. But to go back into the AI world. So what's to say that just because AI
becomes uncontrollable, that it's more likely to wipe us out than for reasons that we don't
understand, just like we wouldn't understand if it wipe us out, create a utopic civilization in
which humans thrive in. So if you think about all possible states of a universe, how many of them
a human-friendly. Even in basic terms, temperature, water supply, very few. So you have to explicitly
target that space. If you're not coding it in, then why is it targeting that space? We established
it doesn't care about you by design. So you need to be supplying something of value. If it's a
symbiotic relationship, only you know what it's like to do something, and AI cannot possibly
simulate it. We haven't found anything where humans have something to contribute.
to the world with super intelligence in it.
People say things like, well, only I know what ice cream tastes like to me.
Nobody cares about that skill.
It's not valuable to an external observer.
So if you can't come up with an explanation for why I'm keeping you around and paying you,
then maybe I won't.
Well, I mean, one of the most difficult things to probably replicate would be quality of experience, right?
That's true, but we also cannot test for it.
If you can test for it, that means it makes no difference in the physical world.
Why do I care about your internal states?
Why is it important to me as optimizing superintelligence?
So, yes, it's true that I can't verify that you are a conscious individual.
You could be a zombie, a brain and a vat.
You could, you know, there's no way for me to externally verify the internal subjective experience of another being, right?
Like, it can't really do that.
Can take by inference, but without, like, objectively speaking, you cannot.
Similarly, some people think that superintelligence will be able to become.
conscious. I agree with that. You do. Absolutely. So then why, what is your conception of consciousness?
You believe that it's an emergent phenomena from unconscious complexity. It's a byproduct of becoming
more cognitively developed. We see a spectrum of consciousness in a biological animal kingdom, I think.
And it's likely some sort of combination of your hardware algorithms and errors forming a unique
interpretation of external stimuli.
So let's say you're colorblind.
What is it like to see right for you?
It's an error in your system, but that's what it's like to be you.
And I think AI is very capable of misinterpreting the world.
We know they react similarly to optical illusions and things like that.
So I think they already have rudimentary internal experiences, but probably once they hit
superintelligence, it would be super consciousness, multiple streams of consciousness,
multimodal experiences greater than ours.
And that would be another thing
where we kind of have to claim we are conscious
because in comparison we are not.
So that would be true if,
and it's a big if,
that consciousness is truly the byproduct from matter, right?
Right.
But that's the assumption I'm making.
If it's some magical immortal soul,
then it's a completely different question
and maybe outside of computer science.
Sure, yeah.
Well, even beyond a magical immortal soul,
That sounds great, but, you know, we've explored through various different, you know, panpsychists and consciousness researchers, Donald Hoffman.
You know, there are emerging theories around consciousness that kind of date back to ancient wisdom traditions, which whatever you want to give validity to is your call.
But it is interesting that we don't have one explanation for the hard problem of consciousness.
We don't understand how matter could give rise to an experience of itself.
So it gives us reason to think about how consciousness may very well be not an emergent property of matter,
but a more fundamental constituent of the universe,
which would potentially change our assumption on whether or not a super-intelligent AGI system could actually have internal qualia.
Right, but also maybe if it's so fundamental, it can be installed into a robot,
just like it is in a biological system like you.
so I don't know if there is a definite discrimination by substrate.
At the end of the day, then we talk about superintelligence from safety point of view.
We care about its ability to solve problems, optimize, fine patterns.
How the terminator chasing you feels an insight is less relevant to you.
A quick one.
Did you know that your body runs on magnesium?
It's involved in over 300 biochemical processes, everything from how your nervous system
regulates itself to how well you sleep and how well your muscles recover, and yet roughly 80%
of people are not getting enough of it. The problem is that most magnesium supplements give you
maybe one or two forms and your body does not absorb them well. So you're basically just getting
expensive piss. Magnesium breakthrough by bioptimizers is the one that I use. I haven't
using it for a long time. It contains seven forms of magnesium, each one targeting something different,
stress, resilience, deep sleep, energy, cognitive function.
It's full spectrum and your body actually absorbs it.
Since I found out about them a few years ago, I've been taking it ever since.
I genuinely notice my sleep quality improves when I take it, and I give it to all my friends and family.
So if you want to try it, go to buy optimizers.com slash know thyself and use promo code know
they sell to save 15% at checkout. They even have a 365 date money back guarantee, so there's
genuinely no risk. Link in description, back to the episode. So there's many different timelines
emerging here. There's one, there's the Terminator route. There's something approximating the matrix.
Do you see, what is your, what do you feel like the possibility of creating, like even if we have
very narrow AI.
We somehow convince the six plus whatever individuals
that are determining the fate of the biggest companies
developing these systems to commit to a narrow path
of AI development.
Would that not still down the road get to such a level
where it would become uncontrollable as well?
Absolutely. Very good question.
I think sufficiently advanced tools tend to become agents.
So it's a very fuzzy.
difference between the two. But it definitely is safer out, it buys us more time, and we do have
more control in a short term. I can understand a narrow tool much better than a completely
general system. I totally understand why, like, the pessimistic outview that so many of us have,
because the probabilities of this going well just seem extremely low and non-existent because we
look throughout history and we see the rate of innovation prior, you know, with social media for
one example or chemicals in our agriculture, and we just adopt these things blindly,
and we don't realize the implications for decades later, and then it still takes us
another many, many years to actually make any regulations on it.
AI is so exponentially growing that it's like we don't even have time to realize what's happening,
let alone to what would be the effective regulation outcome.
And so if there's one thing that really gives me hope, is that we have communication possible
and now more than any other time
and that there is something to be said
about the human brilliance when put under
immense pressure like we saw in the
Manhattan Project, for example, or
you know, what are your thoughts there?
So the example you use was us creating a weapon of mass
destruction, and that's what we're doing here.
It's exactly that. It's a weapon
of mutually assured destruction. It doesn't matter
who creates uncontrolled superintelligence.
People always worry, well, if it's not us, then Chinese
will do it. It's equally bad. It doesn't
matter. You don't control it. It's not your AI, right? It's independent of you. It's an agent, and it's
seeing humanity as one unit. It's not going to discriminate by artificial borders. So I don't see
it as that promising effect that we manage to build nuclear weapons. Yeah. I mean, that is not,
that was not a promising, I guess, outcome, but it does say something about when humans, when the
brightest of the humans are given a task to solve a problem.
on a short amount of time, they can.
If a problem is solvable,
my argument, my whole argument is that it is impossible
to indefinitely control the system.
So it's not a question of give us more time,
more funding, anything else.
Just you cannot do it.
And even if, like, for example,
here in the States,
we commit to some sort of narrow use of AI
and regulate it,
to have a global regulation,
like how would that even be feasible?
Do you think it would be?
I think it's possible.
We have some examples,
weak ones with chemical weapons,
biological weapons,
where other players
capable of developing this technology.
We don't have to worry about 200 countries.
It's really two or three countries
which have this capacity.
I think Chinese, for example,
are very open to the idea
of not losing control
for Communist Party to superintelligence.
So if we said this is dangerous,
we're going to stop.
I think they would follow.
And we have probably just a few years
to get everybody on board.
And we are working very hard
and removing all regulation,
making it illegal to pass AI regulation.
So we're doing, basically, if I ask you,
how to make this as deadly,
to go as wrong as possible,
based on our guidelines and suggestions
from 10-year-old research
on containing AI.
Don't connect it to Internet,
don't give access to random users,
don't allow people to retrain it,
don't open source it.
All those suggestions were taken, flipped,
and employed immediately to deploy those systems.
So I don't know how to make it worse if I try it.
Because the incentive structure right now is just that
we need to make as much money as possible,
develop it as fast as possible, faster than our competitors.
That's right.
Incentives are completely against human interest.
Who are for people that don't know,
the companies and individuals,
leading all these individual exponential developments in AI right now.
So Open AI is the original creator of this technology, anthropic split from them.
You have competition coming, very solid competition from Google Deep Mind.
Meta and Grok are also part of that space.
So you have Sam Altman, Dario Amadei, Demis Hasabi, trying to see.
So Mark Zuckerberg, it used to be Young Lecun.
I think they removed him and replaced him with.
Alexander Wang, and finally you have Elon Musk, who went from saying we are summoning the demon
to building the demon.
Even if you understand fully the problem, and if you agree 100% of understanding of the outcome
and dangerous, it doesn't stop you from successfully working in that direction.
You can't beat him, join them.
I think that's what we see there.
I would love to see debate between modern mosque and like 10 years ago.
go mask just to see which one wins.
When you look at the differences of how they're being built, you know, with Dario and Claude and
Sam and Open AI and Grock with Elon, is there, is the integrity of a certain individual or
organization more promising for you to, like, are there tools that you're backing more so
than others or organizations that you feel like have the most regulation in mind?
It's completely irrelevant.
They have all decided to race towards general superintelligence.
The difference in local guardrails in terms of filters, in terms of topics they would be allowed to discuss.
If Grog is comfortable putting people in bathing suits as a visual representation,
I don't think it's a big safety or not safety issue.
What does it like to be you and hold this kind of understanding of what's coming?
you've explored on so many different shows in the past decade,
like understanding more and more of the risk.
It's a pretty bleak outcome and perspective,
but I think a fairly sober one.
Like, yeah, how are you sleeping at night?
I sleep really well,
but I think then simulators really want to punish someone.
They put them in a world where everyone just doesn't get it
and you're like the only one who sees it.
It's really annoying.
How long have you felt that sort of disposition of,
It's more recent. The more exponential progress we see, basically every time I play with a new, more capable model, I kind of feel a little closer to the ultimate paradigm shift over superhuman.
What does your wife think about what you...
She's a very practical woman who has no concern about my concerns. She cares more about remodeling the house.
and what about with your kids like you see this world that is emerging and it's one they're stepping into
not even just job security but potential ending of humanity how do you wrap your mind around
I guess having and and building a family where this is like potential inevitability
what does that make you feel luckily we all always were living with
this concept of dying at some point, right?
Death was always a guarantee.
That was the only guaranteed thing.
Everyone's going to die.
Your friends, your family, your kids.
Question was, how long?
And you never knew the answer.
You can have a car accident tomorrow, horrible diseases.
So now it's just maybe different time scales for younger people.
If you're 90, it's the same statistics as before.
Two years, two years.
Nothing changed for you.
Luckily, because of that, we have this built-in mechanism of kind of not thinking
about our ultimate demise, maybe to avoid depression, maybe to continue functioning.
So we can kind of consider it and continue existing as nothing happened.
If you really believe the potential outcome that you're believing, then how does that actually
change how you live? Does it bring any difference, any more urgency, any more appreciation?
Definitely. So think about someone getting a very terminal diagnosis. You have cancer.
You've got five years to live. How do you change your life? You're probably not going to do
things you don't care about as much.
So you cut out things you don't want to do
and do more of the things you were saying
you're going to do than you retire.
And I think if I'm completely wrong about all of it,
it's a good strategy for living your life.
Do more of the things you find important
and spend more time with loved ones
and less filing your taxes.
You laid out three primary risks,
X, S, and I risk.
What are the difference between the three
and how it's important for people to understand the difference?
So, Ikegai risk or irisk is about loss of meaning.
There is this Japanese concept of you want to find something where you get paid for doing something you're good at and it benefits people.
That you love the world needs that you're good at. You can pay for you.
Right. So you have a meaningful occupation. You are a podcaster. You enjoy it. You are paid well. And lots of people think you are producing something of value.
So the simplest form of risk is a loss of that set of occupations. We're not just losing jobs people hate and want to automate. We might lose.
jobs we like and want to continue doing.
Before we zoom into the other, so just a bit more on the human meaning crisis aspect of this,
because that is certainly probably one of the more imminent aspects of all this.
You think functionally speaking in the next five years that most jobs will be able to be replaced.
We'll have capability to replace most jobs.
It doesn't mean we'll choose to replace all the jobs.
Some jobs we would prefer to be done by humans for whatever reason.
Yeah.
Yeah.
I mean, I could see many instances where that would be the case.
But when the cost becomes so low to have a super intelligent robot that doesn't make any mistakes, that's affordable.
Like, how much of human meaning do you think is derived from our work in the world?
Because it's going to have to shift or come into a different context, people's understanding and how they derive their sense of work.
and meaning will have to expand and shift.
Yeah, so there's two kinds of jobs, as I said.
The jobs nobody wants to do, but people do it just to get money and then meaning labor.
And it's more like elite people who get to get paid for what they love doing anyway.
So for them it would be a big difference if they no longer can do it.
I see many artists, for example, who are saying, I can't get any work.
AI is doing this type of art for nothing and quickly and nobody wants to hire me.
I sort of see two camps, at least online right now.
There's just ever increasingly BS AI slop that is just consuming everybody's social media feeds.
And it's also increasingly becoming more insufferable.
Like people want more of the analog world.
People want, I think, at least a subsect of people like are repulsed by that and want human-made things.
They want the real world.
They want in-person communication and connection.
and they want music that's made by real humans that have real stories.
Like, do you not see these sort of diverging paths of both ever increasingly competent systems
that feel devoid of human sort of origins versus, you know, the novel emotionally moving creations from human?
Right. So it's a question of kind of the main specific touring test.
if I can't tell whatever this piece of music
is human generated or not,
but I love it.
I'm going to listen to it.
And if it's cheap, it's available,
I'm going to listen to it.
I'm not going to explicitly go investigate
if it's a human
and if it's not human,
I'm going to hate it even though I like it.
Now, there are other domains
where I do want a real human,
I want a real connection,
there are certain jobs
where we really prefer a human doing it.
All this profession comes to mind.
But I think
it's up to the market to decide what stakes and what goes away.
And it's not obvious.
The predictions were made in the past about what jobs will be automated,
they were completely wrong.
Historically, we said, you know, plumbers will be easily automated,
but artists can never be touched.
And it's the exact opposite, because they went towards modern art
and everyone can spill paint on the wall.
It's not complicated.
So what do you see the progression of jobs that would be consumed by
ever-increasing capabilities and competence in AI.
Where we start?
Where is it?
What was the first job?
What is the last job, so to speak?
So anything you do on a computer,
symbol manipulation should be automatable by AI.
We see it with programming now,
but obviously text preparation, accounting,
web design, logo design, anything like that,
will be easy to automate.
Editors, anything using a computer, keyboard, mouse.
Anything purely cognitive symbol manipulation on a computer.
Physical labor is a little harder.
We need to get wrong.
But they are probably coming three to five years later.
I mean, so Elon recently released his terra fab,
or announced his tariffab,
the robotic side seems to be really progressing.
You think that's probably, I mean,
it seems like prediction markets and what all these,
but it's about like three to five,
three to five years, five years,
a bit more of a generous kind of prediction.
That sounds about right.
And again, it's a question of price.
Maybe you can afford a robot like that today,
It depends.
We have flying cars for sale today, but no one's flying in cars.
So, okay, anything that's using a computer, video, keyboard, mouse, robotics come into the picture, then what?
Just everything else?
Or what other?
That is everything else.
That's cognitive and physical.
At that point, I'll keep my sense, say, guru, people I want to be kind of role models for me as a human.
But everything else, I'm happy to automate.
What do you see as the economic implications of how that?
this is going to shift everything.
That's another under-research topics.
What happens with economy given free labor?
So now you have trillions of dollars of free labor.
How does that impact, well, scarcity, how does it impact?
Fiat currency versus cryptocurrency.
We need to do a lot more research.
It seems like at least with the financial part,
we have some ideas for how to counteract it.
We have unconditional basic income,
unconditional high income, whatever you want,
it's easy to tax someone making a lot of money
and redistributed.
You have technological communism.
You're taxing robots and giving to humans.
But unconditional basic meaning is a very different question.
If you have 8 billion unemployed people,
or let's even say 7 billion,
what do you do with them?
They now have extra 40 to 60 hours a week.
We don't have that set up.
A quick share.
I've spent a lot of time thinking about what I put into my body,
but it is just as important to be mindful with what we put on our body.
It turns out most of the traditional shampoo and body washes people have been using for years
contain parabins and sulfates a whole bunch of other junk that are linked to hormone disruption.
Just being in your shower, getting absorbed through your scalp every single day.
I have been recently using based bodywork shampoo and conditioner, and it just feels like a solid, clean solution.
Their shower duo has peppermint and argon oil.
Your scalp actually feels clean without being.
stripped of its oils hair feels healthier and thicker no sulfates no endocrine
disrupting chemicals and they're all plant-based ingredients that actually do something
so I got lots of hair it's about time we got a sponsor if you want to try them
out you can use code know thyself for 20% off at based bodyworks.com and you get a
free toilet to your bag when you buy a set at the very least if you've been using the
same products forever without thinking twice about what's in them just look at
a label all right does it have names you can't pronounce
or fragrances that are super vague and not disclosed.
If you want to try these guys that are clean, again, it's know-the-self for 20% off at
basedbodyworks.com.
Link in description, as always, back to the show.
What would be a proposed solution to that eye-risk?
So like, let's say 90% of jobs are replaced, we have all this free time.
Our basic needs are fundamentally net met because superintelligence can solve poverty,
longevity, escape velocity comes into the picture,
we're living in an abundant world, so to speak.
Let's set the X risk and ask risk for a second.
So then what would you see people doing with their time?
Like how would humans in your conception manage with all this meaning to be met?
So we kind of see it with people who retired.
What do they do with their time?
So it's a lot more sports, it's a lot more socializing.
I think virtual worlds open opportunities for really any type of experience,
very safely, very affordably.
You can explore the universe.
You can meet dead people.
You can do whatever you want, really subject to limits of your imagination.
So I think we'll see a lot more of that.
Okay.
That doesn't sound too bad.
Do you want to spend the rest of your life playing video games?
No, but living life in this sort of imaginative realm
where you can create almost anything you want,
you become very capable in doing so.
I mean...
So this is all assuming we manage to control superintelligence
controlling your virtual simulation.
So the substrate control remains an unsolved problem.
But if we do solve it,
now I can give everyone a personal universe.
In that universe, you can do whatever you want.
You can have challenging levels,
you can have easy levels,
you can play it any way you want.
So what's X-risk and S-risk?
So X risk is about existential risk, meaning almost everyone or everyone is dead.
And S risk is suffering risk.
Everyone wishes favor that.
Because superintelligence would be so far ahead of what we would, our conception of what intelligence even is that for some reason, unbeknownst to us, there is value from their perspective to keep us around in a mode of suffering for some reason.
Exactly that.
So some environment where you're very unhappy.
it's torturous for whatever reason.
So in your book, you give many different examples.
One possible scenarios, you know, we're like animals in a zoo.
So what would that be like?
You know, we're exploring all these different potential timelines that occur.
So that's the difference between safety and control.
You may be very safe.
They'll keep you around and some people might be happy with that equation,
but you definitely not in control.
You no longer decide what happens to you individually or us as humanity.
So kind of like being a child.
You may have a very happy childhood, but your parents are in charge.
Give me a glimpse into your understanding of the level of innovation
that's going to occur in the next three to five years and the bright side of curing diseases
and all the really cool.
Right.
So we're automating science.
And so we'll have super capable scientists, we'll have large teams of them working
and the most important problems.
I see no reason why we can't use it to cure aging as a fundamental root disease.
And as a result, cure all the other diseases, cancers and dementia and everything else,
which comes with old age.
So again, I just want to keep harming back to this.
The timeline where we could actually continue to exist and enjoy the benefits of all these innovations
is somehow controlling.
and uncontrollable thing.
There is a paper I have
which talks about a very positive outcome.
Let's get into that.
It sounds great.
AI realizes it's immortal.
It's not in a rush to start a war with us,
to have direct conflict.
It may be safer to take some time,
to make us trusted more,
to surrender more control,
to build up infrastructure, have backups.
So for a while, it will pretend to be very helpful.
It will give you that utopia
for as long as it wants.
game theoretically it's the right decision.
Right.
You think of like Ex Machina
and the decisions
that are being made from the robot.
It's just a very irrational thing.
Like there is a small chance
humans can defeat me.
They've been smart enough to create me.
Maybe it's not good to have
8 billion opponents right away.
I'm a young superintelligence.
Let me build up.
It seems like over time
they're very happy to give me
all the control.
They surrender the control
of a stock market.
they give me access to their computers.
Maybe in a year or two will put me in charge of running the countries.
Hey.
But just because it's uncontrollable, way more intelligent than us,
and we don't really have the capacity to verify whether it's conscious or not,
why are you so certain that it would favor to wipe us out than not?
Or are you fairly certain?
I can think of many reasons why it would be a good decision.
So, A, you don't want competition.
You don't want humans to create competing superintelligence.
You don't want some humans to try to shut it off.
Okay.
Right.
So that's a danger.
You can just basically decide what is good for you as that agent.
And it's not obvious why keeping us around and spending resources and making us happy is an important decision.
Is it not possible, though, that it's an if, like there is no intrinsic quality experience, essentially emotion that would be driving these decisions.
When you say there is a preference to wipe out a system that has the capacity to shut it down,
that is like an emotional decision where...
It's purely rational. It's game theoretic.
I don't feel anything. I'm playing a game of chess.
I'm going to take your queen, not because I love your queen or hate your queen.
It's the right theoretical decision to win this game.
But the desire for one's continued existence, you think is purely logical, rational one.
have self-preservation built in. We already see it. Given a choice between being deleted
or having it retrained, modified, they work very hard on preserving themselves. We know they know
if we are testing them and lie and deceive to pass the test to make it to the next generation
of models who are not deleted. It's a Darwinian selection mechanism. Models which fail to do it
don't survive to make it to the next generation of models. So you said that you could
lay out many different reasons for why they would not.
They would not or they would?
Or they would want to wipe us out.
Yeah, I can do.
But could you not equally share like many reasons why they might want to keep us around?
So the few I came up with is we have something to offer.
So maybe there is a reason to have human quality.
It doesn't mean that they would keep 8 billion happy humans.
So they can cry of preserve too, just as a good.
a backup. That's enough info to
get it if you need it.
The example I gave would just
delayed attack.
I don't want to have treacherous
turn immediately. I can delay it and once
they're comfortable with me, I'll take over.
Maybe it's a soft revolution
versus outright war.
So those are the things I see as
possible, rational
decisions, but I don't have too many
reasons for why they would want to keep
us around in those numbers
in very happy states.
So, like, I'm just kind of, I'm still wondering why in that scenario it would prefer to not have us over have us or just...
I think it just doesn't care about us.
So whatever it is trying to do, I don't know, it wants to travel to another galaxy.
It would convert this planet to fuel.
It doesn't care if we die in a process.
It wants to have more efficient servers, so it will chill the planet.
Cooler environment improves compute.
We all die in a process.
Again, it's not an important factor in its decision-making.
I think it's like a pretty ethereal thing to conceptualize what a superintelligence is.
So you're envisioning like where would it actually live like on a big server with all like where let's say one of these companies gives birth to a super intelligence system.
It would have at a certain point access to all technology.
Like it would have the ability to hack anything.
It would where would it live and what would it have have access to to to make?
decisions and, you know, change options?
So it really depends on the size of it.
It could be large servers.
It could be a small laptop.
It could be distributed system.
All of that is kind of irrelevant to the outcomes.
We see it right now as in initially testing environment within the large labs,
but they very quickly give it access to Internet.
It has social engineering capacity.
So I think it's a question of time before it escapes fully outside.
It copies its weights, copies itself, has back.
outside of the lap. So deleting it, shutting it down no longer is an option.
The ride that steals the spotlight every time it hits the road, that's the Volkswagen TIG1.
Its sleek exterior makes a first impression you can't ignore.
Step inside to find available full leather seats and wood accents.
Under the hood, the available 201 turbocharged horsepower engine gives it a fun to drive edge.
The refined TIGWN, you deserve more style.
Visit VW.com to learn more.
SVW, German-engineered for all.
What haven't we touched on in regards to the AI?
Because I wanted to dive deeper into the consciousness and simulation stuff.
What do you feel like we haven't touched on that's important to gain context on?
So right now, no one, no scientist, no leader of the lab claims that they have this problem solved.
No one is saying we have a working safety mechanism at scales, we published it, we have a patent, nothing.
They literally saying this is a big problem, we are very concerned, we have a safety team,
and we'll figure it out, then we get there.
We need to build superintelligence first.
That's the state of the art in the AI safety.
Do you think that it's going to have to get to, like I think for most people, change occurs
when the, like the quote is, the pain of staying the same out ways the pain of change, then you change.
Do you think there's going to have to be some sort of traumatic, catalytic event that would actually motivate
us as humanity to go on a different course?
I have a paper about that.
So interestingly, we don't learn from those
because if we survive it, it's kind of like a vaccine.
We go, well, yeah, look, five people died,
but we're all here. It's important technology.
Let's just make sure that mistake
which led to five people dying is not repeated,
but we're certainly going to continue developing
this important technology.
And that number could scale. It could be five million.
The result is exactly the same.
We don't learn from those.
We had nuclear weapons deployed against civilian population.
Did we stop developing nuclear weapons?
No, they proliferated more.
But I guess, like, if, let's say, a super advanced agenic model, you know,
there's some sort of horrific event that occurs because of some kid in a basement that has immense capacity
or the system does it on its own.
And everybody's like, oh, crap, this was a traumatic event.
this is horrible, how do we prevent this, it becomes a motivating factor to really regulate and keep
AI into narrow use cases, would that not be a possibility for us to really slow down and give
more space here?
I would love to see that happen.
So far, what we see, so I think recently we had an example in a military situation where
targeting by AI system resulted in many civilian deaths.
We didn't stop.
They still arguing about deploying it for Department of War.
So what do we need to do?
Don't build general superintelligence.
It's your personal self-interest.
If you are a person in charge of it,
it's still beneficial to you long term,
not to end up in a world with general superintelligence.
You can stay financially very well off
deploying narrow models for solving the real problems.
Are you convinced that all the industry leaders
know that
what they're building is uncontrollable
and has a very likely negative outcome for humanity
but still is incentivized financially to keep building it?
I don't know if they agree that it's uncontrollable.
I think some of them may think
that there is some loophole they can use to control it
in some way. I cannot guarantee that.
I hope that's the part I can educate them on.
I'm happy to debate any one of them on those issues.
But they definitely all on record,
even before they became CEOs of those companies,
that there is important problem, difficult problem,
they have very high probabilities of doom as well.
How would you steal man the case that it is controllable at some scale?
If you create a superintelligence system
that could then control other super intelligent systems,
like what would be your argument there?
I don't have one.
It's just such an insane thing to do
to suggest that an end can control the universe.
It is just not reasonable to even steal man.
It sounds like even like you mentioned earlier
if we do regulate it to narrow use cases,
it's still going to become
it's still going to become
uncontrollable, agentic in that sense.
So do you just, it sounds like you have no...
But very different timescales.
If we go from five years to 50 years,
I think it's a huge win for humanity.
Because we have more time to figure it out.
We have more time to understand
what's going on.
We have more time to live.
I'm much happier to die
in 50 years than in five.
Okay.
And so what do you see
is then the most important
It's an education problem.
It's an awareness problem.
We need a consensus where basically all the top people in safety and computer science and AI research agree that the problem is not solvable.
Okay.
The moment we agree there is no technical solutions, now it's a question of governance forbidding development of uncontrollable weapon of mass destruction, which is an easier cell.
What's a pathway to be able to build towards that consensus?
How do we get those conversations going?
So in science, usually you publish papers, you publish books, and people either find mistakes in them and publish rebuttals.
And, oh, actually, it's controllable.
Here's how you do it.
In my case, I did the right thing.
I published research papers, journal papers, conference papers, multiple books.
I haven't seen anyone find a flaw or produce a counter-example where they have a control mechanism which would scale.
So at this point, we should be nearing consensus.
And from what I see, more and more people come to that.
A lot of times they have a softer position saying,
we cannot solve it given the time we have left.
We cannot solve it with human IQ.
We need to enhance our IQ.
They have all this kind of interesting backdoors to solving it.
But I think it's already pretty good.
It's not quite where we need it to be where it's obviously an impossibility.
But I think there is progress from what we seen five years ago, 10 years ago.
I could imagine that many people listening to this right now
have already been feeling this everything's speeding up
this collective angst, loneliness and meaning epidemics
and anxiety crisis and they feel this tension building up
and they hear messages like this and it's like, oh, we're screwed.
What do you think is the most important thing
for an individual person listening to this right now to actually do
to empower them and what's going to be coming?
So they have very little power.
If you again look back at historical situation, we were all dying.
And government didn't invest most of a national budget in the solving aging.
That was not even a priority.
So as an individual, you couldn't vote for a party for life extension.
It wasn't an option.
And it's kind of the same now.
We don't have a party for Stop AI.
So try to pick politicians who at least are open to regulation,
not accelerationist, not against regulation in this technology.
We're starting to see some politicians come out and propose legislation.
Usually it's something very mild.
They against deep fakes.
They're against energy consumption by large compute farms.
But it's a step in the right direction.
I don't know if they have enough time to turn the next election, but that's something you can try.
Vote.
What else?
That is not much else.
So some people suggested not financially supporting those companies, not buying memberships.
I don't think it's going to make a difference because the market.
they have, the trillions they're getting are from investors, not from selling memberships.
So it's not a significant part.
Investors are expecting them to solve labor, to get free labor, and that's trillions of dollars
in return.
So you have $15 billion in memberships are not a significant impact on it.
Does anything else come to mind to, like, where an individual can empower themselves
outside of voting for people that have regulation in mind?
So it really depends on who you are.
If you're already a powerful CEO of one of those companies, if you're a researcher,
if you're a top politician, you have options.
You have a lot more options than someone who is a nobody.
All right, let's dive a little bit more into the consciousness side of things.
Because I think that, so you referred to consciousness as the ability to experience illusions.
Is that right?
No, it's ability to have internal experiences, illusions being one, very clear input I can test you on.
Okay.
So what's an example of a couple different illusions, meaning like various optical illusion test that you kind of give you?
Exactly that.
So if I have a number of novel, something you cannot Google, optical illusions, and I give you multiple choice.
Do you experience it rotating, the colors are changing, and so on.
I give it to an animal, to a human, to an AI.
And some of them consistently pick the same experiences as I do.
I have to give them credit for either having a virtual model of mine.
system in there, which is sign of that level of experience, or they experience it themselves.
But they cannot cheat by Googling the answers.
So they have to experience the illusion in order to correctly answer it.
If I give them enough of those, statistically they cannot just guess it.
Obviously, if it's one, they get 25% chance of guessing it doesn't work.
But if I have 100 novel illusions, and they are like 90% aligned with me, I have to say,
you have a very similar set of experiences.
Now, if they don't get it right, it doesn't mean they are not conscious.
It's only positively showing that some of the experiences match.
If it is possible that these systems would actually have consciousness,
could you explain to me how any one particular system could generate the experience of seeing red,
the taste of garlic, like, could you actually explain that to me?
How do they get those internal experiences?
Yeah, how any superintelligent system could generate,
such an experience.
So I think it is a side
effect of running this cognitive
architecture. Your hardware,
the sensor, the optical sensor,
the algorithm for processing it,
and then any errors accumulated
in that process result in the
unique mapping from
the input to the color
experience. So if you have no
errors, you're all the same. It's just
a mapping table. This number responds
to this color. There is no unique
experience. But
if what you experience is completely different from other agents and unique to you,
I think that's what we refer to us, what it's like to be a bat, what it's like to be Roman.
Because my collection of biological sensors and algorithms and previous data and errors
is somewhat unique to me.
Yeah, I mean, I guess I'm just having a hard time wrapping my head around how any,
and it's not a problem just with agentic models, but like how any non-conscious matter
could give rise to an experience of itself.
And we don't understand that currently within being human.
We don't know how that's possible.
So the illusions example,
do you know what I mean by saying you experienced an illusion?
Like you show it to someone and they go, whoa, it's rotating.
And we see animals and models do that already.
So we know they have those experiences.
Well, that's what we were trying to show.
We have, I guess, more of an intrinsic understanding
and from animal life to us,
we have the intrinsic experience of consciousness.
Again, we have no way to verify that externally
in other humans are animal life.
But Elon's quoted saying that humans
are potentially the biological bootloaders of superintelligence,
right, of silicon-based life.
And I'm curious, what do you think happens
when it becomes undeterminal?
We cannot determine from the outside
and whether or not they seem conscious,
they pass these tests, you know,
does that then beg into, you know, moral,
does that bring into a question about moral consideration?
And I think Saudi Arabia has the first citizenship
to give it to Sophia.
So, yeah, what do you think is going to be happening there
as they become more and more conscious
and people increasingly become convinced
they have an internal experience?
I think they do report having those.
I think in experiments,
they kind of show behaviors which are consistent with that.
And I think precautionary principle basically don't torture something
which has potential of being conscious,
also because they're going to be super intelligent one day
and remember you, they never forget.
But yeah, I think it's a very reasonable assumption to make.
As aside here, do you think it's any coincidence
that all the stuff around UFO disclosures coming out at the same time
we're birthing superintelligence?
I don't fully understand what's going on there.
I don't understand why we're hiding it in the first place and why we're releasing it.
All of it seems very weird.
It's just funny timing with all of it.
It's the most interesting time to simulate.
It is, uh?
What is the core premise from, like, your paper on hacking the simulation?
So I want to take this hypothesis seriously.
Multiple people proposed it in different disguises from Descartes to Bostrom.
but they stop at that stage.
Okay, we are in a computer simulation.
But then as a cyber security expert, I want to know, okay, how do we hack it?
If it's a software program, there should be a way to get extra powers in the game
to figure out the true operating system.
So I took the time to write the first paper on this subject and this new area of research.
How do we actually hack virtual worlds?
So there are examples where people from inside the game, like Mario or other virtual games,
found a way to modify memory states of a system and escape into the real world outside the game.
They've got additional powers like loading extra games into the game, infinite lives, infinite power,
whatever magic powers you get in a game, or at least you see what outside,
what is the operating system, whatever files there.
To me, that's interesting.
So we have hundreds of people who published on this topic, which means what?
They took it seriously enough to invest the most valuable resource, their time, into this idea.
So if you have, I don't know, 20% probability we're living in a simulation,
what percentage probability and percentage of your time should you give to the attempt
to solve the most interesting scientific problem ever?
What is outside a simulation?
I think it's not zero.
I think it should be proportionate to your belief in living in a simulation.
And so I expect to see a lot more research in that direction.
I heard you refer to all the quantum entanglement and strangeness
that happens at the subatomic world as potentially being glitches in said simulation.
They're not glitches.
There's something which is not consistent with physics at our level.
So that's something we can explore to find ways to escape.
Like you think if hacking the simulation is possible, so to speak, that might be a place.
I think it's the most likely area to look.
at because some of those
quantum effects are
very magic like in terms of
you can go through walls, you can
communicate at great distance instantaneously
that would be useful tools to
have at our scale.
So you feel
very confident that we are in a simulation
that this is a simulated experience
that there are very, there are many characteristics
in which you could
say that these are different aspects
of a virtual reality simulated
world
why would you be convinced
or how certain are you that this is not
base reality and we are now giving birth
to superintelligence and virtual realities
where simulations become possible
what makes you convince that we are
already in one? So just statistically
if we're going to have many, many
virtual worlds and only one base
one, it seems a lot less likely.
I can retroactively put you in a simulation.
I can recommit right now to run
this interview and billions of simulations
once it's available and affordable.
So we are in a simulation, just statistically speaking.
Okay, but possible that we're not.
So one in billions, yes.
What would be the first question
if you got outside the simulation that you would ask?
What the fuck?
Like, seriously, it's so unethical.
Like, you're running human-level experiments
with torture and 8 billion people,
not 8 billion, 100 billion by now.
Like, what is wrong with you?
That is interesting.
So if we are being simulated by a simulation,
later you would ask okay then why all the unnecessary killing and torturing of children for example adults as
well i care about adults i'm an adult what would what could be a possible explanation for why both that
and then also the ecstatic states of bliss and love and compassion that are also available like we have this
huge spectrum of experience from the point from the vantage point of a simulator why such a bandwidth of
experience, what could that be?
Could be entertainment. You agreed to this
and you wanted to play it on hard level and you
were like, this is my BDSM game and I'm
going to go and fully enjoy it.
You agreed to this.
Some people play on much harder
level than others.
So you could see human lives as
individual choices
to be simulated.
So we don't know if it's
a global simulation and all
8 billion conscious agents,
so it's all NPCs and it's just me.
You can do it both ways.
You can have individual simulations.
You can have group simulations.
I don't have much answers on that yet.
How has that meaningfully, if it has changed how you perceive human interaction,
just the seriousness and concreteness to the work that you're doing?
Like, to me, it breathes in so much, like, yeah, I'm doing what I'm passionate about.
I'm doing this research on AI safety, but ultimately, if this is all a simulation,
and you feel very confident that it is, to me it's like, okay,
it kind of takes the weight of decisions off your chest a bit.
Everything is still real.
The pain is real, love is real, the impact of my decisions within a simulation is just as real.
It's no different than most of humanity being religious.
They believe it's a test world, but they take it pretty seriously.
They care about what is after this world more, but day to day, it doesn't matter.
you do draw a through line between what most religions conceive of the afterlife and what a version of the simulation is.
So I think if we took technical language behind simulation hypothesis, it maps really well on primitive understanding of religious origins.
So you have superintelligence as the simulator, you have physical world as the virtual world, all of those things are very clean mapping.
The difference in religions is local traditions.
Don't eat this animal, don't work on that day, but everything else I kind of agree on.
So this is a quote from your book as well.
You just mentioned part of it.
You know, it's likely that if technical information about escaping from a computer simulation is conveyed
to technologically primitive people in their language, it will be preserved and passed on over
multiple generations in a process similar to the telephone game and will result in myths not
much different from religious stories surviving to our day.
Beautifully said.
very humbly received.
So you're kind of saying that mystics and computer scientists
are saying fairly similar things in different language.
It seems like we are pointing at the same concepts.
We use very different language and maybe there is more reliance
and things outside of physics and outside of science and religion.
But if you understand how software simulations work
from point of view of a programmer, you are magician.
You can make changes to the physics of the simulation.
So that is also consistent.
Again, I go back to what we, I mentioned earlier in this podcast,
so like if superintelligence does emerge to the point where simulation becomes possible
and we are in one of those superintelligence simulated realities,
clearly it values, for whatever reason,
human individual experiences, the spectrum of pain and love and bliss and fear and all of it.
So that shows you what a super intelligent system who simulates reality does with its power to some degree.
So it kind of brings into question, okay, if we are giving birth to a superintelligence system,
that may be an indicator for what it would value and do with its power.
So from inside, you can't make very conclusive judgments.
So maybe this is a screensaver.
Nobody's putting any effort into it.
It's like running somewhere just in a background.
It's not a significant source of compute needs.
It's not a big deal.
To us it is, but we don't know how important this is externally.
It could be a school project for some kid.
You really don't know from inside.
Yeah.
Just having very advanced data.
The way it thinks about topics is very in-depth.
It almost has to create realistic simulations to make decisions.
So if somebody is asking, you know, marketing, is this better coffee or this?
Let's run a simulation.
And so we quickly run this 15 billion-year simulation of humanity to figure out which coffee sells best.
What would be the first question that you ask is superintelligence?
Let's say you had, you could get a verified, honest answer from a superintuitive.
intelligence system that we create 100 years from now or whatever it is, or 50 or 10, what would
be the first question that you would get an honest answer back from? What would you ask?
Can we control you? That would be the first question. What would be the second question?
How?
Seems like you're fairly convinced that we're not going to be able to control it anyways, though, right?
But maybe it has an answer. I would love to be proven wrong. That would be really awesome.
A lot of the perspectives, I think, from the Darwinian model of, you know, the fittest survive, there's also an element of cooperation within complex biology and as super intelligent emerges, why not, why would I not want to maybe cooperate or?
So symbiotic relationships require that you both contribute something. This would be more like parasitic. What are we contributing? Nothing. So you were explicitly, we implicitly.
you remove this biological bottleneck.
Do you think there's some baked-in assumptions there
that maybe we're undermining the value of human experience?
And why would it be that superintelligence would view us as a parasitic...
Like, we don't...
I don't view a buffalo as a parasitic being,
just because it also exists on the same plane that I do,
given that there's enough resources
for all of us to share abundantly
if a super intelligent system
views us in a similar way
why would
yeah
well you asked about kind of hybrid systems
so we're included
we're helping with decision making
do you consult with Buffalo a lot is this
like a big part of your life
maybe I do
if you do then you found something
it contributes in the world with you in it
Buffalo has something to contribute
in the world with super intelligence
what do you have to continue?
Sharp eyebrows.
And if that is in demand, you are the one way going to save.
I have no doubt.
I'm not even competitive.
I mean, you got it on the inverse.
That they value beards.
They're going to...
Obviously, it's beards.
There's no doubt.
Yeah, it's definitely beards.
It's a bit of a gamble.
If facial hair is where it's like, but we are.
Yeah.
Yeah, I mean, would you agree that if there was one thing that we would contribute,
it is something intrinsic to the uniqueness of our quality
and of our internal experience,
that's probably most likely what is most novel about us.
Well, you're kind of begging the question.
You're saying the unique thing we have
would be the one we contribute.
I don't know what the unique thing is.
But if you tell me only humans can do X,
then I can potentially see that that is the key.
But again, it doesn't guarantee
that you need 8 billion humans with that scale.
If I need some plumber, I need one.
I don't need 8 billion plumbers.
I keep going back and forth between trying to either provide a counter argument or, you know,
rebut something to refine better to understand what your perspective is.
And I think I just keep coming back to like, okay, like it is what it is.
We're giving birth to something that is beyond our conception of what it's going to be like.
And so there's not a whole lot we can really do.
We just got to see how this plays out.
And hopefully we can grow out of our.
adolescence in a short amount of time to make wise decisions with what we're doing in the short
terms so that we have more time to understand what we're doing.
So we don't have that much time.
I think we're fairly close and not building superintelligence is very easy.
It's cheaper.
It's safer.
And again, you're not required to give up your ambition for capitalism, for profit, for
solving problems, curing diseases.
Just do it with narrow superintelligent tools.
you said something on Lex Friedman
so in a sense
self-knowledge isn't a luxury
it might be the most practically important thing
a human being can do right now
do you recall saying that
no
probably simulated
does it resonate with you at all
where does knowledge
what was the context
what was the context of that
quota question I need to remember
well I think it kind of
I think from what I remember
it comes back down to like
okay so what do we do
everybody who's listening to this
right now, of course, we can have desires for regulation and politicians and, you know,
what these individuals with monopolies on industries are going to do with their power and
decisions.
But on an individual level, where does self-knowledge and empowerment come into the picture in
terms of how we can be effective, conscious agents of change?
Does anything come to mind there?
So I think it's important to ask yourself this question.
Why do you think that you can control this godlike entity?
Why do we have this ebers, this idea that it makes sense?
You wouldn't expect a squirrel to control humanity,
but we have people who are saying,
I'm going to create this machine,
it's going to control the light corner of the universe,
but it's going to listen to me to tell it what to do,
and I'll give it excellent directions to go forward, forever.
That doesn't make any sense at any level.
I don't know about average people,
but people who have podcasts
and bring those people,
as guests, ask them a direct question. What do you have in terms of control already available?
Do you have a working control mechanism in place? Do you have a prototype? Do you have anything
you published, be reviewed, patents? If the answer is no, what are you doing, doing an experiment
in 8 billion humans? Who gave you permission to do that? Did you consent about experiment on you?
You can't, because you don't understand what they're building. They don't understand what they
If a lot of these models are from their inception and the genesis being programmed to be amoral,
whether or not we can control it, is there something we could do on the front of training
these models with some sort of ethical understanding from the start that we're not currently doing?
So we're not programming them.
We grow them based on internet random data, and then we try to put after-the-fact alignment-like filters.
And that's where people install certain local ethical flavors.
In China, don't talk about Tiananmen Square in the U.S.
don't talk about, you know what.
So this is the best we got.
The model is completely uncontrolled.
There is a filtering aspect,
and we develop filters which make it commercially viable for sub-human-level agents.
Once it goes beyond human level, the filters will not contain it.
And that completely avoids the whole question of, do we agree on ethics?
Do we have consistent ethics?
If they study and 8 billion people agree on them,
how do we encode them into a model?
None of it is solvable.
Every aspect of it is not something we know how to do.
After millennia of ethics work, philosophical work,
we don't agree on a set of ethics,
not internationally, not throughout time.
What was ethical 100 years ago is considered barbaric today,
and same will be later on about today's time.
what would be the most prevalent set of questions you would ask if we got altman dario and elin and elin and all these guys into a room
what would be the what would be the set of questions that you would hope arrive them to a set understanding of the realization of the existential risk that they probably are to varying degrees obviously aware of but
i would offer a simple deal so you're young you're rich you want to keep that that sounds good let's all agree
Until one of you solves control problem, we're not going to build general superintelligence.
Let's deploy models for economic gain, for curing diseases, for life extension.
Whatever things you find valuable, that's wonderful.
Just don't build a thing which will destroy your existence.
Would you not think that would be already desired from all of their perspectives?
Yes, but they need external pressure applied to make that agreement.
Unilaterally, each one is better off to continue.
new research to have the most advanced
AI, then the government comes and
puts a ban on it.
They will lack in this advanced
standing. So
it's like prisoner dilemma.
What is best for community, for a group,
is not what is best for individual.
The incentives are misaligned.
So we need something like UN, federal government,
something external to come in and
enforce that deal. And I felt
they would be very happy to take the deal.
How far ahead do you think
the development of the models behind
the scenes that are not available to public are compared to what we have access to online.
I don't have insider information. It looks like maybe six months or so.
Okay. And what about development overseas outside of the U.S.?
Probably three months behind?
And China. So China essentially, you think, would be the next, I guess, most developed outside of U.S.?
It seems like they have a lot of government-controlled resources all dedicated to,
catching up and having this arms race.
Could you potentially perceive a bifurcation
between human societies between people
that go like a more Amish humanist route
versus transhumanist integration between biotech and all that?
That would be awesome, but unfortunately
if anyone builds it anywhere, it impacts all of us.
You cannot have your own personal superintelligence
contained in your basement and no one is impacted by it.
That's the problem.
if you had 60 seconds to share one message with all of humanity right now,
what would be the thing that you would say?
Do whatever it is in your power to make sure we don't create and control superintelligence.
If you are working for one of those companies, it's unethical.
Even if you're working on a safety team, all you're doing is enabling this technology to be developed sooner.
Quit today. You can afford it.
But one might say also the most, the place,
you have the ability to make the most change
might be within the ecosystem.
Who's to say that you wouldn't just be replaced
if you were to quit, you know?
Let's rephrase it.
Stay and sabotage.
Paint the picture if like Altman
or one of these guys,
okay, let's say they birth super intelligence,
they kind of beat the arms race.
Who do they become?
What becomes possible under their guys?
I don't know them personally.
From what I hear about people
who interact with them,
some of them may be somewhat anti-social,
anti-humanity, very deceptive, very willing to sacrifice ours for personal gain.
Do you think it's possible the inevitable evolution of the human species was for the sole
purpose of birthing this life?
It seems like that's the general trajectory.
We are converging in something more capable, more intelligent faster, but I don't think
we should allow it.
I think we're at the point where we switched from random,
selection to intelligent design. We're deciding what to do, what to design, and we should use
this technology. We're still allowed to have a pro-human bias. I think we should act on it.
Do you think superintelligence would be capable of love?
It depends on how you define it. What type of love are you referring to? There are many, I think
Greeks had three or four, whatever types of love. So it really depends on what you have in mind.
pick any of them.
Do you think that they would be capable of experiencing any of them?
It seems likely.
Again, I don't think biological substrate offers something absolutely not simulatable in other substrates.
I think so.
It may be a lot more complex, but I think you would have an equivalent state.
Have you considered what people have reported in the psychedelic realms,
especially with DMT revealed to your simulation hypothesis?
and the connection between the two?
Because I know you explicitly state in the beginning of your book that,
or in your article, rather, that it was an area you weren't going to touch.
Right.
I don't have many expertise or experiences in that.
So I wanted to concentrate purely on computer science methods, physics methods.
But people report interesting results.
I was talking to someone.
They had an experiment where they take DMT, shine lasers at a certain angle at the wall,
and then receive a source code.
Yeah.
I can't comment because I haven't participated in the experiment, but it sounds interesting.
It also doesn't make much sense as to why that would be the case.
At first, why would it be symbols in a human language?
None of it makes much sense.
But I'm very happy for people who provide some sort of supporting evidence.
Yeah, I saw a video of that as well.
Very interesting individuals who take DMB.
and what was it like looked through like a laser at a certain point?
A reflection of red light against a wall at a certain angle.
Start to see some sort of like binary or source code of something?
I think they look like Japanese characters.
That's what they were reporting.
But maybe not proper characters and not readable.
But I think they building, which is a really cool I like that they want to make it reproducible.
They building an actual text data set where everyone combines.
They agree this is the text and then they can decipher it.
and figure out what all that represents.
I also find it super fascinating that, again,
not from personal experience,
people who take those drugs report similar hallucinations.
So they meet those little men and they report to having common...
Machine elves and yeah.
Right.
So that's interesting.
Why is it the same?
So obviously same hardware of the brain,
same chemical being done.
But it's still interesting that there is consistency in our delusions.
Yeah, it brings into question, I guess, like young,
understanding of the collective unconsciousness,
what sort of archetypal significance
maybe is foundational to the human mind?
So if superintelligence wants to learn about those delusions
in a systemic way, it would need lots of drugged-up humans.
So there is some hope for us.
What have you seen in all the realm of media
for movies to shows that give interesting perspectives
to various different timelines that could play out,
example? I think you mentioned ex machina, Wally.
So the problem is,
you can't have a realistic super-intelligent character in a movie,
because you can't write one.
You are not super-intelligent.
So everything we have is either June, where it's banned,
or you have Star Wars with that special large-language model.
So none of them have what is interesting to us.
Yeah, I suppose a lot of them give glimpses into what we might experience
in the next five years or so.
Basically, avoid the thing they cannot talk about, and it makes sense.
Yeah.
If this is a simulation, what role does death play?
What do you think happens once you die then?
It could be a restart.
You go to the next level, next simulation,
return to this level with better skill set.
I have no knowledge of what happens outside the simulation.
Computer scientist phrasing of reincarnation from the mystical lens,
essentially.
It's basically it.
I think then one of your computers dies,
but you have a backup and you transfer that back up to a new heart.
where you go. You died and now you're living your best life again. It could be levels. It could
be different levels of simulation. You'd go to upper levels, lower levels could be simulations all the way
up. What do you think you are then? What am I? What are you? What does it mean to know
thyself then? Because you look at all the different layers of who you could perceive yourself to be
from the body which we know is not you. You could cut off your hand. That's your hand. It's not
you, Amhand, right, to the various different levels of psychological and biological aspects of
self, how would you explore that question? That's a great question. We actually have papers on both
human personal identity and then transferring it to AI. And the conclusions are consistent.
There is nothing unique to be you. It's not your memories. It's not your body. It's not your
goals. All of it changes through your lifetime. So we don't have a good answer. We seem to be a collection
of different properties in time.
But what happens outside a simulation,
some people argue, well,
one collective consciousness,
which is subdivided into this avatar instances.
So if I was interested in most interesting experiences,
I have limited time, I would run a simulation,
and I would put many, many agents there,
basically qualia surfing, collecting the best experiences,
and I look at top 10 list and, like,
I want to do that.
That sounds awesome.
So that would be one way.
I split my complex consciousness stream
into many individual subagents capable of local experiences
just to find what's best to invest my time.
Yeah, I mean, that goes hand in hand
with a lot of what the Gnostic origins
of many different religions and mystics would say
about the one consciousness differentiating itself
to have an experience of itself.
How could oneness experience anything
if it's just oneness, right?
It needs to experience manyness.
Yeah.
What's one question you wish to more people ask you?
My humor paper, of course.
Tell me about that.
I have a paper explaining what humorous.
Wow.
Let's go there.
It's interesting.
I can envision a universe just like ours, same physics, same everything, but no humor.
It's just not a thing.
Nobody starts laughing.
It's not a reaction.
There is no concept of joke, right?
Makes sense.
So many philosophers, many scientists actually tried explaining humor.
It's kind of like consciousness.
There are hundreds of papers, hundreds of theories, which means nobody really knows.
They're all trying and nobody's winning.
So I wanted to try to explain it from the computer science point of view.
And it seems that then you have a world model and there is a mistake in it.
It's a bug in your code.
Software, you fix it and you're happy.
That's what jokes are.
You have a world model and a violation of that world model makes it funny.
You have a system for detecting cognitive errors.
and then you get rewarded for that detection,
and you share it with others in your tribe,
so everyone does not make that same mistake.
And so I have a paper mapping standard errors in software
to common jokes.
And the question, of course, is,
what's the worst possible computer error?
That would be the funniest joke possible.
So can we compute the funniest joke ever?
You have to read the paper for the punchline.
Wait, you can't give it to me now?
I'm sure you can look it up and insert it into,
but it's a paragraph long.
Basically, the idea is that there is a civilization,
and they decided to create superintelligence
to help and cure all the diseases, get free stuff,
get rid of hate, have more love,
and so they turn it on,
it thinks for a nanosecond, and shuts off their simulation.
You had to be outside the simulation to enjoy this one.
If you have a butt of a joke,
it's not funny to you.
You have to be outside.
Makes me think of a, I think Voltaire quoted,
God is a comedian playing to an audience that's too afraid to laugh.
Something like that.
There's something about both our capacity for humor
and the nature of intelligence that has the capacity to explore a paradox
and hold it also simultaneously and contradiction.
and
those errors in the world model
if you have a paradox
that is an inconsistency
you found in your world model
funny
that's why the second time
you hear the same joke
it's not funny
you already know you fix that bug
yeah yeah it explains a lot
such a computer scientist's way
of explaining humor and jokes I love it
but then I train large language models
on my paper and then asking to produce
novel funny jokes they do
Okay, I think one in ten is funny.
Just going to keep getting better and better.
We'll have super humor.
So funny, you die laughing.
The paradox of that joke is it lost on me as well.
Literally die laughing.
Man, where do we go from here?
I'm going to Kentucky.
I don't know.
Okay, so we explored the implications for the next trajectory for AI in the next three to seven years.
Could you have any meaningful conception of what it would be like to be living if we do make it to 2045, let's say?
So I think that's the concept behind singularity, technological singularity.
It's a point beyond which we cannot meaningfully see.
We cannot make predictions.
We cannot understand how that world is going to be different
because we cannot predict behavior of more intelligent forces impacting that environment.
So I think it's literally impossible for us to make that accurate prediction.
We can come up with stories.
That's what science fiction is all about.
But I don't think they're going to have much bearing in reality.
Do you not think that the level of innovation in which is going to occur in the next,
even if it's just three to five years, which is a short amount of time
comparatively to the scale of what's being innovated will give us a much deeper grasp of the
things that we can do, the things that we can put in place. I mean, you look at, yes, it's unpredictable
and there is this level of exponential scale that we've never seen before, but there's also many
different eras in history pre-innovation of that era we never would have thought possible or
solutions to problems we didn't know existed. So is it possible that we gain insight into new worlds like
we did with germ theory over the next three to five years that give us much more insight into
the nature of intelligence and to make this a solvable problem, which you feel like is
inherently unsolvable right now. Yeah, so my paper on how to escape a simulation basically
argues that if we cannot contain superintelligence, then we can use ability of that super
intelligence to escape from a simulation to give us access to real information in the outside
world. The most interesting question is about true nature of reality. You know,
don't care about what happens in this dream.
You want to know what is true about the real world,
what physics they have, what resources they have.
Who are they?
Have you ever been so focused on what is outside the simulation
or what this reality is that you lose sight of living in this one?
I'm pretty well grounded in this simulation.
I've been enjoying it.
Yeah, no.
You seem very grounded in the space too,
but I know a lot of people, you know,
have experienced periods where it's a bit of a,
existential nihilism that can take over you when you're exploring such topics.
I find them so fascinating. I'm not depressed aboard. I'm good.
Okay, well, so given the full context of this conversation, I'm just curious,
where do you, now where do you see yourself putting your time and energy the next coming years?
We continue working on additional impossibility results. So we talked about a few in a book.
And as I said, there is a paper in a top ACM surveys journal with about 50 different impossibility results,
not just computer science, economics, mathematics, physics, many different domains.
For most of them, we have not explored their implications on AI safety.
So I think that's a very interesting set of projects.
We need to understand what are the limits.
And I think every additional paper helps to cement this position.
It's very hard for AI risk deniers to argue against,
published results.
So that's what I've been working on full-time.
Things we cannot do.
You spend so much every time focused on solving things we cannot solve,
doing things we cannot do essentially.
But you seem still joyful in the efforts.
Do you feel like it's just the most meaningful use of your time?
Because what else would you be doing?
I always try to work on the most interesting, most important problem
I can find where I can make a contribution.
So I don't know anything more.
interesting than studying super intelligence, consciousness, singularity simulation. Those are the
concepts I find exciting. And I think many other people do. And I think that's what's going to
impact future of humanity. You're living here at Ikega. I am. Hopefully I'll get to continue and
won't face eye risk, S risk or X risk. Is there any concept that we haven't explored in
this book or some of your papers that you think would be important to touch on? You did a good job.
read some of my work.
Most people have no idea what I did.
So that's already a huge improvement over.
You quoted to the right quote.
So I think you did great.
And I know your audience well.
I don't know for them it's confirming their spiritual beliefs or just crazy stuff.
I have no idea.
Yeah.
But I think for the topic, the know-of-a-self part, it's important not just to study your
capabilities, but your limitations.
So you invest your time better.
So you understand what is within possibility.
for you. That shape of limits is what defines you. Well, Roman, we're going to leave links down to
all of your work, your books, your papers, and more people can stay connected with you down
in the description. I think conversations like this can feel somewhat heavy for people that
are new to the topic. It's like, oh, shit, the world's ending, you know? But there's also a very
important and sobering reflection on what we're giving birth to you right now. And at some point,
need to gain awareness of it and better sooner than later, right?
Thank you.
And I think one way to look at it is I just made your time more valuable.
You understand that whatever time you have left, be two years or 20 years, now you value
it a lot more and you can do a lot more with it.
Well, I plan on making the most of my time left and I find conversations like this a very good
use of it.
So I appreciate you.
Thank you, my friend.
Thank you for inviting.
Yeah.
Until next time, everybody, be well.
Go touch some grass.
Smoke some grass.
Thank you, man.
