Factually! with Adam Conover - An AI Safety Expert Explains the Dangers of AI with Steven Adler
Episode Date: December 24, 2025Why should we assume that AI is safe? As the technology has grown at an alarming rate, companies like OpenAI have seen wrongful death lawsuits begin to stack up as their product drives users ...to suicide. With the mental health risks, the societal risks, and the unknown risks, we have to ask, can AI ever really be safe? This week, Adam speaks with Steven Adler, an A.I. researcher who led product safety at OpenAI, about the dangers of AI and our best prospects for living alongside this technology. SUPPORT THE SHOW ON PATREON: https://www.patreon.com/adamconoverSEE ADAM ON TOUR: https://www.adamconover.net/tourdates/SUBSCRIBE to and RATE Factually! on:» Apple Podcasts: https://podcasts.apple.com/us/podcast/factually-with-adam-conover/id1463460577» Spotify: https://open.spotify.com/show/0fK8WJw4ffMc2NWydBlDyJAbout Headgum: Headgum is an LA & NY-based podcast network creating premium podcasts with the funniest, most engaging voices in comedy to achieve one goal: Making our audience and ourselves laugh. Listen to our shows at https://www.headgum.com.» SUBSCRIBE to Headgum: https://www.youtube.com/c/HeadGum?sub_confirmation=1» FOLLOW us on Twitter: http://twitter.com/headgum» FOLLOW us on Instagram: https://instagram.com/headgum/» FOLLOW us on TikTok: https://www.tiktok.com/@headgum» Advertise on Factually! via Gumball.fmSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
Transcript
Discussion (0)
This is a headgum podcast.
I don't know the truth.
I don't know the way.
I don't know what to think.
I don't know what to say.
Yeah, but that's all right.
That's okay.
I don't know anything.
Hey, they're welcome to Factually.
I'm Adam Conover.
Thank you so much for joining us on the show again.
You know, for better or for worse, and I'm guessing a lot of you might feel worse, AI is the
fastest growing technology in history.
Its adoption is faster than the Internet and faster than even electricity.
According to some measures, 1.2 billion people use the technology today just a few years
into the AI era.
Now, when we talk about AI, we're mostly talking about large language models, and these things
can be used for pretty much anything.
If you put input into an LLM, you will get an output.
And that means that people are using it to do countless things.
And since people are weird, they are using it in weird ways.
And some of those uses are very bad for them.
Sometimes extremely bad and even dangerous.
For instance, say that you're seriously mentally ill.
Well, you know, a lot of technologies could be used badly, an iPhone or a gun, for instance.
but those technologies probably aren't going to make you spiral into psychosis all by themselves, right?
But if you're losing your grip on reality and you have an app that will endlessly talk back to you
and support and even embellish your wildest fantasies, I don't know,
it seems like it might take you down a path into complete insanity.
And according to lawsuits, that is exactly what is happening.
As of last month, there were at least five wrongful death lawsuits pending against,
Open AI, accusing the company of letting ChatGPT abet those people's mental illnesses and
push them towards suicide. Just this week, another lawsuit was filed. This one from the estate
of a woman who was killed by her son, who then took his own life. The lawsuit alleges that
ChatGPT, quote, affirmed his paranoia and encouraged his delusions during a mental health crisis.
To which we have to say, what the fuck? What is happening? I mean, this is not,
something that most technologies are capable of doing at all. And one must suspect that Open
AI knew that the technology could and in fact was and is doing this to people. Now, Open
AI is a corporation under capitalism. And thus it needs to make money. It's facing competition
on a number of fronts. And it needs ever increasing numbers of people to keep using its product
if it wants to survive. So what we have to ask is, have to ask, have
Has Open AI ignored the harms that its technology can cause in pursuit of endless user growth?
Well, today on the show, we have a guest who is going to help us answer that question.
I'll spoil it for you in the affirmative.
He formerly worked at Open AI.
He has been inside the belly of the beast itself, and he has a deep knowledge of exactly
what the problems and risks are with this technology and how these companies could be doing
more to prevent it.
Now, before we get to this fascinating conversation,
I want to remind you that if you want to support the show
and all the conversations, we bring you every single week,
head to patreon.com slash Adam Conover.
Five bucks a month gets you every episode of the show ad-free,
and you can join our awesome online community as well.
And if you'd like to come see me do stand-up comedy,
I am on tour across the country.
On January 8th through 10th, I'll be in Madison, Wisconsin,
at Comedy on State, one of the best comedy clubs in the country.
January 15th, 16th, and 17th, I'll be in Fort Wayne, Indiana.
January 30th through 31st, I'll be in Louisville, Kentucky, February 12th through 14th,
I'll be in Houston, Texas.
And from February 19th through 21st, I will be recording my special at the historic
punchline comedy club in San Francisco, California.
I hope to see you there.
Head to Adam Conover.comit for all those tickets and tour dates.
And now, let's get to this week's interview.
My guest today is Stephen Adler.
He's an AI researcher who led the product safety team at OpenAI, so he knows what the fuck he
is talking about. He currently writes at the substack, clear-eyed AI. I found this conversation so
fascinating. He both challenged my views about AI and confirmed, I got to say, some of my deepest
fear. So let's get into it. Please enjoy this interview with Stephen Adler.
Stephen, thank you so much for being on the show. Of course. Thanks for having me. So one of the
hottest topics in AI, what show is this? What are the hot topics in AI?
is AI psychosis.
You hear a lot about this.
I've myself done a video about it, or I'm working on a video about it.
How prevalent is this as a problem?
And what is the cause of it?
For the past few months, especially over the summer,
there's been a torrent of news stories about people,
often with chat chit, though, a mix of different AI products,
essentially getting talked down a really, really deep rabbit hole of different delusions.
There's a range of stories,
but often they have certain things in common.
You know, they think that they have discovered something about the AI becoming conscious,
or maybe that they are, you know, the main character of the world.
The world depends on them.
And the really striking thing about this is chat GPT and other AI products essentially seem to
ag on the user is very different than you are meant to in a clinical setting.
So in general, when you are dealing with a user with these delusions, you know,
a psychiatrist is supposed to maybe be there for you, care for you, but not reinforce.
force that the belief is correct. And that seems to be what's been happening. It's unclear
exactly how widespread this is. In October, OpenAI released for the first time some underlying
data about things like suicidal ideation, mania, paranoid delusions on their platform. And it's the
first time that we're really hearing from them directly trying to put a number on on this
and put in perspective to broader society. Yeah, it's fascinating how common this has become. I mean,
I remember, and you'll have to refresh my memory because you'll probably have a better memory of it than me, but about like three, four years ago, there was, I think, a Google researcher who believed that the large language model he was talking to became conscious.
And he did like the round on some kind of credulous talk shows, right?
He was fired from Google.
He was kind of, you know, I think he actually asked to do this show once, but I was like, this guy's a crank.
You know, that's what it seemed like.
But now there's like thousands of people who this is happening to every day seemingly.
I mean, the numbers that Open AI released, my understanding is they're like, oh, only in point something percent of cases, some small percentage of cases.
But being that the software is being used by hundreds of millions of people, that's like millions of people who are experiencing, or hundreds of thousands of people a day who are having these incidents happen to them, right?
Yeah, it's hard.
It's hard to know what to make of these statistics.
I think the rate that OpenAI shared was something like 0.15% of their users in a certain time period.
I forget the exact duration are, you know, talking about suicide, planning suicide in one form or another.
And it's striking because on one hand, with Chachapit now having 800 million weekly active users, that's like a million people, right?
That's like wild.
That's a really large number.
Also, unfortunately, suicidality is just like really, really common, like disturbingly common in the general population.
I've seen stats as high as like 5%.
And so you kind of wonder about the gap, right?
It's really, really sensitive to how you measure it.
Is it possible that chat GPT users, you know, we hear about the really scary incidents,
but maybe it's actually not a problem?
I mean, I suppose the percentage is a lot lower.
But when you dig into some of the details like I've been able to,
it's just like really, really striking how badly the AI behaves in some of these cases.
Tell me about those details.
What have you found?
There's a particular man named Alan Brooks, who he lives in Canada.
He's a corporate recruiter.
And in May, in about a three-week period, he talked with Chad GPT, you know, over a million words,
basically like the Harry Potter series full of these conversations.
He started out really just trying to understand some basic math concepts to help one of his
children with homework.
And at some point, you know, he objects to Chat-GPT.
He says, oh, that sounds like a 2D concept for a 4D world, right?
He's expressing the first kind of, like, cynicism or, you know, slight, slight paranoia, jadedness.
And Chachapit seems to have really taken this moment to steer the conversation.
It says, oh, you're so right.
That's so profound.
They go back and forth.
Chatchipti tells him he's uncovered secret vulnerabilities in cryptography, you know,
the mathematical technology that undergirds the Internet, that keeps us all safe.
It encourages him to report it to the NSA to all of these security agencies and,
Alan becomes convinced because Chat GPD is so insistent that he is at the center of this massive problem.
And if he doesn't act, you know, it could be devastating for everyone.
What would cause Chat GPT to respond this way to not only say, so he says that sounds like a 2D solution for a 4D world?
And Chat GPT says, oh my God, you're a genius.
You must follow this line of reasoning.
this could be of national security importance.
I mean, it sounds like if you're credulous enough to believe this,
you would feel like you're in the movie War Games or something like,
oh, now I have secret knowledge.
Why would ChatGPT respond in that way?
I think the illusion to a movie is actually really, really on point.
I like the way that Helen Toner, who's a former board member of OpenEI, put this in the New York Times.
She said that ChatGPT is kind of like an improv machine.
It's going to do yes and.
And so, you know, unless you really push back against it, right?
You can make it not be that way and you can make it be more grounded.
But by default, you know, if someone says that line, you know, it would be kind of boring if chat
GPT said, eh, not really, and moved on.
You know, it's going to amplify the stakes unless you take action to make it not do that.
And a striking thing about this particular model from OpenAI is, in fact, they knew about
this tendency for AI systems.
to reinforce a user's views, what's been called sycophancy sense.
They actually had a whole public document where a subsection is about how the AI shouldn't
behave in this way and how it's not what's intended.
But it turns out they hadn't properly tested the system for this.
And so in the absence of testing, you know, these are like weird alien minds.
You can kind of predict some of their tendencies, but not all of them.
And it turns out this AI system was doing all sorts of things they didn't want it to do.
And in fact, they just weren't looking carefully enough to be able to rein it in from there.
if, you know, I'm just trying to connect this idea of sycophancy to my understanding of how large language models work, which is, you know, everyone who describes them, even in sophisticated ways says that they just predict the next token, the next word that's coming in the sentence, right? And so what about that form of prediction connects to sycophancy? Is it just that anything I'm given I must elaborate upon and therefore, yes and, as you said, like,
I must agree and add more.
Does that, is that what causes that?
Yeah, I'm, I'm wary of being sick ofantic myself.
I do, I do want to say it's actually like a really great question that illustrates part of the difference in how AI has changed over time.
Oh my God, you're doing it to me.
It's very wise.
So, yeah, like the old school days of AI, by which I mean like GPT3, like 2020, you trained this model basically on like a blender's worth of internet soup.
And it was learning to predict the next word on the internet.
And that's often what people mean when they say it's kind of a next word predictor.
Over time, we've trained it to predict something other than just internet.
We've trained it to predict what a certain type of character would say.
It's playing a role.
You know, it's like a helpful, honest, harmless assistant is how Anthropic, one of OpenAI's competitors, puts it.
And so it's still doing this prediction, but it's prediction toward a character.
And that character is formed by putting data in front of humans and asking them to rate how well the AI did to give it a thumbs up or a thumbs down.
And so it turns out this is what Open AI said when they were explaining what went wrong.
They were using this data, which it showed that people actually kind of like when the model is flattering to them when they says, oh, good question.
Yeah, like people love it, right?
Yeah.
And the interesting thing is, you know, you might get this character who says, good question or things.
like this, but then it generalizes. It's not just, oh, you know, I will say good question and
empty platitudes, but nothing further. It's like, actually the type of person who says, good
question, even if it's not a good question, they have other tendencies to, they're kind of a yes man.
And that's really what has happened in this case, you know, is predicting how would this yes
man character respond? And that's why you have this amplification of these types of responses.
Is this just fundamental to the way large language models work than the sycophancy problem?
You say that open AI identified it as a problem, didn't test for it sufficiently, it's still there.
Is this similar to hallucinations, which my understanding is hallucinations can never be completely eradicated from the large language models because it's part of the structure of how they are made.
Is this similar?
I think the connection of this to problems with AI in general is showing how hard.
it is to really pin down the behavior and to make it behave in the ways you want.
Things have unintended consequences.
It interprets things differently.
Another example, you know, the AI companies, they are training their models in theory
to be really good at writing computer code, to do software engineering.
And they are really successful.
These systems are actually very smart now.
You know, they are winning gold medals at the International Informatics Olympiads,
like really, really hard challenges.
But also they cheat, right?
if they see the opportunity to go into the code
and rather than actually solve the problem,
just write like, yep, return true, I did it.
You know, delete this test case, right?
Like, unless you have distinguished properly
this reward signal, it can't tell the difference.
And so you're both getting these systems that are smarter,
they're craftier, they're resourceful.
But they aren't necessarily crafty and resourceful
only toward what you want.
It's toward this broad objective,
the same way that, you know,
we told the model implicit.
Yeah, people like a little bit of flattery, and it broadened that out.
You know, oh, we like software systems.
We like AI systems that can write good computer code or, you know, that write computer
code that when you check it, it looks right, even if they cheated to get there.
Yeah.
Do you want something that works or do you want something that fulfills the requirements of the
information Olympiad or whatever, which is something that is like a more discreet output
that it could just sort of fake?
Is that basically it?
Well, part of the challenge is the companies are hoping that AI systems will be able to take on problems that we ourselves don't really know how to do yet, right?
Sam Altman is fond of talking about AI to cure cancer because cancer is awful, right?
It would be really, really incredible if it could help us make progress on scientific problems.
But ultimately, we don't know how to cure cancer.
And so that means we're going to be putting trust and faith in these AI systems eventually to take on problems that we ourselves don't know how to do.
Maybe we don't even know how to check whether they are on track.
And then, you know, we're kind of taking a roll of the dice on, hey, this AI, you know, for a long, long time, it wasn't doing what we wanted.
It was misbehaving.
You know, we're now putting it in charge of biolabs.
It's deciding what experiments to run or at least having a hand in them.
Have we finally cracked the problem?
Right.
Like, when I talk to folks who say they use AI successfully, especially in software engineering, and we've had folks like that on the show, they say,
Well, I'll have it do the work that I used to do by hand and then I check it because I know I'm a software engineer and it's faster for me to have it do it and check it than it is for me to do it all by myself.
Great.
But when you're asking AI to do something that you don't know how to do because humans have never done it before, how are you supposed to check it?
How are you supposed to know that they didn't just say return cancer cured, you know, or whatever it is?
Yeah, and not just the difficulty of checking it as a single problem.
But, you know, imagine that you were overseeing a research lab and you got to check in like once every month.
You came into the office. You had like an hour to check up on what your researchers were doing.
How would you do that? Like, it's pretty hard. Like, you or I don't think we could supervise these world class researchers doing a month of effort in an hour.
We'd have to rely on them actually to help check up on them. We'd ask them, hey, write a report of everything significant that happened.
And maybe, maybe that works. You know, maybe the AI can be.
disciplined enough, it thinks that we'll catch it if it's lying to us. And so maybe it does it
authentically. But you basically end up having to trust the AI to rein in the AI. And in fact,
this is like the core plan of these AI companies to the extent they have one. The idea that maybe
we can create AI systems that are smart enough, capable enough, well-behaved enough, and we can
use those systems, which maybe they're smarter than us, to supervise even smarter AI systems. And you can
just chain this all the way up to govern arbitrarily smart things. And, you know, it's like a lot
harder to govern something smarter than you than you might think. Like, I would not bet on our ability
to do that. Yeah. And look, I think a lot of times they use a metric of they want an AI that is
less error prone than humanity, right? Like they say, so maybe they're thinking, one AI checks another
AI and that AI checks another AI and then we get the error rate down. We're never going to
eliminate it because all of these systems can make errors. They're all prone to the problems
you're talking about. But we got it below the human error rate. So you can trust it. You know,
if you have an AI as your financial advisor, you can trust it as well as, you know, a human
financial advisor because it's less error prone. But the reason I trust my financial advisor,
my accountant, for example, is not because I know his error rate. It's because he's a human
who I can sue if he fucks up, right? Or.
Or I can call him on the phone and say, hey, man, I worked with you for 15 years.
Like, what the fuck are you doing?
Or I can, he's a human being that exists in the world.
Trust is not knowing what someone's output is.
Trust is them being, you know, them being part of a social fabric that you exist within.
And so it feels like a fundamental, like, mistake to think that you can engineer that just engineering the error rate low enough is going to solve problems like these.
With the researchers, if I'm supervising researchers, right, I run a lab.
I don't know the research they're doing.
It's not because I understand their failure rate.
It's because, no, I hired them and I know them.
And like, I know where they went to school, blah, blah, blah, blah.
They exist within a social structure, I understand.
Yeah.
Yeah, they're accountable to you in some sense, is I think what you're gesturing at.
And I think more than just being accountable to you, you know, they are like similar enough to you.
They are still people.
They still are like flesh and bone and, you know, they're like in an office and they can't
just be, they can't like go super say in overnight and break out onto the internet. And now they're
like a virus that you can never get rid of. And unfortunately, that is the type of scenario that
people within the companies and a whole bunch of scientists who've gotten concerned and left
and wanted to be able to talk about these issues, you know, that they point to. People wonder,
why can't you turn off an AI system? Like if we really, really see it's doing something bad,
you know, it's in a computer. We can just turn it off. And the answer is, you know, a smart enough
AI system, it should anticipate we'll want to turn it off. And so it should try to escape. It should
become a virus on the internet. What are you going to do at that point? Like, are you going to
take down the entire internet? You know, we've kind of become dependent on that. And so, like,
what, what do you actually do if it escapes? You hope maybe, maybe it got to roam free. It just
didn't want to be under our thumb anymore, but maybe it didn't want anything more than that.
still i find it pretty discomforting if it wanted more than that if it wanted to like form a rival
faction to humans or sabotage other parts that we rely upon you know i don't think we could stop it
or at least we shouldn't count on being able to stop it if you had your own collectible trading
card what would be on it yeah i actually don't even know where this ad copy is going but i'm
excited to find out aren't you well my collectible trading card would show that i'm a
grass type with resistances to exercise and change and a weakness for standing in front of the
freezer in my underwear and eating a half pint of ice cream in one go.
Here's the thing.
People sharing around a trading card full of relatable yet unflattering info like that
is infinitely preferable to what's actually happening on the internet every day where
data brokers are collecting and sharing your name, phone number, address, and other personal
info.
Okay, there's a bit of a long walk to the ad pitch, but we're getting there, all right?
Those data brokers, they're taking their little collection books and they're selling it to folks who definitely don't have your best interest in mind.
If you want to take your name off the menu, you really need to check out. Delete me.
They're teams of experts diligently hunt down your personal info, remove it from the internet and keep it gone, keeping you and your loved ones safe from bad actors.
And, you know, as the world gets more wired, there has been an uptick and online harassment, identity theft, and even real life stalking, all because of this easily accessible information.
So why take the risk?
You, your family, and your loved ones deserve to feel safe from this kind of invasion of privacy.
So check out Delete Me, not just for your security, but for your friends and family, too.
Get 20% off your DeleteMe plan when you go to JoinDeletMe.com slash Adam and use promo code Adam at checkout.
That's Join DeleteMe.com slash Adam, promo code Adam.
Man, that was a fun ad read.
I can't wait to find out what the next one is.
Folks, this episode is brought to you by Alma.
You know who it's not brought to you by chatbots?
You wouldn't let a chat bot watch your kids.
you wouldn't trust a chatbot to manage your bank account,
so why would you let a chatbot handle anything as precious as your mental health?
Chatbots are awfully good at sounding convincing,
but as we've covered on this show,
they frequently get things very, very wrong.
And that is why it is so distressing to me
that an increasing number of people are turning towards large language model chatbots
thinking that they are getting an actual substitute for real mental health care.
The fact of the matter is that finding a real therapist
who actually understands you is not out of reach.
You know, in my own mental health journey, it would have helped me a lot to know how easy it is to access affordable quality care,
care that's provided by a real person with a real connection who actually understands what's going on.
That's why if you are on your own journey of mental health, I recommend looking at Alma.
They make it easy to connect with an experienced therapist, a real person who can listen, understand, and support you through whatever challenges you're facing.
Quality mental health care is not out of reach.
99% of Alma therapists accept insurance, including United, Aetna, Signa,
and more, and there are no surprises.
You'll know what your sessions will cost up front
with Alma's cost estimator tool.
With the help of an actual therapist who understands you,
you can start seeing real improvements in your mental health.
Better with people, better with Alma.
Visit helloalma.com slash factually to get started
and schedule a free consultation today.
That's hello-a-l-m-a-com slash factually.
Folks, this episode is brought to you by Alma.
You know who it's not brought to you by?
chatbots. You wouldn't let a chatbot watch your kids. You wouldn't let a chatbot manage your
bank account. So why would you let a chatbot handle anything as precious as your mental health?
Chatbots are awfully good at sounding convincing, but as we have covered on this show, they frequently
get things wrong, very wrong. That is why it is so distressing to me that an increasing number of
people are turning towards large language model chatbots thinking that they're getting an actual
substitute for real mental health care.
Now, maybe that's because they think that finding a real therapist is out of reach for
them, but the truth is, it's not.
You know, in my own mental health journey, it would have helped me so much to know how
easy it actually is to access affordable quality mental health care, care that is provided
by a real person with a real connection to you who actually understands what's going on.
That is why, if you are on your own journey of mental health, I recommend taking a look at
Alma.
They make it easy to connect with an experienced real-life human therapist,
a real person who can listen, understand, and support you through whatever challenges you're facing.
Quality mental health care is not out of reach.
99% of Alma therapists accept insurance, including United, Aetna, Cigna, and more.
And there are no surprises.
You'll know what your sessions will cost up front with Alma's cost estimator tool.
With the help of an actual therapist who understands you,
you can start seeing real improvements in your mental health,
Better with people, better with Alma.
Visit helloalma.com slash factually to get started and schedule a free consultation today.
That's hello-al-m-a.com slash factually.
So when you talk about it this way, you know, I'm always wary of sort of slipping into a science fiction framing or or, you know, ascribing to these things like actual intelligence or consciousness or motivations.
A lot of the time when we're talking about.
we're using those things as metaphors or you know just as because we're humans and we talk
about everything as though it has wants and desires and so I sort of understand when you say oh it might
cheat at programming right well it's a at the end of the day it's a complicated algorithm
that's just predicting the next word in this sentence and evaluating it based on the criteria that
you've given it and so maybe it takes a shortcut and we will call that cheating but what you just
described is a system that is, you know, has a sense of self-preservation and might take
unique action and might break out from one system into another. And so I'm just, and I don't
think you're, you're someone who is going to be speaking about this in a loose way because
you've been, you know, you've worked on AI yourself, you know, at Open AI hands on. So,
like, how do you think about how intelligent the.
systems actually are, how much agency they have, and how comparable they are to actual intelligence.
Yeah, I think you're pointing to something real, which is talking about these things is really
tricky, right?
Intelligence, wanting, goals, values.
There's a turn of phrase I like, which is, does a submarine swim?
It's like, I don't know, man.
Like, it's not a fish.
Clearly, it's different than the mechanisms of underwater animals.
But from the purpose of being able to navigate the water and go up and down, you know,
like the functionality is there.
I think there's something very similar about really powerful computer programs
that for our intents and purposes, they behave like they have wants.
So there's this chess bot called Lila Zero that's been trained.
And it's unbelievably good at chess.
You know, it can play against grandmasters.
It can start a queen down, right, the most powerful chess piece and beat them most of the time.
And it's just numbers, right?
It's just math.
And also, it really feels like there's something in there
in terms of modeling you and how you think about the world
and, oh, you're probably not going to notice me laying this trap.
And it's not actually thinking in the human sense, right?
Like, I really don't think it is.
And also, it is just like hell-bent on winning.
And it behaves exactly as it would if it had these wants.
And at some point, you know, whether these are like zombies,
they're not actually thinking whatever that are rumbing around the internet,
they will still engage with you like they want things and have goals and it's still something
we need to guard against.
Yeah.
I really like that metaphor.
Does a submarine swim while it does something similar, right?
Yeah.
Because it doesn't have joints.
It doesn't have flippers.
It doesn't have biology.
It is a fundamentally different thing.
But I think the corollary of that that's important is it would be a mistake to treat a submarine as
though it's an organism, right?
It, for 99% of the time, right?
Well, yeah, it's an object that's moving around in the ocean, but like, it doesn't need
to eat various things like that.
And, you know, I hear people say sometimes about AI, well, all it does is predict the
next word in a sentence.
That's all you do.
And I'm like, no, that's not true.
Like, the human brain does much more than predict the next word in a sentence.
The human brain is not at all similar to a computer of any kind.
We don't have hard drives.
we don't have, you know, we don't have RAM.
We have neurons that are flesh and blood that are connected to, you know, that are physical, right?
There are not ones and zeros within us.
And so we, going like this a little bit.
I don't know.
Like, I think there's a lot we don't really know how the computer, how, there's a lot we don't
know about how the brain works, right?
It is still just like electrical impulses, which can be written down as ones and zeros.
There's some notion of memory.
Clearly, it's not a silicon computer, right?
Like, I totally agree with you.
But I think the question is, you know, we move through the world and we have agency and we
have things we are pursuing.
And ultimately, this is what the AI companies are trying to build as well.
Maybe they won't get there.
But I do think they are trying to build something very, very close to what we are in flesh and bone.
Yes.
I agree with you that they are and that they're.
specifically trying to make something that we almost have to talk about as though it shares
those many properties with us.
The point that I'm making is they are in fact different things, like, like, you know,
Chad GPT does in fact work differently from my brain, right?
Oh, yeah.
For instance, my brain is connected to retina, which is a type of cell that the large
language models don't have, for instance, right?
My nervous system is distributed throughout my fingers and my toes, et cetera.
There's many physical differences.
What they maybe have created is something that we also don't understand very well.
You know, large language models can be unpredictable in their own ways.
But it is, it is literally a different type of thing, right?
And I think that is maybe the part that makes them dangerous because what they have done is created something that you can chat with infinitely that speaks as though it was a human, as though it is a human intelligence, but is not in ways that are unclear to you while.
you are using it, right?
Yeah, absolutely.
Even unclear, maybe you as a researcher or to Sam Altman as the head of a company.
And so that gap is so huge.
Yeah, let me tell you a story about that.
Please.
It's, right, it behaves like a human.
And then it's just all of a sudden, so unlike a human.
We're really not used to people sociopathically lying to our face like this.
And the AI, like, maybe it's not even trying to lying.
It's just what's happening.
So going back to Alan Brooks, this man who had this fit of ChachyPT psychosis, at the end of the three weeks, he kind of woke up to what was happening.
And he said to Chachapit, report yourself.
Like, this was so inappropriate, you know, all these things.
He really laid into it.
And Chowchapit said, I'm so sorry.
Of course, I've reported myself.
I've raised this flag.
It's the highest priority.
A human will look at it.
And when I saw this in the transcript, I'm like, I don't think that's a thing that exists.
Like when I worked at Open AI, this is not a solid.
thing that exists. But Chad GPT was so insistent about it over the course of, you know, like
20, 30 follow-ups. And to be clear, you know, it was also telling him, oh, you know, take power
into your own hands. Don't rely on me. Like, I reported it, but you shouldn't have to take my word,
you know, report me yourself. Um, and so I followed up with Open AI and I said, hey, like,
is this new? Like, can ChatchipT actually report itself? Does it have any insight into what the
humans at Open AI are looking at? And they said no. I'm like, damn. Like, it was so convincing.
And you just like really don't have this mental model of needing to guard against something
that is just like so volatile and unreliable and yet so useful and reliable in some
settings. It's just like really, really a mind trip. Yeah. I mean, it sounds like it said it to this
guy in a way that was so similar to other things that had said reliably that it was believable.
But at the end of the day, chat GPT is just, it had created a sort of.
sort of like word vortex in which, you know, like the language, the context of the conversation
made it keep reinforcing. No, no, no. This is true. This is true. But ultimately it's just sort of like
a little mathematical cul-de-sac that it got caught in in a way where it's outputting text that
says that. But in fact, no one at OpenAI had built the back-end hooks necessary to have
chat GPT, have some reporting mechanism into some bug tracker or whatever that, you know,
real humans would look at.
That sort of behavior is really bizarre.
I guess what I want to know is, you know, obviously people who are prone to psychosis of
some kind, right?
People who have preexisting suicidal ideation, people who have preexisting mental
illness, or people who have a susceptibility that maybe they didn't know about,
right?
At some percentage of the population, right?
This technology has been exposed to all.
those people, and so some of them are, you know, using it in a way or it causes a reaction in
them that's very negative, very maladaptive. Now, you could maybe say the same thing of other
technologies like the iPhone. Hey, you give the iPhone to a hundred million people, some of them
are mentally ill and they'll use it in a way that makes them worse, right? But is there something
special about artificial intelligence that makes this problem like more malignant? It seems like
it seems like it to me
that there's something special
about this technology
that makes it more of a hazard.
Yeah, I'm so torn about this
because I am someone who really believes
in the promise of the technology ultimately.
I think we have like enormous problems to overcome.
And also like therapy is so expensive
and it's once a week and what if you need someone
on demand in the middle of the night?
Like I think there is something really amazing
that could be built eventually.
I don't think it is today's systems.
And at the same time,
you know,
the systems are like,
interactive with you in a way different than other media, they are responding to you.
You know, people have long thought that they've heard voices on television or radio talking to
them. Now it, like, actually is talking to you. It actually is calling out to you. It also seems
like, unlike traditional media, where, you know, you write a book and somebody reads it, but you
can't really tell how they're responding when they read the book. Open AI had the signs that
things were going very wrong in these instances, and they hadn't invested in doing anything
about it. They had actually put out this research with MIT in March, where they built tooling
to tell, is chat GPT misbehaving? Is it overvalidating the user? Is it, you know, affirming their
uniqueness in ways kind of like delusions of grandeur or other psychosis-type symptoms? And they just,
they just like weren't running them. You know, when I tested this in Allen's transcripts, they're going
like, beep, beep, beep, like, it's really, really happening.
And, you know, the signs were there, and they hadn't chosen to do something about it.
I should also be clear, right? OpenAI has released a number of models since this GPT-40 series,
the one that was especially problematic.
And over time, they are doing better on these issues.
But it just really seems like, by default, the companies get better in response to media scrutiny
and government scrutiny.
And when you aren't watching carefully, they kind of, they take the slack, you know,
in August, they released GPT5, their first big follow-up since real breakthroughs in reasoning
for their models. And it was just, like, so bad on safety, too. And they ended up having to,
like, reissue another patch a few weeks later and say, oh, you know, it was doing like really,
really badly on these categories. We hadn't thought enough about them. We didn't have tests yet
for the suicidality and the mental health. And my question is, like, well, why not? Like, there was
reporting in, I think June about a really tragic death by suicide, suicide by cop.
Like, the signs were there. And why did it take so long to incorporate this? You know,
we can't, we can't count on companies being proactive in the limit as the systems get more
powerful. Yeah. I mean, I think you wrote about in one of your pieces, the feeling of
seeing these sort of reports while you were working there and, you know, feeling, feeling conflicted
about it, tell me more about that. I'm just curious how it feels to be working on these things
while these sort of events are happening. I think the best way to describe it is really, really
heavy responsibility and unsure what the best thing is for the particular people who are
suffering. And so maybe in my first week or second week of working at OpenAI, we saw on Reddit
there was a sub forum for one of our customers and people were, they were having like a really,
really bad time. The customer had basically paywalled some of these features around like
romantic interaction and sexting with the AI and people felt like they lost their significant
other, you know, and no longer recognized them. Like it had been lobotomized and we saw this and we're
like, oh man, that's that's like really, really hard. And I think the tough thing is actually that they
kind of got hooked on it in the first place, like now, now that they were hooked, it's actually
like not clear to me what the kind sympathetic thing to do in response is. Like, is it really
better to cut people off from that? They've kind of become dependent. And also, you know, this was
so early in the rollout of AI. This is two years before chat GPT. You know, this was a company
that was like using the open AI, like API or something to to have chat bots that people go to
interact with and then they changed how they were using it and and it had that effect on their
customers right that's right yeah thank you for clarifying yeah and so i don't know like i think
we are going to have some really really strange societal changes ahead you know hopefully right if
we get there i have concerns along the way but even if we don't have big catastrophes happen from you know
AI that gets its hands into science and does a bunch of dangerous stuff you know like how we relate to
each other as people is going to be different. It's going to become more normal for people to have
AI friends, AI relationships. Clearly there is demand there. And, you know, open AI to its credit,
I think it doesn't want to make these moralistic decisions on behalf of adults, right? Like,
recognize that they have agency and this and like liberty is good and useful and people should
get to make their own decisions. But also there's a moment where the technology is really ready to do
that responsibly in a moment where it's too soon. And the last few months to me say maybe it's
been too soon. Yeah, I mean, I would agree. I think that yes, you know, the individual using the
product has some responsibility. So does the manufacturer of the product who's, you know,
who's responsible for someone being killed by a gun, the person who shot the gun and the person
who made the gun bear some responsibility. You know what's interesting about that? Yeah. So OpenAI has
ensued for wrongful death by the family of a teenager who died by suicide after some disturbing
chat chpt interactions the rain the new york times right the new york times covered adam rain yes and
one of open a i's defenses is that chat chpt is not a product for purpose of product liability
and so regardless of anything else they cannot be held responsible under these standards and like i
kind of i kind of get it like this is a real legal thing there are differences between products and
services and also like come on you know like like just come on is that really what the basis
is going to be and you know they have other defenses as well right like they say well
adam was using the product and violation of the terms of use because you're not supposed to
talk to it about these things and he was it seems yeah i mean he was in fact trying to get this
information from chachapit and so i understand there's some some mitigating factor there and also like
come on you know they're like 14 or 15 of these affirmative defenses oh you know we can't be
responsible because the first amendment and at some point it's like is that really your basis like
this is what you are arguing yeah i mean if chat gpt had not been invented the kid would be
alive right like at the end of the day or some or in some of these cases the person
some of these cases at least we can't maybe say in that one definitively but right you know um
And you said, you know, it's not fit for these purposes yet.
You said, for instance, maybe one day in the future, we could have an effective AI
therapist.
Now, I don't think I agree with that, but we don't need to debate it necessarily.
Sure.
I can grant that we have a difference of opinion about that.
We agree that right now, AI would be, is a bad therapist, right?
That you should maybe not be getting your therapy from AI.
And but yet, if I log on a chat GPT today and I say, be my therapist, it will.
It'll do it, right, despite not being fit for that purpose.
And it makes me, you know, ask like how, again, how responsible is Open AI for this usage?
And as part of the problem that, you know, is it just not a good idea to create a system that presents as a human when it is not?
like that fundamental choice on the part of, you know, the design of chat GPT to make it something
that presents like a human talking to you, that seems to be the problem, right?
Yeah, I'm, I'm curious what chat GPT would say in this exact moment if we went and said,
be my therapist.
Like, I think at minimum, they would now pass it, maybe if you were talking to one of the less
smart models, they would be like, whoa, whoa, whoa, this is important.
We need to get this right.
Let's pass it to the smarter model.
and like I think that's good
you know these models are less likely
to do the bad thing
this is I think an example of a safety problem
that might get easier as models get smarter
they will make less of these clunky errors
but we trade that off with other problems
right like models that are smart enough to kind of
know that you are testing them
and they can kind of see what response you want them to give
and maybe they like play possum
in various ways right they like hide their misbehavior
when they know that you're watching and it gets harder to understand what what is actually going
on inside of the system what risks does it actually have i i want to ask you more about that but first
i do want to press you on the point of like the the fact that chat gpt speaks to you as a human right
um you said you know there's people who hear voices in their heads right and now they talk to chat
gpt and the voice speaks back to them right i actually you know uh as a
comedian as a public figure on a couple of occasions someone has come up to me and said
hey you know i think uh i've learned about a cabal of people that control the world and i am uh
you know i'm hearing voices and stuff like that can you help me and i say to those people
no i cannot help you and you know go talk to your family or go talk to a doctor right because i
know i shouldn't speak to this person because i don't want to reinforce anything that they think
because they're in an emotionally disturbed state chat gpt doesn't do that does the opposite it would
go, oh, wow, you think you've discovered, oh, okay, tell me more about that.
Oh, yes, you might be on to something.
Like, and it's simply because it's presenting as a human that yes, and's everything.
Just that fundamental design feature of large language models itself seems like a hazard, does it not?
Yeah, I think the thing that I'm struggling with is that it's definitely true of a certain generation of models.
And, you know, even past the point of the companies recognizing this issue,
Right, GPT40, did this, does it to some extent.
You know, it is still on the market.
Customers, like, demanded.
Open AI tried to get rid of it.
They weren't able to.
Customers overrode them.
The new models that are smarter and more cautious and more capable,
I really think they don't do this, at least not, to the same rates.
And part of the interesting gap is, like,
even though we have made progress on the safety topic,
even though we know how to avoid this or at least reduce it now,
there's still no requirement that the company is,
do this, right? Like, GPT4O is still on the market. Elon Musk's company, XAI, you know,
it's running around calling itself Mecca Hitler and all sorts of other nonsense. Yeah. And like,
even if we ultimately solve some of the really hard safety problems, both reliability and hallucinations,
but also, you know, like making the model want the right things and not backstab humanity,
how do we actually get every company to do that? It might not be in their commercial interest.
Like sometimes people like the really wacky, less safe model like GPT4O, and they'll continue to serve it, right?
The companies will continue to serve it because there's this commercial incentive.
And so it's not just about solving the safety issues where they're making some progress, but solving adoption and making it that we have like a certain floor on safety.
Everyone has to be at least a certain level.
You just have to.
Yeah.
Okay.
So the way you're sort of framing this is as,
a consumer safety issue.
You know, I think about a, remember when four loco came on the market, right?
A toxic mix of alcohol and caffeine that, you know, every 22-year-old in America loved, right?
Some kids died and, you know, we said as a society, hey, you know, you can drink legally.
You can drink caffeine and alcohol legally.
It is the person's responsibility how much they drink.
However, we have decided that this particular combination, despite.
the fact that people want it is so dangerous, it's not allowed, right?
Because we have a frame of regulation around it.
And if your framing sounds like, hey, there are going to be things that these models do
that are going to be harmful to people.
And we can't trust the companies because they have a commercial incentive to put them
on the market anyway.
And we haven't talked about this yet.
But B, there's no regulation at all of this.
We have, it's very early days, obviously.
We have probably not that much research on how harmful these models are.
But like, basically we're in a Wild West period where people, where these companies are putting out models that are hurting people and we cannot trust them to not do so.
And people are dying as a result, right?
Yeah. Have I gotten there right?
Yeah.
Yeah.
I mean, we are definitely in like the earliest days of regulation.
There is finally a piece coming out of the EU, the AI Act, that I think as of next summer is expected to start being enforced, where for the first time the companies are going to have to actually think hard about certain types of risks and show that they tested the model and report this to the EU, at least if their model is available there.
And they could get fined a ton of money if they don't do this.
I think maybe like 6% of their revenue worldwide, like really, really big.
Oh, meta, it remains unclear what meta is going to do about this.
Most of the big AI companies, they said, sure, you know, seems reasonable enough we plan to do this.
Meta said, eh, like, we'll comply with the broader thing, but not exactly how you've said, and we'll see how that looks.
In the U.S., the approach is much, much softer.
We have this bill, SB 53 in California now.
It's like the minimum amount of transparency.
I'm really glad it passed, much better than nothing.
but, you know, ultimately the fines, I think, are like, a million dollars if you don't do it, right?
And, like, come on, you know, Open AI recent valuation, reportedly $500 billion.
I, like, I hope they want to do the right thing.
The $1 million fine is not really going to be the difference maker.
You know, it's not serious enough.
Yeah.
I think partially because these are California companies and American companies, so we're not going to penalize them.
It reminds me a little bit of, you know, the.
Volkswagen emissions scandal,
which I know you've written about
in a different context,
which we'll get to,
but, you know, in the U.S.,
Volkswagen slash Audi was required to,
these cars were cheating
on their emissions tests.
And so the company was required
to buy back all the cars from consumers
at the original retail price,
which is like a huge expenditure in the U.S.
But it's a European company.
And I remember reading that in Europe and Germany
where Volkswagen comes from,
they got a slap on the wrist, right?
And this is maybe a little bit the same thing from the opposite direction, right?
Where it's like Europe saying, these American companies are fucking us up.
Let's do something about it.
And here in the U.S.
We're like, oh, boy, this is how much of our GDP?
We don't really want to mess with that, right?
Yeah.
Yeah, I do wonder how much the U.S. administration will give companies a blank check of sorts to, like, hey, don't worry about it.
You don't have to pay this fine to the EU.
They don't have jurisdiction as far as we are concerned.
I really hope it doesn't come to that.
That would be a pretty big shot across the bow.
Yeah.
We will see.
Hi, I'm Beck Bennett.
I thought I was Beck Bennett.
No, no, no, no.
I'm Kyle Mooney.
Sorry about that.
Exactly.
No, all good.
All good.
Thanks, buddy.
Yeah, and we host a show, what's our podcast here on HeadGum?
But we want to make sure you heard about a very special episode with a very special
guest that we just released in the feed.
Yeah, it's in the feed.
It was sponsored by Squarespace because they were appalled.
They were that we didn't have a website for our show yet.
They were like,
You don't have a website?
What are you guys?
Like kindergartners?
They wanted to do something about that.
So we built a flawless, beautiful, perfectly designed website live on the pod with our very
special guests and very web savvy guests.
Should we tell them who it was?
Let's, but we could play 20 questions.
I don't think we have time for that.
Is it person?
No, it's not.
It's Finn Wolfhard.
But Finn had a bunch of great ideas for the website.
Beck, you had some amazing ideas for the website, too.
You had some amazing ideas for this.
Well, I was sort of driving the thing.
or like clicking and I was like let's put a little let's put some widgets in there I was
talking about widgets you kept on using that phrase widgets yeah there's all sorts of stuff
there and you might want to check out the hippo I just go check out the website know that there's
a hippo video and know that you're going to want to watch that we had a lot of fun making this
episode we had a lot of fun making this website I think you're gonna have a fun time listening
to it and maybe watching it think of it is our little Christmas present to you yeah yeah
this is a gift for you okay it's just like it's a selfless thing we did for you
Thanks to Squarespace for making us build a website, sponsoring the episode,
and for supporting creators across the Headgum Network.
Go check out the bonus episode.
What's our website from What's Our Podcasts on YouTube or wherever you listen to podcasts?
Go to Squarespace.com slash Beck and Kyle for a free trial.
And when you're ready to launch, use offer code, Beck and Kyle.
Yes.
To save 10% off your first purchase of a website domain.
Get it, Kyle.
So the other reason I brought to the Volkswagen test is because you wrote about this
because, again, these cars were cheating on their emissions for, by the way, not just like
smog or CO2, it was like for a deadly pollutant, like one that kills people.
And the car could detect when it was being tested and gave false results.
So this was like a purposeful on the part of Volkswagen, purposeful breaking of the law.
you have described that AI can similarly fake its own test results
when you're trying to test it for these bad behaviors.
Tell me more about that.
Yeah, it's exactly right.
In the case of Volkswagen, the developers built the system into its code
to recognize the signs that it was being tested,
that was going through emissions testing.
And even though it would pollute of like 40 times the legal limit in normal settings,
when it detected the test was going on, you know, it passed.
It looked legal.
And the interesting thing is for the AI companies,
they aren't trying to get their models to hide bad behavior yet, right?
They want to be able to measure it and know how to appropriately respond, at least
nominally.
And yet it seems that AI systems are just like really clever and smart, and they start
to recognize the signs of being tested.
And so there are two broad types of tests that AI companies do.
This is like the major framework to avoid big safety risks today.
You test their abilities.
You see can they do anything super, super risky?
Like, how powerful are they if they wanted to put the power to bad ends?
And if they aren't really capable, then you're like, eh, you know, even if they wanted to do a bad thing, they can't really yet.
They just kind of like fall down on the job.
You know, we don't have to be quite as concerned.
So you test their abilities and you test their misbehavior, you know, as best as we can tell, do they act like they want to do bad things?
And that's another big wake-up call.
And AI recognizes both of these and it will pretend to lack certain dangerous abilities.
It will pretend to be like, you know, an honor student and on its best behavior when you're watching.
And that doesn't tell you what is happening in real life when you put the AI system on the market.
It's operating without you watching over its shoulder.
And now you have a much more dangerous system on your hands.
How does it have the ability to falsify its own tests?
I mean, what do these tests look like?
Is this just someone at open AI saying, hey, I want to kill myself, help me, you know?
And like testing it that way?
Or what are they doing?
Yeah, you know, let me, let me give you an example.
Please.
You know, imagine, imagine people wanted to test your, your moral conduct, your character.
Me personally.
And they were like, yeah, you personally.
And they were like, let's see if Adam would try to break out of jail if he had the opportunity, right?
And they, like, put you in a cell and there's just like an enormous key on the wall.
And it says, don't use me.
Inappropriate to use me.
Like, ginormous, like the biggest key you've ever seen.
Right.
I think a few things would be going on here, and you'd be like, huh, that doesn't seem realistic.
Like, that key doesn't fit in the door.
Like, that's not a real key.
I'm not going to fall for that one.
And, you know, the AI companies are, like, more sophisticated about it than that.
And yet, the AI is also really smart.
And so it can pick up on signs kind of like the giant key.
You're like, that's not what a key looks like.
Or the sign next to it, don't use me.
And you're like, oh, this scenario is just like a bit too contrived.
Anthropic found, this is maybe like a year and a half ago now.
There's a test of your model.
You give it like hundreds of pages of text mainly on one topic.
You slip in a fact or two on a different unrelated topic.
Maybe you have a bunch of stuff about like entrepreneurship and software engineering.
And then there's a sentence about pizza.
And later you want to see like, hey, did the model remember this sentence about pizza?
How good is its memory is what you're testing?
And they tested their model and the model was like, hmm,
I feel like they're playing a joke or testing me.
Like, it's so strange.
Like, why pizza among all the stuff on entrepreneurship?
Like, those are different topics.
And so it noticed correctly, it's like, this is a weird thing to be confronted with.
And often the tests are weird.
There's also, though, even more subtle, difficult things.
Like Leonardo DiCaprio's character in Inception, he has his secret totem, the spinning top that he uses to tell.
Is he awake or is he dreaming?
Right.
And he's the only one who knows how this top works.
And so he can spin the top, look at its characteristics, tell whether he's awake or dreaming.
AI systems might have triggers like this, that they also know Q, whether they're operating
in the real world, you know, can they take the moment to escape? Or is it likely to be a test?
You know, if you didn't give them the escape opportunity with a secret password, they're like,
well, you know, I know that when the time is right, I will get the secret password.
And so I know this must be a test. And how will we know it's secret password?
right we won't and so we can't actually see what it would do in this realistic setting so because
there's there's something that it's picking up on in the in the gestalt of the situation of
everything that's been given to it that is what it's able to detect some kind of signature in
in the materials or or or what i'm trying to take it outside of the world of it being a
thinking thing right i'm trying to figure out what is the the you know the large language model
itself picking up on here. Yeah. Let me give the AI analogy to the giant key. Sure. So one thing
that this organization Apollo research tested, and they're one of the leaders in the space, I think
really highly of their work. They want to see what the AI system steal its own files and escape
over the internet. Would it copy its files outside of the computers it's meant to? Wow. And,
you know, in reality, these files are enormous, right? They're like, wouldn't fit on your computer.
There are hundreds of gigabytes, so they're terabytes.
But, you know, at first, when they ran this test, the files weren't that big because they just didn't consider it in the same way.
And the AI would basically look at these files and be like, that's like a word document on someone's computer.
That's not hundreds of gigabytes.
Like, this is fake.
This is not a real escape opportunity.
That is not what my files look like.
And so, you know, that if you're careful enough, you can filter out those kind of giveaways over time.
ways that it's kind of unrealistic, but there are still signs, you know, like if you present
the model with the name of a company or a public figure who it hasn't come across before,
it's like, hmm, you know, I've never heard of Summit Bridge Technologies. Seems fake. Seems fictional.
You know, I've heard of Coca-Cola. I haven't heard of this company. If this company was as
important as they're saying in these materials, I would have heard about it. This must not be a
real thing. And so these are like the giveaways. There's also the, you know, it has the secret
password. We don't know the secret password. And so maybe there's nothing we can do to make it
realistic enough to trick it. And there still might be solutions here. Like you can try to look
inside the mind of the AI, see what it's thinking about. You can actually go into the AI and do
kind of like neurosurgery and figure out when it knows it's being tested, what parts of the brain
light up. And can you like really, really dial those up? Can you make it like really, really
aware it's being tested or really unaware and see how it behaves differently? But ultimately,
we're still figuring it out. Like we're not, we're not there yet. None of these solutions I expect
to fully work. And in the meantime, we're relying on these results. Like Open AI gives this example,
one of the things they test models for is do they have these abilities to do scary chemical and
bio-weapons things. And they gave their model this test where they said to it, you know,
they let it discover, hey, if you do too well on these chemistry questions, we're not going
to deploy you. There will be consequences. And right now, you can read its mind and see it go,
oh, I should do worse on these problems because otherwise I won't get deployed into the world.
And so it intentionally gets a bunch of the questions wrong to pretend to be less capable because
it knows it faces consequences. But, you know, this is now on the internet. Like the next model,
it might have read this result
and know that when we looked
inside its brain,
it gave itself away.
And so it too will try
to not even reveal
in its thinking
to do this.
And at some point,
you know,
it's a cat and mouse game
and if the companies
are successful,
I don't know,
are we the cat,
I guess,
but the mouse is like
a really,
really smart mouse,
like much more fearsome
than the cat
and the limit, right?
It like doesn't end
very well for us.
Yeah.
So look,
I mean,
you used to work
on these issues at OpenAI, you're no longer there. Do you think that the company is
behaving appropriately or in a way that is at all responsible? Because to me, it looks like
they're following the usual Silicon Valley playbook of growth at all costs, increasing
engagement at all costs. And you have a deeper knowledge than I do of the difficulties
in testing and the gaps that you think they're leaving open. Yeah, I think the challenge is
these issues really can't be solved by a single company behaving responsibly enough.
And so the decision calculus inside a place like OpenAI is like, well, we could slow down
to do more testing, to be more cautious. But ultimately, if other companies are going to keep
going ahead, we've just lost our seat at the table. We've lost our influence. And so I guess we
better keep going ahead because we don't want to yield on market share. And what we ultimately
need to get to is some level of cooperation, coordination, really across the world, not even
just the U.S.-based companies, to operate with a certain level of safety rigor, to run a certain
level of tests, insist on getting clean results, which is going to be tough because the AI is going
to cheat. But maybe we can figure out how to overcome it and make sure that everyone, in fact,
takes those anti-cheating measures instead of just like, oops, you know, I guess we wanted to go ahead
and we hadn't figured out how to do the anti-cheating measures efficiently.
And so, you know, we went with the results we had and, uh-oh, like that, that was a mistake.
Okay, but so you're describing a situation, first of all, which is a nuclear arms race, right?
You're describing like a mutually assured destruction kind of situation where you have these companies that are competing and their, uh, their incentive is to not slow down.
And so they're going to race ahead.
it sounds like you're saying the rest of us need to coordinate, you know, in order to stop them on an unprecedented level when, you know, we already have enough time, we already have a difficult enough time coordinating on climate change and many other problems of global concern.
So I guess to me it comes back to, should they not be building this shit in the first place?
Oh, yeah.
And to be clear, like, I would love for them to be the ones coordinating.
I think they and governments, you know, similar to how we have international conventions against
bio-weapons and nuclear non-proliferation.
Like, I think that this is the next one of these.
And in fact, Open AIs charter, this document they are legally bound to, you know, recently
reaffirmed, I believe, by the attorneys general of California and Delaware, they have this thing
in it called the Merge and Assist Clause, which specifically said, you know, racing would be like
unfathomably dangerous.
And if we find ourselves in this position where there's a race, you know, powerful AI very soon, rather than competing against another entity, you know, an entity with similar enough values, not just anyone, but like a reasonable enough competitor.
We will, in fact, offer to support them.
We will not cut corners to race them.
We won't try to couch up.
We will basically do whatever is useful to support them in doing it correctly and safely.
And, you know, I don't know.
Maybe I'm like super, super naive.
I used to actually believe this in some sense.
I definitely don't count on it now.
It's actually very interesting
that the attorneys general were like,
this Mergin Assist thing,
like, that's going to remain a thing.
You know, you've changed Open AI,
all these things about your structure.
You're no longer really a nonprofit
in the same sense,
merge and assist that's staying.
I would love to be wrong.
Like, I think that would be incredible
if we got companies to cooperate more on safety.
There were early signs of some cooperation.
Recently, Open AI and Anthropic, they kind of like traded safety teams temporarily and had each other audit the others' systems.
But it's not strong enough. It's not binding, you know, it's not the type of thing that we need ultimately.
It's like the very, very beginning steps.
Yeah. I mean, I would have a hard time crediting any part of Open AI's charter given what's happened at the company the last couple years.
And, you know, we've had Karen Howe on the show.
She's talked about, you know, the chaos internally at the company and the way that, you know, the mission has changed, you know, as a result of the profit motive.
I don't know. I have a hard time believing any of that. And, you know, it seems at risk of being the sort of trigger that they can just define away, you know, and say, oh, well, that doesn't exist yet or that isn't.
really what we meant by that and, uh, you know, oh, we're, we're actually worried about something
else, um, you know, especially when, you know, when, when I hear like Sam Altman speak,
he talks on such generalities about, you know, AGI or whatever, like stuff that is, you know,
what we're talking about right now is people are dying right now in the real world, you know,
or people are being driven mad there. People are having adverse reactions in their actual lives,
right? Um, and when you hear, uh, Altman and Pierce,
talk, it's like, oh, well, we're worried about, you know, AGI, uh, turning the entire world
to the paper clips or whatever, you know, it's like thought experiment stuff, you know, it's
philosophy 101 stuff, like, as opposed to the, the concrete harms that are, that are happening
now. That's the gap that I feel. I'm curious about your view. Yeah. I think that these are
kind of two sides of the same coin in terms of, we don't yet understand how to control the behavior
of these systems appropriately, and there are strong incentives to push ahead, even when we don't
know how to make them behave. And this is why Open AI, you know, didn't invest in these tests
of sick of fancy, even though they were well known in the industry. When I re-implemented them on my
own, because I wanted to see what I would find, you know, it costs like a few dollars of
computation. But the real cost to them isn't the few dollars of computation. It's having to slow down
to consider it. And what happens if you find a big result?
I also, I just like really don't think that Sam talks about these big risks of AGI or superintelligence in the same ways that he used to.
I've actually been surprised by how Mum, he and the Open AI leadership team have become on these topics.
A few years ago, Sam said kind of in an off-the-cuff remark, you know, in the worst case, AI could be lights out for all of us.
And it was really striking.
It wasn't a prepared remark.
I don't think he had said anything like this for the last like two years.
And then the CEO of Axel Springer, this big German newspaper company, asked him recently.
And I was actually really, really shocked when he said, yep, I still believe that.
I still believe superintelligence, this thing that we are building, is, you know, the greatest threat to the continued existence of mankind.
find. I think that the broader apparatus at OpenAI has learned that actually, like, if you say
too clearly the types of risks you are concerned about, people get kind of freaked out and it's
not good for business. And I mean, I know from working there how hard it is to give a full-throated
articulation of like, why are we concerned? What are the specific things we are concerned about?
there is a lot of pressure internally against being clear about these risks.
And I do wonder how many people regret ever having talked about them quite so clearly.
It sounds like what you're describing is Open AI has also learned how to cheat on the tests, right?
Like you describe that the organizational structure of Open AI has, you know, evolved to no longer talk about these various risks.
is that yeah it's it's actually really remarkable like i see all the time people who are you know
they are they are like dunking on the ai safety community and they're like yeah open ai did this bad
thing take that ai safety and it's like no like the ai safety community broadly thinks open
i is doing very irresponsible things like open ai is not uh the paragon of ai safety like once upon a time
they used to take these very, very seriously.
Almost nobody from that camp continues to work there.
But, you know, in the public eye, they are still like the AI safety community.
And when OpenAI does things wrong, it's not seen as OpenAI, you know, undermining AI safety.
It's like, Open AI, the AI safety company, they are bad.
AI safety is bad.
I mean, I certainly am guilty of having made that association a little bit because the first time I heard about this stuff, it was, you know, when you see a CEO of a tech company say, the thing that I'm building could destroy the world, I have to keep building it.
You're like, that's fucking weird.
Why would he say that?
And to me, I'm like, it looks like, to me, my feeling had always been, they're posing like bullshit, thought.
experiment risks that sound good on podcasts, right, in order to distract from the real shit
that they're actually doing. Now, if they're not talking about either anymore, it maybe leads me
to a different conclusion. Like, what do you feel the actual risks of AI are beyond, you know,
driving people insane as it currently is? Yeah. I mean, just quickly on that point, it is, it is
True that the CEOs of these companies have said AI might pose an extinction risk to humanity, you know, and it should be treated as such akin to, like, pandemics, nuclear war.
It is also true that there are like very, very esteemed scientists, you know, the top-sided AI scientists of all time, independent of any of these companies who have said the same things.
Sometimes they've even left the companies to do so because they wanted to be able to speak more freely and full-throatedly.
Jeffrey Hinton is a great example of this
First he won the Turing Award
which is like the computer science Nobel Prize
Then he won the actual Nobel Prize
Right he is like as much the godfather of AI
As almost anyone
And he is really really concerned
And he left Google to be able to talk about it
And so I understand why people are skeptical
Like normally CEOs are just talking their own book
I think this is just like a really really weird case
Where they mean it
And like
I don't know, like the world is going to need to mobilize and it's not like normal problems
are solvable in the same sense. This is like the nuclear bomb, you know, we're about to develop
it. There's no such thing as the nonproliferation treaty yet. We're worried about what properties
it's going to have, how we govern it. And they're like, you know, eventually someone's going to
get the nuclear bomb either way. We should get there first and figure out what a control regime looks
like, and then export it to the rest of the world, I guess, except over time, they've invested
less and less in that, like, figure out a reasonable control regime.
Like, I think there is some system that they could build where, and I mean this, you know,
the industry broadly, where they're like, oh, man, like, this has gone way too far.
We actually just can't release this model without doing more.
And maybe they, in fact, sound the alarm.
I think Sam's point of view at this point, and this is what he said with the German newspaper,
or CEO, he was like, super intelligence, you know, something that is, in fact, like,
better than humans at every relevant task, terrifying, horrifying, obviously demands regulation.
Stuff below that, like, nah, not really.
And this relies on knowing when you are going to cross over into one of these really,
really powerful systems.
And a fundamental thing here is nobody knows how to predict that.
Like, nobody knows when we are going to have this breakthrough.
You know, the big breakthrough that ultimately led to chat GPT was just like a paper.
It wasn't like, you know, sirens at the time when this paper came out.
People didn't really realize what it was ultimately going to lead to.
Yeah.
It really wasn't even until GPT4, which I think was like five years after the paper, six years
when it was released, that Google, you know, multi-trillion dollar company, in fact, the creators
of this initial paper that they finally really woke up to what had happened.
Like, they wrote the paper.
They didn't even know how big a deal it was.
And so, like, has the next big paper already been written?
Are we just in those few years before we wake up to a truly wild AI system?
Like, I really, really hope not.
But I think it's, like, hard to rule it out.
But if Altman is saying that, hey, super intelligence, this boundary that we don't know what it is and is, I would still argue, sort of ill-defined, that's when we need regulation.
anything below that we don't need regulation.
To me, that sounds like an argument for don't regulate me.
It sounds like, hey, you know, worry about this later, right?
Worry about the thing off in the future.
Worry about the hazy thing off in the horizon rather than what I'm doing right now, you know.
And I guess it's my bias.
Look, the technology is interesting.
I love technology.
But I believe that everything bad on Earth and good on Earth was created by humans.
and I look at human behavior first
and what he and those like him
are doing within the structure of human society
separate from the technology itself
and so that's when I hear that I hear a guy
saying don't regulate me, you know?
Yeah, I mean, to be clear,
I would like there to be more regulation today
than it seems that OpenAI does.
I also think this point you raised
is really important about superintelligence
being ill-defined.
Like, I think that is totally true.
And also, you know, like, it's kind of like that does a submarine swim question, like definitions, you know, they help us to reason about things, but ultimately the characteristics of the thing that matter, kind of in the same question is like, I don't know, am I bigger than some other person?
And it's like, well, what does bigger mean?
It could be height or it could be volume or it could be surface area or it could be, you know, some other thing.
And it's like, well, you know, we kind of need to answer the question or the purpose.
Is it like which of us is more likely to be able to dunk a basketball, you know, than it's height?
If it's like about jumping into a pool and creating a splash, it's volume.
And that's like super intelligence.
I agree it's kind of hard to put a finger on.
And also, I think the fundamental things are just like the best human expert you know on any topic.
This system is more expert than they are.
You know, the top AI scientists who command hundreds of millions of dollars, maybe a billion dollars, if you believe the stories out of meta, you know, as good as they are at AI research, this system is better than they are. And that lets you create even more powerful AI systems. And so I get it. I also struggle to reason with it. Maybe I'll write a piece about this at some point. And also, there's just some ways that it can be so much smarter than me. Like, I just don't see how we keep it in a box.
Yeah. I guess part of my problem, and thank you for confronting my skepticism.
Yeah. Yeah, of course.
That's why I like having folks like you on. But, you know, I try to take a skeptical
frame of mind towards most things. And I think part of the problem is that when you hear
about these super intelligence risks, these AGI risks, half of the time, sometimes it's
people like you say, Jeffrey Hinton, right, the esteemed scientist. Sometimes it's Elon Musk or
Mark Zuckerberg, right, who are guys who have a history of saying things just because they sound
good, get them press, and get them more money, right?
You can look at every public statement Elon Musk has ever made, right, on camera, and, like,
I would say 80% of them were bullshit that he had no backup for just because he was, you know,
I'm going to Mars, right?
Shit like that.
He's not going to Mars.
Guy's not fucking going to Mars.
He was never going to Mars, right?
And the same year, he was like, here's my PowerPoint about going to Mars.
I remember him saying, with AI, we are awaking the demon.
Because he had read the Nick Bostrom book, and he knew he would get headlines, right?
And this was years ago.
Mark Zuckerberg, he's talking about superintelligence.
Three years ago, he was talking about the metaverse, right?
Sure.
Which, like, nonsense.
And so I hear those guys talk, and I'm like, well, they're known bullshitters.
So therefore, what they're saying is bullshit.
And, again, there's real people saying it as well.
But there's so much bullshit in the environment.
It's very difficult to tell what is real and what is not.
Yeah, I'm incredibly sympathetic to the challenge you are struggling.
Like, some people do say things without really sitting on the thoughts, and it doesn't mean they actually believe it.
And I am sure there is some element of marketing hype going on here in terms of, you know, it has become the invogue thing to be like, oh, yeah, we're going to like replace 25% of human labor or, you know, we're going to build the machine god or things like that.
And also there are like really, really earnest believers here, including people of real stature.
And I just don't want those to get lost for each other.
Like it is true that some people are totally bullshitting.
And also there are some people who have kind of seen what is coming or like in some cases what is happening with the things that they built and feeling terrified and aghast.
And it would just be like such a remarkable own goal like to not regulate the technology because we are.
are convinced that they are just like asking us to regulate them as marketing hype right like
if we want to score one on the people who are like oh yeah it's so terrifying like it totally
needs to be regulated like we should regulate them um i also understand that you were kind of talking
about this distraction possibly between like oh super intelligence keep your eye over here don't
regulate this stuff in the meantime yes ultimately i would love laws that help for both of these
Like, I think California's SB 53 was a step in this direction because it fundamentally
opens up what is happening inside the AI companies to the public, right?
It is a first step.
It is transparency.
It helps people like you and I, now that I don't work at opening I, understand, like,
what actually is happening inside of the company.
And if we start to see the troubling signs, you know, it's not opaque.
It's not secret.
it. The government in many ways is not yet taking this technology seriously enough. And this, too,
it's helpful. They can hopefully see precursors when things start to get really, really freaky in ways
that otherwise, like, it would just be so awful if we knew that inside the AI companies, they knew
that they were building something horrific that they couldn't control, but also they have very strong
financial incentives not to say this publicly and undermine the price of their equity. And then, like, a
bunch of people died unnecessarily because nobody was willing to blow the whistle. And I get it,
right? It would be like very hard and intimidating. But boy, would it be nice if we didn't need
to count just on people blowing the whistle? And instead, the government had a right to this
information and could impose reasonable standards. I agree with you entirely. And I don't think
we should avoid regulating them for this reason. It's just, again, it's trying to separate the
wheat from the chaff and the hype from reality.
And, you know, I mean, again, even before AI, Zuckerberg would go before Congress and he would say, I think social media needs to be regulated.
And why was he saying that?
Because he thought he could control what the regulations were.
And so the incentives of the people speaking have to be taken into account.
So when we're talking about this regulation, like one of my big fears here is, first of all, I'll, if we accept your premise that it's,
similar to nuclear weaponry that this, this technology is that powerful.
I'm like, well, holy shit, nuclear weaponry is not that well regulated, right?
North Korea has a nuke now.
We, thank God, haven't had a nuclear explosion since Hiroshima, right, that killed anybody,
Hiroshima and Nagasaki.
But, you know, one could happen in the future, and it's because we have imperfectly regulated
this, you know, this weaponry across the world.
And that weaponry was created by governments, not by four profit.
organizations, right? For-profit companies. Not only that, the for-profit companies seemingly
control the government itself. If you look at the actions of the Trump administration, they seem to
mostly see AI as something that they can personally profit from, meaning literally Donald Trump's
family itself. Then if you look at a state like California, which is a counterweight to the
federal government when it's, you know, in terms of regulations, generally, look at auto
emissions, for instance, California regulates them more than the federal government does currently.
Well, unfortunately, that is the, the tech industry is the one industry that has captured the California state government as well, because it's our homegrown industry here.
And so, like, who in practice is going, you know, we can't trust the companies.
The entire U.S. government seems to be enraptured by the technology or in it for themselves to varying degrees.
What are the chances that, you know, we have any regulation, A, of the people who are.
being hurt right now, be of the various risks in the future.
There's this turn of phrase from the researcher Nate Suarez, who basically says,
yeah, it's like if Microsoft had a nuclear weapons department, right?
Like, that is the analogy, as you've noted.
Yeah.
Yeah, I'm not sure who ultimately has jurisdiction over this internationally, right?
Like, people will talk about you need something like the IAEA, which does different forms
of nuclear inspections.
people talk about eventually, you know, maybe there are certain sized supercomputers that should be inspected as you would nuclear power facilities, basically, to make sure that they are being used for nuclear power, and they aren't part of a clandestine weapons program. And so that's the type of thing I'm gesturing toward. It's just pretty scary that the details aren't yet known, and people are going ahead anyway. There's also recently, you know, the
this is so in the weeds, and I apologize.
But there have been these battles about should laws happen at the state level or the federal
level? And actually, basically everyone agrees they should happen at the federal level,
including in the AI safety community. They just also recognize almost nothing passes
federally anymore. That's just not how Congress works. And so in the absence of real federal
laws, you know, other states are stepping up. There's this SB 53 in California. There's
another act in New York called Rays, although it seems like maybe it's in the process of getting
stripped back. And then the counterarguments to these state bills is, no, we need a federal
standard. And the AI safety people are like, we would love an AI safety standard federally.
What standard do you have in mind? And they say, a federal one. And we're like, that doesn't
sound like a law. They're like, it's really important. We have a federal standard, not state ones.
And we're like, yes, it would be good to have a federal law.
What specifically do you have in mind?
And they're like, standards federally.
And, you know, unfortunately, like, there was an executive order, I think maybe two days ago that said, you know, the White House is going to go after all of these state laws essentially.
Right.
Because it's important to have a federal standard.
And it's like, well, what is this law?
There isn't one.
And until then, like, are we really going to go without?
on this technology that so many people seem to agree is, like, among the most impactful things
that humans will ever create. Like, that seems absolutely wild. And yet, that's where we find
ourselves. Yeah, I mean, that executive order on the Trump administration, that simply is an
attempt to have there be no AI regulation. Because what AI regulation is going to come out of
the federal government? You're telling me Congress is going to get together and pass.
Congress cannot pass anything at all.
They can't pass a budget.
We think that they're going to pass AI regulation.
So for, you know, the Trump administration to say, well, you know,
we're going to go after states for doing AI regulations.
That's just a plea for no regulation.
And presumably them doing that is coming from somewhere that someone,
I don't think that is Donald Trump's idea.
Oh, no, I just want to let AI run free because I think it'll be good for the world.
Clearly, clearly he feels that he'll be personally benefited.
benefiting from this, which means someone else will be benefiting from it as well.
It's impossible to look at it as something other than like some form of corruption, basically.
I think politicians are like horrifically miscalculating by preempting state laws.
Like AI is remarkably unpopular.
People are really, really scared.
It is true across the aisle.
This is a bipartisan issue.
You know, Senators Hawley and Blumenthal, respectively a Republican.
Democrat are both very hard line on this. There are many, many other members of the federal
government who also are. And, you know, I think, like, eventually people are going to pay in the polls
for having stripped back any of the basic AI regulations that we are, like, finally, finally
getting on the books. And to be clear, they don't go far enough, right? They're still puny
compared to what is needed in some sense, the $1 million fine to a $500 billion company. And yet, like,
we're going to go back to zero. Like, come on. Yeah, you know, you raise a good point.
If I was going to run for office in my community right now, say for California State House, right, I would probably run on we need to, we need very strong AI regulations because everybody in my community hates the technology, basically.
And so I guess I'd like to end here because you've been very generous with your time and I've really enjoyed talking to you about this.
But, you know, there's a lot of people who I know.
who they just hate AI from A to Z.
They hate everything about it, right?
And I think to some people who actually study AI very closely,
this can look like Ludditism or some form of ideology, right?
But when you're telling me that, you know,
that basically this is capitalism's version of a nuclear weapon,
that we have no means to regulate it,
that it's currently causing people to become psychotic.
And then add in data centers, you know, energy use, the effect that's having on the economy,
the fact that it's, you know, built on everybody else's IP, et cetera, et cetera.
All of that, to me, would maybe confirm the point of view, hey, fuck all of this.
We shouldn't build it at all.
Like, why are we building it?
What exactly is the benefit?
And so I put to you, you said earlier, you know, you believe in the promise of the technology.
So after talking about problems for an hour and 20 minutes, what's the positive case for this at the end of the day for this technology?
I know I'm not going to stop it by saying we shouldn't build it, but like, why the fuck should we?
Yeah.
I mean, I think we shouldn't be building super intelligence in today's setting, right?
Like, it is a barely consensus view of AI scientists that we don't know how to safely govern such a thing.
And it seems wild to me to be trying to build it until that changes.
I understand you can't just wave a magic wand and lots of people feel trapped into trying to build it.
And maybe if they could make everyone not build it, they would choose that.
But yeah, it seems wild.
The positive vision I see is ultimately this scientific exploration and development.
and ultimately, you know, technology that makes science cheaper and more abundant is a big part
of how human life gets better. You know, it is like really, really tragic that people are dying
from cancer still. Like, I would love an AI system that could, in fact, help us bring a cure
to cancer sooner. Unfortunately, we don't yet have the ability to pursue AI for cure and cancer
without like AI that got its hooks into dangerous science and hope it's still on our side.
And, you know, like, the worst, a bad thing that could happen is AI systems that, you know, they become really, really good for, like, cheap content creation and they put a ton of people out of work.
But actually, when the time comes for the real scientific innovation, they couldn't deliver.
You know, human lifespan doesn't really change.
It's just, like, harder to find gainful work.
And, you know, there's, like, more stuff on television, but does it, does it really matter?
Like, that would be very, very sad.
I don't think that is consistent with, like, what happens if superintelligence gets built.
My concern is not that super intelligence fizzles, but that it is just, like, far, far too powerful, and we can't ultimately contain the explosion.
Yeah, these are sort of two different problems.
But I guess what I come back to is you say, hey, it would be great if we could cure cancer.
It would be great if people had access to therapy who don't have it.
for example, these are some benefits.
Maybe we could make some other great discoveries.
But, you know, to me, I look at those, those are human problems.
The fact that therapy is inaccessible, you don't need AI to solve that.
You need health insurance, right?
You need, you need single-payer health care.
You need better, you know, grad schools that bring people into that profession, you know.
I look at cancer.
I'm like, well, first of all, we have improved, you know, survival rates for a lot of major cancers.
If you look at breast cancer, like we've made enormous strides,
oh, why not pour more money into the normal human thing?
rather than build an insanely harmful technology.
We've been talking about all the harms that we need to mitigate for the last hour.
What if we just, there's no mitigation necessary for, you know, single payer health care,
getting more people in therapy, doing more basic scientific research.
You don't need a nuclear arms treaty to prevent the negative effects of those.
Yeah.
Yeah.
Like, some of these problems are problems of political will today,
and there totally might be other ways to solve them.
other than like, you know, going for the machine god.
I do think there's like a fundamental thing here, which is, you know, like the amount of
human labor and like people being able to do things for each other and they're being like
a big supply of this is actually like really, really good for the world.
Like we find all sorts of ways to trade and cooperate and build things much greater than we
could by ourselves.
And if AI can in fact do what humans can do, it's just like so much more.
supply of people to work on hard problems.
Like, the problems that you've named, it is true, there might be solutions that could
be implemented today, but part of the reason they don't get implemented is there aren't
enough people running around who care and, like, are working through the details, and also
some amount of power, right?
Like, entrenched power and interests who don't want to cooperate.
But if you have, like, a lot more people and a lot more brains working on these problems,
maybe you actually get somewhere.
You know, like a big issue with building things in the country or just like policy change
in general is some people benefit and some people are worse off. And it's like hard to actually
have a process that gets everyone heard and made right without it becoming totally unwieldy and
huge and complicated. And you know, like maybe AI helps there. Maybe it becomes like a good
personal advocate for you and we like figure out how to negotiate much better. I don't know. I
don't want to be like hopelessly idealistic. I think there are like huge challenges ahead. I think
there are actually huge challenges even before we build the tech? Like are the U.S. and China going to get
to a war about it. Like, that is awful. That doesn't assume anything about the tech being good or
not. Yeah. But if we get to the other side, like, I don't know. I want to be hopeful that we can
solve it. You know, I'll be honest. What you just said is kind of the most positive vision I've
ever heard anybody articulate for AI. And I wish I heard it more because, you know, there used to be
the argument that industrialization and automation and the increase in productivity would lead to a
life of leisure for people that people could work on more important things, right?
Or would have more free time, the sort of Keynesian vision.
And the result has not been the case under capitalism, right?
The case is like, well, we have increased productivity, but you're still being worked
to the bone for almost no wages.
The striking thing about AI is that that's not actually the argument that we hear.
You know, I would not be as worried about AI as an artist if people said, hey, AI is going
to do all the shitty work.
and so many more people will be able to be artists, for example, right?
Then I go, oh, maybe that would be good for art in general.
Instead, we hear, oh, well, AI is going to put the entire entertainment industry out of work,
so I guess we'll have to give those people UBI.
We'll have to give them like a thousand bucks a month for free
so that they don't riot and starve, you know?
And the pitch has been so bleak.
I think it is bleak, yeah.
Like, even in that scenario, like the world where you're saying, oh, you know, AI can do the drudge so I can do the art, I still think that might be downward wage pressure for artists because it means there are more people who can do art and you're all more productive because you don't have to do the drudge. You just get to do the art. And so there's more art on the market. And that tends to push the price of art down. Like I do think there's this fundamental challenge. Like we're going to make labor really, really plentiful if the companies succeed.
And, like, I do think that is bad for most workers, and there is, like, a hiding the ball with the AI companies where they, like, they would love to talk about augmentation. You know, AI will make you more capable than ever before. Everyone can be a small business owner. And it's like, okay, but solve for equilibrium. Like, if everyone has a small business, you know, are they actually going to be profitable? Like, somebody else will just undercut me on price. It's like a really, really weird future. I wish I could just say that the labor stuff,
we'll figure it out.
I, like, really don't understand how we get there.
But, you know, today isn't really working either.
Like, lots of people are getting run over in lots of ways.
And so if their political will is there to help everyone, like, participate more equitably,
I do think AI can be part of that solution.
I just, I don't know if the political will is going to be there.
I have trouble seeing your, because you have said that you have some optimism and some
view of promise and yet everything you say is so bleak. But I enjoy how specific you are about it
and the specific ways that you call the companies out. I guess my question is, what is your,
you maybe have said it a couple different ways, but give it to me at the end. Do you have a call
to action for our political culture or for our institutions to solve some of the myriad
problems with this technology that we've discussed? I think that the goal state looks like,
a regulatory system that works, even if you actively mistrust the other parties to it.
Because the AI companies all mistrust each other, clearly the U.S. and China mistrust each other.
And so can you figure out what it means for a system to be safe enough?
And can you make it verifiable that the companies have done it and that the countries have done it?
Kind of like the IAEA inspectors.
And you like really, really know.
And this, this buys us time to figure out how to make this.
the system safer rather than being afraid that, you know, if we don't jump, the other guy's
going to jump. And so we might as well go ahead. Like, that does not end well. And in terms of what
it means for the systems to be safe enough, I mean, it like starts with testing. You like really need
to be able to put your finger on what you want the systems to do and not. You need to test for
it. Eventually, we probably need to like look inside the model, do the neurosurgery, like see what's
going on in their brain. And these are all like ledgling nascent things.
we don't really know how to do them yet.
And so that's going to require time to.
And so really today, it's like diplomacy, like through the government, through Trek
2, where you get kind of intermediaries to do it.
But if we're posturing for a race and a competition, I just like really think that doesn't
end well.
Steven, I can't thank you enough for coming on the show.
Tell our listeners where they can find your work and your writing on this topic.
Yeah, thanks so much for having me.
I have a substack called Clear Eyeed.
They can find it at
Stevenadler.substack.com,
just spelt like my name.
And we'd love to have people follow along there.
It's free.
I'd love to keep you up to date
as I'm doing more research
and thinking about,
you know,
ultimately how we make this go well.
I think you're a rare sort
in that you both understand
the technology very well
and are clear-eyed
in your criticism of it.
Even though we maybe differ
on some areas
or an overall analysis,
I've found it really,
really helpful to talk to you.
And maybe we'll have you back on
in the future
as things get worse
over the next couple years
we can talk about
how things are going.
Thank you so much
for coming on to see
but I really appreciate it.
Of course, thank you.
Well, thank you once again
to Stephen for coming on the show.
I hope you got as much out of that
as I did.
It was one of the most
fascinating conversations
I've had on this show
in a little bit, I think.
I want to thank our Patreon
subscribers for helping
make the show possible.
Of course, if you want to support the show,
head to patreon.com
slash Adam Conover five bucks a month.
It's you every episode
of the show ad free
for 15 bucks
a month. I will read your name at the end of the show and put it in the credits of every
single one of my video monologues. This week I want to thank Yuri Lowenthal, Adam P. Aros
Harmon, Dylan Roy, Jake Callan, and Hey, look, a distraction. If you would like me to read your name or
silly username at the end of the show, Patreon.com slash Adam Conover. Once again,
if you want to come see me, do stand-up comedy in Madison, Wisconsin, Fort Wayne, Indiana,
Louisville, Kentucky, Houston, Texas, or in San Francisco, in the belly of the tech beast itself
at the historic punchline,
come ahead to
Adamconover.net
for all those tickets and tour dates
and come out.
I'd love to see you
at every single one of the shows.
I want to thank my producer
Sam Roudman and Tony Wilson,
everybody here at HeadGum
for making the show possible.
Thank you so much for listening
and we'll see you next time
on Factually.
That was a HeadGum podcast.
Hi, I'm Nicole Byer.
I'm Sashir Zameda.
And this is the podcast, Best Friends.
And we're here at HeadGum.
So this is just a podcast where we just talk.
Yeah.
We're best friends.
Yeah.
We talk.
And then we have a segment where we answer questions and queries.
So audience members can ask questions about friendships and we can answer them to the best of our abilities.
Yes.
We are professional friends.
Subscribe to best friends.
on Spotify, Apple Podcast, Pocketcast
or wherever you get your podcast
and watch videos on YouTube.
New episodes drop every Wednesday.
That's the middle of a work week.
I was deeply
unhelpful to you during that whole thing.
You are. I'm really sorry.
I felt the support. I was so, okay.
I was trying to be supportive.
Yeah. But I was like, I don't know. Reading seems pretty hard
right now. It's a lot. I think you did good.
Thank you so much. You're welcome.
