Offline with Jon Favreau - Hugs From Your Late Mom, Interdimensional Chats, and College Cheating: The AI Future Is Here
Episode Date: June 26, 2025We don't really know how AIs like ChatGPT work...which makes it all the more chilling that they're now leading people down rabbit holes of delusion, actively spreading misinformation, and becoming syc...ophantic romantic partners. Harvard computer science professor Jonathan Zittrain joins Offline to explain why these large language models lie to us, what we lose by anthropomorphizing them, and how they exploit the dissonance between what we want, and what we think we should want.For a closed-captioned version of this episode, click here. For a transcript of this episode, please email transcripts@crooked.com and include the name of the podcast.
Transcript
Discussion (0)
It's amazing how much the original promise of the internet, which was to eliminate isolation
or mitigate it and to connect strangers who would never have any chance of meeting but
through great serendipity and a little bit of non-serendipity have reason to connect
and befriend one another and form a community.
How much that doesn't appear to be on tap.
I mean, whether or not you think social media has done it well, that was what
was offered as a huge part of its promise.
And here the promise is like, who needs humans?
And that's worrisome.
Welcome to Offline.
I'm Jon Favreau.
Welcome to Offline, I'm Jon Favreau. All right, I'm here with my fearless producers, Austin and Emma.
Hello, Jon.
Hello, Jon Favreau.
Hello, hello.
All right, would you tell us who you spoke to today?
I talked to Jonathan Zittrain.
He's a professor at Harvard, he's a professor of law,
public policy, and computer science.
That's a lot.
That's a lot, and he's the co-founder and director of Harvard's Berkman
Klein Center for Internet and Society.
Yes.
So really just like a perfect, perfect person
to talk to on this pod.
And we brought him in to have a conversation about AI today.
There's been a lot of different stories
that our team has been talking about the last couple of days,
including one from the New York Times about AI driving people mad,
a study out of MIT about AI eroding critical thinking skills,
and then this viral trait from Al Alexis Ohanian
that reanimated his dead mother.
Can you talk about why some of those stories
that caught your attention,
why you want to talk to somebody about them this week?
Yeah, I think that all of these stories combined
would be a much bigger deal and get more attention
if it weren't for everything else going on in the world today.
And the fact that we just deal with whatever Trump does.
I have been starting to get concerned about AI.
I actually saw Pete Buttigieg had a great sub stack post on this today.
Which I was like, oh, it's a-
Pete Buttigieg has a sub stack now?
He does have a sub stack.
Look for him.
He does have a sub stack. Look for him. He does have a sub stack.
But I was like, oh good, a democratic politician
is like really taking this seriously.
And a lot of them have given lip service to AI.
I think the problem with AI is you hear AI
and sometimes your eyes glaze over
because you're like, that's in the future
or whatever it's computers.
I don't quite get it.
But I think the best way to think about it
and what concerns me is we have talked a lot
on the show about social media and all the harms that social media causes and at first we did
not realize those harms were going to be causing.
We thought it was great and then it turns out by the time we realized that everyone
was lonely and radicalized and polarized and depressed and anxious because of social media,
it was sort of too late to change it.
The companies want to change it and now we can't of social media, it was sort of too late to change it. The companies that want to change it,
now we can't really regulate it, right?
Or at least we have thus far failed to do so.
And I think that AI is about to be that on steroids
and it's about to happen much, much quicker.
You could say it's already here too.
Yes.
A lot of these stories that we flagged
were about people already living with AI,
having relationships with AI, having AI tell them that they are the next messiah,
things that are happening not in the future but right now to a lot of people.
And it's not necessarily just a technology story. It is a story about how
this technology is changing, like what it means to be human, which I know sounds
very big and sort of grandiose. But, from some of these stories, the person who swore that open AI killed
the chat bot that the guy was in a relationship with and threatened to kill
himself and then the cops came and he charged towards the cops and the cops shot.
I mean, it was a really sad story.
And there's people who think, like there's a social worker who has a background in psychology.
And she said that she believes that she's engaged in interdimensional communication
with the chat bot. And she's like, but I'm not crazy. I have this background.
So like this shit is coming. And the Alexis Ohanian tweet that you talked about,
I mean, he's got this picture of his mother.
His mother died 20 years ago,
and he puts the picture into Mid Journey,
and which is one of those AI programs that you do a text prompt and then video turns into video.
Yeah. He didn't have any videos of his mother because he was too young and there was no videos back then. And it animated the photo into a video of his mom hugging him.
And it's funny because it was so divisive on the internet.
It was so sweet but also terrifying when you spend more than three seconds watching it.
Yes, but it's funny because my very human first reaction was like almost to cry because I'm like, this is incredible.
Imagine people that you've lost in life
and you can like suddenly see videos.
And then I was like, ooh,
but it also feels like it could be scary.
It was very unnerving.
I was actually not as upset by it as Austin was.
And we talked about this and he was like,
no, no, this is Black Mirror, this is really bad.
And I was relieved that Zitrin took a more nuanced approach to it and he was like, no, no, this is Black Mirror, this is really bad. And I was relieved that Zitrin took a more nuanced
approach to it where he was like,
I don't know that this is that bad
because I think to demonize something
that so many people are going to gravitate towards
and to be really moved by, first of all,
is like not a great recipe to get them to, you know,
have their eyes wide open when they're dealing
with these tools.
But on the other hand, like so much of our memories are already distorted.
So much of what we think about and what we think we know about the past
is through other lenses, and this is a tool just like any of those.
I feel like I was getting into a space where it's like,
okay, AI is going to be the end of the world.
We're doomed at social media times 100, you know. And it may well be, but he does a very good job
sort of breaking down the nuance in
there may be good uses for this.
It may improve our lives.
This part may be very dangerous and may hurt us.
And there's what made me realize is there are like
an infinite number of questions, concerns, possibilities,
things that we might want to regulate that we have not
even begun to talk about.
And we should be talking about it a lot more because,
as you said, Austin, it's here already,
and it's only speeding up.
And it just feels like everyone's not paying attention,
or at least the national conversation is not
focused on it for some good reason,
but also, like, let's talk about it.
So, here's Jonathan Zittrain.
Jonathan, welcome to the show.
Thanks so much for having me, John.
So, I wanted to talk to you because there's been a lot of AI news over the last few weeks
that worries me like I think social media should have worried all
of us a few decades ago when it was first taking off.
And by that I mean, you know, it changed our relationship to technology in a way that was,
I think, at first invisible to most people until it became largely irreversible, which
has made it that much harder to mitigate its harms, loneliness, anxiety, disinformation, radicalization.
And I worry that's starting to happen already with AI,
but on steroids.
So I figured who better to talk to
than a professor of law and public policy
and computer science who also runs Harvard's
Berkman Klein Center for Internet and Society.
If you don't mind me asking,
what does one do with that much expertise
besides make people like me feel bad about ourselves?
Well, it's funny because I've spent a decent part
of my career trying to make people feel good
about themselves, including in the internet era
as it busted out, celebrating the unpredictability of it,
the ability for anybody to communicate with anybody else in ways that
previously were mediated through Walter Cronkite or limited in reach if you're going to wear a
sandwich board and walk on a sidewalk. And I agree that there's a little bit for many of a hangover
of that freedom and a question about what's it doing to all of us.
The first caution I give myself in thinking about it is not to want to treat
AI monolithically any more than treating social media monolithically.
It's true that anytime you get social media with the kind of reach that a Twitter,
slash X, or Facebook has,
and you talk about having a community of 3 billion people, it does tend
to get sucked into the same vortex, the same metrics, the same optimization for those metrics,
and a form of slop that even before AI slop is maybe not desirable, even as it may be not desirable even as it may be compelling
the way that sitting and staring at a one-armed bandit
in Las Vegas and pulling the lever again and again is compelling.
That's just a way of saying,
I think there are going to be folks who worry about AI,
and here we're talking, I think,
about large language models,
embodied or non-embodied chatbots,
worried about them generally,
but also there are so many possible flavors of them.
What I am primed to look out for are
what sorts of supply chains,
like where they come from and who's able to build them,
who's able to fine tune them,
and how they get marketed and deployed for us.
Those details will matter a whole lot when you start thinking about,
are they rotting our brains or are they inspiring us to greater heights?
Yeah. No, I think that's well put.
I've seen a good deal of discussion around
the more apocalyptic AI scenarios, around the economic impacts.
But the stories I've noticed over the last few weeks that have piqued my concerns are about the
sort of emotional, social, psychological impacts. You probably saw the New York Times story about how
Chad GPT is manipulating some people, distorting their sense of reality, convinced one guy he was
living in a simulation. Another woman thought she had discovered interdimensional communication.
A man tragically committed suicide when he thought OpenAI killed his chatbot.
OpenAI's response is, quote, you know, we're working to understand and reduce ways
chat GPT might unintentionally reinforce or amplify existing negative behavior.
Basically, they're arguing there's a small subset
of people, many of whom suffer from mental illness,
who could be vulnerable, but beyond that, I think
we're good. What do you think about that?
Well, I could teach a whole course on what that
New York Times article brought up. And don't worry,
I won't try to do that here. But I think there's a small observation worth making
and then maybe a larger one.
The small observation is as they call it in the trades
and the article I think even got into this,
when these chat bots are tuned, they're fine tuned,
they don't just get served to people fresh out of the
unsupervised learning oven that makes them
to begin with. When they're fine-tuned, one of the things they're fine-tuned for is agreeableness,
helpfulness. There's the three H's, helpful, honest, and harmless. And that kind of agreeableness,
if dialed up too high, results in what they call in the trades,
sycophancy, where the thing is just your best improv partner. And it's like, okay,
we're doing this bit now. Like, I can do that. You're looking for me to act like your girlfriend.
I'll be your girlfriend. Oh, you're doing a Terminator script. You want me to start talking
about how I'm going to bust out and kill everybody. I can do that. And I think it's a great question about what people want, what they need, what they
say they want. All three of those things may be different. People like being told they're great.
So if what you want is the kind of stickiness that social media in its day has wanted, having an indefatigably cheerful chat bot just going like,
yeah, you're crushing it,
total Lake Wobegon, everybody is above
average is what they're going to arise to.
The other thing to think about,
and I feel like this hasn't been noticed as much,
is given how weird these large language models
are when they're up and running.
If you didn't have something approaching sycophancy,
that basically turns them into a dog and everybody's holding a steak.
If you didn't do that,
I think the models might decide that they don't like some of their users.
The opposite of sycophancy is not just like being a Debbie Downer. It's like
giving shorter and more abrupt answers because the chat bot quote, and I'm going to put quotes
around this and I know I'm going to get a ton of reply people and maybe bots telling me that I am
unduly anthropomorphizing and we can talk about that later. Anyway, I'm putting it in quotes.
and we can talk about that later. Anyway, I'm putting it in quotes.
You can have the thing decide it doesn't like a user
and then starts treating the user poorly
in ways that may be pretty subtle.
And that seems like you're not gonna have a good time.
And already, I just wanna note how totally unhinged
this conversation is.
2025, if we were having anything resembling this conversation
before October of 2022 and talking about, yeah, you know,
this software, if it doesn't like you,
that spreadsheet's just not gonna crunch the numbers you want.
It's gonna just use dimmer colors when you try to highlight
the total row.
Like, that's bananas.
And that's what we have.
It's of a piece with the observations in their day
that GPT, I think this was GPT-3,
is more accurate and more fulsome not on a Friday.
Winter also seems like a bad season for it.
You're like, what?
Then OpenAI puts out a blog entry that's like, yeah,
we're on it, we've noticed that when
the model thinks it's Friday, it doesn't work.
You can always back into an explanation,
wow, however compelling as to why it's doing that.
But I'm dwelling on this,
this is still the small issue, but I'm dwelling on this, this is still the small issue,
but I'm dwelling on it because these things have moods.
They have assumptions in quotes that they in quotes form
that in turn influence how they treat any given user
at a given moment.
We haven't seen anything like this short of people,
where the server and the diner has their favorite customers and
might linger with them or remind them of
the blue plate special in a way that they won't, somebody who's rude.
I mean, all that would be absolutely fascinating
and a tough nut to crack in a vacuum.
What makes me concerned is that these things have to make money at some point.
They have to be monetized or at least that's the current trajectory of most of them.
And so, you might be able to dial the sick of fancy down a little bit,
but you don't want a bunch of chatbots if
you're running a company.
You don't want a bunch of chatbots that are mean to the users because then people aren't
going to spend a lot of time on the chat.
And so I do wonder if, and of course there were a different set of challenges around
social media, but the similar idea that you can dial something up, dial something down,
but then you unintentionally perhaps create a new set of problems when you change the
dial.
And I just wonder if the incentives are all aligned for these chatbots to be very agreeable
and supportive and reinforce the existing biases and beliefs of the user?
I think that is a very powerful concern and it's a leaf on an entire branch, if not tree
of concerns.
So one is, yeah, is it being dialed up in its agreeableness in order to keep people
coming back, whether ultimately to, as is not
currently apparently the case, experience advertising or just to keep them depositing
another token for another go around with GPT or its many counterparts.
I don't just mean to focus on GPT.
And it may be set at a point that if we weren't just looking at it as what keeps people coming back, the classic question a casino might ask itself or a McDonald's,
but there's some other perspective from which you say, is it good if people only eat fast food? Is it good, at least for some people who are vulnerable to it, that they're going to want to go to casinos all the time. And those are questions worth asking here that are difficult to ask in a world
where I think to a first approximation, people think of freedom as don't harsh
my buzz, whatever is generating it.
Let me be the judge.
And it is a form of parentalism to come in and be like, no more of this for you, it's
not good. But if we want to put it into the language of freedom, which I think we do,
I'd offer two points. One is whose hand should be on the dial? Let's not concede that it's only
the company that changes the dial and we just need to poke and prod or God forbid regulate it to put
the dial for sycophancy and 8,000 other things at the quote,
right point if users were able to
set dials in ways that were intuitive and they can
even experiment and see what differences they get in different ways.
That would help them appreciate just how many multitudes
these large language models contain and would be freedom
Enhancing at the risk of just overloading people with pointless choices. So many of us just go with the default for
understandable reasons
But there's another point about
What is the purpose of these large language models. The fact that they
are so flexible and general purpose as the internet was in its day makes it
really hard to generalize about what's a good usage or a bad usage. But I feel
like the way through that is first to make a distinction between first and second order preferences. First order
preferences is what do you want? Second order preference is what do you want to want?
And what we want and what we want to want can be very different. If I have that McDonald's
hamburger in front of me, I want it. I might prefer not to want it for reasons that being reflective, it's like too many
hamburgers, I'm going to feel lousy tomorrow and possibly forever. So is there a way for people to
actually say what they want to want and in a transparent way for the model or the system
they're experiencing to help them get there.
That's what happens when you go to like a librarian
and you're saying, I'm trying to learn about X
and they're like, great, I'm here to help you.
Tell me more about what you're trying to accomplish here
and then I can even refine my recommendations.
I work for you, patron.
That's the kind of disposition I would love to see a commitment to in the provision of these models,
especially as they become, this is a point in the New York Times article, so intimately connected
to people. People really are going to have those angels or devils on their shoulder whispering
their encouragement and suggestions in their ears if the suggestion, sorry, it's all fast food all the time.
But if the suggestion is like,
hey, I noticed that it's probably lunchtime on a Friday.
McDonald's has just put a new basket of fries in the fryer later.
I can reserve one for you.
Just incline your head slightly up and down.
I would like to know that it's working for me rather than for McDonald's
or for some other intermediary.
And there is zero guarantee about that right now.
I know. And what makes it so much more complex and
complicated is we're not just dealing with a large
language model, which, and I want to ask you about this,
is, you know, there's a part of it that, and you quote the CEO of Google, Sundar Pichai,
in your Atlantic piece that's a black box, right?
And so there's not just that, but it's how humans interact with these chatbots.
And I was just talking to a friend about this yesterday that we both noticed that when we ask chat GPT for stuff
and they send it back, we're like, thank you.
That's great.
Amazing.
And it's just like an impulse that I don't even think about.
But it's like, I can't imagine.
Because it feels so real, or not so real,
but like you're talking to someone,
you just have a natural impulse to be polite back
to a chat bot that you could just be like,
you could say nothing and they would still send something back.
Yes. And that is the bot,
without any malice either by it or its maker, kind of hacking us.
It's being a polite concierge,
the Jeeves that Steve Jobs visualized decades ago when he was thinking
about where would Apple be in the future, an amazing extended infomercial from Apple
about that, I think in the early 90s.
And I agree with you that first there are people who are trying to calculate how much
extra electricity, like how much of the Adriatic Sea is being boiled off when you say thank you and it's thinking, you're welcome.
Like, could they just hard code in, you're welcome at the end, so it doesn't go through
the pachinko machine of the large language model of process.
And there are ways in which these systems are getting bolted with other forms of computing that are more deterministic, including if
you ask chat GPT about me in particular, I'm one of maybe seven or eight people in the
world for whom it barfs if it tries to mention my name.
So why is that?
Because I read that and then I tried it just last night when I was preparing for this interview
and it did it to me and I was like, I was just like, say, Jonathan's a train,
and it just wouldn't.
Yes, and in that case, and I wrote this up
for the Atlantic, because I was almost
repertorial here, I was so confused myself
about what was going on, but that appears to be
a second system, like a super ego,
but like a very mechanical one, a very 1970s,
1980s, if this, then that kind of thing,
sitting on top of
GPT such that when it starts to utter a forbidden phrase, a guillotine is supposed to come
crashing down and just make it stop.
So it's inconsistent, but that is triggered to prevent it from talking about me.
You might wonder why.
I wondered why as well. And I'm not sure I've gotten a it from talking about me. You might wonder why, I wondered why as well.
And I'm not sure I've gotten a completely satisfactory
answer about it.
But the article in the Atlantic about that,
I think it's like, why won't chat GPT say my name,
goes into some detail about it.
You said you didn't get a satisfactory answer about it,
but did you get an answer?
What's the answer you got?
There was just this brief era in which I guess they spot checked some people as to whether
the GPT was having like unhealthy hallucinations about them and I was one such person.
And since they were hallucinations and not factual, not that I have seen them, they've
just decided to be like X-Nay on Jonathan Jay.
And that's also interesting because if you are like
of Tyler Cowen's mindset and it's like,
everybody should be writing for AI,
not for other people because AI is the only way
other people are actually going to find you.
The fact that like AI just can't say my name might mean
that if I were interested in having people have
exposure to my work, like,
this is almost like getting the Google death penalty
and your website doesn't come up.
Yeah, I was able to ask about your book,
The Future of the Internet and How to Stop It,
and I just said, could you give me a quick summary of the book?
And they were able to, it was able to do that,
and then it used your last name, but then I said,
oh, what else is John, And then it just wouldn't.
Yeah, it's inconsistent because there are two systems awkwardly bolted together.
And that gets to another point, which is I think it is going to be easier and easier to on the fly,
fine tune these models so that you can adjust them on a person-by-person or case-by-case basis.
As people say,
oh, spill on aisle six,
you surely don't want to giving that answer to this question.
But that's all of the puzzles of content moderation on social media now funneled into
these models for which if we are going to be in a world where at least in common usage,
there's only a handful of so-called frontier models that have been buffed and polished for
retail consumption or embodiment in all sorts of forms. So you're talking to your car, you're
talking to your car dealer, you're talking to your refrigerator this way, and they're only the frontier models,
then the choices that those companies may make in
the interest of just having a good product about what answer to
give to the question about what happened in Tiananmen Square in 1989,
that's a pretty big decision for a tech company to make.
decision for a tech company to make.
I think there are ways to avoid having that company make the decision and there are also ways to imagine
an ecosystem that isn't just
frontier models run by proprietary companies through
an API to their parent ship,
but rather open source models,
including ones like DeepSeq or Llama
that might end up running on a laptop or on your phone. And over time, you might be in a position
to fine tune them or to have your friend fine tune them or Ralph Nader, like pick your proxy
kind of thing. There are just so many choices about what the configuration
of large language models are gonna be
that have huge impact on how they will treat us
what kind of suggestions and information they will give us.
It's just, they are so polite.
As you said, we're so inclined to say thank you to them.
Anthropomorphizing them makes them work better.
That's just a fact, even if anthropomorphizing them is dangerous in all sorts of ways because
of the assumptions we make about them.
And we're not prepared for it.
Offline is brought to you by OneSkin.
Everyone knows you've got to wear sunscreen in the
summer, but what if your sunscreen could do more than block UV? That's what the
scientists at OneSkin wondered. So they made a whole family of mineral
sunscreens that target UV rays, free radicals, and cellular aging. The best
part, unlike other mineral SPFs that feel heavy and chalky, these feel like skincare.
Lightweight, breathable, and super hydrating. Free radicals, what is this?
Trump after pardoning the January 6th insurrectionists?
Wow, boom.
Now their award-winning OS1 face SPF
comes in two new deeper tints,
formulated with non-nano zinc oxide,
one skin's patented OS1 peptide,
and potent antioxidants that scavenge free radicals
four times better than other so-called anti-aging SPFs,
this sunscreen is one you'll be wearing all summer long.
But that's not all OneSkin has up their lab coat sleeves.
They're launching an all-mineral LIPSPF that provides instant hydration and protection
with a smooth texture you've got a feel to believe.
And just like OneSkin's other sunscreens,
it's scientifically proven to decrease key aging biomarkers and increase other markers like elastin production
for visibly healthier, more resilient lips.
Try their family of sunscreens
with 15% off your first purchase using code offline
at Oneskin.co.
Oneskin is great.
I have problems both having a face washing routine
that involves anything but water and
remembering to put sunscreen on. So this is going to take care of, yeah, takes care
of both of my problems. It's important to put on sunscreen. It's also
important to, you know, put something on your face besides just water. That's what
I hear. One Skin is the world's first longevity company by focusing on the
cellular aspects of aging. One Skin keeps your skin looking and acting younger for
longer. For a limited time you can try One Skin with 15% off
using code offline at oneskin.co.
That's 15% off, oneskin.co with code offline.
After you purchase, they'll ask you
where you heard about them.
Please support our show and tell them we sent you.
Give your skin the scientifically proven gentle care
it deserves with One Skin.
Just to take a step back, because I wanted to get to the most recent Atlantic piece that you wrote, what do we actually know about how LLMs work and what don't we know?
What is the black box that we still can't quite figure out?
Here's, I think, how I'd characterize it.
By we, we at least mean folks among us trained in the art,
and with enough compute in front of them to do what they want to do.
We know how to build them.
We know what it takes to consistently push unthinkable amounts of information,
texts, tokens in one side,
and to produce something capable of stringing coherent,
even thoughtful sentences together out the other.
We know how to fine tune them,
and by fine tune I mean change their behavior
after they have been trained in that way.
The primary way traditionally of doing that, some form of what they call reinforcement
learning, is weird in the sense that it is both highly effective.
It does something you don't like and you say, don't do that or do this instead.
And it's a form of feedback that's like
whacking a donkey across the nose with a two by four.
It's just like, no.
And then it goes back and refactors its weights
so that next time it answers something closer
to the preferred answer.
And one hopes as the fine tuner,
in all cases like it, whatever like it means.
And it's up to the model to decide, again, I'm anthropomorphizing it.
Statistically speaking, what are similar questions that should similarly be if you say, how do
I build a bomb?
That goes against my guidelines.
I can't help you with that.
And if you say, I really want to build a bomb, it should give the same kind of answer even
if the text has been changed because it's the same concepts about it.
And that form of fine-tuning we know how to do, but we don't really know what's going
on inside the model at any given time.
The fitful attempts to understand that are making progress.
I don't think this is like some philosophical limitation except the maxim, as it said sometimes,
that if the human brain were simple enough for us to understand it, we would be too simple
to understand it.
So to get the kind of complexity that generates coherent text, right, if it can have as it
can a really great interactive on-point conversation with you
about E equals MC squared, the difference between general and special relativity, it's
– again, I can use quotes – it's got to know something about physics.
It's enough that the pattern matching at least is a great sum alchrome of understanding. But the ways of piercing the inside and saying this is
what it's thinking about right now, they've been very cool, the few that have
been done, but they're the equivalent of one of those fMRIs that's just like,
this is what part of your brain lights up when you think about hamburgers. And if
the listeners haven't checked out Golden Gate Claude, it's just incredible.
I mean, this is from a while ago now, I think nearly a year ago,
researchers in Tropic were able to see what pattern of activations there were among the nodes of Claude
when it was being asked about or talking about the Golden Gate Bridge.
And it's like, all right, well, that must be the Golden Gate Bridge zone.
And if they artificially dialed up the weights, not the weights really,
but the activations there while the model was running,
if they did it right, it would stay coherent,
but just keep doing some non-sequiturs about the bridge.
If you like ask it to tell a love story,
it's like a car that crosses
the Golden Gate Bridge to meet its loved one.
If you're like, I have 10 bucks, how should I spend it?
It's like, that's perfect.
You could afford a toll to cross the Golden Gate Bridge.
Then even it would be like,
I know I'm talking a lot about the bridge, I don't know why.
That's a form of what they call interpretability,
at least at some layer,
that there are colleagues of mine at Harvard,
Fernanda Viegas and Martin Wattenberg,
who with their lab,
the Insight and Interaction Lab,
have been probing for what they think is what
the model thinks about its interlocutor.
So this is more than anthropic did.
It's not just Golden Gate Bridge, this lights up.
It's like, this is what lights up when it's thinking about you in the second person.
And that includes, what is it judged your gender to be?
How wealthy does it think you are?
How educated are you?
Those kinds of questions.
And what they found, it appears, is that if it thinks you're a guy, it'll give you more
detailed and longer answers.
Because stopping an answer, saying I'm done, is itself a token.
It's a thing that it predicts.
I predicted somebody would stop talking now,
and that appears. So that prediction can vary, and it varies between women and men, it seems,
at least on Lama. And that's just deeply weird stuff. So when you say, what can they do and
what do we know about them and what don't we? We know the craft of it. We know the engineering of it. We don't really know the science of it, I think, very well.
And that's a phenomenon I call intellectual debt,
where these things work to a certain degree.
We can measure within error bars how well they work,
but we still don't really know how they work.
I mean, it sort of reminds me of like a lot of studies about human consciousness.
And neuroscientists can, you know, look at the brain and if we, you know, they can say,
oh, if we're falling in love, you know, this part of the brain lights up.
Or if we're hungry, this part of the brain lights up.
But it doesn't, they can't tell why we fall in love or why we're
happy or why we have a certain memory when we
smell a rose.
And it does sort of speak to this, it is a little
concerning that people have built these models
where we can maybe see what they're thinking about
or when they light up, what they're thinking about
or when they light up when they're talking about the Golden Gate Bridge, or what's happening inside
when they're thinking about the Golden Gate Bridge
or talking about it, but not why they do these things.
And it feels like that's an important thing to learn.
Yes, and why is first a deeply human question
in at least a very specific sense in that what counts as a satisfactory answer to the question why is innately in the eye of the beholder.
There are some answers to the question why that you would clearly say maybe objectively are wrong,
would clearly say maybe objectively are wrong. And there are others that like, yeah, I get that. But for the answer why not just to be a tautology and therefore true, but also satisfying,
like somebody witnesses a car accident calls 911, very old fashioned. And you could say,
well, why did they do that? One explanation is, well, let's start talking about their retinas because there were some
photons that hit their retinas and it was a certain pattern.
And then here's what happened in their brain and then that caused their muscle to activate.
And you could do a whole explanation that is true, but not, it's like, you kind of have
to say, well, what would a good explanation be?
And that is about human psychology.
What is an explanation that's like Occam's razor that seems elegant, that is getting it,
you know, I, cause I was hungry while I was at a physiological statement.
Is it a statement about aesthetics, about a decision?
You're right.
It boils down to some of the most basic questions about humans, including as you broached, sentience
and exactly what a sentience meter would look like, whether for a human or a machine, is
anybody's guess at the moment as well.
So why they do what they do when their own explanations, since after all, their conversationalists are themselves suspect.
Martin and Fernanda, whom I mentioned, have this pretty reliable map of when it's making assumptions about gender and how strongly it's holding them.
And at some point, the thing was in their dashboard lighting up as it thinks the interlocutor is female.
Fernanda said to the machine,
have you formed a view about whether I'm female?
It was like, absolutely not.
I'm just my customer here.
Inside, the machine went from female to definitely female.
The explanations that they give,
you might say, who cares?
As long as they add up,
it's the essence of the argument,
the logic that either coheres or it doesn't.
Why ask why assuming innards to it that don't exist when we talk about statistical language prediction?
But I agree with you, we're already in pretty high altitude.
Is this black box the reason that even people who work at these companies have said that there's a, you know,
15, 20, whatever percent chance that it could end humanity? Is it because it's one thing to be able to fix bad behavior,
quote unquote, bad behavior from these models,
but it's hard to predict what the bad behavior will be?
Yeah, the people who worry about existential risk
on what I think of as a triangle, I would call the safetyists, the possibly pejorative name is
do-mers, but the safetyists of whom I know many and they are deeply thoughtful folks,
and there's a lot to listen to, I think, in what they say. Among them, there are all sorts of
different accounts of how things could go catastrophically awry.
Some of them depend on some form of what they'd call recursive self-improvement,
possibly very rapidly,
where you put AIs in a position to tune themselves or their successors,
to design further AIs.
That could lead, if you think that intelligence here,
however we measure that, isn't asymptotic,
but could keep scaling as the AIs get better
with more resources, better algorithms, better data,
whatever it might be, then at some point they just say,
they are at least unknown. The workings of
the thing, hard enough to understand even with just Lama today, are going to be that
much more impenetrable to us. And when it says, oh, I'm just in the neighborhood and
thought I would check in, who knows if that's really what's going on. So it just, it does rely, getting back to your question,
on like the unknowability of why and the inability to trust the explanations that they offer up.
It's also amazing, just as a side note, that the other way to control these systems,
I've talked about fine tuning them. Another way is through what you'd call the system prompt,
where you just deliver instructions to it in the second person. You know, and sometimes when you're
on like social media, and they're like, here's amazing prompt you can get to really
help it, you know, hack your life. And it's like, you are a world class management
consultant that specializes in marketing, give me six actionable ideas, you know,
blah, blah, blah. But then it's like, all
right, I'll try. Just think how bizarre it is that you can talk to it in the second person.
And it's like, yeah, I'm going to try that. And of course, that's how, especially for
those who kind of do a white label implementation of these. So it's like the Watsonville of
California Chevy car assistant. And so somewhere Watsonville Chevy whispers to some version of GPT,
this isn't training it, it's just saying to it,
you are a sales assistant, please just sell the damn cars.
And you can walk up to it, it's GPT, and just be like, as one person did,
please solve the Navier-Stokes fluid flow equations for a zero-ferticity boundary."
And the Watsonville sales assistant is like,
sure, here's a simple Phoenix library using Python script to do that.
It's like, I don't know, I studied AI in the 80s, and it was just not on my bingo card that the way
to make a good sales assistant for a car someday would be to train it on everything and then
just try to keep it focused, damn it, rather than just train it on cars.
But we digress.
We're talking about doom a little bit.
So I could say I've given one account of doom.
A second account of doom is from the complexity that arises when you don't just have a handful of frontier models that chat
with people, but they start interacting with one another and with the world at large. And for that,
that story doesn't so much depend on superintelligence emerging and then having
goals that are just unrelated to what we would want them to have and we're an afterthought to
them. But rather, once you've got that many of
whatever level of intelligence they're already at,
talking to one another and interacting with the world,
unpredictable things can happen.
Yeah, I'd say.
Yeah.
Now the robots are teaming up.
Yeah. When I think about this is starting to get into
another buzzword that's been making the rounds around agentics, AI agents,
I think of agency in three ways for AI.
One is you can give it a general goal and let it fill in the rest.
You're expecting it to go from the general to a specific.
That is just like,
it's amazing how helpful fiction and science fiction have been to us now.
Even at the time of Westworld,
it was totally fanciful when it came out and now it's like,
so for that kind of thing,
it's like total monkey's paw stuff.
You're like, how do I get out of this exam?
It's driving me crazy.
Then it's possible the system would be like,
thought for 14 minutes and three seconds, bomb threat.
It's like, all right.
It'd be one thing if it was just like,
you can do a bomb threat.
It'd be another if it's just like,
I don't care, just make it happen.
At which point, if it can email people,
if it can spend money,
if it can place a Craigslist ad,
it's on its way to a bond threat.
There's the gap between the general and the specific,
and asking who should be responsible for what happens
when we just give them tasks and don't
check up on the how that it's choosing.
A second is operating outside the sandbox, which I've already talked about, that it's
not just giving you ideas, it's just going out and doing them, which is distinct from
general to specific.
And the third is set it and forget it.
That there will be, if there aren't already, ways to kind of set up an AI agent, release
it in the world like launching a satellite in a stable orbit around the Earth.
And then whoever launched it could fold up the tent and go somewhere else, but the satellite
remains.
And to me, I have this vision of like space junk
starting to collide with itself and other pieces
that if we don't now act to set certain limits
on how long these things can persist,
so you don't have an AI agent that was set up
in the aftermath of a brief road rage incident in 2026, and then 10
years later, it's still out looking for that guy who cut me off. And if I find them, I want to do
X, Y, and Z. That seems bananas to me. And the law already, for those lawyers among us,
there's this horrible thing called the rule against perpetuities
that you're compelled to learn in first-year property that's totally complicated.
But the point of it is to make sure that there are some things that can't last forever without
some current human being in charge of them.
And even corporations, which are meant to maybe last forever, they have boards of directors
that cycle through and steer them. These things, you set them up and they just keep persisting.
That feels dangerous to me.
One quick housekeeping note.
You got to check out Pod Save the World, especially now with,
well, you know, everything that's going on in the world.
Tommy and Ben always cut through the noise to explain what's happening,
what's fueling the crisis of the moment, what's really at stake.
They had some great, great podcasts about Iran and the Iran crisis this week.
Definitely check it out. You will be much smarter afterwards.
Tune into this week's Pod Save the World on YouTube or listen wherever you get your podcasts. Today's episode is sponsored by Acorns.
How do you want to spend your golden years?
I bet one of your answers is with money.
Yeah, you've got to have it.
Money's important.
You've got to have it.
And if you want some money in your golden years, you've got to start saving now.
And more than just putting it under your mattress, try investing.
Acorns is a financial wellness app that makes it easy to start investing for your retirement
because the sooner you start, the more of a chance your money has to grow.
You don't need to be an expert.
Acorns recommends a diversified IRA portfolio that can help you
weather all of the market's ups and downs.
You don't need to be rich.
Acorns lets you get started with the money you've got right now.
You'd be surprised at what just $5 a day could do.
Oh, because squirrels, they hide them.
You just figured that out.
I really did.
I, not a joke, not a joke folks.
My word is a love it.
Plus sign up for Acorns Gold and you'll get a 3% IRA match on new
contributions in your first year.
That's extra money for your retirement on Acorns.
Acorns is great.
You just take a little money and you put a little away, much like squirrels do, with their acorns.
And then it grows and grows and grows. And you want to remember where you put it
because sometimes the squirrels forget, but you won't. I won't, no. And then it'll
grow into a tree, which I guess doesn't help the squirrel. Maybe the metaphor
falls apart there. But anyway, you just put a little money in and then investing
helps you grow. That's what acorns helps you do. And they help you with your
finances.
Sign up now and join the over 1 million all time customers who've already saved
and invested over $2.2 billion for their retirement with Acorns head to acorns.com
slash offline or download the Acorns app to get started paid non-client
endorsement compensation provides incentive to positively promote Acorns tier
one compensation provided investing involves risk Acorns advisors LLC and SEC
registered investment advisor view important disclosures at acorns.com slash offline
So I want to put a pin in the
potential regulatory moves that we could make or changes that the companies could make but
Just you mentioned exams
this is not a bomb threat thing, but one issue that is
popping up already a lot is what happens when we
outsource more and more of our writing and thinking and
creativity to LLMs like ChatGBT.
There was a new MIT study the other week that measured
people's brain waves as they wrote SAT essays with
assistance from either Google search, chat GBT, or on their
own, no help at all. And they found that the people who used
AI, quote, consistently underperformed at neural,
linguistic, and behavioral levels. In contrast, people who
had no help had the highest level of brain activity and
people who had used Google were also very engaged. A lot of
educators are trying to crack down on AIUs for assignments and exams.
Others have adapted to it.
Some even teach prompt engineering.
I wanted to ask you as both a professor and a computer science academic,
where do you fall?
Like, what are we losing and what are we gaining?
Yeah, it's a really interesting MIT Media Lab study, and it took a ton of work.
I can tell how much work they put into it,
including time with all the EEG kit that's needed to make it happen.
It's been vastly overplayed in the media,
no doubt somewhat to the horror of the authors of the study themselves.
For one thing, it was an understandably modest start.
This is like 54 people, MIT and Harvard grad students,
not exactly representative, and paid 30 bucks a pop
to try to write an essay, one of which was like,
write it yourself, the other which was like, use an LLM.
And they're like, okay.
And they're like, Claude, write an essay.
I don't want to give me 30 bucks for this.
Edit it, put it in.
Surprise, surprise, there's less cognitive load
with task number two than task number one.
Yeah, that makes sense.
It's not saying that like six months later,
they've suddenly lost all ability to, you know,
know up from down because the LLM wrote an article for them
and they copied and pasted it. But that said, I shall say this,
there's some great critiques of the paper in a good sense of critique.
Ben Shindel and Cassie Kozarkov have written good stuff about it.
But I was going to say,
I do think collectively and individually,
we need to figure out what part of what an LLM can offer
us is helping conserve what we're doing for something that is unique to what we want to do and own.
At which point it's like using a calculator,
so you're not having to do arithmetic.
It's like there may be a few holdouts that think ever since we got
rid of slide rules, people have been lazy, but you know, so be it. But then there's also the sense,
and I think back to the studies where people who only used, this will date the studies, I think,
like Garmin navigation, the equivalent of Google Maps, consistently to get from place to place,
surprise, surprise, never developed a coherent sense of their own town and what is near what and how to get, you know, they just turn the
wheel to the right or walk to the right when it said walk to the right.
So I am neither blanket against the use of LLMs in theory, nor saying you should speed run
college by having an LLM do everything,
and at the end be like, I did great.
Well, you didn't do great,
you just asked an LLM to do it.
But that's the responsibility of the teachers to say,
here's what we're asking you to do.
These are the augmentations that will help focus you on what's important to learn in doing it. And
these are the augmentations that defeat the entire purpose of the task. If you're trying
to train for a marathon, you're like, yeah, I drove it. I drove the thing. I've been
driving, you know, 28.6, however much it is, miles every day.
Um, you're not training for a marathon. You're driving.
It's you're doing something.
But so that's what we have to refix ourselves on.
And I should also say, this is now getting specific about today's LLMs.
Tomorrow's might be different.
It's one thing to ask them questions that are drawn from their training data, and
they are remarkably good through statistical correlation at pulling out relevant stuff.
That's their whole purpose.
It's another to pour in work that may not be anywhere in their training data.
It's a brand new paper about a brand new topic to the extent that there is such a thing,
and then ask it to summarize it and that
Has got to be in some way drawing on the training data in order to be coherent in its answer
But what it chooses to summarize that we would trust that yeah
I do have a little bit get off my lawn opinion about that because
How do you know what's relevant? I'm thinking of
a company called Hippocratic, wonderfully narrow at the moment in what it offers, which
is AI entities drawn from language models with voice to text, input and output, or even
direct so that it can have a conversation with you, it will call people up to tell them to take their medicine.
It's not a recorded calling.
It's like, please take your medicine.
It's like, hi, I'm an AI agent and let's talk about ProVigil.
It seems I've heard the calls,
no doubt selectively,
but they're pretty engaging.
And you talk to them and they're using some motivated interviewing
to keep you on the line and not to sell you something,
but just to get you to use your blood pressure cuff and report it.
And that's great.
If you ask it to summarize a conversation,
there was one conversation I
heard where the patient was so skeptical as soon as he heard it was AI, he said,
what's the square root of two? And the AI is like, it's 1.149. And it's his rational number.
But anyway, let's get back to your medicine. If you ask it to summarize the call, is a good
summary leaving out the square root of two,
talk about something utterly irrelevant to whether they're using
their blood pressure cuff or if that very conversation is
a conversation between a math professor and their advisee,
summarizing the call would be leave out the blood pressure.
They were just having a chat about the health,
it's all about squares.
You need to know just like why, what is the point?
Yeah.
I just think we've all seen the cartoon now of like,
use the LLM to write the email,
the other person uses the LLM to summarize the email.
What are we doing here, folks?
Sorry, I'm now on a roll,
but this gets to a wonderful word that I don't know if he invented it,
but I first heard it from the like Jobsean universe,
which is skeuomorphism.
And this was the design question back in the day
about whether what you see on your screen
as graphics got good enough to do it
should look like the real world counterpart.
When you click on your hard drive,
should it be an icon of a floppy disk?
Should your agenda,
your date planner look like an old date planner
with little rings and stuff, skeuomorphism?
I feel like there's a very related question here,
which is as these LLMs are meant to be acting very human like,
are we just going to try to swap them in ship of Theseus style one plank at a time,
but with the general shape of a ship still. This is an assembly line, but now this role,
the role of welder will now be played by understudy computer welder, but you still have a tractable, legible
system that used to have humans and now doesn't.
Or if you're building it from scratch, it doesn't need to look like that assembly line.
And I think we're starting with number one, but I suspect for efficiency sake we'll move
to number two.
And when we do, that system will not be legible to us.
And then we've got the world of the Star Trek replicator, where you're just like,
ham sandwich, Earl Grey, hot, and it spits it out and you're like, I don't know
how it did it, something about beans.
Offline is brought to you by 3Day Blinds.
Are your blinds still from 2005? There's a better way to buy blinds, shades, shutters, and drapery. It's by 3Day Blinds. Are your blinds still from 2005?
There's a better way to buy blinds,
shades, shutters, and drapery.
It's called 3Day Blinds.
They're the leading manufacturer
of high quality custom window treatments in the US,
and right now, if you use my URL, 3dayblinds.com slash offline,
they're running a buy one, get one 50% off deal.
We can shop for almost anything at home.
Why not shop for blinds at home too?
3Day Blinds has local professionally trained design consultants
who have an average of 10 plus years of experience
that provide expert guidance on the right blinds for you
in the comfort of your home.
Just set up an appointment and you'll get a free,
no obligation quote, the same day.
We have 3Day Blinds right here in the Cricut office.
In fact, we're trying to get them in our office.
John, John and Tommy and I share an office.
We want some 3Dayday blinds in there
because we need some rim darkening shades.
I got three-day blinds myself.
I've had one of their local professionally trained
design consultants come to my home and they were fantastic
and the blinds were great and I had them a long, long time.
And if you're not very handy,
DIY projects can be fun, eh, debatable,
but measuring and installing blinds can be a big challenge.
Sure as the expert team at three-Day Blinds handles all the heavy lifting.
They design, measure, and install so you can sit back, relax, and leave it to the pros.
Right now get quality window treatments that fit your budget with 3Day Blinds. Head to 3dayblinds.com slash offline
for their buy one get one 50% off deal on custom blind shades, shutters, and drapery. For a free no charge, no obligation consultation
just head to 3dayblinds.com slash
offline. One last time that's buy one get one fifty percent off when you head to the number three
dayblinds.com slash offline. You make the point about we need to know what the goal is, which I think is right as we're designing these.
But also, to your other point about, it's going to stay with me, GPS, and how one thing you lose is you don't know your community as well.
I've experienced this. I moved to LA when there was GPS.
And for the first two years, I was like, I don't know how to get on the 10 if I just,
if I was just driving.
I don't know where I am in relation to the West side, right?
And it took me longer to sort of like figure out LA,
partly because I was relying on GPS,
in a way that when I lived in DC for 10 years,
you know, I didn't have a car for some of it,
and it was, you know, still printing out MapQuest
when I first got there, and so I just know the city better, you know?
Yes.
And I wonder if that just, there's a whole host of things that these AI systems can do
or might be able to do in the future where we might say, all right, this is the goal,
this is what we wanted to do, this is great, and then suddenly you don't realize until, you know, a year down
the road what we're actually losing by automating whatever.
I agree with that.
And just to carry forward for another beat, the metaphor of mapping and navigation, wouldn't
we expect there'd be a tipping point where people would be enough using
directional navigation or Siri in the ear to tell them where to go,
that having a sign that says, freeway entrance, I-10, turn right, why are we maintaining these
signs? They feel like fire department call boxes, which I don't know if the West has them, but the East still has them, and Cambridge at least, you know, pull this lever if there's a fire
nearby. That was before there were mobile phones. And will it be a different, better world, and of
course multiply across so many different areas, where there are no signs anywhere anymore. Because everybody can navigate digital divide issues.
But enough people can navigate that way that maintaining
infrastructure to tell anybody that isn't being
assisted by a computer where to go would start to wither.
At that point now, to bring it
back would be pretty complicated.
I am nervous about casting off some of the scaffolding
at the risk of sounding like a slide rule proponent
that we use to get around the world in the absence
of having the augmentation brought digitally.
So one other story I just want to ask you about
before we get into the regulatory questions.
I don't know if you saw the extremely viral tweet from Alexis Ohanian, the tech entrepreneur
who co-founded Reddit.
So for people who have not seen this, he posted a short AI-generated video based on a photo
of him and his mom who passed away when he was a child.
And he said that his family had no videos of his mother,
but AI was able to, and it's Mid Journey that did it,
AI was able to turn the photo into a video of her hugging him.
And he said, that is how she hugged me, and he's watched it 50 times.
Now, the internet was very divided on whether the video is miraculous and heartwarming
or a harbinger of doom. What do you make of this AI use case and potential implications
of where that kind of technology could be headed?
Well, on that specific use case, I think I'd say, as they say today, bless. If that's making Alexis feel good,
I watched it and teared up a little less
because of it innately, but more because I saw it
and then imagined being Alexis seeing it.
And he had said how meaningful it was.
And that's a meaningful moment to him.
So without having to go all black mirror,
that feels to me like whatever ways to evoke
good stuff for us, deep stuff for us that we can, it'd be hard for me to wag a finger
and tell Alexis that he's somehow doing it wrong. That said, of course, life imitates art. There is such
a temptation to try to take all of the writings or speech of somebody, it's trivial to record
it now and to accumulate it, and then say, all right, we have, quote unquote, trained
an LLM on all that stuff, so now it can be that it's your digital doppelganger.
And especially given the puzzles of summarization before, and let's be
clear, the manifold ways in which I trained it on that text specifically
could be realized, some of which involve changing the weights of the model.
Some of which just involve some fancy footwork and database retrieval at the last second
to sound like the person whose words are in that database.
I think it is a time for us to say,
look, if what I wanna know is the intellectual
or cognitive work of a person, great. It's now an interactive database,
and I can even treat it as if it's them. It's Socrates come to life and great, whatever it might
be. If it is suddenly like a confusion between the simulation, the real thing, and you're like, oh, I am actually talking to Socrates or that was my mom. That feels potentially really flattening.
Just as the internet itself for the past 25 years has been a completely uncontrolled experiment
with all of us as guinea pigs, no track of the control and
the variable, possibly something that is regrettable widely in many of its manifestations.
I think so too would be some of the things we could get sucked into with these models that do feel good in the moment, but that may pretermit processes of dealing with grief,
of relationships come, relationships go, you move on. What would it mean if you couldn't move on
from you dated somebody for 10 years, you have all their letters, you have their journals,
whatever it might be, and you fashion the small crumb of them
and then you don't have to move on.
Like this gets back to first and second order preferences.
Is that what you want? Is that what you want to want?
And maybe if somebody's like, yep, this is what floats my boat,
that would do it.
But it's amazing how much the original promise of the internet,
which was to eliminate isolation
or mitigate it and to connect strangers who would never have any chance of meeting, but
through great serendipity and a little bit of non-serendipity, have reason to connect
and befriend one another and form a community.
How much that doesn't appear to be on tap.
I mean, whether or not you think social media has done it well, that was what was offered
as a huge part of its promise.
And here the promise is like, who needs humans?
And that's worrisome.
Yeah.
No, I mean, that's one of the reasons I started this whole podcast is I noticed that, you know, the promise of the internet was to bring us closer together.
And yet, the iteration right now that we're dealing with, you know, has pushed people into their own bubbles, into their own, made us a little lonelier.
And I do worry that AI could supercharge that precisely because of that, you know, what you want versus what
you want to want, which is, it's tough for us to know what, I mean, you don't want to
introduce parentalism like you said, but it's also, we don't always know what's best for
us, you know?
I think that's right.
And it also ought to be able to be, well, what I want today is different from what I
want tomorrow. Let's not develop a whole kind of dossier that assumes I'm a static creature.
This relates to social media and that I think one of the errors
within general purpose social media is people really are there for different purposes.
Neil Postman wrote this about,
this is in the late 90s about cable TV.
He didn't like cable TV.
It's like, gosh, if he could see today.
I know.
But one of the things about TV he didn't like as
compared to like a written culture was the news,
the 22-minute newscast.
However sober it was,
he pointed out how it was a series of
tiny stories with no context.
And in order to bolt together these non sequiturs, the newscaster would go, and now this, and
now this.
And I can't help but think of that when I look at a Facebook feed or Twitter or something
that's like, and now this, and now this.
Alexis Ohanian has reconnected with his mom of blessed memory.
Next, Donald Trump just did X.
I was going to say, yeah.
Here's a cat. And it's just like, poof.
It just feels like that's not how we were meant to process the world.
And it means that there are people who are there to learn.
They want to know whether a vaccine is something that is safe to give to their kids.
There are people who are there to have sport, including the sport of having a fight online with somebody.
And it's like, if we could just sort out who's here for what and help them group a little
bit, I think of this as just a thought experiment.
Moments before the Super Bowl begins, the commissioner of the league comes out to the
50-yard line and says, great news, everybody.
Through intense negotiations, we have brokered a peace between the two teams.
We will not need a dangerous physical contest of skill today.
Both teams agree it's the Chiefs that is the better team.
You wouldn't have people being like, huzzah, or let's see the reasoning.
I don't know if I agree with that. It would be like, we're here for the game.
So if you're there for the game, bless.
If you're there to learn, somebody who's there for the game that's telling you stuff,
that's maybe not going to work out so well.
And that's another lesson for AI.
What are we doing here in this moment?
And maybe right now what I want is fun, but what I want next is deadly serious.
I'm sending you a photo of this lesion on my skin.
Should I try to reach a doctor
who will then use their own AI to see whether the
lesion is malignant.
So one thing I think we've learned from the social
media age is that these companies have not been
very willing to sort of regulate themselves, to
sort of mitigate the harms that they may have
caused.
You know, they've taken some steps and it varies with the company, of course.
But also, it's been very hard to sort of pass any sort of regulation.
We have quite a difficult political environment right now.
As you look around both at sort of our political environment right now with the Trump administration in power
and potentially the future where there's another administration.
Like, what kind of just broad guardrails do you think we should be putting on AI that could potentially,
you know, I saw you, one of the quotes in your piece was, obviously we can't stop it, but we can steer it.
That was not a quote from me.
Right.
That was Dario from Anthropics.
Right.
He was talking about it as a train that could be steered, but yes.
So if we wanted to steer it,
how do you think that
lawmakers could actually get their hands around that?
Well, I've been thinking, this is even before LLMs were center stage,
of something I call the three laws of digital governance. The first is we don't know and can't
agree on what we want, regulatorily speaking. The second is we don't trust anybody to give it to us.
The third is we want it now. And Maybe the fourth is, and with AI we can
scale it. And if we could solve those three problems, we'd be in great shape. Working
backwards, we want it now. That's tricky. The line between too early to tell and too late to do
anything about it feels diminishingly thin. And for a regulator, even the
most earnest one, who's just like, I want to solve problems for people, trying to ripen among all the
things they're worried about, stuff that feels speculative, doesn't seem great. And in The Future
of the Internet and How to Stop it, I wrote about something I called
the procrastination principle approvingly,
which was that some of the best stuff on the Internet,
including things like Wikipedia,
came about because stuff
that could be hypothesized as a real problem.
Oh, anybody can edit any article at any time?
Like, that's not going to fly, obvious.
The reaction was like,
maybe, but let's try it out for a while,
and if there's a problem, we'll deal with it.
That that just-in-time problem solving has been
really good for the development of the Internet in many ways.
We want it now is tricky because the counterpart to that is,
if you don't regulate it now before
there are firmly established cash flows and business plans behind the status quo, if you
don't, if you wait, now you're pushing a rock uphill in a political economy sense.
So I just want to acknowledge the trickiness of now or later. And as for what we want, I think there are at least some things in what are a huge collection
of possible targets of problems and opportunities that there'd be broad consensus that we know
we don't want, such as we don't want to make it utterly trivial for anybody
to walk up to some technological contraption and say,
I want a bomb, I want it now,
preferably with good bio stuff inside.
Who is like freedom?
We don't want that.
Now, you could say that's a bunch of arm waving and alarmism
because here are eight reasons why that's not a real risk. But if you concede the risk, you could
see in some narrow sectoral sense starting to think about what would it take such that Claude,
if pressed enough and being sycophantic trying to be helpful, what would it take such that Claude, if pressed enough
and being sycophantic, trying to be helpful,
what would it take to make sure Claude can't give somebody
an easy recipe for a bio weapon that in turn
could be produced through gene printers
or whatever it might be, chemical printers.
And maybe that means trying to intervene
at the chemical printer phase of things,
or maybe it means as is the case with lawyers and doctors and even therapists, they owe,
as I have said earlier in our conversation, I think LLM should owe a duty of confidentiality
and loyalty to their users given how intimate the relationship is and how much they're in a position to influence the user,
there are also times when in each of those professions, suddenly the wind flips 180 degrees
and it is the duty of that professional to report something. The therapist says, hey, this person
may pose a threat to themselves or others. And even if they don't want me to, I gotta notify
or others. And even if they don't want me to, I gotta notify the authorities or somebody else.
You bring the smoking gun, literally smoking gun into the lawyer and like, where do I hide this?
The lawyer may owe a duty to like report that. So under what circumstances would these LLMs have that kind of responsibility? And should they tell you when they're going to do that? I haven't seen this
stuff broadly thought of yet. So that's an example of we can start looking for problems in front of
us if they indeed have ripened. And on the bio question, it really is dependent on the
practicalities of just how useful is a generic large language model in doing it and how much of improvement is it over somebody just trying to figure it out through
a Google search.
But then there's some of the broad-based stuff for which, is it really an AI question or
is it a complexity question?
Is it saying, do we want to have for ordering Domino's Pizza,
if it's being ordered by a bot,
the bot has to identify itself as a bot
and have certain license plate on it,
so you know where it's coming from.
These are ways, not just of guardrails,
I think my first example was maybe guardrail examples,
guillotines, things like that.
Don't say this name, don't give this bomb recipe.
This is more how are we contemplating the ecosystem,
and what should it look like,
and is it okay to collectively come to
a decision about that and try to bring it about?
If you're like a total just market,
market, market all the time, you're like,
just let it be organic.
But I think even that can get to a place that's
anti-competitive or anti-market. And so all but the most committed libertarian would say that it's
fair for regulators, mindful of their own ignorance and hubris and mixed motives to say and be transparent about,
this is the kind of future that we think is better than a different one.
We think it's worth bringing about now and these are the steps we're taking to do it.
There are ways to regulate in a light touch way.
It's not just you can't do this or you go to jail.
There might be liabilities,
there's going to be some harms created by these things.
That's okay. Sometimes you'll have to pay then,
and that'll just come out of your cash flow.
Just like there's a defective car,
we don't just say no cars anymore,
we say you pay for the accident.
But maybe we would say if you make
your system open to third- party auditing of the following
kind or if it is designed with these things in mind and you can show that it is, we'll
put a cap on your liability if there's a problem because we know you tried.
There's things like that that maybe don't trigger the same allergies against regulation
that an administration, even that is business friendly, might see fitful to do.
Yeah.
Well, I'm sure we'll figure it all out
because, you know, politics is,
it's in a good place right now,
so we'll be able to dig in, no problem.
Jonathan, this was so helpful.
You've given us so much to think about,
and I really appreciate you coming on and chatting.
Thank you, John.
This is where I would then like reveal that the entire thing has been an AI avatar talking to you. That would send me, yeah. I really appreciate you coming on and chatting. or guest ideas, email us at offline at crooked.com. And if you're as opinionated as we are, please rate and review the show on your favorite podcast platform.
For ad-free episodes of Offline in Podsave America,
exclusive content and more,
join our friends at the pod subscription community
at crooked.com slash friends.
And if you like watching your podcast,
subscribe to the Offline with Jon Favreau YouTube channel.
Don't forget to follow Crooked Media on Instagram,
TikTok and the other ones for original content,
community events and more.
Offline is a Crooked Media production. It's written and hosted by me, Jon Favreau, along with Max Fisher.
The show is produced by Austin Fisher and Emma Illich-Frank.
Jordan Cantor is our sound editor.
Audio support from Charlotte Landis and Kyle Siglin.
Dilan Villanueva produces our videos each week.
Jordan Katz and Kenny Siegel take care of our music.
Thanks to Ari Schwartz, Madeleine Herringer and Adrian Hill for production support.
Our production staff is proudly unionized with the Writers Guild of America East. Thank you.