Making Sense with Sam Harris - #326 — AI & Information Integrity
Episode Date: July 6, 2023Sam Harris speaks with Nina Schick about generative AI and information integrity. They discuss the challenges of regulating AI, authentication vs detection, fake video, hyper-personalization of inform...ation, the promise of generative design, productivity gains, disruptions in the labor market, OpenAI, and other topics. If the Making Sense podcast logo in your player is BLACK, you can SUBSCRIBE to gain access to all full-length episodes at samharris.org/subscribe. Learning how to train your mind is the single greatest investment you can make in life. That’s why Sam Harris created the Waking Up app. From rational mindfulness practice to lessons on some of life’s most important topics, join Sam as he demystifies the practice of meditation and explores the theory behind it.
Transcript
Discussion (0)
Thank you. of the Making Sense podcast, you'll need to subscribe at samharris.org. There you'll find our private RSS feed
to add to your favorite podcatcher,
along with other subscriber-only content.
We don't run ads on the podcast,
and therefore it's made possible entirely
through the support of our subscribers.
So if you enjoy what we're doing here,
please consider becoming one.
Just a brief note to say that my last podcast on RFK Jr. has been jailbroken,
which is to say the audio that's now on the free feed is the same as the subscriber audio.
There's no longer a paywall.
This came in response to some heartfelt appeals that we make that a public service announcement, which we've now done, so feel free to forward that to your
friends or anyone who you think should hear it. There have also been several articles that have
come out in recent days about RFK that entirely support the points I make there. So nothing to retract or amend as far as I know. Okay.
Today I'm speaking with Nina Schick. Nina has been on the podcast before. Nina is an author
and public speaker who wrote the book Deep Fakes. She is an expert on current trends in generative AI. She advises several technology companies and frequently speaks at conferences.
She has spoken at the UN and to DARPA and many other organizations.
And in addition to generative AI, she's focused on geopolitical risk
and the larger problem of state-sponsored disinformation.
And today we speak about the challenge of regulating AI,
authentication versus the detection of fakes,
the problem of fake video in particular,
the coming hyper-personalization of information,
the good side of AI, productivity gains, etc.
We talk about possible disruptions in the labor market,
open AI as a company, and other topics.
Once again, it was a pleasure to speak to Nina.
Unfortunately, in only one of the seven languages she speaks.
But for better or worse, English is where I live.
And now I bring you Nina Schick.
I am here with Nina Schick.
Nina, thanks for joining me again.
It's great to be back, Sam.
So you were on, I forget when you were on.
It's a couple of years now.
When do you have the year in memory?
Yeah, it was in 2020.
So just before kind of the new wave of what we're seeing really started to emerge.
just before kind of the new wave of what we're seeing really started to emerge.
So yes, you came on to talk about your book, Deep Fakes, which was all too prescient of our current concerns. But in the meantime, we have this new phenomenon of generative AI,
which has only made the problem of deep fakes more profound, I would imagine. And I think that'll be
the main topic of today's
conversation. But before we jump in, can you just remind people what you've been doing these many
years? What have been your areas of focus and what's your background? Sure, absolutely. So I'm
half Nepalese and I'm half German. I actually grew up in Kathmandu, but eventually I came to live in the UK. So I'm based in London
right now. And I was always really interested in politics and geopolitics. So really, for the kind
of first two decades of my career, I was working in geopolitics. And it just so happened, at the
end of the 2000s. And throughout the 2010s, I happened to work on a lot of kind of seismic events, if you will,
from the original annexation of Crimea by Russia, to kind of how the ground was laid for Brexit here
in the UK to kind of, I don't know if you remember, but the kind of migration crisis in Europe in
2015, which in large part was triggered by the indiscriminate bombing of civilians in Syria by President Putin,
basically the weaponization of migration, which then consequently kind of led to,
was one of the main reasons that Brexit happened as well.
And the kind of persistent feature in my geopolitical career was that technology was emerging as this macro geopolitical force and that it wasn't just shaping
geopolitics on a very lofty and high level, but that it was also shaping the individual experience
of almost every single person alive. And I saw that again, also on the ground, so to speak,
in Nepal, where, you know, technology and the internet and smartphones have changed
society immeasurably in the past few decades. So I just increasingly started becoming interested
in technology as this kind of shaping force for society. And because I had been working
on information warfare, disinformation, actual wars, and information integrity. In 2017, I was advising the former
NATO Secretary General on emerging technology threats. And he was working in the context of
a group of global leaders, which actually included at the time, the former VP Joe Biden,
because they're kind of concerned about the 2020 election in the US,
given what had happened in 2016. And it was whilst I was working for this kind of group,
which would, you know, in part, we're looking at disinformation, but they're also trying to
forge and strengthen the transatlantic alliance, in the view that only if Europe and the United
States are united, can they kind of stand up against the
authoritarian forces of Putin and in China, etc, etc. But that's when I first saw emerging this
so-called phenomenon of deepfakes, right? It started emerging at the end of 2017. And I
immediately sensed that this was something that would be really, really important because it was the first time we're seeing AI create new data.
Right. And the first use case, so malicious, was a non-consensual pornography.
As soon as it became possible, so as soon as these kind of research advances started leeching out of the AI research community and enthusiasts started using them on the internet,
the first thing they started to do was to generate content, i.e. deepfakes, in the form of
non-consensual pornography. And I immediately sensed, though, that this wasn't just kind of
just a tawdry women's issue, even though this was undeniably, in this instance, something that was
targeted against women, but almost a civil liberties issue,
because if you can now clone anybody with the right training data and basically use AI to kind
of recreate their biometrics, then this is potentially going to be a huge problem.
So that was kind of the seed. That's what led me to write my book on deep fakes and information
integrity and the inundation of synthetic content and what that would do in an
already very corrupt and corroded information ecosystem. But really what my reflection is now
is, you know, that was just really my starting point into the world of AI and generative AI,
because from that point on, when I wrote the book, I kind of stopped working with the global leaders
and on the policy side of things because I was just so
fascinated in what was happening in this kind of new developments in AI that I've just been
concentrating on that. And although the starting point was mis- and disinformation, over the past
few years, my reflection has been that it is so much more than that. Mis and disinformation is one very important part of the story.
But if you think about generative AI as what is becoming clear now, and again,
we're only at the very beginning of the journey, I think it's really more profound than that. I
think it's almost a tipping point for human society. Well, you have waved your hands in
the direction of many emerging problems here. And over and against all of those, there's the question of what to do about it.
And regulation is one of the first words that comes to mind.
And yet regulation, speaking from a U.S. point of view, maybe the same is true in the U.K. now politically.
But in the U.S., regulation is a bad maybe the same is true in the U.K. now politically, but in the U.S.,
your regulation is a bad word for at least half of the society. And especially in this area,
it seems to be in zero-sum collision with free speech, right? So there are many people who are,
you know, center, right of center, who are especially focused on this issue. There's a kind of silencing
of dissent. There's an effort on the part of big tech and big corporations generally,
the pharmaceutical industry, messaging into the public health calamity of COVID,
the government adjacent to all of that. There are these elite interests that seem to want
to get everyone to converge inevitably prematurely on certain canonical facts, many of which it is
feared turn out not to be facts. They turn out to be politically correct dogmas, taboos, various mind viruses that we don't want to get anchored to,
it is argued. And I'm certainly sympathetic with some of that, but like you, I've grown
increasingly worried about disinformation, misinformation, information integrity generally.
So I'm just wondering what you think about the tension there, because we're going to talk about the problem in detail now, and some part of a response to it is going to include
some form of regulation. Many people at this point, it seems to me, just don't want to even
be party to that conversation. The moment you begin going down that path, all of the conspiratorial red flags get waved or imagined, and half the audience thinks that the World Economic Forum and specific malefactors, you know, puppeteers pulling the strings of society, will be in control, or at least will be struggling to maintain control of our collective mind.
So what do you think about the tensions and trade-offs there?
Yeah, I mean, I think you've just really hit the nail on the head there when you talk about
how difficult what a kind of quagmire this is. And it's difficult for many reasons. Firstly,
because when you talk about regulating AI, you know, it's so vast. It's just like
talking about regulating society or regulating the economy. So we kind of have to break it down
into component parts that are easier, I guess, to conceptualize and understand.
And with generative AI in particular, because it's so nascent.
I mean, I've been following it almost from kind of day one in terms of the research breakthroughs,
which really started emerging in 2014, 2015.
But I would say that it's really only in the last 12 months
that the capability of some of these foundational models, right? And what they can do for data generation
in all digital medium, whether it's text, video, audio, every kind of form of information and the
implications that has on the kind of future of human creative and intelligent work. I mean,
it's so profound that I've really come to see the other side in
the sense that it's no longer only about disinformation, but this is also potentially
a tremendous economic value add. This is potentially also a huge area for scientific
research and insight generation. And you're already starting to see some very interesting
use cases emerging in enterprise and in research. So I'm sure we'll get into that later. So when you talk about
regulating it, I do have some sympathy for this, if you want to call it the worldview where people
are just a little bit sick of politicians, sick of kind of sweeping
statements, because already you see the same thing happening with regards to AI, right?
Where you have a lot of political leaders who perhaps don't have a background in artificial
intelligence, but they understand that this is going to be an important factor.
And they kind of are doing a lot of grandstanding, almost saying,
well, you know, we're going to build safe AI, and we're going to put together a global agency.
And without much substance, you can see why people start to get pretty
cynical. That being said, does this need to be regulated? Absolutely. Because I can't think of
a more kind of profound technology, exponential technology that's going
to change the entire framework of society. But to start regulating it, well, I guess we need to start
breaking it down into its constituent parts. And that's so difficult because A, it's still nascent,
we don't understand the full capabilities of the technology. And B, because of the exponential acceleration and adoption. I mean, if you consider,
I almost conceive of this, I mean, obviously, this is a continuum, and it's kind of an exponential
curve. But if there was one moment that's completely changed everything, if I had to
pinpoint one moment, I would say it is the moment that ChatGPT came out, right? You can almost see the world as pre-ChatGPT and post-ChatGPT,
not because, you know, OpenAI was the first to pioneer large language models. And Yan LeCun,
the AI chief at Meta, kind of famously or infamously came out at the time and was like,
it's not that innovative. And he got absolutely panned. Because whilst he was correct in the sense that
they weren't the first to pioneer large language models,
it kind of misses the point because that changed the entire debate
in terms of both public perception, but also the market moving.
So in the past kind of six months, we've seen all of big tech,
every single big tech company fundamentally and strategically
pivot to make generative AI a core part of their strategy. The kind of emerging enterprise use
cases are truly astounding. So I think where this is the calm before the storm, or this is probably
the last moment, I would say, before we really start seeing AI being integrated into almost
every type of human knowledge work. So when you're thinking of the pace and scale of change
at this rate, you know, policymakers having worked with many policymakers for many years,
they're always kind of on the back foot anyway. But faced with challenges like this,
always kind of on the back foot anyway. But faced with challenges like this, you know, it's very,
very difficult, not least because there is a huge skills gap, not only in the companies that are building the technology, we hear this all the time about the AI skills gap and so on, but also on the
regulatory side, who actually understands this? Who actually can foresee what the implications might be?
And given, so the only kind of piece of transnational regulation that's in the works
right now is coming from the European Union. And this is kind of a gargantuan piece of legislation.
It's meant to be the kind of first regulatory blueprint, if you will, on artificial intelligence,
been in the works for years.
But until ChatGPT came out, it made no reference of generative AI or foundational models.
Now, they very quickly redrafted it because they understood that, you know, this is really
important, but that's only going to come into force in 2026.
So I think one of the consistent
reflections, and you must have, I know that you've had this reflection as well, if you consider what's
been happening over the past few years, is just how quickly all of this has unfolded. So all AI
researchers I talked to, we all knew, or they knew that this was kind of hypothetically within the
realm of the possible.
But everyone always says we didn't think we'd be here now. And just trying to keep up with the research papers that are coming out every single day, the new companies, the amount of money flowing into the space, the kind of market moving impetus started by the tech companies by actually commercializing and productizing these tools and bringing it to
market of hundreds of millions of people, you know, yeah, regulators have a hard task on their hands.
We could probably class the spectrum of problems into two bins here. And so the deepest, which
perhaps we won't even talk about unless you especially want to touch it. I mean, it's something that I've spoken about before and will continue to cover on this podcast.
But the deepest concern is what often goes by the name of existential risk here.
Is there something fundamental about the development of AI that poses a real threat to the not just the maintenance of democracy and the maintenance of civilization,
but to the further career of our species. And, you know, I'm convinced that there is a problem
here worth worrying about, and therefore there will be some regulation of a sort that, you know,
we've used for, you know, nuclear proliferation or the spread of
the tools of synthetic biology. And, you know, I don't think we've done either of those especially
well, but here it's even harder. And, you know, that's a separate conversation perhaps.
Then there are all of the piecemeal near-term and truly immediate threats of the sort that we've
just begun to speak about that go under the banner of
information integrity and cyber hacking and cyber terrorism and just any malicious use of
narrow AI that can really supercharge in a human conflict and confusion. And this can, short of being an
existential threat, it can be an enormous threat, which is certainly worth worrying about.
So, but then obviously there, the reason why this is such an interesting conversation is that
that's only half of the story. The other half, as you point out, is all of the good things we can expect to do and build
and enjoy on the basis of increased intelligence. Generically, intelligence is the best thing we
have, right? It's the thing that differentiates us from our primate cousins. It's the thing that
safeguards everything else we care about, even if the things we care about can't be
care about, even if the things we care about can't be narrowly reduced to intelligence,
things like love and friendship and creative joy, etc. All of that is safeguarded from the casual malevolence of nature by intelligence. The fact that we have cures for any illnesses is the result
of intelligence. So obviously, we want a cure for cancer, we want a cure for Alzheimer's. If AI could give us those two things, the whole experiment
would be worth it already, apart from the possibility of our destroying everything
else we care about. So let's start where you and I really left off last time with the issue of deep fakes and I guess just
fakeness in general. I mean, it wasn't until the emergence of ChatGPT that I suddenly glimpsed
the possibility that in very short order here, most of everything on the internet could be fake,
right? I mean, just most text could be fake, most image could be fake. I mean,
not, you know, not now, but, you know, maybe two years from now. I mean, just when you look at how
quickly you can produce fake faces and fake video and fake journal articles, and that's just an
amazing prospect. So tell me how you've been viewing the development of these tools in the last six to 12 months. And what are the crucial moments with respect to deep fakes and other fake material that you've noticed?
of the risks broadly in those two buckets, the kind of AGI or existential risk scenario. And I'd like to have a conversation on that with you too. So maybe we can go back to that.
But as to the second bucket, which is almost the short and medium term risks,
things that are actually materializing right now. And foremost in that bucket, in my mind,
is without a doubt, information integrity. Now, this is kind of the main thesis of my book,
which came out three years ago. And that was, you know, so much has happened since then. At the time,
generative AI hadn't even been coined as a phrase. And although people don't usually put chat GPT
and deep fakes into the same sentence, they're actually manifestations of the same phenomenon, right? The same kind of
new capabilities, new quote unquote, of AI to be able to generate new data. Now, the really
interesting thing is that when these capabilities started kind of coming out of the research
community at the end of 2017, people started making these non-consensual pornographic creations with them and memes and kind of visual content.
So much of the premise of my book was focused on visual media.
But of course, at the same time, concurrently, a lot of work was going into the development of large language models.
I mean, Google was really pioneering work in large language models in 2017.
However, they kept it behind closed doors, right?
It wasn't out there.
This is perhaps why nobody was really talking about text.
And although we kind of registered, I know you've spoken about GPT series before ChatGPT came out.
We've spoken about large language models, and I knew it was kind of on my radar.
out and we've spoken about large language models and I knew it was kind of on my radar. It was really only when GPT-3 and GPT-3.5 and now GPT-4 and basically chat GPT came out that you truly
understand the, first of all, the significance of a large language model and what it can do to scale
mis- and disinformation. I mean, it's truly incredible and how convincing it is.
And we were thinking about visual content as the most convincing, you know, AI-generated
video of people saying and doing things they've never done.
Anyone can be cloned now, whether it's your voice or your face, but hadn't really considered
or put text on an equal footing.
But of course, that makes so much sense.
You know, we're storytellers. That's something that goes back to the earliest days of civilization.
And the problem, I suppose, is that there's been a lot of thinking on kind of the visual components
of AI-generated content and what we can do to kind of combat the worst risks.
And I'll come back to that, but less on text.
And if you think about the solutions on kind of the disinformation piece around that really
started thinking around was the thinking started with deep face.
Initially, people started thinking, okay, what we need to do is we need to detect it,
right?
We need to be able to build an
AI content detector that can detect everything that's made by AI. So then we can definitively
say, okay, that's AI generated. And that's not, that's synthetic. That's authentic.
Turns out that in practice, building detection is really, really difficult because first,
there's no one size fits all detector there are now hundreds of
thousands of generative models out there and there's never going to be one size fits all
detector that can detect all synthetic content second it can only give you a percentage of how
likely it thinks that is generated by ai or not so okay i okay, I'm 90% confident. I'm 70% confident. So always has a chance for a false negative,
false positive or false negative.
And third, as you correctly point out,
and this is kind of,
was one of the points I made in my book
and actually over the past few years
has become something I've been speaking about a lot,
is that if you believe, as I do,
and as you already pointed out, that there will be some
element of AI creation going forward in all digital information, then it becomes a futile
exercise to try and detect what's synthetic, because everything is going to have some degree
of AI or synthetic nature within any piece of digital content. So the second approach,
kind of tech-led approach that's been emerging over the past few years, which is more promising,
is the idea of content provenance, right? And this is applicable to both synthetic or AI-generated
content, as well as authentic content. It's
about full transparency. So rather than being in the business of adjudicating what's true,
you know, this is real, this is not, it's about securing full transparency about the origins of
content in kind of almost the DNA of that content. So whether if it's authentic,
you can capture it using secure capture technology. And that will give you almost
kind of a cryptographically sealed data about that piece of content, where it was created,
who it belongs to, but the same principle should also be applied to AI generated content.
And of course, not everyone's going to do that. But the difference
here is that if you are a good actor, right, so if you are a open AI, or you're a stable diffusion,
or you know, you're your Coca Cola, and you want to use AI generated collateral in your latest
marketing campaign, then you should mark your content so everyone can see that this is actually synthetic so the
technology to do that already exists and i should point out that it's much more than a watermark
because people are yeah okay it's a watermark watermarks can be edited or removed but this kind
of authentication technology like i said it's about cryptographically sealing it almost into
the dna of that content so it can never be removed. It's indelible. It's
there for kind of the world to see. But the second part to this, and this is really
where it starts getting tricky, is that it's no good signing your content in this way,
in full transparency, if you're a good actor, if nobody can see that kind of nutritional label,
right? So you've signed it, you put the technology in there. But like, if I view it on Twitter, am I going to see it? Or if
I see it on YouTube, will I see it? So the second point is that we actually need to build into the
architecture of the internet, the actual infrastructure for this to kind of become
the default, right, the content credentials. And there's a nonprofit organization called the C2PA,
which is already building this open standard for the internet.
And a lot of interesting founding members, Microsoft, Intel, BBC, ARM.
And I guess we will see if this kind of becomes a standard because my view, so the projection, the estimate that I make is that 90% of online content is going to be generated by AI by 2025.
That's a punchy kind of figure, but I really believe that this is probably the last moment of the internet where the majority of the information and data you see online doesn't have some degree of AI in its creation.
So this issue of authentication rather than detecting fakes, it's an interesting flip of the approach and it presents a kind of bottleneck.
of the approach, and it presents a kind of bottleneck. I'm wondering, does this suggest that this will be an age of new gatekeepers, where the promise at the moment, for those who
are very bullish on everything that's happening, is that this has democratized creativity and
information creation just to the ultimate degree.
But if we all get trained very quickly to care about the integrity of information and our approach to finding legitimate information, whatever its provenance,
whether it's been synthesized in some way or whether it purports to be actually human-generated,
if the approach to safeguarding the integrity is authentication rather than the detection of misinformation,
how do we not wind up in a world where you can't trust an image unless it came from Getty Images, say,
or it was taken on an iPhone, right?
Like the cryptographic authentication of information,
is this something that you imagine is going to be
lead to a new siloing and gatekeeping? Or is it going to be like, you know, blockchain mediated,
and everyone will be on this, you know, all fours together dealing with content, however,
you know, people will be able to create it, you know, outside of some walled garden or inside of a major corporation, and everyone will have access to the same authentication tools?
of the problem is so vast. And in part, because over the past 30 years, we've created this kind of digital information ecosystem, wherein everybody and everything must now exist. It
doesn't really matter if you're an individual, whether you're an organization or enterprise or
a nation state, you don't really have the choice not to be engaged and be doing things within this ecosystem.
So the very possibility that the medium by which information and transactions and communications and interactions,
so all digital kind of content and information could be compromised to the extent that it becomes untrustworthy,
that's a huge problem, right? So trying to ensure that we build an ecosystem where we can actually
trust the information or a part of an ecosystem, right? We can actually trust the information you
engage with online is going to be critical to society, to business, on every kind of level conceivable. Detection
will always play a role, by the way, I think. It's just that it's not the only solution.
Let me just take a very narrow but salient case now. Just imagine at some point today,
a video of Vladimir Putin claiming that he is about to use tactical nukes in the war in Ukraine emerges online.
And, you know, the New York Times is trying to figure out whether or not to write a story on it, react to it, spread it.
Clearly, there's a detection problem there, right?
We have this one video that's spreading on social media, and to human eyes, it appears totally authentic, right? It's like we have this one video that's spreading on social media and to human eyes, it appears totally authentic, right? I think it's uncontroversial to say that
if we're not there now, we will be there very soon and probably in a matter of months, not years,
where we'll have video of Putin or anyone else where it will literally be impossible for a person to detect some content in which somebody's biometrics are synthesized.
So it's a visual representation of a person saying and doing things they didn't do or a completely synthetic person.
They're already so sophisticated that we can't tell.
And whilst we're talking about the nuclear political scenario, it's already cropping up really malicious use cases and kind of the
vishing has become a big deal. So this is kind of phishing using people's biometrically cloned
voice, you know, the age old scam of your loved one calling you because they've had an accident
or they're in jail and they need to be bailed out. Now, imagine you get that call
and it's actually your son's voice you hear or your wife's voice or your father's voice. Are you
going to send the money? Hell yeah, you're going to send the money because you believe your loved
one is in trouble. And you can already synthesize voices with up to three seconds of audio.
When I started looking at this field back in 2017, you need hours and hours and hours of training data
to try and synthesize a single voice so you could only do people who are really in the public eye
like you sam you know your your entire podcast repertoire would you know have been a good basis
for training data but you don't need to be sam harris and you know to to do this you just need
three seconds which you could probably
scrape off Instagram, off YouTube, off LinkedIn. So you're already seeing that, right?
One question here, Nina, are we, with respect to video, are we truly there? Because the best
thing I've seen, and I think this is, you know, most people will have seen this, are these
Tom Cruise videos, which are fake, but they're somewhat gamed because, if I understand
correctly, the person who's creating them already looks a lot like Tom Cruise, and he's almost like
a Tom Cruise impersonator, and he's mapping the synthetic face of Cruise onto his own facial
acting. And so it's very compelling. It never looks truly perfect to me,
but if you weren't tipped off
that you should be paying close attention,
you'd probably pass for almost everyone.
But are we to the point now
where absent some biasing scheme like that
where you have an actor at the bottom of it,
you can create video of anyone that is
undetectable as fake? It's a much more difficult challenge. And the deep Tom that started emerging
went viral on TikTok in 2021. You're absolutely right. Because that creator was, first of all,
he was doing a lot of VFX and AI. It's not that the AI was kind of autonomously creating it. And
he was working with an actor who was a Tom Cruise impersonator, right? So he was just mapping it onto his face,
which is why the fidelity looked highly convincing when it came out in 2021.
Video is still a harder challenge. But already now, there are consumer products on the market
where you can send 20 seconds of you talking into your cell phone. And from that,
they can create your own kind of personalized avatar. So the point is that whilst it's still
the barriers to entry on synthetic video generation are still higher, they're coming
down very quickly. And like I said, there's the kind of market for your AI avatar is already thriving.
And that requires about 20 seconds of video.
And where do the visual foundational models fit in here, like DALI and MidJourney and
Stable Diffusion?
Are they the source of good deepfakes now?
Or are they not producing that sort of thing?
Yeah, so it's been a really interesting shift
because when deepfakes first started coming out in 2017,
it was more that this was now a kind of tool
that enthusiasts, ML and AI enthusiasts,
perhaps those with a background in VFX
could use to create content.
And they started doing it to basically clone
people, right? But there was no kind of model or foundational model in order to be able to do this.
Then pretty soon, I think it was in 2018, NVIDIA released this model called StyleGAN,
and that could generate endless images of human faces. So it had been trained on a vast data set of human faces.
So every time you kind of, you might have gone to that website a few years ago called thispersondoesnotexist.com.
Yeah.
It was astonishing, right?
Because every time you refreshed the page, you'd be given what looked like an entirely convincing photograph of somebody who was entirely synthetic and AI generated.
Although I remember the tell there was that they could never quite get the teeth right.
The teeth always looked somewhat deformed.
The teeth, the ears. Exactly. There was this giveaway, telltale signs, right?
So when you saw the best in class kind of productions or creations like Deep Tome in 2021,
there was a high level of post-production and VFX
and a bit of AI. So, you know, this wasn't still democratized to the extent where anybody could do
it. But what you've been seeing over the past 12 months is the emergence of these so-called
foundational models. Now, these are interesting because they are not task-specific,
their general purpose. And they are trained on these vast, vast data sets. You can almost
conceive of it as the entirety of the internet. So the ones you just mentioned,
DALI 2, Stable Diffusion, Mid-Journey, they're all text-to-image generators, right? And they're so compelling because the user experience is phenomenal
because they have NLP tied into them so that when we use them, how do we get them to create
something? Well, we prompt them. We just type what we want them to create. So now all of a sudden,
you have these foundational models that can generate images of anybody or anything
so yeah you know the the really sophisticated deep fakes you've been seeing during the rounds
on the internet recently whether it was donald trump's um kind of just before his arraignment
you know fighting off those ones yeah yeah yeah those were created by mid-journey v5 or the ones of the pope
in the balenciaga jacket so there's been an incredible amping up of the capability shall
we say because before it's quite piecemeal and you have to do this and that but like there was
no foundational model where you could just type in what you wanted it to create and boom, it would come out. Now, does that exist
for video yet? Not yet. Is it going to come? Invariably. And that's actually what ChatGPT is
as well, right? It's a one manifestation of a foundational model for text. And that's one of
the reasons why it has just been so compelling. It's that user experience. Hey, I can just have
a conversation with it. I can just type in there and then it can like create anything I ask it to. And it's the same concept for the foundational models for image generation and video generation is next.
So, and I must say it lands differently now. Prescient.
Yeah.
I mean, it's like, I forgot what I thought about it 10 years ago, but, you know, it's
all too plausible now.
And it's already happening, Sam.
Yeah, yeah.
But it's, and the thing that's, it's hard to get away from, I mean, obviously there's
some benefits to this sort of thing. I mean, if you could have a
functionally omniscient agent in your ear whenever you want, many good things could come of that.
But there is something, it's a vision of bespoke information where no one is seeing or hearing the same thing anymore, right? So there's
a siloing effect where if everyone has access to an oracle, well, then that oracle can create
a bespoke reality with or without hallucinations. I mean, your preferences can be catered to
in such a way that everyone can be... I mean, to some degree, this has already happened,
but the concern gets sharpened up considerably when you think about the prospect of all of us
having an independent conversation with a superintelligence that is not constrained to to get everyone to converge or agree or to even find one another interpretable, right?
I already feel like when I see, I'm no longer on social media, so I have this experience less now,
but when I was still on Twitter, I had the experience of seeing people I knew to some
degree behaving in ways that were less and less interpretable to me. I mean,
they were seeming more and more irrational. And I realized, well, I'm not looking over their
shoulder seeing their Twitter feed. I don't see the totality of what they're feeling informed by
and reacting to. And I just see, basically, from my bubble, they appear to be going crazy.
See, basically from my bubble, they appear to be going crazy, and everyone is redshifted and quickly vanishing over some horizon from the place where I am currently sitting. And no doubt that I'm doing the same thing for them.
And it is an alarming picture of a balkanization of our worldview.
And it's, yeah, I guess the variable there really is the coming bespokeness
of information. And somebody, I think it was Jaron Lanier, I think it was Jaron Lanier,
you know, flagged this for me some years ago where he said, you know, just imagine if
you went to Wikipedia and no one was actually seeing. You look up an article on anything, World War II, and that is curated purely for you.
No one is seeing that same article.
The Wikipedia ground truth as to the causes and reality of World War II was written for you, catering and pandering to your preferences and biases, and no one else
has access to that specific article. It seems that we're potentially stumbling into that world
with these tools. Oh, absolutely. This kind of hyper-personalization or the audience of one,
and you already kind of see some of the early manifestations of that so you
you're talking about her so we have already of course after chat gpt came out some of the kind
of more nefarious things that started immediately being built because it's going to get very weird
when it comes to love sex and relationships are these kind of girlfriend bots right for very lonely men and these kind of
chat bots can cater to their every sexual fantasy there was actually a company called replica which
also did these avatars and they people were using them as this kind of girlfriend you know being
very inappropriate and using them for their their fantasy. So they had to kind of change the settings, if you will, and reboot and kind of make sure that their avatar
didn't include the chatbot abilities that would kind of relate to any kind of intimate relationships.
But it's also true, of course, from the point you are making.
By the way, well done on leaving Twitter.
There's nothing left there. So
I'm sure your life is a lot better for that. But if you think about radicalization and online
radicalization, this is actually something that I already read in a paper years ago, because it was
a piece of research done by the Middlesbrough Institute of Terrorism. And it was looking at how an early
forebear to chat GPT-4, I think it was GPT-3, they had tested it and seen how it performed
as a radicalization agent, right? And we know so many people are radicalized online.
Now, imagine we're just talking right now about the capabilities of very sophisticated
chatbots that are going to become even more sophisticated and be able to fulfill your
every sexual fantasy or to be able to groom you, to radicalize you.
And the next step when we talk about the capabilities of these generative models is so-called multimodal
models, right?
Because right now, they're still kind of broken up the foundational models into the type of content that they
generate. So you have large language models, you have image generators, you have audio generators,
you have the kind of growing video generators, although they're obviously not as sophisticated
as the text or image yet. But the multimodal is when you can work with all those digital medium in one.
So hypothetically, you can, if we're going to go back to the kind of virtual girlfriend scenario,
you know, you can not only chat to her, but you can see her, you can have photos generated of her.
Similarly, if we go back to the grooming kind of scenario, you know, you're being shown video,
you're being shown audio whatever
your worldview is can be entrenched so these are some of the darkest kind of manifestations of this
hyper personalization on the kind of more benign side i think people in the entertainment world
are very excited about it because they're like oh this is the ability to create an audience of one. So if you like the Harry Potter books and you want to find out more about Dumbledore,
you can say, I want to have some more backstory generated for Dumbledore and I want to know
about where he was born and what was his mother's backstory.
And it could just hypothetically generate that for you in real time. But one area where this hyper personalization is really promising is in medicine. And interestingly, we talked about chatbots. And there have been some interesting trials on kind of using a chatbot as almost a assistant, or a friend or a voice to people who are anxious or depressed.
And the early kind of indicators have been that it can be helpful. Now, I don't know,
I guess it's a philosophical point, whether you think you should treat people with mental health
issues with a chatbot, you know, whether that is the benefits outweigh the risks, that's not for me
to decide. But in terms of like the hyper-personalization of potential medical treatment
for people based on their own data, that's something that I am really interested in.
Yeah. Yeah. I mean, you add, you know, full genome sequencing to that. Exactly. It's very interesting. On the multimodal model front, do you have a
sense of how far away we are from the all-encompassing tool where you could sit down and say,
give me a 45-minute documentary proving that the Holocaust didn't happen using all kinds of archival footage of Hitler and everyone else,
and make it in the style of a Ken Burns documentary. And with that prompt, it'll spit out a totally compelling 45-minute video
with essentially perfect fake sourcing of archival imagery and all the rest.
So I think to have a 40-minute completely synthetically generated video,
we're still a way off that.
But having said that, you know, I've been kind of working with the research community.
If you'd like to continue listening to this conversation,
you'll need to subscribe at SamHarris.org.
Once you do, you'll get access to all full-length episodes
of the Making Sense podcast,
along with other subscriber-only content,
including bonus episodes and AMAs
and the conversations I've been having on the Waking Up app.
The Making Sense podcast is ad-free
and relies entirely on listener support.
And you can subscribe now at SamHarris.org.