Making Sense with Sam Harris - #326 — AI & Information Integrity

Episode Date: July 6, 2023

Sam Harris speaks with Nina Schick about generative AI and information integrity. They discuss the challenges of regulating AI, authentication vs detection, fake video, hyper-personalization of inform...ation, the promise of generative design, productivity gains, disruptions in the labor market, OpenAI, and other topics. If the Making Sense podcast logo in your player is BLACK, you can SUBSCRIBE to gain access to all full-length episodes at samharris.org/subscribe. Learning how to train your mind is the single greatest investment you can make in life. That’s why Sam Harris created the Waking Up app. From rational mindfulness practice to lessons on some of life’s most important topics, join Sam as he demystifies the practice of meditation and explores the theory behind it.

Transcript
Discussion (0)
Starting point is 00:00:00 Thank you. of the Making Sense podcast, you'll need to subscribe at samharris.org. There you'll find our private RSS feed to add to your favorite podcatcher, along with other subscriber-only content. We don't run ads on the podcast, and therefore it's made possible entirely through the support of our subscribers. So if you enjoy what we're doing here, please consider becoming one.
Starting point is 00:00:53 Just a brief note to say that my last podcast on RFK Jr. has been jailbroken, which is to say the audio that's now on the free feed is the same as the subscriber audio. There's no longer a paywall. This came in response to some heartfelt appeals that we make that a public service announcement, which we've now done, so feel free to forward that to your friends or anyone who you think should hear it. There have also been several articles that have come out in recent days about RFK that entirely support the points I make there. So nothing to retract or amend as far as I know. Okay. Today I'm speaking with Nina Schick. Nina has been on the podcast before. Nina is an author and public speaker who wrote the book Deep Fakes. She is an expert on current trends in generative AI. She advises several technology companies and frequently speaks at conferences.
Starting point is 00:01:49 She has spoken at the UN and to DARPA and many other organizations. And in addition to generative AI, she's focused on geopolitical risk and the larger problem of state-sponsored disinformation. And today we speak about the challenge of regulating AI, authentication versus the detection of fakes, the problem of fake video in particular, the coming hyper-personalization of information, the good side of AI, productivity gains, etc.
Starting point is 00:02:20 We talk about possible disruptions in the labor market, open AI as a company, and other topics. Once again, it was a pleasure to speak to Nina. Unfortunately, in only one of the seven languages she speaks. But for better or worse, English is where I live. And now I bring you Nina Schick. I am here with Nina Schick. Nina, thanks for joining me again.
Starting point is 00:02:48 It's great to be back, Sam. So you were on, I forget when you were on. It's a couple of years now. When do you have the year in memory? Yeah, it was in 2020. So just before kind of the new wave of what we're seeing really started to emerge. just before kind of the new wave of what we're seeing really started to emerge. So yes, you came on to talk about your book, Deep Fakes, which was all too prescient of our current concerns. But in the meantime, we have this new phenomenon of generative AI,
Starting point is 00:03:18 which has only made the problem of deep fakes more profound, I would imagine. And I think that'll be the main topic of today's conversation. But before we jump in, can you just remind people what you've been doing these many years? What have been your areas of focus and what's your background? Sure, absolutely. So I'm half Nepalese and I'm half German. I actually grew up in Kathmandu, but eventually I came to live in the UK. So I'm based in London right now. And I was always really interested in politics and geopolitics. So really, for the kind of first two decades of my career, I was working in geopolitics. And it just so happened, at the end of the 2000s. And throughout the 2010s, I happened to work on a lot of kind of seismic events, if you will,
Starting point is 00:04:05 from the original annexation of Crimea by Russia, to kind of how the ground was laid for Brexit here in the UK to kind of, I don't know if you remember, but the kind of migration crisis in Europe in 2015, which in large part was triggered by the indiscriminate bombing of civilians in Syria by President Putin, basically the weaponization of migration, which then consequently kind of led to, was one of the main reasons that Brexit happened as well. And the kind of persistent feature in my geopolitical career was that technology was emerging as this macro geopolitical force and that it wasn't just shaping geopolitics on a very lofty and high level, but that it was also shaping the individual experience of almost every single person alive. And I saw that again, also on the ground, so to speak,
Starting point is 00:05:01 in Nepal, where, you know, technology and the internet and smartphones have changed society immeasurably in the past few decades. So I just increasingly started becoming interested in technology as this kind of shaping force for society. And because I had been working on information warfare, disinformation, actual wars, and information integrity. In 2017, I was advising the former NATO Secretary General on emerging technology threats. And he was working in the context of a group of global leaders, which actually included at the time, the former VP Joe Biden, because they're kind of concerned about the 2020 election in the US, given what had happened in 2016. And it was whilst I was working for this kind of group,
Starting point is 00:05:52 which would, you know, in part, we're looking at disinformation, but they're also trying to forge and strengthen the transatlantic alliance, in the view that only if Europe and the United States are united, can they kind of stand up against the authoritarian forces of Putin and in China, etc, etc. But that's when I first saw emerging this so-called phenomenon of deepfakes, right? It started emerging at the end of 2017. And I immediately sensed that this was something that would be really, really important because it was the first time we're seeing AI create new data. Right. And the first use case, so malicious, was a non-consensual pornography. As soon as it became possible, so as soon as these kind of research advances started leeching out of the AI research community and enthusiasts started using them on the internet,
Starting point is 00:06:45 the first thing they started to do was to generate content, i.e. deepfakes, in the form of non-consensual pornography. And I immediately sensed, though, that this wasn't just kind of just a tawdry women's issue, even though this was undeniably, in this instance, something that was targeted against women, but almost a civil liberties issue, because if you can now clone anybody with the right training data and basically use AI to kind of recreate their biometrics, then this is potentially going to be a huge problem. So that was kind of the seed. That's what led me to write my book on deep fakes and information integrity and the inundation of synthetic content and what that would do in an
Starting point is 00:07:25 already very corrupt and corroded information ecosystem. But really what my reflection is now is, you know, that was just really my starting point into the world of AI and generative AI, because from that point on, when I wrote the book, I kind of stopped working with the global leaders and on the policy side of things because I was just so fascinated in what was happening in this kind of new developments in AI that I've just been concentrating on that. And although the starting point was mis- and disinformation, over the past few years, my reflection has been that it is so much more than that. Mis and disinformation is one very important part of the story. But if you think about generative AI as what is becoming clear now, and again,
Starting point is 00:08:11 we're only at the very beginning of the journey, I think it's really more profound than that. I think it's almost a tipping point for human society. Well, you have waved your hands in the direction of many emerging problems here. And over and against all of those, there's the question of what to do about it. And regulation is one of the first words that comes to mind. And yet regulation, speaking from a U.S. point of view, maybe the same is true in the U.K. now politically. But in the U.S., regulation is a bad maybe the same is true in the U.K. now politically, but in the U.S., your regulation is a bad word for at least half of the society. And especially in this area, it seems to be in zero-sum collision with free speech, right? So there are many people who are,
Starting point is 00:08:58 you know, center, right of center, who are especially focused on this issue. There's a kind of silencing of dissent. There's an effort on the part of big tech and big corporations generally, the pharmaceutical industry, messaging into the public health calamity of COVID, the government adjacent to all of that. There are these elite interests that seem to want to get everyone to converge inevitably prematurely on certain canonical facts, many of which it is feared turn out not to be facts. They turn out to be politically correct dogmas, taboos, various mind viruses that we don't want to get anchored to, it is argued. And I'm certainly sympathetic with some of that, but like you, I've grown increasingly worried about disinformation, misinformation, information integrity generally.
Starting point is 00:10:00 So I'm just wondering what you think about the tension there, because we're going to talk about the problem in detail now, and some part of a response to it is going to include some form of regulation. Many people at this point, it seems to me, just don't want to even be party to that conversation. The moment you begin going down that path, all of the conspiratorial red flags get waved or imagined, and half the audience thinks that the World Economic Forum and specific malefactors, you know, puppeteers pulling the strings of society, will be in control, or at least will be struggling to maintain control of our collective mind. So what do you think about the tensions and trade-offs there? Yeah, I mean, I think you've just really hit the nail on the head there when you talk about how difficult what a kind of quagmire this is. And it's difficult for many reasons. Firstly, because when you talk about regulating AI, you know, it's so vast. It's just like talking about regulating society or regulating the economy. So we kind of have to break it down
Starting point is 00:11:14 into component parts that are easier, I guess, to conceptualize and understand. And with generative AI in particular, because it's so nascent. I mean, I've been following it almost from kind of day one in terms of the research breakthroughs, which really started emerging in 2014, 2015. But I would say that it's really only in the last 12 months that the capability of some of these foundational models, right? And what they can do for data generation in all digital medium, whether it's text, video, audio, every kind of form of information and the implications that has on the kind of future of human creative and intelligent work. I mean,
Starting point is 00:12:02 it's so profound that I've really come to see the other side in the sense that it's no longer only about disinformation, but this is also potentially a tremendous economic value add. This is potentially also a huge area for scientific research and insight generation. And you're already starting to see some very interesting use cases emerging in enterprise and in research. So I'm sure we'll get into that later. So when you talk about regulating it, I do have some sympathy for this, if you want to call it the worldview where people are just a little bit sick of politicians, sick of kind of sweeping statements, because already you see the same thing happening with regards to AI, right?
Starting point is 00:12:50 Where you have a lot of political leaders who perhaps don't have a background in artificial intelligence, but they understand that this is going to be an important factor. And they kind of are doing a lot of grandstanding, almost saying, well, you know, we're going to build safe AI, and we're going to put together a global agency. And without much substance, you can see why people start to get pretty cynical. That being said, does this need to be regulated? Absolutely. Because I can't think of a more kind of profound technology, exponential technology that's going to change the entire framework of society. But to start regulating it, well, I guess we need to start
Starting point is 00:13:30 breaking it down into its constituent parts. And that's so difficult because A, it's still nascent, we don't understand the full capabilities of the technology. And B, because of the exponential acceleration and adoption. I mean, if you consider, I almost conceive of this, I mean, obviously, this is a continuum, and it's kind of an exponential curve. But if there was one moment that's completely changed everything, if I had to pinpoint one moment, I would say it is the moment that ChatGPT came out, right? You can almost see the world as pre-ChatGPT and post-ChatGPT, not because, you know, OpenAI was the first to pioneer large language models. And Yan LeCun, the AI chief at Meta, kind of famously or infamously came out at the time and was like, it's not that innovative. And he got absolutely panned. Because whilst he was correct in the sense that
Starting point is 00:14:25 they weren't the first to pioneer large language models, it kind of misses the point because that changed the entire debate in terms of both public perception, but also the market moving. So in the past kind of six months, we've seen all of big tech, every single big tech company fundamentally and strategically pivot to make generative AI a core part of their strategy. The kind of emerging enterprise use cases are truly astounding. So I think where this is the calm before the storm, or this is probably the last moment, I would say, before we really start seeing AI being integrated into almost
Starting point is 00:15:06 every type of human knowledge work. So when you're thinking of the pace and scale of change at this rate, you know, policymakers having worked with many policymakers for many years, they're always kind of on the back foot anyway. But faced with challenges like this, always kind of on the back foot anyway. But faced with challenges like this, you know, it's very, very difficult, not least because there is a huge skills gap, not only in the companies that are building the technology, we hear this all the time about the AI skills gap and so on, but also on the regulatory side, who actually understands this? Who actually can foresee what the implications might be? And given, so the only kind of piece of transnational regulation that's in the works right now is coming from the European Union. And this is kind of a gargantuan piece of legislation.
Starting point is 00:15:59 It's meant to be the kind of first regulatory blueprint, if you will, on artificial intelligence, been in the works for years. But until ChatGPT came out, it made no reference of generative AI or foundational models. Now, they very quickly redrafted it because they understood that, you know, this is really important, but that's only going to come into force in 2026. So I think one of the consistent reflections, and you must have, I know that you've had this reflection as well, if you consider what's been happening over the past few years, is just how quickly all of this has unfolded. So all AI
Starting point is 00:16:38 researchers I talked to, we all knew, or they knew that this was kind of hypothetically within the realm of the possible. But everyone always says we didn't think we'd be here now. And just trying to keep up with the research papers that are coming out every single day, the new companies, the amount of money flowing into the space, the kind of market moving impetus started by the tech companies by actually commercializing and productizing these tools and bringing it to market of hundreds of millions of people, you know, yeah, regulators have a hard task on their hands. We could probably class the spectrum of problems into two bins here. And so the deepest, which perhaps we won't even talk about unless you especially want to touch it. I mean, it's something that I've spoken about before and will continue to cover on this podcast. But the deepest concern is what often goes by the name of existential risk here. Is there something fundamental about the development of AI that poses a real threat to the not just the maintenance of democracy and the maintenance of civilization,
Starting point is 00:17:46 but to the further career of our species. And, you know, I'm convinced that there is a problem here worth worrying about, and therefore there will be some regulation of a sort that, you know, we've used for, you know, nuclear proliferation or the spread of the tools of synthetic biology. And, you know, I don't think we've done either of those especially well, but here it's even harder. And, you know, that's a separate conversation perhaps. Then there are all of the piecemeal near-term and truly immediate threats of the sort that we've just begun to speak about that go under the banner of information integrity and cyber hacking and cyber terrorism and just any malicious use of
Starting point is 00:18:37 narrow AI that can really supercharge in a human conflict and confusion. And this can, short of being an existential threat, it can be an enormous threat, which is certainly worth worrying about. So, but then obviously there, the reason why this is such an interesting conversation is that that's only half of the story. The other half, as you point out, is all of the good things we can expect to do and build and enjoy on the basis of increased intelligence. Generically, intelligence is the best thing we have, right? It's the thing that differentiates us from our primate cousins. It's the thing that safeguards everything else we care about, even if the things we care about can't be care about, even if the things we care about can't be narrowly reduced to intelligence,
Starting point is 00:19:33 things like love and friendship and creative joy, etc. All of that is safeguarded from the casual malevolence of nature by intelligence. The fact that we have cures for any illnesses is the result of intelligence. So obviously, we want a cure for cancer, we want a cure for Alzheimer's. If AI could give us those two things, the whole experiment would be worth it already, apart from the possibility of our destroying everything else we care about. So let's start where you and I really left off last time with the issue of deep fakes and I guess just fakeness in general. I mean, it wasn't until the emergence of ChatGPT that I suddenly glimpsed the possibility that in very short order here, most of everything on the internet could be fake, right? I mean, just most text could be fake, most image could be fake. I mean, not, you know, not now, but, you know, maybe two years from now. I mean, just when you look at how
Starting point is 00:20:33 quickly you can produce fake faces and fake video and fake journal articles, and that's just an amazing prospect. So tell me how you've been viewing the development of these tools in the last six to 12 months. And what are the crucial moments with respect to deep fakes and other fake material that you've noticed? of the risks broadly in those two buckets, the kind of AGI or existential risk scenario. And I'd like to have a conversation on that with you too. So maybe we can go back to that. But as to the second bucket, which is almost the short and medium term risks, things that are actually materializing right now. And foremost in that bucket, in my mind, is without a doubt, information integrity. Now, this is kind of the main thesis of my book, which came out three years ago. And that was, you know, so much has happened since then. At the time, generative AI hadn't even been coined as a phrase. And although people don't usually put chat GPT
Starting point is 00:21:38 and deep fakes into the same sentence, they're actually manifestations of the same phenomenon, right? The same kind of new capabilities, new quote unquote, of AI to be able to generate new data. Now, the really interesting thing is that when these capabilities started kind of coming out of the research community at the end of 2017, people started making these non-consensual pornographic creations with them and memes and kind of visual content. So much of the premise of my book was focused on visual media. But of course, at the same time, concurrently, a lot of work was going into the development of large language models. I mean, Google was really pioneering work in large language models in 2017. However, they kept it behind closed doors, right?
Starting point is 00:22:28 It wasn't out there. This is perhaps why nobody was really talking about text. And although we kind of registered, I know you've spoken about GPT series before ChatGPT came out. We've spoken about large language models, and I knew it was kind of on my radar. out and we've spoken about large language models and I knew it was kind of on my radar. It was really only when GPT-3 and GPT-3.5 and now GPT-4 and basically chat GPT came out that you truly understand the, first of all, the significance of a large language model and what it can do to scale mis- and disinformation. I mean, it's truly incredible and how convincing it is. And we were thinking about visual content as the most convincing, you know, AI-generated
Starting point is 00:23:12 video of people saying and doing things they've never done. Anyone can be cloned now, whether it's your voice or your face, but hadn't really considered or put text on an equal footing. But of course, that makes so much sense. You know, we're storytellers. That's something that goes back to the earliest days of civilization. And the problem, I suppose, is that there's been a lot of thinking on kind of the visual components of AI-generated content and what we can do to kind of combat the worst risks. And I'll come back to that, but less on text.
Starting point is 00:23:50 And if you think about the solutions on kind of the disinformation piece around that really started thinking around was the thinking started with deep face. Initially, people started thinking, okay, what we need to do is we need to detect it, right? We need to be able to build an AI content detector that can detect everything that's made by AI. So then we can definitively say, okay, that's AI generated. And that's not, that's synthetic. That's authentic. Turns out that in practice, building detection is really, really difficult because first,
Starting point is 00:24:22 there's no one size fits all detector there are now hundreds of thousands of generative models out there and there's never going to be one size fits all detector that can detect all synthetic content second it can only give you a percentage of how likely it thinks that is generated by ai or not so okay i okay, I'm 90% confident. I'm 70% confident. So always has a chance for a false negative, false positive or false negative. And third, as you correctly point out, and this is kind of, was one of the points I made in my book
Starting point is 00:24:55 and actually over the past few years has become something I've been speaking about a lot, is that if you believe, as I do, and as you already pointed out, that there will be some element of AI creation going forward in all digital information, then it becomes a futile exercise to try and detect what's synthetic, because everything is going to have some degree of AI or synthetic nature within any piece of digital content. So the second approach, kind of tech-led approach that's been emerging over the past few years, which is more promising,
Starting point is 00:25:34 is the idea of content provenance, right? And this is applicable to both synthetic or AI-generated content, as well as authentic content. It's about full transparency. So rather than being in the business of adjudicating what's true, you know, this is real, this is not, it's about securing full transparency about the origins of content in kind of almost the DNA of that content. So whether if it's authentic, you can capture it using secure capture technology. And that will give you almost kind of a cryptographically sealed data about that piece of content, where it was created, who it belongs to, but the same principle should also be applied to AI generated content.
Starting point is 00:26:22 And of course, not everyone's going to do that. But the difference here is that if you are a good actor, right, so if you are a open AI, or you're a stable diffusion, or you know, you're your Coca Cola, and you want to use AI generated collateral in your latest marketing campaign, then you should mark your content so everyone can see that this is actually synthetic so the technology to do that already exists and i should point out that it's much more than a watermark because people are yeah okay it's a watermark watermarks can be edited or removed but this kind of authentication technology like i said it's about cryptographically sealing it almost into the dna of that content so it can never be removed. It's indelible. It's
Starting point is 00:27:06 there for kind of the world to see. But the second part to this, and this is really where it starts getting tricky, is that it's no good signing your content in this way, in full transparency, if you're a good actor, if nobody can see that kind of nutritional label, right? So you've signed it, you put the technology in there. But like, if I view it on Twitter, am I going to see it? Or if I see it on YouTube, will I see it? So the second point is that we actually need to build into the architecture of the internet, the actual infrastructure for this to kind of become the default, right, the content credentials. And there's a nonprofit organization called the C2PA, which is already building this open standard for the internet.
Starting point is 00:27:51 And a lot of interesting founding members, Microsoft, Intel, BBC, ARM. And I guess we will see if this kind of becomes a standard because my view, so the projection, the estimate that I make is that 90% of online content is going to be generated by AI by 2025. That's a punchy kind of figure, but I really believe that this is probably the last moment of the internet where the majority of the information and data you see online doesn't have some degree of AI in its creation. So this issue of authentication rather than detecting fakes, it's an interesting flip of the approach and it presents a kind of bottleneck. of the approach, and it presents a kind of bottleneck. I'm wondering, does this suggest that this will be an age of new gatekeepers, where the promise at the moment, for those who are very bullish on everything that's happening, is that this has democratized creativity and information creation just to the ultimate degree. But if we all get trained very quickly to care about the integrity of information and our approach to finding legitimate information, whatever its provenance,
Starting point is 00:29:17 whether it's been synthesized in some way or whether it purports to be actually human-generated, if the approach to safeguarding the integrity is authentication rather than the detection of misinformation, how do we not wind up in a world where you can't trust an image unless it came from Getty Images, say, or it was taken on an iPhone, right? Like the cryptographic authentication of information, is this something that you imagine is going to be lead to a new siloing and gatekeeping? Or is it going to be like, you know, blockchain mediated, and everyone will be on this, you know, all fours together dealing with content, however,
Starting point is 00:29:57 you know, people will be able to create it, you know, outside of some walled garden or inside of a major corporation, and everyone will have access to the same authentication tools? of the problem is so vast. And in part, because over the past 30 years, we've created this kind of digital information ecosystem, wherein everybody and everything must now exist. It doesn't really matter if you're an individual, whether you're an organization or enterprise or a nation state, you don't really have the choice not to be engaged and be doing things within this ecosystem. So the very possibility that the medium by which information and transactions and communications and interactions, so all digital kind of content and information could be compromised to the extent that it becomes untrustworthy, that's a huge problem, right? So trying to ensure that we build an ecosystem where we can actually trust the information or a part of an ecosystem, right? We can actually trust the information you
Starting point is 00:31:19 engage with online is going to be critical to society, to business, on every kind of level conceivable. Detection will always play a role, by the way, I think. It's just that it's not the only solution. Let me just take a very narrow but salient case now. Just imagine at some point today, a video of Vladimir Putin claiming that he is about to use tactical nukes in the war in Ukraine emerges online. And, you know, the New York Times is trying to figure out whether or not to write a story on it, react to it, spread it. Clearly, there's a detection problem there, right? We have this one video that's spreading on social media, and to human eyes, it appears totally authentic, right? It's like we have this one video that's spreading on social media and to human eyes, it appears totally authentic, right? I think it's uncontroversial to say that if we're not there now, we will be there very soon and probably in a matter of months, not years,
Starting point is 00:32:17 where we'll have video of Putin or anyone else where it will literally be impossible for a person to detect some content in which somebody's biometrics are synthesized. So it's a visual representation of a person saying and doing things they didn't do or a completely synthetic person. They're already so sophisticated that we can't tell. And whilst we're talking about the nuclear political scenario, it's already cropping up really malicious use cases and kind of the vishing has become a big deal. So this is kind of phishing using people's biometrically cloned voice, you know, the age old scam of your loved one calling you because they've had an accident or they're in jail and they need to be bailed out. Now, imagine you get that call and it's actually your son's voice you hear or your wife's voice or your father's voice. Are you
Starting point is 00:33:30 going to send the money? Hell yeah, you're going to send the money because you believe your loved one is in trouble. And you can already synthesize voices with up to three seconds of audio. When I started looking at this field back in 2017, you need hours and hours and hours of training data to try and synthesize a single voice so you could only do people who are really in the public eye like you sam you know your your entire podcast repertoire would you know have been a good basis for training data but you don't need to be sam harris and you know to to do this you just need three seconds which you could probably scrape off Instagram, off YouTube, off LinkedIn. So you're already seeing that, right?
Starting point is 00:34:09 One question here, Nina, are we, with respect to video, are we truly there? Because the best thing I've seen, and I think this is, you know, most people will have seen this, are these Tom Cruise videos, which are fake, but they're somewhat gamed because, if I understand correctly, the person who's creating them already looks a lot like Tom Cruise, and he's almost like a Tom Cruise impersonator, and he's mapping the synthetic face of Cruise onto his own facial acting. And so it's very compelling. It never looks truly perfect to me, but if you weren't tipped off that you should be paying close attention,
Starting point is 00:34:50 you'd probably pass for almost everyone. But are we to the point now where absent some biasing scheme like that where you have an actor at the bottom of it, you can create video of anyone that is undetectable as fake? It's a much more difficult challenge. And the deep Tom that started emerging went viral on TikTok in 2021. You're absolutely right. Because that creator was, first of all, he was doing a lot of VFX and AI. It's not that the AI was kind of autonomously creating it. And
Starting point is 00:35:24 he was working with an actor who was a Tom Cruise impersonator, right? So he was just mapping it onto his face, which is why the fidelity looked highly convincing when it came out in 2021. Video is still a harder challenge. But already now, there are consumer products on the market where you can send 20 seconds of you talking into your cell phone. And from that, they can create your own kind of personalized avatar. So the point is that whilst it's still the barriers to entry on synthetic video generation are still higher, they're coming down very quickly. And like I said, there's the kind of market for your AI avatar is already thriving. And that requires about 20 seconds of video.
Starting point is 00:36:09 And where do the visual foundational models fit in here, like DALI and MidJourney and Stable Diffusion? Are they the source of good deepfakes now? Or are they not producing that sort of thing? Yeah, so it's been a really interesting shift because when deepfakes first started coming out in 2017, it was more that this was now a kind of tool that enthusiasts, ML and AI enthusiasts,
Starting point is 00:36:38 perhaps those with a background in VFX could use to create content. And they started doing it to basically clone people, right? But there was no kind of model or foundational model in order to be able to do this. Then pretty soon, I think it was in 2018, NVIDIA released this model called StyleGAN, and that could generate endless images of human faces. So it had been trained on a vast data set of human faces. So every time you kind of, you might have gone to that website a few years ago called thispersondoesnotexist.com. Yeah.
Starting point is 00:37:14 It was astonishing, right? Because every time you refreshed the page, you'd be given what looked like an entirely convincing photograph of somebody who was entirely synthetic and AI generated. Although I remember the tell there was that they could never quite get the teeth right. The teeth always looked somewhat deformed. The teeth, the ears. Exactly. There was this giveaway, telltale signs, right? So when you saw the best in class kind of productions or creations like Deep Tome in 2021, there was a high level of post-production and VFX and a bit of AI. So, you know, this wasn't still democratized to the extent where anybody could do
Starting point is 00:37:53 it. But what you've been seeing over the past 12 months is the emergence of these so-called foundational models. Now, these are interesting because they are not task-specific, their general purpose. And they are trained on these vast, vast data sets. You can almost conceive of it as the entirety of the internet. So the ones you just mentioned, DALI 2, Stable Diffusion, Mid-Journey, they're all text-to-image generators, right? And they're so compelling because the user experience is phenomenal because they have NLP tied into them so that when we use them, how do we get them to create something? Well, we prompt them. We just type what we want them to create. So now all of a sudden, you have these foundational models that can generate images of anybody or anything
Starting point is 00:38:47 so yeah you know the the really sophisticated deep fakes you've been seeing during the rounds on the internet recently whether it was donald trump's um kind of just before his arraignment you know fighting off those ones yeah yeah yeah those were created by mid-journey v5 or the ones of the pope in the balenciaga jacket so there's been an incredible amping up of the capability shall we say because before it's quite piecemeal and you have to do this and that but like there was no foundational model where you could just type in what you wanted it to create and boom, it would come out. Now, does that exist for video yet? Not yet. Is it going to come? Invariably. And that's actually what ChatGPT is as well, right? It's a one manifestation of a foundational model for text. And that's one of
Starting point is 00:39:37 the reasons why it has just been so compelling. It's that user experience. Hey, I can just have a conversation with it. I can just type in there and then it can like create anything I ask it to. And it's the same concept for the foundational models for image generation and video generation is next. So, and I must say it lands differently now. Prescient. Yeah. I mean, it's like, I forgot what I thought about it 10 years ago, but, you know, it's all too plausible now. And it's already happening, Sam. Yeah, yeah.
Starting point is 00:40:15 But it's, and the thing that's, it's hard to get away from, I mean, obviously there's some benefits to this sort of thing. I mean, if you could have a functionally omniscient agent in your ear whenever you want, many good things could come of that. But there is something, it's a vision of bespoke information where no one is seeing or hearing the same thing anymore, right? So there's a siloing effect where if everyone has access to an oracle, well, then that oracle can create a bespoke reality with or without hallucinations. I mean, your preferences can be catered to in such a way that everyone can be... I mean, to some degree, this has already happened, but the concern gets sharpened up considerably when you think about the prospect of all of us
Starting point is 00:41:12 having an independent conversation with a superintelligence that is not constrained to to get everyone to converge or agree or to even find one another interpretable, right? I already feel like when I see, I'm no longer on social media, so I have this experience less now, but when I was still on Twitter, I had the experience of seeing people I knew to some degree behaving in ways that were less and less interpretable to me. I mean, they were seeming more and more irrational. And I realized, well, I'm not looking over their shoulder seeing their Twitter feed. I don't see the totality of what they're feeling informed by and reacting to. And I just see, basically, from my bubble, they appear to be going crazy. See, basically from my bubble, they appear to be going crazy, and everyone is redshifted and quickly vanishing over some horizon from the place where I am currently sitting. And no doubt that I'm doing the same thing for them.
Starting point is 00:42:15 And it is an alarming picture of a balkanization of our worldview. And it's, yeah, I guess the variable there really is the coming bespokeness of information. And somebody, I think it was Jaron Lanier, I think it was Jaron Lanier, you know, flagged this for me some years ago where he said, you know, just imagine if you went to Wikipedia and no one was actually seeing. You look up an article on anything, World War II, and that is curated purely for you. No one is seeing that same article. The Wikipedia ground truth as to the causes and reality of World War II was written for you, catering and pandering to your preferences and biases, and no one else has access to that specific article. It seems that we're potentially stumbling into that world
Starting point is 00:43:13 with these tools. Oh, absolutely. This kind of hyper-personalization or the audience of one, and you already kind of see some of the early manifestations of that so you you're talking about her so we have already of course after chat gpt came out some of the kind of more nefarious things that started immediately being built because it's going to get very weird when it comes to love sex and relationships are these kind of girlfriend bots right for very lonely men and these kind of chat bots can cater to their every sexual fantasy there was actually a company called replica which also did these avatars and they people were using them as this kind of girlfriend you know being very inappropriate and using them for their their fantasy. So they had to kind of change the settings, if you will, and reboot and kind of make sure that their avatar
Starting point is 00:44:09 didn't include the chatbot abilities that would kind of relate to any kind of intimate relationships. But it's also true, of course, from the point you are making. By the way, well done on leaving Twitter. There's nothing left there. So I'm sure your life is a lot better for that. But if you think about radicalization and online radicalization, this is actually something that I already read in a paper years ago, because it was a piece of research done by the Middlesbrough Institute of Terrorism. And it was looking at how an early forebear to chat GPT-4, I think it was GPT-3, they had tested it and seen how it performed
Starting point is 00:44:55 as a radicalization agent, right? And we know so many people are radicalized online. Now, imagine we're just talking right now about the capabilities of very sophisticated chatbots that are going to become even more sophisticated and be able to fulfill your every sexual fantasy or to be able to groom you, to radicalize you. And the next step when we talk about the capabilities of these generative models is so-called multimodal models, right? Because right now, they're still kind of broken up the foundational models into the type of content that they generate. So you have large language models, you have image generators, you have audio generators,
Starting point is 00:45:34 you have the kind of growing video generators, although they're obviously not as sophisticated as the text or image yet. But the multimodal is when you can work with all those digital medium in one. So hypothetically, you can, if we're going to go back to the kind of virtual girlfriend scenario, you know, you can not only chat to her, but you can see her, you can have photos generated of her. Similarly, if we go back to the grooming kind of scenario, you know, you're being shown video, you're being shown audio whatever your worldview is can be entrenched so these are some of the darkest kind of manifestations of this hyper personalization on the kind of more benign side i think people in the entertainment world
Starting point is 00:46:19 are very excited about it because they're like oh this is the ability to create an audience of one. So if you like the Harry Potter books and you want to find out more about Dumbledore, you can say, I want to have some more backstory generated for Dumbledore and I want to know about where he was born and what was his mother's backstory. And it could just hypothetically generate that for you in real time. But one area where this hyper personalization is really promising is in medicine. And interestingly, we talked about chatbots. And there have been some interesting trials on kind of using a chatbot as almost a assistant, or a friend or a voice to people who are anxious or depressed. And the early kind of indicators have been that it can be helpful. Now, I don't know, I guess it's a philosophical point, whether you think you should treat people with mental health issues with a chatbot, you know, whether that is the benefits outweigh the risks, that's not for me to decide. But in terms of like the hyper-personalization of potential medical treatment
Starting point is 00:47:31 for people based on their own data, that's something that I am really interested in. Yeah. Yeah. I mean, you add, you know, full genome sequencing to that. Exactly. It's very interesting. On the multimodal model front, do you have a sense of how far away we are from the all-encompassing tool where you could sit down and say, give me a 45-minute documentary proving that the Holocaust didn't happen using all kinds of archival footage of Hitler and everyone else, and make it in the style of a Ken Burns documentary. And with that prompt, it'll spit out a totally compelling 45-minute video with essentially perfect fake sourcing of archival imagery and all the rest. So I think to have a 40-minute completely synthetically generated video, we're still a way off that.
Starting point is 00:48:33 But having said that, you know, I've been kind of working with the research community. If you'd like to continue listening to this conversation, you'll need to subscribe at SamHarris.org. Once you do, you'll get access to all full-length episodes of the Making Sense podcast, along with other subscriber-only content, including bonus episodes and AMAs and the conversations I've been having on the Waking Up app.
Starting point is 00:48:57 The Making Sense podcast is ad-free and relies entirely on listener support. And you can subscribe now at SamHarris.org.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.