How to Be a Better Human - How to stay grounded in an increasingly artificial world (from The TED AI Show)

Episode Date: May 27, 2024

Today, we’re sharing the first episode of the newest TED Audio Collective Podcast – The TED AI Show. Now before you think, “wait, isn’t artificial intelligence the opposite of being human?”,... know that we are wondering that too! That’s what’s nice about The TED AI Show. It asks: how is AI shaping human stuff? Join creative technologist Bilawal Sidhu as he sits down with Sam Gregory, a human rights activist and technologist, for some real talk on deepfakes, how AI is challenging our sense of what’s real and what’s fiction, and how to maintain our sense of self in this rapidly-evolving world.We hope you enjoy this episode. We'll be back with more How to Be a Better Human next week. You can listen to The TED AI Show anywhere you get your podcasts. Hosted on Acast. See acast.com/privacy for more information.

Transcript
Discussion (0)
Starting point is 00:00:00 Hi, everyone. Thanks for listening to How to Be a Better Human. This is Chris Duffy here. And this week, instead of a new episode of our show, we are going to play an episode of the newest podcast in the TED Audio Collective family, which is the TED AI Show. Now, I know what you might be thinking. TED AI Show, that doesn't really sound like something that's about humans at all. It's about technology, about artificial intelligence. Chris, isn't AI maybe the opposite of becoming a better human? And I got to say, I'm not sure that I disagree with you. It's a field that I have a lot of skepticism about and a lot of fear. And also I'm interested in. I want to know
Starting point is 00:00:35 where does AI live up to the hype and where does it fall short? And I think the thing that is so interesting about this new podcast is that they ask those questions too, right? They're very focused on how is AI shaping human stuff? How is it shaping human relationships and human work and culture and art? What is it going to do to our lives in 10 years? Is this really a thing that is going to change everything or is it overhyped? How do we use it in a way that serves our best interests and is ethical? The host of TED AI, Bilawal Sadu, is a creative technologist, and he sits down with people involved in building this technology, with artists, with journalists, and with so many more interesting, intelligent people to talk about the thrilling and often terrifying future that our technologies are leading us into.
Starting point is 00:01:20 I hope that you enjoy this episode. And if you do, you can find more episodes from the TED AI Show wherever you're listening to this. Either way, we will be back with new episodes of How to Be a Better Human next week. And I also just want to say this is a real human voice. You are hearing my actual human voice as I record this. Who knows how long that will last? I guess we will have to listen to the TED AI Show to find out.
Starting point is 00:01:42 OK, without any further ado, here is an episode of TED AI. Enjoy. Okay, picture this. We're in Miami. There's sun, sand, palm trees, and a giant open air mall right on the water. Pretty nice, right? But then, late one Monday night in January of this year, things get weird. Cop cars swarm the mall, like dozens of them. I'm talking six city blocks shut down, lights flashing, people everywhere, and no one knows what's going on. The news footage hits the internet, and naturally, the hive mind goes into overdrive. Speculation, conspiracy theories are flying, and one idea takes hold aliens folks are dissecting grainy helicopter footage in the comments
Starting point is 00:02:32 zooming in analyzing it frame by frame to find any evidence of aliens so i thought i'm a tiktoker what if i brought this online fever dream to life and shared it with the masses? Using the latest AI tools, I created a video. Towering shadowy figures silently materializing amidst the flashing lights of the police cars. An alien invasion in the middle of Miami's Bayside marketplace. Just a bit of fun, I thought.
Starting point is 00:03:01 Some got the joke, a Miami twist on Stranger Things. But I watched as other people flocked into my comment section to declare this as bona fide evidence that aliens do in fact exist. Now you might be wondering, what actually happened? A bunch of teenagers got into a fight at the mall, the police showed up to break it up, and that's all it took to trigger this mass hysteria. It was too easy. Too easy to make people believe something happened that never actually
Starting point is 00:03:33 happened. And that's kind of terrifying. I'm Bilal Sadoo, and this is the TED AI Show, where we figure out how to live and thrive in a world where AI has changed everything. So I've been making visual effects, blending realities on my computer since I was a kid. I remember watching a show called Mega Movie Magic, which revealed the secrets behind movies' special effects. which revealed the secrets behind movies' special effects. I learned about CGI and practical effects in movies like Star Wars, Godzilla, and Independence Day. I was already into computer graphics, but seeing how they could create visuals indistinguishable from reality was a game changer.
Starting point is 00:04:22 It sparked a lifelong passion to blend the physical and digital worlds. Several years ago, I started my TikTok channel. I'd upload my own creations and share them with hundreds and thousands and now millions of viewers. I mean, just five years ago, if I wanted to make a video of giant aliens invading a mall in Miami, it would have taken me a week
Starting point is 00:04:38 and at least five pieces of software. But this aliens video, it took me just a day to make using tools like MidJourney, RunwayML, and Adobe Premiere. Tools that anyone with a laptop can access. Since ChatGPT came on the scene in late 2022, there's been a lot of talk about the Turing test, where a human evaluator tries to figure out if the person at the other end of a text chat is a machine or another human. But what about the visual Turing test, where machines can create images that are indistinguishable from reality?
Starting point is 00:05:14 And now OpenAI has come out with Sora, a video generation tool that will create impressively lifelike video from a single text prompt. It's basically like ChatGPT or DALI, but instead of text or images, it generates video. And don't get me wrong, there are other video generation tools out there, but when I first saw Sora, the realism blew my socks off. I mean, with these other programs, you can make short videos like a couple seconds long, but with Sora, we're talking minute-long videos. The 3D consistency with those long dynamic camera moves definitely impressed me. There's so much high-frequency detail, and the scene is just brimming with life. And if we can just punch in a single text prompt
Starting point is 00:05:57 into Sora and it'll give us full-on video that's visually convincing to the point that some people could mistake it for something real, well, you could imagine some to the point that some people could mistake it for something real? Well, you could imagine some of the problems that might stem from that. So we're at a turning point. Not only have we shattered the visual Turing test, we're reshattering it every day. Images, audio, video, 3D, the list goes on. I mean, you've probably seen the headlines.
Starting point is 00:06:22 AI-generated nude photographs of Taylor Swift circulating on Twitter. A generated video of Vladimir Zelensky surrendering to the Russian army. A fraudster successfully impersonating a CFO on a video call to scam a Hong Kong company out of tens of millions of dollars. And as bad as the hoaxes and the fakes and the scams are, there's a more insidious danger. What if we stop believing anything we see?
Starting point is 00:06:47 Think about that. Think about a future where you don't believe the news. You don't trust the video evidence you see in court. You're not even sure that the person on the other end of the Zoom call is real. This isn't some far-flung future. In fact, I'd argue we're living in it now. So given that we're in this new world, where we're constantly shattering and reshattering the visual Turing test, how do we protect our own sense of reality?
Starting point is 00:07:15 I reached out to Sam Gregory to talk me through what we're up against. Sam is an expert on generative AI and misinformation, and is the executive director of the human rights network Witness. His organization has been working with journalists, human rights advocates, and technologists to come up with solutions that help us separate the real from the fake. Sam, thank you for joining us. us. I have to ask you, as we're seeing these AI tools proliferate just over the last two years, are we correspondingly seeing a massive uptick of these visual hoaxes? The vast majority are still these shallow fakes because anyone can make a shallow fake. It's trivially easy, right, just to take an image, grab it out of Google search and claim it's from another place. What we're seeing, though, is this uptick that's happening in, in a whole range
Starting point is 00:08:05 of ways people are using this generative media for, for deception. So you see images sometimes deliberately shared to deceive people, right? Someone will share an image claiming it's, you know, of an event that never happened. And then, you know, we're seeing a lot of audio because it's so trivially easy to make, right? A few seconds of your voice and you can churn out endless, endless cloned voice. We're not seeing so much video, right? And that's, you know, a reflection that, you know, really doing complex video recreation is still not quite there, right? Yeah, video is significantly harder, at least for the moment, and I personally hope that it would stay pretty hard for a while. Though some of these generations are getting absolutely wild.
Starting point is 00:08:51 I had a bit of an existential moment looking at this one video from Sora. It's the underwater diver video. there's a diver swimming underwater, you know, investigating this historic, almost archaeological spaceship that's crashed into the waterbed. And it looked absolutely real. And I was thinking through what that would have taken for me to do the old fashioned way. And I was just gasping that this was just a simple prompt that produced this immaculate one minute video. I'm kind of curious, have you had such a moment yourself? It's funny because I was literally showing that video to my colleagues and I didn't cue them up that it was made with Sora because I wanted to see whether they clicked that it was
Starting point is 00:09:38 an AI generated video because I think it's a fascinating one. It's kind of on the edge of possibility. There's definitely a kind of a moment that's happening now for me. And it's really interesting because, you know, we first started working on this like five or six years ago and we were just doing what we described as prepare, don't panic and really trying to puncture people's hype, particularly around video deep fakes, because people kept implying that they were really easy to do and that we were surrounded by them. And the reality was it wasn't easy to fake, you know, convincing video and to do that at scale. So it's certainly for me, Sora has been a click moment in terms of the possibility here, even though it feels like a black box and
Starting point is 00:10:15 I'm not quite sure how they've done it and how accessible this is actually going to be and how quickly. So related to this, a lot of these visual hoaxes tend to be whimsical, even innocuous, right? In other words, they don't cause serious harm in the real world and are almost akin to pranks. But some of these visual hoaxes can be a lot more serious. Can you tell me a little bit about what you're seeing out there? The most interesting examples right now are happening in election context globally, and they're typically people having words put in their mouths. In the recent elections in Pakistan, in Bangladesh, you had candidates saying, boycott the vote or vote for the other party, right? And they're quite compelling
Starting point is 00:10:57 at a first glance, particularly if you're not very familiar with how AI can be used. And they're often deployed right before an election. So those are clearly, in most cases, malicious. They're designed to deceive. And then you're also seeing ones that are kind of these leaked conversation ones, so they're not visual hoaxes. And so you've got really, you know, quite deceptive uses happening there,
Starting point is 00:11:18 either directly just with audio or at the intersection of audio with animated faces or audio with the ability to make a lip sync with a with a video if i if i wanted to ask you to zoom in on one single example that's disturbed you the most something that exemplifies what you are the most worried about what would it be i'm gonna pick one that is uh it's actually a whole genre and i'm gonna describe this genre because i think it's the one that people are familiar with but once you you start to think about it, you realize how easy it is to do this. And that is pretty much everyone has seen Elon Musk selling a crypto scam, right?
Starting point is 00:11:52 Often paired up with a newscaster, your favorite newscaster, or your favorite political figure. In every country in which I work, people have experienced that. They've seen that video where it's like the newscaster says, Hey, Elon, come on and explain how you follow this new crypto scam or come on, political candidate, and explain why you're investing in this crypto scam. For anyone who hasn't seen it, these are just videos with a deepfake Elon Musk trying to guilt you into buying crypto as a part of their Bitcoin giveaway program. Bitcoin giveaway program. And so the reason I point to that is not because it has massive human rights impacts or massive news impacts, but it's just, this is so commodified, but we have this sort of bigger question of how it plays into our overarching understanding of what we trust, right? Does this undermine people's confidence in almost any way in which they
Starting point is 00:12:41 experience audio or video or photos that they encounter online, does it just reinforce what they want to believe? And for other people, just let them believe that nothing can be trusted. We're going to take a quick break. When we come back, we're going to talk with Sam about how we can train ourselves to better distinguish the real from the unreal using a little system he calls SIFT. More on that in just a minute. We're back with Sam Gregory of Witness. Before the break, we were talking about how these fake videos are starting to erode our trust in everything we see. And yeah, maybe you can find flaws in a lot of these videos, but some of them are really, really good.
Starting point is 00:13:29 And nobody's zooming in at 300% looking for those minor imperfections, especially when they're scrolling through a feed, right? Like before their morning commute or something. Yeah, and you're hitting on the thing that I think, you know, the news media has often done a disservice to people about how to think about spotting AI, right? We put such an emphasis on kind of like, you know, you should have spotted the Pope, you know, had his ring finger on the wrong hand in that puffer jacket image, right?
Starting point is 00:13:56 Or didn't you see that his hair didn't look quite right on the hairline? Or didn't you see he didn't blink at the regular rate? And it's just so cruel almost to us as consumers to expect us to spot those things. We don't do it. I don't look at every TikTok video in my For You page and go, like, let me just look at this really carefully and make sure if someone's trying to deceive me. And so we've done a disservice often because people point out these glitches and then they expect people to spot them.
Starting point is 00:14:21 And it creates this whole culture where we distrust everything we look at. And we try and apply this sort of personal forensic skepticism and it doesn't lead us to great places. All right, I want to talk about mitigation. How do we prepare and what can we do right now? When we first started saying prepare, don't panic, it was five or six years ago and it was in the first deepfakes hype cycle,
Starting point is 00:14:44 which was like the 2018 elections when everyone was like, deepfakes are going to destroy the elections. And I don't think there was a single deepfake in the 2018 US elections of any note. Now, let's fast forward to now, right? 2024. When we look around the world, the threat is clear and present now, and it's escalating. and present now and it's escalating. So prepare is about acting, listening to the right voices and thinking about how we balance out creativity, expression, human rights, and do that from a global perspective because so much of this conversation often is also very US or Europe centric. So what can we do now? The first part of it is who are we listening to about this?
Starting point is 00:15:21 And I often get frustrated in AI conversations could get this very abstract discussion around AI harms and AI safety. And it feels very different from the conversation I'm having with journalists and human rights defenders on the ground who are saying, I got targeted with a non-consensual sexual deepfake. I got my footage dismissed as faked by a politician because he said it could have been made by AI. So as we prepare, the first thing is who do we listen to, right? And we should listen to the people who actually are experiencing this. And then we need to think, what is it that we need to help people understand how AI is being used? This kind of question of the recipe. And I use the recipe analogy because I think we're not in a world where it's AI or not. It's even in the photos we take on our iPhones, we're already combining AI and human, right?
Starting point is 00:16:06 The human input, then the AI modifications that make our photos look better. So we need to think, how do we communicate that AI was used in the media we make? We need to show people how AI and human were involved in the creation of a piece of media, how it was edited and how it's distributed. The second part of it is around access to detection. And the thing that we've seen is there's a huge gap in access to
Starting point is 00:16:29 the detection tools for the people who need it most, like journalists and election officials and human rights defenders globally. And so they're kind of stuck. They get this piece of video or an image and they are doing the same things that we're encouraging ordinary people to do. Look for the glitches, you know, take a guess, drop it in an online detector. And all of those things are as likely to give a false positive or a false negative as they are to give a reliable result that you can explain. So you've got those two things. You've got an absence of transparency explaining the recipe.
Starting point is 00:16:58 You've got gaps in access to detection. And neither of those will work well unless the whole of the AI pipeline plays its part in making sure the signals of that authenticity and the ability to detect is retained all the way through. So those are the three key things that we point to is transparency done right, detection available to those who need it most, and the importance of having an AI pipeline where the responsibility is shared across the whole AI industry. I think you covered like three questions beautifully right here. So a key challenge is telling what content is generated by humans versus synthetically generated by machines. And one of the efforts you're involved in is the appropriately named Content Authenticity
Starting point is 00:17:40 Initiative. Could you talk a bit about how does that play into a world where we will have fake content purporting to be real? Yes. So about five years ago, there were a couple of initiatives founded by a mix of companies and media entities, and Witness joined those early on to see how we could bring a human rights voice to them. And one of them was something called the Content Authenticity Initiative that Adobe kicked off. And another was something called the Coalition for Content Provenance and Authenticity. The shorthand for that is C2PA. So let me explain a little more about what C2PA is. It's basically a technical standard for showing what we might describe as the provenance of an image or a video or another piece of media. And provenance is basically the trail of how it was created, right? This is a standard that's been increasingly adopted by platforms in the last couple of months using Google and Meta, adopted as a way they're
Starting point is 00:18:29 going to show to people how the media they encounter online, particularly AI-generated or edited media, was made. It's also a direction that governments are moving in. Some key things that we point to around standards like the C2PA is, you know, the first thing is they are not a foolproof way of showing whether something was made with AI, made with a human. What I mean by that is they tell you information, but, you know, we know that people
Starting point is 00:18:52 can remove that metadata, for example. They can strip out the metadata. And we also know that some people may not add this in for a range of reasons. So we're creating a system that allows additional signals of trust or additional pieces of information, but no one confirmation of authenticity or reality. I think that's really important that we be clear that this is, in some sense, a harm reduction approach. It's a way to give people more information, but it's not going to be conclusive in a kind of sort of silver bullet like way.
Starting point is 00:19:25 not going to be conclusive in a kind of sort of silver bullet like way. And then the second sort of thing that we need to think about these is, you know, we need to really make sure that this is about the how of how media was made, not the who of who made it. Otherwise, we open a backdoor to surveillance. We open a backdoor to the ways this will be used to target and criminalize journalists and people who speak out against governments globally. Beautifully said, especially in the last point, I noticed Tim Sweeney had some interesting remarks about all of the content authenticity initiatives happening as kind of described it as sort of surveillance DRM, where you cannot upload a piece of content, right? Like if people like you aren't pushing on this direction, we may well end up in a world where you cannot upload imagery onto the internet without
Starting point is 00:20:03 having your identity tied to it. And I think that would be a scary world indeed. The thing that we have consistently pushed back on in systems like C2PA and is on the idea that identity should be the center of how you're trusted online. It's helpful. Right. And in many times I want people to know who I am. to know who I am. But if we start to premise trust online in individual identity as the center and require people to do that, that brings all kinds of risks that we already have a history of understanding from social media, right? That's not to say we shouldn't think about things like proof of personhood, right? Like how do we understand that someone who created media was a human may be important, right? As we enter an AI generated world, that's not the same as knowing
Starting point is 00:20:44 that it was Sam who made it, not a generic human who made it, right? So I think that's really important. It's a slippery slope indeed, and really good point on sort of the distinction between validating you're a human being versus, you know, validating you are Sam Gregory. That's a very subtle but, you know, crucial distinction. Let's move over to fears and hopes. You know, back in 2017, you felt the fear around deepfakes were overblown. Clearly now it is far more of a clear and present danger. Where do you stand now? What are your hopes and fears at the moment?
Starting point is 00:21:18 So we've gone from a scenario in 2017 where the primary harm was the one that people didn't discuss. That was gender-based violence. And the harm everyone discussed political usage was non-existent to a scenario now where the gender-based violence has got far worse, right? And targets everyone from public figures to teenagers in schools all around the world. And the political usage is now very real. And the third thing is you have people realizing there's this incredibly good excuse for a piece of compromising media, which is just to say, hey, that was faked, or hey, plausibly, I can deny that piece of media by saying that it was faked. And so those three are the sort of the core fears that I experience now that have translated into reality. Now,
Starting point is 00:22:05 in terms of hopes, I don't think we've acted yet on those three core problems sufficiently, right? We need to address those and we need to make sure that, you know, we criminalize the ways in which people target primarily women with non-consensual sexual deepfakes, which are escalating. In the second area of fears, which is the fears around their misuse in politics and to undermine news footage and human rights content, I think that's where we need to lean into a lot of the approaches like the authenticity and provenance infrastructures like the C2PA, the access to detection tools for the journalists who need it most, and then smart laws that can help us rule out some usages, right? And make sure that it is clear that some uses are unacceptable. And then the third
Starting point is 00:22:51 area, that's the hardest one, because we just don't have the research yet about what is the impact of this constant sort of drip, drip, drip of you can't believe what you see and hear. We can only reach an 84% probability that it's real or false, which is not great for public confidence. But we also don't know how this plays into this broader societal trust crisis we have, where already people want to lean into kind of almost plausible believability on stuff they care about, or just plausibly ignoring anything that challenges those beliefs. I think you brought up a really good point about it's almost like the world is fracturing into the multiverse of madness, I like to call it,
Starting point is 00:23:28 where people are looking for whatever validation to sort of confirm their beliefs. At the same time, it can result in people being jaded, right, where they're just going to be detached. Well, I don't trust anything. And so I'm curious, how do you see consumers' behaviors changing in this world where the visual Turing test gets shattered over and over again for all sorts of different, more complex domains? Are people going to get savvier? What do you think is going to happen to society in such a world? So we have to hope that we walk a fine line. We're going to need to be more skeptical of audio and images and video that we encounter online. But we're going to have to do that with a skepticism that's supported by
Starting point is 00:24:10 signals that help us. What I mean by that is, if we enter a world where we're just like, hey, everyone, everything could be faked. It's getting better every day. Hey, look out for the glitch. Then we enter a world where people's skepticism quite rightly will accelerate because all of us will experience like on a daily basis being deceived, right? And I think that's very legitimate for us to then feel like we can't trust anything. Right. In the ideal world, everyone's labeling what's real or fake. But when that's not happening, what do people do? I always go back to, you know, basic media literacy. I use an acronym called SIFT that was invented by an academic called Mike Caulfield. And SIFT is S-I-F-T. S stands for
Starting point is 00:24:52 stop, right? Because it's basically stop before you're emotionally triggered, right? Whenever you see something that's too good to be true. I stands for investigate the source, which is like, who shared this? Is it someone I should trust? The F stands for investigate the source, which is like, who shared this? Is it someone I should trust? The F stands for find alternative coverage, right? Did someone already write about this and say, wait, that's not the Pope in a puffer jacket. In reality, that's an AI image. And then the fourth part of that, which is getting complicated is T for trace the original, which used to always be a great way of doing it in the shallow fake era, because you'd find that an image had been recycled, but it's getting harder now. So when I look at the knife edge we've got to walk, it's to help people do SIFT
Starting point is 00:25:32 in an environment that is structured to give them better signals of how AI was used, and where the law has set parameters about what is definitely not acceptable, and where all the companies, all the players in that AI pipeline are playing their part to make sure that we can see the recipe of how AI and human was used and that it's as easy as possible to detect when AI was used to manipulate or create a piece of imagery, audio or video. I really like SIFT. I think that's also very good advice for people when they come across something that is indeed
Starting point is 00:26:03 too good to be true. Very often we will be like, oh, well, that's interesting and go about our day. The devices we use every day aren't foolproof, right? They've got vulnerabilities. There is this game of whack-a-mole that happens with patching those vulnerabilities. And now we've got these cognitive vulnerabilities almost. And, you know, on the detection side, the tools are going to need to keep improving because people are going to find ways to use the detectors to create new generators that evade them. Right. And so that game of whack-a-mole will continue.
Starting point is 00:26:31 But that isn't to say that all hope is lost. We can adapt and we can still have an information landscape where we can all thrive together. That's the future I want. we can all thrive together. That's the future I want. The way we describe it at Witness, we talk about fortifying the truth, which is that we need to find ways to defend that there is a reality out there.
Starting point is 00:26:51 Thank you so much, Sam. I will certainly sleep easier at night knowing there are people like you out there making sure we can tell the difference between the real and unreal. Thank you so much for joining us. Sam Gregory and I had this conversation in mid-March, and a few days later, there was another development. YouTube came out with a new rule. If you have AI-generated
Starting point is 00:27:13 content in your video and it's not obvious, you have to disclose its AI. This move from YouTube is an important one, the kind Sam and his colleagues at Witness have been advocating for. one, the kind Sam and his colleagues at Witness have been advocating for. It shifts the onus onto creators and platforms and away from everyday viewers, because ultimately it's unfair to make all of us become AI detectives scrutinizing every video for that missing shadow or impossible physics, especially in a world where the visual Turing test is continually being shattered. And look, I'm not going to sugarcoat this. This is a huge problem, and it's going to be difficult for everyone. Folks like Sam Gregory have their work cut out for them,
Starting point is 00:27:53 and massive organizations like TikTok, Google, and Meta do too. But listen, I'm going to be back here this week, and the week after that, and the week after that, helping you figure out how to navigate this new world order, how to live with AI, and yes, thrive with it too. We'll be talking to researchers, artists, journalists, academics, who can help us demystify the technology as it evolves. Together, we're going to figure out how to navigate AI before it navigates us.
Starting point is 00:28:23 This is the TED AI Show. I hope you'll join us. The TED AI Show is a part of the TED Audio Collective and is produced by TED with Cosmic Standard. Our producers are Ella Fetter and Sarah McRae. Our editors are Ben Bencheng and Alejandra Salazar. Our showrunner is Ivana Tucker, and our associate producer is Ben Montoya. Our engineer is Asia Pilar Simpson. Our technical director is Jacob Winnick, and our executive producer is Eliza Smith. Our fact checker is Christian Aparta. And I'm your host, Bilal Velsadu. See y'all in the next one.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.