Today, Explained - How an AI pope pic fooled us

Starting point is 00:00:00 Today Explained, Sean Ramos for him. This past Saturday, against my better judgment, I was scrolling the feeds, liking the tweets, and then I saw something glorious. Cool Pope, Papa Francis, in an epic white puffer. He's got the cross dangling over the jacket. The jacket has this inexplicable built-in white belt. He looks like he's on his way to save humanity from eternal damnation, and he's going to be warm as hell while he does it.

Starting point is 00:00:29 The tweet I saw captioned it, the boys in Brooklyn could only hope for this level of drip. Chef's kiss, I sent it to all my Catholics. And then, on Sunday, I spent less time online, so it wasn't until Monday morning that I found out that the image was fake. AI got me, and millions of others. How we all got fooled by Prada Pope, ahead on Today Explained. Get groceries delivered across the GTA from Real Canadian Superstore with PC Express.

Starting point is 00:01:08 Shop online for super prices and super savings. Try it today and get up to $75 in PC Optimum Points. Visit Superstore.ca to get started. It's the Pope. He looks fantastic. He's wearing a big white puffer jacket that looks like it was made like Balenciaga, Montclair, something like that. He looks like he just stepped off a runway. And of course, the kicker is that it's a fake image, that it was made using AI. It's not a real thing.

Starting point is 00:01:44 James Vincent is a senior reporter at The Verge. Today Explained asked him why this AI-generated picture of the Pope matters. Because I think it is a minor milestone in terms of a fake AI image tricking a lot of people. I saw this being shared by a lot of people going, wow, the Pope looking drippy as fuck. And people who were taken in by it. And now there have been lots of AI fakes that have been circulated. And they always, or not always, but a lot of them are taken in some people. But I think this was the first time I saw lots of people thinking that

Starting point is 00:02:19 this just happened to be a real image. Were you fooled by it? Be honest. Yes. Okay. So the first time I saw it, it was on Friday night, and I was, you know, scrolling through my phone idly for whatever reason. And I went past it, and I just saw it in passing was like, wow, Pope looking real, he's looking real good today. And then I saw it again. And I was like, okay, this is obviously like, lots of people are paying attention to this image. So I took a closer look. And I am happy to say I did spot that it was a fake. But I say this is someone who looks at AI generated images day in, day out. So I'm kind of I'm definitely better attuned at seeing what makes something fake.

Starting point is 00:02:56 Let's help the people out here. What eventually tipped you off that it was fake? How can we spot a fake image? So when it comes to AI-generated images, there are a few different tells, and they kind of all, in my mind, relate to one of the same underlying traits. AI is good at generating surfaces,

Starting point is 00:03:17 not the underlying system. Now, that sort of sounds like quite an abstract thing to say, but if you think about the data it's trained on, it's not being fed 3D models of people or clothing or, you know, architecture. It's being fed 2D representations of it. And so it learns what the surface of the image looked like. This means that when you zoom in, when you look at the details, they often reveal some inconsistency, some blur or smear. Now, a very famous example of this is that AI image generators are not very good at producing hands.

Starting point is 00:03:49 They create things that look like flesh, that look like fingers, but sometimes the joints are in the wrong place, or there's missing a knuckle, or there's a finger missing, or something like this. And this is because there's no internal representation. These systems haven't gone through anatomy classes. So if you actually, to use a specific thing, There's no internal representation. These systems haven't gone through anatomy classes. So if you actually use a specific thing, yeah, you zoom in on his hand and he has this sort of slight indistinct claw of a hand

Starting point is 00:04:13 and it's grasping something that looks like it could be the lid of a coffee cup but isn't quite a coffee cup. And this is the sort of telltale detail that gives away an AI-generated image. They don't do systems in a way. So other things that were wrong with the images was his jacket was sort of weirdly folded. When you looked at it up close, you were like, well, it's sort of got a piece of fabric that dips in and out of another piece of fabric. And then he was wearing a crucifix, but the crucifix had this sort of image of Jesus that looked like Jesus had been sculpted in clay

Starting point is 00:04:44 and then sat on. Which the Pope would not do. Famously, the Pope has a lot of respect for the image of Jesus Christ. This is what gives AI images away. They're also not very good at text, for example. If you see anything that has a written word in it, they're not very good at doing sort of big background shots. They often have a single image in the front of the frame, like this one, and no detail in the background

Starting point is 00:05:15 because it's quite a lot of moving parts to work out all that stuff. The unfortunate thing is that they are getting better at all these challenges. These systems used to be very bad at generating hands. The hands are better now. As in the Pope image, one of the hands was sort of okay and one was bad. But, you know, these are challenges that will be overcome in time. So if there were these telltale signs, I mean, you're talking about pretty obvious things. The cross didn't actually have like a real Jesus on it. His one hand looked like something out of a horror movie. Why exactly

Starting point is 00:05:48 did this image hit the way it did? I mean, it spread like wildfire and a lot of people thought it was real. What was convincing, if not these things that were so unconvincing? So there's a couple of elements at play here. One is a recent improvement to the software. So the specific software used to create this image was made using mid journey, it's called and version five of mid journey came out in the middle of March. And it included quite a big improvement in its the quality of images of people it made. So you may have seen, for example, other recent AI fakes, and I'm talking the last two weeks since the middle of March, ones of Donald Trump supposedly being arrested, ones of French President Emmanuel Macron in the protests, ones of Elon Musk

Starting point is 00:06:34 supposedly on a date with AOC. None of these things have happened. None of these, to be clear, none of those things have happened. None of those things have happened. But all those images have been created by Midjourney because it's had this bump in its quality basically now i think an important part of this is that mid journey also has quite a specific aesthetic it has a type of image that it's better at doing than others and i think that type of image corresponds to celebrity images these are often pictures as with the one of the Pope where they're really dramatic. They have fantastic lighting. They've got a strong contrast between the lighting and the shadows. Often the colors really pop, the fabrics look really kind of shiny or glossy.

Starting point is 00:07:14 And I think one part of that is that the systems are trained on a lot of images scraped from the internet. A lot of images on the internet, especially high quality images from stock sites like Getty images and Shutterstock are of celebrities. So there is this correspondence between the training data and the images that they are good at producing. And to take that a step further, between the images we are used to seeing. So I think, you know, when you see a fake image of the Pope looking, you know, particularly fashionable, In a way, it's because we've seen pictures of the Pope looking fashionable before. You know, there's actually this strong association

Starting point is 00:07:49 between the Pope and Italian fashion. So much so, I was writing a story about this image and I was looking into the sort of the background of it. And there was a press release by the Vatican where they had to deny officially that the previous Pope wore Prada loafers. Early in his papacy, there were rumors that the pontiff favored Prada footwear. Good Italian designer, right?

Starting point is 00:08:11 And they had this amazing phrase, which is like, the pope doesn't wear Prada, he wears Christ. But because of this, I think people were sort of, they had an image in their mind of the pope as being this sort of like memefied figure, right? The pope sometimes says or does quite funny stuff and images of him are taken out of context. You know, there's a famous one where he was giving a talk. I can't remember where, but he's holding the microphone and it looks like he's holding it like he's in the middle of a freestyle rap.

Starting point is 00:08:40 Like he's dropping bars. Exactly. And, you know, it's just a funny image and it's a real image, but, you know, it turned the Pope into a meme. Yeah. And I think this brings me to sort of the final point, why we believe this one is because when it comes to fake images, if you want to believe the image is true, you will always, that will always push you towards believing it.

Starting point is 00:09:00 And I think this is something that is going to be very important when we keep in mind the future media environment. And I think a big part of the internet Twitter had it primed in their head. Pope's funny. Images of the Pope being funny are cool. I'm going to retweet that. And so they skipped the bit in your head where you might go, hang on a second, is that actually real? I think that's the really important thing that we're going to have to remember in the future. And there's this one dead giveaway, I think, that we did not touch on, which is that it's springtime in Vatican City.

Starting point is 00:09:31 He does not need a puffer coat. Exactly. Well, yeah, no, it's true. It's true. Do we know who made this fake image and why? Was it to dupe the internet? Yeah, BuzzFeed had an interview with the creator of it who used his first name but not his full name.

Starting point is 00:09:50 He didn't want to give it away. BuzzFeed identified him as a 31-year-old construction worker from Chicago who apparently was dripping on mushrooms when he came up with the idea for the image. And he shared these images to a facebook group and a subreddit on friday at about two o'clock east coast u.s time and they just spread like wildfire from there but i mean i think that timeline seems sort of like a you know an insignificant detail but i think that's really telling the fact that these things went viral in a matter of hours i think

Starting point is 00:10:24 that really shows that you have the right image at the right time. It can really take off before people truly can, you know, debunk it or say what it is. Is this a new normal for the internet now? I mean, I think the thing is that the internet has always had a lot of fake news and fake images on it. And we, you know, Photoshop has existed for a long time. We've always had these sorts of problems. I do think this technology is going to accelerate the frequency with which we have situations like this.

Starting point is 00:10:58 You know, in this case, it was sort of a bit of a self-fulfilling cycle in that it got talked about because it got talked about because it got talked about and lots of people were talking about it to debunk it as well as to spread it as a real thing but I do think in general the ease with which you can now produce this sort of fake and not just in images obviously but in text and audio means there will be more of this going around yeah More with James in a minute on Today Explained comes from Ramp. Ramp is the corporate card and spend management software designed to help you save time and put money back in your pocket. Ramp says they give finance teams unprecedented control and insight into company spend. With Ramp, you're able to issue cards to every employee with limits and restrictions

Starting point is 00:12:20 and automate expense reporting so you can stop wasting time at the end of every month. And now you can get $250 when you join Ramp. You can go to ramp.com slash explained, ramp.com slash explained, R-A-M-P dot com slash explained, cards issued by Sutton Bank, member FDIC. Terms and conditions apply.

Starting point is 00:12:54 Bet MGM, authorized gaming partner of the NBA, has your back all season long. From tip-off to the final buzzer, you're always taken care of with a sportsbook born in Vegas. That's a feeling you can only get with BetMGM. And no matter your team, your favorite player, or your style, there's something every NBA fan will love about BetMGM. Download the app today and discover why BetMGM is your basketball home for the season. Raise your game to the next level this year with BetMGM, a sportsbook worth a slam dunk and authorized gaming partner of the NBA. BetMGM.com for terms and conditions. Must be 19 years of age or older to wager. Ontario only. Please play responsibly. If you have any questions or concerns about your gambling or someone close to you,

Starting point is 00:13:41 please contact Connex Ontario at 1-866-531-2600 to speak to an advisor free of charge. BetMGM operates pursuant to an operating agreement with iGaming Ontario. Today Explained, we're back with James Vincent, senior reporter at The Verge.

Starting point is 00:14:06 James, when we last had you on the show, it was last year, you helped us understand that AI images are part of a broader trend in AI called generative AI. How are the other types of AI generation faring when it comes to misinformation? Well, the two other big categories are text and audio. Text hasn't been so much of a problem for misinformation that we know so far. when it comes to misinformation? Well, the two other big categories are text and audio. Text hasn't been so much of a problem for misinformation that we know so far. This is the same sort of technology that's in chat GPT. People worried it would be used for a lot of propaganda and fake reviews and stuff like that. We've not really seen that so far.

Starting point is 00:14:39 Now, the audio side is slightly different. And I think there have been more instances, relatively low level of misinformation. If you've been on TikTok recently in the last couple of months, you know, they spread across social media. There's lots of videos of audio deep fakes of Joe Biden, Donald Trump, Barack Obama,

Starting point is 00:14:59 lots of, you know, well-known figures like this. No, I'm not fat. Pikmin 3 has better graphics, gameplay, and story. It makes Pikmin 2 look like a glorified tech demo, man. Newer doesn't always mean better, Joe. You should know that. Besides, Pikmin 2 has the president, Shacho.

Starting point is 00:15:15 He's like a tiny man with a mustache who runs his own company. His whistle is a car horn, Joe. Doing things like playing video games or arguing about their favorite rap albums or whatever it might be. Drake is the biggest to ever do it. He's one of the pioneer rap pop stars. He has more hits than you've had days in office.

Starting point is 00:15:33 You want to talk about days in office one term? Listen, Drake is an undeniable force in rap. One of the biggest and greatest to ever do it. But Bro's catalog is mostly fluff. That's a very good one. It's a good one. But there's also been people who have taken these tools and created fakes of, say, Joe Biden saying some transphobic things. And then I have seen instances of these being spread on Twitter, for example, and they do fool some people. And like the sort of indicators you gave us for spotting an AI-generated image.

Starting point is 00:16:07 Are there similar indicators for audio? They are trickier. So for audio, the sort of limiting factor at the moment, and again, this is only going to be true for a short amount of time, is expression. So a lot of the AI-generated voices, they're not very good at doing highs and lows, like really enthusiastic talking. Oh, God, I'm so unhappy. You know, they're not good at doing that sort of thing. They're also not very good at accents. I've heard, you know, Irish friends trying to imitate their

Starting point is 00:16:36 voices using voice clones, and it doesn't capture their accent very well. And again, that is a factor of training data that most of the training data is, you know, American voices and RP voices in English, that sort of thing. And what about text? Is there like other indicators for for text for spotting AI and text? There's no reliable software that can distinguish AI generated text versus human generated text. And there's nothing that can do that at the moment. However, there is sort of like some common sense rules about how you should be using these tools. So obviously, a lot of the popular text generation systems at the moment are chatbots.

Starting point is 00:17:10 That's ChatGPT, there is BARD by Google, and there is Microsoft's Bing chatbot. Now, sometimes people often use these for search. They use them for answering questions that they would usually answer by going to Google and clicking on websites. If you do that for any important information, you need to fact check and you need to source what you're doing. So anytime you get a specific answer about, I don't know, a research paper or biographical data, historical data, what year something happened, who it happened to,

Starting point is 00:17:41 you're going to want to check that with another source because there is a decent chance that they hallucinated their response. It sounds like you're saying these systems are not trained on, I don't know, responsibility. And as a result, it's only a matter of time before some fake text or audio or video or image disrupts, I don't know, democracy. How are Microsoft and OpenAI and Google responding to fears over that distinct possibility? They're not doing nothing. So I think a good example of this would be how Microsoft has handled news search within Bing. So Bing is a chatbot,

Starting point is 00:18:26 it can produce false information, but often cites its sources as well. It has little footnotes about its responses and supposedly points you to where it got that information from. And sometimes it gets the information wrong. Sometimes it points you to an incorrect source. Sometimes it sources it from a source that is untrustworthy in the same way that Google might direct you to an untrustworthy news site. But that is something that it's doing that's helpful. However, I really believe that these companies are not doing enough, especially when it comes to chatbots at the moment. Microsoft and Google and OpenAI are obviously involved as well. They're locked in this battle where suddenly AI is the hot new thing. They're competing with each other extremely ferociously.

Starting point is 00:19:06 And they are pushing out systems that I do not think have been properly safeguarded yet. These are companies that have, in the past, when it was less of a hot issue, talked about how much they care about AI safety and ethics and regulation. And now, when there is potential market share on the line, they have shown that they will happily discard those principles if it means beating the competition. Is artificial intelligence a threat to society and humanity? Well, a group of AI experts and industry execs,

Starting point is 00:19:35 including Elon Musk, believe so and have signed an open letter calling for a six-month pause on advanced AI development. That's training systems more powerful than OpenAI's newly launched chat GPT. How much more could these companies be doing right now not to rush out new exciting products and updates, but to ensure that these products and updates are safe for Genpop? Yeah, well, it's a tricky question because the safeguards for each of the different sites of system are different.

Starting point is 00:20:07 So, for example, with a voice cloning system, you want to know if people are cloning someone who is a private citizen, for example, because then you'd be like, well, why are they cloning this guy or this girl's voice or whatever it might be? So in that case, you might want to have that person submit some sort of a check saying, yes, I consent to having my voice cloned. When it comes to the search engines, the text search engines, they could have more safeguards like Bing is introducing in terms of, you know, footnoting their sources. But there's essentially a lot of big technological issues that just haven't been solved. Like the fact that language models can't sort fact from fiction. I think these companies should be more cautious in what they're releasing to the general public. But I think there is so much excitement

Starting point is 00:20:52 and fervor around AI at the moment that a lot of companies are just thinking it's better to push what they have out there. They'll soak up any reputational damage that comes with it, and they'll reap the benefits of being first to the user base. And no one's in charge.

Starting point is 00:21:07 There's no one, agency, country, body, who's saying, here are some guardrails. There is proposed legislation in the US and in the EU that could apply to some of these systems. But unfortunately, it's a classic case where the technology has just moved so much faster than the law. You know, the congressional hearings with TikTok recently, that was not a great display of the technical literacy of the United States lawmakers. Mr. Chud, does TikTok access the home Wi-Fi network? It's not a recipe for successful legislation, unfortunately. So it means that the people who are steering the ship right now are the CEOs of these corporations. Satya Nadella, Sundar Pichai, Sam Altman. These are the people who are in charge right now, for better or worse.

Starting point is 00:21:56 And barring our listeners say, you know, trusting those guys fully, what can we do to have our wits about us as we look at our phones and see images that are real alongside images that are fake and hear audio that is real alongside audio that was generated by AI? I've been struggling with this question myself recently because I'm often a sort of technical fix guy. I think there can be technical fixes to these problems, like having software that detects fakes and flags it up as such. That's not happening right now. And I think we're in a situation where we're going to have to enter a new era of media literacy. People are going to really have to rethink their old assumptions about how they view information on the internet. Just because you see a photograph that looks incredibly real, that looks entirely real,

Starting point is 00:22:48 you are going to have to stop yourself and think, actually, is that real? Did that happen? It's going to mean having new media habits, I think. It's going to mean probably turning more to trusted sources. That sounds really nice, James. But I mean, what we've seen is that people didn't question what was put in front of them back when it was just other people lying. And now we've got sophisticated machines lying. And you're hoping that we go to trusted sources. Well, I'm going to be really pedantic.

Starting point is 00:23:22 And I'm going to say that the machines aren't lying. It's still other people lying, right? Okay, fair. The problem is the same. It is always people doing this for whatever reason they have. The machines aren't doing this to us. We're doing it to ourselves. And you're right that so far we've sort of failed these tests and now we're getting an even bigger hurdle and quite likely we're going to fail it too as well. I've seen some people talking about the fact that, you know, most of human civilization has been conducted under conditions of mistrust in a way, right? It's only a relatively small amount of time that we had these things called photographs,

Starting point is 00:24:05 which were quite hard to fake and meant that you could believe the output. And it's only a smaller amount of time that we've had this thing called digital video. And if you filmed it, it probably happened and it was hard to fake. So we have survived with a greater level of mistrust in the past. I'm sure we can do it again. Old Spice, Sleepy Joe on the past. I'm sure we can do it again. I wish your confidence were contagious. I'm leaning towards humanity had a good run, maybe

Starting point is 00:24:38 the machines will do better. Well, they're having fun at least. We can have an AI-generated pope for the machines. We can have AI Catholicism. Amen. James Vincent, Saint Vincent. He's a senior reporter at The Verge.

Starting point is 00:25:00 And I wrote a book called Beyond Measure, which is a history of measurement, which I promise you is much more exciting than it sounds. Our show today was generated by Amanda Llewellyn. It was edited by birthday boy, Matthew Collette. It was fact-checked by Matthew and Avishai Artsy and Siona Petros. And it was mixed by Paul Robert Mounsey. It's today explained.

Starting point is 00:25:22 Be careful out there. It's only a matter of time before the AI can access the home Wi-Fi network. about me cause he know that ass fat damn and it been what it been calling his phone like yo send me your pin

Pet Camera - EBO Air 2

Today, Explained - How an AI pope pic fooled us

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

Pet Camera - EBO Air 2

Today, Explained - How an AI pope pic fooled us

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.