Today, Explained - How an AI pope pic fooled us
Episode Date: March 29, 2023An AI-generated image of Cool Pope in immaculate drip went viral over the weekend and most everyone thought it was real. The Verge’s James Vincent explains how we should navigate our new internet re...ality. This episode was produced by Amanda Lewellyn, edited by Matt Collette, fact-checked by Avishay Artsy and Siona Peterous, engineered by Paul Robert Mounsey, and hosted by Sean Rameswaram. Transcript at vox.com/todayexplained  Support Today, Explained by making a financial contribution to Vox! bit.ly/givepodcasts Learn more about your ad choices. Visit podcastchoices.com/adchoices
Transcript
Discussion (0)
Today Explained, Sean Ramos for him.
This past Saturday, against my better judgment, I was scrolling the feeds, liking the tweets,
and then I saw something glorious.
Cool Pope, Papa Francis, in an epic white puffer.
He's got the cross dangling over the jacket.
The jacket has this inexplicable built-in white belt.
He looks like he's on his way to save humanity from eternal damnation,
and he's going to be warm as hell while he does it.
The tweet I saw captioned it,
the boys in Brooklyn could only hope for this level of drip.
Chef's kiss, I sent it to all my Catholics.
And then, on Sunday, I spent less time online,
so it wasn't until Monday morning that I found out that the image was fake.
AI got me, and millions of others.
How we all got fooled by Prada Pope, ahead on Today Explained.
Get groceries delivered across the GTA from Real Canadian Superstore with PC Express.
Shop online for super prices and super savings.
Try it today and get up to $75 in PC Optimum Points.
Visit Superstore.ca to get started. It's the Pope. He looks fantastic.
He's wearing a big white puffer jacket that looks like it was made like Balenciaga,
Montclair, something like that.
He looks like he just stepped off a runway.
And of course, the kicker is that it's a fake image, that it was made using AI.
It's not a real thing.
James Vincent is a senior reporter at The Verge.
Today Explained asked him why this AI-generated picture of the Pope matters.
Because I think it is a minor milestone
in terms of a fake AI image tricking a lot of people.
I saw this being shared by a lot of people going,
wow, the Pope looking drippy as fuck. And people who were taken in by it. And now there have been lots of
AI fakes that have been circulated. And they always, or not always, but a lot of them are
taken in some people. But I think this was the first time I saw lots of people thinking that
this just happened to be a real image. Were you fooled by it? Be honest.
Yes. Okay. So the first time I saw it, it was on Friday night, and I was, you know,
scrolling through my phone idly for whatever reason. And I went past it, and I just saw it
in passing was like, wow, Pope looking real, he's looking real good today. And then I saw it again.
And I was like, okay, this is obviously like, lots of people are paying attention to this image. So
I took a closer look. And I am happy to say I did spot that it was a fake.
But I say this is someone who looks at AI generated images day in, day out.
So I'm kind of I'm definitely better attuned at seeing what makes something fake.
Let's help the people out here.
What eventually tipped you off that it was fake?
How can we spot a fake image?
So when it comes to AI-generated images,
there are a few different tells,
and they kind of all, in my mind,
relate to one of the same underlying traits.
AI is good at generating surfaces,
not the underlying system.
Now, that sort of sounds like quite an abstract thing to say,
but if you think about the data it's trained on,
it's not being fed 3D models of people or clothing or, you know, architecture. It's being fed 2D
representations of it. And so it learns what the surface of the image looked like. This means that
when you zoom in, when you look at the details, they often reveal some inconsistency, some blur
or smear. Now, a very famous example of this is that AI image generators
are not very good at producing hands.
They create things that look like flesh, that look like fingers,
but sometimes the joints are in the wrong place,
or there's missing a knuckle, or there's a finger missing,
or something like this.
And this is because there's no internal representation.
These systems haven't gone through anatomy classes.
So if you actually, to use a specific thing, There's no internal representation. These systems haven't gone through anatomy classes.
So if you actually use a specific thing, yeah, you zoom in on his hand and he has this sort of slight indistinct claw of a hand
and it's grasping something that looks like it could be the lid of a coffee cup
but isn't quite a coffee cup.
And this is the sort of telltale detail that gives away an AI-generated image.
They don't do systems in a way. So
other things that were wrong with the images was his jacket was sort of weirdly folded.
When you looked at it up close, you were like, well, it's sort of got a piece of fabric that
dips in and out of another piece of fabric. And then he was wearing a crucifix, but the
crucifix had this sort of image of Jesus that looked like Jesus had been sculpted in clay
and then sat on.
Which the Pope would not do.
Famously, the Pope has a lot of respect for the image of Jesus Christ.
This is what gives AI images away. They're also not very good at text, for example. If you see
anything that has a written word in it,
they're not very good at doing sort of big background shots.
They often have a single image in the front of the frame,
like this one, and no detail in the background
because it's quite a lot of moving parts to work out all that stuff.
The unfortunate thing is that they are getting better
at all these challenges.
These systems used to be very bad at generating hands.
The hands are better now. As in the Pope image, one of the hands was sort of okay and one was bad.
But, you know, these are challenges that will be overcome in time.
So if there were these telltale signs, I mean, you're talking about pretty obvious things.
The cross didn't actually have like a real Jesus on it. His one hand looked like something out of a horror movie. Why exactly
did this image hit the way it did? I mean, it spread like wildfire and a lot of people thought
it was real. What was convincing, if not these things that were so unconvincing?
So there's a couple of elements at play here. One is a recent improvement to the
software. So the specific software used to create this image was made using mid journey, it's called
and version five of mid journey came out in the middle of March. And it included quite a big
improvement in its the quality of images of people it made. So you may have seen, for example,
other recent AI fakes, and I'm talking the last two weeks since the middle of March, ones of Donald Trump supposedly
being arrested, ones of French President Emmanuel Macron in the protests, ones of Elon Musk
supposedly on a date with AOC. None of these things have happened. None of these, to be clear,
none of those things have happened. None of those things have happened. But all those images have
been created by Midjourney because it's had this bump in its quality basically now i think an
important part of this is that mid journey also has quite a specific aesthetic it has a type of
image that it's better at doing than others and i think that type of image corresponds to celebrity
images these are often pictures as with the one of the Pope where they're really dramatic.
They have fantastic lighting. They've got a strong contrast between the lighting and
the shadows. Often the colors really pop, the fabrics look really kind of shiny or glossy.
And I think one part of that is that the systems are trained on a lot of images scraped from
the internet. A lot of images on the internet, especially high quality images from stock
sites like Getty images and Shutterstock are of celebrities. So there is this correspondence
between the training data and the images that they are good at producing. And to take that a
step further, between the images we are used to seeing. So I think, you know, when you see a fake
image of the Pope looking, you know, particularly fashionable, In a way, it's because we've seen pictures
of the Pope looking fashionable before.
You know, there's actually this strong association
between the Pope and Italian fashion.
So much so, I was writing a story about this image
and I was looking into the sort of the background of it.
And there was a press release by the Vatican
where they had to deny officially
that the previous Pope wore Prada loafers.
Early in his papacy, there were rumors that the pontiff favored Prada footwear.
Good Italian designer, right?
And they had this amazing phrase, which is like,
the pope doesn't wear Prada, he wears Christ.
But because of this, I think people were sort of,
they had an image in their mind of the pope as being this sort of like memefied figure, right?
The pope sometimes says or does
quite funny stuff and images of him are taken out of context. You know, there's a famous one
where he was giving a talk. I can't remember where, but he's holding the microphone and it
looks like he's holding it like he's in the middle of a freestyle rap.
Like he's dropping bars.
Exactly. And, you know, it's just a funny image and it's a real image,
but, you know, it turned the Pope into a meme.
Yeah.
And I think this brings me to sort of the final point,
why we believe this one is because when it comes to fake images,
if you want to believe the image is true,
you will always, that will always push you towards believing it.
And I think this is something that is going to be very important
when we keep in mind the future media environment.
And I think a big part of the internet Twitter had it primed in their head.
Pope's funny. Images of the Pope being funny are cool.
I'm going to retweet that.
And so they skipped the bit in your head where you might go, hang on a second, is that actually real?
I think that's the really important thing that we're going to have to remember in the future. And there's this one dead giveaway, I think, that we did not touch on,
which is that it's springtime in Vatican City.
He does not need a puffer coat.
Exactly.
Well, yeah, no, it's true.
It's true.
Do we know who made this fake image and why?
Was it to dupe the internet?
Yeah, BuzzFeed had an interview with the creator of it
who used his first name but not his full name.
He didn't want to give it away.
BuzzFeed identified him as a 31-year-old construction worker from Chicago
who apparently was dripping on mushrooms
when he came up with the idea for the image.
And he shared these images to a facebook group and a subreddit
on friday at about two o'clock east coast u.s time and they just spread like wildfire from there
but i mean i think that timeline seems sort of like a you know an insignificant detail but i
think that's really telling the fact that these things went viral in a matter of hours i think
that really shows that you have the right image at the right time.
It can really take off before people truly can, you know, debunk it or say what it is.
Is this a new normal for the internet now?
I mean, I think the thing is that the internet has always had a lot of fake
news and fake images on it.
And we, you know, Photoshop has existed for a long time.
We've always had these sorts of problems. I do think this technology is going to
accelerate the frequency with which we have situations like this.
You know, in this case, it was sort of a bit of a self-fulfilling cycle in that it got talked
about because it got talked about because it got
talked about and lots of people were talking about it to debunk it as well as to spread it as a real
thing but I do think in general the ease with which you can now produce this sort of fake and
not just in images obviously but in text and audio means there will be more of this going around yeah More with James in a minute on Today Explained comes from Ramp.
Ramp is the corporate card and spend management software designed to help you save time and put money back in your pocket.
Ramp says they give finance teams unprecedented control and insight into company spend.
With Ramp, you're able to issue cards to every employee with limits and restrictions
and automate expense reporting so you can stop wasting time at the end of every month.
And now you can get $250 when you join Ramp.
You can go to ramp.com slash explained,
ramp.com slash explained,
R-A-M-P dot com slash explained,
cards issued by Sutton Bank,
member FDIC.
Terms and conditions apply.
Bet MGM, authorized gaming partner of the NBA, has your back all season long.
From tip-off to the final buzzer, you're always taken care of with a sportsbook born in Vegas.
That's a feeling you can only get with BetMGM.
And no matter your team, your favorite player, or your style, there's something every NBA fan will love about BetMGM.
Download the app today and discover why BetMGM is your basketball home for the season.
Raise your game to the next level this year with BetMGM, a sportsbook worth a slam dunk and authorized gaming partner of the NBA. BetMGM.com for terms
and conditions. Must be 19 years of age or older to wager. Ontario only. Please play responsibly.
If you have any questions or concerns about your gambling or someone close to you,
please contact Connex Ontario at 1-866-531-2600
to speak to an advisor free of charge.
BetMGM operates pursuant
to an operating agreement
with iGaming Ontario.
Today Explained,
we're back with James Vincent,
senior reporter at The Verge.
James, when we last had you on the show, it was last year, you helped us understand that AI images are part of a broader trend in AI called generative AI.
How are the other types of AI generation faring when it comes to misinformation?
Well, the two other big categories are text and audio.
Text hasn't been so much of a problem for misinformation that we know so far. when it comes to misinformation? Well, the two other big categories are text and audio.
Text hasn't been so much of a problem for misinformation that we know so far.
This is the same sort of technology that's in chat GPT.
People worried it would be used for a lot of propaganda and fake reviews and stuff like that.
We've not really seen that so far.
Now, the audio side is slightly different.
And I think there have been more instances,
relatively low level of misinformation.
If you've been on TikTok recently
in the last couple of months,
you know, they spread across social media.
There's lots of videos of audio deep fakes
of Joe Biden, Donald Trump, Barack Obama,
lots of, you know, well-known figures like this.
No, I'm not fat.
Pikmin 3 has better graphics, gameplay, and story.
It makes Pikmin 2 look like
a glorified tech demo, man.
Newer doesn't always mean better, Joe.
You should know that.
Besides, Pikmin 2 has the president, Shacho.
He's like a tiny man with a mustache
who runs his own company.
His whistle is a car horn, Joe.
Doing things like playing video games
or arguing about their favorite rap albums or whatever it might be.
Drake is the biggest to ever do it.
He's one of the pioneer rap pop stars.
He has more hits than you've had days in office.
You want to talk about days in office one term?
Listen, Drake is an undeniable force in rap.
One of the biggest and greatest to ever do it.
But Bro's catalog is mostly fluff.
That's a very good one.
It's a good one. But there's also been people who have taken these tools and created fakes of, say, Joe Biden saying some transphobic things.
And then I have seen instances of these being spread on Twitter, for example, and they do fool some people.
And like the sort of indicators you gave us for spotting an AI-generated image.
Are there similar indicators for audio?
They are trickier.
So for audio, the sort of limiting factor at the moment,
and again, this is only going to be true for a short amount of time, is expression.
So a lot of the AI-generated voices,
they're not very good at doing highs and lows,
like really enthusiastic talking. Oh, God, I'm so unhappy. You know, they're not good at doing that sort of thing.
They're also not very good at accents. I've heard, you know, Irish friends trying to imitate their
voices using voice clones, and it doesn't capture their accent very well. And again, that is a
factor of training data that most of the training data is, you know, American voices and RP voices in English, that sort of thing.
And what about text? Is there like other indicators for for text for spotting AI and text?
There's no reliable software that can distinguish AI generated text versus human generated text.
And there's nothing that can do that at the moment.
However, there is sort of like some common sense rules about how you should be using these tools.
So obviously, a lot of the popular text generation systems
at the moment are chatbots.
That's ChatGPT, there is BARD by Google,
and there is Microsoft's Bing chatbot.
Now, sometimes people often use these for search.
They use them for answering questions
that they would usually answer by going to Google
and clicking on websites. If you do that for any important information, you need to fact check and you need
to source what you're doing. So anytime you get a specific answer about, I don't know, a research
paper or biographical data, historical data, what year something happened, who it happened to,
you're going to want to check that with another source because there is a decent chance that they hallucinated their response.
It sounds like you're saying these systems are not trained on, I don't know, responsibility.
And as a result, it's only a matter of time before some fake text or audio or video or image
disrupts, I don't know, democracy.
How are Microsoft and OpenAI and Google responding to fears over that distinct possibility?
They're not doing nothing.
So I think a good example of this would be how Microsoft has handled news search within Bing.
So Bing is a chatbot,
it can produce false information, but often cites its sources as well. It has little footnotes about
its responses and supposedly points you to where it got that information from. And sometimes it
gets the information wrong. Sometimes it points you to an incorrect source. Sometimes it sources
it from a source that is untrustworthy in the same way that Google might direct you to an untrustworthy news site.
But that is something that it's doing that's helpful.
However, I really believe that these companies are not doing enough, especially when it comes to chatbots at the moment.
Microsoft and Google and OpenAI are obviously involved as well.
They're locked in this battle where suddenly AI is the hot new thing. They're competing with each other extremely ferociously.
And they are pushing out systems that I do not think have been properly safeguarded yet.
These are companies that have, in the past, when it was less of a hot issue,
talked about how much they care about AI safety and ethics and regulation.
And now, when there is potential market share on the line,
they have shown that they will happily discard those principles
if it means beating the competition.
Is artificial intelligence a threat to society and humanity?
Well, a group of AI experts and industry execs,
including Elon Musk, believe so
and have signed an open letter
calling for a six-month pause on advanced AI development.
That's training systems more powerful
than OpenAI's newly launched
chat GPT. How much more could these companies be doing right now not to rush out new exciting
products and updates, but to ensure that these products and updates are safe for Genpop? Yeah,
well, it's a tricky question because the safeguards for each of the different sites of system are different.
So, for example, with a voice cloning system, you want to know if people are cloning someone who is a private citizen, for example, because then you'd be like, well, why are they cloning this guy or this girl's voice or whatever it might be?
So in that case, you might want to have that person submit some sort of a check saying, yes, I consent to having my voice cloned. When it comes to the search engines,
the text search engines, they could have more safeguards like Bing is introducing in terms of,
you know, footnoting their sources. But there's essentially a lot of big technological issues
that just haven't been solved. Like the fact that language models can't sort fact from fiction.
I think these companies should be more cautious
in what they're releasing to the general public.
But I think there is so much excitement
and fervor around AI at the moment
that a lot of companies are just thinking
it's better to push what they have out there.
They'll soak up any reputational damage
that comes with it,
and they'll reap the benefits
of being first to the user base.
And no one's in charge.
There's no one, agency, country, body, who's saying, here are some guardrails.
There is proposed legislation in the US and in the EU that could apply to some of these
systems.
But unfortunately, it's a classic case where the technology has just moved so much faster than the law. You know, the congressional hearings with TikTok recently,
that was not a great display of the technical literacy of the United States lawmakers.
Mr. Chud, does TikTok access the home Wi-Fi network?
It's not a recipe for successful legislation, unfortunately. So it means that the people who are steering the ship right now are the CEOs of these corporations. Satya Nadella,
Sundar Pichai, Sam Altman. These are the people who are in charge right now, for better or worse.
And barring our listeners say, you know, trusting those guys fully, what can we do to have our wits about us as we look at our phones and see images that are
real alongside images that are fake and hear audio that is real alongside audio that was generated by
AI? I've been struggling with this question myself recently because I'm often a sort of technical
fix guy. I think there can be technical fixes to
these problems, like having software that detects fakes and flags it up as such. That's not
happening right now. And I think we're in a situation where we're going to have to enter a new
era of media literacy. People are going to really have to rethink their old assumptions about how
they view information on the internet. Just because you see a photograph that looks incredibly real, that looks entirely real,
you are going to have to stop yourself and think, actually, is that real? Did that happen?
It's going to mean having new media habits, I think. It's going to mean
probably turning more to trusted sources. That sounds really nice, James.
But I mean, what we've seen is that people didn't question what was put in front of them
back when it was just other people lying.
And now we've got sophisticated machines lying.
And you're hoping that we go to trusted sources.
Well, I'm going to be really pedantic.
And I'm going to say that the machines aren't lying.
It's still other people lying, right?
Okay, fair.
The problem is the same. It is always people doing this for whatever reason they have.
The machines aren't doing this to us. We're doing it to ourselves.
And you're right that so far we've sort of failed these tests and now we're getting an even bigger hurdle and quite likely we're going to fail it too as well.
I've seen some people talking about the fact that, you know, most of human civilization has been conducted under conditions of mistrust in a way, right?
It's only a relatively small amount of time that we had these things called photographs,
which were quite hard to fake and meant that you could believe the output.
And it's only a smaller amount of time that we've had this thing called digital video.
And if you filmed it, it probably happened and it was hard to fake.
So we have survived with a greater level of mistrust in the past.
I'm sure we can do it again.
Old Spice, Sleepy Joe on the past. I'm sure we can do it again. I wish your confidence were contagious.
I'm leaning towards
humanity had a good run, maybe
the machines will do better.
Well,
they're having fun at least.
We can have an AI-generated pope for the machines.
We can have AI Catholicism.
Amen.
James Vincent, Saint Vincent.
He's a senior reporter at The Verge.
And I wrote a book called Beyond Measure, which is a history of measurement,
which I promise you is much more exciting than it sounds.
Our show today was generated by Amanda Llewellyn.
It was edited by birthday boy, Matthew Collette.
It was fact-checked by Matthew and Avishai Artsy
and Siona Petros.
And it was mixed by Paul Robert Mounsey.
It's today explained.
Be careful out there.
It's only a matter of time before the AI can
access the home Wi-Fi network. about me cause he know that ass fat damn and it been what it been calling his phone like yo send me your pin