How to Be a Better Human - How to stay grounded in an increasingly artificial world (from The TED AI Show)
Episode Date: May 27, 2024Today, we’re sharing the first episode of the newest TED Audio Collective Podcast – The TED AI Show. Now before you think, “wait, isn’t artificial intelligence the opposite of being human?”,... know that we are wondering that too! That’s what’s nice about The TED AI Show. It asks: how is AI shaping human stuff? Join creative technologist Bilawal Sidhu as he sits down with Sam Gregory, a human rights activist and technologist, for some real talk on deepfakes, how AI is challenging our sense of what’s real and what’s fiction, and how to maintain our sense of self in this rapidly-evolving world.We hope you enjoy this episode. We'll be back with more How to Be a Better Human next week. You can listen to The TED AI Show anywhere you get your podcasts. Hosted on Acast. See acast.com/privacy for more information.
Transcript
Discussion (0)
Hi, everyone. Thanks for listening to How to Be a Better Human. This is Chris Duffy here.
And this week, instead of a new episode of our show, we are going to play an episode
of the newest podcast in the TED Audio Collective family, which is the TED AI Show.
Now, I know what you might be thinking. TED AI Show, that doesn't really sound like something
that's about humans at all. It's about technology, about artificial intelligence. Chris, isn't
AI maybe the opposite
of becoming a better human? And I got to say, I'm not sure that I disagree with you. It's a field
that I have a lot of skepticism about and a lot of fear. And also I'm interested in. I want to know
where does AI live up to the hype and where does it fall short? And I think the thing that is so
interesting about this new podcast is that they ask those questions too, right? They're very focused on how is AI shaping human stuff? How is it shaping human relationships
and human work and culture and art? What is it going to do to our lives in 10 years? Is this
really a thing that is going to change everything or is it overhyped? How do we use it in a way that
serves our best interests and is ethical? The host of TED AI, Bilawal Sadu, is a creative technologist, and he sits down with people
involved in building this technology, with artists, with journalists, and with so many
more interesting, intelligent people to talk about the thrilling and often terrifying future
that our technologies are leading us into.
I hope that you enjoy this episode.
And if you do, you can find more episodes from the TED AI Show wherever you're listening
to this.
Either way, we will be back with new episodes of How to Be a Better Human next week.
And I also just want to say this is a real human voice.
You are hearing my actual human voice as I record this.
Who knows how long that will last?
I guess we will have to listen to the TED AI Show to find out.
OK, without any further ado, here is an episode of
TED AI. Enjoy. Okay, picture this. We're in Miami. There's sun, sand, palm trees, and a giant open
air mall right on the water. Pretty nice, right? But then, late one Monday night in January of this
year, things get weird.
Cop cars swarm the mall, like dozens of them.
I'm talking six city blocks shut down, lights flashing, people everywhere, and no one knows what's going on.
The news footage hits the internet, and naturally, the hive mind goes into overdrive.
Speculation, conspiracy theories are flying, and one idea takes hold aliens folks are dissecting grainy helicopter footage in the comments
zooming in analyzing it frame by frame to find any evidence of aliens so i thought i'm a tiktoker
what if i brought this online fever dream to life and shared it with the masses?
Using the latest AI tools, I created a video.
Towering shadowy figures silently materializing
amidst the flashing lights of the police cars.
An alien invasion in the middle
of Miami's Bayside marketplace.
Just a bit of fun, I thought.
Some got the joke, a Miami twist on Stranger Things.
But I watched as other people flocked into my comment section
to declare this as bona fide evidence that aliens do in fact exist.
Now you might be wondering, what actually happened?
A bunch of teenagers got into a fight at the mall,
the police showed up to break it up,
and that's all it took to trigger this mass
hysteria. It was too easy. Too easy to make people believe something happened that never actually
happened. And that's kind of terrifying. I'm Bilal Sadoo, and this is the TED AI Show,
where we figure out how to live and thrive in a world where AI has changed everything.
So I've been making visual effects, blending realities on my computer since I was a kid.
I remember watching a show called Mega Movie Magic, which revealed the secrets behind movies' special effects.
which revealed the secrets behind movies' special effects.
I learned about CGI and practical effects in movies like Star Wars, Godzilla, and Independence Day.
I was already into computer graphics,
but seeing how they could create visuals indistinguishable from reality was a game changer.
It sparked a lifelong passion to blend the physical and digital worlds.
Several years ago, I started my TikTok channel.
I'd upload my own creations and share them with hundreds and thousands
and now millions of viewers.
I mean, just five years ago,
if I wanted to make a video of giant aliens
invading a mall in Miami,
it would have taken me a week
and at least five pieces of software.
But this aliens video,
it took me just a day to make
using tools like MidJourney,
RunwayML, and Adobe Premiere. Tools that anyone with a laptop can access.
Since ChatGPT came on the scene in late 2022, there's been a lot of talk about the Turing test,
where a human evaluator tries to figure out if the person at the other end of a text chat is a machine or another human. But what about the visual Turing test,
where machines can create images that are indistinguishable from reality?
And now OpenAI has come out with Sora, a video generation tool that will create impressively
lifelike video from a single text prompt. It's basically like ChatGPT or DALI, but instead of text or
images, it generates video. And don't get me wrong, there are other video generation tools out there,
but when I first saw Sora, the realism blew my socks off. I mean, with these other programs,
you can make short videos like a couple seconds long, but with Sora, we're talking minute-long
videos. The 3D consistency
with those long dynamic camera moves definitely impressed me. There's so much high-frequency
detail, and the scene is just brimming with life. And if we can just punch in a single text prompt
into Sora and it'll give us full-on video that's visually convincing to the point that some people
could mistake it for something real, well, you could imagine some to the point that some people could mistake it for something real?
Well, you could imagine some of the problems that might stem from that.
So we're at a turning point.
Not only have we shattered the visual Turing test,
we're reshattering it every day.
Images, audio, video, 3D, the list goes on.
I mean, you've probably seen the headlines.
AI-generated nude photographs
of Taylor Swift circulating on Twitter.
A generated video of Vladimir Zelensky surrendering to the Russian army.
A fraudster successfully impersonating a CFO on a video call
to scam a Hong Kong company out of tens of millions of dollars.
And as bad as the hoaxes and the fakes and the scams are,
there's a more insidious danger.
What if we stop believing anything we see?
Think about that.
Think about a future where you don't believe the news.
You don't trust the video evidence you see in court.
You're not even sure that the person
on the other end of the Zoom call is real.
This isn't some far-flung future.
In fact, I'd argue we're living in it now.
So given that we're in this new world, where we're constantly shattering and reshattering the visual Turing test, how do we protect our own sense of reality?
I reached out to Sam Gregory to talk me through what we're up against.
Sam is an expert on generative AI and misinformation, and is the executive director of the human rights network Witness.
His organization has been working with journalists, human rights advocates, and technologists to come up with solutions that help us separate the real from the fake.
Sam, thank you for joining us.
us. I have to ask you, as we're seeing these AI tools proliferate just over the last two years,
are we correspondingly seeing a massive uptick of these visual hoaxes?
The vast majority are still these shallow fakes because anyone can make a shallow fake.
It's trivially easy, right, just to take an image, grab it out of Google search and claim it's from another place. What we're seeing, though, is this uptick that's happening in, in a whole range
of ways people are using this generative media for, for deception. So you see images sometimes
deliberately shared to deceive people, right? Someone will share an image claiming it's, you
know, of an event that never happened. And then, you know, we're seeing a lot of audio because it's
so trivially easy to make, right? A few seconds of your voice and you can churn out
endless, endless cloned voice. We're not seeing so much video, right? And that's, you know,
a reflection that, you know, really doing complex video recreation is still not quite there, right?
Yeah, video is significantly harder, at least for the moment, and I personally hope that it would stay pretty hard for a while.
Though some of these generations are getting absolutely wild.
I had a bit of an existential moment looking at this one video from Sora.
It's the underwater diver video. there's a diver swimming underwater, you know, investigating this historic, almost archaeological spaceship that's crashed into the waterbed.
And it looked absolutely real.
And I was thinking through what that would have taken for me to do the old fashioned way.
And I was just gasping that this was just a simple prompt that produced this immaculate one minute video.
I'm kind of curious, have you had such a
moment yourself? It's funny because I was literally showing that video to my colleagues and I didn't
cue them up that it was made with Sora because I wanted to see whether they clicked that it was
an AI generated video because I think it's a fascinating one. It's kind of on the edge of
possibility. There's definitely a kind of a moment that's happening now for me. And it's really interesting
because, you know, we first started working on this like five or six years ago and we were just
doing what we described as prepare, don't panic and really trying to puncture people's hype,
particularly around video deep fakes, because people kept implying that they were really easy
to do and that we were surrounded by them. And the reality was it wasn't
easy to fake, you know, convincing video and to do that at scale. So it's certainly for me, Sora has
been a click moment in terms of the possibility here, even though it feels like a black box and
I'm not quite sure how they've done it and how accessible this is actually going to be and how
quickly. So related to this, a lot of these visual hoaxes tend to be whimsical, even innocuous, right? In other words,
they don't cause serious harm in the real world and are almost akin to pranks. But some of these
visual hoaxes can be a lot more serious. Can you tell me a little bit about what you're seeing out
there? The most interesting examples right now are happening in election context globally,
and they're typically people
having words put in their mouths. In the recent elections in Pakistan, in Bangladesh, you had
candidates saying, boycott the vote or vote for the other party, right? And they're quite compelling
at a first glance, particularly if you're not very familiar with how AI can be used. And they're
often deployed right before an election. So those are clearly, in most cases, malicious.
They're designed to deceive.
And then you're also seeing ones
that are kind of these leaked conversation ones,
so they're not visual hoaxes.
And so you've got really, you know,
quite deceptive uses happening there,
either directly just with audio
or at the intersection of audio with animated faces
or audio with the ability to make a lip sync with a
with a video if i if i wanted to ask you to zoom in on one single example that's disturbed you the
most something that exemplifies what you are the most worried about what would it be i'm gonna pick
one that is uh it's actually a whole genre and i'm gonna describe this genre because i think it's
the one that people are familiar with but once you you start to think about it, you realize how easy it is to do this.
And that is pretty much everyone has seen Elon Musk selling a crypto scam, right?
Often paired up with a newscaster, your favorite newscaster, or your favorite political figure.
In every country in which I work, people have experienced that.
They've seen that video where it's like the newscaster says,
Hey, Elon, come on and explain how you follow this new crypto scam or come on, political candidate, and explain why you're investing in this crypto scam.
For anyone who hasn't seen it, these are just videos with a deepfake Elon Musk trying to guilt you into buying crypto as a part of their Bitcoin giveaway program.
Bitcoin giveaway program. And so the reason I point to that is not because it has massive human rights impacts or massive news impacts, but it's just, this is so commodified, but we
have this sort of bigger question of how it plays into our overarching understanding of what we
trust, right? Does this undermine people's confidence in almost any way in which they
experience audio or video or photos that they encounter online, does it just reinforce what they want to believe?
And for other people, just let them believe that nothing can be trusted.
We're going to take a quick break.
When we come back, we're going to talk with Sam about how we can train ourselves to better distinguish the real from the unreal using a little system he calls SIFT.
More on that in just a minute.
We're back with Sam Gregory of Witness. Before the break, we were talking about how these fake
videos are starting to erode our trust in everything we see. And yeah, maybe you can find flaws in a lot of these videos,
but some of them are really, really good.
And nobody's zooming in at 300%
looking for those minor imperfections,
especially when they're scrolling through a feed, right?
Like before their morning commute or something.
Yeah, and you're hitting on the thing that I think,
you know, the news media has often done a disservice
to people about how to think about spotting AI, right?
We put such an emphasis on kind of like, you know, you should have spotted the Pope, you know, had his ring finger on the wrong hand in that puffer jacket image, right?
Or didn't you see that his hair didn't look quite right on the hairline?
Or didn't you see he didn't blink at the regular rate?
And it's just so cruel almost to us as consumers to expect us to spot those things.
We don't do it.
I don't look at every TikTok video in my For You page and go, like, let me just look at
this really carefully and make sure if someone's trying to deceive me.
And so we've done a disservice often because people point out these glitches and then they
expect people to spot them.
And it creates this whole culture where we distrust everything we look at.
And we try and apply this sort of personal forensic skepticism
and it doesn't lead us to great places.
All right, I want to talk about mitigation.
How do we prepare and what can we do right now?
When we first started saying prepare, don't panic,
it was five or six years ago
and it was in the first deepfakes hype cycle,
which was like the 2018 elections when everyone was like, deepfakes are going to destroy the elections.
And I don't think there was a single deepfake in the 2018 US elections of any note.
Now, let's fast forward to now, right?
2024.
When we look around the world, the threat is clear and present now, and it's escalating.
and present now and it's escalating. So prepare is about acting, listening to the right voices and thinking about how we balance out creativity, expression, human rights, and do that from a
global perspective because so much of this conversation often is also very US or Europe
centric. So what can we do now? The first part of it is who are we listening to about this?
And I often get frustrated in AI conversations could get this very abstract discussion around AI harms and AI safety. And it feels very different from the conversation I'm
having with journalists and human rights defenders on the ground who are saying, I got targeted with
a non-consensual sexual deepfake. I got my footage dismissed as faked by a politician because he said
it could have been made by AI. So as we prepare, the first thing is who do we listen to, right? And we should listen to the people who actually are experiencing this.
And then we need to think, what is it that we need to help people understand how AI is being
used? This kind of question of the recipe. And I use the recipe analogy because I think we're not
in a world where it's AI or not. It's even in the photos we take on our iPhones, we're already
combining AI and human, right?
The human input, then the AI modifications
that make our photos look better.
So we need to think, how do we communicate
that AI was used in the media we make?
We need to show people how AI and human
were involved in the creation of a piece of media,
how it was edited and how it's distributed.
The second part of it is around access to detection. And the thing that we've seen is there's a huge gap in access to
the detection tools for the people who need it most, like journalists and election officials
and human rights defenders globally. And so they're kind of stuck. They get this piece of
video or an image and they are doing the same things that we're encouraging ordinary people
to do. Look for the glitches, you know, take a guess, drop it in an online detector.
And all of those things are as likely to give a false positive
or a false negative as they are to give a reliable result that you can explain.
So you've got those two things.
You've got an absence of transparency explaining the recipe.
You've got gaps in access to detection.
And neither of those will work well unless the whole of the AI pipeline
plays its part in making sure the signals of that authenticity and the ability to detect is retained all the way through.
So those are the three key things that we point to is transparency done right,
detection available to those who need it most, and the importance of having an AI pipeline where
the responsibility is shared across the whole AI industry.
I think you covered like three questions beautifully right here. So a key challenge is telling what content is generated by humans versus synthetically generated by machines.
And one of the efforts you're involved in is the appropriately named Content Authenticity
Initiative. Could you talk a bit about how does that play into a world where we will have fake content purporting to be real? Yes. So about five years ago, there were a couple
of initiatives founded by a mix of companies and media entities, and Witness joined those early on
to see how we could bring a human rights voice to them. And one of them was something called the
Content Authenticity Initiative that Adobe kicked off. And another was something called the Coalition
for Content Provenance and Authenticity. The shorthand for that is C2PA. So let me explain a little more about what C2PA is.
It's basically a technical standard for showing what we might describe as the provenance of an
image or a video or another piece of media. And provenance is basically the trail of how it was
created, right? This is a standard that's been increasingly adopted by platforms in the last couple of months using Google and Meta, adopted as a way they're
going to show to people how the media they encounter online, particularly AI-generated
or edited media, was made. It's also a direction that governments are moving in. Some key things
that we point to around standards like the C2PA is, you know, the first thing is they are not a
foolproof way of showing
whether something was made with AI,
made with a human.
What I mean by that is they tell you information,
but, you know, we know that people
can remove that metadata, for example.
They can strip out the metadata.
And we also know that some people
may not add this in for a range of reasons.
So we're creating a system
that allows additional signals of trust or additional pieces of information, but no one confirmation of authenticity or reality.
I think that's really important that we be clear that this is, in some sense, a harm reduction approach.
It's a way to give people more information, but it's not going to be conclusive in a kind of sort of silver bullet like way.
not going to be conclusive in a kind of sort of silver bullet like way. And then the second sort of thing that we need to think about these is, you know, we need to really make sure that this
is about the how of how media was made, not the who of who made it. Otherwise, we open a backdoor
to surveillance. We open a backdoor to the ways this will be used to target and criminalize
journalists and people who speak out against governments globally. Beautifully said, especially
in the last point, I noticed Tim Sweeney had some interesting remarks about all of the content
authenticity initiatives happening as kind of described it as sort of surveillance DRM,
where you cannot upload a piece of content, right? Like if people like you aren't pushing on this
direction, we may well end up in a world where you cannot upload imagery onto the internet without
having your identity tied to it.
And I think that would be a scary world indeed.
The thing that we have consistently pushed back on in systems like C2PA and is on the idea that identity should be the center of how you're trusted online.
It's helpful. Right. And in many times I want people to know who I am.
to know who I am. But if we start to premise trust online in individual identity as the center and require people to do that, that brings all kinds of risks that we already have a history of
understanding from social media, right? That's not to say we shouldn't think about things like
proof of personhood, right? Like how do we understand that someone who created media was a
human may be important, right? As we enter an AI generated world, that's not the same as knowing
that it was Sam who made it, not a generic human who made it, right? So I think that's really important.
It's a slippery slope indeed, and really good point on sort of the distinction between
validating you're a human being versus, you know, validating you are Sam Gregory. That's a
very subtle but, you know, crucial distinction. Let's move over to fears and hopes.
You know, back in 2017, you felt the fear around deepfakes were overblown.
Clearly now it is far more of a clear and present danger.
Where do you stand now?
What are your hopes and fears at the moment?
So we've gone from a scenario in 2017 where the primary harm was the one that people didn't discuss. That was gender-based
violence. And the harm everyone discussed political usage was non-existent to a scenario now where
the gender-based violence has got far worse, right? And targets everyone from public figures
to teenagers in schools all around the world. And the political usage is now very real.
And the third thing is you have people realizing there's this incredibly good excuse
for a piece of compromising media, which is just to say, hey, that was faked, or hey,
plausibly, I can deny that piece of media by saying that it was faked. And so those three
are the sort of the core fears that I experience now that have translated into reality. Now,
in terms of hopes, I don't think we've acted yet on those three core problems sufficiently, right?
We need to address those and we need to make sure that, you know, we criminalize the ways in which
people target primarily women with non-consensual sexual deepfakes, which are escalating. In the
second area of fears, which is the fears around their misuse in politics and to undermine news footage and
human rights content, I think that's where we need to lean into a lot of the approaches like
the authenticity and provenance infrastructures like the C2PA, the access to detection tools
for the journalists who need it most, and then smart laws that can help us rule out some
usages, right? And make sure that it is clear that some uses are unacceptable. And then the third
area, that's the hardest one, because we just don't have the research yet about what is the
impact of this constant sort of drip, drip, drip of you can't believe what you see and hear.
We can only reach an 84% probability that it's real or false, which
is not great for public confidence. But we also don't know how this plays into this broader societal
trust crisis we have, where already people want to lean into kind of almost plausible believability
on stuff they care about, or just plausibly ignoring anything that challenges those beliefs.
I think you brought up a really good point about
it's almost like the world is fracturing into the multiverse of madness, I like to call it,
where people are looking for whatever validation to sort of confirm their beliefs. At the same
time, it can result in people being jaded, right, where they're just going to be detached. Well,
I don't trust anything. And so I'm curious, how do you see consumers' behaviors changing in this world where the visual Turing test gets shattered over and over again for all sorts of different, more complex domains?
Are people going to get savvier?
What do you think is going to happen to society in such a world?
So we have to hope that we walk a fine line.
We're going to need to be more skeptical of audio and images and video
that we encounter online. But we're going to have to do that with a skepticism that's supported by
signals that help us. What I mean by that is, if we enter a world where we're just like, hey,
everyone, everything could be faked. It's getting better every day. Hey, look out for the glitch.
Then we enter a world where people's skepticism quite rightly will accelerate because all of us
will experience like on a daily basis being deceived, right? And I think that's very
legitimate for us to then feel like we can't trust anything. Right. In the ideal world,
everyone's labeling what's real or fake. But when that's not happening, what do people do?
I always go back to, you know, basic media literacy. I use an acronym called SIFT
that was invented by an academic called Mike Caulfield. And SIFT is S-I-F-T. S stands for
stop, right? Because it's basically stop before you're emotionally triggered, right? Whenever you
see something that's too good to be true. I stands for investigate the source, which is like,
who shared this? Is it someone I should trust? The F stands for investigate the source, which is like, who shared this? Is it someone I should
trust? The F stands for find alternative coverage, right? Did someone already write about this and
say, wait, that's not the Pope in a puffer jacket. In reality, that's an AI image. And then the fourth
part of that, which is getting complicated is T for trace the original, which used to always be
a great way of doing it in the shallow fake era, because you'd find that an image had been recycled, but it's getting harder now.
So when I look at the knife edge we've got to walk, it's to help people do SIFT
in an environment that is structured to give them better signals of how AI was used,
and where the law has set parameters about what is definitely not acceptable, and where all the
companies, all the players in that AI pipeline are playing their part
to make sure that we can see the recipe of how AI and human was used and that it's as
easy as possible to detect when AI was used to manipulate or create a piece of imagery,
audio or video.
I really like SIFT.
I think that's also very good advice for people when they come across something that is indeed
too good to be true.
Very often we will be like, oh, well, that's interesting and go about our day.
The devices we use every day aren't foolproof, right?
They've got vulnerabilities.
There is this game of whack-a-mole that happens with patching those vulnerabilities.
And now we've got these cognitive vulnerabilities almost.
And, you know, on the detection side, the tools are going to need to keep improving because people are going to find ways to use the detectors to create new generators that evade them.
Right. And so that game of whack-a-mole will continue.
But that isn't to say that all hope is lost.
We can adapt and we can still have an information landscape where we can all thrive together.
That's the future I want.
we can all thrive together.
That's the future I want.
The way we describe it at Witness,
we talk about fortifying the truth,
which is that we need to find ways to defend that there is a reality out there.
Thank you so much, Sam.
I will certainly sleep easier at night
knowing there are people like you out there
making sure we can tell the difference
between the real and unreal.
Thank you so much for joining us.
Sam Gregory and I had this conversation in mid-March, and a few days
later, there was another development. YouTube came out with a new rule. If you have AI-generated
content in your video and it's not obvious, you have to disclose its AI. This move from YouTube
is an important one, the kind Sam and his colleagues at Witness have been advocating for.
one, the kind Sam and his colleagues at Witness have been advocating for. It shifts the onus onto creators and platforms and away from everyday viewers, because ultimately it's unfair to make
all of us become AI detectives scrutinizing every video for that missing shadow or impossible
physics, especially in a world where the visual Turing test is continually being shattered.
And look, I'm not going to sugarcoat this.
This is a huge problem, and it's going to be difficult for everyone.
Folks like Sam Gregory have their work cut out for them,
and massive organizations like TikTok, Google, and Meta do too.
But listen, I'm going to be back here this week,
and the week after that, and the week after that,
helping you figure out how to navigate this new world order,
how to live with AI, and yes, thrive with it too.
We'll be talking to researchers, artists, journalists, academics,
who can help us demystify the technology as it evolves.
Together, we're going to figure out how to navigate AI before it navigates us.
This is the TED AI Show.
I hope you'll join us.
The TED AI Show is a part of the TED Audio Collective and is produced by TED with Cosmic
Standard. Our producers are Ella Fetter and Sarah McRae. Our editors are Ben Bencheng and Alejandra
Salazar. Our showrunner is Ivana Tucker, and our associate producer is Ben Montoya.
Our engineer is Asia Pilar Simpson. Our technical director is Jacob Winnick,
and our executive producer is Eliza Smith. Our fact checker is Christian
Aparta. And I'm your host, Bilal Velsadu. See y'all in the next one.