Y Combinator Startup Podcast - #109 - Samantha Bradshaw
Episode Date: January 23, 2019Samantha Bradshaw is a researcher at the Computational Propaganda Project and a doctoral candidate at the Oxford Internet Institute. She’s been tracking the phenomenon of political manipulation thro...ugh social media.You can find Samantha on Twitter at @sbradshaww.The YC podcast is hosted by Craig Cannon.***Topics:53 - What is a bot?2:53 - When computational propaganda began3:53 - Changes in bot tactics since 20165:53 - Using bots for content creation7:28 - WhatsApp and the upcoming Indian election9:23 - Trends in computational propaganda10:53 - How bots integrate into platforms13:23 - Responsibilities of platforms to remove fake accounts14:53 - The role of governments in media manipulation18:18 - Fake news and selecting news that aligns with your beliefs19:53 - Are platforms getting better or worse?21:33 - Samantha's personal internet habits23:03 - Sentiment around tracking in the UK vs the US24:23 - The Mueller report and US midterms29:18 - Canadian elections30:18 - 2020 US elections30:53 - Deepfakes31:48 - Optimistic thoughts for the future33:08 - How to help against computational propaganda
Transcript
Discussion (0)
Hey, how's it going? This is Craig Cannon, and you're listening to Y Combinators podcast.
Today's episode is with Samantha Bradshaw. Samantha is a researcher at the Computational Propaganda
Project and a doctoral candidate at the Oxford Internet Institute. She's been tracking the
phenomenon of political manipulation through social media. You can find Samantha on Twitter at
S Bradshaw W. All right, here we go. So I want to talk about two separate things. So one is the
the responsibilities of these platforms and the responsibilities of governments.
And then the other thing is bots, because I think a lot of people hear the word bots,
but don't necessarily know what that means.
So maybe we should start there, and you can kind of contextualize what bots are actually doing right now.
And then we can talk about, you know, how these corporations are interacting with their users and with governments.
So what, how would you define a bot?
Yeah.
So, I mean, a bot is essentially just a script or a piece of code.
And I mean, bots can do a whole bunch of different kinds of things.
Let me start here.
So a bot, for example, could be a web scraper.
Like Google's search engine is essentially a bot.
And what it does is it goes out and it crawls the internet in an automated way
and then stores information about different pages.
And so like that's a really good bot that we need for the internet.
Because if Google wasn't able to scrape and crawl pages in an automated way,
it wouldn't be able to index them.
and we wouldn't have Google search.
Other bots, and I think the bots that are being talked a lot about in the media
and, you know, around all these questions about Russian foreign interference in elections
and manipulating the media, these are bots that people design,
and they're designed to mimic human behavior in an automated way.
So they might plug in to Twitter or into Facebook,
and then they might like, share.
share, retweet a whole bunch of different stories very quickly to give something a sense of
popularity. Like, ooh, lots of people are really engaging with this content. They might follow
people just to, again, sort of amplify that popularity, maybe around a particular person
rather than an idea. Some of the more sophisticated bots might actually interact with real
people. So they blend like chatbot technology. And, you know,
We all have been to those customer service pages where the thing pops up at the bottom and says,
Hi, can I help you today?
That's not actually a real person.
That's another kind of bot, but it uses a lot of natural language to interact with people.
And so that kind of technology can be plugged into some of the bots that we're seeing on social media platforms to actually respond to users in comment threads or post like a little comment about a story that's being shared.
Yeah.
And when did it become clear that?
governmental agencies or non-governmental agencies, we're using bots to manipulate
manipulate thoughts around politics.
Yeah.
So I think this phenomenon really came into media and public attention during the 2016 election.
Yeah.
But from some of the research that I've done here at the OI and on the computational propaganda
project, we know that these techniques have been experimented with by governments for a long
time. We had evidence of them going back to, you know, even the early days of social media
platforms and using these kinds of tools to shape discussions mainly at home and in more
authoritarian regime contexts as, you know, another tool of social control.
Okay. And so now that you've been studying this for two years. So you came on in 2016?
Yeah. Okay. And so then pre-election, you started doing this research.
and now there's a lot more attention paid to it.
How have bot tactics changed?
Yeah.
So it's been fascinating to study this from the, you know, sort of inception of attention to this issue.
Because even when I started this research agenda, you know, the whole focus was on Russian
interference in the U.S. election.
But, you know, I just had this inkling that it was a much broader phenomenon.
and, you know, there's more all kinds of these digital techniques being used that, you know, make use of automation and algorithms and trying to game them to manipulate public opinion.
And so in terms of how I've seen this change over time, we're definitely seeing a lot of new entrants starting to experiment with this kind of technology.
and, you know, in particular around elections,
usually these new entrants might experiment with some of the more crude bots.
So things that might just like share and retweet certain kinds of stories or follow a politician.
So they don't really engage with users very much.
It's usually pretty easy to tell that these are bot account.
So it's trying to give the algorithm signal to show that something might be good, a person or a piece of content.
Exactly, exactly.
But they're not very sophisticated with them yet.
So we're seeing a lot of that evolve over time.
We're seeing a lot more gaming of the algorithms as well
and more of these kind of sophisticated techniques.
So not just liking and sharing,
but using specific keywords to get content trending,
to try to get things at the top of Google's search algorithm,
trying to get things at the top of YouTube
to be recommended next.
and trying to get some more of this organic reach
and what the search engine optimizers have been doing for decades.
We're just seeing these tools now being applied to politics.
And is it breaching into full-length content creation at this point?
We're definitely seeing content creation,
especially in more of the sophisticated operations.
I mean, a lot of the stuff that's come out about Russia's actions in the U.S.
have shown, you know, a lot of content creation.
It seems to just kind of be, you know, throwing stuff at a wall and seeing what sticks.
But, you know, they're putting a lot of resources into finding out exactly what the buttons are to push in society.
Okay.
And then creating content around those issues.
And so that must be, so is that what's happening with WhatsApp or are these individuals that are just creating content and then getting them, posting them to groups?
Yeah.
So I think it's a little bit of both.
On the one side, you do definitely have organized,
actors who are involved in creating messages, figuring out what sticks and then spreading them.
In other cases, it's, you know, just individuals who might have a particular political
ideology. They just want to get messages to spread because, you know, that's what they believe in.
Other times, you know, there's a whole economic incentive behind getting content to go viral,
which is kind of what, you know, I talked about at the beginning.
If you can get people to look at your content on a website, you can generate advertising revenue, the more people that click through.
That's kind of the more pure form of using positive.
So can you contextualize how WhatsApp is being manipulated?
Because this is a very non-US trend.
Yeah.
Specifically with the Indian election coming up.
And this is another interesting point about disinformation, too, is that it's different everywhere you go.
because, you know, the people who want to manipulate public opinion,
they're going to go to where the people are and the platforms that the people are using.
So in the U.S., we tend to see a lot of these campaigns on Twitter and Facebook
because that's the platforms that the majority of people use.
In contexts like India, more people are on WhatsApp,
and they use that, you know, more than other platforms.
So that's where we're seeing a lot of these campaigns.
In terms of how WhatsApp is actually being used,
they're still not very much known about it from an academic.
standpoint, just because of the nature of the platform, it's really hard to study because
it's essentially a closed platform. You can't actively monitor a lot of the communications
that happen on WhatsApp. There are certainly public groups that you can join and follow,
but that doesn't give you like a whole good look of, you know, what's happening on the platform.
And sometimes it can be hard to find the groups that you need to join to study them and stuff.
But, you know, overall, people, I mean, we looked at what's app in Brazil, for example.
And in that study, we joined a bunch of different groups.
We're looking at the kinds of images and the kinds of conversations that were being shared.
And there were a lot of memes that were being used to push disinformation to people that were in that groups.
And so in that way, it's, you know, kind of similar to Facebook.
So it is kind of like the Reddit Pepe sort of stuff.
we saw in 2016.
Exactly.
Yeah.
Yeah.
A lot of mobilizing images to get people riled up and, you know, getting them to feel
certain kinds of emotions.
Okay.
And so what are other creative forms that you've seen?
Because in your recap of what you saw in 2017, you said that there were something like
48 countries you found this being used in in 10 languages or something like that.
Yeah.
So, yeah, how else is it being done?
Yeah.
So we're definitely seeing a lot.
more memes, a lot more bots on platforms, search engine optimization tactics. We're also seeing
more videos on YouTube and more pictures on Instagram. A lot of these platforms that haven't
necessarily been focused upon in the media, but they're very powerful platforms because
of the way that they deliver content. You know, images and video can have a much more powerful
effect on our psyche and how we, I guess, digest information.
And what we see opposed to what we read tends to stick with us longer.
So if someone was more likely to see a piece of fake news, they'll more actively remember
it than if they read it.
So we're seeing a lot more disinformation on these kinds of platforms.
And it's not very well studied yet, but certainly needs to be.
Gotcha.
Yeah.
In terms of how bots actually work, could you break down how they are entering in these platforms at all?
I think many people are curious about like, you know, you hear bots, but then you think about, well, when I sign up for Facebook, like, I have to give all this personal information.
So how are these systems engineered to infiltrate platforms with such scale?
And further, like, do they even need that degree of scale to be effective?
Right.
So I think when we are talking about the platforms and how bots integrate, there's quite a big difference between even Facebook and Twitter.
So starting with Twitter, for example, it's relatively easy to plug into the API, to scrape information.
And to create an account on Twitter, you also don't need a real name.
So you can use any kind of fake identity to create an account, which is why.
Twitter tends to have a lot more of these fake accounts that use some kind of automation.
There are also a bunch of tools that allow you to, you know, automate your activity on Twitter,
things like HootSuite and whatnot.
So you can set timers on tweets and things like that.
Yeah.
And that was a signal you guys used, right?
It was like over 50 posts a day, therefore bot or more likely to be a bot?
Exactly.
That was when we first started looking at this phenomenon back in 2015, you know, we just set a
very crude measure of, you know, over 50 tweets a day, you're probably an automated account
because most people don't, you know, write that many tweets during a day. Maybe like if they're
at a conference, yeah, or if nothing else to do, but that's still quite a large number of tweets.
So that's Twitter. Whereas Facebook, you still need to have a real name to create an account. Sometimes
Facebook actually verifies your identity. I know this because I actually set up a Facebook.
Facebook account once. And, you know, this is a while ago. And I just wanted to log into it recently. And I tried. But they wanted me to send them a piece of like ID. So like a passport or driver's license, things like that. To verify that this account was real. And it's like, oh, I don't have anything with a fake name on it. So whatever. I'll just leave my fake account alone. It can die. But that also means that because it's so hard to create fake accounts on Facebook,
the accounts that are fake might actually be a little bit more powerful because people don't expect there to be as many fake people on Facebook as they do Twitter.
And so they might actually have more an impact in the circles and communities that they've managed to infiltrate.
Yeah.
And so when it comes to the responsibility of a platform to get rid of fake accounts, where do you think it lies?
Like should they do anything?
Or I think the consensus is they should.
But what do you think?
I definitely think they should.
I think for a long time there hasn't been an incentive for them to
because the more active accounts there are on these platforms,
the more they're valued on the market, right?
Because all of a sudden there's this huge user base of people,
people that these platforms can advertise to
and sell that advertising space too.
But as soon as you start saying,
oh, millions of people are not actually real,
they become devalued, right?
So I think for a long time,
there wasn't this incentive
to actually go through and delete all the fake accounts.
Yeah, so basically the public market
thinks more users equals more money,
therefore good, keep going.
But then on the advertiser side,
if, you know, those views are from bots, you're also not getting what you want.
Exactly. And I think this is where it has sort of flipped now because advertisers are starting
to realize, actually, these aren't real people. So why am I paying this much money to advertise
to, you know, a piece of code that's not going to buy my product? Yeah. And so now that the,
these Senate hearings are happening, and they're happening all over the world, when you're starting to
see governments get involved. I know it happened in Germany around like a certain degree of
censorship. What do you think is the right course of action? Yeah. So I think this is a really
complicated problem that has so many dimensions to it. It's not just, you know, fake news or
that kind of disinformation. You know, we've talked about the accounts and how those are
problematic. We talked about foreign interference and like these really coordinated campaigns
against governments. There are so many different issues that are kind of connected in this,
you know, media manipulation, social media manipulation bucket. When governments go after the
content of what's being shared, I think that's a mistake. I don't think that's getting to the
underlying problem that's sort of fueling the fake content.
or the disinformation to spread or to go viral in the first place.
So things like NetDG, for example, when it was first introduced in Germany in, I think, 2017,
early 2017, someone from the AFD had posted some, you know, horribly racist comment online.
And, of course, it got taken down immediately because NetDG essentially says Facebook or any platform
has to remove content that breaks German law within 24 hours.
or else they'll face like 50 million euro fine.
So because this broke the hate speech law,
Facebook removed it immediately.
But they also removed all of the content around,
that was created around that tweet,
all of the people that were calling this person out
for making such a racist comment,
all of the criticism on it.
And so all of a sudden you start to lose the vibrancy of this,
you know,
online political sphere.
Yeah.
So I don't think going after the content is necessarily a good idea.
We're already also seeing authoritarian governments adopt this law into their own,
into their own legal systems to silence dissent and to go after journalists who are
publishing so-called fake news and whatnot.
So I think it has a lot of unintended consequences and a lot more negative consequences and, you know,
things like collateral censorship.
Yeah.
In terms of what could be done, I think, you know,
enforcing more transparency around the platforms
and their operations and their algorithms is really important.
Right now, all of these, you know, platforms,
they're just black boxes.
And we don't understand anything about how these algorithms work
and how they're tailored to deliver certain kinds of content.
Yeah.
You know, complete transparency is also not great
because if you perfectly understand what goes viral on Google, everyone will then know how to break it, right?
But at least, you know, maybe understanding the intentions of the designer and, like, tracing those kinds of processes and those meetings and having those kinds of principles be more out in the public would be one way of starting to get that kind of in into understanding what's happening with these algorithms and in these kind of more closed black.
black boxes.
But what about in a world where, say, we could get rid of all fake news, how do you avoid the
problem of us only wanting to see what we agree with?
Yeah.
I mean, and this is also part of the problem too, because, you know, fake news is not just a
digital problem, but it's one of human nature.
And there's a lot of research out there in academia that shows people do slack news
and information that adheres to their own beliefs, right?
It's our, it's the selection effect.
So that's where things like education and stuff come in, you know.
And I know education is often used as the go-to, you know, solution to the problem.
But it does have a really important role here in teaching us how to be good citizens
and reminding us why democracy is important and why it's important to look at different sources
and to be able to negotiate that public consensus with one another.
You know, the system's not perfect, but I think we've been focusing on how much it's been broken for so long that, you know, the positives have sort of been lost in a lot of the conversations.
Yeah, I think we were talking about this before, but I think the lack of optimism in online communication misses the real optimism that exists in day-to-day life.
And, yeah, there's not really been a place for that yet.
or at least it doesn't do that well unless it's incredibly cheesy,
kind of like life inspirational, life hacks, that's what I think.
So in terms of these platforms in the long run,
based on your research,
do you think this problem is getting,
things are getting better,
or do you think that they just,
you know,
combust with enough,
you know,
bots involved?
I think we're already starting to see combustion happen,
especially with Facebook,
you know,
a lot of my friends this year are starting to get off these platforms into deep platform,
which might open up the market to new kind of models that, you know,
aren't based on advertising and, you know, aren't completely monetized by digital advertising.
Maybe we'll start to see different kind of business models prop up and maybe create a little bit more space in the market.
Right now, it's just so consolidated, too.
Well, that's a separate, I mean, is that the breaking up of Facebook, WhatsApp, Instagram?
Is that a topic of conversation here at the OIA as well?
I mean, it's something that I think we all tangentially are thinking about and what that would look like and how that would actually be done.
You know, I'm not an economist or an expert on any of this.
But, you know, I do think that the consolidation of these platforms and the fact that there's no space,
for competition.
Like that's kind of locked us into these systems, which has then made the problem
so much worse because now all of a sudden it's everyone that is being affected opposed
to, you know, just just the smaller.
Right.
So you have to be 100% in or 100% out, which is tricky.
Yeah, exactly.
And so what have you, what have you been doing in terms of your personal internet habits to,
I mean, you haven't checked out completely, obviously.
I saw you have a Twitter account.
Like you're on.
Yeah.
So what do you do?
Yeah.
I mean, I keep using the excuse that, well, I study this.
So I need to be on it to see, you know, what's happening.
I might, you know, pick up on some change that I wouldn't have been able to actually see or understand if I wasn't on the platform.
But I think I'm just a little bit more conscious around, you know, the kinds of information that I'm putting out on these platforms.
I'm trying to be a little bit more conscious around.
like apps on my mobile phone and, you know, making sure that I'm restricting access to
certain kinds of things, just practicing a lot more digital privacy and better habits around
that.
Okay.
Anything weird and ultra fringe or just kind of basic good practice?
I think I'm pretty, pretty basic when it comes to this stuff.
Yeah.
I haven't, you know, totally got the tinfoil hats on and, you know, disconnecting everything in my
life because I'm worried about these issues to that extent. But I do think that, you know,
especially these issues around privacy and data collection and how that relates to disinformation
and, you know, even targeting messages based on the data about me. Like, I think that these are
huge, that's another huge bucket of problems. Yeah. And the climate here, do you feel it's the
same as it is in the States? Because the UK's had CCTV everywhere.
quite a while now. Are people more comfortable with that tracking or less comfortable or the same?
I think it's probably, I think they're probably the same.
Okay.
You know, a lot of my British friends are still outraged when they find out that, you know,
private companies have been doing X, Y, and Z with their data.
I think, you know, the difference around what private companies are doing and, you know,
what governments do around surveillance, because everyone kind of knows what the government does.
There are all kinds of laws that have been passed and debated that's a little bit more transparent
in terms of what they're collecting, or at least they give the impression, you know, of transparency around that.
There's more of a public discussion where I think people are, there's still a shock factor around the fact that Facebook was, you know, storing private phone calls and private messages and, you know, selling that data to other marketers.
without, you know, the users knowing, I think the shock factor still exists in that space.
Okay.
So to go back to the States, just a little bit.
I know this is happening everywhere, but, yeah, I can be a little America-centric or U.S.
centric.
What are your thoughts on what will come of the Mueller report?
Yeah.
I don't know, to be honest.
I hope
I'm happy that the Mueller report
at least brought to public attention
a lot of the detail around
what the IRA, the Internet research agency
was doing in the U.S.
And that was a very credible source
of information around, you know, the wide range of techniques,
not just the digital, but, you know,
even the real world activities
that we're going on.
Whether or not it will lead to any outcome, I'm not sure.
But the fact that it has at least made this information public,
I think that's already like a win for me.
And so, you know, whatever comes out in the future,
like, you know, at least we have more information
and more knowledge around the investigation.
Okay.
And then in terms of the midterms,
have you guys crunched those numbers yet?
Yeah, so when we looked at the midterms, actually, let me go back, because we studied the 2016 elections and we studied the midterms.
And in both these studies, we were looking at what people were sharing as political information and news online on both Twitter and Facebook.
In 2016, we found that Americans on average were sharing about a one-to-one ratio of junk news to professionally produced information.
Japanese meaning fake or just not high quality?
Not high quality.
We have a five point definition that we tend to check off and you need to have like four out of the five things.
So things like counterfeit, is it mimicking a real legitimate news source?
Is it using like the Washington Post font or the BBC colors to give more credibility?
Things like do they adhere to any kind of journalistic standards?
Do they publish corrections, the kind of language that they use?
Are they using a lot of F-bombs and really, you know, hyperbolic language?
So things like that.
So it's not just fake news, but it's all of this other kind of low-quality information
that isn't necessarily helping our democracy.
So, yeah, in 2016, users were sharing junk news at a one-to-one ratio
to professionally produced information.
In 2018, when we redid this analysis during the midterm, the ratio of junk news actually went up.
And so it went to like 1.2 or 1.3 to 1.1.
So that was...
With the same amount of users?
So we controlled for the number of users.
And we also in 2016 did a study on the swing states.
And, you know, in actual swing state states, people were sharing more junk news compared
to uncontested ones.
We haven't done that same analysis now for 2018 midterms, but I think it would be interesting to see if it's also geographically distributed in terms of who's sharing what.
The assumption would be that these people are more heavily targeted in terms of botnets, right?
Exactly.
Yeah.
Yeah.
I mean, it's hard to say exactly for sure who or what is sharing that information, but we can see that, you know, by number it's more concentrated in certain places.
than in others.
Yeah, because I've been curious to see that, like, the net number is just going to decline.
So you could still find instances of fake news being shared, but more people become more
skeptical and aren't checking it at all.
Did you find that in the midterm study or is that just not part of it?
So we didn't go into people's, you know, emotions or how they felt about sharing this
kind of information.
I think what we would have expected to see, too, was that it's declining.
because the platforms have been saying over and over that they've been taking steps to reduce the kind of disinformation and misinformation and misinformation being spread.
And it's not.
And it's not.
You know, we've done this study in other countries as well.
So, you know, during the U.K. elections in Germany and France and Sweden, Mexico, all of those countries have much lower levels of junk news to professionally produce news shares than the U.S.
So the U.S. is definitely a dramatic case here.
but it's still interesting nonetheless.
And what about Canada?
You said there's something coming up this year?
Yeah, well, we'd love to study the Canadian elections in 2019.
You know, I'm Canadian myself, so I'd be really intrigued just to see what's going on.
You know, Canada has always prided itself on being a very inclusive country.
And, you know, a lot of the junk news that we see in the U.S.
uses a lot of anti-immigration rhetoric and things like that.
So just out of personal interests, you know, I think I'm worried, I guess, I don't think Canada's
necessarily immune from those kinds of conversations.
And I'm already starting to see some of the kind of populist narratives appear in my own
newsfeed and in my own communities of friends.
So I think it will be a really interesting case study.
Okay.
Yeah, because I imagine you see it here now with Brexit too, right?
Oh, yeah, definitely, definitely.
Despite the vote already having happened, it's still common, right?
Yeah, oh, it's still a thing.
I think it will continue to be part of the political rhetoric in the UK for years to come.
And so with the U.S. in 2020, all signs point to this increasing?
I would think so.
The fact that we saw an increase already in 2018, you know, I don't think the platforms are going to be able to get their act together in time for 2020. And, you know, the U.S. election is where we see a lot of new innovation in these, right, you know, manipulation techniques because millions and millions of dollars go into these campaign media strategies. So there's a lot of money to play around, to experiment, to innovate. So it's going to be an interesting.
Interesting.
This is terrifying.
So do you think 2020 or 2024 will be the year of deep fakes?
I don't know.
I think, you know, there's a lot of hype about deep fake right now.
I don't know how real it's actually going to be.
And, you know, we're already seeing, you know, a lot of the research agencies like DARPA and things like that work on being able to detect
when photos and videos have been manipulated.
So I'm a little bit more optimistic that deep fake will not become a thing.
Maybe in like low literacy media environments, there might be, you know, more, it might have more of an impact.
But I like to remain optimistic.
Yeah, I was going to say, like, so closing up, what are your optimistic thoughts for the future?
My optimistic thoughts for the future.
Geez, I don't know if I have any today.
I've just been ranting about all of the problems and, you know, reminding myself after the Christmas break why I really care about these things.
Are there any signs of things improving?
You know, I like to think that they are.
I think a lot of governments are seriously thinking about this problem.
You know, there are a couple of people that are really educating themselves.
the issues that are at the intersection of technology and politics and society.
You know, I think a lot of the time policymakers make laws without necessarily
understanding the technology. And, you know, there's a big gap there. But it does
make me feel more optimistic when I see, you know, like Senator Warren or here in the UK,
we have Damien Collins. They're really on top of their game and taking this in and
thinking really seriously and deeply about what good regulation could look like. That's not going
to just break the technology. So, you know, the fact that there is more energy around government
regulation and proper government regulation, that makes me feel a little bit more optimistic.
So if someone wanted to study what you're studying or try and help this cause, what would you
tell them to do? I would just say be nice to each other on the internet, honestly.
Because I think there's just so much anger in society right now.
You know, we're seeing more and more polarization, like especially in the U.S.
You know, that gap has been widening and widening for the past 20 years.
And I think we just need to remember that we're all humans at the end of the day.
We might have different beliefs, but that doesn't make us, you know, evil or wrong or terrible people.
We need to just be nice to each other and learn to talk to each other again.
That's a great point.
Yeah.
So checking out isn't.
necessarily a net positive for the community.
Exactly.
Yeah.
Cool.
All right.
Thanks so much, Sam.
Thank you.
All right.
Thanks for listening.
So as always, you can find the transcript and the video at blog.org.combinator.com.
And if you have a second, it would be awesome to give us a rating and review wherever you find your podcast.
See you next time.
