Y Combinator Startup Podcast - #109 - Samantha Bradshaw

Starting point is 00:00:00 Hey, how's it going? This is Craig Cannon, and you're listening to Y Combinators podcast. Today's episode is with Samantha Bradshaw. Samantha is a researcher at the Computational Propaganda Project and a doctoral candidate at the Oxford Internet Institute. She's been tracking the phenomenon of political manipulation through social media. You can find Samantha on Twitter at S Bradshaw W. All right, here we go. So I want to talk about two separate things. So one is the the responsibilities of these platforms and the responsibilities of governments. And then the other thing is bots, because I think a lot of people hear the word bots, but don't necessarily know what that means.

Starting point is 00:00:39 So maybe we should start there, and you can kind of contextualize what bots are actually doing right now. And then we can talk about, you know, how these corporations are interacting with their users and with governments. So what, how would you define a bot? Yeah. So, I mean, a bot is essentially just a script or a piece of code. And I mean, bots can do a whole bunch of different kinds of things. Let me start here. So a bot, for example, could be a web scraper.

Starting point is 00:01:05 Like Google's search engine is essentially a bot. And what it does is it goes out and it crawls the internet in an automated way and then stores information about different pages. And so like that's a really good bot that we need for the internet. Because if Google wasn't able to scrape and crawl pages in an automated way, it wouldn't be able to index them. and we wouldn't have Google search. Other bots, and I think the bots that are being talked a lot about in the media

Starting point is 00:01:34 and, you know, around all these questions about Russian foreign interference in elections and manipulating the media, these are bots that people design, and they're designed to mimic human behavior in an automated way. So they might plug in to Twitter or into Facebook, and then they might like, share. share, retweet a whole bunch of different stories very quickly to give something a sense of popularity. Like, ooh, lots of people are really engaging with this content. They might follow people just to, again, sort of amplify that popularity, maybe around a particular person

Starting point is 00:02:14 rather than an idea. Some of the more sophisticated bots might actually interact with real people. So they blend like chatbot technology. And, you know, We all have been to those customer service pages where the thing pops up at the bottom and says, Hi, can I help you today? That's not actually a real person. That's another kind of bot, but it uses a lot of natural language to interact with people. And so that kind of technology can be plugged into some of the bots that we're seeing on social media platforms to actually respond to users in comment threads or post like a little comment about a story that's being shared. Yeah.

Starting point is 00:02:54 And when did it become clear that? governmental agencies or non-governmental agencies, we're using bots to manipulate manipulate thoughts around politics. Yeah. So I think this phenomenon really came into media and public attention during the 2016 election. Yeah. But from some of the research that I've done here at the OI and on the computational propaganda project, we know that these techniques have been experimented with by governments for a long

Starting point is 00:03:23 time. We had evidence of them going back to, you know, even the early days of social media platforms and using these kinds of tools to shape discussions mainly at home and in more authoritarian regime contexts as, you know, another tool of social control. Okay. And so now that you've been studying this for two years. So you came on in 2016? Yeah. Okay. And so then pre-election, you started doing this research. and now there's a lot more attention paid to it. How have bot tactics changed? Yeah.

Starting point is 00:04:00 So it's been fascinating to study this from the, you know, sort of inception of attention to this issue. Because even when I started this research agenda, you know, the whole focus was on Russian interference in the U.S. election. But, you know, I just had this inkling that it was a much broader phenomenon. and, you know, there's more all kinds of these digital techniques being used that, you know, make use of automation and algorithms and trying to game them to manipulate public opinion. And so in terms of how I've seen this change over time, we're definitely seeing a lot of new entrants starting to experiment with this kind of technology. and, you know, in particular around elections, usually these new entrants might experiment with some of the more crude bots.

Starting point is 00:04:57 So things that might just like share and retweet certain kinds of stories or follow a politician. So they don't really engage with users very much. It's usually pretty easy to tell that these are bot account. So it's trying to give the algorithm signal to show that something might be good, a person or a piece of content. Exactly, exactly. But they're not very sophisticated with them yet. So we're seeing a lot of that evolve over time. We're seeing a lot more gaming of the algorithms as well

Starting point is 00:05:28 and more of these kind of sophisticated techniques. So not just liking and sharing, but using specific keywords to get content trending, to try to get things at the top of Google's search algorithm, trying to get things at the top of YouTube to be recommended next. and trying to get some more of this organic reach and what the search engine optimizers have been doing for decades.

Starting point is 00:05:55 We're just seeing these tools now being applied to politics. And is it breaching into full-length content creation at this point? We're definitely seeing content creation, especially in more of the sophisticated operations. I mean, a lot of the stuff that's come out about Russia's actions in the U.S. have shown, you know, a lot of content creation. It seems to just kind of be, you know, throwing stuff at a wall and seeing what sticks. But, you know, they're putting a lot of resources into finding out exactly what the buttons are to push in society.

Starting point is 00:06:29 Okay. And then creating content around those issues. And so that must be, so is that what's happening with WhatsApp or are these individuals that are just creating content and then getting them, posting them to groups? Yeah. So I think it's a little bit of both. On the one side, you do definitely have organized, actors who are involved in creating messages, figuring out what sticks and then spreading them. In other cases, it's, you know, just individuals who might have a particular political

Starting point is 00:06:56 ideology. They just want to get messages to spread because, you know, that's what they believe in. Other times, you know, there's a whole economic incentive behind getting content to go viral, which is kind of what, you know, I talked about at the beginning. If you can get people to look at your content on a website, you can generate advertising revenue, the more people that click through. That's kind of the more pure form of using positive. So can you contextualize how WhatsApp is being manipulated? Because this is a very non-US trend. Yeah.

Starting point is 00:07:32 Specifically with the Indian election coming up. And this is another interesting point about disinformation, too, is that it's different everywhere you go. because, you know, the people who want to manipulate public opinion, they're going to go to where the people are and the platforms that the people are using. So in the U.S., we tend to see a lot of these campaigns on Twitter and Facebook because that's the platforms that the majority of people use. In contexts like India, more people are on WhatsApp, and they use that, you know, more than other platforms.

Starting point is 00:08:01 So that's where we're seeing a lot of these campaigns. In terms of how WhatsApp is actually being used, they're still not very much known about it from an academic. standpoint, just because of the nature of the platform, it's really hard to study because it's essentially a closed platform. You can't actively monitor a lot of the communications that happen on WhatsApp. There are certainly public groups that you can join and follow, but that doesn't give you like a whole good look of, you know, what's happening on the platform. And sometimes it can be hard to find the groups that you need to join to study them and stuff.

Starting point is 00:08:39 But, you know, overall, people, I mean, we looked at what's app in Brazil, for example. And in that study, we joined a bunch of different groups. We're looking at the kinds of images and the kinds of conversations that were being shared. And there were a lot of memes that were being used to push disinformation to people that were in that groups. And so in that way, it's, you know, kind of similar to Facebook. So it is kind of like the Reddit Pepe sort of stuff. we saw in 2016. Exactly.

Starting point is 00:09:11 Yeah. Yeah. A lot of mobilizing images to get people riled up and, you know, getting them to feel certain kinds of emotions. Okay. And so what are other creative forms that you've seen? Because in your recap of what you saw in 2017, you said that there were something like 48 countries you found this being used in in 10 languages or something like that.

Starting point is 00:09:33 Yeah. So, yeah, how else is it being done? Yeah. So we're definitely seeing a lot. more memes, a lot more bots on platforms, search engine optimization tactics. We're also seeing more videos on YouTube and more pictures on Instagram. A lot of these platforms that haven't necessarily been focused upon in the media, but they're very powerful platforms because of the way that they deliver content. You know, images and video can have a much more powerful

Starting point is 00:10:04 effect on our psyche and how we, I guess, digest information. And what we see opposed to what we read tends to stick with us longer. So if someone was more likely to see a piece of fake news, they'll more actively remember it than if they read it. So we're seeing a lot more disinformation on these kinds of platforms. And it's not very well studied yet, but certainly needs to be. Gotcha. Yeah.

Starting point is 00:10:33 In terms of how bots actually work, could you break down how they are entering in these platforms at all? I think many people are curious about like, you know, you hear bots, but then you think about, well, when I sign up for Facebook, like, I have to give all this personal information. So how are these systems engineered to infiltrate platforms with such scale? And further, like, do they even need that degree of scale to be effective? Right. So I think when we are talking about the platforms and how bots integrate, there's quite a big difference between even Facebook and Twitter. So starting with Twitter, for example, it's relatively easy to plug into the API, to scrape information. And to create an account on Twitter, you also don't need a real name.

Starting point is 00:11:23 So you can use any kind of fake identity to create an account, which is why. Twitter tends to have a lot more of these fake accounts that use some kind of automation. There are also a bunch of tools that allow you to, you know, automate your activity on Twitter, things like HootSuite and whatnot. So you can set timers on tweets and things like that. Yeah. And that was a signal you guys used, right? It was like over 50 posts a day, therefore bot or more likely to be a bot?

Starting point is 00:11:53 Exactly. That was when we first started looking at this phenomenon back in 2015, you know, we just set a very crude measure of, you know, over 50 tweets a day, you're probably an automated account because most people don't, you know, write that many tweets during a day. Maybe like if they're at a conference, yeah, or if nothing else to do, but that's still quite a large number of tweets. So that's Twitter. Whereas Facebook, you still need to have a real name to create an account. Sometimes Facebook actually verifies your identity. I know this because I actually set up a Facebook. Facebook account once. And, you know, this is a while ago. And I just wanted to log into it recently. And I tried. But they wanted me to send them a piece of like ID. So like a passport or driver's license, things like that. To verify that this account was real. And it's like, oh, I don't have anything with a fake name on it. So whatever. I'll just leave my fake account alone. It can die. But that also means that because it's so hard to create fake accounts on Facebook,

Starting point is 00:12:59 the accounts that are fake might actually be a little bit more powerful because people don't expect there to be as many fake people on Facebook as they do Twitter. And so they might actually have more an impact in the circles and communities that they've managed to infiltrate. Yeah. And so when it comes to the responsibility of a platform to get rid of fake accounts, where do you think it lies? Like should they do anything? Or I think the consensus is they should. But what do you think? I definitely think they should.

Starting point is 00:13:35 I think for a long time there hasn't been an incentive for them to because the more active accounts there are on these platforms, the more they're valued on the market, right? Because all of a sudden there's this huge user base of people, people that these platforms can advertise to and sell that advertising space too. But as soon as you start saying, oh, millions of people are not actually real,

Starting point is 00:14:03 they become devalued, right? So I think for a long time, there wasn't this incentive to actually go through and delete all the fake accounts. Yeah, so basically the public market thinks more users equals more money, therefore good, keep going. But then on the advertiser side,

Starting point is 00:14:24 if, you know, those views are from bots, you're also not getting what you want. Exactly. And I think this is where it has sort of flipped now because advertisers are starting to realize, actually, these aren't real people. So why am I paying this much money to advertise to, you know, a piece of code that's not going to buy my product? Yeah. And so now that the, these Senate hearings are happening, and they're happening all over the world, when you're starting to see governments get involved. I know it happened in Germany around like a certain degree of censorship. What do you think is the right course of action? Yeah. So I think this is a really complicated problem that has so many dimensions to it. It's not just, you know, fake news or

Starting point is 00:15:14 that kind of disinformation. You know, we've talked about the accounts and how those are problematic. We talked about foreign interference and like these really coordinated campaigns against governments. There are so many different issues that are kind of connected in this, you know, media manipulation, social media manipulation bucket. When governments go after the content of what's being shared, I think that's a mistake. I don't think that's getting to the underlying problem that's sort of fueling the fake content. or the disinformation to spread or to go viral in the first place. So things like NetDG, for example, when it was first introduced in Germany in, I think, 2017,

Starting point is 00:15:59 early 2017, someone from the AFD had posted some, you know, horribly racist comment online. And, of course, it got taken down immediately because NetDG essentially says Facebook or any platform has to remove content that breaks German law within 24 hours. or else they'll face like 50 million euro fine. So because this broke the hate speech law, Facebook removed it immediately. But they also removed all of the content around, that was created around that tweet,

Starting point is 00:16:33 all of the people that were calling this person out for making such a racist comment, all of the criticism on it. And so all of a sudden you start to lose the vibrancy of this, you know, online political sphere. Yeah. So I don't think going after the content is necessarily a good idea.

Starting point is 00:16:55 We're already also seeing authoritarian governments adopt this law into their own, into their own legal systems to silence dissent and to go after journalists who are publishing so-called fake news and whatnot. So I think it has a lot of unintended consequences and a lot more negative consequences and, you know, things like collateral censorship. Yeah. In terms of what could be done, I think, you know, enforcing more transparency around the platforms

Starting point is 00:17:26 and their operations and their algorithms is really important. Right now, all of these, you know, platforms, they're just black boxes. And we don't understand anything about how these algorithms work and how they're tailored to deliver certain kinds of content. Yeah. You know, complete transparency is also not great because if you perfectly understand what goes viral on Google, everyone will then know how to break it, right?

Starting point is 00:17:52 But at least, you know, maybe understanding the intentions of the designer and, like, tracing those kinds of processes and those meetings and having those kinds of principles be more out in the public would be one way of starting to get that kind of in into understanding what's happening with these algorithms and in these kind of more closed black. black boxes. But what about in a world where, say, we could get rid of all fake news, how do you avoid the problem of us only wanting to see what we agree with? Yeah. I mean, and this is also part of the problem too, because, you know, fake news is not just a digital problem, but it's one of human nature. And there's a lot of research out there in academia that shows people do slack news

Starting point is 00:18:39 and information that adheres to their own beliefs, right? It's our, it's the selection effect. So that's where things like education and stuff come in, you know. And I know education is often used as the go-to, you know, solution to the problem. But it does have a really important role here in teaching us how to be good citizens and reminding us why democracy is important and why it's important to look at different sources and to be able to negotiate that public consensus with one another. You know, the system's not perfect, but I think we've been focusing on how much it's been broken for so long that, you know, the positives have sort of been lost in a lot of the conversations.

Starting point is 00:19:27 Yeah, I think we were talking about this before, but I think the lack of optimism in online communication misses the real optimism that exists in day-to-day life. And, yeah, there's not really been a place for that yet. or at least it doesn't do that well unless it's incredibly cheesy, kind of like life inspirational, life hacks, that's what I think. So in terms of these platforms in the long run, based on your research, do you think this problem is getting, things are getting better,

Starting point is 00:19:57 or do you think that they just, you know, combust with enough, you know, bots involved? I think we're already starting to see combustion happen, especially with Facebook, you know,

Starting point is 00:20:11 a lot of my friends this year are starting to get off these platforms into deep platform, which might open up the market to new kind of models that, you know, aren't based on advertising and, you know, aren't completely monetized by digital advertising. Maybe we'll start to see different kind of business models prop up and maybe create a little bit more space in the market. Right now, it's just so consolidated, too. Well, that's a separate, I mean, is that the breaking up of Facebook, WhatsApp, Instagram? Is that a topic of conversation here at the OIA as well? I mean, it's something that I think we all tangentially are thinking about and what that would look like and how that would actually be done.

Starting point is 00:21:00 You know, I'm not an economist or an expert on any of this. But, you know, I do think that the consolidation of these platforms and the fact that there's no space, for competition. Like that's kind of locked us into these systems, which has then made the problem so much worse because now all of a sudden it's everyone that is being affected opposed to, you know, just just the smaller. Right. So you have to be 100% in or 100% out, which is tricky.

Starting point is 00:21:30 Yeah, exactly. And so what have you, what have you been doing in terms of your personal internet habits to, I mean, you haven't checked out completely, obviously. I saw you have a Twitter account. Like you're on. Yeah. So what do you do? Yeah.

Starting point is 00:21:43 I mean, I keep using the excuse that, well, I study this. So I need to be on it to see, you know, what's happening. I might, you know, pick up on some change that I wouldn't have been able to actually see or understand if I wasn't on the platform. But I think I'm just a little bit more conscious around, you know, the kinds of information that I'm putting out on these platforms. I'm trying to be a little bit more conscious around. like apps on my mobile phone and, you know, making sure that I'm restricting access to certain kinds of things, just practicing a lot more digital privacy and better habits around that.

Starting point is 00:22:24 Okay. Anything weird and ultra fringe or just kind of basic good practice? I think I'm pretty, pretty basic when it comes to this stuff. Yeah. I haven't, you know, totally got the tinfoil hats on and, you know, disconnecting everything in my life because I'm worried about these issues to that extent. But I do think that, you know, especially these issues around privacy and data collection and how that relates to disinformation and, you know, even targeting messages based on the data about me. Like, I think that these are

Starting point is 00:22:57 huge, that's another huge bucket of problems. Yeah. And the climate here, do you feel it's the same as it is in the States? Because the UK's had CCTV everywhere. quite a while now. Are people more comfortable with that tracking or less comfortable or the same? I think it's probably, I think they're probably the same. Okay. You know, a lot of my British friends are still outraged when they find out that, you know, private companies have been doing X, Y, and Z with their data. I think, you know, the difference around what private companies are doing and, you know,

Starting point is 00:23:39 what governments do around surveillance, because everyone kind of knows what the government does. There are all kinds of laws that have been passed and debated that's a little bit more transparent in terms of what they're collecting, or at least they give the impression, you know, of transparency around that. There's more of a public discussion where I think people are, there's still a shock factor around the fact that Facebook was, you know, storing private phone calls and private messages and, you know, selling that data to other marketers. without, you know, the users knowing, I think the shock factor still exists in that space. Okay. So to go back to the States, just a little bit. I know this is happening everywhere, but, yeah, I can be a little America-centric or U.S.

Starting point is 00:24:23 centric. What are your thoughts on what will come of the Mueller report? Yeah. I don't know, to be honest. I hope I'm happy that the Mueller report at least brought to public attention a lot of the detail around

Starting point is 00:24:45 what the IRA, the Internet research agency was doing in the U.S. And that was a very credible source of information around, you know, the wide range of techniques, not just the digital, but, you know, even the real world activities that we're going on. Whether or not it will lead to any outcome, I'm not sure.

Starting point is 00:25:10 But the fact that it has at least made this information public, I think that's already like a win for me. And so, you know, whatever comes out in the future, like, you know, at least we have more information and more knowledge around the investigation. Okay. And then in terms of the midterms, have you guys crunched those numbers yet?

Starting point is 00:25:32 Yeah, so when we looked at the midterms, actually, let me go back, because we studied the 2016 elections and we studied the midterms. And in both these studies, we were looking at what people were sharing as political information and news online on both Twitter and Facebook. In 2016, we found that Americans on average were sharing about a one-to-one ratio of junk news to professionally produced information. Japanese meaning fake or just not high quality? Not high quality. We have a five point definition that we tend to check off and you need to have like four out of the five things. So things like counterfeit, is it mimicking a real legitimate news source? Is it using like the Washington Post font or the BBC colors to give more credibility?

Starting point is 00:26:27 Things like do they adhere to any kind of journalistic standards? Do they publish corrections, the kind of language that they use? Are they using a lot of F-bombs and really, you know, hyperbolic language? So things like that. So it's not just fake news, but it's all of this other kind of low-quality information that isn't necessarily helping our democracy. So, yeah, in 2016, users were sharing junk news at a one-to-one ratio to professionally produced information.

Starting point is 00:27:00 In 2018, when we redid this analysis during the midterm, the ratio of junk news actually went up. And so it went to like 1.2 or 1.3 to 1.1. So that was... With the same amount of users? So we controlled for the number of users. And we also in 2016 did a study on the swing states. And, you know, in actual swing state states, people were sharing more junk news compared to uncontested ones.

Starting point is 00:27:30 We haven't done that same analysis now for 2018 midterms, but I think it would be interesting to see if it's also geographically distributed in terms of who's sharing what. The assumption would be that these people are more heavily targeted in terms of botnets, right? Exactly. Yeah. Yeah. I mean, it's hard to say exactly for sure who or what is sharing that information, but we can see that, you know, by number it's more concentrated in certain places. than in others. Yeah, because I've been curious to see that, like, the net number is just going to decline.

Starting point is 00:28:05 So you could still find instances of fake news being shared, but more people become more skeptical and aren't checking it at all. Did you find that in the midterm study or is that just not part of it? So we didn't go into people's, you know, emotions or how they felt about sharing this kind of information. I think what we would have expected to see, too, was that it's declining. because the platforms have been saying over and over that they've been taking steps to reduce the kind of disinformation and misinformation and misinformation being spread. And it's not.

Starting point is 00:28:38 And it's not. You know, we've done this study in other countries as well. So, you know, during the U.K. elections in Germany and France and Sweden, Mexico, all of those countries have much lower levels of junk news to professionally produce news shares than the U.S. So the U.S. is definitely a dramatic case here. but it's still interesting nonetheless. And what about Canada? You said there's something coming up this year? Yeah, well, we'd love to study the Canadian elections in 2019.

Starting point is 00:29:11 You know, I'm Canadian myself, so I'd be really intrigued just to see what's going on. You know, Canada has always prided itself on being a very inclusive country. And, you know, a lot of the junk news that we see in the U.S. uses a lot of anti-immigration rhetoric and things like that. So just out of personal interests, you know, I think I'm worried, I guess, I don't think Canada's necessarily immune from those kinds of conversations. And I'm already starting to see some of the kind of populist narratives appear in my own newsfeed and in my own communities of friends.

Starting point is 00:29:52 So I think it will be a really interesting case study. Okay. Yeah, because I imagine you see it here now with Brexit too, right? Oh, yeah, definitely, definitely. Despite the vote already having happened, it's still common, right? Yeah, oh, it's still a thing. I think it will continue to be part of the political rhetoric in the UK for years to come. And so with the U.S. in 2020, all signs point to this increasing?

Starting point is 00:30:20 I would think so. The fact that we saw an increase already in 2018, you know, I don't think the platforms are going to be able to get their act together in time for 2020. And, you know, the U.S. election is where we see a lot of new innovation in these, right, you know, manipulation techniques because millions and millions of dollars go into these campaign media strategies. So there's a lot of money to play around, to experiment, to innovate. So it's going to be an interesting. Interesting. This is terrifying. So do you think 2020 or 2024 will be the year of deep fakes? I don't know. I think, you know, there's a lot of hype about deep fake right now. I don't know how real it's actually going to be.

Starting point is 00:31:13 And, you know, we're already seeing, you know, a lot of the research agencies like DARPA and things like that work on being able to detect when photos and videos have been manipulated. So I'm a little bit more optimistic that deep fake will not become a thing. Maybe in like low literacy media environments, there might be, you know, more, it might have more of an impact. But I like to remain optimistic. Yeah, I was going to say, like, so closing up, what are your optimistic thoughts for the future? My optimistic thoughts for the future. Geez, I don't know if I have any today.

Starting point is 00:31:55 I've just been ranting about all of the problems and, you know, reminding myself after the Christmas break why I really care about these things. Are there any signs of things improving? You know, I like to think that they are. I think a lot of governments are seriously thinking about this problem. You know, there are a couple of people that are really educating themselves. the issues that are at the intersection of technology and politics and society. You know, I think a lot of the time policymakers make laws without necessarily understanding the technology. And, you know, there's a big gap there. But it does

Starting point is 00:32:35 make me feel more optimistic when I see, you know, like Senator Warren or here in the UK, we have Damien Collins. They're really on top of their game and taking this in and thinking really seriously and deeply about what good regulation could look like. That's not going to just break the technology. So, you know, the fact that there is more energy around government regulation and proper government regulation, that makes me feel a little bit more optimistic. So if someone wanted to study what you're studying or try and help this cause, what would you tell them to do? I would just say be nice to each other on the internet, honestly. Because I think there's just so much anger in society right now.

Starting point is 00:33:21 You know, we're seeing more and more polarization, like especially in the U.S. You know, that gap has been widening and widening for the past 20 years. And I think we just need to remember that we're all humans at the end of the day. We might have different beliefs, but that doesn't make us, you know, evil or wrong or terrible people. We need to just be nice to each other and learn to talk to each other again. That's a great point. Yeah. So checking out isn't.

Starting point is 00:33:44 necessarily a net positive for the community. Exactly. Yeah. Cool. All right. Thanks so much, Sam. Thank you. All right.

Starting point is 00:33:52 Thanks for listening. So as always, you can find the transcript and the video at blog.org.combinator.com. And if you have a second, it would be awesome to give us a rating and review wherever you find your podcast. See you next time.

Y Combinator Startup Podcast - #109 - Samantha Bradshaw

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.