CyberWire Daily - Detecting Twitter bots in real time. [Research Saturday]

Starting point is 00:00:00 You're listening to the Cyber Wire Network, powered by N2K. data products platform comes in. With Domo, you can channel AI and data into innovative uses that deliver measurable impact. Secure AI agents connect, prepare, and automate your data workflows, helping you gain insights, receive alerts, and act with ease through guided apps tailored to your role. Data is hard. Domo is easy. Learn more at ai.domo.com. That's ai.domo.com. Hello, everyone, and welcome to the CyberWire's Research Saturday. I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down threats and vulnerabilities and solving some of the hard problems of

Starting point is 00:01:10 protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us. And now a message from our sponsor Zscaler, the leader in cloud security. Enterprises have spent billions of dollars on firewalls and VPNs, yet breaches continue to rise by an 18% year-over-year increase in ransomware attacks and a $75 million record payout in 2024. These traditional security tools expand your attack surface with public-facing IPs that are exploited by bad actors more easily than ever with AI tools. It's time to rethink your security. Zscaler Zero Trust plus AI stops attackers by hiding your attack surface, making apps and IPs

Starting point is 00:02:01 invisible, eliminating lateral movement, connecting users only to specific apps, not the entire network, continuously verifying every request based on identity and context, simplifying security management with AI-powered automation, and detecting threats using AI to analyze over 500 billion daily transactions. Hackers can't attack what they can't see. Protect your organization with Zscaler Zero Trust and AI. over 500 billion daily transactions. Hackers can't attack what they can't see. Protect your organization with Zscaler Zero Trust and AI. Learn more at zscaler.com slash security.

Starting point is 00:02:43 I started looking at disinformation campaigns by foreign actors early last year. That's Daniel Katz. He's a senior principal researcher in the Norton LifeLock research group. The research we're discussing today is titled Introducing BotSight, a new tool to detect bots on Twitter in real time. real time. Norton LifeLock, previously Symantec, has a history of looking at state-sponsored campaigns in the security space. So we look at advanced persistent threats, malware that's distributed by state actors. And we had this conversation within the research group that disinformation is very similar to these kind of threats, but it's operating in a space that doesn't have as much scrutiny from dedicated professionals.

Starting point is 00:03:34 So we started looking at the data and this was the end result after a lot of work and a lot of back and forth about what the best way to tackle that problem is. Well, let's walk through it together. I guess, can you describe to us, first of all, what is the tool that you all have released? We released a tool called BotSight. The idea of the tool is that it can flag accounts which are behaving in such a way that they're very similar to social bots. So these are accounts that state-sponsored groups use to spread disinformation. So when you install BotSight, which is available as a browser plugin for all major browsers or as an app for iOS, is available as a browser plugin for all major browsers or as an app for iOS.

Starting point is 00:04:30 You can use this tool to see percentages right in your Twitter feed of the likelihood that a given account is acting as a social bot. Well, take us through what's going on under the hood here. I mean, how did your team come at this problem and analyze the data to be able to come up with these percentages? So there's a big reservoir of data that we analyzed. We actually analyzed four terabytes of past tweets that we got our hands on through various sources. There's actually been a lot of academic work in this area, but it's been focused on older data sets.

Starting point is 00:05:09 So while we took a lot of cues from the academic world in terms of our approach, the data that we use is a little bit newer. So in the background, we have a machine learning model that takes in approximately 20 different features from a given account and calculates what is the probability that this account is a social bot. These features are based on historical examples of social bots in the past, and this percentage is what we call

Starting point is 00:05:42 calibrated. So a lot of the time, machine learning models, they return just numbers between zero and one. But these don't really correspond to percentages. But we have calibrated it so that whatever percentage you see in the feed, that is the actual likelihood that that account is a social bot. And how do you check against yourself in an ongoing way? How are you making sure, I guess, maintaining the integrity of these evaluations over time? It's a great question.

Starting point is 00:06:18 The short answer is testing. We have a lot of people now who are using this tool and who had been using this tool internally at Norton LifeLock before we released it for about five months. And people were constantly coming back with feedback. As you know, different people use Twitter in a variety of different ways. There are so many accounts and a lot of languages. Norton LifeLock is a global company, so we have employees from all over the world. And they're telling us that they're getting unexpected results here and there. And so we're constantly tuning the model for months and months before we got something that we were really, really happy with.

Starting point is 00:06:57 And this kind of highlights the difference between academic research data sets that we initially started using versus a real-life deployment. And the real world is much more messy than the small academic data sets that researchers tend to use. Can you give us some insights on the types of things that you and your team had to do to fine-tune the results? Absolutely. So one of the things that we found was that a lot of celebrities kind of behave like bots. And the reason for this

Starting point is 00:07:34 is they use social media management tools in order to coordinate their posts. They release the same posts, for example, on Instagram and on Facebook and on Twitter and on other platforms. And it looks a lot like the coordinated activity that we see. And so we had to make some adjustments for that kind of behavior. Another example are ad campaigns by corporations, which kind of behave in the same way. Can you give us an idea, what is a typical behavior that differentiates a bot from a real human?

Starting point is 00:08:15 So there are a few different behaviors. One is we find that groups of accounts that act together in concert over a prolonged period of time, they tend to belong to a single bot network. This is very rare for humans because humans are fickle. Some days you tweet a lot, other days you don't tweet at all. And you don't see this long-term collaborative behavior. That's one thing that we've observed. Another feature that we've observed is that social bots are very bad at coming up with their own organic content. They generally try to amplify, so they do a lot of retweeting. And social bots, they're really not normal Twitter users. So on Twitter, you might engage with the content. You might reply to some people. You might like posts. But that kind of, what we might say, passive and active engagement is actually very rare for bots. Bots tend to either retweet, that's the most common behavior,

Starting point is 00:09:26 or they tend to generate kind of generic tweets. They'll tweet a news story, for example. Now, I've installed the plugin here for myself, and I've been playing with it with Twitter. And first of all, I have to say, it is a lot of fun. So there's that element of it. But it's also fascinating to see these numbers scroll by. And I'm curious, what do you anticipate being a typical use case of this? Ideally, how would you like to contribute to how people use

Starting point is 00:10:03 the platform? That's a great question. I think that there's been a lot of discussion in the media about bots and fake news. And I think that a lot of people are aware of the problem that there exists these social bots on social media. But I think most people don't really have a sense of where these bots live. I think most people don't really have a sense of where these bots live. And this can be a little bit toxic because every time there's some odd opinion that's a little bit different, you might see on Twitter that people will call this person out as being, for example, a bot or a troll or something like that. And so we really wanted to contextualize where are you most likely to find bots?

Starting point is 00:10:46 What are the typical bot behaviors? And we wanted to do it in a way that is very clear to the average user. You know, we can always release a paper and talk about these bot behaviors, but I think this really helps educate a person in a way that is interesting, is obvious. Yeah, it's almost like you have an expert sitting over your shoulder while you're scrolling through things. And you say to yourself, hmm, that seems a little bot-like. You can then look at the results from the plugin and say,

Starting point is 00:11:18 hmm, yep, yep, it likely is. Someone else agrees. Or I guess the other direction where you can say, no, you know what, that is probably a real person. Exactly. And the other angle of this is we wanted to create a sense of critical thinking about where are tweets coming from. So one of the key instigating ideas behind this tool was in early March, I think, back when the U.S. Democratic primary was in full swing. I don't know if you remember it because it feels so long ago, but it was a contest between Bernie Sanders and Joe Biden. And right when Joe Biden won the South Carolina primary, there was some coverage by major publications that talked about,

Starting point is 00:12:18 you know, there's a lot of anger online from Bernie people that were tweeting with certain hashtags, like hashtag rig DNC, for example. But when my colleagues and I looked deeper into these trends, we found that specifically for that one, there was actually a lot of non-organic activity. So it didn't really appear that these were legitimate Bernie supporters, but were actually outside actors. And so we felt like BotSite going forward might help journalists to think about the possibility that these trends that they're seeing, they actually might not be organic. What sort of insights have you gained on Twitter itself as you've been going through this process and gathering this data and fine-tuning the tool, do you have a sense for where we stand when it comes to Twitter and the ubiquity of bots on the platform?

Starting point is 00:13:14 I think that we actually stand in a really good place right now, contrary to some of the coverage that you may see, because Twitter has gone after the bots, at least certain types of bots, very aggressively. And so from our own research, we see a marked decrease in the overall amount of bots. We call it the background radiation of bot activity on the platform, from almost 20% in 2016 to around 5% currently. 5% is still a high number, but it's a lot better than where we used to be. However, this doesn't address other misinformation problems that our tool doesn't tackle. So for example, people retweeting misleading claims is just something that we don't address. So for example, there was a picture that people were tweeting that seemed to show

Starting point is 00:14:19 an explosion in DC over the past little bit, but it turns out that this was just a steal from the show Designated Survivor. And this is not something that our tool would catch unless this was sponsored by outside groups, which it doesn't look like it was. It was just organic activity. Right.

Starting point is 00:14:39 So if the bots latch onto it and start amplifying it, that's something that you would detect. But the truth of the post itself is not something that you're really aiming at. Exactly, exactly. And this is both an advantage of our approach and a deliberate design decision, but also something that I think that people have to keep in mind when they go on social media. So on the one hand, just because something is not true doesn't mean it was posted by a bot. And secondly, just because the number of bots on the platform has gone down

Starting point is 00:15:18 doesn't mean that there is less false information. Do you have any advice or things that you learned in this process? I'm thinking for the folks in our audience who may be working with artificial intelligence or perhaps they're students who are learning about this sort of thing. Are there any insights that you can share through going through this process? Anything that surprised you or was unexpected in using that sort of approach for this sort of information and challenge? I have a few different insights. Well, the first one is what I said before, which is that your training set may not necessarily be representative of real-world data. So our initial training set were common data sets that are used in the research community,

Starting point is 00:16:04 things that are data sets that people used in the research community, data sets that people have published many academic papers on. But we found that these data sets, if you just use them to train the models, are actually not really representative of current use of Twitter. And whether this is because people are just using Twitter in different ways or because the data sets themselves are small. I'm not sure, but it speaks to the value of doing some kind of validation before you really deploy your models into the real world and being able to adjust your data sets accordingly. We also did a variety of what's called cross-validation in a very specific way. what's called cross-validation in a very specific way. There's a paper called D-Bot,

Starting point is 00:16:49 which talks about the use of cross-validation to make your model more resilient. And the idea there is that you're specifically taking your training set and you're splitting it according to the different types of thing, in our case, bots. And you're training your classifier on, let's say you have five types of bots in your training set. You train on the first four, and then you see if your classifier can recognize the fifth type that it hasn't trained on. And this makes your model more

Starting point is 00:17:20 robust, because if you start mixing in the different types of bots together in your training set, then what happens is when you encounter a new type of bot in the wild, you're not as sure that your model is going to detect that kind. And let's say the final thing that we really learn is that even if you have a false positive rate or a false negative rate of 1% or 2%, that can be still a lot if people are using your classifier and are really relying on it. You know, if you have, for example, 100,000 or 1 million lookups, 1% of that is a big number. And so you have to think about these percentages in terms of the anticipated volume. How do you and your team protect against your own personal biases sneaking

Starting point is 00:18:15 into the various algorithms that you're using here? How do you make sure and guard against that sort of thing? So when people send us accounts that they think are bots or are not bots, we are very conservative. And so we apply a consensus to these examples before we add them into our training set. So if we're not all sure that something is a bot or is not a bot, based on examples that someone has sent in, we don't add them. This is one of the reasons why we try to stay away from actually using the content of the tweet. And our classifier really focuses on metadata. It does look at the hashtags, but it looks at not what the hashtags are,

Starting point is 00:19:08 but for example, is the hashtag popular? How many hashtags are there? How many mentions are there? But in terms of the actual content of the tweet, whether it's political or whether it's medical, we try not to look at that partially for this reason. Why was it important for you all to put a tool like this out there to make it widely available for free?

Starting point is 00:19:33 For me, this is deeply personal. I grew up in Russia in the early 90s, and my parents actually met handing out pro-democracy leaflets in the late 80s in the Soviet Union. And so my family background has this rich history of both experiencing a wide campaign of disinformation and misinformation and knowing the real value of objective truth. And I really latched onto this issue quite strongly when we were looking at these kind of misinformation campaigns. And I felt like this was a real positive good that we can do in the world.

Starting point is 00:20:19 You know, it's very rare as technologists that we can just put something out there to really help people in an unambiguously good way. And that really made me excited to do something like that. Disinformation, it doesn't just come from one side or the other. It comes from both sides at once. The intent isn't just to deceive you, but it's also to inflame tensions. It's also to divide us.

Starting point is 00:20:50 It's also to play to our existing biases. So we have to be especially careful when we are on social media to question things that we see that appeal to us. Our thanks to Daniel Katz from Norton LifeLock Research Group for joining us. The research is titled Introducing BotSight, a new tool to detect bots on Twitter in real time. We'll have a link in the show notes.

Starting point is 00:22:08 Thank you. which can keep your company safe and compliant. The CyberWire Research Saturday is proudly produced in Maryland out of the startup studios of DataTribe, where they're co-building the next generation of cybersecurity teams and technologies. Our amazing CyberWire team is Elliot Peltzman, Puru Prakash, Stefan Vaziri, Kelsey Bond, Tim Nodar, Joe Kerrigan, Carol Terrio, Ben Yellen, Thanks for listening.

CyberWire Daily - Detecting Twitter bots in real time. [Research Saturday]

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.