CyberWire Daily - WAV files carry malicious data payloads. [Research Saturday]

Episode Date: December 14, 2019

Researchers at BlackBerry Cylance have been tracking ordinary WAV audio files being used to carry hidden malicious data used by threat actors.  Eric Milam is VP of threat research and intelligence at... BlackBerry Cylance, and he joins us to share their findings. The research can be found here: https://threatvector.cylance.com/en_us/home/malicious-payloads-hiding-beneath-the-wav.html Learn more about your ad choices. Visit megaphone.fm/adchoices

Transcript
Discussion (0)
Starting point is 00:00:00 You're listening to the Cyber Wire Network, powered by N2K. data products platform comes in. With Domo, you can channel AI and data into innovative uses that deliver measurable impact. Secure AI agents connect, prepare, and automate your data workflows, helping you gain insights, receive alerts, and act with ease through guided apps tailored to your role. Data is hard. Domo is easy. Learn more at ai.domo.com. That's ai.domo.com. Hello, everyone, and welcome to the CyberWire's Research Saturday. I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down threats and vulnerabilities and solving some of the hard problems of
Starting point is 00:01:10 protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us. And now, a message from our sponsor, Zscaler, the leader in cloud security. Enterprises have spent billions of dollars on firewalls and VPNs, yet breaches continue to rise by an 18% year-over-year increase in ransomware attacks and a $75 million record payout in 2024. These traditional security tools expand your attack surface with public-facing IPs that are exploited by bad actors more easily than ever with AI tools. It's time to rethink your security.
Starting point is 00:01:57 Zscaler Zero Trust Plus AI stops attackers by hiding your attack surface, making apps and IPs invisible, eliminating lateral movement, connecting users only to specific apps, not the entire network, continuously verifying every request based on identity and context, simplifying security management with AI-powered automation, and detecting threats using AI to analyze over 500 billion daily transactions. Hackers can't attack what they can't see. Protect your organization with Zscaler Zero Trust and AI.
Starting point is 00:02:33 Learn more at zscaler.com slash security. Well, I mean, the most interesting thing about this is obviously like steganography is not new. It's been used since the days of the Romans. It's been around for a long time, but typically you see it used with image files. That's Eric Milam. He's VP of Threat Research and Intelligence at BlackBerry Cylance. The research we're discussing today is titled Malicious Payloads Hiding Beneath the Wave. And in this specific case, it was very interesting to see it used with a different file format, in this specific case, a WAV audio file. So that really drew our attention right away and was something fairly interesting. There's not a lot
Starting point is 00:03:20 of research out there. There are some discussions around it. I think some people have covered it, but not to the depth in which we wanted to go to within this research. Let's start off with some basics here. I mean, I think most of us are probably at least somewhat familiar with what a WAV file is. That's an audio file. It goes back a long way. It's a file format that's been around a long time. Yeah, yeah. It definitely predates the MP3s and MP4s that we have these days. It's been around since the beginning of time. Yeah. It's, it's very interesting. I think the reason it was chosen or what we've come to believe is the reason it was chosen is back then when this was created, there wasn't a lot of file integrity checking going on. And so when you do something like Stego, you have to really understand the format so you don't break it.
Starting point is 00:04:05 Some file formats are definitely a lot more forgiving. That's why Stego in images, they normally use PNG, lack of integrity checking, just the ease of leveraging that format. Same thing here with Wave. If they tried to do this, say, within an MP3, there's file integrity checking, there's compression, there's all other kinds of things that have to go into it that could potentially break. So I think they kind of took the path of least resistance and picking the WAV file itself. And I think the file format really helped them make it a little bit easier on them. Yeah, that's really an interesting insight. Well, let's go through here together.
Starting point is 00:04:42 How exactly did this work? It wasn't just the WAV file standing on its own. Yeah, that's correct. Like every WAV file was coupled with a matching loader file. So the way it works is the loader file is paired with this item, with this file. And when it runs, essentially, the loader then will look for that WAV file, grab and extract the items out of it, parse that into memory, into the same address space, and then execute the must-use payload that's within it. And what's going on within the WAV file itself? Are they spreading the useful data throughout the file?
Starting point is 00:05:21 Yeah, there's a certain level of encoding that happens obfuscation within that. I have to admit, I don't know the technical details around that part. So essentially, yeah, what they're doing is they're leveraging empty space or areas within that file format in order to put these malicious bits in there that can then be extracted. And the interesting thing was, as I was pointing to earlier, they didn't actually break the file format. The wave files actually played. One played music and one played white noise. So that means they had a really good understanding of where they could put things, the file structure itself, and how they could leverage and get the items back out of it. Yeah. And again, very similar to what we've heard
Starting point is 00:05:58 about folks doing with image files, where if you didn't know any better, you wouldn't know that there was anything wrong with the file at all. Yeah, that's correct. In fact, we did some research around APT32 and Ocean Lotus not that long ago, where they used PNG files and they used encoding within the RGB to hide a payload and be able to extract a payload. So similar technique, just a different file format. Well, there were three basic categories that you all cover here in the research. Why don't we go through those together one by one? Sure. So the first one employed some steganography to decode and execute a file. What was going on with that one? A PE loader? Yeah. So the PE loader, the way it works, that goes back to what we're talking about with a
Starting point is 00:06:41 coupled file. So there's essentially two different files that were placed on these systems. So the first one is a PE file, which we'll call the loader, which knows how to access the WAV file, the specific WAV file, and extract what needs to be extracted from that in order to either install the backdoor or in another instance, we were able to find some crypto mining software as well. So we had both shellcode and crypto mining. So that loader would be matched with the WAV file. So those two both exist then on the endpoint. And when that loader is run, it's able to grab that WAV file again and extract that shellcode into memory and then provide access back to the command and control server
Starting point is 00:07:20 for additional nefarious activities. It would seem to me like the loader itself could be the thing that kind of gives away the game. A WAV file gets through unnoticed, but hey, what's this loader or what's this mysterious file here doing on my system that turns out to be the loader? Sure. And potentially those are the ways where a lot of things get caught, right? So the difference being that the loader is a portable executable in this case, and the WAV file is, as we talked about in the research, a benign file that you would never expect to have any malicious intent behind it. So when you look at things like the loader, there's different things associated with it,
Starting point is 00:07:57 different characteristics. Obviously, if it's something that's never been seen before, unless you're using something like machine learning, you might not ever detect it. So the way in which we look at files that could potentially be benign, if you think about a loader, a lot of times those are largely written to bypass any level of security. They're very kind of simplistic in which they just go out and download or pull down additional information. In this case, the loader was simply meant to read another file. So there might not really be at first glance or first blush any specific malicious intent or anything that might be found around that. So what we tend to do in those types
Starting point is 00:08:37 of situations within BlackBerry Silence is we leverage a technical technique using machine learning that is referenced as centroids. It's a technical term around basically just really building a specific model for catching items. So we're able to do something like build a specific model around this loader that even if potentially it's not deemed malicious, we're able to actually still find it, catch it, analyze it. And what that also allows us to do is as these attackers evolve these techniques, change these loaders, if the variance isn't too great, we can continue to catch those as they even adjust and evolve. Now, do the two files typically show up on your system at the same time? Is there any attempt to put a time gap between the installation of the two? Or does that not really matter in this case?
Starting point is 00:09:25 No, it doesn't necessarily matter in this case. The only thing that matters is, again, they're a matched pair, the loader and the WAV file. So they both have to be on the system for this to happen. You can't have one loader with a different WAV file. They're really dependent on each other. So yeah, they tend to be downloaded simultaneously. We didn't experience anything, at least in what we were able to analyze where, say, it was something that was UPX-packed or packed in some way that then ended up on the system and was extracted for the two. The way we analyzed it pointed to the fact that there was probably some type of backdoor connection to the command and control server, and then those files
Starting point is 00:10:01 were then downloaded simultaneously. I see. Now, the second category of loader was using an algorithm to hide some shellcode. What was going on with that one? The attackers were using the Metasploit framework shellcode. And so obviously when you're using shellcode, the goal is to maintain a connection to keep your backdoor going, your command and control responses and execution. And so this is how these attackers were able to stay on these systems for a really long time. I think the main thing too that I want to point out is like when you use something like a backdoor, you're obviously targeting. And when you couple that with something like hiding this within a WAV file, like the main focus is trying to stay as hidden for as long as possible.
Starting point is 00:10:46 So they definitely believe that this is an adversary that is really looking to stay inside an environment, exfil as much data as they can and continue to move throughout the environment. And then the third one was using an algorithm to hide some PE files. Yeah, there was like a crypto miner that was being used, Monero Crypto Miner, which we had some conversations around this. It's easy to say that, okay, since it's a crypto miner, they're probably using this to just go do some mining, make some money. Maybe they've X-filled everything they need from a system, or maybe they've identified a system and they just want to make some easy cash. But the other side of that too is possibly that these attackers want to fire off red herrings in other areas of the organization so that the focus ends up happening
Starting point is 00:11:33 on that. So let's say they use something that's a crypto miner that might be more easily recognized or more easily caught or identified quickly. And let's say they're using that in a completely different part of the organization in which they're actually attacking and exfilling data. It's an attempt to move the attention of the SOC and other individuals to that and to addressing that. So we're kind of torn between those two, but definitely interesting for sure. Yeah. Any patterns that you're seeing in terms of who this seems to be focused at, both in terms of groups and geographically? We looked at that. They're definitely not targeting an individual. If they're going to spend the amount of time they did here to hide and obfuscate, again,
Starting point is 00:12:13 using a WAV file, something that no one would ever think is malicious. They really put a lot of thought into this. They wanted to stay hidden as long as they could, which really means that they're targeting enterprises and organizations. they could, which really means that they're targeting enterprises and organizations. We haven't seen a trend at this point as to specifically, you know, a vertical like healthcare or auto industry, but we're definitely keeping an eye on them to see, you know, where they end up going from here. And in terms of detection and protecting yourself against this, what are your recommendations
Starting point is 00:12:40 there? Since this payload is, you know, it's loaded into memory. So it's really only detectable in that space. You know, it's easy to understand or to analyze a system that's being crypto mined by some of the things that are happening on it most easily. Usually the CPU or GPU is maxed out for a consistent amount of time, but that's dealing with the symptoms of it. So in order to really handle this specific attack, you'd have to have something that is looking in the memory space, understands what's going on in that memory space and is able to make a decision or determination or to take preventative steps from analyzing what's going on in that you have these legacy files that have been around for decades, these legacy file formats. And I think we sort of categorize them in our minds as being completely benign, partly because they're sort of tried and true. Those of us in certain
Starting point is 00:13:38 industries like podcasting, we're slinging around WAV files all day long and not thinking twice about it. I wonder if it's time for a little bit of recalibration when it comes to how we think about these legacy files. I mean, absolutely. Obviously, as time goes on, things get better, right? Things change again, like MP3 has been around also for a very long amount of time. You know, there's also OGG files. There's a whole bunch of different file formats that can be leveraged. And I definitely think that when we look at something that, you know, I hate to say is so old because I'm probably just as old as it, but something that has been around, right? We don't think about security when it comes to these things because they're just, as you mentioned, a normal part of every day and haven't really been tied in the past to anything that would be malicious, right? So I don't want to call it a wake-up call, but it's definitely
Starting point is 00:14:29 something where you look at that and go, wow, yeah, maybe I should put up a little bit more, be a little bit more concerned around some of these, right? And just maybe use a little bit more operational security per se when handling these. Yeah. I mean, just as you said at the outset, just that these file formats internally don't have the type of integrity checking that we've come to expect from modern file formats. And perhaps the very fact that they operate in that way means that either they deserve a closer look or even, you know, maybe it's time to, I don't know, recommend that these formats maybe get retired for something a little more modern. Yeah. I mean, I would agree a hundred
Starting point is 00:15:10 percent. I mean, it's not like we don't have anything better out there. Right. Right. I mean, getting rid of WAV files, I obviously... It's hard to imagine, right? It's hard. Yeah. Yeah. I don't know, like, but yeah. And I don't know how widely they're still used and leveraged, I don't know how widely they're still used and leveraged. In my small world, it's all about compressed files. Because I remember back in the days of copying DVDs or ripping the DVDs. And it was a WAV file. And it was ridiculously huge for back then, obviously. And then when MP3s came out, you're like, wait, it's a tenth of the size?
Starting point is 00:15:43 OK. Right. And that was back at least in the 90s. So, we're looking at 20 years. So, maybe it is time to just be like, hey, we're not going to leverage that anymore. The funny part of it is that in the interim, file storage capacity has become so crazy. Yeah. You know, it's sort of meaningless.
Starting point is 00:16:04 Sure. That a large WAV file, it's not really a barrier anymore. I agree. In professional audio production, I mean, WAV is, to this day, is one of the industry standards because it's uncompressed, you know, and so ubiquitous. So what an interesting thing to have to think about. Yeah, I mean, this research really does beg the question of the things we thought were safe. Now we have to, you know, we do have
Starting point is 00:16:30 to put kind of an eye to and the things maybe, you know, we always say if you don't build something with security in it from the get go, it's hard to strap security onto it afterwards. Right. And obviously this would fall into that realm. Right. No, I don't think anybody ever considered that. And again, it's still fairly new to see this, but it begs the question as to how long was maybe this going on before we actually identified it right across, you know, across the globe, across the population. That's Eric Milam from BlackBerry's Silance. The research is titled Malicious Payloads Hiding Beneath the Wave. We'll have a link in the show notes. Cyber threats are evolving every second and staying ahead is more than just a challenge. It's a necessity. every second, and staying ahead is more than just a challenge. It's a necessity. That's why we're thrilled to partner with ThreatLocker, a cybersecurity solution trusted by businesses worldwide. ThreatLocker is a full suite of solutions designed to give you total control, stopping unauthorized applications, securing sensitive data, and ensuring your organization runs smoothly and securely. Visit ThreatLocker.com today to see how a default-deny approach
Starting point is 00:17:49 can keep your company safe and compliant. The CyberWire Research Saturday is proudly produced in Maryland out of the startup studios of DataTribe, where they're co-building the next generation of cybersecurity teams and technologies. Our amazing CyberWire team is Elliot Peltzman, Puru Prakash, Stefan Vaziri, Kelsey Bond, Tim Nodar, Joe Kerrigan, Carol Terrio, Ben Yellen, Nick Valecki, Gina Johnson, Bennett Moe, Chris Russell, John Petrick, Jennifer Iben, Rick Howard, Peter Kilpie, and I'm Dave Bittner. Thanks for listening.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.