Risky Business - Risky Biz Soap Box: Mike Wiacek on lazy mode threat hunting

Starting point is 00:00:00 Hey everyone and welcome to this special Soapbox edition of the show. My name's Patrick Gray. And for those of you who don't know, these Soapbox editions of the podcast are wholly sponsored and that means everyone you hear in one of these Soapbox shows paid to be here. Today's Soapbox is with Mike Wyasek, the Chief Executive and Founder of Stairwell. Mike is an ex-Googler who was also the founder of Chronicle, the company that was spun out of Google and then later reacquired by Google because they're galaxy brain strategic geniuses, I guess. But yeah, after the dust settled on all of that, he spun up Stairwell. And Stairwell is a really interesting platform that basically seeks to create something similar to a network detection and response platform, but for file analysis instead of network traffic. The idea is you get a copy of every unique file

Starting point is 00:00:58 in your environment, you know, historically, and as they arrive, you get all of those files, you stick them in the Stairwell platform, and you can do that via what is essentially a file forwarding agent and then you set up an inventory also that lists where these files exist in your environment at what times so on and so forth and from there once you've brought all of this stuff together you can start doing some analysis so if you find a dodgy file, you can do all of the usual malware analysis type stuff, but you can also do things like immediately find out where else that file is in your organization or even where else it was. And from there, you can identify other files that are similar, variants of those files. So you can identify the whole family and then

Starting point is 00:01:43 search from those and outwards and onwards and outwards and onwards. And you can unpack all of this very, very quickly. So this is the type of tool that EDR companies use internally on their back ends to do threat hunting. But Stairwell is for you and your organization. You can drive it. And as you'll hear, the idea of a transparent, customizable, and programmable security stack is something that's kind of on trend at the moment.

Starting point is 00:02:08 And I actually think that's quite positive. So here's Mike starting out by laying out the case that doing this sort of file analysis in your organization makes a whole lot of sense. Enjoy. I'm a guy who built systems to play with Yara at scale, studied data in VirusTotal over and over and over again, always constantly evolving. And I realized that from a data perception, I knew more about the files that exist in a feed like VirusTotal than I did about the files on the computer two cubicles down from me. And that's very strange, right? Like I had a better ability to apply Yara

Starting point is 00:02:47 over billions of files in this data set, but not on the thousands of files on the computer two computers over. And how do I flip that? Because I actually had a really cool sense of information through this window, but I couldn't look out that window. And as we started trying to figure out what stairwell would do, we decided to say,

Starting point is 00:03:12 let's stop. Let's do the opposite of what we were doing with Chronicle. Instead of collecting logs, I'm pretty certain when I left, they were in the exabytes of logs. And if they were to tell me today, they're in the tens of exabytes of logs. And if they were to tell me today they're in the tens of exabytes of logs stored within Chronicles, a platform at Google, I would not be surprised. And so when you start thinking about that much data, you start thinking about that volume, what is the actual signal in that cost? I did some math the other day, and this is absolutely crazy. If you were storing logs on four terabyte hard drives, and I'm talking three and a half inch SATA hard drives, if you were storing logs on four terabyte hard drives, to get to one exabyte, and if you stacked

Starting point is 00:03:56 these hard drives on top of each other, like flat, flat, flat, going up, what would be a good representation of how big that tower would be right and it's actually it's 250 000 hard drives and if you were to stack them up on top of each other it ends up being 3.157 miles tall that i was trying to come that is tall i mean that's you know it's not as cool as stacking them up and they go to the moon, which is the usual metric when we're stacking things, Mike. But, you know, that's still a lot of hard drives. That's a lot of hard drives. I mean, I guess that's an interesting point, right? files in your environment on every single endpoint is actually less data intensive than

Starting point is 00:04:46 handling logs, which is wild when you think about it. It's so counterintuitive. People are like, oh, but you're collecting all of the files. It's terrible. Isn't that a lot of data? And it's like, it's not small amounts of data by any means. Yeah. But compared to what you're putting in your logs, it's actually incredibly smaller. And if you want proof of that, just go check your latest Splunk invoice, because they're going to tell you exactly how many bytes. Yeah. I like to say, it's like logs are not worthless, but they're worth a lot less than we think they are, is basically one of the insights that we had. If I start collecting

Starting point is 00:05:25 the files and I start storing those files, that opens up so many more opportunities because now what I have is in some sense, ground truth data, right? Like logs give me metadata about what happened around some sort of time. But usually when you start getting down to files or, you know, thinking about like, you know, the actual contents of a Python script that's running or PowerShell script that's actually running, those are the actual commands that are running. That is the ground truth. The logs is showing you, telling you about what happened when that thing ran. I'd rather have the thing that actually ran so I can go over and look at it and be able to like reconstruct it and do analysis over it after the fact. Sure, sure. But I mean, you're not going to find everything with files, right? Because obviously there's various types of attacks where, you know,

Starting point is 00:06:14 you're not really doing anything with files, but yeah, for the vast majority of stuff, a hundred percent, absolutely. But you know, what occurs to me too about Stairwell is that what you've done is create something similar to the types of tools that are actually used by the EDR companies on their end to do analysis across all of the files that they encounter. But instead, you're sort of moving that, the possibility of doing that sort of work into people's enterprises, which makes me think, I mean, there's a whole bunch of startups at the moment who are doing this sort of thing, right? So you've got ones like Sublime Security and Sublime are doing email security, but now you can actually throw your own rules into it, right? So it's not like one of these black box, you know, cloud-based email security firms, you know, you actually then can come on-prem, do threat hunting on your own data, you know, spin up your own rules, things like that. I mean, it's, it seems like, you know, there's another vendor too, that's kind of like a

Starting point is 00:07:13 little bit stealth, so I won't mention them, but it does seem like there's this trend now of, you know, getting more programmable, transparent and and configurable security tools into your own premises that allow for greater flexibility, right? So that's a big part of, I'm guessing, what Stairwell's bringing for a lot of your customers. I think that's totally true. I think we're entering a new stage when I think about security capabilities. Before it was like, give me some black box, which I install or I send data to, and then it tells me what's bad. And I just act on that. And a personal soapbox of mine for a long time has been like, too many security professionals are just consumers of alerts from security products. We should be practitioners. You would not go to a lawyer

Starting point is 00:08:05 and this lawyer simply just typed everything you said in the chat GPT, asked for legal advice, and then he just regurgitate it back to you what the box told him to say to you. You actually want someone to actually think about this. But in security, we've almost pushed the other direction. It's like, just do what the thing says. Do what my EDR says. Do what my firewall's alerts are telling you to do. And we've, you know, we've, I've actually had a customer say this to me the other day, which I thought was great.

Starting point is 00:08:38 He said, stairwell is the one platform I don't want to soar to death. And I thought that that was a really one i took that as a as a badge of honor but he's like no i don't want to just plug you in with an api and call it done he goes i get a lot of value by going in and looking around um and we can talk about some of the some of the cool things we've been building uh in a bit but like that's like that's like one of those cases where show them that there's a workbench where they can understand how this particular thing fits into everything else that's happening in their environment, how things may be moving around. Give them that visibility.

Starting point is 00:09:19 Most of the time, that's it. They have one alert and that's it. But in cases where there's not, or in cases where there's co-occurring things or anything happening on other devices, you may know about this piece of the puzzle, but you don't know about that piece of the puzzle. And being able to construct that together across time where I'm not wholly reliant on signatures, heuristics, an AI model, pick your magic black box of filtering good or bad, watching the here and now and only the here and now. And so since we're basically collecting all of these files and restoring them in perpetuity, any new bit of information almost essentially colors the graph of what's bad and good, different shades of gray at any point in time. And so

Starting point is 00:10:06 you're always having new adaptations. And that's one of the things that customers actually often are kind of shocked by is like, well, this was good yesterday. I'm like, yeah, well, it's bad today. Well, this was bad yesterday. Well, now it's good today, where you start understanding the fluidity of what's happening. And so you start looking for trend lines much more importantly, not trying to solve every problem right like i i don't i don't view stairwell as a replacement for edr you should have your edr stairwell provides a different level of detection above what edr is capable of doing well it gives you the flexibility right and and i think a lot of what we do now it's so much of how we handle things these days is driven by like there being a tight labor market for really specialist people in security.

Starting point is 00:10:52 So we kind of outsource a lot of these decisions, you know, in the moment decisions to companies like your CrowdStrikes and your Sentinel Ones and whatever, who have teams who are trying to do this at scale for everyone all the time. And it's actually a model that makes sense but i i guess what i'm getting at is if you're a larger company or a more attacked organization and you just need that little bit more flexibility yeah i just think this is an interesting trend towards more of a you know programmable and transparent security tech stack. Yeah. I mean, I actually think, in some sense, black boxes are a problem in security. They benefit the attackers. Because here's an honest question, or it's not a question, more of an anecdote. It would be, when I was at Google and we were looking at some AV solutions, this is years and years and years ago, we were looking at different AV vendors that were out

Starting point is 00:11:50 there. And we asked the Project Zero team to say, which of these products do you think actually holds up well to an attacker going after them? And none of them did. We won't name names or say anything like this, and this is over a decade ago, so it's a long time ago, but it was what none of them did. Like, in fact, they came back to us and they simply said, you know, all of them, we found remote code execution vulnerabilities in all of them. And so you're like, wow, by installing the security software, I've made myself less secure. How many companies get down to that level where you would have like a Project Zero engineer tearing apart a security control or system you're thinking about deploying before you do it? And the answer is like none. Like Nobody does that. That's ridiculous. But do you know

Starting point is 00:12:46 what? The groups who are writing prolific ransomware, they do that. They're taking these things apart. Conti is taking this stuff apart. Three-letter agencies around the world, three, four-letter agencies, depending on what country you're in, they're taking it apart. And so the question there is like you're putting your eggs in the basket of a product who the people you probably are most afraid of and impose the most risk to your business. They know how these tools work inside and out and you don't. my story with becomes like in some sense, a self-fulfilling prophecy of like, I'm hoping this thing protects me, even though I know that they already have figured out how to work around whatever its capabilities are. Going back to the Stuxnet anecdote of like code paths for different security products

Starting point is 00:13:38 on device. That's strange, right? And so like one of the things for us is like, we want to show you this. We want to tell you like, this is how this is based on uh you know high entropy this is based on this this is based on how common it is this is based on these yar signatures which you can go look at this is what's happening um you know definitely lots of interesting value there the most fundamentally valuable thing that you get out of stairwell too is that you're doing if you're trying to set up for detection you're doing the analysis like somewhere other than the endpoint

Starting point is 00:14:10 bingo right so this is like like doing the scanning on the actual endpoints like when it comes to file scanning honestly it only makes sense because that's what the way we've always done it like as you said earlier, like imagine green fields, like should we scan the dangerous things on the target devices? Yeah, probably not. That seems crazy, actually. Do you have to beep that? Yes, I do.

Starting point is 00:14:39 No, that is, yes, that's exactly the point. It's like the big thing for stairwell, when we think about it in that particular case is that we're place shifting where the analysis happens. Yes. There is no feedback. In some sense, that's actually one of the strangest things that people will first realize is they're like, well, how do you block something? And I'm like, we don't. Right. That is where your EDR has an incredibly valuable role. That is the active change agent

Starting point is 00:15:10 on a device that you're responsible for. Work with them to block a process, kill off a file, do what you need to do. We are completely passive. We collect data, we store it privately and securely in our platform. You can see everything there is to see about it. It's there for forever as long as you want it to be. At the end of the day, if something is believed to be malicious, we will tell you and we will show you where and we'll tell you why and do all of this type of stuff. But at the end of the day, you build the integration or we'll build it for you to actually go take that action on the device. But it's not like there's an, using a academic term, there's not an oracle on that machine that's saying good, bad,

Starting point is 00:15:52 good, bad. There is no way to see what does stairwell think about a file without providing it to stairwell in the first place. And then it gets stored. And then that opens up all sorts of interesting opportunities. Do you actually see detection, detection, detection, a variant of the file that fails detection, you realize they've worked around it. So now you can actually study the evolution of files the way people do inside of VirusTotal. But you can actually do that on devices in your enterprise. And that actually becomes a really, really interesting thing. And the barrier to entry for tackling some of that does not require you to be reverse engineer anymore. Like we've been able to like really bring down the level of sophistication required that, you know, I've been teaching my 11 year old son how to discern malware from non-malware, just looking at metadata that we have, like in some of our download environments. And it's like when you can teach an 11 year old boy who has the attention span of a race of Mario Kart to start doing some of this work,

Starting point is 00:16:49 you start realizing that you're actually really bringing down that barrier to entry. And so as you said earlier, decisions made because of human resource constraints can go away. That is where you get to start to have a lot of fun with security again, right? You're basically setting up your platform to be able to do threat hunting fairly trivially, right? And I don't currently know, I mean, I know that the EDR companies do this on behalf of their customers. But again, you know, you're talking about spreading a small number of people across just epic scale.

Starting point is 00:17:32 This is much more about, you know, doing it yourself, using a whole bunch of automated tools based on files that are appearing in your organization. I mean, you can even do something as simple as like, there is this executable file or DLL in my organization that is unique. No one else in the world has ever seen this file. Geez, you know, maybe I should just do some analysis on it and see if there's variants of this file elsewhere in my organization. And, you know, just off you go from there. But it, you know, it's much easier when you set up for it with something like stairwell, like that's clearly true. clearly true that that is that's the value of preserving those files up front building that data lake you're future proofing your future ability to threat hunt at any point in time you're future proofing your ability to train ai models at any point in time like one of the things i find really interesting

Starting point is 00:18:19 with the way security companies are using ai is they're training it on this relatively, not universally, but relatively the same set of data. They're buying the same data, the same feeds, they're getting it from the same people, and then they're training on it. And so what you end up with is a bunch of different AIs from different vendors that all work the same. And so one of the cool things is, again, for any AI model, the training data defines how good your model is going to be at the end of the day, garbage in, garbage out. What's cool about the way that we're working, since we're building unique data lakes per customer, you actually have the ability at any point in time to take that data and train

Starting point is 00:19:00 whatever you want on it. And so when you start thinking about it like that, you can take what our global view is, and then you could take what your local view is and build a hybrid model that gives unique data for you. If you're walking down the street and you find my Visa card and you find Eric Schmidt's Visa card, one of them could probably go buy a Tesla on the credit card and one of them will not. I'll leave it to you to decide which credit card could buy the Tesla on a Visa card, right? It's not me. And so when you start thinking about it like that, like credit card, they have fraud models which are tailored to the customer,

Starting point is 00:19:33 tailored to the customer's income, buying purchases and so forth. But security seems to be one more, much more, again, not universally, but much more on the one size fits all approach. And that homogeneity, that groupthink does not benefit the defenders. It benefits the threat actors. Well, yeah, this is again, this is like the third time I've come back to this, but again, this is why I'm seeing this push towards much more configurable and programmable security tools, right? It's happening across the board. I mean, I think the trend that was like the last five years was towards security tools all becoming, all getting an API, right?

Starting point is 00:20:17 So that there was at least some level of automation that you could, you know, or saw that you could do, right? So you could take something like Tynes or whatever automation platform you want and start gluing this stuff together. But now the trend seems to be, you know, to be able to adjust what that API actually does now, right? So everything's becoming much more programmable.

Starting point is 00:20:39 So the first stage was getting all of the APIs working. Now it's like, okay, how do we actually start tuning these APIs or, you know, configuring things in the background so that the APIs do what we want, you know? And I can think of like five companies off the top of my head that fall into this category of being, you know, programmable security stack play. So let's jump from there, right? So it's like, when you think about APIs in security, right, they're useful. I a file, delete a file. You have a very simplistic API for S3 at the high level. And that serves it really well. The question, though, is that the right API for people to use for a higher level challenge like security on a daily basis?

Starting point is 00:21:45 Like storing a file, anyone can go store files like in S3 real easy. The question is, what do you, for security purposes, it's not just get the file. There's a lot of analysis that you want on that file. There's a lot of data you want on it. So at what level of like the bare, from the bare bottom of get, store, delete, do you have all the way up to like,

Starting point is 00:22:06 tell me which of the files I've stored is malicious via a threat actor based in the Middle East, right? Like where on that stack do you do that? And then how do you plug into it? Well, I guess the point that you're making is that perhaps, you know, it's less about automations with you and more about like actually having someone there to manually work through it, you know, in's less about automations with you and more about like actually having

Starting point is 00:22:25 someone there to manually work through it, you know, in a way that's very efficient. I think it's a bit of both. And I actually think it's like doing some of that work automatically may lead to interesting novel detections, which could be, should go to people after the fact. It's the level of where we look at it. What is the unit of granularity? Are we looking at the behavior of one particular process? Are we looking at the behavior of one particular process? Are we looking at the byte sequences of one particular file? How do they all kind of interconnect? This actually led to a new feature that we've been prototyping and building lately.

Starting point is 00:23:00 We've been showing with customers. I was going to talk about it with you here today. And it kind of came across as a conversation with a, we hired a new VP of marketing recently, and I was walking him through, he asked me, he said, how would a SOC analyst use Stairwell? And they get an alert from their EDR. What do they do? And I said, oh, okay, interesting. I've never actually, I've never storyboarded this out before. So I grabbed a marker and went up to a whiteboard with them. And I said, this is what I would do. I'd probably take the hash to the file from the EDR alert. I'd come in the stairwell and I'd first see, do we have, what machines have that exact file? If it's from an EDR,

Starting point is 00:23:45 I know at least one does. So what other machines have that? And then I would say, let me extract out the IOCs from that file. I'll detonate it. I'll pull in host names, IPs, registry keys. I'll save those and we can come back to those later. I'd then probably see what does it match any YARA rules. And you think in our platform, we have tens of thousands of YAR rules floating around. So I'm looking for any high precision rules that match that make me a quick idea about what it is. I'll check that. Again, detonate that in a sandbox, try and get a constrained read on what its dynamic behavior is. Then we have a feature called variant discovery where we can, in two or three seconds, show you from our corpus of almost a billion files what other files are very similar to that one in near real time. So

Starting point is 00:24:32 it's like, do I have any variants of this file on any other computer? And it ended up being like a 10-stage process. And I'm sitting there with him, and we're looking at it, and I'm like, yeah, that about paints there's more but like that about paints out that picture and then i'm i had this light bulb in the back of my head goes off i was like well why don't we just do all of that in one step automatically why why would why why make you go through 10 steps to get that there and kind of we i ran over grab some engineers and we we started prototyping something literally that afternoon. And it's a feature that we're calling Run to Ground.

Starting point is 00:25:10 And honestly, it's giddy to be able to show it. Maybe we'll do another one of the video demos with you to actually show a demo of this. When we have a non-internal ugly UI of the capability. There's like a polished one. So it's like, give me a hash. Here's what run to ground does. And it kind of, it's awesome. It basically, it's like,

Starting point is 00:25:35 take that file, run it through a variant discovery process, find me the variants of it. And then do that recursively like three or four more times. So if you think about family members, it's like, find me the suspect, their siblings,

Starting point is 00:25:45 their first cousin, second cousins, third cousins. So it basically blows that out into a giant set of potential files, some far distant from the original, but you have a very large set. We then go over and we say, find me every computer at my company.

Starting point is 00:26:01 That has anything from this family. Any of them. Yep. For each one of those, that defines a potential, the time that that file was first seen on each such device defines like a reference time and a device. And so then what we do is, since we have a timeline of everything on every machine with the files intact, we say, okay, go back n units of time and go forward n units of time on each device centered around the time when that file showed up, and then filter away any file

Starting point is 00:26:35 that was first seen on that device around that time. That's very common. So toss those away. I have very uncommon file timeline hours to days before that file, hours to days after that file for every such thing. And yeah, we had the data to do this. And we had the infrastructure to pull that off very quickly at scale. And so we threw it together. And we started asking some of our longtime customers to come in and say, run some of your alerts from your EDR through this thing. And it has what I call the Craigslist UI.

Starting point is 00:27:11 It's pure text. But run it through. See what you found. And within like a day, we actually found stuff that was going undetected by EDR on their systems. And one of them was actually an APT infection that we found. And what it was, was their EDR flagged it on one machine. And we found a variant that was on a different machine, like a day before, that EDR did not flag and actually had managed to take hold on. And so by being able to take

Starting point is 00:27:45 new information and apply it across all time, because we have the bytes of the files, we're not trying to match course behaviors, you end up with this amazing ability to almost self-inoculate your entire history and forward history as well for anything that you know to be bad. And when you get that, you're like, boom, oh yeah. What happened around the time that file was on there? You actually saw a script run, which hit a URL, which downloads the file, saves it as the one that you found.

Starting point is 00:28:18 And then after that, you found a second stage show up about two seconds after it started executing. And so you boom, boom, boom. You almost have that forensic timeline of what happened on every machine in like, again, five to 10 seconds after you start loading it up. And that's where it's like, you know, I think about it like this is that, you know, if we were, if you were, if I was at a different company and we had a breach and we found something that we believed to be believed to be APT, we'd end up calling a company we end up calling mandiant or or someone like that up for breach help um they'd come in they'd be there for a week two three more um

Starting point is 00:28:52 probably writing a six seven figure check to get that data that we just gave you in five seconds yeah so i mean how common is it that you're finding that you're finding like you know apt tools uh that are that are going undetected i mean you just gave us an example there. Where's the threat report? Mike, that's normally what happens is someone finds something like this and there's a threat report that goes out and a big song and dance. We don't own that data, right? I think, you know, at the end of the day, it's like that customer, that's their files, that's their data. We don't have that data available in a public source where we can go talk about it. So, at the end of the day, it's like that customer, that's their files, that's their data. We don't have that data available in a public source where we can go talk about it. So at the end of the day, we want to respect people's privacy as well. You wanted to talk about it and I wouldn't let you,

Starting point is 00:29:35 is what I'm getting from this. That's their call, right? If they were willing to do that, I'm all for it. I mean, I have some really cool stories. We've been finding interesting malware that was capable of jumping air gaps and so forth. And it was on USB drives that were being moved around. And that's another one. When we capture files, we're able to capture the GUID attached to the file system of a USB drive as it's moved around. So not only can we tell you what was on a drive that was plugged into a machine, we can tell you every single machine that that particular drive has ever been plugged into as well. And so when you start thinking about this, like having this giant source of not just the collection of the files from all of these things, but the metadata about what computers had seen those files intersected together is literally you're operationalizing.

Starting point is 00:30:29 Think about like a Mandiant level IR engagement. And you can operationalize that 1,000, 5,000 times a day on every single alert that you actually have to look at. And you get this full historic view, which is just like, to me, it's a security candy store. Yeah, yeah, yeah. That's exactly what I've always wanted. Mike Wyasek, thank you so much for joining us for that conversation. All very interesting stuff. And we'll look forward to doing it again sometime soon.

Starting point is 00:30:56 Cheers. Cheers. That was Mike Wyasek of Stairwell there. And you can find them at stairwell.com. I do hope you enjoyed that podcast. Thanks for listening.

Your Ad Here

Risky Business - Risky Biz Soap Box: Mike Wiacek on lazy mode threat hunting

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.