The a16z Show - Anatomy of the SolarWinds Hack: Who What Where When How

Starting point is 00:00:00 Hi, everyone. Welcome to the A6 and Z podcast. I'm Sonal, and we're sharing this episode that just dropped on 16 minutes here today because it's all about the solar winds hack, one of the largest, at least publicly known and so far, hacks of all time. It's widely relevant to all, especially given ripple effects, including the Wall Street Journal reporting just yesterday that according to a recent government investigation, 30% of both private sector and government victims linked to the hack had no direct connection to solar winds. So it's going to have ripple effects for quite some time. So we did an anatomy of a hack in this long explainer episode, a tear down of the specifics we know so far, what went down, and what we all need to know whether you're a big company, small company, or individual. For quick context, before I introduce our experts, over 18,000 customers downloaded compromised software. This is as reported in December, though obviously it now goes well beyond them. Those customers include several large government agencies, which we covered on 16 minutes last year. Private sector victims include companies like Cisco, Intel, Microsoft, Nvidia, Deloitte, VMware, Belkin, and others.

Starting point is 00:01:07 The broad consensus per statement issued by the Office of the Director of National Intelligence, the FBI, the Department of Homeland Security, and the National Security Agency, is that Russia was most likely the origin of the hacking. And more specifically, that the Cozy Bear Group, also known as APT-29, overseen by Russia's intelligence service, was responsible. That's just a super quick high level because we're going to actually go deeper and break down the who, when, when, how, and the chess game of it all. So now, let me quickly introduce our experts. Our in-house expert is A6CENC operating partner for security and former CISO, Joel De LaGarza. And our special expert guest is Stephen Adair, the president of Vilexity, an information security firm that does incident response and forensics, including memory forensics, and they've responded to multiple cases of this. their team actually put out several detailed posts on this and more. But first, Stephen, can you summarize what happened? Obviously, we'll continue to dig in on the details throughout the episode,

Starting point is 00:02:02 but also what category of hack this is. Start with the basics. Yeah, sure. So Solar Wins is a company that creates network and system management software that's used really heavily by tens of thousands organizations around the world. So it's used by large giant commercial companies, Fortune 500. It's used by small organizations, managed service providers, and governments. So it's a piece of software used to manage these really sensitive, important assets. So think about the IT teams and people who want to watch what's going on on key systems, on network devices and things that are really important within a network. They have a product called Orion.

Starting point is 00:02:36 That's their flagship product. And what happened is that solar winds was basically breached. How exactly, you know, that's not really been published. We don't know. But attackers were able to compromise solar winds, get into what's called, like, the build process of this product. So essentially, a development or this software that's downloaded and used by all these organizations, they were able to get into SolarWinds networks and modify that build process.

Starting point is 00:02:59 And what's interesting and notable about this is they didn't go in and modify the source code. What they did is think about it if you're on an assembly line and someone made a change like early on and they all put it together. They actually waited until the very end, the very last step of compiling this package to make this software that goes out. And they monitored it, they watched it, they looked at it, they learned, They tested and they ended up compiling in a backdoor, which would give them access to the systems running solar winds, Orion, for anyone who installed the update or downloaded it freshly since they did this. So they were able to modify solar winds and push out this update to organization all around the world. Basically, they'd create a shopping list and selectively target who it was that they wanted to go into and basically break into and further their access.

Starting point is 00:03:42 They could look through and see, oh, this company or this government agency, I'm very interested in them. they could actually activate and walk right into their network, and they're already sitting in and going into a very sensitive part of that network. So in short, it's what's called a supply chain compromise, where they were able to get into build process, insert themselves at the back door into this legitimate software, and expand their access, and do it very stealthily for many, many months until, you know,

Starting point is 00:04:06 fire eye came forward and figured this all out in December 2020. Right. And just to quickly even more high-level context, it, this is playing out against the broader landscape of, for many years now. Companies have obviously been using various providers of third party in cloud software and services. We'll delve into this whole notion of a supply chain hack, what it means, what it means for the future of security. But the thing I want to really pull on for what you said is that this was very unusual because they didn't go for the source code. They kind of waited for the updates. And then they were very targeted as opposed to just sort of spray and prey. So in your

Starting point is 00:04:44 assessment of all the hacks that you've seen out there, Joel, I want to hear your thoughts to you here. Is this really a sophisticated hack? Because obviously, in our show, we not only tease apart what's hype, what's real, but I often wonder if that word gets thrown about very casually. Yeah. So from our opinion, it's definitely this aspect of it is certainly one of the more sophisticated that we've seen. It's not necessarily that there aren't a lot of smart people around the world, good and bad, that couldn't pull off something similar. It's, you know, one, the fact that they did. Two, they did it so strategically. And three, you know, even if they had gone in and modified the source code, people would still be

Starting point is 00:05:16 talking about how statistic it was. But they took it up a notch and basically said, yeah, we modify this code or someone's watching it or they audit it or someone's watching the check-in process. Basically, they went to a system where none of that mattered anymore. And they just kind of bypassed all that and went like straight for the jugular. I would argue to say a much more difficult way to go about it, but a lot more likely to meet with success and go and detected. And I think they gambled correctly in this case. I mean, I think with these kinds of operations, And this is ultimately an espionage, you know, nation-state professional-type operation. From my perspective, the duration and the extent to which these things can run and detect it is usually the indicator of how sophisticated they are.

Starting point is 00:05:57 And so, like, these long-running, you know, really successful campaigns that avoid detection really belies like a level of sophistication. Because operational security, right, like covering your tracks is actually just about as hard as getting in. And so, you know, the fact that they exercise the ability to cover their tracks for. so long, to know where to insert in the process, and to lay low is just indicative of a level of discipline that you don't necessarily see in a lot of attackers. Not just get in, but be able to cover their tracks, which is what both of you guys say. And by the way, we've only talked about the duration of when the hack was revealed by Fire Eye and that it had been, you know, several months before.

Starting point is 00:06:33 Do you guys have specifics on what the latest date point is in that timeline? Yeah. The first, essentially what they did was an experiment early on, and this has been post publicly. the code in solar window Ryan was modified in late 2019, where basically they made some initial modifications, which actually didn't do anything malicious or put a backdoor, allow any type of access. Software went out, and they basically were able to prove,

Starting point is 00:06:54 like, hey, I succeeded at doing this. It existed. No one noticed anything. And essentially waited at some point to move on to phase two, which was, okay, I can get in, I can go into tech, it, I can have it build, it all works, stuff makes it into production. No one notices. And they said, okay, well, I'm satisfied with that. Now it's time to go for,

Starting point is 00:07:11 broke and put the actual code in there and open the floodgates. What you just described, Stephen, sounds exactly the way a company builds a product. Like, hey, we're going to test it out. We're going to try an experiment, an MVP, a minimum viable product, if you will. Then we'll, based on that, decide how to deploy it and target it and blah, blah, blah. I mean, I hate to say that, but that's exactly what you just described sounded like. Yeah, and honestly, it wouldn't surprise me if they had done some way of trying to basically clone their development environment too and probably tested this, I would guess, probably pretty thoroughly before they even ran the tests within their network.

Starting point is 00:07:44 So they were incredibly savvy in certain ways in terms of how targeted they were and the choices they made. In the Microsoft blog post, one line in particular really struck me. It said that the threat actors were savvy enough to avoid giveaway terminology like backdoor, key logger, etc. Instead, they gave their tampered code an innocuous name, Orion Improvement Business Layer that would fit right into a marketing brochure. This is from an Axios post summarizing it. The attack's crucial door opening exploit was a small chunk of, quote,

Starting point is 00:08:17 poisoned code, which is what Microsoft dubbed it, all of five lines long or roughly 160 characters. And then Enafried at Axios goes on to comment, which I had to chuckle, even though it's sad, was this could well be the most damage per character yet achieved in the short history of cyber warfare. So I am curious if you have any thoughts on some of those honestly quite clever things

Starting point is 00:08:37 that they did to hide undetected and any more specifics you could share there. And then we'll go into the step by step in a moment, too. The fact that they're not naming variables and naming things that are commonly used in attacks is mostly a credit to the existing kind of antivirus and anti-mower industry. You've got a lot of tools that are out there that are looking for this stuff. And you would imagine any adversary that's relatively sophisticated is going to run their changes through all those tools to make sure they don't get detected before they deploy it. And so that's just table stakes for this kind of activity. It doesn't really show any kind of real sophistication. Of course, it just depresses me to hear that. And we'll talk about this at the end, which is what

Starting point is 00:09:13 companies and people can do because I'm like, great, the better and better we get, the more and more sophisticated they get. And it just becomes this like never ending back and forth, back and forth escalation. Espionage 101. Yeah, to be completely honest, that stuff doesn't surprise, especially when their job is to like blend in as much as possible. But I'll add to one of the things and make sure that we give credit, some of the analysis of things we're talking about today are obviously from a lot of the security communities come together and publish a lot of details. It's been great. But this is one of the other things that they did is they actually used an existing config file that is part of SolarWinds Orion. That's there legitimately. It was there five years ago. Was there two years

Starting point is 00:09:49 ago? It's there right now. But they actually repurpose that exact config file. They created a specific value and said if this is a three, you shouldn't beacon and you're basically turned off. And they use values in fields within this. So then leverage that file that's already being read and used by the program to then also inform it on some of what it should do. So they use native existing files of functionality and things that are very inaccurate looking. And then they did a couple other steps beyond that that are pretty stealthy, although they're not necessarily rocket science.

Starting point is 00:10:14 They're very uncommon. One of them is the fact that this back door, once it's loaded, it wouldn't start it's beckoning or calling out for this DNS activity, which I know we haven't explained yet. But basically, the mechanism by which it actually gives that avenue of control back into these systems, you have to meet certain criteria before to even, you know, beacon. So, example, if you weren't domain join, meaning you're, you're less likely to be an actual corporate asset.

Starting point is 00:10:36 You're someone testing it on a computer. You're a workstation at home. They're not even going to pass the sniff test. But what they then do is they actually set a timer. And so it might be actually up to two weeks before it actually starts doing anything. I might be under scrutiny from QA or a build or someone might be looking at it when they first install it, make sure it's not malicious. So they actually say, hey, I'm just going to wait two weeks. I'm in this environment.

Starting point is 00:10:57 This is for the long haul. I'm not in a rush to immediately get access to these systems. So that's an interesting aspect. It's actually fairly uncommon to see Maur that is on any timer of significance or driven by a specific event that's likely to happen very soon. The other thing that was really interesting, the Maur basically would activate when a certain response was given to its query. Hey, go connect to this domain name or go connect to this website. And those domains that they used were actually domains that had expired. One of the telltale signs when you're looking in the Maurer and the things is like, oh, it was just registered last week or last month or

Starting point is 00:11:32 earlier today. So this would pass that sniff test all day long. Some of them had five or six years they'd existed. They might even have like a website. They picked up infrastructure that had a history to it. They actually owned and controlled these domains. They weren't like hacked domains or things like that where they were using compromise infrastructure. So just kind of an interesting note on that friend. It's interesting and honestly a little creepy. I got goosebumps while you were talking because it makes me think of every long game, the patience and waiting and stalking. that really skilled predators do. And I don't mean to glorify it by any means,

Starting point is 00:12:08 but I'm just sharing that what you just shared in technical terms, it gave me goosebumps quite literally. I don't know how you think about it. When we first saw this in July last year, we had, I think, three domains that we had seen use in our actual attack. And as we looked into them, we said, wow, like we kind of noticed this. We said, yeah, these things have a real history. What the hell's going on here?

Starting point is 00:12:26 And then we found a way to find more of their infrastructure, even if we hadn't seen it using an attack. And they all had this in common. Like, we had a way which we could figure out and find some of the infrastructure from some mistakes that they had made. That's why in our post we actually were able to provide a lot of indicators. Like DHS included that in their list and everything. But other than that, each one of the domains we looked into, we just instantly knew at that point.

Starting point is 00:12:47 I mean, we already knew we were dealing with an advanced start actor, but we're kind of thinking to ourselves, like, these guys have really stepped it up a notch. This was actually the third time we had dealt with them in an instant response engagement. But this was like a little bit different than the other two rounds. There's a number of things that just made it stand out, and that was definitely one of them. This might be the first A6 and Z podcast network show to be optioned for a movie. I'm just going to say it right here on air. Joel, anything to add to that before I switch into the detailed step by step?

Starting point is 00:13:13 I mean, only if Matthew McConaughey plays me. I listen to him on the Com app every other night or so. Yeah, no, I mean, I think that that's exactly it. Just the level of preparation and just the long game that these guys are playing. You know, the malware stuff is pretty common on the financial. crimeware type side, right, people trying to steal money. But those actors typically register domain names within a day, but it's just all very fishy and suspicious. But to see someone build these really advanced, large, complicated infrastructures, years ahead of using it, it just belies a real level

Starting point is 00:13:45 of sophistication. You don't really see every day. Okay, so just a recap for listeners where we are and where we're going. We've covered what happened at a high level, including some of what's hype, what's real, and interesting or undercovered in the media. You did a great job summarizing, Stephen. But Now let's spiral into that a bit deeper and fill in some blanks that you haven't covered. Both technical details, like you mentioned the beacon, DNS, I want all of it. How folks figured things out. So we can then know what the open questions still are, ripple effects and implications, and then more on supply chain compromises and what we can all do.

Starting point is 00:14:18 But I especially want to know the anatomy of how they got access to the emails. But start from the very beginning of the timeline. Yeah, so the story of the solar wind supply chain compromise obviously starts with solar winds. And that's probably where some of the question marks are currently, and they might remain that way. They were reached sometime, at least as of late 2019, and then ultimately what came out later in May of 2020, push out an actual backdoor version of their software, a backdoor meeting, a piece of software that shouldn't be there that allows this foreign adversary to have control or remote access into these systems. So we're talking in late May that happened from the cases we've been involved in and things that been published publicly. we're seeing that a lot of the threat activity started in June and July.

Starting point is 00:15:02 The SolarWind Software would send out this DNS query. So when you want to go to a website, you want to go to A16Z.com, you type that in. There's a system called DNS. It says, hey, where is this located? A DNS server says, oh, it's located over here. It's the basis which kind of you can find things on the internet. So you're not memorizing these numeric IP addresses. So the malware, all it did, once it finally activated,

Starting point is 00:15:28 waited between 10 and 14 days before it would start creating these DNS queries. It would do these DNS queries from the SolarWinds Orion server. And those DNS queries contained encoded data. And if you decoded that data, it gave you different information. One was information about the network that that machine is joined to. So for example of, say, Microsoft, it might show Microsoft.com or Microsoft. That internal or, you know, one of these government agencies, it might say trez.gov. but it would give this indicator

Starting point is 00:15:58 so the attackers could actually see who these victims were. Because remember, they're indiscriminately pushing out this software. They didn't actually tens of a thousand machine. That is an untenable thing to manage and go and manually look at everything and try and actually install software and do something of significance.

Starting point is 00:16:12 And their goal is to stay under the radar and not get caught. And now they have to decide who it is they want to go after further. So they probably have a shopping list that they started with and they probably have a new shopping list of things they're walking to the grocery store

Starting point is 00:16:23 and didn't even know they wanted that. But now they know they do. And they essentially issued commands and allowed them to initiate this backdoor on who it was that they wanted to attack. And they did this through a specific DNS response called a C-name value. So it says, hey, where is this host name? They responds back. They would actually send a specific response to prep it so that the malware would be waiting to know that next time something happens, that it should take a specific option and open the back door.

Starting point is 00:16:47 They would respond with this domains. And these domains would basically be the control points that were the attackers would then have the hands-on keyboard. A human is doing this at this point. Someone says, I'm ready to take a look at this system. Now, hackers behind this are actually involved. And they're saying, now I want to look around and figure out, is this a test machine? Is this a real network I'm interested in? It's a lab environment.

Starting point is 00:17:07 Is this a staging environment? Things like that. And they can figure out, is just the real deal? Does this have access what I want? Do I want to proceed? And they did this for we don't know how many organizations. And that's the real scary part in all this, is you have all these people that have come forward. And they're like big companies or there's these government.

Starting point is 00:17:23 agency and that's just the ones we know about. I don't think anyone has a real notion of the size and scope of where they took a further interest and then actually did something. In our particular case, we got permission to write up and share details of our incident investigation. The attackers were very focused on getting access to email of specific individuals. So their goal was maintain access, move around, you know, get what they need, having access to specific individuals and what they're writing, who's sending them while they're communicating was a key focus of what they're doing. We're able to see that they did that. The interesting part in kind of stepping away slightly from solar winds and why the intel community and law enforcement says it's likely tied to Russia, APT 29 or the Dukes,

Starting point is 00:18:04 when we've been tracking as a group we called Dark Halo, just because we've dealt with ABT 29 on many occasions in the past, but we just have no real way to link the two. But what was entering to us is the story of this group didn't start with Solar Winds. We worked three separate incidents involving the Solar Winds attackers, what we called Dark Halo. So, This is a story that starts well before and has multiple other avenues. We had actually dealt with them back in 2019. We had an organization we were doing work with, and we kicked the group out. They went away.

Starting point is 00:18:33 In our initial response, we had determined they'd been in that organization for four to five years prior. They came back in Q1, 2020 through an exchange control panel vulnerability, you know, mail service. They had a vulnerability that attackers would take advantage of, got back in, stole email for certain individuals. They were kicked out and removed again. That's what we did. and then they came back a third time with solar winds in July of 2020 again. We didn't have a good way to prove it.

Starting point is 00:18:59 And we took steps and mitigation in place to deal with it. So to say, hey, how did they get into solar winds or where else they're operating? Well, this isn't their only trick. They have a lot of tricks that they sleep. They've been able to do this and operate for quite some time. Wait, so how did you make that link across those separate incidents that it was the same group? I'll tell you. And it was something interesting is if we had worked them at three different organizations,

Starting point is 00:19:18 we actually wouldn't have come to the conclusion that this was a single threat group. We wouldn't have linked the three things. Any advanced attack or anyone in the network, they have certain commands and things that they're going to do. But they changed enough between each of the attacks that the actual techniques, the tools, there's a custom malware or a commercial script or a public script like Nishang or a pin testing framework or these different toolings or a web shell. They changed it between each one of the tax aware. Was it a very non-obvious? That's just the same group. But what they did is they went after the email of the same people each time And why we are 100% certain is the same group is when they would still email,

Starting point is 00:19:53 they would only take a certain amount of email. They would specify, I want all the email since the last time I took it. Oh, so it's like incrementally building on the total. Oh, my God, that's so fascinating. Exactly. I keep going. Yes. So in early 2020, they got back in and they said, okay, well, I want all the email for these particular users since a specific date in 2019.

Starting point is 00:20:11 And then when they came back in through the solar winds vulnerability, they basically said, hey, I want every email for these people. and I only wanted starting from this specific date range, starting in early 2020. So we had each time they came back and asked for the email since the last time they did it. So in the one case, obviously, they had an intimate previous knowledge. The other cases we worked, they didn't have as much knowledge. They had to work their way and kind of figure out the way of the land. So we're dealing with the same group in all three incidents.

Starting point is 00:20:37 That's an interesting tidbit. I was about to say I still have goosebumps. That's incredible. That was so good, Stephen. Pretty impressive analysis and worked there. the things that really jump out to me is this is something that is linked together over a four-plus year campaign, trying to maintain persistent access to the communications of high-value individuals. I think the other thing that really jumps out to me is that they have a big data problem.

Starting point is 00:21:03 They've got access to tens of thousands of computers and potentially thousands of organizations. It sounds like the kind of analysis that Stephen has done is pretty unique. There aren't a whole lot of people in the world that can do that sort of thing. And so this is probably an incident that will be continuing to understand for the coming months, if not maybe years. There's probably going to be a really long tail on that. These people are still out there. They're still operating. What are they doing now?

Starting point is 00:21:27 That's particularly concerning. It's interesting because Martin Casado, you know, our general partner who's also a security expert, he mentioned to me that he thinks it's super interesting how interactive the attackers are during the attack because it's obviously a very sophisticated team of people gathering data. and making chess moves in real time. And it's so fascinating because when we report and talk about and communicate these types of attacks, we kind of make it seem like it's a malware that does all the work, but it's really the people that are at the center of it. And then on the other side of it, you have this whole interesting dance on your end as sort of this forensics expert with

Starting point is 00:22:03 your team going in and trying to figure it out and the puzzles and everything involved. Well, you know, I heard chess is popular now. Green Gambit, right? This is exactly like playing a game of chess. The difference is that you don't see the moves immediately. They get revealed over time. And then you're left kind of piecing other things together. That's exactly the analogy.

Starting point is 00:22:23 Yeah, I definitely agree that their goal was to actually not have their moves or they did never be understood. We noticed the versions of their software that were downloaded. There was an update to Solar Winds Orion. I believe it was in August of 2020. And that version wasn't backdoored anymore. Didn't have the malicious code. So we initially speculated, oh, did the bad guys remove it? did solar winds find it, did it inadvertently get removed?

Starting point is 00:22:44 We didn't know how it's going down at the time. So they removed the code. They got in, got all this access and basically said, I'm going to try and remove this now and like flying under the radar. So if they had their way, they would have pulled off like the perfect caper, done all this stuff. No one would have known how it happened. And then the Orion product, basically,

Starting point is 00:22:59 it would have nothing malicious in it. So just kind of like an interesting other thing that they did. It is. It's a very vivid contrast to the analogy of chess, especially given the popularity of Queen's Gambit when you see them recording their moves and the spectators watching. It's a real contrast to this idea that you're literally making the move, peeling it back,

Starting point is 00:23:16 making the move, peeling it back. It's really stunning. Okay, so my next question before we talk about some things we can expect to see moving forward, what are some of the open questions still on the table? Like, we know solar winds was compromised, but the big open question there is, obviously, we don't know how. Then the second big thing in the Microsoft post that I saw, and Steven Sinovsky pointed this out, which is, you know, they do this outline, but we still don't know how the signed code was signed. So that whole idea of signed the code is a bit of a mystery still. I want to hear from you guys, what are your open questions or what are the open questions the industry is still looking at or that people should or shouldn't look at? Sure. Yeah. So how is solar winds

Starting point is 00:23:54 compromise? Obviously, one of the open questions. You could spend as much time and resources. You have infinite resources. You may not ever be able to answer that question because that system's gone. It was wiped. All the logs are here. It was never logged or it happened five years ago. So I would say the scariest part of this, people are finding out about this in December for something that was operationally live in May. They had a long headway into breaking into different organizations

Starting point is 00:24:20 doing that shopping list. And there are going to be, and there have been from this very group, and as a result of the solar winds compromise, more supply chain breaches. Some people are breathing a sigh of relief. I didn't run, you know, Solar Winds, Orion software. I'm safe.

Starting point is 00:24:36 That's not necessarily true. We're not trying to sow fear and certainty and doubt that everything is untrusted, which arguably you need to go to a typewriter. It's some pigeons now. But it's IT companies. It's security companies. It's managed service providers. It's managed security service provider.

Starting point is 00:24:51 There's these different people that were running solar wind that then had this level of access to either directly get into networks, get into email, get into authentication system, to provide software or software updates or software downloads. They 100% certain had access to numerous networks. works and system that would allow them to rinse and repeat solar winds probably on numerous different scales in numerous different ways. It doesn't have to be through a build time compile. It could be they changed a download. They changed an update process. They took keys or secrets or remote access protocols or passwords that got them into like other networks or other systems. So the

Starting point is 00:25:26 scary part is that the supply chain compromise here is just causing a chain reaction that's probably already impacting other organizations that have no idea. I think that's one of the biggest questions is who else was victimized that we don't know about and what did they do? So what you're basically describing is like this complex adaptive system, like everyone sort of networked and connected, trying to tease apart the scope and ripples of this is going to take ages. And we might never, ever get to the bottom of all of that because of that connectivity. It's interesting because General Paul Nakasone or Nakasone, I'm not quite sure how to pronounce it.

Starting point is 00:26:02 He heads both the NSA, the National Security Agency, and the military's U.S. Cyber Command. One of the things that they talked about is that developing a coherent unified picture, what you just described, Stephen, of the extent of the breaches has been difficult. The challenge is that, quote, he's expected to know how all the dots are connected, but he doesn't know how many dots there are or where they all are, which is kind of a distillation of what you just described. What are the other open questions that are on the table?

Starting point is 00:26:30 For me, the big open question, and with all of these really sophisticated breaches, the first is how many stupid things let up? to this? Like how many ridiculously easy to solve problems like applying security patches or using two-factor authentication? Like how many of those kinds of things we know we should always do are responsible for this is always front of mind when we see this. Because I think when you double-click on these a lot of the times, it starts off in a fairly innocuous way, which is like someone guessed an account or someone got access to some account. But as this event shows you, if you give a sophisticated actor, a toehold in your organization, they're just going to run through it.

Starting point is 00:27:04 So that's the first one. And then the second one is we think of these breaches because of just the way the media covers them and the fact that they kind of show up sporadically. We think of them as like events in time that have a start and finish. But in reality, these groups are still running and we're still chasing them. You don't know the implications of any of this stuff for a while. You don't know if they were getting into the Department of Energy to read Rick Perry's old emails or if they were getting in there to steal futuristic bomb designs. Maybe there's going to be some new weapon that pops up in physical. 15 years and it's linked to this breach. And we've seen from these breaches, like if you go all the way back to some of the first ones that have been publicly reported, you know, we've often seen that the goal of these is either to spy on individuals and get some kind of intelligence there or to steal the designs for things that people want to go recreate. Right. And don't forget that oftentimes, I think we often forget to talk about when we talk about intelligence, it's often in the form of blackmail, right? Like we're not just talking about stealing IP and obvious secrets because a lot of people dismiss this as, oh, email, I just book events and share, like, like photos with the family in my email, I don't think they realize that it's such a vector to all these ways of really exposing who you are. It's your identity in many ways. So that's another way to think about that too. Absolutely. Anything else on the open question side? So a bunch of other secondary breaches are now being reported on. Some of the Microsoft stuff, you saw that there were people

Starting point is 00:28:25 creating reseller accounts or trying to get reseller access to people's Office 365 enterprises. And then there were certificates that were compromised for things like Mimecast and maybe perhaps other services that are out there. And so like this picture starts to emerge that there's these lots of fires just started burning. And it's always really difficult to tell if it's one fire massing together or just a bunch of different people that are acting independent. That's actually something I wanted to really quickly touch on before we go into the rest of this. Because the thing that was confusing to me is, okay, so I read the Microsoft post. You know, like there's some intrusions that there's a partner for Microsoft actually that handles cloud apps. access services, we don't know how connected or not connected it is. Then you have a reseller

Starting point is 00:29:06 gaining access to Microsoft customers Azure accounts. Then you have this reported Russian state sponsored effort exploiting a VMware flaw that the NSA warned about last month that takes advantage of a recently announced vulnerability in VMware workspace one access, access connector, identity manager, et cetera. And this is according to the NSA that they've had at least one case that they've successfully accessed protective systems by exploiting the flaw. And then you have like, you know, one after another and they issued a patch. I mean, I'm reading all these at the same time and I'm like, is it all the same thing or not? And I think that's what you're saying, Joel, about we don't know if it's all one fire or a bunch of fires.

Starting point is 00:29:44 And do you guys have any thoughts on how to connect those dots, if at all? So as a general statement, I would say what we know about this hacker that we call Dark Haler, the people behind the solar wind attacker, they're extremely adept in methods that allow them to gain access to email or system involved with email. So things like trying to get access to an Office 365 or Azure AD environment through a partner organization or by stealing some, you know, SAML tokens or some kind of authentication mechanism or trying to get access through, you know, some other possibly through a vendor to get access to that same data or to email data, essentially by any means necessary. I would say all of those are very on par with what we've seen this attacker do and focus on and what others have

Starting point is 00:30:27 seen, a very good chance that they are related. But even if they weren't, it just kind of underscores that there's a lot of people trying to get access to this data. And now you need to focus a lot more on the cloud, on the technologies that are used to secure the cloud or that have access into it. And the things and places where people don't always look because it's new to them or they never looked at it or they didn't know to look at it. So I think this event will actually end up advancing security in many ways because it's causing people to think about and do things that they weren't realizing before. And as you can see, the bar has been set higher to where they can't walk right in the front door anymore, right? They're not easily able to get right into these

Starting point is 00:31:05 organizations by compromising, you know, the core network or the system administrator and the other ways which you could get there. So in some ways, it's a sign that security has improved a lot, but also that there's a massive amount of work to do at the same time. It makes me again think of the chess analogy. And when you have a player that comes to the table that has a set of moves, like patterns that are well beyond what the human mind can even comprehend. And that makes me think a little bit of even like Alpha Go playing Go with the real chess player in Korea and how the system made moves that they considered very alien, but that a human being would never have done, but that still follow the rules of the game, the constraints of the game that is, and yet we're

Starting point is 00:31:43 completely novel. And you just keep seeing more and more moves kind of grow and become more and more sophisticated on both sides, even as we may improve, like there are going to be alien moves at some point. But they're completely honest, they're undoubtedly highly skilled in discipline, which if you think about it, okay, if we go back to the chess analogy, you know, are they a master? Are they a grandmaster? In some ways, you're going to say, okay, they're a grandmaster, but most of their opponents are unranked. So they have this, like, kind of lower skill and their strategy is easier. But then they've been able to go to these people, maybe their security and their defenses are much higher rank. And they're using that skill set, that knowledge and that kind of cat and mouse to still get into those organizations. But to have to do that, that shows that people have leveled up quite a bit, which is a good thing for these companies, the security industry. But at the end of the day, they still manage to either capture that king or get them to knock it down. I guess no one's really thrown in the towel. No one has surrendered that I've seen so far. But I would say they're winning a lot of matches and they're playing a lot of them simultaneously. Right. But they're not, to be clear, to your point, an alien player,

Starting point is 00:32:45 like an alpha-go, they're still moves that are human, just very skilled. At least from what we've seen, but who knows, like, what we're missing, though, right? So. Right. Okay, so now the big picture questions. We've covered what happened, how it happened, the details. We talked about this, you know, phenomenon of supply chain attacks, chain of chains, what it means. I would love to hear what you think about this when you think about the broader trends at play. Yeah, absolutely.

Starting point is 00:33:10 I think on the podcast several times, I know I sound a bit like a broken record, but we've talked about the biggest challenge being securing the supply chain and how all these businesses that are becoming software businesses are actually becoming reliant on other people's software. And so it's not just a matter of the stuff that you write to run your company. It's also the matter of the stuff that your suppliers are writing. And as everyone knows, security is really difficult and it's hard to secure your own things. And then having to worry about the security of your suppliers is adding an additional layer of complexity. And so over the last couple of years, there's been a lot of investment in trying to understand third party risk management, vendor risk management, how to glue these things together. There are several different approaches, everything from private systems, that will look for vulnerabilities and report on the risk. There are publicly available standards. Different trade groups are trying to develop their own standards for security,

Starting point is 00:34:01 and then certain vendors are trying to come up with their own standards. There is no easy answer, and so what you've got is a lot of different approaches that are being tried and a lot of experimentation that's taking place. This is probably the first breach with such a size scale and scope. So this is kind of the watershed moment for that third-party risk management, and there's any number of other suppliers that are out there that are in very similar positions, right?

Starting point is 00:34:23 And it could be a company like SolarWinds or it could be an open source repository that a bunch of people are building into their applications. There are any a number of different ways. The thing that's really difficult for me, based on where I sit and what I see, is if you play through all the different potential solutions that are out there,

Starting point is 00:34:41 it's really hard to know which one of them would have actually prevented this. So like, if I went to any of SolarWinds customers and said, hey, what's your vendor risk review report on SolarWinds? You know, before the breach, I'm sure they would have said it was a wonderful company. It was doing everything. They passed our review. They answered our questionnaire.

Starting point is 00:34:57 You know, they've got the people hired. They have a program. And so it really comes down to how do you actually measure these things and how do you measure the risk in that third party and how do you effectively mitigate against it? The third party risk of the vendor risk management or how someone evaluates this, it can only go so far, right? Like how would you evaluate solar winds and the Orion product indifferently than you would Microsoft Windows and the Fender and how it updates and things like that, right?

Starting point is 00:35:20 So there's limitations to what you can do. I mean, you can audit them or find out their code review process and all that stuff. And they could have passed out all with flying colors. Or is your checklist say, are you looking for advanced adversaries, you know, injecting themselves into your build process at the most highest levels of sophistication and espionage? But even if they check yes to that, which they might not, they probably aren't having an effective way and mechanism to do that. One of the things that Alex Stamos, people tend to overquote him,

Starting point is 00:35:46 but he did have a good tweet about this, which is, quote, there was no good reason for most enterprise software products to talk to random internet hosts all day. It might be time to move on to an outbound network permission model for Windows servers. So connections only allowed to domains and signed manifest plus internet as defined in GPO. Is that the right thing to do? Should people be air gaping? Like, what should people be doing? We deal with sophisticated breaches all the time. And this can even apply for like primware and other stuff. But that is a recommendation that Vlexity has been getting for years and years and years. organizations that it's often in an incident that we say, hey, your domain controller,

Starting point is 00:36:23 for example, doesn't need to be able to talk to internet. There's obviously exceptions to the rules and everything, but usually those can be defined, especially with next generation firewalls or modern firewalls. You can define what is actually needed and allow them to do those things and not allow them to do anything. They're not explicitly required. And that's a model that is the least privilege, just like the least access type model. That's a little bit harder depending on your organization to enforce for users and workstations where you need to browse the web into all this stuff. And that's what content filters and certain restrictions are for, unless you're in like a DOD environment or something where it's a lot more locked down,

Starting point is 00:36:55 but that's usually accepted in a lot of commercial organizations. And the server is where an attacker, if they're going to install malware, do things usually go for because that's where the supply chain, that's like one of the big areas get to it, or those are the machines that are not at home or requiring a VPN, they're always on. They don't get rebooted frequently. That's where malware gets installed a lot because it's something that they can count on in its regular, being able to prevent that and limit what those can do. That model, if that had been put in place for organizations with SolarWinds, in this specific instance, it would have mitigated that threat.

Starting point is 00:37:26 Now, if I started thinking outside the box and this attacker used DNS, well, what if they had done command and control activity and issued commands and had done that all over DNS? So the SolarWind server talks to its local DNS server. Local DNS server goes out to the internet. If they had modified this malware and actually did all the command of control over DNS, instead of doing it over this connection. That paradigm and that shift would have been a lot more difficult to mitigate.

Starting point is 00:37:50 But that's the type of issue and security item. We need to think about, you could proactively try to address that or just say, hey, that's a lower likelihood and I'll address it if that happens. But by and large, it's a best practice with regards to minimal access, specifically for servers connecting to the Internet and different resources. It's funny talking about this because it's like the history of the security industry is the history of unreasonable requests. I know that a lot of people are jumping up and down talking about.

Starting point is 00:38:14 about like don't let production talk directly to the internet. And if you worked at a bank, you know, for the last 20 years, that's been the case, right? Like highly regulated industries and people that have invested heavily on security have always focused on doing these rather idiosyncratic things that don't make a lot of sense, but made a lot of sense to people who've either come from an incident response or a deep security background. You know, back in the 90s, I remember being involved in strenuous debates about why you need to encrypt traffic moving within your data center. Everyone thought it was the most assinine thing because it's a private link. You've got MPLS. No one's going to listen to you. And then Snowden releases documents.

Starting point is 00:38:51 And it became really obvious why you want to encrypt your data within your data center. So this is just another example where people have been giving best practice advice saying, hey, you need to make sure that random servers, random production systems can't just talk arbitrarily to the internet. And the response to that has generally been, well, that's an unreasonable request. That takes a lot of work. I don't know that we necessarily want to do it. And there was never a particularly great reason or piece of evidence to point to you to say, well, this is why. So this is why, why you want to limit that access. And there's probably a list of other things that are equally unreasonable requests that security people would ask you to do. And eventually

Starting point is 00:39:24 they're going to have their, this is why moment. But something that Joel mentioned earlier, which I think is really important is a lot of organizations aren't doing blocking and tackling. They don't have two-factor authentication on the remote access to their network. They're using weak passwords. They're not patching. They don't know where their assets even are. And their build process is not secure. They don't even do code auditing or check in their code. I mean, there's a lot of low-hanging fruit for most organizations. They haven't even been able to kind of get in some of the basics. But I think a big problem that a lot of organizations, whether that's a government, commercial organization, or really anyone, whether a small company or these massive companies with

Starting point is 00:40:00 huge budgets, a problem that they're facing is if you had certain security data, you could immediately and very easily answer, did I have a problem? One, did I run that vulnerable? software, because maybe you patching. I don't know. Maybe I never ran it and I skipped a version. If you had all your DNS queries log and the responses, you would say, did I get a C name? Did I even call out to that command and control activity? There's certain logs from the endpoints if solar winds was instrumented, these event log data. If you had been capturing that data, you could answer that question. Most companies do capture that data, don't they? It depends. If you went into SMBs and mid-sized businesses, even some large businesses, I would say a lot of them aren't

Starting point is 00:40:38 actually logging or keeping DNS logs. And if they are keeping DNS data, it may not be query and response. And event logs, the vast majority of organizations don't have a centralized and long-running retention policy for event logs. But even if they do, their data retention of how long they were keeping this data did not go back far enough. They actually had data. They had data back 30 days.

Starting point is 00:41:00 They had data back 60 days, 90 days. So they're finding out in December about a breach and set of activity that happened and potentially initiated in May. And, oh, I kept all this great data, but I can only go back three months. And three months from December is September. And for a brief that happened in June and July, that's, in some respects, useless. That's a scary place to be in to not know if you were compromised or if you were when it started or what happened or where did they go, how did they pivot?

Starting point is 00:41:25 It's a missed opportunity and probably a bit scary from these companies is that I was collecting all the right data, but I didn't have it for long enough. So I don't actually know. Wow. We're helping a lot of companies right now to see what resources they have. We specialize in memory forensics, requiring memory from their solar windsor, requiring disc artifacts or full disk images, any log sources. And we have some stuff that we can potentially go in and say, doesn't look like it or definitely,

Starting point is 00:41:49 yes, you were. You know, we see these items that clearly indicate that you got a second-speed breach and you need to expand this out. But we can't give anyone that they're in limited data a confirmed clean bill of health. It's a little bit like going to the doctor and having like maybe a continuous glucose monitor for the last year, but you only have the data for the last three weeks stored. And it's sort of like, okay, here's where it's happening. I'm getting sick, but I only have the three weeks. It's just like a really tough thing to figure out.

Starting point is 00:42:16 I want to break this down by advice for big companies, like large enterprises, advice for small and medium-sized businesses, and advice for consumers. So let's start with the big companies because the best threat actors. They understand the reality of modern enterprise IT. What are pieces of advice or mindsets even that you have to offer

Starting point is 00:42:34 for how chief security officers, CEOs, leaders should be thinking about the implications of this for their business. I mean, I've spent a lot of my career in big companies, and I think the thing to do right now is to think about strategy. Like, the tactics are great, and there's going to be a lot of people chasing a lot of actions over the next day's weeks, months. But I think the strategic view of how an organization wants to think about security, as we start to understand what happened and how it happen. We'll consistently see in some organizations that security either wasn't funded, it wasn't empowered, it didn't have a remit to act. It may have been under assault. People often view security as being a cost center as something that contributes to the lack of performance in a

Starting point is 00:43:15 business, and that is an attitude that is still quite popular. So I would say that it's really going to be about figuring out strategically where does security sit, what's the right amount to spend on it, how do you effectively empower it, and then how do you partner and build security, into your business so that it's something that helps enable it versus something that holds it back. Yeah. Generally, no one really thinks like security is not important. I don't think we ever hear that. Now, action may speak a louder than words sometimes, but I think a lot of people think about, oh, it's an afterthought, I'm going to add it later or, yeah, yeah, we'll do that one day. And I think like our main advice to a lot of these different organizations, whether it's a

Starting point is 00:43:51 startup or a mid-sized company or a company that's growing really rapidly, is not necessarily that they need to come out of the gate and have to have every imaginal security product. They need to be auditing all their source code on day one. They need to have everything locked down and the latest firewalls and this filter and all these EDR products. But it's like, think about that stuff. Are you doing the two-factor? Are you lazy? Like, ah, I don't need to put, you know, two-factor on my Salesforce account where all my most sensitive contacts and information is in my organization or, yeah, I don't really need to put an email. It's like, it's easier if everyone can just log straight in or I'm just going to share this route, you know, Amazon key to get

Starting point is 00:44:26 in the AWS, because that's just how our organization's growing and we're not formal. There's things that people can do best practices, actions that organizations can take, see what you can do now, see what you can do along the way, and put that on your radar, so you're not in a position where you're starting from scratch or trying to investigate a breach or figure out if you even had a breach. We all knew that we should have done, and we knew that two years ago, and we run into that a lot. Don't wait till later. And now advice for consumers, like just day-to-day people, like,

Starting point is 00:44:56 family members, et cetera. What would your advice be for how to think about things like this? We wrote a really excellent blog post last year called 16 things you can do to protect yourself, and I would strongly recommend that people do all of those 16 things. It's all really basic stuff, and it starts with two-factor authentication, patching your systems, and goes all the way down to how you want to think about securing your potential social media accounts, etc. So yeah, we issued some guidance and several intersections of prevention and detection, and then remediation, if you have an actual threat or concern. From the, The prevention site, prevent unnecessary access from your servers, like your solo

Starting point is 00:45:31 wind server or other devices from talking to the internet. That's a prevention mechanism. You know, monitor your assets, kind of see where they're logging in from. If you have that kind of centralized logging or like a SIM, you know, same thing. Make sure you're capturing either from event logging or your endpoint security products that actual commands being run on the system are being logged because that can be pivotal and be critical to one detection. But even if you're not actively monitoring it, you can go back and say,

Starting point is 00:45:53 hey, what commands are running this server? that's not consistent with what our system admin or the typical activity would do. But take a look at your mail server, look at where your email is going, because that's where the attackers, I believe they're way ahead of the game with regards to the things that they can do

Starting point is 00:46:07 in Office 365 and Azure AD, where they are so familiar with the administrative commands and what to do from a cis admin aspect, they're able to do a bunch of things and hide in ways that people have never even thought about and encountered. And it's not necessarily like their ghosts or it can't be found.

Starting point is 00:46:22 People just don't know to even look for it. And then just from the general remediation perspective, once in the device is in backdoor to compromise, it's an untrusted system now. Don't just like roll back to an earlier version or I'm just going to upgrade the new version. We say, hey, blow that whole system away. Start with a fresh, clean install. And if you're putting SolarWinds Orion back on it, download the newest version that's not backdoor and start everything from scratch. If anything used on that server, if your SolarWinds set up for the Orion had credentials, it changed all those passwords and make sure those passwords aren't similar to like old passwords that you're used. The other thing, too, is kind of any sensitive API key integration and things,

Starting point is 00:46:58 like we saw Two-Factor bypass to get into email by this Streatter because they had taken a secret key and were able to generate cookies and skip into the email system while not actually being challenged for Two-Factor. You've got to think about the stuff that someone could steal if they're in your network related to this, but also that advice extends well beyond this Streatter and Solarwyn specifically. That's great. I'll include links to Vlexity's blog post as well as the 16 things that you can do to secure yourself in the show notes. Bottom line it for me. What's your takeaway?

Starting point is 00:47:29 It's consistent with what we've been saying for a while now. The hardest problem to solve is third-party risk, and this is probably the most significant third-party breach that we've seen in history. And so I think it's going to take us months to really understand what happened and probably years to fix it. Thank you so much, you guys, for joining this episode of 16 minutes, which is a 3x 16 minutes. Definitely, thanks for having me. Yeah, thank you so much. And Stephen, it seems that we're always catching up when the world is burning down.

The a16z Show - Anatomy of the SolarWinds Hack: Who What Where When How

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.