Risky Business - Risky Biz Soap Box: How to measure vulnerability reachability

Starting point is 00:00:00 Hey everyone and welcome to this soapbox edition of the Risky Business Podcast. My name's Patrick Gray. For those of you who don't know, these soapbox editions of the show are wholly sponsored and that means everyone you hear in one of these editions of the show paid to be here. The idea is, it's like a keynote interview, right? So someone from one of the sponsors gets to come along, talk to us about what they're doing, talk to us about how they see their problem space, so on and so forth. And joining me today is someone you would have heard on the show many times before.

Starting point is 00:00:33 It's Feroz Abukadija, who is the founder and CEO of Socket. Now, Socket is a software supply chain security company, which really started out making a product slash platform that would help software developers discover when some of their dependencies or packages that they were relying on were malicious. But Socket has also grown these days. I guess like what Socket does is not just that anymore. So joining me to have a chat about a few things now is for us and Booker DJ and a big feature of this interview, which I think is going to be really interesting to people not just in software development, actually, is we're going to be talking about reachability analysis,

Starting point is 00:01:15 which is the idea that there might be a bug in a package, but, you know, is it reachable and, you know, how do you go about testing that? So that's going to be a big part of this conversation. But for us, let's kick it off with a bit of an update. You know, you're not just doing the malicious package tracing stuff anymore, are you? No, that's right. And Pat, it's great to be here. I think we originally launched Socket on a snake oilers back in April in 2023.

Starting point is 00:01:41 So we've been doing this for a little while. Yeah. Yeah. And so, yeah, the pitch has definitely changed since the early days. When we initially launched, we were very focused on solving this problem of, you know, how do we help companies safely use open source software? We looked around and we saw all these supply chain attacks happening in open source, and we were frankly, like a bit mystified as like, why is no one taking this problem seriously? All the vulnerability scanner vendors, you know, the SCA tools were just, you know, purely focused on CVEs to the exclusion of, you know, what we felt as were, you know, a bit bigger risk than vulnerability, right?

Starting point is 00:02:19 Like a supply chain attack is something that's going to compromise you, you know, 100% of the time, whereas the vulnerability, you know, you may have, you know, you may have. some time to fix that. You might have some, some, you know, days, weeks, months before, you know, somebody finds that. And so we were just like surprised no one is really tackling this problem and it's going to be a growing problem we believed. And so that's kind of where we started as a company building the first kind of ecosystem-wide scanner going out and really analyzing every open source package that exists looking for signs of supply chain attacks. And so we built, you know, I think is a best in class tool in that space. And we've had a lot of success, you know, folks bringing it in, using it, and, you know, we're now finding about a thousand supply chain

Starting point is 00:03:03 attacks per week in the open source communities and ecosystems that we scan. We cover all the top ones. So that's kind of where we started. And then now, you know, we had customers, we're fortunate to actually work with like some of the biggest companies in the world now. We've had success getting, you know, this into more and more hands. And what they told us is like, hey, why aren't you guys also, you know, handling the CVE problem? Like, it seems like it's a related problem. Like, just tell me if this open source dependency is safe. You know, help me understand the full risk profile of it, not just, you know, is it malicious? And so then, you know, we kind of, we listened to our customers. And so we, you know, we were like,

Starting point is 00:03:39 it seems like a reasonable request. So let's, let's kind of take a look at the vulnerability space. And then, you know, that's kind of what led us now to doing, you know, we didn't want to just do it like everyone else. And, you know, vulnerability scanning is a commodity, right? Like, you know, there's open source tools. It's like a million vendors that can do, you know, tell you if you have CVEs and your dependencies. So we wanted to take, like a socket like approach and kind of approach it with fresh eyes and really try to kind of do something different. Yeah. So when we first met and first started talking about, you know, your product, I always thought that a big risk for you was going to be that some of these

Starting point is 00:04:11 software composition analysis companies like Sneak and White Source and whatever, that they were going to move into this space, right? And that would be a bit of a challenge for you. What I didn't necessarily expect was for you to actually go and sort of enter their space. and start challenging them there. I mean, I'm guessing that's where you're moving, right? You're trying to become like a bit of a challenger to those companies now. That's right. Yeah.

Starting point is 00:04:35 And there's actually been many instances already where we've had customers switching off of those tools for various numbers of, you know, reasons. I mean, is it just the case that like with any sort of software in Infosec? I mean, I see this pattern over and over because I've been doing this for 25 years is new vendor comes along, builds a cool tool. It's amazing. Everybody uses it for like five to 10 years. Then it sort of gets a bit stale.

Starting point is 00:04:56 then the new one comes out and everyone sort of switches to that. Is that kind of what you think is happening here? Or is it like what you want to happen here, I guess? I mean, there's a bit of that. I think people do want to use the hot new thing. I do think, though, that the existing vendors have stumbled in a couple of key ways. I think one is, you know, they are inundating folks with too many alerts. And the people feel that these tools are failing them in a key way.

Starting point is 00:05:22 Like, if everything isn't, isn't, you know, urgent priority, then nothing is a priority. And so I think if you look at those, some of the folks you mentioned, like they've been really slow to adopt things like reachability analysis to try to cut down on the noise and help you understand, you know, which of the vulnerabilities are actually real. And I think on the supply chain attack side, they're relying on literally the CVE system to tell you if a package is malicious, which is just not what that, like not what the NVD is for. It's only tracking vulnerabilities. And so a lot of folks have realized they have a gap in that area. And they just have. But they are, they are catching up to you on that, right? I mean, not really.

Starting point is 00:05:56 So they do have like a, you know, they do have some research teams that are kind of sporadically looking for things here and there, but it's not systematic. Like, I, socket is literally looking at every single open source package from the moment it's published. Like we get, you know, we're crawling all this ecosystems. We get like, you know, updates when a new one is published. Well, I mean, I'm regularly seeing Socket, you know, cross my desk just through, you know, Catalan's news bulletins and whatever. It seems that like, you know, you are had absolutely the number one source for that sort of information these days. I think part of that is that we started as a company in the LLM era, and one of the things about doing something at this massive scale is that it helps to have, like, really cheap, you know, I think of them as, I think we talked about this before. I call them like, you know, college undergrads or interns. Yeah, yeah, yeah.

Starting point is 00:06:42 an infinitely scalable set of these interns and you can just tell them to go and look at all the code and they're going to get it wrong a lot. They do all the time. They'll come back and say, oh, this is malicious and then you look at it and it isn't. But they're still pulling, you know, a lot of signal out of, you know, out of the noise. And then we just add humans in the loop. And the combination of AI plus humans is actually really powerful in this, in this space. So you got like a like an agentic sweatshop, basically. Pretty much. Yeah. Working with you. So, you know, some nice cheap labor.

Starting point is 00:07:11 So look, you just mentioned it, and this is obviously a big part of the conversation today, is this idea of reachability analysis, which, look, again, you know, I've been in this discipline for a very long time, and I can understand why this is, like, actually pretty game-changing from a voln management perspective. But let's just start by talking about, like, defining what is reachability analysis for us? Please define it for the audience. Yeah. So the main problem with bone scanners today is they give you too much noise, not enough signal. And what that's really caused by is when they identify a CVE and a dependency, what they're telling you is that yes, you have a component somewhere in your application that a CVE, you know, has been issued on. That doesn't mean that your application is actually vulnerable to that vulnerability, you know, that you're actually, that it's exploitable or that, you know, it's truly what we call reachable, meaning there's a way for. or an attacker from the outside to, you know, hit your application and then, you know, find a path through the code to actually trigger that vulnerable function in the application.

Starting point is 00:08:17 So the component has a CVE issued against it, but your application is actually not vulnerable. So that's the problem. That's the problem with Volunscanners today. They don't do a very good job distinguishing that. Well, I mean, that's right. There could be some library that somebody's included, but you're not using the vulnerable function and there's no way to trigger it. There's no way to get some input to it, right?

Starting point is 00:08:35 So you are going to get so many alerts telling you. you to fix bugs that just don't affect you, right? But, but so here's the thing. Okay, reachability analysis sounds great if there's a way to figure out like, well, you know, are these bugs reachable in our software packages? But that is a very hard problem. There's a reason nobody solved it. Yeah, no, I 100% agree. It's actually an area of computer science that's like in active, you know, development. And really what it comes down to is there's this fundamental kind of like rule about analyzing source code that it's called the halting problem. There's no way to, you know, to when you're looking at a piece of source code

Starting point is 00:09:10 to really determine what it's going to do at runtime without actually running the code. And so what you have to do when you're doing static analysis is you have to make some assumptions. You have to use some heuristics. And that's where it's really challenging to pick the right heuristics to get good results from this analysis.

Starting point is 00:09:27 You often find like when folks do this, it's hard to actually roll out the reachability analysis from the legacy vendors, the other vendors that have tried this because the reach goes down these these these kind of false paths in the code and spends all the the CPU cycles kind of like going down this path that doesn't really matter and then it ends up kind of you end up running at a time and then you have to kind of give up this so it'll never terminate the analysis will never terminate and then they try to solve this by kind of putting in these hacks like oh don't

Starting point is 00:09:58 don't actually scan that folder because it's a degenerate case and the scanner just gets confused and goes and goes you know wild in that direction so they'll kind of ignore things here so it's really like a not a very rigorous process to get these things rolled out. And then you also have to go like, you know, repo by repo within your company and kind of add this extra CI step and do all this tweaking and customizing to get this like analysis to run. And, and it's very expensive. This analysis takes like hundreds of gigs of RAM in some cases just because of just the the effort involved to actually do this analysis. It's a very hard problem. And, you know, there's a reason why like people either don't do it or they say they do it, but when you actually

Starting point is 00:10:35 dig a little bit deeper, you learn there's a bunch of asterisks on that, like they're not scanning the transit of dependencies, right? It doesn't actually work on a monorepo in a real world code base, you know, like it doesn't work in very well in dynamic languages because they're making all these assumptions, you know, that they've baked in that work for Java, but not for JavaScript or Python or Ruby or any of the other dynamic languages that are really popular with developers. So there are a lot of caveats with the kind of existing solutions. Yeah, yeah. So what's your approach here? I believe that this involved, there was some very small startup that was doing some promising

Starting point is 00:11:10 work here and you've like sort of acquired them, right? Like you've done some sort of stock swap thing and brought them into the fold. So why don't you tell us about like that process and like what this person or these people were actually doing that made it so compelling for you? Yeah. So like I said, when we started looking at CVEs and we were hearing from our customers, like they wanted us to be the kind of all in one solution for supply chain security, tell us if our dependencies are safe, help us, you know, identify problems.

Starting point is 00:11:35 We started developing reachability analysis ourselves internally, and we used an approach called module-based reachability, which is where you kind of take a look at a high level of like, are you including entire files or entire classes in Java applications? And that's how you can kind of determine whether or not like a particular class is going to be loaded into the application. And that works pretty well for static languages. It gets you about a 30% noise reduction, and it's relatively easy to build. And so that's what we see a lot of, like, the other vendors kind of doing as they're kind of, you know. Yeah, so it says, this may be reachable. Yeah. Exactly.

Starting point is 00:12:15 And so when we were like, okay, we want to do better than this. Obviously, this is not like, this is good to kind of put our, you know, just to kind of put a line in the sand and say that we can kind of do reachability. But it didn't feel like a truly socket level quality product that we wanted to put our name on. And so we started looking at, okay, how can we do this better? And it turns out, like I said, this is a really difficult problem. And it was going to take our team probably six, maybe 12 months to kind of like get up to speed on the latest research to maybe bring on board a couple of PhDs that had the relevant experience.

Starting point is 00:12:49 And so we looked around and found the Kiwanah team, which if you don't know them, they are just a brilliant, brilliant set of engineers. They're out of Denmark. They're actually from a town called Arhus. in Denmark, and the company was started by these four folks. One is a professor who's been studying, doing research on reachability, specifically in JavaScript for over 20 years, and three of his PhD students. And then, you know, it's kind of a small team, about eight folks.

Starting point is 00:13:19 And at the time we did this acquisition, by the way, socket ourselves. We were only 30 people, so I never would have thought, like, a 30-person startup would be acquiring an eight-person startup. Like, that just wasn't on the table for me. You know, is it, is it, is it an aquire? Is it a merger? Is it an acquisition? Well, kind of, yes, I guess, to all of those sort of. It's unusual to see this, right? Like I, you know, it's not something you see very often, which is a small startup merging with another startup or sort of acquiring another startup. Yeah. I mean, the thing in this case was it was such a match made in heaven. I mean, the Kowana culture, they're super technical. Almost everyone there is, you know, is, is, is, is it was an engineer. And they had built the best reachability. analysis solution that we'd seen. It was better because it was designed from the start for dynamic languages like JavaScript and Python and Ruby. And then it was kind of transferred from there to the static languages. So it works really well on the hard to analyze languages, which is

Starting point is 00:14:15 unique. They also didn't really build anything else. So they just did reachability. And we did basically everything else, but not reachability. And so it was this perfect match. It's like puzzle pieces, right? Worth worth more than the sum of the parts, right? When you put those two things together. So tell me these Danish mega nerds who you managed to discover, right, who had built this amazing reachability thing. And that's not meant as a pejorative, by the way, to these Danish mega nerds who might be listening. It's very impressive what you've done. Why don't you tell us what their approach is to solving this problem? Because as you said, you've outlined a couple of kind of dead ends or things that get you like 30% of the way there. You know, what's very different

Starting point is 00:14:53 about their approach? So the first thing is that they analyze the full dependency tree. So that's something you actually don't see in all, you know, it's not a given, actually, that you're going to get that from some of the other solutions. And that's really important for a few reasons. The first is that, you know, when you're dealing with vulnerabilities in modern applications, you have these deep dependency trees. And so a lot of your vulnerabilities are going to be two, three, five, 10 layers deep in your dependency tree. I think I mentioned this to you before, Pat, but like a Hello World JavaScript application today has 1,000 dependencies. I'm not kidding.

Starting point is 00:15:30 You go to get a React app going. You want to show Hello World? Like 1,000 JavaScript dependencies to show Hello World in your browser. I mean, this is the same, we're living in the era of the, you know, 10 megabyte buttons on websites, right? Yes. It's just, there's too much. There's too much everything. Yeah.

Starting point is 00:15:49 And the vibe coding is not helping because, you know, people are writing more code. So the key thing is you got to analyze the full dependence. tree. And one thing that's really important to us just as a company and with both the Kiwana team and Socket is that we never tell you that a dependency or that a CVE is reachable if it's potentially not reachable. And we never tell you it's not reachable if it's potentially reachable because both of those lead to really bad outcomes. You don't want to not fix something because the tool told you that it's fine and then you end up getting hacked. And you don't want to do the opposite. You don't want to report over report things as reachable

Starting point is 00:16:24 because that's time you're spending on really fake busy work that isn't going to actually meaningfully improve security, and that's the whole reason why this whole reachability analysis thing exists. So you need to analyze the full dependency tree, but that brings kind of, that's sort of the main challenge, is how do you do that efficiently? Because that's a lot of code. You know static analysis tools, often they're slow to run.

Starting point is 00:16:45 Developers don't like running them. You know, they can't run them on every time they hit save in their editor because they're too slow to run. Now imagine static analysis, but not just on the 5% of the application that's your code, but now it's on the 100% of code that's your entire dependency tree. So you're talking like 20x as much code in some cases.

Starting point is 00:17:05 And so this has to be fast. And so that's like a lot of the reason why, you know, the other folks have kind of punted on doing that. And they'll just make these guesses or these assumptions about whether or not a CVE that's deep in the tree is actually reachable or not. So they're just, they were very focused on accuracy.

Starting point is 00:17:19 And that's another reason that we really connected with the team, you know, perspective. Well, so how do you solve that problem? How do you make it sort of perform it, right? And how do you deploy this thing? Because you're saying about some of the other things before, it's like a CICD pipeline integration.

Starting point is 00:17:36 It's kind of like no idea what one wants to do it. So how does it, you know, a very risky business sort of question. Like, why don't you tell us how this thing works? Yeah, yeah. For sure, for sure. So there's a few things, though, I need to explain because we actually have a couple of options for reachability. and it's actually what makes the solution really powerful for folks.

Starting point is 00:17:56 So what we've been describing so far is what we call our tier one reachability. And this is where you do a full analysis of the entire application and dependency tree. And the Kowana teams manage to make that perform pretty well, mostly by using really smart heuristics. So when you're looking at code, you have to make assumptions about what you're going to analyze and what you're not going to analyze and where you're going to kind of cut off the analysis because you can't run the code fundamentally what we're doing. doing here is we're doing static analysis. So they've just done a really good job of over the years because of Onus's research, like really honing this down so that it works on real world code bases

Starting point is 00:18:31 on large monorepos. And it's something that is like literally just due to his like decades of research in this space. But in addition to that, we have another option we call our tier two reachability or pre-computed reachability. And this is something that's unique to socket that we've developed with the Kowana team since they came on board three months ago. And this is a really great option for folks that want to prioritize ease of getting this rolled out above all else. And it's really powerful because it just needs your manifest files to do the analysis. There's no source code access required.

Starting point is 00:19:08 And now this is kind of a crazy idea. Think about that for a minute. We're going to tell you if a CVE is reachable from your application, but we're not going to look at the code of your application. right? So how do you do that? Well, you make an assumption. The assumption is you make that you say, we're going to assume for your direct dependencies that you're using all of the functions that are exported from that direct dependency in your application. Now, that's an assumption. But if you make that one assumption, you can now, we can pre-analyze or pre-compute the reachability graph

Starting point is 00:19:39 for that dependency and the entire dependency chain that goes all the way down before we've even looked at your application. So when you come to socket and you say, these are my dependencies, we can say, great, you have a CVE that's 10 layers deep, and there's no possible way to reach it because we've already looked at that entire dependency graph, and there's no way to use that direct dependency in any way. You can call all the functions. You can call it with all the different arguments. You cannot reach the vulnerability. And so you get almost as good performance. You get 80% reduction instead of 90 plus percent reduction, but you don't need to analyze the application source code. It's incredible. No, no, no, no. I get it because, yeah,

Starting point is 00:20:15 Like it's a tree, right? And you just analyze it all the way down and you're going to, you're going to know like that thing there's buried way too deep. You can't get to that. Yeah, exactly. And to my knowledge, no one's ever done this before. So, so we just, I think we discovered this. We actually just filed a patent on it because I mean, not that I, I'm a huge fan of software

Starting point is 00:20:33 patents. I think they're like mostly bad ideas, but, but this is a novel idea and we want to protect it. The point is it's, it's really like the first that I've seen of this type of type of analysis and it's great because it means that you just connect us to your GitHub or you just hit our API with those manifest files and you're going to get back results, right? We've literally just onboarded a customer that's in the Fortune 50 that was using, I don't want to say their name just out of respect for other vendors, but actually I'll say, you mentioned their name

Starting point is 00:21:05 earlier in the show and they couldn't get reachability rolled out and they've been a customer for five years. And then they tried socket with the precomputed reachability. they got results immediately and they were they were shocked and so they're switching off now and switching to socket bully so it's like it's insane like this is actually like going to be a really big deal for teams that love the idea of reachability they've been hearing about it for years but they've just never been able to get it rolled out yeah so I imagine like just because you've got a bit of flexibility in the way you can roll this thing out like switching over is going to be pretty easy but if you're dealing with a company like eddie fortune 50 they're going to have a lot of

Starting point is 00:21:40 repos a lot of code lot of applications so this is going to result in a lot of output. So how are you handling that, right? Like are you, you know, providing outputs that are friendly for, you know, vulnerability management tools? Like, you know, nucleus is a sort of more, I guess they're sort of a maturing startup now who do vulnerability management stuff, but like there's other companies that do that sort of data sciencey bit of managing vulnerabilities at scale. Like what are you doing to integrate with that stuff? Because I mean, it's great to find this stuff, but if you're just crapping it out into a console that you

Starting point is 00:22:14 have to log in and it's a socket console, like that, that's of limited utility. Yeah. No, all of our biggest customers basically never log into the console. They are, they're API only users. Yeah. And that's something we wanted to support, uh, we have supported from the very beginning just because we know like everyone has like 80 security tools, I think is the last number I heard. And so they, you know, it has to go, go into one system.

Starting point is 00:22:34 We see, um, actually Anthropic, um, one of our customers, um, did a talk at B-Sides SF recently, uh, and they, uh, talked about how they're using Sockets, API, to integrate Socket, but the company doesn't really even have to know that Sockets being used. So they built this dependency tool that internally tells you whether or not a package is allowed or not within the company. And behind the scenes, it's just hitting Sockets API and saying, like, what's the score for this package? What's the supply chain security score?

Starting point is 00:23:04 And if it's below 80 out of 100, then they just don't even allow it in the entire company. It's just banned. And that's their approach. But the developers don't need to know that. They just check the internal tool and it tells them whether or not they're allowed to use it. So everything in Socket uses APIs. And so in this case, yeah, we just, you know, you can import that stuff into whatever you want. Nuclius is a, is a great company and a great, like, one of, you know, great option for ingesting this data. Yeah, like mega triage. I guess

Starting point is 00:23:31 is what they do. So, um, look, the question becomes, right? You've got this stuff out there in the wild, you know, analyzing corporate applications and trying to find what's reachable, what's not. I mean, what have you learned through that process? Because I'd imagine that you're going to see something interesting over and over and over and over again, whether it's certain classes of bugs are like more likely, just generally, to be reachable or bugs in certain types of libraries or packages, are more likely to be reachable? Like, what are the insights you can share with us about what you've learned by going through this process with a bunch of enterprise apps?

Starting point is 00:24:08 I mean, the first realization is just the amount of dependencies that all these companies have. It's just, it blows my mind. As somebody, you know, as an open source maintainer, former open source maintainer myself, I scrutinized the crap out of my dependencies before I brought them into my projects, because, you know, anyone who's installing my code is going to pull in my dependencies. And so I felt like it was my responsibility to make sure that those dependencies were safe. Yeah, most people, most people don't think like that for us. I know. This was actually, this was the realization that led to starting the company, actually, was when I kept seeing these supply chain attacks, you know, I realized no one's looking at the code

Starting point is 00:24:46 because if you just opened the file and you just looked at what the dependency did, you know, these things were not disguised very well. They still are not disguised very well. Like you can just see in the code, you know, like it steals all your environment variables and sends them off to an IP address. Like it's, so that's when I realized no one's looking at this stuff. Because if you did, if you spent even one second and opened up, you know, the index.js file of the package, just see what is it doing?

Starting point is 00:25:12 You know, you'd see right away, like, it's, it's not, something looks off about the code. By the way, this is why the LLMs are so effective at identifying those types of anomalies is because one of the reasons is that there's something about when you just eyeball the code, sometimes not all the stuff we find is like this, but in terms of volume, the vast majority is this like low effort crap that's being pumped out by folks. And so that stuff is you just look at the file, which the LLM can do, and it just sort of sees it. It's like something's off here and then it bumps it up to the human for a view. But yeah, so that's the first thing. It's just like there's a lot of dependencies. We've seen,

Starting point is 00:25:49 you know, like a lot of companies, brands that you know, right? I mean, they have 100,000 plus dependencies, you know, across the organization. And in some cases, they have like 50 versions of the same package, you know, across all the different applications. And some tools make this worse, like Dependabot, when folks are using that tool, it tries to bump, you to the latest version of packages. It's a really popular tool with developers, but it doesn't take into account what else is already being used in the company. And so you get this kind of diffusion of versions. So you're using just every version of the package, you know, that you could possibly be using. Yeah, like 600 forks of the same thing, basically. That kind of deal?

Starting point is 00:26:29 Yep, yep, pretty much. Yep. And so we see a lot of that. I think the other thing that's interesting about just stuff we've noticed is there's quite a lot of what are called phantom dependencies. It's kind of an interesting thing. So folks don't necessarily always declare the dependencies that they're using. They just import them. And if the dependency happens to be somewhere in the dependency tree, like something else installed it. Maybe it was installed by one of their dependencies. So it's not directly dependent upon by them. But they just go and import it, right? And the tools let you do that. And they just say, oh, the dependencies there on the on the file system. I'll just grab it. But you're not declaring it

Starting point is 00:27:13 anywhere. Yeah. So it's not in a manifest or it's just how it got here. Who knows? Who can say? Exactly. There's no version. You know, you don't know what you're. So, so a lot of tools will take in the manifest and then just miss those dependencies. So that's what they're called phantom dependencies. So that's been interesting the amount of that that that is present, which is another reason why, you know, doing doing reachability can be really helpful. Yeah, because that thing's not going to get you owned. It's just kind of there, right? I mean, this is sort of like, you know, uh, dormant malware kind of thing. Like you don't really have to worry about it. EDR doesn't often alert on it, you know, because it's just sitting around. It's on the file

Starting point is 00:27:47 system. It's not really a priority. I mean, it's the same thing for this stuff, right? Like a bug that no one can exploit. When you've got like, I mean, people just don't understand the extent to which vulnerability management teams have backed up, right? Like, they need help to prioritize. So anything that can help them do that. So, so look, with this reach ability analysis, do you have any sort of statistics on when you compare, like an SCA tools scan that says, well, there's like, you know, say it finds a thousand CVEs, what percentage of them can you rule out with this reachability analysis? Yeah.

Starting point is 00:28:22 So with the tier one reachability analysis that came in from the Kuala acquisition, we get 90% on average. Now, this is different in each programming language. So it's a kind of an average number. you'll have to sort of see for your application what you get, but averaged across all of the different languages that we support, it's 90%, which is a really big number. And that just shows like all that work that's been done by these teams over,

Starting point is 00:28:48 you know, I mean, it just feels so bad for them and for the developers that are forced to fix this stuff. That they've been patching stuff that just doesn't give them any risk at all. Yeah. Yeah. And I mean, and I get it if you want to stay on the latest package versions due to like, you know, some other reason. Like if you have a good reason, like, oh, I want to get the new features that this package has. Like, that's a great reason to upgrade, right?

Starting point is 00:29:06 Maybe it fixes a bug. Like, okay, good. That's a good reason to upgrade. But if you're just doing it on a legacy application that's in maintenance mode, right? You're so painful to do these upgrades a lot of times because you're actually- You're breaking stuff. You have to re-engineer it, work around it. Look, I've been there.

Starting point is 00:29:22 It's horrid. Yeah. And sometimes it's fully the security team's job. Like, they have no help from the developers. When I say I've been there, by the way, it's like it's people who are doing that stuff for me telling me about this. I am not a developer. but yes, go on, sorry.

Starting point is 00:29:36 But, I mean, at least you have the developers helping you. That's not always the case. Like, sometimes it's like their job is go and patch the application. And then they're like, I don't even know what this application does. It doesn't have any tests. Like, I can patch the dependency, but then the app doesn't start anymore. Now, I'm going to go and just hack it the code until I get it to run again. But I don't know if it's like working correctly anymore.

Starting point is 00:29:56 There's no tests. Like, how am I supposed to do that? It's a really painful thing to ask people to do. So the 90% is like really powerful. Now, I want to just be clear, though, the other reachability, the tier two, which is the pre-computed approach that's really easy to roll out, we see on that one about a 60 to 80% reduction. So it's a little less, but it's way, way easier to roll out. It's nice to have both options. So I guess like, look, the reachability

Starting point is 00:30:18 thing is very important. It's very cool. But I guess you're seeing this, you know, now that you're doing, you're sort of trying to enter that bigger SCA market, which is huge, right? Instead of just being a bit player who does like something very specific, you're trying to enter that market. It seems like the reachability thing is like your hook, right? Like that's your, what do they call it? Your USP, your unique selling proposition seems to be around the reachability stuff. Like, is that a fair assessment? It's actually the second USP, I guess. The first being, you know, the deep supply chain analysis, you know, looking for the malicious packages. In addition to the CVAs. Yeah, yeah, yeah, yeah, that makes sense. Yeah. And I think it's,

Starting point is 00:30:57 in a way, it's kind of more broadly applicable to more teams because what we found is, you know, people care a lot about the malicious package detection, absolutely, especially if they've been affected by an attack or had a close call or something like that, or if they're in a space that just is known for this kind of stuff. Like the cryptocurrency industry is obsessed with Socket. Yeah, I was going to say, I can imagine, I can imagine you got a lot of customers there. Yeah, yeah, pretty much anyone who's anyone there. So, yeah, it's been a pretty, pretty good hook, but it's not part of like SOC2 standards. You know, it's not part of, you know, like, you don't have to be doing this type of

Starting point is 00:31:35 scanning. And so that's where I think the reachability comes in because everybody has to deal with phone management. Yeah, so it was really funny when you mentioned like at the start of this interview, you were talking about like, well, people aren't really looking for the malicious, you know, malicious packages. They're so focused on CVEs and whatnot. And I'm just sitting there thinking, well, that's the compliance standards,

Starting point is 00:31:53 tell them that they have to, right? Like, at what point do we start seeing this sort of analysis? for, you know, scanning for malicious packages. Like, when is that coming to compliance standards, the big ones, like SOC2? I think it's overdue, to be honest, and I was kind of hoping that... Well, me too. I'm sort of... And I figured you might have some insight here.

Starting point is 00:32:12 Yeah, you know, they did think about adding it to the Secure by Design initiatives, but they wanted to start with stuff that was super uncontroversial for the first version of that. And then after, you know, Trump took over, I don't know what the future of Secure by Design is. So I don't know if a V2 is going to come out with the supply chain stuff in there. Yeah, I think the S-bomb stuff is promising folks. I mean, that is actually getting pretty standardized, especially if you want to sell software to the U.S. federal government. So I have some hope that at some point folks are going to look around at all these S-bombs they've been collecting

Starting point is 00:32:48 and go, what the hell can we do with this stuff? And then they're going to realize like, oh, knowing the security status of these things, not just the vulnerabilities, but like, you know, who the hell is behind this, this package, right? Who is the maintainer? Like, has it been backdoored? And then you can sell API calls to the U.S. government, right? Basically. I mean, yeah.

Starting point is 00:33:10 And to everyone else who's been collecting S-bombs as well and, like, you know, just sticking them in their compliance tools and not doing anything useful with them. Yeah. But I mean, that's always been the plan with S-Bomb, right? It's to collect the information first, get it coming in, and then, you know, naturally you'll be able to find useful things to do. do with it. And checking for malicious packages is going to be one of those things. And not just that, but even like the broader set of risks, like we see a lot of like a surprising use case that we've

Starting point is 00:33:34 seen that I did not expect was folks looking through their S-bombs and through, you know, the dependencies that we pick up, looking for deprecated packages, not because those are a risk today, but because if there was a CVE added, you know, announced or, you know, discovered in one of those, those packages, there'd be no path forward, right? They would, they would have no. there's no maintenance, no further maintenance happening on that package. So now their company is stuck either forking it themselves and making the fix or migrating off of that package to a different one, which can be significant work. So they just want to get ahead of that before they're scrambling. And so we've actually seen things like that, which I would never

Starting point is 00:34:12 guessed that security teams would be interested in deprecated packages, because that to me feels much more like an engineering quality kind of a conversation. No, I get it. I get it. I understand 100% why you would want to know, you know, because I think, you know, if you're a security team, the reason you would want that info is because a lot of these are not going to be a tremendously big deal, but you know, you might shake out one or two where you're like, that's deprecated, that's a problem, you know, that's, is that sort of how it's working? Yeah, they're surprised sometimes at what's deprecated or there's even cases of things that are kind of like unofficially deprecated. They haven't had a version released in five years.

Starting point is 00:34:48 And so they might not have flagged it as deprecated officially, but it's effectively deprecated and so things like that we can also is the maintainer alive who can say yeah and that's actually all too real um you know yeah people are getting older and you know this stuff there there there are cases that you're joking about it but there's actually real cases like that yeah yeah where someone's someone is no longer with us right makes it a little bit difficult to maintain uh code when you are no longer alive all right look we're gonna we're gonna we're gonna wrap it up there for us to talk through all of that uh really interesting stuff i i i i i think this reachability analysis stuff is yeah very interesting and it's it's something that

Starting point is 00:35:27 everybody kind of needs right so i wish you all the best with it i hope everybody builds something like this because for too long we've been patching you know we've been patching bugs and stuff that we just haven't had to had to do uh yeah great to chat to you my friend and we'll uh we'll do it again soon cheers thank you pat it's an honor to be here

Risky Business - Risky Biz Soap Box: How to measure vulnerability reachability

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.