CyberWire Daily - Giving everyone a stake in the success of Open Source implementation. [Research Saturday]

Starting point is 00:00:00 You're listening to the Cyber Wire Network, powered by N2K. of you, I was concerned about my data being sold by data brokers. So I decided to try Delete.me. I have to say, Delete.me is a game changer. Within days of signing up, they started removing my personal information from hundreds of data brokers. I finally have peace of mind knowing my data privacy is protected. Delete.me's team does all the work for you with detailed reports so you know exactly what's been done. Take control of your data and keep your private life Thank you. JoinDeleteMe.com slash N2K and use promo code N2K at checkout. The only way to get 20% off is to go to JoinDeleteMe.com slash N2K and enter code N2K at checkout. That's JoinDeleteMe.com slash N2K, code N2K. Hello, everyone, and welcome to the CyberWire's Research Saturday.

Starting point is 00:01:36 I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down threats and vulnerabilities and solving some of the hard problems of protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us. And now, a message from our sponsor, Zscaler, the leader in cloud security. Enterprises have spent billions of dollars on firewalls and VPNs, yet breaches continue to rise by an 18% year-over-year increase in ransomware attacks and a $75 million record payout in 2024. These traditional security tools expand your attack surface

Starting point is 00:02:19 with public-facing IPs that are exploited by bad actors more easily than ever with AI tools. It's time to rethink your security. Zscaler Zero Trust plus AI stops attackers by hiding your attack surface, making apps and IPs invisible, eliminating lateral movement, connecting users only to specific apps, not the entire network, continuously verifying every request based on identity and context, simplifying security Thank you. your organization with Zscaler Zero Trust and AI. Learn more at zscaler.com slash security. So this is actually our fourth edition of the report. That's Tim Mackey. He's principal security strategist within the Synopsys Cyber Research Center. The research we're discussing today is titled 2019 Open Source Security and Risk Analysis Report. This comes

Starting point is 00:03:31 from an initiative that we had within the Black Duck software community. Synopsys acquired Black Duck software in December of 2017. And so the research work when it was Black Duck was known as the COSRI report or the Center for Open Source Research and Innovation and we brought that forward into Synopsys and this is the second incarnation of this under the Synopsys branding the research itself is looking at one aspect of the Black Duck business which is all about doing audits of commercial software code bases, typically in either a merger and acquisition scenario or some new VC funding round, basically as part of a tech due diligence. So we're looking at actual real code, real applications, real libraries, as opposed to, say, doing a survey. I see. So there's some really interesting data here in the report. Let's just start off with

Starting point is 00:04:23 sort of an overview. Can you give us the lay of the land? Where are we when it comes to the prevalence of open source software in code these days? The easy statement for that is open source development is where it's at. We typically see that the majority of software components making up a commercial application are open source in nature. And if you look at how development teams have evolved, say, over the last five or 10 years, this kind of makes sense. We have a preponderance of libraries and options and frameworks and runtimes that enable development teams to create their unique functionality feature set offering without necessarily having to be

Starting point is 00:05:04 stuck with, hey, if I don't have the expertise in-house, I'm really going to struggle. That expertise can be anywhere in the world. And that's one of the key values that open source development brings to modern applications. So you have sort of a tried and true component of functionality that has built a good reputation for itself over the years. And a development team can basically take that off the shelf and plug that functionality into whatever they're developing. True. And within the report, we saw that of the code that we analyzed, 96% of it contained at least one open source component.

Starting point is 00:05:39 And that on average, it was about 60% of the code was open source in nature. And that was independent of industry. Take us through what are some of the most prevalent places where open source is being used? It really has no meaningful locus. So, for example, that could be IoT development. That could be new mobile applications. That could be cybersecurity. That could be heavy industry.

Starting point is 00:06:05 mobile applications, that could be cybersecurity, that could be a heavy industry. They're all using some level of open source componentry in order to build their systems. And if we look at how an application stack is created, it kind of makes sense. Maybe you're deploying on top of Linux, or you have containerized applications, you're bringing Docker into the mix. You might have common runtimes like Java or.NET, all of which are open source. So you're bringing in open source technologies as part of your overall solution delivery. And that's kind of a good thing. But what are the most common components that you're seeing in use? The most common component last year was jQuery. So we were seeing a fair number of applications that had a web-based front end to

Starting point is 00:06:45 them. And as you would kind of expect, there's an awful lot of JavaScript in a modern web-based application. So jQuery kind of topped the list. Any other notable components that you see there come up a lot? It really does vary. So we see things like Font Awesome coming up. That was actually our third most common component that came up. But it's clear across the board. So if someone's gone down the Node path, we're going to see an awful lot of Node. If they've gone down the Angular path, we're going to see an awful lot of Angular. And that's true also on the server backend where Java and.NET, Golang, all of the capabilities

Starting point is 00:07:19 that you would expect out of those languages are represented in an open source form in these applications. I see. So let's dig into some of the security issues here. Again, give us an overview. What are the vulnerabilities that we're talking about? We saw quite a spectrum in terms of the vulnerabilities and the patch state. And I think the patch state is really a key thing to focus in on. One of the vulnerabilities that we saw last year, and this is right now top of the leaderboard, we've never seen it quite this striking, is a vulnerability that was in FreeBSD. And so this particular application was using a very old version of FreeBSD that had a vulnerability that was disclosed in May of 1990. Or the way

Starting point is 00:07:59 we put it, probably it is older than some of the developers working on modern code. Right, right. And so we looked into how this could be. And one of the things we came out with was this was an application that just fundamentally met its requirements. And no one saw any reason to deviate from this until they brought in a company like us to go and assess the software and say well what exactly are the quote-unquote smoking guns that might be present here and what do we need to do to move forward that we saw this it was it's working why do we need to change it yeah that's really fascinating because i mean i can see sort of if it's not broken you know don't fix it sort of

Starting point is 00:08:40 thing if everything's working the way it's designed and you're also, I would imagine, you're not getting complaints from the users about functionality problems as well. Correct. And this actually ends up manifesting itself in a different aspect of patching when it comes to open source components. And that is, there's no one vendor. There's no, quote, vendor known as open source where you can just go and get all your patches from. Your patch has to match wherever you obtained your code from. So the easy example is if I have a patch for OpenSSL, I could have a patch that comes from upstream. I could have a patch that comes from, say, canonical. I could have a patch that comes from Red Hat. If I apply the wrong patch, I could change the behavior of OpenSSL

Starting point is 00:09:25 in ways that I don't expect. And that could be a really, really bad thing. So I have to know not only that I have to patch something, but where to get the correct patch from. If I have a piece of software that's working fine, and there's been multiple versions over the years, that the parts that are working just fine, it's unlikely that I'm going to go back and check the parts that haven't changed. There's been no functional change since the last version. Is it likely that I'm not going to go back and check to see if that open source component has had any updates or patches? That's actually a very common scenario. What we see development teams doing, and when you step back, this makes perfect sense that they would do this, is here's a component that meets my requirements.

Starting point is 00:10:10 I don't want to run the risk of say the component not being available from wherever I downloaded it from. So I'm going to bring it in-house and I'm going to cache it in some form of binary repository. This is awesome because it assumes and enables that I'm going to have a very consistent build environment. That application is going to come out exactly the same way every time. Over time, there might be security disclosures of one form or another against that component in its specific version.

Starting point is 00:10:34 If I don't have some process to go and keep it up to date, I'm now going to get progressively out of date. And when the time comes to actually update it, it might be six months, it might be a year, it might be two years later, when the time comes to actually update it, it might be six months, it might be a year, it might be two years later, the delta and functionality can pose some significant tax on the organization when they go and apply that new patch and suddenly there's some behavioral change or configuration change or so forth. And that's why one of the big things that we saw is an alarm is that 85% of the code bases contained a component that was more than four years out of date from whatever the current version is, or had absolutely no development within the last two years.

Starting point is 00:11:12 And so it's that level of awareness that teams really need to have is, am I getting stale? Am I getting out of date? What's the quote unquote operational risk that's going to be associated with updating to the new version when I finally get around to it? And I suppose, I mean, there's an assumption that if we're bringing this in-house, then are we relying on our own team to keep it up to date in a way? Does that make sense? It does. And so it effectively becomes a question of if you're bringing it in-house, what is the procedure and process that you're going to run through in order to keep things, quote, secure and current? And that might mean that, hey, I'm going to maintain an independent fork because that

Starting point is 00:11:52 makes sense for my organization and there's a conscious decision behind it and there's humans with competencies in order to do that maintenance. Or I'm going to build a process that has, for example, an engagement with that community to be aware of when they release new updates, when they release new versions, how they communicate their patch and security information. That needs to just be part of the overall, how do I responsibly consume open source software? Let's go through some of the vulnerabilities that you saw. I mean, there were some that popped up over and over again. What were some of the ones that you kept seeing there? The one that really popped up over and over was associated with a Jackson data bind. And there

Starting point is 00:12:29 were three vulnerabilities that were part of this puzzle. CVE-2018-7489, CVE-2017-7525, and CVE-2017-15095. They all fundamentally had the same root scenario. And for the people who don't know what the data bind is all about, it really provides a serialization, deserialization capability to bind data into Java objects so that people working with Java can just use that data as if it was a member variable off of that object. And so what was at the root of this is that some class types could have a polymorphic or a dynamic binding model associated with them. And so the first attempt to fix this was, oh, we've stumbled across one of these classes and it's something we really shouldn't be touching. Let's just go put a simple if statement in there that says we're going to ensure we don't touch this. The second attempt was, oh, there's a couple more. So let's go put a case statement in there that says if you're in this set. And the third

Starting point is 00:13:24 attempt was, well, you know what? We really need to have a different approach because if there's a couple more. So let's go put a case statement in there. It says, if you're in this set and the third attempt was, well, you know what? We really need to have a different approach because if there's yet more, we're going to end up in some serious trouble. So you effectively had three separate attempts to patch this. And what we saw was that not everyone had actually moved on to what the final set of patches were. And there were a number of people who were on the intermediate steps and it may have worked for them, but they now have a latent risk if the developer at some point in time in the future goes and says, aha, I want to go and do this. Now you expose yourself to that unrefactored code, if you will. Yeah. I mean, that's an interesting point as well, because I can imagine an inquiry in-house as someone saying, hey, have we patched this bit of code? And someone can do a quick check and say, yes, we have patched it,

Starting point is 00:14:10 but it might not be the most recent patch. Correct. And if we put our developer hats on and forget about the whole open source angle, what we're effectively saying is in our own development teams, we've probably had a situation where we've attempted to fix a bug and it didn't necessarily work out correctly the first try. So we went and we came up with a different avenue of attack. That's exactly what happened here, except it's an open source, freely downloadable version. There's no vendor control where they can go and push that update

Starting point is 00:14:39 out. So the onus very much is on the consumer of this component to go and ensure they're up to date. So the onus very much is on the consumer of this component to go and ensure they're up to date. One of the things that you all looked into here were license risks and how different components have different licenses attached to them. Take us through what you discovered here. One of the key things that we look for are anything associated with license conflicts or challenges to the intellectual property. It's part and parcel of what we're

Starting point is 00:15:06 trying to do from a tech due diligence perspective. The canonical example is, let's assume that someone has some GPL3 code, but they're trying to release their project under an Apache license. That's going to create some challenges for them. And so looking at the license is definitely high on the list of tech due diligence and equally important to looking at what the security state of the situation is. So what we found were 68% of the code bases had some form of a license conflict. 61% contained some form of a GPL conflict. And those are relatively straightforward things to work through. The more alarming scenario was that 32% contained some form of custom license that would need a legal review in order to interpret it. That someone had taken a

Starting point is 00:15:52 standard license and modified it in some way, added some clause into it, wrote their own version of a license and said, gee whiz, this is open source as long as you go and do these set of things. But it's not a standard example that might be endorsed under SPDX or under the OSI model. But even worse were 38% of components we saw had no identifiable license associated with them, which means that who owns that code and what are the rights and obligations that are granted. So as we go through, the core thing that we want to call out on their license side of things is make certain that you can actually identify where you got the code and what the rights are so that you can fulfill any obligations that are associated with that

Starting point is 00:16:33 license. I would imagine then that could create a real remediation headache if you find something like that in your code and now the folks at legal have to go digging around to figure out what our situation here is. Correct. And what a lot of companies try to do in order to avoid this situation is they will say you can use anything that, say, has an MIT license or an Apache license, or we want everything to be GPL. If it's not GPL, we don't want it. They'll pick one of the very standard, understandable, recognizable, tested licenses and say developers can run with those. But there's always going to be some exception someplace where there's a component that fits exactly what the requirements are, but has a

Starting point is 00:17:15 license that's a little bit off. And so one of the big things that I personally advocate for is that development teams make friends with their lawyers, take them out to lunch, hang out with them a little bit. It's like, it seems so goofy, but at some point in time, that legal team is going to need to be there for you. They should at least know that you're on quote, the good guy side of camp, and you're not trying to do anything and bend any rules. You just are legitimately trying to do the right thing for the company. And when you have that relationship, it's a whole lot easier to go and have a conversation and say, look, I did this. How do we get ourselves unstuck? Yeah. Building up that relationship ahead of time rather than

Starting point is 00:17:55 when everyone's in a bit of a scrambling mode, I suppose. Exactly. It's like when you're in crisis, the default is how do we get ourselves out of here as quickly as possible? And if there's no relationship there, it's not going to make matters any better. But if there is a relationship, at least you know how the person's thinking about certain things in advance. So let's walk through some of the recommendations here. What tips do you have for folks who are out there making use of these open source bits of code? So the first thing I definitely want to call out is that at the beginning, we said open source is kind of the way the world's at. We saw a 16% increase in the number of open source components

Starting point is 00:18:32 in the code basis we were looking at. And despite all of the license side of things, we found that the 20 most popular open source licenses covered 98% of the code in place. So it really is a case of people are doing the right kinds of things. However, if I want to move to action items, I need to recognize the first rule is you can't patch what you don't know you have. And so no matter what kind of tooling and process that you put in place, you have to understand that you have that component in place and where you got it from. So you have to have some form of inventory discovery tooling in place to solve that,

Starting point is 00:19:09 because eventually there's going to be a patch. Eventually there's going to be something that needs to be updated. And if you don't have that process in place, you're kind of stuck. And as a result, looking for a vendor known as open source isn't going to necessarily help things. How about having an audit done? Is that something that folks should have routinely? I would say having an audit done periodically for a major event is a really good thing. Having an audit done, say, when you're going to release the first version of your product or a major refactor of your product would be a very good thing. But by the same token, there is tooling

Starting point is 00:19:46 available that you can bake into your SDLC so that you can ensure that, say, a non-compliant license isn't introduced at the outset, that you have a continuous monitoring in place for new vulnerability disclosures against whatever your application looks like so that maybe you're about to ship tomorrow and all of a sudden there's something that's really hairy and audacious that comes down the pipe today maybe you're able to fix that in time but you have to push the update out till monday okay that's fine knowing that in advance and having that level of awareness is also key and i think you know one of one of the points that's made throughout this research here is that it's not the open source software itself that's necessarily risky. It's how you implement

Starting point is 00:20:32 it. It's how you go about using it. Exactly. And one of the key things that I advocate for everyone to do is try and identify what your most critical components are. That might be a framework like Node.js or Angular. That might be a database like, say, a Mongo or an Elastic. That might be a delivery paradigm, let's say Kubernetes or Docker. Find out what your top 10, 15 most critical components are in your environment. And then engage with those communities. Find out how they work, where they work. Do they have meetups?

Starting point is 00:21:04 How are they discussing their future direction and be an active participant, not just on the consumption side, but on the steering of the future? Because some of these components, they only have a handful of developers and they have a backlog of activity. So if you have development energy that could go and solve a problem, they're probably really willing to accept it. And if you're in that community, you're also feeling more engaged with the future direction of everything. The only other thing that I'd probably call out is that when adopting open source, you probably want to make certain that you have a robust strategy for its consumption. And that would cover all of the things that we've been talking about. But it would also make certain that the development teams and the legal teams and various software architects and so forth are actively engaged in that process. So that it's a standard within the culture of the development team in an organization, as opposed to something that's, say, lawyers saying we must do this or executives saying we must do this, that everyone has a stake in the success.

Starting point is 00:22:09 Our thanks to Tim Mackey from Synopsys for joining us. We were discussing the 2019 Open Source Security and Risk Analysis Report. We'll have a link in the show notes. Cyber threats are evolving every second, and staying ahead is more than just a challenge. It's a necessity. That's why we're thrilled to partner with ThreatLocker, a cybersecurity solution trusted by businesses worldwide. ThreatLocker is a full suite of solutions designed to give you total control, stopping unauthorized applications, securing sensitive data, and ensuring your organization runs smoothly and securely. Visit ThreatLocker.com today to see how a default deny approach can keep your company safe and compliant.

Starting point is 00:23:14 The CyberWire Research Saturday is proudly produced in Maryland out of the startup studios of DataTribe, where they're co-building the next generation of cybersecurity teams and technologies. Our amazing CyberWire team is Elliot Peltzman, Puru Prakash, Stefan Vaziri, Kelsey Bond, Tim Nodar, Joe Kerrigan, Carol Terrio, Ben Yellen, Nick Valecki, Gina Johnson, Bennett Moe, Chris Russell, John Petrick, Jennifer Iben, Rick Howard, Peter Kilpie, and I'm Dave Bittner. Thanks for listening. Thank you.

CyberWire Daily - Giving everyone a stake in the success of Open Source implementation. [Research Saturday]

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.