CyberWire Daily - Giving everyone a stake in the success of Open Source implementation. [Research Saturday]
Episode Date: June 29, 2019Synopsys recently published the 2019 edition of their Open Source Security and Risk Analysis (OSSRA) Report, providing an in-depth look at the state of open source security, compliance, and code qual...ity risk in commercial software. Tim Mackey is principal security strategist within the Synopsys Cyber Research Center, and he joins us to share their findings. The research can be found here: https://www.synopsys.com/software-integrity/resources/analyst-reports/2019-open-source-security-risk-analysis.html Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
You're listening to the Cyber Wire Network, powered by N2K. of you, I was concerned about my data being sold by data brokers. So I decided to try Delete.me.
I have to say, Delete.me is a game changer. Within days of signing up, they started removing my
personal information from hundreds of data brokers. I finally have peace of mind knowing
my data privacy is protected. Delete.me's team does all the work for you with detailed reports
so you know exactly what's been done. Take control of your data and keep your private life Thank you. JoinDeleteMe.com slash N2K and use promo code N2K at checkout.
The only way to get 20% off is to go to JoinDeleteMe.com slash N2K and enter code N2K at checkout.
That's JoinDeleteMe.com slash N2K, code N2K.
Hello, everyone, and welcome to the CyberWire's Research Saturday.
I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down threats and vulnerabilities and solving some of the hard problems of
protecting ourselves in a rapidly evolving cyberspace.
Thanks for joining us.
And now, a message from our sponsor, Zscaler, the leader in cloud security.
Enterprises have spent billions of dollars on firewalls and VPNs,
yet breaches continue to rise by an 18% year-over-year increase
in ransomware attacks and a $75 million record payout in 2024.
These traditional security tools expand your attack surface
with public-facing IPs that are exploited by bad actors
more easily than ever with AI tools. It's time to rethink your
security. Zscaler Zero Trust plus AI stops attackers by hiding your attack surface, making apps and IPs
invisible, eliminating lateral movement, connecting users only to specific apps, not the entire
network, continuously verifying every request based on identity and context, simplifying security Thank you. your organization with Zscaler Zero Trust and AI. Learn more at zscaler.com slash security.
So this is actually our fourth edition of the report. That's Tim Mackey. He's principal
security strategist within the Synopsys Cyber Research Center. The research we're
discussing today is titled 2019 Open Source Security and Risk Analysis Report. This comes
from an initiative that we had within the Black Duck software community. Synopsys acquired Black
Duck software in December of 2017. And so the research work when it was Black Duck was known as
the COSRI report or the Center for Open Source Research and Innovation and we brought that forward into Synopsys and this is the second
incarnation of this under the Synopsys branding the research itself is looking at one aspect of
the Black Duck business which is all about doing audits of commercial software code bases, typically in either a merger and acquisition
scenario or some new VC funding round, basically as part of a tech due diligence. So we're looking
at actual real code, real applications, real libraries, as opposed to, say, doing a survey.
I see. So there's some really interesting data here in the report. Let's just start off with
sort of an overview. Can you give us the
lay of the land? Where are we when it comes to the prevalence of open source software in code
these days? The easy statement for that is open source development is where it's at. We typically
see that the majority of software components making up a commercial application are open
source in nature. And if you look at how
development teams have evolved, say, over the last five or 10 years, this kind of makes sense.
We have a preponderance of libraries and options and frameworks and runtimes that enable development
teams to create their unique functionality feature set offering without necessarily having to be
stuck with, hey,
if I don't have the expertise in-house, I'm really going to struggle. That expertise can
be anywhere in the world. And that's one of the key values that open source development brings to
modern applications. So you have sort of a tried and true component of functionality that has built
a good reputation for itself over the years. And a development team can basically take that off the shelf
and plug that functionality into whatever they're developing.
True. And within the report, we saw that of the code that we analyzed,
96% of it contained at least one open source component.
And that on average, it was about 60% of the code was open source in nature.
And that was independent of industry.
Take us through what are some of the most prevalent places where open source is being used?
It really has no meaningful locus.
So, for example, that could be IoT development.
That could be new mobile applications.
That could be cybersecurity.
That could be heavy industry.
mobile applications, that could be cybersecurity, that could be a heavy industry. They're all using some level of open source componentry in order to build their systems. And if we look at how
an application stack is created, it kind of makes sense. Maybe you're deploying on top of Linux,
or you have containerized applications, you're bringing Docker into the mix. You might have
common runtimes like Java or.NET, all of which are open source. So you're
bringing in open source technologies as part of your overall solution delivery. And that's kind
of a good thing. But what are the most common components that you're seeing in use? The most
common component last year was jQuery. So we were seeing a fair number of applications that had a
web-based front end to
them. And as you would kind of expect, there's an awful lot of JavaScript in a modern web-based
application. So jQuery kind of topped the list. Any other notable components that you see there
come up a lot? It really does vary. So we see things like Font Awesome coming up. That was
actually our third most common component that came up.
But it's clear across the board.
So if someone's gone down the Node path, we're going to see an awful lot of Node.
If they've gone down the Angular path, we're going to see an awful lot of Angular.
And that's true also on the server backend where Java and.NET, Golang, all of the capabilities
that you would expect out of those languages are represented in an open source form in
these applications.
I see. So let's dig into some of the security issues here. Again, give us an overview. What
are the vulnerabilities that we're talking about? We saw quite a spectrum in terms of the
vulnerabilities and the patch state. And I think the patch state is really a key thing to focus
in on. One of the vulnerabilities that we saw last year, and this is right now top of the leaderboard, we've never seen it quite this striking,
is a vulnerability that was in FreeBSD. And so this particular application was using a very
old version of FreeBSD that had a vulnerability that was disclosed in May of 1990. Or the way
we put it, probably it is older than some of the developers working on modern code.
Right, right.
And so we looked into how this could be.
And one of the things we came out with was this was an application that just fundamentally met its requirements.
And no one saw any reason to deviate from this until they brought in a company like us to go and assess the software and say well what exactly
are the quote-unquote smoking guns that might be present here and what do we need to do to move
forward that we saw this it was it's working why do we need to change it yeah that's really
fascinating because i mean i can see sort of if it's not broken you know don't fix it sort of
thing if everything's working the way it's designed and you're also, I would imagine, you're not getting complaints from the users about functionality problems as well.
Correct. And this actually ends up manifesting itself in a different aspect of patching when
it comes to open source components. And that is, there's no one vendor. There's no, quote,
vendor known as open source where you can just go and get all your patches from. Your patch has to match wherever you obtained your code from.
So the easy example is if I have a patch for OpenSSL, I could have a patch that comes from upstream.
I could have a patch that comes from, say, canonical.
I could have a patch that comes from Red Hat.
If I apply the wrong patch, I could change the behavior of OpenSSL
in ways that I don't expect. And that could be a really, really bad thing. So I have to know not
only that I have to patch something, but where to get the correct patch from.
If I have a piece of software that's working fine, and there's been multiple versions over
the years, that the parts that are working just fine, it's unlikely that I'm going to go back and check the parts that haven't changed.
There's been no functional change since the last version.
Is it likely that I'm not going to go back and check to see if that open source component has had any updates or patches?
That's actually a very common scenario. What we see development teams doing, and when you step back, this makes perfect sense that they would do this,
is here's a component that meets my requirements.
I don't want to run the risk of say the component not
being available from wherever I downloaded it from.
So I'm going to bring it in-house and I'm going to
cache it in some form of binary repository.
This is awesome because it assumes and
enables that I'm going to have a very consistent build environment.
That application is going to come out exactly the same way every time.
Over time, there might be security disclosures of one form or another against that component in its specific version.
If I don't have some process to go and keep it up to date, I'm now going to get progressively out of date.
And when the time comes to actually update it, it might be six months, it might be a year, it might be two years later,
when the time comes to actually update it, it might be six months, it might be a year,
it might be two years later, the delta and functionality can pose some significant tax on the organization when they go and apply that new patch and suddenly there's some behavioral
change or configuration change or so forth. And that's why one of the big things that we saw is
an alarm is that 85% of the code bases contained a component that was more than four years out
of date from whatever the current version is, or had absolutely no development within
the last two years.
And so it's that level of awareness that teams really need to have is, am I getting stale?
Am I getting out of date?
What's the quote unquote operational risk that's going to be associated with updating
to the new version when I finally get around to it? And I suppose, I mean, there's an assumption that if we're bringing this in-house,
then are we relying on our own team to keep it up to date in a way? Does that make sense?
It does. And so it effectively becomes a question of if you're bringing it in-house,
what is the procedure and process that you're going to run through in order to keep things, quote, secure and current?
And that might mean that, hey, I'm going to maintain an independent fork because that
makes sense for my organization and there's a conscious decision behind it and there's
humans with competencies in order to do that maintenance.
Or I'm going to build a process that has, for example, an engagement with that community
to be aware of when they release new updates, when they release new versions, how they communicate their patch
and security information. That needs to just be part of the overall, how do I responsibly consume
open source software? Let's go through some of the vulnerabilities that you saw. I mean,
there were some that popped up over and over again. What were some of the ones that you kept
seeing there? The one that really popped up over and over was associated with a Jackson data bind. And there
were three vulnerabilities that were part of this puzzle. CVE-2018-7489, CVE-2017-7525,
and CVE-2017-15095. They all fundamentally had the same root scenario. And for the people who
don't know what the data bind is all about, it really provides a serialization, deserialization capability to bind data into Java objects so that people working with Java can just use that data as if it was a member variable off of that object.
And so what was at the root of this is that some class types could have a polymorphic or a dynamic binding model associated
with them. And so the first attempt to fix this was, oh, we've stumbled across one of these classes
and it's something we really shouldn't be touching. Let's just go put a simple if statement in there
that says we're going to ensure we don't touch this. The second attempt was, oh, there's a couple
more. So let's go put a case statement in there that says if you're in this set. And the third
attempt was, well, you know what? We really need to have a different approach because if there's a couple more. So let's go put a case statement in there. It says, if you're in this set and the third attempt was, well, you know what? We really need to have a different approach
because if there's yet more, we're going to end up in some serious trouble. So you effectively had
three separate attempts to patch this. And what we saw was that not everyone had actually moved
on to what the final set of patches were. And there were a number of people who were on the intermediate steps and it may have worked for them, but they now have a latent risk if the developer at some
point in time in the future goes and says, aha, I want to go and do this. Now you expose yourself
to that unrefactored code, if you will. Yeah. I mean, that's an interesting point as well,
because I can imagine an inquiry in-house as someone saying, hey, have we patched this bit of code?
And someone can do a quick check and say, yes, we have patched it,
but it might not be the most recent patch.
Correct.
And if we put our developer hats on and forget about the whole open source angle,
what we're effectively saying is in our own development teams,
we've probably had a situation where we've attempted to fix a bug
and it didn't necessarily work out correctly the first try. So we went and we came up with a
different avenue of attack. That's exactly what happened here, except it's an open source,
freely downloadable version. There's no vendor control where they can go and push that update
out. So the onus very much is on the consumer of this component to go and ensure they're up to date.
So the onus very much is on the consumer of this component to go and ensure they're up to date.
One of the things that you all looked into here were license risks and how different
components have different licenses attached to them.
Take us through what you discovered here.
One of the key things that we look for are anything associated with license conflicts
or challenges to the intellectual property.
It's part and parcel of what we're
trying to do from a tech due diligence perspective. The canonical example is, let's assume that
someone has some GPL3 code, but they're trying to release their project under an Apache license.
That's going to create some challenges for them. And so looking at the license is definitely high
on the list of tech due diligence and equally important to looking at what the security state of the situation is.
So what we found were 68% of the code bases had some form of a license conflict.
61% contained some form of a GPL conflict.
And those are relatively straightforward things to work through. The more alarming scenario was that 32% contained some form of
custom license that would need a legal review in order to interpret it. That someone had taken a
standard license and modified it in some way, added some clause into it, wrote their own version
of a license and said, gee whiz, this is open source as long as you go and do these set of
things. But it's not a standard example that might be endorsed under SPDX or under the OSI model. But even worse were 38% of components we
saw had no identifiable license associated with them, which means that who owns that code and
what are the rights and obligations that are granted. So as we go through, the core thing
that we want to call out on their license
side of things is make certain that you can actually identify where you got the code and
what the rights are so that you can fulfill any obligations that are associated with that
license.
I would imagine then that could create a real remediation headache if you find
something like that in your code and now the folks at legal have to go digging around to figure
out what our situation here is. Correct. And what a lot of companies try to do in order to avoid
this situation is they will say you can use anything that, say, has an MIT license or an
Apache license, or we want everything to be GPL. If it's not GPL, we don't want it. They'll pick
one of the very standard, understandable, recognizable, tested licenses and say developers can run with those. But there's always going to be some exception
someplace where there's a component that fits exactly what the requirements are, but has a
license that's a little bit off. And so one of the big things that I personally advocate for
is that development teams make friends with their lawyers, take them out to lunch,
hang out with them a little bit. It's like, it seems so goofy, but at some point in time,
that legal team is going to need to be there for you. They should at least know that you're on
quote, the good guy side of camp, and you're not trying to do anything and bend any rules.
You just are legitimately trying to do the right thing for the company. And when you have
that relationship, it's a whole lot easier to go and have a conversation and say, look, I did this.
How do we get ourselves unstuck? Yeah. Building up that relationship ahead of time rather than
when everyone's in a bit of a scrambling mode, I suppose. Exactly. It's like when you're in crisis,
the default is how do we get ourselves out of here as quickly as possible? And if there's no relationship there, it's not going to make matters any better.
But if there is a relationship, at least you know how the person's thinking about certain things in advance.
So let's walk through some of the recommendations here.
What tips do you have for folks who are out there making use of these open source bits of code?
So the first thing I definitely want to call out is that at the beginning,
we said open source is kind of the way the world's at.
We saw a 16% increase in the number of open source components
in the code basis we were looking at.
And despite all of the license side of things,
we found that the 20 most popular open source licenses covered 98% of the code in place.
So it really is a case of people are doing the right
kinds of things. However, if I want to move to action items, I need to recognize the first rule
is you can't patch what you don't know you have. And so no matter what kind of tooling and process
that you put in place, you have to understand that you have that component in place and where
you got it from. So you have to have some form of inventory discovery tooling in place to solve that,
because eventually there's going to be a patch.
Eventually there's going to be something that needs to be updated.
And if you don't have that process in place, you're kind of stuck.
And as a result, looking for a vendor known as open source isn't going to necessarily help things.
How about having an audit done? Is that something that folks should have routinely?
I would say having an audit done periodically for a major event is a really good thing. Having an
audit done, say, when you're going to release the first version of your product or a major
refactor of your product would be a very good thing. But by the same token, there is tooling
available that you can bake into your SDLC so that you can ensure that, say, a non-compliant
license isn't introduced at the outset, that you have a continuous monitoring in place for
new vulnerability disclosures against whatever your application looks like so that maybe you're about
to ship tomorrow and all of a sudden there's something that's really hairy and audacious
that comes down the pipe today maybe you're able to fix that in time but you have to push the update
out till monday okay that's fine knowing that in advance and having that level of awareness
is also key and i think you know one of one of the points that's made throughout this research here
is that it's not the open source software itself that's necessarily risky. It's how you implement
it. It's how you go about using it. Exactly. And one of the key things that I advocate for everyone
to do is try and identify what your most critical components are. That might be a framework like Node.js or Angular.
That might be a database like, say, a Mongo or an Elastic.
That might be a delivery paradigm, let's say Kubernetes or Docker.
Find out what your top 10, 15 most critical components are in your environment.
And then engage with those communities.
Find out how they work, where they work.
Do they have meetups?
How are they discussing their future direction and be an active participant, not just on the consumption
side, but on the steering of the future? Because some of these components, they only have a handful
of developers and they have a backlog of activity. So if you have development energy that could go
and solve a problem, they're probably really willing to accept it. And if you're in that community, you're also feeling more engaged with the future direction of everything.
The only other thing that I'd probably call out is that when adopting open source, you probably
want to make certain that you have a robust strategy for its consumption. And that would
cover all of the things that we've been talking about. But it would also make certain that the development teams and the legal teams and various software architects and so forth are actively engaged in that process.
So that it's a standard within the culture of the development team in an organization, as opposed to something that's, say, lawyers saying we must do this or executives saying we must do this, that everyone has a stake in the success.
Our thanks to Tim Mackey from Synopsys for joining us.
We were discussing the 2019 Open Source Security and Risk Analysis Report.
We'll have a link in the show notes. Cyber threats are evolving every second, and staying ahead is more than just a challenge.
It's a necessity. That's why we're thrilled to partner with ThreatLocker,
a cybersecurity solution trusted by businesses worldwide. ThreatLocker is a full suite of solutions designed
to give you total control, stopping unauthorized applications, securing sensitive data, and
ensuring your organization runs smoothly and securely. Visit ThreatLocker.com today to see
how a default deny approach can keep your company safe and compliant.
The CyberWire Research Saturday is proudly produced in Maryland out of the startup studios of DataTribe, where they're co-building the next generation of cybersecurity teams and technologies.
Our amazing CyberWire team is Elliot Peltzman, Puru Prakash, Stefan Vaziri, Kelsey Bond, Tim Nodar,
Joe Kerrigan, Carol Terrio, Ben Yellen,
Nick Valecki, Gina Johnson,
Bennett Moe, Chris Russell, John Petrick,
Jennifer Iben, Rick Howard, Peter
Kilpie, and I'm Dave Bittner.
Thanks for listening. Thank you.