CyberWire Daily - Establishing software root of trust unconditionally. [Research Saturday]
Episode Date: April 13, 2019Researchers at Carnegie Mellon University's CyLab Security and Privacy Institute claim to have made an important breakthrough in establishing root of trust (RoT) to detect malware in computing devices.... Virgil Gligor is one of the authors of the research, and he joins us to share their findings. Link to original research -Â https://www.ndss-symposium.org/ndss-paper/establishing-software-root-of-trust-unconditionally/ Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
You're listening to the Cyber Wire Network, powered by N2K. of you, I was concerned about my data being sold by data brokers. So I decided to try Delete.me.
I have to say, Delete.me is a game changer. Within days of signing up, they started removing my
personal information from hundreds of data brokers. I finally have peace of mind knowing
my data privacy is protected. Delete.me's team does all the work for you with detailed reports
so you know exactly what's been done. Take control of your data and keep your private life Thank you. JoinDeleteMe.com slash N2K and use promo code N2K at checkout.
The only way to get 20% off is to go to JoinDeleteMe.com slash N2K and enter code N2K at checkout.
That's JoinDeleteMe.com slash N2K, code N2K.
Hello, everyone, and welcome to the CyberWire's Research Saturday.
I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down threats and vulnerabilities and solving some of the hard problems of
protecting ourselves in a rapidly evolving cyberspace.
Thanks for joining us.
And now, a message from our sponsor, Zscaler, the leader in cloud security.
Enterprises have spent billions of dollars on firewalls and VPNs,
yet breaches continue to rise by an 18% year-over-year increase
in ransomware attacks and a $75 million record payout in 2024.
These traditional security tools expand your attack surface
with public-facing IPs that are exploited by bad actors
more easily than ever with AI tools. It's time to rethink your
security. Zscaler Zero Trust plus AI stops attackers by hiding your attack surface, making apps and IPs
invisible, eliminating lateral movement, connecting users only to specific apps, not the entire
network, continuously verifying every request based on identity and context, simplifying security Thank you. your organization with Zscaler Zero Trust and AI. Learn more at zscaler.com slash security.
So for the past 15 years or so, in particular, most importantly, since about 2008 onwards. We noticed that malware, that's malicious software,
got placed into the firmware of device controllers.
That's Virgil Gligor.
He's a professor at Carnegie Mellon University
and a member of their Scilab Security and Privacy Institute.
The research we're discussing today is titled
Establishing
Software Root of Trust Unconditionally. That includes network interface cards inside your
laptop or your desktop, includes DMA devices, that's direct memory access devices. It includes
disk controllers. It includes systems management boards. And this malware, which started being
discovered around 2008, got to be fairly difficult to spot. In fact, that kind of malware could not
be detected by any antivirus or anti-malware program that runs on your machine. And the reason for that is very
simple. The antivirus malware that runs on your machine has to communicate with this peripheral
device controllers themselves. So in effect, antivirus malware programs have to communicate with malware. And malware, as it was pointed out about 2015
by the head of the global research and analysis team at Kaspersky, Kostin Raju, malware can always
reply positively, namely, the firmware update was done completely, no problem, and the like.
So it's extremely difficult to detect the presence of malware
on these devices. And in fact, in 2015, this fellow at Kaspersky suggested that we needed
a reliable test to detect such malware on the firmware of the peripheral devices. And such test
did not exist. So what Kaspersky pointed out was quite clear to us at
Carnegie Mellon for quite some time. In fact, there was research done here by some of my former
colleagues as of 2010, 2011, which identified this problem. So essentially, we're keenly aware
of the problem. In fact, the US government was also keenly aware about it. So the problem became worse and worse over time, as opposed to better and better. And the reason for that is very simple. firmware of this peripheral devices became more pervasive. That is, it became a problem of supply
chain among other attack vectors. So malware can come to you, an end user, on your device shrink
wrapped. So now in the supply chain, there are multiple points where this malware could be placed.
And people identified over time these vectors of placement of malware on these devices in the supply chain.
The supply chain is just one example to show that, in fact, it's very easy for experts, not for the mere mortals like us.
But it's very easy for experts to actually introduce this malware once they have
control of a supply chain. So what we're talking about here is establishment of this thing called
root of trust. Take us through what does that mean? So that means that the person who wants to
carry out the malware detection test and malware replacement test has to attach an external device.
And this external device, which we call a verifier, initializes the firmware of the
peripheral device controllers and the memory of your computer, the volatile memory, the primary
memory, not the disk. So essentially, if the verifier can initialize verifiably the firmware of these devices,
then clearly malware disappears.
The problem is that it's very difficult to figure out that, in fact, the initialization
of the firmware was done correctly.
So root of trust essentially has two phases.
One, initialize all your flashable firmware of your peripheral device controllers and initialize your memory and then test that your initialization does not contain malware.
And unfortunately, this test cannot be done correctly with very high confidence,
then you don't know whether the malware disappeared.
So essentially, the test that we produced is the test that unconditionally tells the verifier
that in fact everything was initialized correctly and there is
no malware on the computer, on the system state. This is, by the way, before your computer boots.
Ah, I see. And that's really the trick here is this notion that it's unconditional.
Correct. And it happens before you're booting the operating system and you install the operating system.
Now, unconditionality here means that the test requires no secrets. It requires no special
trusted hardware modules like trusted platform modules or secure guard extensions from Intel
and others and high security modules. So no trusted hardware modules,
and no bounds on the adversary computing power. So this notion of unconditionality here is
extremely strong. It hasn't been encountered in security before this.
All right, well, let's dig in here. Describe to us, I guess, as much as you can put it in
layman's terms, how are you achieving this?
Essentially, what I'm doing here is initializing the device controllers and the primary memory with a particular computation. And this computation, unlike many others that were
studied in the past, is optimal in space and time, meaning it cannot take less than a particular number of words
in a particular memory or less than a number of time units, and it will take no more than
the particular number of words or the time units. So, optimality means that your lower bounds equal your upper bounds.
So if you can find such a test where optimality is concrete, meaning it's specified in terms of
real quantities, number of words, units of time, process of cycles, that is, then you can count that once the computation executed, there is nothing else
in that memory that could execute faster, and in the process of registers, and in the process
of memory. So essentially, it's this notion of concrete optimality that enabled us to do this
test. This notion of concrete optimality did not exist in theory, in computational complexity.
In fact, all the notion of optimality people had were asymptotic, which could not be used.
Walk us through what happens when you boot up a system or prepare to boot up a system
that would be using your method. Okay, so when you boot up a system,
obviously there is a certain amount of code which runs in the system, which you cannot trust.
So you really don't know that your bootloader is trusted
because it may not be.
So essentially you boot your computer,
you have your bootloader.
Then the bootloader responds to Verifier commands. It has to respond
because otherwise the Verifier detects right away that there is a problem. So it responds to the
Verifier commands. The Verifier asks the bootloader to initialize the memory of your system, the
primary memory of your system, and to initialize the memories or to reflash the memories of the device controllers.
So in fact, you notice that the bootloader does no longer talk to the disk itself.
It only talks to the disk controller, for example.
So this initialization, which is performed, is performed with these computations that
I just described, which are spacetime optimal.
And once the
bootloader completes the initialization, it responds to the verifier and says, look, I'm done.
What do you want me to do next? Then the verifier challenges these computations,
which were initialized already, to run in the particular number of words which were initialized, plus in the amount of time,
which is the limit, the lower bound for the computation. And if the results come back
correctly of the computation and in the specified times, the verifier can conclude that there is no malware in the system, that in fact,
the system state, the memories contain only the values which were initialized.
At that point, once the verifier concludes that, the verifier can actually start a boot
process under which software, trustworthy software, is loaded on the machine and the boot of the operating system of the disk can complete.
So roughly, these are the steps that one goes through in this test.
And please remember that this test is done before the system runs.
In other words, it cannot be done in the middle of a computation, for example, on your system.
Consequently, it's not done all that
frequently. Now, it is entirely possible that between two such tests, malware is placed on your
machine surreptitiously. Well, the second time you do the test, you are able to unconditionally
get rid of that malware. So when the system prepares to reboot, let's say, that's when the test will
happen and the malware that has been installed in the meantime will be detected. Correct. That's
exactly what happens. Now, the reason why this is called root of trust establishment, because
essentially the root of trust in the system is really comprises the contents, the chosen contents of the system state.
And the system state is basically the content of the memories that we are talking about,
and processor registers.
So help me understand, how do you establish your baseline?
How do you establish that when you're doing your initial testing, that the system is clean
to begin with?
Well, so the first question is, how do we establish that the results that came back
from the test were correct? How can the verifier tell that the results are correct?
So that's the first question. And that turns out not to be a major problem in the following sense.
The verifier has a specification of the machine. So whoever built
the verifier and constructed the test has a specification of the system type under test.
And therefore, the verifier can obtain the correct results, which are ran separately,
either on a simulator of the machine or on a machine, a copy of the machine that does not contain malware. So either way would work.
So essentially, the verifier has the right results in hands, both, as they say, both in terms of
the computation result and the timing. So that's essentially what's necessary for the test to
succeed. So what happens, I can imagine during the normal life cycle of a
system that changes are made, hardware can be added or taken away or updated, firmware could
be updated and so on. How do you then re-establish that those changes made along the way were for
good and not evil? Yes. So what this test refers to is this extremely difficult problem of detecting malware and detecting unknown area. As you point out, malware in this peripheral
device controllers and in firmware can be inserted at all points during the system lifetime because
of updates. And these updates could come from companies like, for example, ASUS in Taiwan,
as you may have noticed two days ago, they update their systems.
And those updates might as well very well contain updates to the firmware, which is clearly the case with supply chain updates.
So essentially, the scenario that he posted is absolutely credible and practical.
Essentially, what happens is after such updates,
you have to bring your system down and perform this external test. Now, this is obviously not
trivial at this point, but it's a necessary step to detect that your firmware updates were
done completely incorrectly, that in fact, no malware, no unaccounted for content was placed in your firmware.
And by the way, when I say unaccounted for content,
what I mean is that often the firmware in the device controller is not fully utilized.
There are sections of the firmware that may contain code
which is not updated. That, for example, reformats partitions of the disk, let's say. And this is a
huge problem. You have to actually reflash and retest the entire firmware and not leave out
any hidden aspect or any hidden part of the firmware.
And that, of course, brings us to the notion, do you, the tester, do you, the verifier,
have the complete and correct specifications of your peripheral device controllers?
And again, without complete and correct specification, the test could not be done.
The test, by the way, that has to be done upon all supply chain updates or all
updates carried out by the operating system relies or depends on two things fundamentally.
One is correct device specifications. Secondly, randomness in nature. In other words, we have to
be able to collect random numbers, true random numbers from nature, not pseudorandom numbers, but true random numbers.
Pseudorandom numbers, again, assume that your adversary is bounded, the power is bounded.
We don't assume that.
So the verifier has to have correct device specifications to the test, pseudorandom numbers,
and of course, it has to have the correct results in hand before the test is started.
With that, the test can be carried
out, at least in principle, unconditionally. I suppose one of the elements here is that you
have to trust your supplier that the specifications they're giving you are accurate. Correct. Could
your system be useful in detecting if those specifications don't align with what is actually being delivered?
My first gut reaction is to say no. My system, my test is not oriented towards that. It does
depend on correct specifications. And let me say at the outset is that that is generally
a huge problem. And it's a huge problem not for my test.
It's a huge problem for computing in general, in practice.
It's a huge problem for reliability.
No reliability problem, if we forget about security, can be solved without complete and
correct specification.
No security problem can be solved without complete and correct specification.
And no cryptography problem can be solved without complete and correct specification, and no cryptography problem can be solved without correct and complete specifications, just to mention a few areas. In other words, you have to know the specification of your devices.
I see.
By the way, that's not a condition. It's a fact of life.
Right. I see. The work that you're doing here, my understanding is that this is still in the theoretical stage, is that correct?
Correct. This is so far has been in the theoretical stage but by the way, we've
had quite a bit of experience with this type of tests. For example, in 2015, we published
a paper identifying the fiction that was relevant or prevalent in previous tests, which people have tried,
mostly here at Carnegie Mellon, where we thought about this problem for a long time.
So we took a retrospective look at root-of-trust establishment,
and we realized that there was a lot of fiction. So we have a lot of experience with tests in
practice, but not of the kind that I came up with recently and
published recently. I see. So what's next? How does this go from the theoretical and be turned
loose in the real world? The next step would be to implement it on real devices. And there is a
variety of devices this could be used, by the way. Microcontrollers that control medical devices.
Microcontrollers for embedded real-time systems, such as, for example, weapon systems, if one cares
about defense. So all sorts of microcontrollers, which are relatively simple devices in the sense
that their specifications are extremely well known. Laptops are complicated devices.
Their specifications of some of their devices are known only to the device producers and to
government agencies that take the time to discover those specifications. And I don't mean government
organizations only in the United States. I mean, government organizations around the world. So the next step is the first small step, handle the problem for device controllers that have
updatable firmware and make sure that you can show that there is no possible malware,
which we can with our test, and then move on to more complex devices, devices with multiple
processors, devices with all sorts
of features, which we can handle, at least theoretically, like caches, pipelining, virtual
memory, SIMD operation. Those are the steps that we anticipate. Nevertheless, the first hurdle was
passed, namely showing that this is indeed possible was passed, because it wasn't clear that this is indeed possible, was passed. Because it wasn't clear that this test really was possible.
Everything else failed before.
Well, congratulations to you and your collaborators.
It seems as though you may be on to something important here.
Thank you very much.
At least from an intellectual point of view, this was an achievement.
But how fast and how soon we can actually materialize this in practice remains to be seen.
Our thanks to Virgil Gligor from Carnegie Mellon Scilab Security and Privacy Institute for joining us.
The research is titled Establishing Software Route of Trust Unconditionally.
We'll have a link in the show notes. your company's defenses is by targeting your executives and their families at home. Black
Cloak's award-winning digital executive protection platform secures their personal devices, home
networks, and connected lives. Because when executives are compromised at home, your company
is at risk. In fact, over one-third of new members discover they've already been breached. Protect your executives and their families 24-7, 365 with Black Cloak.
Learn more at blackcloak.io.
The Cyber Wire Research Saturday is proudly produced in Maryland out of the startup studios
of Data Tribe, where they're co-building the next generation of cybersecurity teams
and technologies.
Our amazing Cyber Wire team is Elliot Peltzman,
Puru Prakash,
Stefan Vaziri,
Kelsey Bond,
Tim Nodar,
Joe Kerrigan,
Carol Terrio,
Ben Yellen,
Nick Valecki,
Gina Johnson,
Bennett Moe,
Chris Russell,
John Petrick,
Jennifer Iben,
Rick Howard,
Peter Kilpie,
and I'm Dave Bittner.
Thanks for listening.