CyberWire Daily - Establishing software root of trust unconditionally. [Research Saturday]

Starting point is 00:00:00 You're listening to the Cyber Wire Network, powered by N2K. of you, I was concerned about my data being sold by data brokers. So I decided to try Delete.me. I have to say, Delete.me is a game changer. Within days of signing up, they started removing my personal information from hundreds of data brokers. I finally have peace of mind knowing my data privacy is protected. Delete.me's team does all the work for you with detailed reports so you know exactly what's been done. Take control of your data and keep your private life Thank you. JoinDeleteMe.com slash N2K and use promo code N2K at checkout. The only way to get 20% off is to go to JoinDeleteMe.com slash N2K and enter code N2K at checkout. That's JoinDeleteMe.com slash N2K, code N2K. Hello, everyone, and welcome to the CyberWire's Research Saturday.

Starting point is 00:01:36 I'm Dave Bittner, and this is our weekly conversation with researchers and analysts tracking down threats and vulnerabilities and solving some of the hard problems of protecting ourselves in a rapidly evolving cyberspace. Thanks for joining us. And now, a message from our sponsor, Zscaler, the leader in cloud security. Enterprises have spent billions of dollars on firewalls and VPNs, yet breaches continue to rise by an 18% year-over-year increase in ransomware attacks and a $75 million record payout in 2024. These traditional security tools expand your attack surface

Starting point is 00:02:19 with public-facing IPs that are exploited by bad actors more easily than ever with AI tools. It's time to rethink your security. Zscaler Zero Trust plus AI stops attackers by hiding your attack surface, making apps and IPs invisible, eliminating lateral movement, connecting users only to specific apps, not the entire network, continuously verifying every request based on identity and context, simplifying security Thank you. your organization with Zscaler Zero Trust and AI. Learn more at zscaler.com slash security. So for the past 15 years or so, in particular, most importantly, since about 2008 onwards. We noticed that malware, that's malicious software, got placed into the firmware of device controllers. That's Virgil Gligor.

Starting point is 00:03:35 He's a professor at Carnegie Mellon University and a member of their Scilab Security and Privacy Institute. The research we're discussing today is titled Establishing Software Root of Trust Unconditionally. That includes network interface cards inside your laptop or your desktop, includes DMA devices, that's direct memory access devices. It includes disk controllers. It includes systems management boards. And this malware, which started being discovered around 2008, got to be fairly difficult to spot. In fact, that kind of malware could not

Starting point is 00:04:17 be detected by any antivirus or anti-malware program that runs on your machine. And the reason for that is very simple. The antivirus malware that runs on your machine has to communicate with this peripheral device controllers themselves. So in effect, antivirus malware programs have to communicate with malware. And malware, as it was pointed out about 2015 by the head of the global research and analysis team at Kaspersky, Kostin Raju, malware can always reply positively, namely, the firmware update was done completely, no problem, and the like. So it's extremely difficult to detect the presence of malware on these devices. And in fact, in 2015, this fellow at Kaspersky suggested that we needed a reliable test to detect such malware on the firmware of the peripheral devices. And such test

Starting point is 00:05:21 did not exist. So what Kaspersky pointed out was quite clear to us at Carnegie Mellon for quite some time. In fact, there was research done here by some of my former colleagues as of 2010, 2011, which identified this problem. So essentially, we're keenly aware of the problem. In fact, the US government was also keenly aware about it. So the problem became worse and worse over time, as opposed to better and better. And the reason for that is very simple. firmware of this peripheral devices became more pervasive. That is, it became a problem of supply chain among other attack vectors. So malware can come to you, an end user, on your device shrink wrapped. So now in the supply chain, there are multiple points where this malware could be placed. And people identified over time these vectors of placement of malware on these devices in the supply chain. The supply chain is just one example to show that, in fact, it's very easy for experts, not for the mere mortals like us.

Starting point is 00:06:41 But it's very easy for experts to actually introduce this malware once they have control of a supply chain. So what we're talking about here is establishment of this thing called root of trust. Take us through what does that mean? So that means that the person who wants to carry out the malware detection test and malware replacement test has to attach an external device. And this external device, which we call a verifier, initializes the firmware of the peripheral device controllers and the memory of your computer, the volatile memory, the primary memory, not the disk. So essentially, if the verifier can initialize verifiably the firmware of these devices, then clearly malware disappears.

Starting point is 00:07:31 The problem is that it's very difficult to figure out that, in fact, the initialization of the firmware was done correctly. So root of trust essentially has two phases. One, initialize all your flashable firmware of your peripheral device controllers and initialize your memory and then test that your initialization does not contain malware. And unfortunately, this test cannot be done correctly with very high confidence, then you don't know whether the malware disappeared. So essentially, the test that we produced is the test that unconditionally tells the verifier that in fact everything was initialized correctly and there is

Starting point is 00:08:26 no malware on the computer, on the system state. This is, by the way, before your computer boots. Ah, I see. And that's really the trick here is this notion that it's unconditional. Correct. And it happens before you're booting the operating system and you install the operating system. Now, unconditionality here means that the test requires no secrets. It requires no special trusted hardware modules like trusted platform modules or secure guard extensions from Intel and others and high security modules. So no trusted hardware modules, and no bounds on the adversary computing power. So this notion of unconditionality here is extremely strong. It hasn't been encountered in security before this.

Starting point is 00:09:17 All right, well, let's dig in here. Describe to us, I guess, as much as you can put it in layman's terms, how are you achieving this? Essentially, what I'm doing here is initializing the device controllers and the primary memory with a particular computation. And this computation, unlike many others that were studied in the past, is optimal in space and time, meaning it cannot take less than a particular number of words in a particular memory or less than a number of time units, and it will take no more than the particular number of words or the time units. So, optimality means that your lower bounds equal your upper bounds. So if you can find such a test where optimality is concrete, meaning it's specified in terms of real quantities, number of words, units of time, process of cycles, that is, then you can count that once the computation executed, there is nothing else

Starting point is 00:10:27 in that memory that could execute faster, and in the process of registers, and in the process of memory. So essentially, it's this notion of concrete optimality that enabled us to do this test. This notion of concrete optimality did not exist in theory, in computational complexity. In fact, all the notion of optimality people had were asymptotic, which could not be used. Walk us through what happens when you boot up a system or prepare to boot up a system that would be using your method. Okay, so when you boot up a system, obviously there is a certain amount of code which runs in the system, which you cannot trust. So you really don't know that your bootloader is trusted

Starting point is 00:11:15 because it may not be. So essentially you boot your computer, you have your bootloader. Then the bootloader responds to Verifier commands. It has to respond because otherwise the Verifier detects right away that there is a problem. So it responds to the Verifier commands. The Verifier asks the bootloader to initialize the memory of your system, the primary memory of your system, and to initialize the memories or to reflash the memories of the device controllers. So in fact, you notice that the bootloader does no longer talk to the disk itself.

Starting point is 00:11:53 It only talks to the disk controller, for example. So this initialization, which is performed, is performed with these computations that I just described, which are spacetime optimal. And once the bootloader completes the initialization, it responds to the verifier and says, look, I'm done. What do you want me to do next? Then the verifier challenges these computations, which were initialized already, to run in the particular number of words which were initialized, plus in the amount of time, which is the limit, the lower bound for the computation. And if the results come back

Starting point is 00:12:36 correctly of the computation and in the specified times, the verifier can conclude that there is no malware in the system, that in fact, the system state, the memories contain only the values which were initialized. At that point, once the verifier concludes that, the verifier can actually start a boot process under which software, trustworthy software, is loaded on the machine and the boot of the operating system of the disk can complete. So roughly, these are the steps that one goes through in this test. And please remember that this test is done before the system runs. In other words, it cannot be done in the middle of a computation, for example, on your system. Consequently, it's not done all that

Starting point is 00:13:25 frequently. Now, it is entirely possible that between two such tests, malware is placed on your machine surreptitiously. Well, the second time you do the test, you are able to unconditionally get rid of that malware. So when the system prepares to reboot, let's say, that's when the test will happen and the malware that has been installed in the meantime will be detected. Correct. That's exactly what happens. Now, the reason why this is called root of trust establishment, because essentially the root of trust in the system is really comprises the contents, the chosen contents of the system state. And the system state is basically the content of the memories that we are talking about, and processor registers.

Starting point is 00:14:16 So help me understand, how do you establish your baseline? How do you establish that when you're doing your initial testing, that the system is clean to begin with? Well, so the first question is, how do we establish that the results that came back from the test were correct? How can the verifier tell that the results are correct? So that's the first question. And that turns out not to be a major problem in the following sense. The verifier has a specification of the machine. So whoever built the verifier and constructed the test has a specification of the system type under test.

Starting point is 00:14:52 And therefore, the verifier can obtain the correct results, which are ran separately, either on a simulator of the machine or on a machine, a copy of the machine that does not contain malware. So either way would work. So essentially, the verifier has the right results in hands, both, as they say, both in terms of the computation result and the timing. So that's essentially what's necessary for the test to succeed. So what happens, I can imagine during the normal life cycle of a system that changes are made, hardware can be added or taken away or updated, firmware could be updated and so on. How do you then re-establish that those changes made along the way were for good and not evil? Yes. So what this test refers to is this extremely difficult problem of detecting malware and detecting unknown area. As you point out, malware in this peripheral

Starting point is 00:16:07 device controllers and in firmware can be inserted at all points during the system lifetime because of updates. And these updates could come from companies like, for example, ASUS in Taiwan, as you may have noticed two days ago, they update their systems. And those updates might as well very well contain updates to the firmware, which is clearly the case with supply chain updates. So essentially, the scenario that he posted is absolutely credible and practical. Essentially, what happens is after such updates, you have to bring your system down and perform this external test. Now, this is obviously not trivial at this point, but it's a necessary step to detect that your firmware updates were

Starting point is 00:17:01 done completely incorrectly, that in fact, no malware, no unaccounted for content was placed in your firmware. And by the way, when I say unaccounted for content, what I mean is that often the firmware in the device controller is not fully utilized. There are sections of the firmware that may contain code which is not updated. That, for example, reformats partitions of the disk, let's say. And this is a huge problem. You have to actually reflash and retest the entire firmware and not leave out any hidden aspect or any hidden part of the firmware. And that, of course, brings us to the notion, do you, the tester, do you, the verifier,

Starting point is 00:17:51 have the complete and correct specifications of your peripheral device controllers? And again, without complete and correct specification, the test could not be done. The test, by the way, that has to be done upon all supply chain updates or all updates carried out by the operating system relies or depends on two things fundamentally. One is correct device specifications. Secondly, randomness in nature. In other words, we have to be able to collect random numbers, true random numbers from nature, not pseudorandom numbers, but true random numbers. Pseudorandom numbers, again, assume that your adversary is bounded, the power is bounded. We don't assume that.

Starting point is 00:18:33 So the verifier has to have correct device specifications to the test, pseudorandom numbers, and of course, it has to have the correct results in hand before the test is started. With that, the test can be carried out, at least in principle, unconditionally. I suppose one of the elements here is that you have to trust your supplier that the specifications they're giving you are accurate. Correct. Could your system be useful in detecting if those specifications don't align with what is actually being delivered? My first gut reaction is to say no. My system, my test is not oriented towards that. It does depend on correct specifications. And let me say at the outset is that that is generally

Starting point is 00:19:20 a huge problem. And it's a huge problem not for my test. It's a huge problem for computing in general, in practice. It's a huge problem for reliability. No reliability problem, if we forget about security, can be solved without complete and correct specification. No security problem can be solved without complete and correct specification. And no cryptography problem can be solved without complete and correct specification, and no cryptography problem can be solved without correct and complete specifications, just to mention a few areas. In other words, you have to know the specification of your devices. I see.

Starting point is 00:19:54 By the way, that's not a condition. It's a fact of life. Right. I see. The work that you're doing here, my understanding is that this is still in the theoretical stage, is that correct? Correct. This is so far has been in the theoretical stage but by the way, we've had quite a bit of experience with this type of tests. For example, in 2015, we published a paper identifying the fiction that was relevant or prevalent in previous tests, which people have tried, mostly here at Carnegie Mellon, where we thought about this problem for a long time. So we took a retrospective look at root-of-trust establishment, and we realized that there was a lot of fiction. So we have a lot of experience with tests in

Starting point is 00:20:42 practice, but not of the kind that I came up with recently and published recently. I see. So what's next? How does this go from the theoretical and be turned loose in the real world? The next step would be to implement it on real devices. And there is a variety of devices this could be used, by the way. Microcontrollers that control medical devices. Microcontrollers for embedded real-time systems, such as, for example, weapon systems, if one cares about defense. So all sorts of microcontrollers, which are relatively simple devices in the sense that their specifications are extremely well known. Laptops are complicated devices. Their specifications of some of their devices are known only to the device producers and to

Starting point is 00:21:32 government agencies that take the time to discover those specifications. And I don't mean government organizations only in the United States. I mean, government organizations around the world. So the next step is the first small step, handle the problem for device controllers that have updatable firmware and make sure that you can show that there is no possible malware, which we can with our test, and then move on to more complex devices, devices with multiple processors, devices with all sorts of features, which we can handle, at least theoretically, like caches, pipelining, virtual memory, SIMD operation. Those are the steps that we anticipate. Nevertheless, the first hurdle was passed, namely showing that this is indeed possible was passed, because it wasn't clear that this is indeed possible, was passed. Because it wasn't clear that this test really was possible.

Starting point is 00:22:26 Everything else failed before. Well, congratulations to you and your collaborators. It seems as though you may be on to something important here. Thank you very much. At least from an intellectual point of view, this was an achievement. But how fast and how soon we can actually materialize this in practice remains to be seen. Our thanks to Virgil Gligor from Carnegie Mellon Scilab Security and Privacy Institute for joining us. The research is titled Establishing Software Route of Trust Unconditionally.

Starting point is 00:23:03 We'll have a link in the show notes. your company's defenses is by targeting your executives and their families at home. Black Cloak's award-winning digital executive protection platform secures their personal devices, home networks, and connected lives. Because when executives are compromised at home, your company is at risk. In fact, over one-third of new members discover they've already been breached. Protect your executives and their families 24-7, 365 with Black Cloak. Learn more at blackcloak.io. The Cyber Wire Research Saturday is proudly produced in Maryland out of the startup studios of Data Tribe, where they're co-building the next generation of cybersecurity teams and technologies.

Starting point is 00:24:04 Our amazing Cyber Wire team is Elliot Peltzman, Puru Prakash, Stefan Vaziri, Kelsey Bond, Tim Nodar, Joe Kerrigan, Carol Terrio, Ben Yellen,

Starting point is 00:24:13 Nick Valecki, Gina Johnson, Bennett Moe, Chris Russell, John Petrick, Jennifer Iben, Rick Howard, Peter Kilpie,

Starting point is 00:24:19 and I'm Dave Bittner. Thanks for listening.

CyberWire Daily - Establishing software root of trust unconditionally. [Research Saturday]

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.