Embedded - 414: Puff, the Magically Secure Dragon
Episode Date: May 26, 2022Laura Abbott of Oxide Computing spoke with us about a silicon bug in the ROM of the NXP LPC55, affecting the TrustZone. More information about the two issues are in the Oxide blog: Another vulner...ability in the LPC55S69 ROM Exploiting Undocumented Hardware Blocks in the LPC55S69 More about LPC55S6x and their LPC55Sxx Secure Boot Ghidra is a software reverse engineering framework… and it is one of the NSA’s github repositories. Laura will also be speaking about this at Hardwear.io in early June 2022 in Santa Clara. Twitter handles: @hardwear_io, @oxidecomputer, @openlabbott, The vulnerability was filed with NIST: NVD - CVE-2021-31532 Transcript
Transcript
Discussion (0)
Welcome to Embedded.
I am Alicia White alongside Christopher White.
When people tell me about chip bugs and non-alpha silicon,
I usually nod quietly and wonder what bug they have in their code that makes them think they found something so rare.
Not today.
Today we're going to talk about a bug in the silicon that can be
used to hack a system. And I'm happy to talk to Laura Abbott today. Hi, Laura. Welcome.
Hi, thanks for having me. Could you tell us about yourself as if we met at the Hardware.io
conference next week? Sure. I'm a firmware engineer at Oxide Computer. I've been there since January 2020. For those who haven't heard of Oxide, Oxide is rethinking the server from the ground up. Servers haven't really changed in a number of years, and Oxide is hoping to build know, oh, a Dell computer next to my desk that, you know, some people get some files from.
But these are big things that go in racks and for data centers and stuff like that, right?
That's right.
But that's actually a good comparison because the servers that you can buy from Dell that go in racks
actually look a lot like your Dell PC.
And it turns out that's a pretty difficult thing to
work with, especially for new hardware. And big companies like Google and Facebook are designing
their own hardware these days that's much nicer to be able to use. But if you're not one of these
big companies, of course, you don't have the chance to be able to buy this because you can't
afford to be able to design your own hardware. So that's sort of what Oxide is going for,
being able to build really nice hardware to be able to deliver this because you can't afford to be able to design your own hardware. So that's sort of what Oxide is going for, being able to build really nice hardware to be able to deliver
a great experience. We want to do a lightning round where we ask you short questions,
and if we're behaving ourselves, we won't ask how and why and all of that.
Okay. Do you like to complete one project or start a dozen?
I honestly have a tendency to do both,
depending on where I am
and what type of thing I'm looking at.
I definitely have a whole bunch of
sort of electronics projects that are half completed
that I need to actually finish sometime.
But I think other things I tend to do one at a time.
Hubris or humility?
Oh, I'm going to have to go with hubris.
Favorite Cortex-M?
Ooh, that's a tricky
one.
I'm going to have to go with
the Cortex-M33
because I do have a soft spot
for that trust zone.
But no floating point?
Less of a floating point.
Urgency or rigor?
Did you go through and put it in Oxide application?
I'm going to have to go with rigor, I think.
That question is from Oxide's interview questions.
What do you hear most often?
Oxide has a number of questions people ask for on the application.
And I think one of the questions Oxide likes to ask is, you know, talk about two values in tension, how you resolve them.
And people do talk a lot about urgency versus rigor because that's a fairly common thing for people to talk about in engineering in terms of trying to figure out, okay, how much work do I need to do to make this correct versus can we get it done a little bit faster?
Favorite fictional robot?
I love WALL-E.
That was a great movie.
Pi OCD or Open OCD?
Ooh, I'm definitely going to have to go with Pi OCD.
Open OCD has a soft spot in my heart,
but, you know, it's kind of a pain.
And my colleague, Cliff,
if you happen to listen to this,
you know, I respect your choice
to use OpenOCD.
What's PyOCD?
I haven't heard of this.
I haven't either.
PyOCD is a Python library
to be able to do debugging support.
It has support for
this SEMSIS DAP standard to be able to connect to a. It has support for this SEMSIS DAP standard
to be able to connect to a lot of microcontrollers.
And that's what we end up using at Oxide,
or at least that's one of the tool chains we end up using.
We should check that out.
Yeah, sounds cool.
Do you have a tip everyone should know?
Always read the documentation
and don't be afraid to try things.
Okay, so before we get into the silicon issue you found
that you're going to be presenting at Hardware.io soon,
tell me about the chip in general.
It was the NXP LPC55?
That's right.
The NXP LPC55 is a Cortex M33 that we evaluated.
It has a number of nice features that when Oxide was evaluating for a chip for our root of trust that we found.
And it has things like a strong identity to be able to give a cryptographically secure, unique identity.
It has some hardware accelerators, but that's less important to us.
It has a secure boot to be able to do things.
And we chose these features in particular because they let us be able to build up for
what we're doing for the root of trust.
So the root of trust is that I have to be able to trust everything in the whole chain
in order to trust that what I'm doing
at the end is useful. And this is often used in firmware update. Is that what you're talking about?
That's part of it. So the idea behind the root of trust is sometimes when you say that term is
that people, that can mean a lot of different things to many people. When I say root of trust, we're talking about answering the question about what software is running on the system.
So the idea is that we're going to say, okay, we're going to trust what's running on the root of trust,
and then that can be used to build up other parts of the system.
So the idea is that we know exactly what's running on there,
and we'll be able to compare that against an expected set of hashes.
And then, say, when it gets time to be able to do something like system updates, you'll be able to
not just install the update, but also get another set of expected calculations about what should be
running on the system. So this is like when I, if I jailbreak my iPhone, I've lost the root of
trust for Apple, and so their apps won't work on it. That's a good example there.
And the LPC55 is an M33, which you said has Trust Zone. What is that?
So Trust Zone is another way to provide isolation from code. So for a lot of chips, you'll often have the privileged versus unprivileged mode. TrustZone provides
another axis of
secure versus unsecure. So you could have
secure-privileged,
secure-nonprivileged,
nonsecure-nonprivileged, nonsecure-privileged,
and other things like that.
Every permutation. Yes.
So,
I remember using on Cortex-M4,
there was a memory protection unit. not an MMU, but a memory protection unit that would allow you to mark various pages as privileged versus non-privileged. Is this part of that?
It's a similar concept. So the MPU is still definitely there, and that provides isolation to be able to choose what things are accessible. The trust zone is based on, uses a controller called the SAU security attribution to be able to specify which
regions are secure or non-secure. So in our system, for example, we end up having both of those
pieces configured. So when you're running in a non-secure world, for example, the regions that are secure are specified.
But you also have the MPU going to specify what regions of memory you're allowed to touch.
Okay.
And if I didn't have the trust zone, what would be different?
How would I treat my—is it only for IoT systems? If you didn't have TrustZone, you could still build
a secure system, but it's helpful to think about TrustZone as just another layer of protection
that provides another way to isolate things. So it's even more, so the goal is that if you're
running in code in a non-secure world, you should only be able to get into TrustZone through very
specific paths. So if for some reason you didn't have TrustZ world, you should only be able to get into trust zone through very specific paths.
So if for some reason you didn't have trust zone,
then you would be left with coming up another way
to be able to fully protect that.
And if you have a properly working system,
that always should be fine.
But the idea with, for example, something like trust zone
is that if you did end up having a bug in your code
that, say, might expose secrets,
then the other layer of security protection
will make it even harder to get those secrets.
Okay.
And this is part of the ARM chip,
not part of the NXP chip.
That's correct.
Trust Zone itself is a part of a specification
defined by ARM.
And one of the things is that
when you start looking at microcontrollers
and looking at everything,
it turns out that implementing
various things like TrustZone can be optional. So depending on what version of the specification
a chip vendor chose to implement, there may or may not be TrustZone.
But the LPC-855 has one.
Correct.
Do you use it? I guess, you know, it's okay. I understand security. You need to have something on there that is like the base of security. And you mentioned an ID. And then I can, in manufacturing, use that ID to assign a particular cryptographic key and put that in it. And then I use that for communications, including firmware update.
What am I missing about the specialness of TrustZone?
I think a good example there is that if for some reason you wanted to really make sure you didn't want to be able to read out the secret for being able to do
your firmware update. So for example, if you had a key
that you really wanted to keep private,
you could put that in TrustZone so that when you are,
say a common design may be to have your bootloader be secure
and then jump into non-secure,
there would be no way to read that out
once you're in secure mode.
Another example that we were looking at
for it to be able to do is provide even more,
with our chips is being able to use TrustZone to provide even
more hardware isolation. So we would only be able to access certain hardware blocks that we were
really not sure about and put them only in TrustZone. So that way the non-secure world
couldn't potentially access them. So the non-secure world wouldn't be able to access the power off button. That's a good example, yeah. Or the rewrite flash code.
Yes, and that's part of what we were looking at when we were evaluating is that, you know,
what exactly can we do to make sure that our product is secure?
And being able to rewrite flash is definitely one of those dangerous scenarios. And there are fuses you can change in manufacturing so that people can't read
out your code over a JTAG link or a CMSIS DAP. This isn't really JTAG. And there are fuses you
can blow so that you can't ever change what's on the board. But those are small things and cover
large pieces of the system. The trust zone lets you change parts but not other, is that right?
Yes. So I think your example of fuses is a good one to sort of compare and contrast because
it's also worth noting is that trustZone is ultimately just another sort of hardware
configuration. And especially for things like fuses, that tends to be a permanent one-time
thing. So TrustZone, you may choose to, once you have your finalized settings, you may choose to
blow your fuses to be able to make sure you can't actually make further changes to your Flash, for example. Okay.
Do you understand Tristan and Christopher?
I do understand it better, yes.
Okay.
Laura, you're giving a talk called
Unwanted Features, Finding and Exploiting
in ROM Buffer Overflow on the LPC55S69. That's correct. So what's that about?
So this is a bug I stumbled on somewhat accidentally. So I mentioned that I'm a
firmware engineer at Oxide Computer. My job is not actually vulnerability hunting. But during the course of
trying to work with a feature of the chip related to software updates, I stumbled across a buffer
overflow that could be used to break some security boundaries in the chip and really violate some
pretty fundamental assumptions. So stumbled across, is this you had a horrible bug and then noticed it did something?
Uh, no.
The Fifth Amendment doesn't apply to this podcast.
So, honestly, I ended up finding out this bug because I was a little bit lazy, and I didn't want to write a parser for the update format that NXP was going for, or at least I
started to work on it and realized, huh, this format is kind of complicated. There's a lot of
fields in this header. So the update format that NXP uses is called SP2, and it starts out with
an unencrypted header before actually getting to the keys and then commands to actually do things like erase
the flash. And so the way that this is transmitted sequentially. So I started thinking and going,
there's a lot of fields in this header. How well does the ROM actually validate all parts of this
header? So I had a ROM dump laying around, and I started looking a little bit closer, and I happened to find one of these fields that wasn't being validated correctly and gave me a buffer overflow.
So I'm clear. What ROM is on the Cortex?
This is a ROM that's specifically designed by NXP.
This is a ROM.
Cortex itself doesn't actually mandate any of
this. This is a design choice that NXP actually made. And when we were initially first choosing
a chip, this was actually, several of my colleagues pointed out that this might be a disadvantage
just because they have had bad experiences with ROMs. And so far, you know, that wisdom has turned out to be very correct.
Interesting.
But this is flash that is on the chip.
It's flash or is it?
It's probably flash. It may be masked ROM, but it's probably flash.
Okay.
And it's flash that we can't get to and we can't modify because it probably has had its fuse blown when it left manufacturing.
Okay.
And this is, like I gave that memory map talk and I had that area that was like the unused addressed spaces in the ocean and you can just, we don't know what all the registers are.
They don't tell us what all the registers are in the manual.
They tell us the registers they want us to use.
Yeah. And so this ROM code is kind of like that.
It's like they tell us how to use it, but they don't tell us what it is.
And what made you – you said you had coworkers who were distrustful of the ROM code.
What made you look around for it? And how did you dump it?
My coworkers who were distrustful,
I think one of my colleagues, Cliff, in particular,
just pointed out that I think especially
for what we're trying to build with the root of trust,
part of what we're doing,
because we really want to know
what exactly is running on it.
And I think especially as you gave a great example
for the manufacturer being able to tell us everything that's in the ROM. And for building a root you gave a great example for the manufacturer being
to tell us everything that's in the ROM. And for building a root of trust, this is kind of
terrifying because it's hard to know exactly is there something in there that may break our
assumptions about being able to do our chain links to be able to do our measurements. So I think we
initially were a little bit worried about this. And it turns out in this case that NXP did not actually add any sort of read protection off the ROM.
So dumping the ROM was a very simple matter of literally just reading it out with a debugger and saving it.
Okay, that was mistake number one.
I don't know. I kind of disagree there.
I actually think that the ROM actually should be available.
But I mean, really, if I have a complaint, they should just be giving us the actually think that the ROM actually should be available, but I mean,
really, if I have a complaint, they should just be giving us the source code to the ROM.
Yes, either it should be transparent, or you should make it totally opaque if that's the
path you're going. Yeah, but totally opaque. I was going to ask this later, but now I'm going
to ask it right now. So I worked with an authentication chip many years ago, and this
stuff was still pretty much in its infancy, and it had a lot of problems. But the question I kept
getting from other technical people in the company when we chose this chip or any chip that did
authentication is, what's preventing someone? And back then, the question was, what's preventing
somebody with a million dollars of equipment from, you know, acid etching down the chip and reading out the secret key from the flash or wherever it's programmed?
And I said pretty much nothing except a million dollars at that point.
Now it's much less than a million dollars.
Is that sort of thing?
So when Alicia talks about, well, they should block off the ROM,
does that actually prevent anything if somebody wants to go look at it visually?
Or are flashes harder to do that with these days?
I think it's still possible to be able to do that to some extent with the ROM.
On the LPC-55, I think we did a little bit of investigation with that.
But I think to your question about being able to read out the secret key,
one of the features that was appealing to us about the LPC55 was that it can't be cloned,
such that it's only tied to the actual chip itself,
which is a great way to be able to get that strong, unique ID.
And you can do that to get further encoding
so that even if you did happen to, say, get a copy of some of the Flash,
if it's been encoded by the Puff, you can't actually decode it.
Interesting. Okay.
Puff the magically secure dragon?
Yes. I like that.
Okay. So you found a buffer overflow and buffer overflows are the sort of thing that lead to security issues because if you have, say, a buffer overflow in a function, you may be able to overwrite the stack and then you can change the
code to run what you want instead of what it was supposed to run, which is bad. I mean, that's
why we get denial of service attacks and all kinds of things that lead us to say, always check your inputs for malicious actions.
So what does the ROM overflow do?
You gave a great example of what people usually think of when they think of a buffer overflow,
which is that doing a classic stack overflow.
In this case, because of how the ROM was actually parsing the code, it was an overflow
in the global space. So the idea is that you would just continue writing and it was something that's
maybe somewhat closer. I call it a heap overflow, but that's incorrect just because where this was,
it wasn't actually overflowing in the heap. It was overflowing something to the equivalent of, say,
the BSS section that's normally all zeros
and then you later set up.
So I found an overflow there
and then it turned out right next to
this global variable I was able to
overflow was the
heap address
or heap allocator and I was able to
use that to be able to turn
that into a way to get code execution.
And I will be talking more
about all the gory details about how I did that in my talk. Okay. And just to make it clear,
the talk is in early June, second week of June. That's correct. In Santa Clara, California. And it's physical.
You're going to a real, actual, in-person conference?
I am.
I will admit I'm a little bit nervous,
but also excited to be able to potentially do a meetup and see people in person.
And have you been to a hardware IO event before?
I have not.
I heard about the conference from my colleague, Rick,
who suggested after we found this bug
that it might be an interesting place to submit a talk.
They tend to do security talks
more than any other kind of embedded talk,
but it's all very embedded.
Yeah, it's all definitely fairly low level.
A lot of things at the hardware layer.
I'm excited to see some of the other talks that are going
and really get to learn about more aspects
about the hardware security.
And usually their talks are put online afterwards,
and I'm hoping yours will be too.
Do you know?
I don't know right now, but again, I'm hoping the talk will be online. If not,
I will definitely be releasing some more of my slides and details about once everything is over.
Cool. Have you written your talk yet? You can tell me. I know how this goes.
I'm in the process of doing that. I am not one of those people who can write the talk
and slides while they're on the plane. I need to be practiced. I definitely have the slides going,
and I'm beginning to practice to make sure I have everything. Especially for a talk like this,
I really want to make sure I'm explaining everything correctly, because I realized I
did a test of the talk, and there are a few more diagrams I need to add to be able to explain things, like what exactly the puff does.
Yeah, okay.
And how you found it, and the tools you used, and really, how can we use this?
How can a malicious actor use this?
How can we use this to do things? That's actually how a malicious actor could use this to do bad things is probably the most interesting of questions.
What do you have for that?
That is a very interesting question.
And I think it's sort of, part of it goes back to the system configuration of the chip. And I'd say, you know, when I say I found a buffer overflow in a software update that sounds pretty bad,
and, you know, people might initially say, wow, this thing is completely broken.
But it actually, if the chip is properly configured to prevent modifications to certain configuration areas,
it's not completely broken.
You can't, I mentioned the chip has Secure Boot.
You can't change Secure Boot keys. You can't change I mentioned that you pass secure boot, you can't change secure boot keys.
You can't change other various configuration settings.
But what is available for you to be able to do
is to perhaps write to unwritten flash pages
that aren't covered by a secure boot image.
If you have another image that's already been signed,
you could do a rollback attack
to be able to boot an older version
that say might have a buggy version.
One of the most serious issues we found with this is that typically the way the chip is set up,
it has a feature called DICE that's designed to be able to compute an identity.
And part of the way that DICE works, it relies on keeping a particular puff-encoded secret restricted.
So the idea is that once it calculates the value for Dice
using the existing image, it will restrict access
and make a change to register to prevent you
from being able to access that same puff-encoded value.
It turns out that at the point this buffer overflow happens,
you can read that value out, which means it's possible to be able to write some code on
there to be able to read out this puff and coded value. You should really not be able to and be
able to say clone an identity. Yeah, clone the identity. That's where the being able to read
the identity comes out. Yeah. And so you could make another system that pretended to be yours and then use that to probably practice other attacks.
Yeah.
And that's especially for what we're trying to do with being able to have root of trash measure other parts of the system.
That's pretty bad if you could have another part of the system pretend to be the measurement and say, oh, here are some measurements. Yeah, they're definitely coming from me. Wait and quick.
Do I need physical access if I was a malicious actor?
That ends up depending on how exactly the chip itself is set up. So I'd say you do,
but it's also important to think about what physical access actually means.
Just because, say, if you have this chip deployed out there and it's getting updates over the network,
then maybe it might be able to do things over there.
But, I mean, it requires stuff to be physically sent over a hardware interface, perhaps.
But there's also a way, depending on the other software you've written on your system, you can also invoke it that way. So it's not one versus the other. It depends on a whole bunch of
other parts of your system. Because it's part of the in-system program, which can happen over
UART spy, I2C, it can. And so you need to send things over one of those. Of course,
one of those might be attached to a Wi-Fi chip or something like that.
You can in-system program this over CAN?
Yes.
Isn't that awesome?
I guess it makes sense.
Yeah, no, you'd have to.
Yeah, there's a version of this chip that does in fact have CAN support.
So I think for those who aren't familiar with CAN,
this is oftentimes used in automotive.
So the idea is that, yeah, if they're trying to update something in your car,
you know, you could potentially be able to send things that way.
So you've mentioned that as long as things are set up correctly,
this probably doesn't affect most people how do i avoid letting people clone my device how do i
avoid having the puff be puffed sorry i can't get past puff it's it's a great name but uh yes
i should uh clarify that um i i think when i said that it doesn't affect certain things, I mean, this isn't as completely bad as it could be.
Is that the POP you can definitely always read out as long as you're trying to use this buggy code.
And I think the real concern, people, is that if for some reason you didn't actually fully seal the CMPA programming area,
it's possible to change a whole bunch of things there by rewriting the flash.
But to prevent using this, I think we did a lot of evaluation when we found this issue at Oxide,
and I think we came to the conclusion that the best way to avoid this issue is just to not use
this ROM update code at all. That's the safest path
until you get a fixed chip. Oh, and so you can write your own programming, your own flash
programming, and have your own ROM update or your flash-based update like everybody else does.
Yeah, that's correct. And I mean, it's definitely considered a rite of passage,
I think, to write a software update
on microcontrollers at some point.
It's a rite of passage that gets repeated
over and over again sometimes.
Serial drivers, bootloaders, and...
They all work differently, so...
Yeah.
So has NXP had a reaction?
Yes, they have. So I think we were generally pleased with how NXP had a reaction? Yes, they have.
So I think we were generally pleased with how NXP responded to this.
We sent them the proof of concept and they definitely accepted and, you know, we were able to get a fix out.
And I think, you know, we had previously had an interview reactions with NXP's product response that was less than satisfactory.
And I think this time they definitely took it more seriously.
And they certainly made the announcement.
And I was actually pleased to find, I think it was last week,
I stumbled upon that the security vulnerability
was actually publicly available on their knowledge base.
And this was something I don't think we had actually seen before.
So I think it's really good to see hardware vendors like NXP
making these things public because I think all these things should in fact be. So I think it's really good to see hardware vendors like NXP making these things public,
because I think all these things should in fact be public so that if there's a chip vulnerability,
you know, you should know and it should be freely available to everybody.
Certainly, if you're advertising that your chips are secure and that you have all these features,
then being open about when they fall down is probably a prerequisite of being trusted.
Yeah, and I mean, it's not that, you know,
you expect chips to be completely bug-free
and to never be in errata,
but I mean, it's a matter of making sure
everybody can, like, actually be aware of it.
And are they changing...
Is it in the errata?
Are they changing their ROM for new versions of this chip?
Or are they coughing up that programming code
so that you could fix it and compile it
and put it part of your program instead of in their ROM?
I think they do plan to issue fixed chips.
And I was thankful I got an engineering sample
to be able to test it and verify that it was fixed.
So I think the hope is we'll be able to get some fixed shifts.
But of course, you know, trying to get your hands on any kind of silicon these days is difficult.
And even in the best of times, trying to seen for, I want to say, big bugs.
But for, I guess I only see it when I hear about something so catastrophically bad that then I go look at the NIST database.
What made you do that and how did you figure out that nobody else had found it before
you? So NIST and the CVE database is an interesting discussion. So CVE assignment is ultimately left,
it's ultimately, I'd say, left up to both the reporter who is finding a bug and, say, the receiving end of a bug.
Sometimes companies may do the receiving end themselves.
But ultimately, Oxide decided to report the CVE to NIST to be able to have an easy way
to track the vulnerability.
And that's really what I see this as being about, is being able to say, OK, we need to
have a way to identify this and be able to point to specific ways that it goes there.
And you're right that you oftentimes see CVEs as being highlighted for big issues, but anyone technically can request a CVE if they want for any kind of issue.
I think it's important to always read the details about what's on the NIST database to see what's actually there and what the issue
actually means. And then to your question about how did you know if anyone else had actually found
this issue already? That's an interesting question, one I actually thought about a lot. And, you know,
I honestly, I don't think we had a good answer. And I think there wasn't a good way to know until
we actually tried to publish this and see if anybody had come out. I think if someone had
come out and said they had already found this issue, honestly, I would have been
really excited to see what exactly someone else
was doing to be able to find this.
And you'd hope that
NXP would tell you,
thank you, we've already started fixing this.
And here's your $10,000 reward.
Yeah.
Yeah, I mean,
when we reported to NXP, you know,
they'd immediately come back and offer us, you know, say, oh, yeah, you know, we're getting ready to fix this.
But I mean, you know, sometimes these bugs aren't found, but we are looking forward to being able to get, you know, fixed chips and being able to deliver them.
Finding this bug and writing it up in such a way that NXP can take action on it and writing it up for the NIST vulnerability database, this all took time that you didn't have to spend.
That is correct. I think Oxide definitely supports, you know, my work and finding issues like this. But at the same time, I think, you know, we're kind of tired of doing this work of being, you know, finding these bugs.
And I think, you know, we hope that this is the last one that we found.
Kind of hope that this is the last one you found.
But do you think it is?
I think it is.
I mean, at least for now, I think.
But who knows exactly what exactly is going to try and do will end up happening. good for the community, but Oxide's building servers.
So it is nice of them to give you the opportunity to talk more about it kind of on their time.
I imagine some of the preparations on your time.
Yeah, and I'm grateful to Oxide, but it definitely does take some time to be able to do this. And I think internally, we did have some back and forth about what exactly we should do.
There's a lot of people who work in security will have many opinions about how exactly the disclosure process should work.
And I think in some respects, Oxide's way, the way we do disclosures, how we would like to be disclosed to is that when, you know, someone inevitably finds an issue in Oxide's
bug, I mean, we'd like to believe is that, you know, if someone came to us with a proof of concept,
we'd, you know, take it seriously and be able to give an estimate about when things were fixed.
But it definitely does take a time, a lot of time to be able to do all that. But ultimately,
I think it's good, you know, and not just for the community, but I think it's also that I think it's
the right thing to do. But I mean, there's lots of debate out there about how you do disclosure.
Beyond the right thing to do, I think for a company with a product like Oxides, which is
a server, which has certain security requirements, demonstrating competence,
that, oh, we at Oxide find these kinds of issues and are really good at it. And that should give you more confidence in our ability to make solid hardware.
I mean, that's not nothing.
Yeah.
Absolutely.
Being able to say we found these vulnerabilities, therefore we're not passing them along to you is definitely confidence building for the server.
When is the server coming out?
We're still definitely working on building it,
iterating on hardware and trying to be able to do things.
What's the space for when we have an oxide space
for when we're actually able to deliver it?
But it's definitely coming,
and people were out doing bring up um about two weeks ago it was very
exciting to see another iteration of uh hardware come out and make a you know a lot of lights
blink and fans turn on your website says late 2022 is when you're going to start shipping racks
so i won't ask you beyond that because i know very well that if you answer it's probably bad
the website needs to be updated as well.
Oh, okay.
Well, we'll just leave that as it is then.
You said you were looking forward to some of the other talks.
Do you have anything in mind?
Yeah, I'm excited to see things related to glitching and side channels.
There's a talk about breaking stock security by glitching data transfers. I'd love to learn more about in the future about how to do physical glitching and side channels. There's a talk about breaking stock security by glitching data transfers.
I'd love to learn more about in the future
about how to do physical glitching attacks.
I bought a chip whisperer,
which is designed to be able to do glitch attacks.
I haven't had a chance to actually sit down
and play with it sometime
and hopefully be able to try that on the LPC-55.
And find something else?
I don't want to find something else but you know there are certain
things I'm like I wonder if we if I glitch
this here if I could actually break it
and it's sort of like this you know tempting me just to be
able to find something else
yeah so
tempting it's a different attitude
toward development because when I'm
my attitude toward development is
I want to get this code done and
never see this again.
And I certainly don't want to find out there's something wrong with the chip.
I think it's admirable to have, I think your attitude is much better to be deeply curious about the things you're using.
And I think it's cool that the company supports that because a lot of times, I think you might've mentioned that you're right,
that other companies might not be so supportive of that.
And what are you doing finding a bug in this thing
that may or may not apply to us?
Just get this done.
Yeah, I'm really lucky to have a lot of support
from everyone at Oxide.
And I think I was also joking on before to some coworkers
is that in some respects,
the fact that I had to do some reverse engineering
to be able to figure this out
actually made it more tempting
to try and figure out what's going on.
If they at NXT actually just put up all the C code for the ROM,
I may have been less tempted to want to dig into that
and just do a whole bunch of reading of the code
to be able to find out what was there.
Because it would be transparent. And if somebody else, if it was transparent,
someone else probably looked at it. And reading code is not as fun. Reading code is not as fun.
Reading code is important, but that doesn't mean it's always fun.
It's more fun to pit yourself against the puzzle of what they've done. Yes, and reverse engineering is definitely a fun puzzle to try and figure out just because Ghidra is a great tool for
being able to reverse engineer, but it doesn't do everything, you know, always do everything
perfectly. It's still up to you to figure out exactly what exactly this code is doing and what it's calling. Gidra, that's the one that you put in ARM machine code, and then it makes assembly code,
and then it makes C code.
Is that right?
Yeah, that's correct.
You give it some code, and it will disassemble it into assembly, and then also attempt to
put it back into something C-like.
How well does that work?
I mean, how does it decide what to use for variable names?
It tends to assign them sequentially.
So you're absolutely right that figuring out what the variables do
is one of the first things you do when you're looking at a disassembly.
What you end up with essentially is that if you imagine if you took a C function
and took away all the nice names for everything
and everything is just, you know, variable one, variable two, variable three.
So it does a lot of complicated algorithms behind the scene
to be able to generate this.
And then it's, you know, you're left trying to figure out, pick up patterns.
But it's pretty easy to start guessing what things are, for example,
like, oh, this looks like a loop.
And especially with something like when you're engineering a ROM,
I spent a lot of time comparing
what the ROM was
actually accessing to physical
hardware blocks in the memory map. So I was able to say,
okay, this function is touching the
GPIO block. This function is touching
the clock configuration
block, which gave me a good idea about what things
were doing. How much code was this
in the ROM?
Ah, there's a pretty big chunk of stack there.
I mean, it includes stuff to be able to do the ISP.
It supports, there's a USB stack.
Oh my God.
It gives you an idea about how much you have to be able to support in ROM.
That's a lot.
Wait a minute.
Gidra is from the National Security Agency?
Yes.
Okay, I had no idea.
But they have a Git repo,
and it says NSA,
which, you know, that's interesting.
Too many secrets.
Sorry.
Yeah, but that sometimes makes some people nervous.
But, I mean, I found it to be a great tool. Too many secrets. Sorry. Yeah, but that sometimes makes some people nervous.
But I mean, I found it to be a great tool.
I'm not actually an expert in reverse engineering.
This was really one of the first serious projects I've ever taken up.
But I found Gator to be a nice tool that's available.
Does it put the C library all together so that you can identify what the C functions are?
Like string copy?
I think it might.
Again, I learned,
I'm learning a lot about Ghidra every time I use it,
but it has some things built in to be able to identify things like common formats,
but it did not have a way to automatically detect things
like string copy
and mem copy. So that was actually one of the things I ended up having to spend some time doing
and staring at some of these functions and realized, huh, okay, this is actually just mem
copy, just written out of a bunch of assembly because it's well-optimized mem copy.
Yeah. Optimized code is very hard to understand.
Yes.
Wow. Now I kind of want to play with it.
What do structures look like in Gitter, Gidra?
There is a structure editor,
so you can define your own structures
to be able to say what things want.
So the idea is that you can edit it field by field
and be able to specify the layout of things.
Yeah, so I think as you go along,
you kind of deduce what things are
and then rename them
and sort of make it more readable from the automatically generated stuff.
I hope you're right. Kind of like a crossword puzzle where things have to line up.
The crossword puzzle is actually a great example because I think sometimes what I ended up doing was saying, okay, this is definitely a structure.
It can tell by what it's accessing.
And it's also got some other nested structures.
So I sort of end up with, you know,
structure one has structure two has structure three.
And I knew how, I could guess how big things were.
And then having them all have fields with, you know, assorted names
and being able to try and guess what these was.
And as I looked at the code,
I would go back and be able to say,
okay, this looks like it's calling a function
that's for initialization.
This is like a teardown function
to be able to change the names to better match.
I'm just now thinking of terrible interview questions,
like just hand somebody this
and a pile of machine code and say,
okay, tell me what this does.
You have a couple hours.
I don't know.
I swear I feel like I've seen that
as an interview question before about,
you know, tell me what this code does.
I've seen that, but not from...
Yeah, I've been given that question.
I got very angry.
I've been given that question twice.
It was because the question was, what does this code do?
And the code was plus plus, plus plus, plus plus, plus plus.
There was minus minus.
No, no, there was minuses, minuses.
Minus minus, plus plus, I.
And the answer is it gets somebody fired.
Well, there was stuff afterward, too.
Because if it's just on one side, it's fine. But there was stuff... The only answer is it it gets somebody fired. Well, there was stuff afterward, too. Because if it's just on one side, it's fine.
But there was stuff.
The only answer is it should get somebody fired.
And if that's how you write your code, please let me know so I can leave the interview right now.
The other question I got asked, what is this to, was Duff's device, which is a loop unrolling thing.
Oh, my God, that's so hard to identify.
Yeah.
That's an annoying question.
And yeah, I hate that interview question just because
it involves actually
knowing the answer beforehand.
It's one of the things you keep actually solved
in an interview. I mean, it's very cool
to be able to learn about, but that's not a great
interview question. I agree completely.
You do a lot of interviewing for Oxide, don't you?
You know,
I do quite a bit of
interviewing with Oxide. I've gotten a chance to meet a lot of candidates, and I help with application review, too.
What do you look for?
I don't think there's one necessarily right answer about what Oxide is looking for.
I think it is a combination of some level of experience, but also an interesting interest in what we're building.
I think, you know, when I say interest, some people think, oh, yeah, so you need to be
completely passionate. But I think sometimes when people say, you know, passion, they assume that
means it must be, you know, all-consuming, the only thing you ever do. But I like think it is
more, it's about, you know, can you demonstrate that, you know, you're able to get the job done?
And I think there's a lot of different ways you can show that you have relevant skills to be able to do what you want.
I mean, can you talk about what have you built before?
I always like to ask people about the past problems they've solved,
because I think that shows a lot about types of things they've solved and exactly what problems you actually overcame.
And I think Oxide has definitely tried a unique approach with its materials question and getting a chance to be able to show exactly what they want, just because I think that materials are
an interesting way to be able to show off a different background, for example.
Can you tell us about the materials question?
Sure. So Oxide has everyone submit written materials.
I'd say one thing that Oxide definitely values is being able to write well.
I think Oxide asks for a work sample, which can be left open-ended.
So it's a way for you to be able to talk about what people have done.
If you've done open-source work, that's a good way to be able to be able to do there. And I mean, a lot of times what I'm looking for there is that how exactly, you
know, does that relate to what Oxstead is doing? I mean, what exactly are you showing me about why
that would make you a good person to work with? Oxstead also asked for an analysis question.
I honestly love reading the analysis questions just because I love seeing what kind of problems
people have worked through in the past and getting to work through the nitty-gritty details about these weird bugs and seeing what sort of things
people have done. And then there are also some questions related to, you know, your happiest
time, your unhappiest times. And some people may think these questions are a little bit cheesy,
but I think they also are a good way to get people to really reflect on, you know, what exactly they've learned and maybe even, in fact, things they wish they had, you know,
done better or might have done differently today. I find those kinds of questions much,
much better than solve this problem or... Duff's device.
Or, yeah, or, you know, here's a high pressure situation that you'll never, ever actually encounter. You have 30 minutes to do some code thing that, you know, if you had four or three hours, wouldn't be a problem. I like engaging with the candidates and figuring out, okay, do they have a history of solving things? Do they have a history of delivering? And, you know, are they somebody I want to work with. And I like what you said about engagement instead of passion, because
I can be really engaged with work, but also not very passionate about it.
So I think that's a distinction.
I mean, that's why we're consultants, because...
I don't want to be.
We don't want to be passionate about companies anymore. We want to be passionate about our lives.
And I'm happy to be engaged. But at the end of the day,
I'm not going to be dreaming about your product.
It reminds me of the time I was asked
if I had a passion for iPod
and then I knew I was doomed.
You don't have a passion to, you know,
listen to, you know, 4,000 songs in your pocket?
Well, no, that part was cool.
It was, well, interviewing there.
They wanted to know if I was super into iPod.
I didn't have one.
And music.
Does that count?
So what do you do?
What was your day work like aside from finding holes in LPC-55?
Yeah, so I do a lot of firmware work in Rust,
so I spent a lot of my time doing that,
and I'm writing code that goes from the root of Rust
and sometimes related to the service processor.
I also help with code reviews,
and I'm lucky to have a lot of fantastic colleagues as well,
so I'll talk to them if I have questions
about what I'm doing or especially Rust.
I didn't really know Rust before I joined Oxide and I've definitely gotten much better at it.
But I mean, there's certainly a lot to learn there to be able to pick up on everything and be able to do a lot of things correctly there.
I know that the Oxide folks like Rust.
I mean, they named their company after the language.
How do you like Rust?
You can tell me.
Be serious.
I do like Rust.
I promise there's not a Rust crab sitting next to here,
pinching my leg, telling me to say this.
Mostly, I like to say is that Rust, it's a powerful language.
And I think it also, it makes it a lot easier for me to be able to write C-like code
because of what the language offers.
The fact that I don't have to think about array index out of bounds errors
or it will give me an error in a way I can actually parse is much nicer.
I think it was some time ago,
I remember I was working on making some change
to the hubris humility stuff,
and I ended up hitting a bug
that I think probably would have taken me
significantly longer to figure out if it had been done C
simply because it would have been some sort of
silent array index out-of-bounds error
as opposed to giving me a nice error message.
And I think it's things like that that are really great to work with.
Has it been difficult? Is the language changing at a rate that's somewhat difficult to
come up with? That was one of the issues I had with iOS development
with Swift. It was like, every six months, like, oh, here's Swift 5.5, and look at
these eight things you can do that are really complicated now but are probably cool and you should learn about them.
And it got really in the way of writing code sometimes because I was like,
well, I got to keep up with the latest thing. As Rust is also a new language, has that been an
issue? I think the Rust community has tried to minimize that in terms of splitting things out
and having a well-defined process for a stable tool chain and an unstable tool chain.
So I think if you're working with the stable things,
you should mostly be able to find things are roughly the same
and you'll be able to do things.
Now, I think, especially for what Oxide is doing,
we are definitely close to the leading edge of things.
So I think there are certain features we're keeping an eye on
that we're hoping to see go stable.
But I think that the language
has definitely come a long way
and it should be pretty stable
to be able to do a lot of things.
Going back to the bug you found
and are going to be talking about,
this wasn't the only one, was it?
No.
So actually last year,
I ended up finding another bug or a different kind of bug.
That was actually why I originally had the ROM dump around,
was that while taking a look at the ROM dump, I discovered there was an undocumented hardware block
that could be allowed to patch the ROM and be able to make changes to the ROM.
And something like this definitely does have its use cases,
but it couldn't be logged out,
so it was possible to reprogram it,
which could be used to break isolation
between the secure and non-secure world for TrustZone.
That sounds kind of important.
Yeah, and I mean, I think this was our first experience with NXP,
and that was the one I think that we were less than satisfied.
I think it took a little bit of convincing
to have NXP believe that this was an issue.
And then I think more than anything, Oxide,
we really just wanted NXP to give us the documentation
for what this thing was doing
and make sure everybody knew it was available,
just because there's good reasons
to want to be able to patch your ROM.
I mean, ultimately, your ROM is just code, and you're probably going to have a few to patch your ROM. I mean, ultimately your ROM is just code,
and you're probably going to have a few bugs in your ROM.
This is understandable, and you need to have a way to fix that up.
But I think what's also important is to make sure
that you can't reprogram that to say
to be able to do other things you weren't expecting.
So let's go back.
The RAM patcher, ROM patcher, sorry.
Let's go back.
Yeah, RAM patcher would let you modify the ROM, including how you program for the trust zone, including the puff generator system and the firmware update. And, okay, so why didn't you ditch NXP at that point?
That seems really important.
How did they, what?
Yeah, this is a question we get a lot. And again, we spent a lot of time
trying to figure out exactly what we should do.
And it sort of comes down to a couple of factors.
One was that I mentioned we had some specific requirements
for what we were looking for in a chip.
And it turned out that there weren't a lot of chips out there
that met our requirements.
But we still have the documentation for write-up
when we selected this chip back in spring 2020.
And even back then, there were some chips we had to rule out
simply because we couldn't actually get our hands on silicon.
All we could get were data sheets.
Probably still can only get data sheets.
Yeah, and so trying to find that.
And then there's also the factor of about, you know,
we're pretty far into our product, you know.
We're getting boards and being able to do things like that.
So trying to find another chip and be able to put that in and then having to do even more silicon, you know, we're getting boards and being able to do things like that. So trying to find
another chip and be able to put that in and then having to do even more silicon, you know, takes
more time. And we've all spent a lot of time evaluating the chip. So I think we know far too
much about the chip by now. So in some respects, we are reasonably confident we know exactly how
this thing works. So we've decided to go with it. I do think this is a great lesson that, you know,
for everyone to think about.
And I'm definitely going to be talking about this in my talk about making choices like this.
I don't wish silicon bugs on anyone, but sometimes you end up having to make these hard choices.
ARM itself has a module for ROM patching.
They used to, actually, is that there used to be the flash patch breakpoint unit, which was for ARMv7
and earlier, but it was explicitly removed in ARMv8m, I think, because of TrustZone,
because they realized you could actually, you know, use this to be able to do bad things.
So, have you used it to do bad things to show the vulnerability?
Yes.
And when we found this issue, I think we shared it to NXP,
and I don't think they were fully convinced.
So I worked with my colleagues to be able to do a full proof of concept.
And my colleague, Rick, I think was the one who really helped to dig in and figure out how to turn this into something that was pretty impressive.
And I think we joked about figuring out how to do assembly code
golf in terms of, you know, finding the smallest number of instructions we could do to be able to
do something interesting to be able to reprogram the ROM. And what Rick and the rest of us
eventually came up with was something to be able to take the, you know, what is essentially
reference code out there and demonstrate that using the expected APIs,
we could have the non-secure world read out stuff from the secure world,
which was definitely not supposed to happen.
So how did you fix that?
That one was actually somewhat easier to fix, mostly,
is that it is possible to actually lock out changes and access to the ROM patcher via another security mechanism on the NXP.
So that, in fact, is available or at least restrict it to only certain levels
such that only certain levels are able to make modifications.
Is it something you have to do on boot each time?
Or is it more like a fuse that you say, okay, never again can the ROM patch this?
No, it's not a fuse, unfortunately. You have to do it each on boot.
So if somebody could hijack the boot, they could read out your secrets? Yeah, if you managed to hijack the boot and disable that check, you know, you were probably
running into some problems, assuming you could be able to take something else to be able to do this.
I mean, this is a lot of times what security is looking like, is that, well, if you could do this,
you could do this, you could do that. So it's all a matter of finding, you know, that one little
inch and being able to come up with the mile. Well, this has been really interesting, and I look forward to your talk,
hopefully being available online after the conference.
Laura, is there anything you'd like to leave us with?
Stay curious, everyone, and don't be afraid to break things.
You never know what you might find.
Our guest has been Laura Abbott,
an engineer at Oxide Computer
working on Rust software
for microcontrollers.
She'll be speaking at Hardware.io
in early June 2022
in Santa Clara, California.
Thanks, Laura.
Thanks.
Thank you to Christopher
for producing and co-hosting.
Thank you to Andrea at Hardware.io for the introduction and Rick Arthur for his Patreon support and his lightning round questions. And of course, thank you for listening. You can always contact us at show at embedded.fm or hit the contact link on embedded.fm. And now a quote to leave you with from Audrey Hepburn.
Nothing is impossible. The word itself says I'm possible.