Embedded - 34: Really Big Cabbage
Episode Date: January 9, 2014Elecia describes to Christopher (@stoneymonster) how to design and create a firmware update mechanism. Hilarity ensues. 4k PC emulator Making Embedded Systems, the book, on O'Reilly (coupon in las...t 2 minutes of the show) or on Amazon.  Â
Transcript
Discussion (0)
Welcome to Making Embedded Systems, the show for people who love gadgets.
This week we're going to talk about updating device software.
It'll be exciting. Really!
Or double your money back.
I didn't intro a guest, but that doesn't mean this is all me monologuing.
Chris White, my producer, is going to be standing in for you, the listeners.
Hopefully asking the questions you have and preventing me from going
off into the deep end before we get the water wings on. Hi Chris, thank you for joining me.
Hello. So I want to talk about
bootloaders for more update. Over the air updates.
Yeah, there are a lot of names. O update or over the air updates yeah there are a lot of names uh ota
for over the air and otap for over the air programming my firm my favorite is fwop
which i don't think is how you're supposed to pronounce it but for how else you're going to
pronounce it if you're going to try to pronounce it so updating uploading whatever you want to
call it it's the programming of your device with new code
after it has gone into the wild.
So not JTAG.
Brought with danger.
Not about manufacturing and how code gets in at that point.
This is all...
Not burning your EEPROM with the UV light and then...
Those were the days.
You never did that. I really did do that. What do you mean I never did that i really didn't i never did that
did you ever have a project where you i must have yeah i don't know uh okay so you haven't done
firmware update i have not done firmware update in a small device but you have done it as a user
as a consumer yes and i've done it for big systems.
Yeah, medical devices.
You've had some Linux-based systems,
and you just replace the executable, and poof, you're done?
Well, there's a little more than that.
You have to check digital signatures and make sure everything's okay
and make sure you don't blow away the good stuff
before the new stuff is validated.
But, yeah.
Well, it's all the same, except smaller and more likely that you're going to blow away all the good stuff before it's done.
Okay, so I have to admit that this is probably the podcast I prepared the least for.
So we're winging it.
And if you hate it, please...
Write someone else.
I was going to say, please send email to Chris.
Great.
So one of the things I am going to do is
I have my book here.
If you don't know, I wrote a book.
It's called Making Embedded Systems.
It's by O'Reilly.
No, it's by me.
The publisher is O'Reilly. Strange, no, it's by me. It's the publisher is O'Reilly.
Strange that the podcast is the same name.
Wow.
It's like we planned this.
Although I've been thinking about
turning the podcast to Embedded.fm
and just calling it that all the time.
Eh, we'll decide later.
There's somebody else to decide for us.
Oh, yes, right.
I feel like a total idiot
when I refer to people,
when I refer people to the book
because I had a client who said,
I need to make my stuff faster,
and can you come in and talk me through it?
And I said, sure, or you could just read my book.
There's a whole chapter about it.
It tells you exactly what you want to do,
and all I'm going to do is...
Charge a lot more money.
That's true.
I guess
my hourly rates are more than what I get
for the book.
But a lot of this show
is going to come from chapter 7
in the book.
So if you've already read it, I hope you don't get bored.
But you know, it'll be fun.
Chris will have questions, right?
I always have questions.
Just maybe not about what you want.
That's fine.
That's fine.
Do you have any questions to start with?
No.
Okay.
So when we're updating code, the idea from the user's perspective is you push a button
or maybe it happens automatically and the device just updates and suddenly has new features.
Or maybe it fixes the bug that everybody hated.
Okay.
And it's funny.
I went to see the optometrist this year,
and he asked me what I did.
And at one point I was working on motorcycle stuff.
And so now I think there's a note in my file that says,
ask her what she's working on.
It may be interesting.
And I said I was working on Internet of Things widgets.
And he said, oh, I got an internet enabled thermostat.
And I said, the Nest?
And he said, no.
And so we talked about it.
It's the Honeywell one.
He said, I asked, since I'm planning this rant for the Embedded Systems Conference about the Internet of Things and how it really fails consumers, I asked him how setting it up went.
And he said it was okay.
I mean, it had a little, it did its own network and he logged in with his phone and gave it his password, blah, blah, blah.
And that was okay.
But three days later, it lost its mind.
It couldn't connect to the internet anymore.
And he was baffled.
But do you know what happened?
Well, either it was very badly programmed and it just lost its mind, or it tried to do a firmware update.
No, he said it hadn't done it since.
It did a firmware update and either...
And erased its information.
On purpose? Well, probably they released the device with one set of firmware and said,
oh, we're going to fix it, but we won't change the flash area or the prom
or wherever they're keeping it.
We won't change the structure of that.
And then they updated the firmware and something changed
and the EEPROM or the flash or whatever they were storing the password in
no longer passed the checksum routine. Given that entering all of your data in those little devices
somehow to get your Wi-Fi network connected is such a pain in the neck,
that's probably something as a developer you'd like to avoid blowing away that information.
I would think so, but... I mean, how does that pass testing? I guess they didn't test it.
Well, they had to test that version zero went to version 12 because, you know, they probably spent
some time in the box and instead QA has been testing that version 11 goes to version 12.
Sure. I mean, I have to say that I have had one device that did that, and I was unhappy when I found out.
Although the version zero for us had only gone to beta users.
And so that was how we justified it.
But to this day, I'm like,
couldn't you have kept it in two formats or something?
But a lot of firmware update has to do with what you have and what...
See, the reason this is hard for small devices is not...
I mean, you said it was the same for big devices, but it's not,
because you have no room sometimes in the small devices
to store the new thing temporarily,
or you don't have the ability to run the update code
and the normal code at the same time, right?
Exactly.
This is all about which resources are constrained and which ones aren't
and how you can do the best you can with what you got,
which is kind of the story for embedded systems.
So there are a couple of things that we need to figure out
regarding which features you have and which ones you can use more of.
The first thing is how you're going to store the code, the actual final image, the what
you're running from.
And we say that this runs from ROM.
Of course, the acronym means read-only memory.
And since we're about to change it, that's not really read-only memory.
But it may be onboard flash.
Read-often memory.
Read-often memory, right, changing it.
The other thing that you need to know
is where are you storing the code,
the new code?
Not the code you're currently running,
but the new code.
So the temporary,
I have to put this somewhere
because I can't overwrite the thing I'm running.
Not even that, not yet.
We're going to get there.
But the thumb drive, if you're updating via thumb drive,
the cloud server.
So where the original source of the new stuff is.
Where the source of the new stuff is.
And the communication method.
And those two things can be, I mean,
a thumb drive indicates that this is going to be over USB, but cloud indicates over Wi-Fi.
Cloud.
So you have to know what those things are.
Right.
And I don't think I can help you with that.
That's kind of how your systems decide.
But when you're building all this, you need to know what it is.
At some point you boil down to a function that says get new image.
Right. And however that works,
who cares? Not always.
Sometimes some of these tips like the
STM
F152 line
and I bet... Seriously had that just
sitting on the top of your head?
I used it recently.
Okay, so the Piccolo C2000 line also has one you're just gonna list
product names that have weird numbers to demonstrate your eidetic memory uh no because
my memory for these numbers is man i have to see them about a thousand times you you know we watched
top gear and they had those Chinese cars. Right.
With the really, really long, long... Do you remember any of the Chinese car names?
Of course not.
It was like the Ford F1 4000...
They weren't Fords because they were the Chinese companies.
It wasn't...
BQR...
Yes.
It was like...
We're going to offend people if I continue, so...
Yeah, well, Top Gear.
So anyway, the source of your data.
So the source of your data and communication method
are the important parts that you need to remember.
Okay.
And so that's going to be like a constant as we talk through
is this, where does this stuff come from?
Some processors have onboard bootloaders.
Whoa, whoa, whoa.
A what now?
An onboard bootloader.
And what this does is it's a part of the chip.
It may take up some code space,
but probably it's just part of the chip as it's shipped to you.
And you put some IO lines in some configuration.
You pull them up, you pull them down, whatever.
And then it goes into this mode
where when you apply power or reset,
come out of reset, you can just upload the firmware.
And you do it like over a UART or a spy terminal or whatever.
It's like the processor can update the firmware for you.
Okay, and there's no way to accidentally destroy this piece of code.
You have to work really hard.
Well, no, that's good.
I mean, I did do it once, yeah.
But that usually involves trying to change
things deep inside the processor
that you weren't supposed to touch in the first place.
That was one of my questions,
is why do you need a bootloader at all?
And I know we're not staying on track here very well,
but this is a fundamental piece, right?
So why do you need a bootloader?
Eventually, to start a program on your chip,
chips are, you know, they read from address zero
and they just go or something like that, right?
Oh, but this bootloader isn't between the code.
It doesn't always run.
It kind of does because it checks those little...
Checks the pins that may be pulled up or down
to indicate you should go into bootloading mode.
So it is the first thing that runs.
But it may be the first thing that runs
before the processor really gets into running code
from address space.
Oh, okay.
So it may actually run outside normal bounds.
Although sometimes if you're willing to walk through the reset factor
and all the things that come between when you reset
and when you hit main, sometimes you can find the bootloader call in there.
And it calls, it checks these GPIOs, and then if they're set in the way that says
go into bootloading mode, it goes into bootloading mode.
And it sets up the UART to the data sheet agreed upon standard, and poof, now you can update your firmware.
That only works if your firmware is going to onboard flash, because the chip vendor knows how to deal with onboard flash.
And it may work if your code is going to onboard RAM.
Right.
It's not going to help you at all if you need to get your firmware update from a network.
Right.
Because the bootloader is not going to have anything really except I can check my pins, I can read stuff from some sort of serial interface
and put it into RAM or to internal flash, but that's it.
Right, but serial interface is pretty useful
because it might come from an SD card
if you can get a SPI interface to an MMC or a small thing
that basically is a flash.
And the bootloader is smart enough to talk to an SD card usually?
Some, some not.
Okay, but that's a feature of the bootloader.
Well, and you can make your own bootloader.
I mean, if you know you're going to be updating this way,
you can make your own little bootloader that does the same thing.
So if you're crazy enough, you could make a really complicated bootloader
that could handle sophisticated things,
like even talk over a network if you had enough code space for it or something.
Yeah, you could.
But if you're talking over a network,
you're probably talking about an operating system,
and that would mean having two copies of the operating system.
Well, we'll go back to two copies.
But you could definitely have a bootloader that could talk to any flash
that you ship your boards with
or any of a small set of SD cards that you have qualified.
Okay.
And then you put your new code onto the SD card, you plug it in,
and now your communication method is spy and you just update your code space.
Your bootloader doesn't change,
which is kind of the downside to this process.
Why?
Well, what if... I mean, if you could change your bootloader,
or you don't necessarily want to, right?
Because there's always a danger with doing an update.
Yes.
And you try to protect,
you try to keep all those danger zones
as small and short as possible
and if you're overriding the bootloader which is the last line of defense for bringing you know
for starting the system then you could potentially turn it into a brick and that's yes we're going to
be talking about bricks bricks are what happen when you have a device that formerly worked fine and then you were updating it and it lost power or it lost its mind or cosmic rays interfered or whatever.
And suddenly you no longer have a device that does jack.
And often you no longer have a device that can ever do anything again if you do this wrong.
I bet it could do something. Well, a lot of the new processors will lock out JTAG
because that's a security thing.
And the only way to update the firmware now
is to actually do this firmware update process.
And if you crash in the middle of updating your bootloader...
I always think it would make a nice bookend, maybe.
Oh, I see.
Cat toy, bookend, landfill tak taker, upper of space, yeah.
Okay, but we're going with the simple method still.
We have the code.
We have a communication method.
We have the bootloader, which we're not going to update ever.
And we have the code.
And that all makes sense.
And now if you're going along and
you power on you check the bootloader you check to see if new code is available how do i do that
what does that mean well the bootloader has to have a communication method it has to know how to talk
spy or you are or whatever that's what the bootloader has sure so the bootloader checks
to see if new code is available on its communication method.
Maybe it's a USB thumb drive.
It goes out to the thumb drive and says, do you have code?
And if the thumb drive says yes, then it says, is this code new?
Yes.
Does this code checksum?
And if you are writing a bootloader and you don't have a checksum man really you're totally gonna make anybody ever
done that well yeah because the checksum is one more piece of code and if you're you're trying
to keep a bootloader to i did it once in 256 bytes you're trying to do that was more impressive
before i saw the the entire pc emulator in 4000,000 bytes the other day. Oh, yeah. What was that called?
Do you remember?
I'll look it up.
Yeah, that was pretty impressive.
Well, yeah.
I don't know that I believe you.
Anyway.
So the bootloader runs and it looks for a new code update
and then it loads the new code to Scratch Space, ideally.
That would be RAM or maybe it's off-board RAM,
or maybe it's even internal flash.
But you're assuming that that thumb drive or SD card
may get pulled away, and you want the bootloader
to be able to continue as much as possible.
It goes back to keeping the amount of time
that you're vulnerable to power off causing brain damage
short.
And you can shorten that time to zero by having lots of space available, but sometimes you
may not.
I don't think so.
Really?
I think we're going to get to the end of this and we're going to say no.
There is no way that is foolproof all the time.
Okay. Is foolproof all the time.
Okay.
I mean, except maybe having two full images.
Yeah.
Oh, all right.
Well, then I'll give that to you.
I mean, unless something goes wrong when you're, okay.
We'll get there.
Okay, so your bootloader has detected there's new and valid code available.
You've loaded it to someplace local and onto Scratch Space,
if you don't have enough Scratch Space to load at all,
you're just loading little tiny bits and erasing a sector,
programming the new code,
and then you just keep doing that until you finish.
And then once it's done,
you can run your new code.
Now, Scratch Space could be
enough space to hold your whole code,
and then if somebody pulls your SD card, you're safe.
Okay.
Or it may be whatever your sector size is,
or whatever your minimum amount that can be programmed.
Okay.
So it may be that I have less scratch space than the size of my image.
Right. Right. But you can't have less scratch space than the size of my image. Right, right.
But you can't have less scratch space.
Well, you really shouldn't have less scratch space
than the size of your sector
because you have to program a whole sector at once.
So you have to put it in place.
Okay.
So yeah, you have to,
the goal here is to kind of be fast too.
Okay, so if you need to update that bootloader,
you're right.
You're going to have this period
where you could totally lose your mind permanently.
The advantage to having a bootloader that's resident,
that always is there,
is that if you get your power yanked in the middle of your erase and program cycle,
at least you still have your bootloader.
You just have to put in the SD card or the thumb drive or whatever again,
and the bootloader will say, oh, okay, I've got new code.
I'll program it.
But if you need to update your bootloader, then that's where things can go bad.
So why would I need to update my bootloader?
Well, let's say your...
Seems like a huge mistake was made somewhere.
SD cards that you were using are no longer available,
and the new SD cards have different timing.
Or your code now needs to take up one more sector,
and the only sector available was in the bootloader sector.
And so now you need to make a smaller bootloader.
I liked that wince.
That was a great wince.
You couldn't see that.
It all sounds very painful.
It is painful.
But on the other hand,
you really kind of need to plan for this
because if you try to do the bootloader at the very end...
It seems like a lot of these things are a result of not planning.
No, they're the result of squeezing the pennies out.
When you need to make a million of something, then getting from
256k to 128k is worth a lot of money,
actually. It's certainly worth a few engineer hours.
Okay, I can see that.
Okay, so
now we're updating the bootloader.
And it's a
shell game. You
update the
new code to have a new
bootloader loader.
And then you load the new bootloader
and then you erase the bootloader loader
with the new code and then you run.
Yeah, it's all very... The bootloader loader? you erase the bootloader loader with the new code, and then you run. Yeah, it's all very...
The bootloader loader?
Yeah, the bootloader loader.
I think you're just making this up now.
No, no, no, I'm not.
But it's one of these things where, well, you know,
there's the goat and the cabbage and the wolf.
And you are going home from a fair,
wherever you take a goat, a cabbage, and a wolf.
Why? I don't know.
And then you need to get them across the river.
And it's really similar because you have to take...
Go on, I'm listening.
Okay, so with this puzzle,
which is this stupid ass interview question.
No, it's an important interview question.
This will answer whether or not the candidate
is a qualified person for this job,
any job involving technology.
If you ask this one question,
you will be able to determine instantly.
You need to turn the sarcasm up a little bit
because I know there's sarcasm,
but I'm not sure it's really coming across here.
All right, well.
Okay, so the goat question.
You have a goat, you have lettuce,
and you have wolf. You need to get them to the
other side of the river. And you have one boat and only one thing fits in at a time.
Which makes no sense because they're all different sizes.
I know. I mean, it's not like the cabbage is going to take all that much space.
Big cabbage. It's a really big cabbage.
And the goat will eat the cabbage
and the wolf will eat the goat.
And so how do you get
them all to the other side?
What does the cabbage eat?
The cabbage eats nothing. It's cabbage.
Okay.
You have to get them all over to the other side.
And really the whole problem
is you have to keep the goat away from the other two.
And so you take the goat over.
No, no, no.
I have no idea how to do this.
Well, you're not qualified.
I know.
But the bootloader problem is strangely similar to this
in that you have to do one thing at a time.
In the right order.
And you keep it in the right order.
And you really have to think it through before you start.
Because if you start with the wrong thing, it just goes bad.
Okay, so you cross the other side with the goat.
I'm probably going to say this a lot.
You go back to the original side. You take the wolf or wolf or the cabbage doesn't matter at this point
and you take that over to your house and now you take the goat back with you because the goat's not
allowed to be alone with the cabbage and then you take the wolf over to your house and now the wolf
and cabbage are alone but that's okay then you go back and you get the goat.
So you've taken an extra trip,
but you get all of your creatures home safely.
Congratulations, you can work for Microsoft.
You know, I think the interview processes are different
than when we went to college.
But yeah, when you're working with bootloaders,
when you need to update a resident bootloader,
you start out with the bootloader in place and some form of code in place.
And you replace that code with something that can then update the bootloader.
Okay.
And then you...
You run that.
You run that, and now you've updated the bootloader,
so you've got bootloader prime or new bootloader,
whatever you're calling it.
And now you don't have any good code,
so now you have to update the code.
But that last step is the normal code update
that you would have done anyway.
But updating the bootloader step
is not strictly necessary every time
and should be minimized, right? It and should be minimized right it really should be minimized
because if you fail to to update the bootloader you have created a brick um on the other hand
sometimes that's necessary creating a brick or updating the bootloader updating the bootloader
okay um no creating a brick and making a system unrecoverable is bad and one of the goals
of good loader design is to make that period as short as possible okay okay so so now that's
that's what happens if you are doing that um so now let's move on to a different type,
a totally different style of loading.
Our new code now is going to contain a loader.
And so we're always going to update the bootloader.
Except we're not going to call it the bootloader anyway.
We're just going to call it the loader.
Okay.
We just said that you don't want to do that
unless you really, really have to.
Well, now it's more like a ping pong buffer.
You have two things that are always valid and they can update each other.
And if one of them becomes invalid, the other one is still okay.
And as part of the whole thing, you know which one is valid and which one is okay.
How does that work when you come out of reset?
How does it know which one to go to?
Because secretly there are three parts to this.
If I ask you another question, are there going to be secretly four parts?
No, what you have is some little, little, tiny, tiny...
A little, not quite a bootloader, but almost a bootloader.
A micro-bootloader.
A micro-bootloader.
I knew this was going to happen.
And it can... It's like Zeno's paradox. but almost a bootloader. A micro bootloader. A micro bootloader. I knew this was going to happen.
And it can... It's like Zeno's paradox.
You load the bootloader
and then the bootloader loader
and you never actually load that code
because you always have to do a smaller loader.
Exactly.
No.
No, not at all.
But you have one little tiny part of code
that I swear is resident
and will never, ever, ever change.
Really, this time.
Not like the bootloader.
Jump zero.
Yes, that.
And so this little tiny piece of code,
which I hesitate to even call it a bootloader,
but it's the only name that it seems consistent.
Bootstrapper.
Bootstrap, yeah.
That's actually a better name for it.
So you bootstrap and you check to see if the loader code is valid.
And you check to see if the runtime code is valid.
And if the loader code is valid, you run it because then it can check and see if there's anything waiting for new code.
If it isn't valid, then you go ahead to the regular runtime code and you run that.
The runtime code can update the loader and the loader can update the runtime code. It's a ping
pong thing. You're going back and forth. One can always work with the other. And because you're
only doing one of these at a time, because you're only, you're finishing one before you start the other
you're finishing the loader or you're finishing the
code update
before you swap and do the other
that means that
something is always valid you can always
run the system you can't make a brick
anymore
so
so that's the advantage of this one
is it's a little bit safer
but presumably there must be a downside because So that's the advantage of this one is it's a little bit safer.
Yes.
But presumably there must be a downside because not everybody does this.
Well, now you have two sets of code that can communicate to your communication method,
the loader and regular code. And both of them have to be able to talk whatever protocol you're talking.
And if that's Wi-Fi, that's a pretty big protocol.
If it's spy, then that's not so big.
If you're talking serial to a Bluetooth chip, that's not so big.
But you're making it so you can update the code more easily
and you're preventing the brick situation,
but you're paying for it in Codespace.
So if you're trying to minimize your Codespace,
this might not be the method for you.
Okay, that makes sense.
Is this totally secure?
I mean, is there still a way to brick this?
The only way really to make a brick at this point
is to fill your checksums, and that's a hardware error.
So you can lose power at any time, and it's okay.
Okay.
And mostly with bricks,
you just assume that you're going to lose power
at the worst possible time,
and when you boot up, can you still run?
You just need to put a big capacitor on everything.
Three minute long capacitor.
Three minutes?
Well, I mean, if you're talking to a server.
What decade is this?
I know.
But if you have, you know, a serial interface or a really, really small communication protocol, even for HTTP over serial, so you
have data, but you don't have a lot of communication overhead, then you might be able to do that.
You might be able to put that in a loader, but you can run into problems.
I'm telling you, power goes out at the worst possible times.
Okay, so let's say you don't have the code space.
But you do have RAM.
Not all of these.
This whole microcontroller thing where you get onboard code space is really spiffy.
I have one processor I'm working with right now
that's all RAM space.
And so its code lives off the processor
and I have to update it from the flash over there.
And then I reboot and it loads into RAM.
So it runs out of RAM.
It runs out of RAM.
Running out of RAM is kind of cool. It's kind of RAM. Running out of RAM is kind of cool.
It's kind of fast.
It's faster than running out of flash, for sure.
Well, it's the way most people think computers work.
Well, that's for computers.
Those are boring.
I'm using computers in a very general term here.
You may be surprised,
but the things we actually work with are computers.
Yes.
They certainly would have been considered large computers only two or three decades ago.
I've heard that the difference between a microprocessor and a microcontroller is whether or not it's got onboard flash.
I still don't use those words properly, but we're going to go with that being the actual difference.
Well, they have peripherals attached and things that microprocessors don't.
Yeah, I don't know if there's anything that has onboard
flash but doesn't have
peripherals.
Okay, so
but I'm not going to worry too much about
the things that run out of RAM because those
well, you can kind of figure those out. You have to
update whatever your code space
storage area is.
It's just one extra step, right?
Eventually, you have to move the stuff to RAM when you first start up.
Yeah.
But otherwise, the process should be the same.
Right.
So let's say onboard code space is too expensive.
So onboard Flash is too expensive.
But you have some RAM, which you're using
because your algorithm takes a whole bunch.
A lot of signal processing falls into this realm.
And so what you do is you put a little RAM chip
on the outside of your processor.
And you load the new code into the external RAM.
Who does that?
The bootloader?
Your code does that
because you don't have your regular runtime code.
Okay, so during the update,
the code loads the new code.
The existing code loads the new code to the external RAM.
Yes.
Got it.
And then the new code to the external RAM. Yes. Got it. And then the existing code
adds a little piece,
puts a little code into run space RAM.
So in the processor's RAM.
We need a diagram.
I know, I've got one in front of me.
It's kind of bad.
I think it's making it worse, actually.
Okay, do you have
runtime code you've got runtime pieces you have runtime flash code yeah you have the ram on the
processor which is run space ram code that's what we're going to call that i'm not sure your terms
are distinct enough to to not be confusing oh all right so you're the ram on the processor that
everything executes from no no most of the RAM on the processor that everything executes from. No, no, no.
Most of the time on this processor that we're talking about here,
this is a made-up processor, although there are many like it.
It runs from Flash.
Oh.
On board Flash.
I thought we were still talking about ones that run from RAM.
Oh, no.
Because those are kind of an easier case.
Sorry.
They can fall under this, but they can...
Got it, got it.
Okay, so...
This is why people listen, right?
It makes them feel better about themselves.
Yes.
When you get into the office today,
you're driving along, listening to this podcast,
thinking, I can totally explain this better than they can.
Find a nice junior engineer and explain it to them.
We're unexplaining it.
Yeah, okay.
So I don't really care about the processors
that run from RAM all the time.
Boring, easy, done, fine.
Those are boring.
We're not going to worry about those.
Okay.
So now this Phantom processor has some onboard flash,
but it doesn't have enough to do that ping pong thing
we just
discussed. Okay. It's got a little bit and it's got some on onboard RAM as well. Okay. Now we want
to update it, but we don't really have enough resources to do this because we've only got some
RAM. We've only got some code space and not enough for a duplicate copy of our code. Yep. So we're
going to take the new code and we're going to put it into an external RAM.
Now you have to have thought about this ahead of time
because you have to have the external RAM.
This is hardware.
You're not just faking this piece.
Okay, so now the code, the regular runtime code,
puts the new code into the external RAM,
scratch space RAM.
And then it takes some little piece, a bootstrap sort of thing,
and it puts it into the run space RAM on the processor.
Run space RAM.
So we have some RAM on the processor we can run code from.
But we don't usually do this because there's not a whole lot of it.
That was the confusing piece.
Okay, so you can run from RAM on this processor,
but you generally don't.
No, because we usually run from flash
because there's enough flash
and there's not a whole lot of RAM.
Got it.
But there's enough RAM to run something small.
Okay.
And should I digress here or should I continue?
I thought we already had digressed.
So a lot of times this happens when you have, I said, signal processing applications.
Normally, you would be using all of your RAM in the signal processing application to buffer
it to look whatever you're doing with your actual algorithm.
You'd be using this RAM.
But since you've started this download process,
that RAM now is suddenly available to do more things with.
And so what we're going to do after we've loaded all of our code into our external scratch-based RAM,
we're going to take a little tiny bit of ourselves,
a little tiny program of ourselves
that is the code that can program the onboard flash.
Got it.
And we're going to put that in RAM.
And then we're going to take that RAM and run from it
and read the external RAM into the onboard flash.
So how does that work?
So presumably you're not doing a reset
between those two things.
Oh God, no.
So you have to...
Because that would be a brick.
So you have to copy that and then jump to it.
Well, it's even worse than copying it and jumping to it.
I mean, you could do that
if you didn't have any function calls,
but you have to link specially for running from RAM
because the addresses are all different. It's weird. function calls, but you have to link specially for running from RAM.
Because the addresses are all different.
It's weird.
You have to deal with the linker file.
So this is my question.
You've got a situation that's sort of dynamic that you don't usually do.
So how does that, how do you actually make that work?
Be more specific, please.
So you have to set this up in the linker file presumably ahead of time
of when you build your whole system.
Yeah.
Right?
Yeah.
So do you have to have a special symbol then
that says this is...
Right.
What's the mechanism through which I start executing
the code that's in the RAM,
the special little run space RAM?
Jump to.
That's what I just asked, and you said no. Oh, I'm sorry.
It was more complicated. No, no.
You can jump to or you can make a function pointer to where you have now copied it to RAM. Okay, but what's the point of the linker file?
But you can't do it...
You can jump to this address and run from this address.
But this address has to be self-consistent,
has to be compiled to run from RAM.
So the code has to be position independent
or position dependent the right way.
Right.
That you put into the RAM.
Right.
So it's less about the action,
it's more about how you prepare the small piece of code.
Yes.
Okay.
Your little bootstrapper that's now running from RAM
has to be designed to run from RAM.
And realistically, if you're doing this type of heavy algorithm data sort of thing,
you probably have something else running from RAM for speed.
Your FFT algorithm probably is already compiled to run from RAM.
So you've already had to go play in the linker hell that is the linker file.
Does that make sense now?
Yeah, it makes more sense.
Okay.
Is that the most complicated scheme?
Please say yes.
That's the most complicated scheme I was going to talk about today.
That's funny.
I neglected to figure out what we were starting. I was talking about holes and
time, space,
displacement.
Well, I mean, that is because
external RAM
is cheap, particularly
if it's slow.
And internal code space is expensive.
And that's why you're doing it that way
uh and basically the game is to find someplace you can stash a piece or a copy of the entire
new thing and to spend as little time as possible in the danger zone of where you have
a mix of new and old code
or you've overwritten something that you need to execute
and if you lose power, you can never get back.
Those are the two big fundamental pieces.
And so you end up with these schemes
depending on the architecture of your system.
Yes.
But it's probably helpful to think about this early
because it would influence the architecture of your system.
Well, it may influence your hardware if you need an external hardware piece.
Yeah. And it is, if you
wait until the last minute to do your firmware update, you will
end up with things like that thermostat did, where, oh, I
had this plan. It was all going to work just fine. Oh, crap, we've
already shipped two and it doesn't
work the way we really need it to work for the future well they won't complain much discon
you disconvenience your users and then disconvenience inconvenience inconvenient
i never said disconvenience but you you end up with amazon reviews that say you suck which
you know is always a great way for you to feel better about yourself.
Not.
So I already said checksums,
and we talked about danger zones,
and when you're doing this plan for your system...
Danger zone.
Sorry.
We're writing to the danger zone.
You're going to need a flowchart.
Just trust me on that
make the flow chart
and then buy yourself a nice set of colored pencils
and highlight in red
when I was 10 I got this
template
and I made my parents buy it for me
at some store and it had little flow chart symbols
and I had convinced myself it was going to
lead me to do all kinds of really cool projects
because I could now draw the little symbols for all the flowcharts.
I never used it.
You really thought the symbols were the important part,
whether it was a rectangle or a diamond?
It's very important.
Yeah.
All right.
Well, draw your flowcharts.
Mark the areas that are dangerous and try to keep them short.
If you can, convince somebody that it's worth the extra 15 cents
never to have your unit turn into a brick.
Well, there's a cost-benefit to that.
There is, absolutely.
If the odds are one in every thousand units is going to turn into a brick
and return it, and that costs you $100 versus $1,000 to use a more expensive chip in your product.
You might err on the side of a couple of unhappy customers.
I neglected to write down what time we started.
Do we have time for one more?
We have time for anything you like.
All right.
As long as it's not another bootloader scheme.
No, no. This is the way that not another bootloader scheme. No, no.
This is the way that makes the bootloader schemes
all look simple.
It's when you add security to it.
I was going to say you're going to add a hard drive.
So I worked on a system that had security keys in it
and there were consumables as part of the system.
And so when you updated the firmware,
you could update the security keys in it and there were consumables as part of the system and so when you updated the firmware you could update the security keys and thereby invalidate previous consumables yeah so security keys are wandering around loose around the world that's not good people can
if if your consumables are worth enough, the people
are going to hack your system. The easiest way to hack it is through this
firmware update scheme. So if you don't, if you, if you have a consumable or
something like this or some secret key, then you either can't put it in your
bootloader, or you can't put it in your new code.
You can't send it plain text through the world.
Or you need to encrypt your code,
in which case your bootloader...
But even if you encrypt your code,
you still have to decrypt your code,
which means you have to have a key somewhere.
Yeah.
The best way to do this is to have a crypto chip
that you never actually read the key
from and that does the hash calculation locally and then tells you if it was okay or not.
You had one of those in one of your bigger systems, didn't you?
I've never had one.
I have had to do encryption in my loaders and in one system that was super space constrained,
we pretty much made our code plain text or hex plain text.
And then we had a secret area that was encrypted and it took special, if you had to update
the keys or you had to update the encryption keys, it was much longer load process.
And that was just for authentication.
If you're going to actually encrypt your code, that's a whole other thing. I think you need
hardware support for that. In which case, you're
probably storing the key somewhere that can't be
read out. Well, I meant encrypt
as you're doing this loading process. Right.
Not encrypt
when you're running from it.
It's only during this load and
unload. I mean, if you're doing
that ping pong bootloader where you have
a loader and you have runtime code and you can go to either one and secretly there's a third bootstrap piece, then your loader and your code both could have encryption pieces in them.
I mean.
Sure, sure, sure.
But you don't want to ever have anything in a piece of something that can be, if somebody has physical access to it and a little bit of know-how, you don't want to have, like you were saying.
This is why they disable JTAG.
The family jewels available for reading, you know, for your company, just in plain text.
Yeah.
So be careful.
Security. and you mark all the red bits where you can destroy your unit, you also need to mark the areas where you have exposed your family jewels.
Thank you for that.
That was a great image.
That image brought to you by Christopher White.
I didn't mean it that way.
I know you didn't, but I really enjoyed it.
So, some questions to ask yourself about your design.
How often will the new code be loaded and by whom?
Are they trusted people like technicians or unknowledgeable end users?
The way you organize your system really may depend on who's uploading your code.
And then there's low-cost parts can make loading code safer.
So, you know, it's definitely cost-benefit analysis.
High-cost parts.
Not really. That external
adding a little piece.
Another piece that might not cost much.
It certainly costs more than zero.
Yes. Oh, you're definitely adding
cost to the system, but you may not be adding
a lot of cost.
It's a risk.
It's a cost-risk benefit
analysis
thingy, Bob.
Bob.
That's what economists call it.
What other pieces of the system might change?
We didn't really talk about, you know,
if you, the SD cards,
I mentioned that what if the SD card itself changes?
Well, that's kind of important,
but what if you have other things
that can change in your
system? Your communication mechanism gets updated someday. If that is important to your system,
then you have to deal with it at a bootloader level because that's where things may seriously
go wrong. And then always, where can your new code be corrupted and how can you figure it out?
What is the soonest possible area you can figure out your code's toast?
Do you have another question?
What is the worst thing that can happen at each stage of your flowchart?
I was going to add, what happens when your code gets bigger than your scheme?
Then you go to a new scheme.
Right, but that can be difficult.
Well, and then the next chapter in the book, I think, is, yes, yes, chapter 8,
optimizing your code for space cycles or ROM.
So RAM, ROM, or cycles.
So, yeah, when it gets too big, you should optimize it.
My point was, you might have this great scheme, and you might have the right hardware for that
scheme. And then somebody says, we need this new feature. And it can fit in your system,
but it no longer maps well to your firmware update scheme.
That's a good point. Because that off-board RAM version
means you get to use all of your on-board flash code
is available for your algorithms and for running.
And so it's dangerous,
but you get 100% of that code
for doing what you need to do on a normal daily basis but the the ping pong
loader code method that you always have essentially two images three kind of once you can't the
bootstrap uh in your code means that you've got a bunch of wasted crap in there and i mentioned
that i had done a loader in what seemed like an extremely small space.
You can do that, but then you are a little constrained.
Life gets tough when you have to deal with that.
And you have to decide in the beginning,
you can't usually switch midstream,
or you can't switch after midway, midstream. Or you can't switch after
units have left the factory.
Even though that's guaranteed to happen every single time.
Something will
change outside your control every single time.
Well, actually, that wasn't true for LeapFrog.
That was one of the best things
about working for LeapFrog, was there was
no firmware update.
You released the code.
They masked the ROMs.
You prayed.
You prayed.
There were no bugs like that time when nobody noticed
that N said nibbling for nuts,
which finally somebody realized was not a good thing.
And so those ROMs went into the sea
or wherever ROMs go whenever somebody says,
yeah, let's make a new masked ROM.
But you could never update the firmware.
It was part of the system.
It was real ROM.
It was real ROM.
Not this other ROM we're talking about.
Not this fake flash ROM,
which isn't really read-only at all.
So what else?
Did I answer all of your questions about bootloading?
I mean, I have a better understanding of it than I did an hour ago.
That's good.
I hope to never have to do it.
You'll have to someday.
Nah, I'll let somebody else do it.
Ah, yeah.
No, I think... Nobody likes that sort of thing i think it falls under one of those things that
uh people who have done it use it as a benchmark for other people it's kind of like sometimes in
interviews i'm trying to figure out what this person knows and what they don't know and so
you ask them the go question oh god God, I hate the go question.
No, no.
I ask them if they've ever started a system from scratch.
Have they ever done Hello World?
Have they installed the compilers by themselves?
Have they chosen all of this stuff on their own?
Chosen a processor?
Because there's a certain, I'm going to say innocence lost when you actually have to go and choose all this
stuff for yourself. And bootleaders are kind of like that. It's a, it's a coming of age
benchmark to me.
Wow.
Can we cue the sappy music? Do you have, we have sappy music? Maybe we need sappy music.
No.
Where's your, didn't you get a new instrument for Christmas?
Yes. I'm going to play the ukulele on the podcast.
That's going to happen right now.
I doubt that.
I'd continue talking about this probably, you know,
for at least another half an hour, two or three hours.
But I'd want diagrams and a whiteboard.
Chris has a meeting to attend soon.
I guess I have to have somebody ask me about optimizing
graphics.
Hey, that's not in the book.
It's because you don't believe
graphics are a part of an embedded system.
No, no, they are.
Your narrow definition of an embedded system only includes
things that have bootloaders.
No, no.
Yeah, graphics. No floating no. Yeah, graphics.
No floating point.
Oh, floating points for wimps.
It's IQ math.
It's so much better.
Never mind the cash.
Cash?
I like cash.
Are you going to give me cash?
Not that kind of cash.
I'm not going to give you cash.
You're not a spider.
I don't get it.
Sorry.
I'm going to need help.
You remember the big spider?
I put the quarter up for reference.
Oh, right.
And he said, does that make them go away?
Yeah, he sent me a picture of a gigantic spider in our kitchen
with a quarter in front of it,
and I asked if he was paying it to go away.
I think we've got off topic.
All right.
So if you have questions or comments, please hit the contact link on embedded.fm or email
show at embedded.fm.
I'd like to thank last week's guest, Allison Chaikin, for suggesting this topic.
I've heard some of you are interested in embedded systems vision projects.
And I have to say, that's kind of cool, but I can't spout any of that off the top of my head.
So be patient while I find a guest.
Or if you are that person and want to be a guest on the show, let me know.
Hit the contact link on embedded.fm.
Let's see, One final thought.
Oh, well, here you go. Get 40% off the print book of Making Embedded Systems and 50% off the ebook
version of this book by entering the discount code authd at o'reilly.com. That is auth as in
author, d as in discount, all capital letters. Or if you're into this sort of thing, it's Alpha Uniform Tango Hotel Delta.
I think Plato said that first, right?
Oh, yeah.