Embedded - 114: Wild While Loops
Episode Date: August 19, 2015Andrei Chichak rejoins us to discuss error handling. Andrei's website says how to reach him or email embedded 'at' chichak.ca Windows 10 "Something Happened" error Hitchbot Book Elecia mentioned: Ki...ndness of Strangers by Mike McIntyre Elecia's book covers logging module in Creating a System Architecture (pp 21-25) Robots and children
Transcript
Discussion (0)
Even though 350 million Americans say,
blame Canada, every time they hear me say it,
this is Embedded.fm.
Welcome to Embedded.fm,
the show for people who love building gadgets.
I'm Elysia White, here with Christopher White.
Andre Cicic is back.
Apparently, we said something about error handling and andre had
thoughts to share hey andre hey how's it going it's hot damn hot damn hot excellent really this
is this a show
well as it turns out as it turns out uh uh chris and ch Chris and Elle mentioned errors last time.
And I figured, oh, this has something to do with me being on the program last time.
And they want to fix that error by having me back.
In computing, we call that a retry.
Retransmit.
Yeah, I don't think having you on last
time was an error at all.
I mean, I think you were show 99
and yet you made the best of
which was voted on like the next
week. So, you know.
Well, that was me. Yeah, that was you.
Did anybody else vote
for 99 though? Oh, yeah.
Really? Oh, the best ofs got definitely more than one vote.
Most of them got more than four.
You're easy to please.
No, really.
Or how's buying and selling wouldn't be so hard.
But we don't have a whole lot of time today, so we should get into the topic, which is error handling.
Because we did talk about it a little bit on the show with Christopher and I, and we sort of mumbled about you have to bubble the errors up to wherever the system is intelligent enough to handle it.
But you had a lot more advice and a lot more detailed.
So what is the most important thing that you tell people about handling errors correctly?
Well, one of the things that I've figured out over the years is it's kind of important when the hardware guys throw the system over the wall, and they don't even give you an LED to work with, you can't do hardly anything with error handling. So it's sort of like you're going to have to temper your expectations, you really should do something about it to tell the puny humans that there is a problem with your sensor, such as you can completely go offline and give it a bogus value.
But if you're given an LED, you get to flash that LED.
And then you sort of have to take a look at your users and say,
what am I going to inflict on my users?
The engineer around here says, well, can't you just put out an error code in Morse code?
Because everybody knows Morse code.
He's like, no, of course you can't.
That's just brutal.
No, we're embedded software engineers.
We know ASCII, not Morse code.
Oh, exactly. Well, he's a ham guy, so he figures that Morse code is very, very efficient.
It's like, no, you can have it come on when there's a problem, but then you can't even
tell if the thing is on.
So maybe you have it on when the system is on,
and you blink it when there's a problem.
But that assumes you have users who are looking at your system
and that your users are going to do something about your error.
I mean, if an LED is blinking on my router,
it's in a little...
Well, right now it's...
That's why I have to use the red LED.
Except I'm never going to look at it.
And if I do see it, the best you can hope for is that...
You're just never going to look at it because there's black tape over all the LEDs.
Well, yes. Welcome to my house.
And what do you do with people like me that are colorblind
that can't tell the difference between a green LED and an amber LED?
And that's 16 16 of the male population
and do you expect your user to reset it or smash it or just let it blink because
yeah so what good does it do to have an led flash when things go bad unless you have
something that somebody's looking at yes exactly And if it's potted in black epoxy, you might as well delete that LED because nobody can see it anyways.
Well, delete it on the production units.
Yeah.
On the ones on my desk, you can leave the LED, although I'll turn it into a debug port.
Such as on my desk here, I've got a pitch sensor for a hydraulically controlled fan. And it has power going in and
ground and one to three volts coming out. It's real hard to do anything in the way of error
detection with that. But as it turns out, well, if it's zero volts, that means it's probably unplugged.
If you got five volts, that's illegal and that's shorted.
So you don't want to use that.
The normal range of the thing is one to three volts.
So if you put the output, if you get an error and it's fatal, put the output at half a
volt. And then people will start bitching and complaining that the fan isn't working properly.
But at least you can take a look at the sensor and say, half a volt, that's not right. This thing's
busted. That's one of the good parts about error handling, is that you have to be able, the question
is, who are you sending your errors to?
Yes.
And in that case, you're sending them to yourself.
In a lot of cases, the goal is not to tell the user what went wrong.
The goal is to tell the user something went wrong, and then to tell the engineer what went wrong. Or not even to necessarily tell the user what went wrong. The goal is to tell the user something went wrong and then to tell the
engineer what went wrong. Or not even to necessarily tell the user. There's certain cases where you
detect an error and you don't want it to bubble up all the way. Exactly. But then the question is
how to do, I mean, do you reset? I've seen a lot of systems that when something goes wrong,
you just reset. Well, the first thing you should try is to retry.
The first thing you should try is to retry.
Retry.
And if you've got an error that's persistent, then get into what can we do about it? It's hard in a system, an embedded system,
where you may have many asynchronous things going on
to know how and when and who controls the retry.
I mean, if you think about I2C,
I2C is something that you do sometimes need to retry,
even if the system is working perfectly.
But do you do it at the I2C transaction layer?
Do you do it at the middle layer
where you have a driver for the device? Do you do it
at the high level that's talking to the device driver?
Do you flash an LED
to the user so that they know that
I2C has failed?
Probably not that last one.
Do you wait until the thing got into production to put in
error detection?
Ah, yes. Not me.
Good. When do you start putting in the error detection stuff Ah, yes. Not me. Good. When did, when do you start putting in the error detection
stuff? Immediately. I think the error detection stuff probably is more likely to be ported from
another system than anything else. And so for me, I want the error detection stuff before the system
comes up. Oh, Elle, you're so like, that's very first world. A lot of the systems I work on,
it's you get hired, and you work on this system, and it's their first one. And then the whole
company goes into limbo for a while, you go work somewhere else. And you start again, and you start again and you start again and you never get to version two.
So what I do is when I start a project, you sort of put in the basic looping structure that you
figure would be appropriate and some sort of an error handling routine and you start using it
immediately, whether it does anything or not. Like you could
just say, error handler, I just rebooted. And the error handler might do absolutely nothing because
you don't know what your need is yet, but at least there's something in place. So if there
is a problem, you can flesh out your error handler a little bit,
put in some printfs to begin with. So when it says it rebooted, out on your serial port comes
a rebooted message. Or you find out that you're getting a reboot message every 16 seconds and
because the system boots so quickly, you've never noticed that before so suddenly you're
you're helping yourself out then after a while you can just start throwing all sorts of stuff
out there and and reduce your errors as you go along the first thing I often do on projects is either ask or if the project is completely
unstarted, implement a logging system. And that's sort of different from error handling,
because you can log things that are not errors. Yeah. But if if a project doesn't have a log
system of some kind to start with, I get very nervous. Because it means that you get far enough
along and you don't actually understand anything that's happening.
And that goes for error handling and for just normal operations.
Oh, this is the way the users use this device.
Well, it would be nice to know that.
Black boxes are scary.
I really prefer that my system tells me what it's doing.
And there's always the downside to logging,
which is that it takes time and it changes the system. And so
you may end up with something different in production than you do in debugging.
And you just have to be aware and careful. And I do often have levels of debug, of logging,
you know, there's the info and the warning and the error, and the errors always get printed out,
even in production, That sort of thing.
But the different ways that you can output can change over a system's life too.
You start out with a debug print, a serial line,
because that's easy for developers.
And then you end up with a tiny area of Flash
that has a ring buffer of errors.
And when it's serial debug, it's nice strings.
But in the flash, maybe it's just codes.
And with a lot of it, you won't even use this stuff until there is a problem.
And if it's one of these, you know, some piece of the, like if the processor goes dead, you can still suck the stuff out of the flash chip.
Yeah.
And sort of do post-mortem.
Back in the bad old days, we used to have a core dump, and you'd go sifting through the entrails of the core dump to see what the registers were doing and what's in memory. And now it's sort of, you know, you can't really do that
except you've got entrails left in flash.
Oh, you can still do that in some places.
There are some Cortex-M3s that I've seen some great core dumps on.
There are some router companies that still subsist on getting core dumps
from their devices from the field and sifting through them yeah got all the way there huh yeah actually we ran into this one
thing recently where one of our customers uh their this is kind of tangential uh one of their
devices spits out a uh packet of data it was once a second and I figured well this was going
to an SD card and I decided well if I do it once a second it's just going to hammer on this SD card
and it's eventually going to die so I left the syncs which synchronizes the directory structure with the SD card only did it on every eighth
right after two and a half years they gave me a call and said there's something wrong with this
device out in the field and it ended up that we had worn out an SD card So we had to sort of figure out, okay, calculate, calculate.
Yeah, that's about a million writes.
And this was a four gig card and it wasn't full.
But the way that the FAT file system works is every time you update a file, it hammers on the directory structure
at the beginning of the card.
And on a SD card, just like any other flash,
it reads out the first sector,
erases it, and writes it back.
And it had gotten a million writes,
and that portion of the card was now unwriteable, much like what you get in SSDs.
Yeah, well, but SSDs have wear leveling systems, and they're a little bit more sophisticated than the old days.
Yeah, and SD cards, they've got a little ARM processor in them, but they don't do wear leveling.
And you can wear a note really quickly just by doing logging.
Yeah, that is something to be aware of.
You and me got to get together and we'll invent a logging file system for
embedded systems.
I've already done that at two companies.
Oh, great. Code reuse.
Okay.
Back to errors. Let's see.
We talked about logging
systems with flexible levels and outputs.
Do you do
functions for your logging and error
handling, or do you do
poundups?
I usually use um well i start with uh functions or tasks if you're running an artos you just create yourself a
error logger task i guess we should i should ask do you mean error logging or error detection?
The goal is error handling.
Okay.
That is definitely a combination.
You can't log it unless you detect it, and you can't handle it unless you detect it.
So I guess detection must come first.
Okay. So if you go back to, let's go back to 99, where a part of the MISRA rules, which I'm not going to harp on, but one of their rules is always check your return codes.
That's a good thing because, you know, some system subroutine is trying to tell you something and you really should listen to what it's trying
to tell you. One of the things that you could do if you really determine that your return code
is trivial and it can't possibly go wrong is call the function with a bracket void bracket in front of it. And it will throw away the stuff off of the stack,
but you've left yourself a little thing,
little reminder that says,
I know that I'm throwing away the result.
And I've looked at this and it's okay.
After that, so you've now detected an error.
I've seen that in code.
And until you said that, I had no idea why they had made that function so you've now detected an error. I've seen that in code, and until you said that,
I had no idea why they had made that function so ugly.
Something like printf will return you the number of bytes
that actually got printed out,
and you should check it against the size of your,
like how many bytes that you're expecting to be printing out.
But when was the last time
you ever checked the return code of printf?
And if you put that in
code and I code reviewed it, I'd be like,
oh, this is dumb.
Especially for debug printf, because it's like,
okay...
You're already wasting enough code space
using printf.
Well, one of the things that I found
in the past is, depending on the
compiler, if you call a function
that returns something, it'll
stick it on the stack.
But the caller isn't
expecting anything to be on the stack,
because you didn't say
foo is equal to
f of x.
I've had some problems. I can't remember
which compiler it was.
But by not pulling stuff
off of the stack
it corrupted stuff
that's a terrible
compiler
we should call it out by name
but maybe we won't
it was a long time ago
I don't remember
I think it might have had something to do with microchip.
Okay.
So what I do is I either call a function, or if I have an RTOS, I can post a message to the error logger.
That way everything happens asynchronously, and I don't have to wait for the error logger, that way everything happens asynchronously and I don't have to wait for
the error logger to handle it.
But that is assuming you detected the error.
Yes, absolutely.
And we probably detected the error
using a return code.
And part of me wants to say you have
to look at return codes. And the other part of me
is, if you look at return codes with
printf, I am just going to say
your code is ugly. So I don't know where to draw the line. Some return codes with printf, I am just going to say your code is ugly.
So I don't know where to draw the line.
Some return codes are informational, and some return codes
are, this worked.
So I think if it's an informational
return code, which is like, I think
on the order of what printf is doing,
or some other functions,
then you don't need to be
rigorous about checking them. But if it's a boolean
that says I did it, or I didn't do it, or it's a Boolean that says I did it or I didn't do it,
or if it's a pointer that says I allocated this or I got nothing,
then you have to check it,
especially if there are cascaded consequences down the line.
Because if a function...
Like failing to malloc something.
Right.
What are you doing using malloc?
I'm not.
Don't you listen?
But I don't always get to choose.
Well, there are varying
degrees of malik too you could have a chunk allocator and it could say i'm i'm out of i'm
out of space um there are uses for dynamic memory allocation of various kinds i know you people
i deny your reality and replace it with my own exactly yeah this is this is religious war stuff okay segue um so yeah and and i actually like i i
read this really cool book once called making embedded systems by alicia white never heard and
it had a nice little section in there about uh error hand error logging and stuff like that as
well as putting in uh levels and turning them on and off and stuff like that, as well as putting in levels and turning them on and off and stuff like that.
And I thought, ooh, this is really cool, and I actually built one.
She's looking at the ceiling like she has no recollection of this book.
No, no, but that chapter was all about how to design an API.
The error logging was sort of the example I was giving,
and I was happy with it as an example,
but as far as an error logging, it wasn't okay, but it was all about the API.
And so I'm amused that you got the wrong thing out of that chapter.
It's okay.
But that is okay.
I got an error logger that I can say, no, shut up, it's fine. See, now I can put into my configuration manager
the default level of error logging
when the system gets booted.
And when I push it into production,
I just set the flag to shut up,
and there is no error logging.
But when I need to debug the thing,
I can just crank it up,
and I don't have to recompile.
And that's awesome.
Yeah.
Except you waste all that code space for all those logs.
You do waste all that code space.
Just remember, a lot of the processors these days have a lot of flash,
but not very much RAM.
So you can stick in a whole bunch of code,
but you do have to watch your variables quite carefully.
If only.
And, well, even with not having enough code space,
if you replace your strings with indicators,
then those are a byte long and you still get the information.
Right.
And you still can turn it on and off.
Of course, you probably are then writing a Python script to interpret it,
but that's not impossible.
That's all very solvable and you still get something out.
And you need something for the interns to do anyway.
Exactly.
Can I tell a funny story about the book?
Okay.
You guys are living in the first world.
Interns, interns, I'd love to have some help.
Okay, go ahead.
Well, actually, I recently took a contract where I get to have an intern for a little
while and I'm so enjoying it.
I'd totally forgotten how fun it is to mentor somebody.
You call him minion?
I have called her minion,
but not to her face and hopefully she doesn't listen to this.
Sorry.
I made an assumption about him,
her.
I thanked the person who led me to the contract because I was excited to have
a minion.
And,
and the person I was talking to was, she said, that's why I suggested you because I knew you'd love to have one.
I started to tell a funny story about the book in the house.
And then Christopher reminded me that we are not finished selling or buying.
So we'll tell that story later and move along with error handling.
What was next on our list?
Check your return codes.
We talked about that.
Except when you don't have to.
Except when you don't have to, which how are you going to tell?
And Andre says void and I say blech.
Well, you can even put it in as a comment.
Like, yes, I know that this can screw up,
but I guarantee you it's not going to.
On every printf?
No.
I'm going to write a new function called printf underscore
without return code.
Leave it alone.
That's really long.
How about if we go on to empathy for the user?
Empathy for the user, empathy for the future programmer.
I feel like we've talked about this.
Yeah. Well, no, this is more along the lines of,
assume for the moment that you've got a larger embedded system
and somebody was nice enough to give you an LCD screen.
And, okay, go.
And now you're going to put,
would you like to abort, retry, or fail on the LCD screen?
Because really, why would you put anything useful?
I mean, we don't like our users, do we?
It seems not.
What is the difference between abort and fail? What is the difference between abort and fail?
What is the difference between abort and fail?
I ended up looking that up.
And abort, fail, retry, of course, everybody knows that.
That's from DOS.
Which we should all be aspiring to.
Yes, absolutely.
Everybody should know about this.
That was introduced, let's see, that was introduced in 1981. They dropped support for it in 2001. And they documented what the difference is between abort and fail in 2006. and well assume for the moment that your uh that your user is a nurse uh what does abort mean to a
nurse and it's like you really should take a look at who the real users are of a system rather than I'm going to write this for me because I
don't know anything about the users. Figure out what their lexicon is and what their abilities
are. I mean, with something like abort, retry, and fail, wouldn't it have been much more useful
back in 1981 to say, hey, you don't have a floppy disk in your drive.
Can you please put one in and lock the door rather than zit, zit, zit, abort, retry, fail?
Well, but even the, could you please put in a floppy drive, you still needed a button
that said, no, I didn't mean to do that.
Please stop.
Yes.
I mean, that's the abort.
The retry is, okay, I've put in my floppy drive.
Now try again.
I don't know what fail is, actually.
That's sort of embarrassing.
What is fail, Andre?
Apparently, one of them will, let's see, fail returns a return code and abort doesn't. So if this is in a batch script,
you can get a return code from the open and then you can deal with it, whereas abort doesn't. So even back in 1981, they had return codes all over the place.
And I suspect those weren't checked either.
No.
Nothing has changed, by the way.
I don't know if you've seen some of the images
from people trying to install Windows 10,
where occasionally a giant dialog box comes up
with the title in all
lowercase, something happened,
and then the contents of the dialog box
says, something happened,
and then a hex code.
So, we haven't really improved much.
No. Something happened.
What would that be, a correct operation?
I think it has okay as the only option.
Oh, God.
I've got a wall oven in my kitchen, and somewhere in there, of course, this is going to be used by people that cook.
And they have an LCD display on it and periodically something screws up and it figures
out that the door is locked. The door gets locked when you're doing a self-clean cycle. But if you're
at, you know, if you're cooking potatoes, the door isn't supposed to be locked. But somewhere along the lines,
they get a comm error between the door and the main CPU, and the door is locked.
So instead of doing a retry to see if it's a persistent thing, they put up a little message on the display saying that there's a comm error and has a little hex code.
And if you look it up on the interwebs, you have a choice.
You can either, you can't fix it by turning off the oven.
You actually have to go down and flip off the breaker and flip it back on again.
Or you can call in a technician.
How is this useful?
Why don't you just fix the problem?
Yeah, that stuff just amazes me,
how it gets out of companies.
And it's just, basically,
it's totally a sign of laziness and lack of care.
Because there's no other explanation.
I mean, you got to the point
that you wrote the error handler
that put the com thing up on the screen.
So you knew that the scenario was possible and meaningful.
So it's not like you just ignored something and didn't handle it,
which I can kind of forgive a little bit better.
But if you knew something was a possibility
and you went ahead and wrote a terrible user interface error handling system
in spite of that, that's unforgivable.
Well, okay, I'm not saying it is the correct thing to do.
I'm in the camp of you should have fixed it.
You should have made it a higher priority
than the little dancing ball that went around.
But I can see as an engineer how you're in a meeting
and you say, I need another week to be able to handle this error gracefully. And your boss says, you don't have another week. We have to ship this. And how many people is it going to affect? And so you're sitting there thinking about the improbabilities of getting to this error. well, maybe one out of every 5,000 times they use their oven, and the boss hears one out of 5,000,
and he's like, that's never.
Until you start thinking about,
we're selling 100,000 of these units,
so they're going to happen actually pretty often.
Yeah.
Well, I wasn't saying it was unforgivable
on the engineer's part, necessarily.
It's unforgivable as a product decision.
Except QA never found it.
The only person who ever knows it happened was the engineer.
They need a better QA department.
Yeah.
You definitely needed a better user experience
department as well.
Someone willing to fight for,
no, our users are going to have a good experience
even if it's a little late.
I'm willing to trade off the time to market
for greatness.
But something like the verbiage of your error message, that can be sent through a real human
being rather than an engineer while the development is happening.
I'm sure that engineers know somebody that doesn't know how a car works, that they can just sort of say,
can you take a look at this and see if it makes any sense to you
while you're still writing the SPI driver routines?
Yeah.
Or just think, could my mom navigate her way through this thing?
It's something that needs to be done at the beginning of a product, is understanding
your audience and making sure that everything that's presented to your audience
makes sense to them, or potentially makes sense to them. And I think
like Alicia's saying, getting to the end of the project and, oh my god, I don't have
time to do this. Well, that says you didn't plan for it up front. You didn't say
who's buying this and what are they going to be looking
at. Presenting a chef
with abort, retry, fail is
not probably what you want to do.
Yeah, there's a reason that they didn't
go into computing.
Exactly. And that is
something we need to remember about our users,
is there is a good reason they didn't go into
computing. Because it's terrible and it's horrible and
you can never get out.
So you're enjoying your job.
What?
No, he gets well paid.
Well, for the most part we enjoy it.
There are moments of enjoyment separated by madness.
Terror, terror.
I think the company picnic was scarier than anything else you've had to do lately.
That's a personal problem.
Well, that's just dealing with real humans.
Really nice real humans.
Okay.
But I can't believe I got sunburned.
It's just ridiculous.
Wait, there was a show here.
Sorry.
No, it's not you.
Let's see. One of the things you had in your email was error logger calls in all default cases of switch statements.
Oh, yeah. That's just obvious.
Oh, I like that.
Yeah, I can see that.
That seems like very good tactical advice.
Well, why would you get a, like with a bunch of stuff, of course, you have a bunch of cases that you just don't want to deal with.
What you can do is group those into one case and leave the default as an error catcher.
Ah, group them as one case or fall through multiple cases?
Fall through multiple cases, all doing exactly the same thing.
The thing that
you might not have picked up
is if
somebody corrupts your stack,
you can get
a switch
value that's out of range.
Okay.
Even though I didn't
generate it,
a fault in the system did and you can catch something like a stack overflow
because you fall into some weird case
that's absolutely impossible
yes, and I find myself every time I write a default case
thinking that and not actually putting a logger in there but now I think I'm going to because I always feel funny putting the default case thinking that and not actually putting a logger in there.
But now I think I'm going to.
Because I always feel funny putting the default case in
because, well, why would I ever get here?
Yeah.
Because it's an error.
So let's handle that.
They should really call default something else.
Error?
Well, one of the things about embedded systems
is that crap happens.
You can get flash memory
that loses values and stuff like that.
And you do have to sometimes retry.
I mean, if you're retrying calculations
because you don't trust it
and you want to make sure that
your PID algorithm generates the same result
twice in a row,
that would probably be an indication that your system is busted.
But SPI has wires that leave the chip, and you can get noise,
and you can have diodes that heat up and voltages start to rise
and buttons end up getting pushed that don't actually
get pushed.
And if you push on a push button, it doesn't go on.
It goes on, off, on, off, on, off, on, off, on, off, on, off, on, off, on.
So you have to pull buttons before they actually settle down for like 5, 10 milliseconds.
That's called debouncing, if anybody's wondering how to handle that problem.
Get the hardware guys to do it.
Oh, not this argument again.
Not this argument again.
Yeah, I tried that one, and the hardware guy said, do it in software.
Yeah.
I saw on the Atmel ARM processors, they have debounce on all the pins.
And you can turn it off and on.
And I thought, oh, I want to use one of those processors.
Apparently, you want to use it as part of your Darth Vader mask.
I love radio. This default is always an error.
Helping with stack overflow is kind of a new concept to me,
but I could fall behind that.
Well, if you've never heard of it before,
then you say, why should I put in a default?
That can't possibly ever happen until somebody mentions stack overflows.
I always put in a default, but it usually looks like default break.
Yeah.
But we should all switch to default log error break.
Unhandled condition in this file.
I'm going to pound define default to oops.
Well, you can do a default assert one,
or sorry, default assert zero, which will always fail.
Right.
Okay, that actually brings up asserts. When do you decide to do asserts versus errors?
One of the things that I've found about asserts is they halt your program.
So with a call to an error logger you could continue but most of the asserts that i've
like the default assert macros have we explained what assert does yet probably not okay uh assert
is implemented as a macro in c and the basic action is you give assert
and the parameter is
some sort of a conditional.
True or false?
Well, something like
assert return code equals success
in this case.
Equal, equal success.
Equals, equals. You're good.
Wow.
I have a compiler going all the time.
Sorry.
You need to get together with some friends.
Okay.
So what the assert macro will do is say, if condition is not true, printf, and then it will typically say printf assert failed, and
then it will give a textual representation of your conditional, and then it will typically
give you the file name and the line number. And if you don't know how the preprocessor gives you file names and
line numbers, expand Assert. You can learn a whole bunch about the C preprocessor and it's
really cool. And then what it typically does is it goes into an infinite loop or halts the
processor or something like that.
So that's usually in a system like if you were running a C program on a computer or BeagleBone or Raspberry Pi or something that is reasonably smart and has a way to output to users.
Yeah, you need a serial port.
You can write your own assert to go over a serial port.
You can write your own assert to write to your error logging system.
I often modify my asserts to have an assembly breakpoint.
Because you can only have a couple of breakpoints for most of these debuggers.
Most of these processors really are what limit you to two or three debug,
two or three breakpoints.
Breakpoints.
But you can actually write an assembly instruction that is breakpoint.
And then in your production code, you let it go ahead and do the error,
which will then reset.
Or instead, when you have your debugger on,
it just sits there and waits for you to actually check your stack.
So here's a little wrinkle. This all sounds wonderful until you turn on optimizations.
And then the compiler decides, well, you've got all these asserts all over the place,
and they basically do the same thing. So we're going to collapse them all to one.
And so every time you have an assert in this file, it actually goes some random place that has nothing to do with where you actually were.
That's usually when you're in the debugger.
It's to log the right stuff, but
that happens to me all the time.
Oh, you can't walk through the asserts. They still
function correctly. No, the backtrace is
in the wrong place, too.
Oh, your assert's a function and not
a pound off.
I think it might be
yes. It leads down to a function function yes yes so it's doing all
kinds of inlining and weirdness yeah it's unfortunate so be aware that be aware of
stuff like that and with your uh inserting uh breakpoints um check your check your hardware
manual because something like on a 68 000000 a breakpoint that's hooked up to
BDM will break
but if you're not hooked
up to BDM
then it's basically a no-op
and your program will just continue
Is BDM Canadian for JTAG?
No, it's
Motorola for JTAG
Sorry
A Sorry No, it's Motorola for JTAG. Sorry.
A.
Sorry.
At a higher level, when I first learned about asserts,
I was always told that you put in asserts that halt the system during development.
And did we say that asserts always reset the system?
I'm getting there. Okay, sorry. I think you say that asserts always reset the system? I'm getting there.
Okay, sorry.
I think you said that it depends on the implementation because you rewrite assert.
Right.
So I learned that you put asserts everywhere
to check any condition that could be weird
during development.
And then when you make production code,
you turn on a flag and it turns them all
either into logs or no ops.
I don't like that.
Well, in Canada,
we call that fail early,
fail often.
Because if you run into an error,
your program's going to fail
and halt and
give you an assert and you have to go fix it
before you can continue.
And for development, that's awesome.
Once you get into production,
what Chris is talking about right now, listeners,
is what you can do is put in a pound define
that redefines assert to be blank.
And it all gets compiled out and it doesn't happen
and it doesn't take up any code space and it doesn't take up any variables. And it all gets compiled out and it doesn't happen and it doesn't take up any code space and
it doesn't take up any variables. And it's awesome. Well, except it changes your timing and it changes
your code map. And if you are mixed up about your error handling policies, you just took out all
your error detection. Yes. And if you have an error with your system somewhere down the road
and you plug in your serial port, you still get nothing.
So, Chris, you started this by saying it made you uncomfortable.
I think Andre and I are both in the same boat.
If you just sprinkle asserts everywhere and then you take them all out later,
you really are not helping yourself.
But I think that was the original intent of inserts a million years ago, was
this is a
this-can't-happen situation, so we definitely
want to
see it during development. But since it's
a this-can't-happen, once it
gets in production, it won't happen.
And I think asserts have expanded to be
used as a general error
handling or detection system.
Yeah. On 168 megahertz ARM processors
with floating point blah, blah, blah.
You're not starved for cycles anymore.
Right.
Okay, so this brings up a point that I think,
well, I think it's pretty important
because error handling and test-driven development
should be talked about together. Because error handling and test-driven development should be talked about together.
Because your goal with test-driven development is to find all of your bugs and then allow you to handle errors gracefully.
But of course, we've talked about that as test-driven development is very useful.
It's also quite difficult to convince people it is the right path.
Yeah.
Are we setting ourselves up to have the same
somewhat painful conversation with error handling?
That it's something everybody should have,
we should all be doing it, we should all be loving it,
but instead nobody's doing it
and we're all just secretly tip-typing away,
hoping nobody notices.
Well, as we all know, because we are geeks,
writing code is a hoot.
Getting an LED to flash is like ice cream and chocolate cake all rolled together. a hobby, I don't need to put in error logging or error handling because I'm just learning or
I'm smarter than the average bear. And if you're working for a company doing this for a living,
I don't have time for this. I'm under a time crunch. So eating your cabbage is hard to justify when you can have bunch of time and grief because chasing down bugs
is a whole lot harder than looking at the stuff that's coming out your serial port and saying,
what the hell is my system doing there? Or, oh, that's right. I forgot to change the CAN bus handler to do negotiation under J1939,
and it's screwed up rather than having to find it the hard way.
Well, even as a hobbyist, it removes frustration because you know what's happening.
I mean, it's like if you were a painter and you made sure that you could only paint
by looking through a cardboard tube, so you were only, you know, painting a quarter-sized
section of the painting without seeing the whole thing. I have many programs, even Arduino, even
Arduino IDE proper programs that, you know, they should check to see if I hooked up the sensor
because the chances are I came back to this project and I forgot to hook up the sensor again.
These things, yeah, error handling should be built in right away
and we should do it a lot.
Well, one of the things that I also do is,
well, actually the EE imputes it on me
because he's pretty anal about things.
He runs ADC lines all out the board
to all sorts of points in the power supplies and puts in
temperature sensors on the boards and stuff like that.
I love that.
And my first job is to put in some sort of an ADC de-scrambler thing because, of course,
Engineer Boy is going to be wanting to know if the power supplies work
properly and on the board sitting in front of me it turns out that a bunch of the diodes and
resistors were in the right ballpark but not quite correct and some of the voltages were kind of whacked and by reading the ADCs out and looking through a table that said
channel one has to be between 1.8 volts and 2.6 volts and suddenly you're getting an error message
popping out it becomes obvious like you can have numbers scrolling by and you won't notice. But if you have this error message popping out, it sticks out
because it's the outlier rather than the normal case.
And we fixed that board awfully quick because
we knew which portion of the board to look at.
I like that your electrical engineer is doing test-driven development through your software.
Well, I mean, It sounds like a joke,
but doing error handling and logging
of a kind in electronics is important
too. I've never heard it
really talked about.
This is designed for manufacturing, designed for production, making sure
that your boards come up.
Breakout boards with tons of test points at everywhere,
and with things you can clip
to reconfigure routing.
I love that.
I've had that on several boards where it's,
oh, cut this trace here, here, here, and here,
and now we move this port over to here.
Well, how many times have you got the RS-232 polarity incorrect and you end up talking into a mouth and listening to an ear?
It's kind of a joke around here
that you've got a 50% chance of getting it wrong and 100, sorry, 50% chance of getting it right and 100% chance of getting it wrong.
So we always put in a set of pads and put in the proper steering resistors so that we can get it right.
But as you also, I don't know if you've run into this, but the line between software and hardware in embedded systems is kind of wiggly.
Yeah.
Oh, yeah.
And throwing things over the wall, I don't understand that.
Yeah.
I mean, this is a group project.
This is not me or you.
This is us getting it shipped or getting it working.
Dude, you can make my life a whole bunch easier if you were to just
take this signal from this pin and put it over there and they'll say oh i've either got a reason
to do it this way or sure not a problem i mean there's going to be a board spin eventually
anyways but on the other hand uh just on wednesday dude asks me, I need to put another trace on the board.
Can I put it over here rather than over there?
Because my board is getting kind of crowded.
And if I put it over in this corner, I've got some board real estate.
And it's like, yeah, sure, that would work just fine.
And now is the time when's like, yeah, sure, that would work just fine.
And now is the time when we say, embedded software engineers,
be sure to take your electrical engineer out to lunch once a month.
And electrical engineers, take your software engineers out to lunch twice a month. It's totally unfair.
We are on the good side. No, I'm kidding. But make sure you ask about their kids and their hobbies and stuff
because having a good relationship is so
much better. I've been in situations where the double E barely talks to me and ones where the
double E is my best friend and the latter is far more preferable to getting things done.
And it's good if they can take a joke too because sometimes it you know it gets kind of heated and
people get sort of uppity about their designs and it's an expression of you
yes although it needn't be it needn't be like all you are it's like code reviews you don't
really like when people says well that looks, when you actually spend a lot of time making that work.
Well, I worked at one place that when there was two rules.
One of them, leave your ego at the door.
Because somebody is going to, you've asked a bunch of people to critique your code.
And they might say some stuff that isn't correct and you might have to straighten them out
or they might spot you know you missed a whole bunch of stuff what were you thinking
and it's like okay let's take it down a notch but yeah okay i'll i'll do better. Another thing was we always had a bag of chocolate macaroons on the table.
And whoever found a bug got a chocolate macaroon.
Yeah, and finding spacing and tab bugs, those aren't macaroon worthy.
No, but that should be in like...
That's like a Tic Tac worthy.
So you had a very large QA department then?
Oh my God, Christopher, that's terrible.
It was only one person.
Sorry, sorry, sorry, sorry.
And always have somebody in your code review
that isn't a part of that,
like isn't a part of the code team, because
they'll spot all sorts of stuff that you don't know.
And it's up to the person who's getting reviewed to bring those people up to speed on what
you're doing.
So you might bring in a double E and you might have to explain to them what a switch statement does.
On the other hand, you get to review their boards,
which is really cool, and you get to learn about electronics.
Yes, and be prepared to ask, well, for my part when I did that,
be prepared to ask a lot of dumb questions.
Well, yes.
It's not dumb, it's ignorant, and there's a difference. Ignorance can be fixed. That sounds better. I asked a lot of dumb questions. Well, yes. It's not dumb, it's ignorant. And there's a difference.
Ignorance can be fixed.
That sounds better. I asked a lot of ignorant questions.
Well, it's all ignorant.
It's just a matter of the level
of ignorance.
If you're at level zero ignorance
and you don't even know what you don't know,
then okay.
Maybe we'll get
Donna from QA to sit in on this one.
These electrons.
I don't understand.
Oh,
back to error handling and checking because I am on a schedule here.
People are showing up at our house remarkably soon.
Okay,
go.
Error handling and checking for hardware errors.
We have to do it.
We have to monitor for flaky batteries and bad writes and flash writes and high current things.
But how do we handle them?
I mean, it's not just, I mean, your flash is broken.
You can't just write to flash saying you're broken.
Flash is great.
I'm still broken. Well, something like Flash, the first time you boot your system after it's produced,
your Flash is either in a random state or it's all erased.
How do you handle that?
Your code should have a default block.
So you have a set of good values that you check your flash.
Does it look correct? Is the checksum okay?
You are using checksums.
If the checksum isn't okay, you've got to clobber that flash and write in a reasonable set.
But in the meantime, since you just clobbered it,
you can use the default values.
So if your flash goes bad,
you can run off of defaults and sort of limp along.
Yeah, but who do you tell?
I mean, how do you handle that error?
Okay, you notice you don't have the correct value.
You go ahead and you limp along on default.
And that's okay for this boot because you're just booting for the first time.
Yeah.
But now you've been in the field and you've been booting for the first time
every boot since the dawn of time.
It's your 12th first boot.
At what point and how do you handle it?
Doesn't that depend on the product?
It depends very much on your user and your product.
One of the things I found is you can get the processor going faster than the flash,
so you have to do a retry on the flash, first of all,
and after three or four attempts, you get the proper values
with the correct checksum.
And that's just life sometimes. If you have no human interface, you might have to go to some
sort of a default crippled value. And at production time, that might be the indication that the thing
is correct, where the thing booted the first time and it's at this
crippled default value and needs to be configured if you have an led you just sort of scream into
your pillow until somebody hopefully notices you because you're not acting quite correctly
or you could put a message up on your display if you're rich enough to have one.
Or do you occasionally say, no, I'm not going to function?
Yes.
I'm going to break, and now you have to deal with me.
I'm going to break loudly, and you can't just quite, I'm not going to limp along with my defaults any longer.
Yeah. Something like on a piece of medical equipment,
do you fail the whole thing or do you bring it up in a questionable state
and hook it up to your grandmother
as a life support system?
No, you shut the thing down
and don't come back up.
So when you're designing your error handling system,
you do have to decide
not only how you're going to display errors,
but whether you're going to display errors,
but whether you're going to allow any functionality.
Because there are times you don't want.
Well, to riff off of that and get back to the medical device,
I worked on some devices where we had hardware detectors for energy delivery
and all kinds of parameters
that had to be in a certain space.
But things can be twitchy.
So there was always a question with certain errors, is this real?
Do I abort completely now because we're doing something
that's in a medical device that could potentially hurt someone?
How sensitive are you to errors?
Because it's not always a Boolean.
Sometimes it's, well, this voltage is 0.01 volts above nominal.
Is that bad? Maybe.
It happens a lot.
It's very complicated, and you don't want to have,
and we had situations where we were false-positing a lot
and irritating doctors because we were just saying,
oh, hands up, we're stopping.
And all that you end up having to do is go into your ADC conversion routine
and tweak a parameter slightly and it all goes away.
Right, but does it go away too far?
Yes, yes.
And that's when the EE comes in and says,
no, that was real and we are actually out of spec,
and why are we out of spec, and find the component that's failing.
And then you have production problems, and we're back to your software does not run alone.
Yeah.
I mean, something like a leapfrog, what would happen if you lost a bit at the bottom end of the vocorder,
if it actually had one.
So you end up with a leapfrog with a lisp.
Is that a problem?
You know, it was funny.
I worked on medical equipment, which you can't fail on.
And then I worked on inertial sensors where I was confronted with the idea
that we made this tilt sensor to go on cherry pickers.
And what they wanted from us was for us to go in the cherry pickers,
depending on this tilt signal, which was slightly terrifying.
I don't really like to have my life depend on my code.
But at that point, you stand behind it.
And we worked on airplanes, so I was very aware of where inertial sensors could fail and all of that.
And then I went to LeapFrog, where they're like, yeah, stuff fails.
You know, kids bash these things all over the place.
Your goal, if an error happens, we don't need to phone home.
We can't phone home.
We don't need to log it.
All we need to do is reset, and that's it.
You don't handle errors
so you do it the microsoft way well no better sorry it just breaks completely and you buy a new
but that is yes how you handle the error depends very much on what you're working on and the
criticality of it i mean leapfrog we would handle the little errors, you know, debalancing sort of errors.
But at some point, you're just like,
yeah, my stack looks corrupted.
Just reset, it's fine.
The kid doesn't really care.
Well, one of the things that I've always found is
I must be really bad at C code
because I can never get a while loop working correctly all the time. And, you know,
my system suddenly gets stuck and I start debugging, debugging, debugging. It finally
comes down to this while loop where I'm spinning on a register that just is never going to come
true. And it's always while loops. But if you were to use a watchdog timer, which for our
listening audience, a watchdog timer is a hunk of hardware that's basically, you set a counter to a
really big number, and it starts independently of the processor. It ticks down or up. And as soon as it hits zero, it resets the processor.
And if you get stuck in one of Andre's wild, wild loops,
after a known period of time, it'll reset the processor, it reboots and continues on.
You don't want to do that in a piece of medical equipment.
You want to find this stuff in testing,
but there's no cost to putting in a watchdog sensor
because something might come up in a year
and that'll save your bacon.
Yeah, things do degrade over time even when kids aren't
bashing on them yeah then watch is a kid gonna notice that uh his leapfrog reset and or you know
he'll just continue playing yeah rather than having it dead and mom it's dead or you have to
take out the batteries to do something that That's silly. You should just have a
watchdog. And watchdogs used to be external to processors, and so they were additional cost. But
now most modern microprocessors have watchdogs in them. And so you should set up that software. I
know it's harder to debug, but you can turn it off when you have your JTAG or BDM connected.
And if you're working with a whole bunch of high powered stuff, assume that your
processor is going to fail and get your EE to put something called a one shot on your power supply
that the, you know, now your processor has to go out there and say, you know, just twiggle this bit
once a millisecond. And if your processor stops twiggle this bit once a millisecond. And if your processor stops
twiggling that bit once a millisecond, the one shot, you know, it's a capacitor and a resistor
and it times out and it turns off the power supply. That way, when your processor is dead,
the system goes dead rather than the processor's dead and your system is hot.
Yeah, exactly.
It's sort of like BattleBots where they say your system has to turn off
because if you have that giant metal spinning thing of death,
you kind of have to make sure that the humans can be around.
You have to have kill switches.
Here in Canada, we had a radiation therapy device called the Therac-25.
That sounds like a Doctor Who villain.
Well, as it turns out, it was because they had a radioactive source with some shutters in front of it.
And they basically got rid of the limit switch on the shutter that told it that the shutter was closed.
And they did it all in software.
Yes, they did it all in software. software and a whole bunch of people got uh very bad uh radiation burns and a bunch of people died
because they didn't have any hardware backup to when their software lost track of where these
shutters were so i worked on a product this is gonna sound like a theme this is and this decision
was made before i got there uh and product had an emergency stop, big red button
on the front.
Normally, one would just connect that
to main's power, right? You push the button,
it just cuts the power.
They decided that was too good for them.
It was
a software interrupt line.
The emergency
stop went to the software, which
triggered an interrupt, so yay good there,
because we'd run that code immediately.
At least we're not polling it.
Right, and then it would do sensible shutdown things
and stop the laser whenever milliseconds
after the switch had gotten pushed, if you're lucky.
As long as you weren't in a while loop
that didn't have interrupts enabled.
That was the absolute first thing I redesigned when we did the next process what are you thinking just cut the mains
power why are you doing this i don't trust my software why do you
oh okay i am i am supporting us back to back to to this range check your parameters
and log the ones that are out of bounds.
Your logger will start doing regression testing immediately
as you develop your code.
This was a good point.
I liked that.
It went back to test-driven development and error log
and having quite a lot in common.
But that one is pretty easy to implement,
although it can clutter your code.
Wow. I don't think error handling necessarily it can clutter your code. Wow.
I don't think error handling necessarily clutters your code.
It's worth it.
And C is not so great at helping that.
It's got type checking.
What do you mean?
Your value has to be between 0 and 255.
I mean, you end up with lots of code that looks like
if this, then return, if then do this, then if this then return.
It's hard to write clean multiple error handling functions.
Well, if you look at ST's
code that gets generated by Kube, yes, I was listening
to last week's program. One of the first things that they
do on any of the first things that they do on any of the hardware abstraction layer code is range checking using asserts.
And then after you pass all the asserts, then it does what you want.
And I found a whole bunch of really cool errors that like, oops, yep, you're right.
That is wrong.
And I get to go fix my code.
Instead of, you know, you can point at the system subroutines and say, you're not working the way that I want.
Well, I think that about covers it for errors for me.
Do you have anything else that you want to cover, Andre?
Well, you were talking.
Let's switch topics.
Christopher, you have?
Okay.
Hitchbot.
Oh, yeah.
I wanted to talk a little bit about Hitchbot because you are Canadian, so you invariably know more about it than I do.
Can you describe what Hitchbot is slash was?
Okay.
Hitchbot was a social experiment being put on by Ryerson Polytechnic and McMaster University.
So this isn't a science project, it was a social science project
to understand the interactions of non-human robot-y things and humans and
what happens when they get together. So these non-hard scientists got this thing made, which is basically a five-gallon bucket with a cake tin on top. It used pool noodles for arms and legs, and it had big yellow rubber glove hands and rubber boots for feet.
And in the cake bin on top, they had four matrix LEDs panels that they could use to, you know, they could have a face and sort of give this thing, anthropomorphize it just slightly.
It had a camera and a microphone and an Android tablet that when you started talking to this thing, it basically did some sort of a Siri kind of thing where you could interact with it. And apparently the speech recognition stuff was
done by a company called Cleverscript in Britain. So what they did was they put this thing together
and it sort of looks like a little kid. It's even got a piece of a car seat attached to its butt. And the intent was, this thing was a hitchhiker.
It's about the size of a little kid, 25 pounds.
Designed to look cute.
Oh, it's just too cute.
So they took it to Halifax out in New Brunt.
Halifax, New...
Out on the eastern coast of Canada.
I believe it's in Nova Scotia.
It's in Nova Scotia.
It's a lovely place.
That's a great town.
And the intent was for it to let it be known
that it's trying to hitchhike across Canada.
And over a period of about,
I think it was about a month and a half,
it hitchhiked from Halifax all the way out to a city called Victoria
out on Vancouver Island off of the west coast of Canada.
And this thing would take pictures every 20 minutes
and had a GPS so it could figure out where it was.
And you had Hugh Munns in the background that would post its progress up on Twitter.
So that happened July, August last year.
And it had a great time and a lot of people interacted with it and took their pictures with it.
And it got onto the news and it
was just a just a real happy kind of time in february of this year they sent it over to germany
and it had a great time drinking beer and it got its uh it got up to the netherlands where it you
know had its portrait drawn by artists and stuff like that and
it made it back and it had a great time there and this summer its job was to start in boston i
believe it is or out on the massachusetts coast and weasel its way down to i think it was going
to san francisco or la to get to disney LA to get to Disneyland. And it had a bucket
list of places that it wanted to go since it's made out of a bucket. And so people picked it up
and they took it to a, I think it went to a Red Sox game and had 4th of July stuff, and then it eventually made its way to Philadelphia,
where on the 1st of August,
somebody beat the living shit out of it,
and it got destroyed.
And the last pictures that came out of this thing were a couple of internet video logger guys that did prank videos.
Picked the thing up and they took it around town in the back of the car and they thought it was very cool.
And then slightly after that, they came up with a security video of, they managed to magically get a hold of the security video of somebody beating their living shit out of this thing.
And that was the last they saw of it.
And it stopped transmitting and it was a pile of junk. As it turns out, there was
no video camera, and they
finally confessed that
they took the video, but
they don't know who actually beat
the snot out of it.
I didn't see that part.
So,
this sort of hit the
international news wires,
and like I've read about this from New Zealand to India of, you know, this thing going across Canada for a few months.
Nobody touched it.
Everybody had fun with it.
Goes across Europe.
Nobody touches it.
Everybody has fun with it.
Three weeks in the States, they beat the snot out of it and destroy everybody's fun.
So, yeah.
I had heard of it, but hadn't really penetrated into my brain that it was in the U.S.
I heard it was coming to the U.S.
And hadn't really thought much about it.
I read Mike McIntyre's hitchhiking book recently,
Kindness of Strangers, which was about hitchhiking from San Francisco
to the East Coast, maybe back too.
And I kind of hated the book because it was a self-absorbed guy talking about how wonderful it was that he could travel on the kindness of strangers without giving anything back.
And I don't know, it seemed very responsible to me.
And so Hitchbot kind of hit a bad point for me because I think hitchhiking, if you don't have to, is really hard
to support. But Hitchbot itself, that wasn't its purpose. Its purpose was social science,
but it also was hope for kids. I know a lot of kids in Canada watched it. They watched its
progress. They learned about geography by watching Hitchcock.
They learned about all sorts of neat things.
And so I am horrified and sad that I was apathetic about it
because maybe if more people knew what it was when they saw it,
it wouldn't have been so vulnerable in the States.
But it just sucks.
I mean, this sucks.
I'm not convinced.
I'm not convinced that
if it was more
publicized, it would have
been less vulnerable.
Some people...
Some people are jerks.
Some people are jerks.
Well, it's...
Yeah.
I have mixed feelings about it because the story gets a lot of press because it lends into a lot of confirmation bias about America
which, you know, some of it may be true
but maybe not. But America is a very big country
with a lot of big cities and a lot of people
in different situations and, you know, Europe
and Canada, they are different places.
And I'm not saying that America is good for being the way it is,
but it is a different situation.
And I think you could easily find yourself in a place
that you don't want to be.
Well, I really don't want to slag the U.S.,
so I'm going to tell you a story instead.
Sorry. I really don't want to slag the U.S., so I'm going to tell you a story instead. One of the people that used to work in the next office worked for a company called JDS Uniphase, and they made fiber optic stuff.
Yep, lost a lot of money on them.
Yep. working out of Japan and he ended up having to go be one of those incredible line engineers in
China where they were doing their manufacturing but it was only going to be for six weeks so he
packed up a bunch of stuff and went to China and it ended up that he stayed there for like six months and of course he didn't pay
for his apartment and the landlord what the landlord did since he defaulted on his
rent was he the landlord packed up all of his stuff into boxes, tied it up with ribbons, and put it out on the front step of the apartment building.
This guy came back six months later, and all of his stuff was on the front step of the apartment building in the street.
And nobody had touched it because it's not theirs.
Why would they possibly touch it?
And I want to live in a society like that.
The last time I was in Seattle, some guy right like half a block away from Pike Street Market had his bag stolen off of his shoulder and the thief knew enough to run downhill
rather than uphill.
I don't want to live in a society like that.
Yeah.
Well, I don't really want to end this on a depressing note.
And I am sad that the United States
did not get the opportunity
to learn about its own geography and history through Hitchbot.
I think we could have.
Oh, but it's it.
We learned all sorts of things about Philadelphia.
Oops.
Well, we learned that we do not have the kindness that we should.
Maybe it's an individual thing.
Maybe it's not.
I mean, at an individual level is that you no and are you the trolls in the
entirely possible it would have ended up in the wrong hands in canada too
there are there are plenty of bad people in canada
and the population difference means that if bad people is a percent of people, then the U.S. has more bad people.
I'm not excusing it,
but talking about America as an entity
is a really weird thing to do.
We drove across the country
and it became very, very, very, very clear
that the United States is not one place.
It is really big.
Canada, of course, is the same width. Bigger, but
very sparsely populated, comparatively,
right? I mean,
a tenth, a fifteenth the size of the
United States by population? Okay,
Canada has the same population
as California.
My province is
I think it's two and a half times
the size of Texas.
And we've got three million people in our province.
But, you know, the population doesn't absolve the people.
I mean, Japan has a population of 127 million,
so they are half the size of the U.S.
If you go for importance, China has, what, 1.2 billion?
Does that make them four times as important as the U.S.?
No, no, certainly not.
So that's a bogus argument.
No, but there are cultural arguments,
and I think the United States has cultural differences all across it
and within cities, and some of those other countries do not.
And they instill different
values that are more pervasive country uh nationwide than maybe the united states has
i know that you andre are a commander hadfield fan um do you know who krista mccullough is
i've heard the name so this this is definitely an age thing.
And I suspect everybody who's about 40 in the United States knows who Kristen McAuliffe is.
But her name isn't that widespread.
She was the teacher who went into the astronaut program on Challenger, and then it exploded.
Oh, she was the first non-astronaut
to ride the shuttle. Yeah.
It's very personal for me because my teacher
at that time actually was in the program and
washed out with a broken ankle when there were only about
15 candidates left.
So she got really far and so she was really into it.
But it was one of those situations where I think it hit the U.S. kids really hard
because such a fanfare had been made.
And we expected so many things about having one of our teachers
in space. And a woman and a
non-astronaut. And she was exceedingly personable.
I mean, incredibly charismatic. And loved science.
And there was a big potential here.
And that was the thing that I thought about when I heard about Hitchbot, because there's this destruction of possibility that was more distressing and I hope that we do better
I always hope we do better
I hope that Canada retries
I hope that the researchers have said that they will rebuild
although I've read that and then I've read it
they wouldn't and then I read they would
so I don't know where they're going to stand
but I hope they retry and that we pay attention next time and next time they start in San Francisco so I can give them a ride well
getting money for social sciences at in Canadian universities is uh it's tough and for something that you can hire a research student for a month
or you can build this buckety thing
that has already failed once.
It's hard.
And at the university level,
it's all about generating papers, frankly,
and they might have enough information
or they might need to continue to get their papers.
I think we should flood the highways with them.
Just have them everywhere.
Well, it's like, why don't people hitchhike anymore?
And it's sort of the same thing.
Bad things happened, or did they?
There was a, I was going to say bad things happen to people. And I was listening to,
I think it was a Freakonomics podcast, or maybe Radiolab, and they were talking about hitchhiking and whether the problems with hitchhiking are actually real.
And a lot of it is boogeyman stuff that never actually existed.
Yeah, I have heard that too.
And yet, I don't think I'd go.
Maybe that's just me.
I think a lot of it is cars got really cheap and it's not necessary anymore.
And you only need one bad thing to happen out of 100,000 or even a million before people are going to report on it.
Right. And it's like all the scares that happened through the 80s and 90s where everybody closes in.
Same thing with kids and playing outside and things like that.
But the robots in the future are going to look back on this very negatively.
So I think we should be more careful.
Well, they were doing
a study in Japan just recently
of the interaction of robots
and children.
And while there's somebody watching,
everything was very nice, but as soon
as
the children weren't being watched...
The robots attacked?
No, the other way around.
The little kids became Power Rangers and they're beating the snot out of the robots.
I've thought that that was kind of disappointing.
Again, I say, I don't want to close this on a depressing note, but now we really do have to go and clean this house up before the people tromp through to look at our house so that they buy it, so that we can buy that house we put an offer in for,
and then that circle of housing continues.
Okay.
Christopher, do you have any last questions?
Is it, do you want to buy my house?
Because I understand that one.
No, no.
I'm done.
Andre, any last thoughts you'd like to leave us with?
I do.
Sorry, I'll make it short.
You were talking to Clive last time about the trolls in the forums and stuff like that.
And one of my friends gave me this nice little quote.
The bad news is you cannot make people like, love, understand, validate, accept, or be nice to you.
You can't control them either.
The good news is, it doesn't matter.
That's good to remember.
I think it fits in well, too.
My guest has been Andre Csicak, a systems developer at CBF Systems Inc. If you are wondering about all the
repeat guests, I was booking a lot around the 100th episode, looking back at people we enjoyed
speaking with and all those times we ended a show with, we should talk more in the future.
And so I've been booking guests who were on again. Though with Andre, I was lost for this week,
trying to figure out how to find a guest after the scheduled one bailed and trying to find the time to write a new outline and when we could even record.
It's not even in our normal recording time now.
So, Andre, thank you so much for being an easy guest.
We probably would not have done this episode without you.
So you only call when your friends bail on you? for being an easy guest. We probably would not have done this episode without you.
So you only call when your friends bail on you?
You're very welcome.
I'd be glad to come on anytime.
And if we stay on the current track,
the next show might be about how to sell your house in the Bay Area.
Oh my God, no.
Or maybe just about all the emails
we've gotten recently. It'll be great. It'll work out. And as always, thank you to Christopher White for
co-hosting and producing. And thank you for listening. If you'd like to say hello,
show at embedded.fm or hit the contact link on embedded.fm. If you'd like to talk to Andre specifically, he's very good with
his email. He writes lots of interesting emails. He is at Embedded at Chichak.ca. And I will
probably put that in the show notes this time. That would be a good. And thank you again to
Clive from ST. He's awesome. I like it when our guests
sort of know each other.
He's answered some of my questions, and
when I get on there, it's like
Clive is answering all these questions.
I should help out, too.
So don't be a troll.
Excellent.
That does it for this week. Hopefully we will talk
more next week, if not the week after, for sure.
My quote this week is from Mahatma Gandhi.
An error does not become truth by reason of multiplied propagation,
nor does the truth become error because nobody will see it.