Programming Throwdown - Customer Bug Handling
Episode Date: December 4, 2018Hey all! How do you find and triage bugs on other people's machines when they don't have the source code (or the knowledge to build it)? That's what we explain in today's episode! It's one of... the topics that's rarely spoken about but extremely important to get right before shipping any software product. Happy hacking! This is the last episode before our Christmas special! If you are a patron, make sure Patreon has your up to date address so we can mail prizes! If you aren't on Patreon, sign up before our Christmas show to be entered in our raffle!! Show notes: https://www.programmingthrowdown.com/2018/12/episode-84-customer-bug-handling.html ★ Support this podcast on Patreon ★
Transcript
Discussion (0)
programming throwdown episode 84 customer bug handling dig it away jason hello hey everybody Jason. Hello. Hey, everybody. How's it going? Episode 84. This is going to be pretty interesting.
I'm actually pretty excited about this. This is something that a lot of people, they don't really
teach that well in school, and you could easily kind of get bitten by this. And I've definitely
gotten bitten by it. And so we get to kind of talk about it. But before we do that, let's talk about
some really funny computer science pop culture.
So how many of you have seen some type of TV show, especially earlier on, like, let's
say the 90s, early 2000s, and they talk they try to talk about computers and computer science
is just such an epic fail.
We posted a few of them here.
My favorite is and definitely check out the website Programming ThrowThrowdown.com, and you can check this out.
But just spoilers here.
The first one, they're trying to track some criminal hacker,
and this woman's like, he's doing it in real time.
And then this other lady goes,
I'm gonna write a GUI in Visual Basic
to track his IP address dead serious and then she
just walks away and i'm pretty sure yeah at the end of the movie they yeah that that gui saves
the day but uh it's pretty epic and then there's uh there's a couple other ones definitely check
them out on the website um yeah who is this 4chan guy and uh that's another good one um if you have any good ones send them
over to us um but it's just it's no end of entertainment there um people have now even
made playlists there's whole playlists of all just hacking uh uh you know hacking video fails
um one other thing to mention a lot of people wrote in about our last episode, which was the episode on teaching kids to code.
And let me pull it up right now.
There were some actually absolutely phenomenal suggestions out there.
While you pull that up, I'll give my contribution to this, which is I have nothing to say on computer science references,
but I know this is a thing that other people talk about as well.
Like I've seen YouTube videos from a biologist saying biology and video and movies is all wrong. Astrophysics is all wrong.
I think Neil deGrasse Tyson does some about like space travel and astrophysics
and ruins movies, you know, by, by pointing out how bad the plot is.
And the one,
one I saw recently was people complaining about, like, how you learn jazz music.
So I think this is a universal thing where movies try to take artistic liberty
because you want to make an interesting movie.
And personally, people are like, oh, they don't know what they're doing.
They're so dumb.
But I don't know.
I think personally it's just they might know or they could go figure it out. out like i'm gonna make a visual basic gui to hack his ip address yeah
sure that doesn't actually make sense but to 99.9 of people that sounds the equivalent of
avada kedavra it's like oh yeah you're just making harry potter spells like
it's just an incantation they don't need it they're picking real words but it doesn't have to be meaningful yeah that totally makes sense yeah i think uh i wonder i wonder uh if now because
i feel like nowadays they're a bit better about that so so i wonder if they maybe they consult
with somebody it's just that when these shows were made you know that was so kind of unpopular
and passe that they just made stuff up because nowadays it seems much more legit like
when i watch more recent shows that's right and i think even in um so i think famously in the die
hard movie the bad guy is oh i'm not forgetting it either german or russian and he's that's what
they insinuate but he's just speaking i think just gibberish and in the sort of german and russian
version you know whatever it is the bad guy is some other country but he's speaking the same words
because they're just sort of like gibberish words i believe i'm recalling this correctly but then
someone was pointing out that uh in like game of thrones they actually like consulted there's like
a guy who's the hollywood consultant for foreign languages or invented languages he's like a linguist oh
and so like in game of thrones this this darth vackie language or whatever is actually modeled
after realistic languages it didn't and it's how it's a whole language it's a whole thing
and when they need a new word or sentence structure that they've not had before they
like consult with him and he like helps them come up with like a cohesive system as if it really were a language wow that's cool yeah i uh i think i recently saw um the show
scandal and uh um yeah my wife was watching it and i was catching a couple episodes and
um yeah they had some hacking sequence and it kind of made sense i mean it wasn't
i i don't know much about hacking myself but i mean it definitely seemed a
little like out of place like but but you know they had a bash command and they're doing you
know i don't ls and they're going to a directory and looking at a file i was like it was it was
much better than the ones we posted but also much less funny um but yeah as far as teaching kids to
code um anthony wrote in he had a ton of really amazing resources.
One of them, which I actually have for my kid, but I forgot, is Kano.
I don't know.
Do you have a Kano kit?
No.
Or have you heard of this?
So it's like this.
It looks like about a Raspberry Pi with a case, maybe a little bit bigger than that.
But it has like a little joystick built right onto the onto the motherboard
and the joystick kind of pops out of the case and it's also got some buttons and um you could just
they have like uh snakes and a couple of simple games but you actually have to build the computer
yourself um you don't have to solder or anything crazy but but uh you know the you have to plug
the battery in you have to actually plug the buttons in um you have to kind of assemble the case and everything um so that was when he mentioned um also scratch
that's pretty popular it's like a little graphical programming language um and then betty wrote in
and um betty's actually a math professor at actually i won't say the university just in case
i don't really know if people like mine does, you know, revealing their identity or not.
But but anyways, with that in mind, I would say where she teaches, but she's a math professor and she recommended something called Bricklayer.org, which looks really cool.
So she actually founded it and it looks like it has a bunch of different things for
um you're using lego um to teach kids math and things like that so some really cool resources
i was actually really impressed that a math professor even even listens to the show so
kudos to you uh betty and uh and also anthony thanks for the thanks for the heads up all right
man is it time for news?
Links?
I think so.
I completely lost.
Do you want to do the first one?
For some reason, I totally lost mine.
Okay, no problem.
All right, so the first one I have is running across an article about using,
which apparently is built into both Excel and Google Sheets, but using the Google online version of Google, of Excel, the spreadsheet
application, is being able to run simple SQL-like queries in it. I find it very difficult sometimes
that I want to do something, and there almost always is a way to do it, of manipulating or
filtering or searching for data. And so this points out that you can do some common SQL operations within a column in the spreadsheet.
Do you have to install something?
No, no, I'm built in.
It does not let you do joins, but it does allow for filtering. like sort of to me like the group by clause the you know min of something group by order by this
is more in not honestly intuitive is the right word but it's more at hand to me because i've
done that more recently uh yeah and so i'll link this but this is from ben collins uh and so he
there's some videos linkedin even as well to be fair i honestly didn't watch the videos
but i just looked at the examples and sort of got the gist and so you can sort of say equals query
the column you want you know select this other column this other column this other column where
you know the whole the whole thing so that was pretty cool i didn't had no idea and then
you know apparently it's also there's some stuff that works in excel so if you ever find yourself needing to use spreadsheets i find is one of
those weird things there are many many many people who spend enormous amounts of time in spreadsheets
and really know what they're doing when it's like a python program probably could have done this way
easier but it's just what they're comfortable with. But on the flip side, I find a lot of times...
I think the GUI for Excel is just amazing.
Like the user experience is so nice.
I'm really surprised that more languages
don't have something like Excel
where you can kind of just easily see everything
and things like that.
And I will say that for me,
people from a CS background though, I think sometimes underutilize, of just easily see everything and you know things like that yeah and i will say that for me people
from a cs background though i think sometimes underutilize i guess you're sort of alluding to
the same thing but people from a cs background tend to underutilize spreadsheets um because
sometimes it really is fast just to dump a bunch of data into a csv open it up in excel google
sheets uh the open office i'm not sure what it's called their number no no no numbers is the um in Excel, Google Sheets, the OpenOffice.
I'm not sure what it's called there.
Numbers?
No, Numbers is the macOS version.
Yeah, there's LibreOffice, which is a fork of OpenOffice.
That's the only one I know.
Yeah, but they have a spreadsheet equivalent,
but I don't know what the name of it is in that suite.
Do you know?
Okay.
Oh, I think it's just called Sheets.
I could look it up.
Okay, it doesn't matter. And any of those. Anyways, dumping a CSV, opening them up in there, and being able's just called Sheets. I can look it up. Okay, it doesn't matter.
And any of those.
Anyways, dumping a CSV, opening them up in there,
and being able to do something really quick,
depending on your data processing skills in Python,
may or may not be faster.
With stuff like Jupyter Notebooks and Pandas,
that story is becoming slightly less awesome
if you know how to use those tools.
They can often be almost as effective.
Okay.
But yeah,
anyways,
if you,
if you do know how to use this or stuck with some data in that format,
check it out.
Yeah.
It sounds awesome.
Yeah.
Actually the,
the there's our studio and that's the only thing.
I mean,
I guess MATLAB,
right.
I think MATLAB has a little spreadsheet type thing, but I guess um that's that's kind of like a sledgehammer right there really isn't a
good you know if you want um excel that excel style where you just have all of your matrices
are just completely visualized like that like in those sheets um there's really there's really
nothing out there i'm actually shocked well actually isn't there uh there are some products i've just never used them there's
tableau and um yeah there's there's another one i can't remember the name of it but i think people
have done this it's just none of them have have have gotten any popularity all right that's super
cool i'll have to try that out. My news is about decentralization
issues or sorry, deserialization issues. So a lot of people might not have tried this,
but in Python, you have pickle. And in Java, you have, I can't remember, I think it's just
called the Java serializer. And you have similar things in Ruby and things like that. And it's pretty amazing.
Like in Python, you can actually, you know, you can pickle anything. So any, almost any object,
if you try to pickle, let's say a file pointer or something like that, you'll get a, you'll get an
error, but you could pickle almost any object without having to write any serialization code.
Like you don't have to, you know, do a, like, like concatenate a bunch of fields into a string
and then figure out how to pull them back apart. It just does the deserialization for you. The
serialization does everything for you. And it writes it to some binary file. So it seems kind
of like magic. It seems really cool. The thing about is it's so open ended that, well that well for one it's not very efficient so a lot
of people don't really use it other than kind of you know prototyping and things like that
also the language upgrade so if you go to from python 2 to python 3 that could cause some issues
but on top of that there's some real security vulnerabilities and so i think that's again comes back to the whole like fact that you
could you could pickle almost anything so there's actually a way where um and you'll have to read
the article to get the details but i think at a high level what's going on is you know you
unpickle something and it unpickles into uh or basically someone has created, so you pickle a file and then you unpickle it
later to get back your object, right?
But let's say someone has access to that file and they can manipulate it or replace it with
another file.
Then you unpickle it and, you know, thinking that it's the original file and then just
start using that class.
But it turns out that person has kind of poisoned it so for example
um just a really naive example let's say you could just pickle lambda functions so you could pickle
a whole function with all the operations and so you know you have various you have dot x
equals three so you have class my class dot x equals three you pickle my class and so that file now says okay my class had an
object x and it was an integer of three now someone comes along and replaces that file with
one that says oh yeah my class had a variable called x and it was actually a function that you
know wipes your hard drive and then later on you de-pickle that, unpickle that, and then call, you know, myclass.x and,
and you, and you lose your hard drive, right? So, so, I mean, I don't think it's that simple,
but, but at a high level, that's kind of what's going on is someone can actually inject really
dangerous code. And when you, when you unpickle, you get, you get hit by that. But it's a cool
article. They actually, they, they linked to some by that. But it's a cool article.
They actually link to some other articles.
So it's a bit of a rabbit hole, but you can follow the rabbit hole down to some really good presentations where people talk about it in detail.
And they show you, they walk you through examples of how exactly this can happen.
And yeah, I just found this stuff fascinating.
It's an attack vector that I didn't really realize was the thing until just now.
Yeah, serialization and deserialization are tricky and important to get right,
especially if other people are providing you data.
Yeah, exactly.
Yeah, I mean, anytime someone else is contributing data,
you have to really expect just about anything.
I have no idea why you would use pickle for data that's coming from other people i actually maybe it's because
you're unpickling crash logs in which case that might not be the right tool for that job
so on a similar related topic i ran across i actually think this i got this on the programming uh
subreddit the stack sort a la xkcd so there was an xkcd oh i should have brought it up so i could
link that as well um there was an xkcd article where they described an ineffective sort and one
of the ineffective sorts is it here uh no there was one where they
basically alluded to let me see if i can find it that they alluded to uh searching for a sorting
algorithm oh here it is perfect linked i should have just gone to the github page no it is there
so basically uh like crappy the xkcd was showing crappy ways to do various sorts.
So things like, you know, pick a random number.
And then at some point, like run various commands, like rm-rf star, you know, do just like crazy things.
It was just sort of a joke, right?
Like XKCD stuff.
So this person tried to say, oh, I know of a joke right like xkd cd xkcd stuff so this person tried to say oh i i know of
a way to do an ineffective sort we'll search stack overflow for how to sort a list in javascript
and we'll run the example code like people put example code so we'll find top voted answers
where there's code and then we'll try to run it and see if it sorts the list and so if you go to the website uh which is sort of a
super sketch and even like says i'm gonna pull some scripts from stack overflow and eval them
so like hey this is probably a really bad idea. You probably don't want to do this. But if you do, it goes and pulls it
and then tries to see if it'll sort the list or not.
So most of them turn out not to be runnable,
but occasionally you'll run across one that will work.
And so you just keep running them
until you find that it's sorted.
And so of course you could check if it's sorted.
And then, it's a little more clever
though because the first immediately obvious thing is to just start upvoting an answer that
does something very malicious like uploading all of your browser history or you know running a
you wouldn't mind bitcoin but running a cryptocurrency miner in JavaScript. And so you could do something really crazy malicious.
So what he did was limit it to posts that came out
before he sort of pushed this up
so that people wouldn't know that he was going to do that.
Yeah, so it was at least somewhat thoughtful.
So anyway, so I thought that was kind of like a good joke.
Oh, that's what it was.
My brain is not working this episode.
I apologize in advance.
So it was the alt text of the XKCD.
So if you don't, this is something I didn't know about for a long time.
I guess I'm not enough of an enthusiast.
So there's like XKCD Explains website, I think is what it's called,
where they explain some of the jokes because sometimes it's not obvious even like to me but then a lot of times there's stuff when you hover
over the image and it gives you the like explanation of what the image is supposed to be
there's this alt text and a lot of times there's this funny alt text so the alt text of the
ineffective sort page on xkcd is stack sort connects to stack overflow searches for sort a
list downloads and runs the code snippets until it finds one that sorts the list that's what the is stack sort, connects to Stack Overflow, searches for sort a list,
downloads and runs the code snippets until it finds one that sorts the list.
That's what the alt text says.
And this person implemented that.
Sorry, I completely butchered the beginning of that story.
But if you didn't know about this...
It's amazing. You should absolutely try that.
Or don't.
Like, I'm not sure.
Sounds immediately like a terrible, terrible thing to do.
Yeah, mines of Bitcoin, sends it to Patrick, and then sorts the list.
Takes a long time.
I did notice, while we've mentioned Bitcoin now a couple times,
that the number of times I overhear cryptocurrency being mentioned randomly
while walking by has dropped very strongly correlated with the drop in price.
Yeah, I mean, nobody's really talking about it anymore it's pretty wild actually um they started cracking
down on the browser-based miners i saw that yeah yeah so there was an issue where um there were
some websites that were um just mining cryptocurrency So you would go to the website and while you're
on the website doing whatever you're doing on there, it's mining cryptocurrency. So one of
the examples is actually BitChute, which we talked about in the past, which was a way to watch videos
kind of like YouTube. They were running a cryptocurrency miner and they actually just
they posted in their blog that they just turned it off because of pressure.
I guess their ISP is going to shut them down or something.
But up until now, they've been just mining Bitcoin on your computer, which is pretty wild.
Yeah.
So I don't know.
So the reason why I said not Bitcoin is because i think bitcoin from a cpu perspective is really
bad now but there are other coins which are still like designed purposely to work pretty well in a
cpu still um and so i'm not sure the exact details but i'm actually that falls into this is off topic
i guess but it falls into a little bit of a murky thing because at some level it's like if i go to
youtube and i'm watching a video for free, you can send advertisements.
Or if you disclosed it properly, like not the whole time watching a video, but maybe for like the first 30 seconds that I'm watching a video, it's computing some hash, doing some work, whatever.
And that's like the exchange for me being able to watch the video to like help offset the costs.
Like at some level, it seems like it could be a reasonable business model the issue is just doing it without people's consent seems a
little like slimy but i'm not clear why it became such a major issue i think if they had popped up
a box saying you know you have to agree to mine bitcoin to watch this video i think it would be
a totally different story yeah yeah i guess i yeah i don't
know i don't know how i feel but people installing malware that does it on your computer like
obviously like sucks right that's they're doing something completely without your permission
and there were some malicious i believe chrome add-ons and other add-ons to browsers where people
were doing it as well you know slipping them into well that was an article yesterday or two days ago
where there was a node.js package that uh checked to see if you have a wallet on the computer and
if it had a certain number of bitcoins it would get your private key and send it to a command and control uh server and this like node package was imported
by lots of major uh projects um i mean that obviously like horrible right and or putting
putting some sort of uh obfuscated crypto miner in a package that you manage as an open source
thing without telling anybody like i think those things are all pretty bad but yeah in general i'm not it seems kind of like a decent trade you could make like i'll go
to your website in exchange for donating some of my compute power yeah i mean it seems fine to me
i mean as long as yeah as you said as long as the consent is handled correctly i mean i think it's
fine all right my my show topic is a retrospective on GraphQL.
So we haven't actually talked a lot about GraphQL.
I think GraphQL would definitely be a great show.
And we should definitely, I'll add that to my list of things as I think about it.
But just to kind of explain briefly GraphQL, It's a replacement for REST. So you typically have a REST API,
and you'll have, for example, you have a bunch of endpoints. So if I go to my website,
slash API slash me, it maybe returns some JSON with my information. If I go to slash API slash user, you know, question mark, you know, Patrick, then it gets his information if I have access to it or whatever.
And so you could you could build endpoints like this.
But things kind of especially for graph based information, it kind of gets out of control.
And it turns out a lot of things are are can be modeled in sort of a graph.
So imagine you have email.
So you have your inbox folders,
and each of those folders have emails.
Each of those emails have attachments.
You can kind of see there's this sort of tree structure developing, right?
And so you end up writing these really complicated query handlers
for being able to return data.
And typically, you end up needing
to access different slices of data at different times. And rather than write, you know, 10,000
query handlers, you end up writing one that returns more data than you actually need,
just so that it can service, you know, three or four different requests or different types of
requests. And it just gets very messy. And so GraphQL is basically a replacement for REST where you basically say,
GraphQL, you know, you own, you know, this part of my website. So it's like, you know,
slash API slash GraphQL, like everything in there is owned by GraphQL. And then what they will do is, is it's, it's just a, they will
basically handle a lot of that for you. And so you can execute different queries and you can ask for,
you know, certain pieces of information and you'll only get back that information. Um, it's really
nice. Um, I highly recommend it, especially, you know, once, if you're building a website that
needs to kind of fetch data in that format, like if you're building a website that needs to kind of fetch
data in that format like if you're building an email server that's a that's a great example
um it's definitely a good thing to learn if you're doing anything web and this company talked about
their evolution from a rest api to uh graph ql what that was like and they also have a list of
lessons learned.
So I definitely give it a read.
It's pretty interesting.
They've been using GraphQL for two years.
So they have a lot of different experiences.
They have things that they like,
things they didn't like,
certain packages that they were using,
things like that.
And so, yeah, if you're doing any web stuff,
definitely check it out.
All right.
I think it's time for patrick's book of the show yeah so we're gonna do something a little bit different we
didn't uh um there's not too much time between last show and this show so um i have to admit
i haven't been doing much reading patrick hasn't been playing enough video games so patrick's gonna
handle the book of this show and i'll do the total show uh
yeah so with the thanksgiving holiday i just didn't end up uh commuting as much and so as i've
attested before i call it reading but i actually mostly listen to the books uh and so i instead
was using a different book over the thanksgiving break and i decided hey this would be good to talk
about on programming throwdown because it's very programming related and that is a cookbook uh
that was sarcasm that's nothing to do with programming uh this book is the food lab by
kenji i don't actually know how you say his last name i guess it's lopez alt but um i know him from know him i read about him when he he writes for a website
called serious eats and although there have been more nerdy food websites that have been around in
the past this one is like a good balance of uh sort of like nerdy tech approach to doing things
without it being a gimmick and so he just takes like a more what i
would say like an engineer style approach to how to cook where you'll see things like in traditional
french uh if you ever watched a lot of engineers talk about good eats which was alton brown's food
network tv show where they kind of play the science of cooking that's sort of like that like
oh in traditional French tradition,
traditional French tradition.
Oh man, I'm really out of it.
You're not supposed to wash mushrooms under running water
because they would absorb too much of the water.
And so you're supposed to use this dry brush
and you really scrub at them to get the dirt out
because mushrooms are really dirty.
But then like, is that true?
Like that's easy to test.
Just dump a bunch of mushrooms in a bucket of water for a while and then put them on a scale and see if they got heavier and if they
didn't get much heavier then they're not absorbing water um but those things are i don't how would
you say that those things are not even like acceptable to question because that's how people
have done it for hundreds of years if you're adhering to taboo high like high cuisine yeah like they're what do you mean like you do it
because you do it and there's just no questioning of it um so one of kenji's big ones like he uses
like a you know uh like what about reverse sear so if you're going to try to crisp the outside of a
piece of meat tradition says you always sear it first and then
braise it or whatever um so like i'm going to cook the outside really hot and then put it in a pot in
the oven and cook it really low and is that better or worse than what they call the reverse here
which is like cooking it low and slow and then searing it at the end um and so like it that's
easy just try it both ways and then and then you know make an objective
call or whatever like but people just you know kind of don't so they do this and they actually
do measurements and yeah so this food lab is sort of his approach where he tries to say like hey if
you want scrambled eggs this way do it this way you want it that way do it that way like the trick
isn't this or that it's just like oh add a little bit of water and your eggs are fluffy why well
because the water turns to steam and when the steam is escaping it's at about the same temperature that the proteins and the eggs
set and so it makes your eggs fluffy um because it sort of happens around the same temperature so
if you want fluffy eggs add a little bit of water to your eggs when you scramble them um you know
this is like really sort of common sense they're not really exactly food hacks but just like i i
don't know i sort of relate to i feel like this is the kind of cookbook i want to be rather than this very fancy highfalutin thing
where it tries to you know impress you with the you know you need the i don't want to even i don't
even know how to pretend to be that you need you know one centimeter long julienned carrots
and you know it's like okay great like no Yeah, no, I'd much rather have a practical book that says like, you know, here are the
things you already cook.
Here's how to do them better.
Yeah.
So then there's recipes for stuff I haven't cooked and good ideas.
And I just feel like, oh, if I go do this, I'm more likely to know if it didn't work,
that it was something I did rather than this is a bad recipe.
So anyways, yeah. So that's the Food Lab by Kenji Lopez-Alt.
And we'll have a link in the show notes.
And similarly to the stack sort we were talking about earlier,
I do find myself and my family,
the internet is sort of weird for recipes
because you just go to Google and you say like, you know, French toast.
And then you can find just whatever website appears
at the top you click it and if it looks reasonable you cook that recipe uh and then in a week you're
like i like that i didn't cook everything so i go that that's basically how i cook everything
oh okay but then if you like it or don't like it what do you do again the next time because if you
search it again it's not guaranteed the same site will come up oh yeah i mean if it's if i cook it
and i like it i I bookmark it.
Okay.
Okay.
Yeah.
Yeah.
But anyway, so I still like cookbooks.
I have a series.
I might recommend.
I might have a couple shows where I recommend them because I kind of enjoy cooking as a hobby, I guess.
I'm not particularly great at it, and I don't do it sort of all the time.
No, I think it's a great hobby.
I like it as a hobby.
Yeah, it's fun.
It's something you have to do anyways, so you might as well make a hobby out of it sure yeah so that's my book of the show very cool
all right my tool of the show my tool of the show is uh is actually software raid controller
or just using raid in general so uh quick story um about or no, I guess it was about a week ago.
Um, I, you know, turned on my, my, uh, computer that I have sitting behind the TV in the living
room.
And, um, it told me that one of my hard drives failed and sure enough, it just completely
failed.
Like no warning.
Uh, actually my wife told me about a few weeks ago ago that it was she was hearing this kind of
clicking sound coming from the computer so there was a little bit of a warning but um
uh but then it went away and then all of a sudden just yeah the hard drive just spontaneously died
uh fortunately i had a raid one backup and so what that means is i actually had two two terabyte
hard drives that were raid one they. They were mirrored.
So every time I would write a byte to one,
it would write to the other one.
And it's done automatically.
So basically, the way this happens,
I don't know how this works on Windows or OS X
or anything like that, but on Linux,
you actually, all of your hardware
is in the slash dev directory.
So anything in slash dev is of your hardware is in the slash dev directory. So anything in slash dev is not of your hard drive
or of your, I guess, file system, is not a file.
It's an actual pointer to some type of device
or some type of engine, like dev random or something, right?
So your hard drives might look something like dev slash
sda, dev slash sdb for your second order, sdc, so on and so forth, right? What the RAID controller
does is it makes a new device. I think mine's called dev slash md0. And when I, you know,
mount that device, it, you know, looks and feels just like a hard drive. So I can, you know, mount that device, it, you know, looks and feels just like a hard drive.
So I can, you know, mount it to a folder.
I can open it up.
I can add files to it, et cetera, et cetera.
But everything I do actually gets written to this RAID controller, which then writes it to both of the hard drives.
And so you can, you know, check the file system for errors and stuff like that just like a normal um file system and under the hood they're they're doing everything basically twice right so um so
yeah i had raid i've had raid one set up for years and years and uh yeah i chose last week for one of
the hard drives to just completely bite the dust and i have um almost all my family photos on that rate array so you
know if I didn't have a rate array they would have just been gone I mean I
probably have some on Google or something but definitely the original
copies you know the original size would have been just gone so I highly highly
recommend doing this if you don't have a Linux machine and you don't want to go to the trouble of setting this up,
you can install, you can buy one of these NAS devices.
So there's a Symbology is one, Drobo is another one.
And what they'll do is they basically run the RAID controller
and everything for you.
So you buy this box, it's running Linux.
You can just put as many hard drives as you want in it, and then you can access it from other computers in your house. So it just
stays on all day. And from your desktop, you can go to some network drive and set that up.
In my case, I already have a computer that's running all day, just sitting behind the TV. So I set up my own.
But either way, yeah, I highly recommend setting up a RAID array. There's RAID 1,
which is total mirroring. So you basically end up with half the capacity. If you have more than
two hard drives, you can do some of the other things like there's RAID 5, RAID 6, so on and so forth.
Basically, I think the idea is that, you know, as long as you have N hard drives, you can lose one of them.
Like the RAID 5 is set up so that you can lose one. And as long as you don't lose another one while, you know, the remaining hard drives are, you know, are recovering from the loss.
As long as you don't lose another one in that time, you're OK, which is, you know, it's highly unlikely you're going to lose two in a row.
So, yeah, definitely set up RAID.
It really saved me.
I went to Amazon.
I bought a new two terabyte hard drive.
When it arrived, I plugged it in where the old one was.
I told my RAID controller, hey, you know, add this new device.
It actually, it told me it was going to take 220 minutes to, you know, replicate.
But even during that time, I could add files to the RAID array.
Like it's smart enough to just work, like be completely available while it's
replicating and all of that so it's pretty magical actually how it works and
you know totally saved me so that's my tool to show nice also it seems brave
that do you have anything like that wait the 220 minutes that's only like
three hours yeah well you know I was kind of curious so yeah i mean i just
i created a dummy file just to see what happened and uh yeah it was pretty dangerous but you know
i live on the edge i guess okay yeah i have a something sort of similar although i will admit
my uh mass the network attack storage i have is RAID-RAID inside, but I sort of use backup between
my desktop and it as
the backup strategy, I guess.
So I don't have like...
My individual computers aren't RAIDed.
Yeah, that makes sense. Yeah, that's true.
My desktop isn't RAIDed.
The only thing I have RAIDed is the
family photos.
Okay. And yeah, I also have that
backed up.
And I assumed my NAS was hardware RAID,
but it could be software RAID.
You know, yeah,
I don't think hardware RAID is a thing anymore.
Oh, really?
Because yeah, I was thinking about this.
Like I remember in like early 2000s
when you could get a hardware RAID motherboard.
But nowadays, you know,
the RAID uses such a trivial amount of CPU
because of moore's
law and all that i don't think you can even get a hardware raid i mean or if you can it's like for
some super industrial setting i will defer to you anyways all right all right um well that's my tool
of the show we uh we typically uh do a pitch for audible you like like to support the show we didn't recommend a book on
audible this week but we have a history on the website of many books which are available on
audible and if you've never tried audible before i think both jason and i are subscribers and so
we both enjoy it and pay happily for our own subscriptions but if you haven't and you would
like to become an Audible member and
get a free book in the process, you can go to audibletrial.com slash programming throwdown
and check out one of the many, many books that they have. And like I said, look in our
show notes for many recommendations we've made throughout time. Short books, long books,
funny books, happy books, sad books. I think they got
something of everything there. It really is
actually kind of overwhelming
and they have sales too. Like while you're
a member, you're able to get other books on a discount
price. So it's a pretty good arrangement. I actually
like it. Yeah, same
here. I'm a big fan.
If you don't want to support us on Audible,
you can support us on Patreon
or even if you do, you can support us on audible you can support us on patreon or even if you do
you can support us on both um so patreon is pretty cool it's uh um you can give up to a dollar
actually this is the last month before the christmas show um last christmas that was pretty
ambitious um you know we really set out to give a gift to everybody we were able to give a gift to
everyone in the u.s and canada but um we had
issues shipping international like shipping overseas and so i'm going to think really hard
this month about what we can do um i learned a lot we learned a lot from from last year um we have a
ton of fans which is amazing but it also means that, you know, logistically, there are things that we're not experts in that we either need to get some help with.
We're definitely going to give out the T-shirts and all of that.
We do that every year.
And, you know, that's pretty easy.
They drop ship for us.
But I'm going to, you know, figure it out.
I'm thinking, you know, anything we buy online they'll ship
relatively cheaply.
I don't know if we can give something
to everybody this year, but
I'm going to figure that out.
Stay tuned until next time.
This is the last month
to sign up to be in
the raffle for the t-shirts and
whatever mystery prize
we figure out.
Go ahead. So go ahead.
You can join.
You can get a crack at the mystery prizes and you can leave.
It's fine.
You won't hurt our feelings.
But also, while you're a Patreon subscriber, you get access to the Patreon RSS feed, which
is the super fast download.
Downloads way faster and you can listen to it.
Very reliable and all that.
And you help support the show, which allows us to ultimately
reach more people. Well, I think it's time for the topic of the show. Customer bug handling.
This is something that is actually really hard to get right. So you think about this,
you have all the code on your machine you might even be cross compiling so
you might even be running linux but building something for windows or something like that
i mean it could be that could be really extreme right but you have all of that information you
might be able to build in debug mode you end up with this enormous binary right i mean you have
all of your own system libraries all of that and so that makes debugging
relatively easy you can step through the debugger you can do all sorts of stuff like that the issue
is what happens when you know you send your program to someone else or you deploy your website
or you push a mobile app and then all of a sudden you start getting reports that, hey, it locked up,
it crashed, things like that. Like, what do you do then? And it turns out that question is
not trivial to answer. Actually, it's actually kind of surprising that, you know, there hasn't
been just something that just has solved this for everybody seems like uh like such a fundamental problem but i guess it's
just so fragmented the ecosystem that's hard to really do that yeah i mean i think not just the
fragmentation i think that we were talking about a little bit before when we were preparing for
the show but i think the scale of things matters a lot too um and so i think in a you know video
game where it runs on many many many well hopefully it's a successful video game where it runs on many, many, many, well, hopefully it's a successful video game,
and it runs on many, many people's computers
where the individual user is, I don't want to say like low sophistication,
but they're not really supposed to be helping you debug your app, right?
Like it just is supposed to work.
And you will almost guaranteed going to get crashes just because of the statistical like
the number of hours your game is running an aggregate across everything is like somebody's
going to drop their you know tablet or lose power on their pc while they're playing your game um
that versus you know this application i write is for a very important you know customer who has
10 people who use it and it's part of a multi-million dollar business and they're very
sophisticated they're very invested in helping me fix my app and if something goes wrong i'm
probably coming out to their site and you know working with them um to get it you know all those the spread there is just incredibly wide
yep yep so yeah so i mean it's true if you go to i mean if you if you have uh enterprise customers
and you can actually go on site then uh um there's a whole lot you can do there right
but let's talk about the case in the beginning where you know you are let's say video game developer and you're sending your game out to people they're paying let's say ten dollars
for it um and you're not really going to be able to fly over to their house so um one thing to do
is turn on debugging symbols um you know it's one of these things, I mean, you have to pass the, you know, dash G flag
in GCC and things like that. And so you might be tempted to build things in release mode.
But I mean, honestly, I mean, disk space is cheap. You know, network bandwidth is not that cheap,
but you can, you know, most of the time these debug symbols compress quite
nicely and you one thing and it's gonna be an overall theme for this show is you don't want
to put yourself in a position where you can't debug the problem like you really don't you you
absolutely don't want to um you know push out a release binary you get a bunch of crash reports
that you can't really do anything with, and then decide,
okay, now I'm going to, you know, take out some of these optimizations, right? I mean, you can
build with debug symbols and the opt mode, and it's not going to be that much slower.
I will caveat that a little by saying that, in the start here, we're going to talk about a bunch
of things as sort of a survey of topics.
But we talked about this when we talked about don't roll your own encryption,
that some of these things can have implications that are kind of big.
So turning on debug symbols is very helpful when you're debugging,
but you have to weigh the cost of people understanding more about your code.
So if you're in something where you're distributing code
that you don't want people to know what it does, or it has secret sauce in it,
or a really great filter or whatever, which is not most people, admittedly. But there are certain
cases where you're going to want to be very careful about, I give this to someone, but I
want to limit their ability to understand exactly how it works. Yeah, you know, that's a really good
point.
I hadn't really thought about that.
But I guess what you could do in that case is you could have, you know,
basically you could divide your code up into the secure code and the unsecure code.
And, you know, the secure code could be built without debug symbols
and then as a static library and then just linked in to the unsecure code.
But yeah, I mean, generally, unless,
yeah, if it's a secret sauce, you can,
there's ways around that, but generally you want to.
Just be careful.
Yeah, and I think most of the time,
the errors are typically in the interface,
in the high level part of the code,
which is not, it doesn't generally have
a lot of IP. Yeah, the other part is, you know, you have logs. If you can, you want to leave those
logs running. Ideally, you want, if you're using like G log or easy logging, in the case of C++,
or using like Bunyan, if you're using JavaScript, any of these log tools,
they all have log levels, even like the log4j,
if you're using Java.
And you typically want to allow people
to turn on verbose logging.
So you can have some kind of configuration file.
And when they set verbose to one you know they get
all the verbose log and and then they can you know email that to you you could have a button
where when there's a crash you know you it automatically emails the logs things like that
but basically you want to have access to that information that you have when you're debugging. That's another way to
think about it is, is if you don't have something when you're debugging, can you still triage the
problem? If the answer is no, then you have to kind of plan for that, right? You have to assume
that whatever crashes you're seeing, you're going to see similar crashes coming from other people.
So for example, when we were rolling out the first versions of the Eternal Terminal, we
were getting tons of really bizarre crashes because it was running on Windows for Linux,
it was running on BSD, there's people.
So we had deployed it at the place where I work, and there was pretty quickly, there was hundreds of people using it.
And we just encountered tons of OS-specific bugs
that were, like, extremely difficult for us to triage
because we didn't have any logs.
We didn't have really any of the stack trace,
any of the debug symbols turned on.
And since then, you know, we've kind of added all this, any of the debug symbols turned on.
Since then, we've kind of added all this capability.
It's made a big difference.
I think with both... With both rolling logs
and log levels, there's consideration.
On rolling logs, one of the things is
you want to limit how much space you take up,
but you need enough logs
so that if something starts
dumping tons of things
going wrong you don't miss the thing that started at all and instead only see the the symptoms which
is a problem i run into other things there you can be clever about is trying to be careful about
how many of a given kind of error you're seeing so often you start spitting off the same kind of
error over and over again and oftentimes it's the first one that's the most interesting and so making sure you always capture like the first of a given kind of error over and over again. And oftentimes, it's the first one that's the most interesting. And so making sure you always capture like the first of a given kind of error, like sort
of slightly smarter ways. And with log levels, being careful that logging has some cost, spending
time driving down that cost is important. And then making sure that you're not logging at such a high
rate or in a part of code that is running at a high frequency that you end
up costing a lot of computation time to do the actual logging yep yeah it totally makes sense
yeah this is something that you know definitely before you ship a binary you want to run it in
verbose mode and just make sure that it isn't just completely blowing up the logs because that can happen very easily. If you put a log inside some really tight for loop,
just one log line can cause, you know,
just megabytes and megabytes of log
to just start building up
almost as fast as you can see it.
Another thing is, you know,
definitely, you know, handle crashes
and things like that pretty gracefully.
There's a bunch of libraries to help with this,
but basically you want to catch the crash.
And typically what will happen is
you'll try to get the stack trace.
So in the case of Linux,
there's actually a system call called backtrace
that will give you the function pointers
of the whole stack up until
where you are right now. And so you'll want to, you know, run backtrace and then, you know,
dump that to a disk to file or something or to a log or something like that. In addition to
crashes, I mean, crashes is the most common thing that, you know, you'll want to handle.
But also there are signals. So so for example if you divide by zero
um well depending on the programming language but i think in c you might get an exception
but i'm sorry in c++ you might get an exception but if you divide by zero and c you're not going
to get some type of exception there isn't even a concept of that you're going to get what's called a signal um and you can look up uh the different
signals there's sig abort there's uh sig sev um and and you what you want to do is look up all
these different signals and make sure you're handling them uh gracefully like some of them
like sig interrupt you don't really want to handle because that's only for when you're debugging
or if someone hits ctrl c on your program maybe just leave it up to them. But you know,
definitely the big ones like sig abort, you know, sig sev and maybe sig term when the process,
when something external tells your process to terminate. You want to handle those and catch
those. And again, write a log out really quick
or something like that.
So then on the other end,
now you've dumped all these logs.
You have some way of getting these logs
to some server that you own.
Maybe use GraphQL for that or something else.
And now you have to process all of those logs, right?
And you don't want to be especially if you
have you know let's say 20 or 30 bug reports which you feel like are the same bug you want to be very
efficient at sort of going through those logs and picking out the most important pieces maybe you
even want to write something that will pull the stack trace and dump it to a database you just
have a database full of stack traces.
And so this, I think we've talked about this many times,
but become an expert in grep.
Grep will absolutely save you.
Grep is basically a way for you to just, you know,
search for specific words and files.
If you're your stack trace,
log line starts with stack trace colon,
you could grep for stack trace colon,
and you'll just see all of the stack traces and all of your logs.
It's getting more common now to actually write logs as JSON objects.
So like Bunyan does JSON.
A lot of people now are starting to use JSON.
And so you could use JQ, which stands for JSON query.
And it's similar to grep in the sense that you can say, you know, just fetch this one object inside of this JSON object if it exists.
Or you could say, you know, give me all of the keys that start with this.
And so it definitely becomes proficient with JQ. The other part that's
really important is, you know, when you get the, you know, again, the people who are running your
code, they don't have your code most of the time, right? So they're running your game or your app
or your program, but they don't have the original source code, right? But when there's a crash,
you want to know the line of the source code
so that you can debug, right?
So what you have instead are you have these pointers,
these function pointer, these function addresses,
and the addresses map to lines of code, right?
So every function address maps to the start of a function in some source file, right?
And so you can use, on Linux, it's called ADDR2LINE.
On OS X, it's called ATOS.
I don't actually know what it is on Windows.
I'm sure there's something on Windows.
No, there's not.
Okay, yeah.
But these tools, what they will do is, you know, when you run
backtrace on Linux, as I said, you're going to get a list of these pointers, and you can dump
them to a log, let's say in hexadecimal, right? Then on your side, when you get this stack trace,
you're going to need to convert that into the lines of source code. And that's what these tools
do. So these tools take in the binary and they also
i think you run them from the directory that has your source code uh or maybe it's from the
directory that built the binary you have to look up in the manual but basically you'll run it from
a place where they know where your source code is um they know where the binary is that's another
thing is when you ship the binary you're going to want to basically know, get your source
control tagged. So if I ship, let's say version one, I want to tag, you know, my source control
repository and say, this is version one, this is exactly what every source file looks like. And
here's the version one, you know, binary. So so then if you get a crash back you can load
that version one binary you can download that you can get the pointers from that from that stack
trace and using this the the correct you know version of the source code and binary you can
actually recover the lines um in the source where that person was when they crashed, right?
So without this, you're going to be totally lost, right?
I mean, someone's going to say, hey, my program crashed.
And you'll have to kind of say, oh, what were you doing when it crashed?
I mean, it's impossible, right?
You don't want to do that.
So with something like this, let's say, oh, my program crashed.
You say, okay, give me the stack trace.
And you're going to get this list of hexadecimal numbers.
And you can pipe that list into address to line.
And it will actually tell you the specific lines that the person was on.
And you can follow that all the way up to main.
Like the furthest one will always be the int main
that starts the program.
So, you know, if you're running,
if this is like a Python program,
then you don't really have to worry about that.
But that also means you're sort of basically
kind of distributing the source code in that case, right?
I think you could use something like PyFreeze,
like one of these compilers
um but i think even in that case it distributes a source code see i don't think there's any way
around it in python um now obviously if you're giving people the source code and the binary they
could you could even run address to line on their computer and then just send you the location of the crash. But for most of these languages,
like even Java, you'll have to unwind the stack.
And I don't think you have to use
any special program in Java,
but you'll have to handle that.
So I think we covered,
that's pretty much desktop in a nutshell.
The hardest part there, as I said,
is the crash handlers.
There's a lot of, if you just look up GitHub C++ crash handler,
you'll find a bunch of really great libraries to help you out.
And I think it's definitely something where if you're going to ship,
you also want to practice it, if that makes sense.
So introducing crashes and making sure that your logging is actually working as intended,
because nothing's worse than a very rare crash happening
and the logging isn't configured properly
and you don't have what you need.
Yeah, actually, I can't remember.
Oh, I think it's Android.
I don't know if it's in the modern Android,
but there was definitely a version of Android,
or maybe I'm mixing things up here,
but there was something I saw where basically you could cause a crash like there was just like a
expert menu and you could go into a developer menu developer settings and one of the developer
settings was crash and and i i just you know out of morbid curiosity i tapped it and yeah it crashed
and uh you definitely want something like that in the first version where you know hopefully you
bury it where way down in the menu but um you know someone who's helping you out can go hit the crash
button and you should be able to get a crash and you should be able to trace exactly the line
uh that crashed and uh and and what you know functions called into that function so on and so
forth so now for the web um you, if you're building a website,
there's two components, your client and your server, right?
The server is pretty straightforward
because on the server, it's you own the code,
you have the binary all in one place.
Even if you're not developing on the server,
you can always push the code to the server.
And so that's not really that um let's say interesting um you could even be running like a totally
interpreted language like python or ruby or something um it's much more interesting when
you get crashes on the browser right now this isn't you know literally crashing the browser
because um that's that's always chrome's fault if you can crash the browser um you because that's always Chrome's fault. If you can crash the browser, that's pretty much on them.
But if your code causes an exception, that's typically the thing you ought to watch out
for.
So for example, you're getting the mean of a list of values, but the list is empty and
you divide it by zero
or something like that.
So in this case, your browser JavaScript
will throw an exception and you have to handle that.
And so typically on the website,
the way this works is because the JavaScript is,
because the source code of the JavaScript
is sent to the browser,
you don't have to worry about address to line or anything like that.
One thing you do have to worry about is the demangling.
So if you do, for example,
you run like one of these scripts that compresses your JavaScript code,
it might take out all the new lines.
And then now, big surprise surprise every crash happens on line
one right so so typically javascript you know when you do one of these compression tools it'll
create what's called a source map which the source map basically just says you know column 40,000 to
40,030 actually maps to line 3004 you know know, and so on and so forth.
And the source map is just this huge file
that maps, you know, chunks of that enormous one-line JavaScript file
to the original, you know, file name and line.
So it's actually similar to address to line, you know, conceptually.
And so there are tools which will take the JavaScript stack trace of the one line file
and convert it into the appropriate stack trace that's readable.
So you have to deal with that.
But basically, it's a similar idea where when there's a crash, because it's a website, you don't have to worry about how to get
the crash to the server because you already have like a client server framework in place. You just
use whatever you're using to send data between those two machines. So use AJAX or even WebSockets,
whatever you want. But then on the server, you know, you'll have to do some work.
On mobile, it's very different.
So I've linked to a couple of libraries.
For Android, it's ACRA.
And for iOS, it's KSCrash.
And yeah, basically, you know, these libraries will,
it's similar to, it's more similar to desktop than to web, right? So these will send um crash logs to some endpoint you know that you have to set up um sometimes they can even send it
through some ios or android infrastructure and then they'll arrive in the you know developer
um in the you know the developer store the developer platform or front end um you can
collect these crash logs.
But, you know, I think in the case of Android and iOS, it's much more handled by the actual
app stores. So I know I made an Android app a while back. It's been probably six years.
But at that time, they had a set of, you know, crash reporting tools. So I would just get a list of crashes
and they would provide the stack traces.
So they would do a lot of that for you.
So I think definitely take a look at
what the Play Store can do,
what the iOS Store can do.
But here's a couple of libraries
that streamline that and make it even easier.
I've not developed much on mobile,
but I feel in some ways mobile might be the easiest
because even though there's lots and lots of mobile devices,
it feels like there's far less than the number of configurations
of computer hardware or web browser configurations.
And so I feel like the number of your ability
to potentially be able to replicate the issue might be much higher.
Yeah, that's a really good point.
There is, let me see if I can find it.
There's a really good tool which simulates different browsers.
I won't look it up now in the interest of time.
But if you kind of Google around for, I think it's called like a browser simulator. But basically, literally on the left-hand side,
it has a panel and you can choose like Chrome, Firefox.
I'm sure it can't simulate everything
because it's just trying to run your JavaScript
kind of through this middleware layer.
It's not literally executing different browser code.
But it'll try its best to, at least for CSS and things like that,
catch a lot of those issues. But yeah, you're right. I mean, browser is in a sense the worst,
although things are kind of homogenizing now where I think Chrome, you know, Brave, Firefox,
you know, most things now are kind of pretty cross platform.
Cool.
Cool.
Yeah, definitely.
Send us your worst debugging nightmare where someone has a crash and there's absolutely
no way to replicate.
We always love hearing stories like that.
I've definitely caused more than a few of those myself.
This is why it's what inspired this episode.
And hopefully with these tips, you can kind of avoid doing it.
You could set people up where if they have a crash, they can send you some information.
You'd be able to triage.
One thing, definitely expect everything to crash.
Like if you whatever you write, you have to have some way to handle crashes either as patrick said because of different um
clients software um sometimes there's people who are running you know windows xp and you'd be amazed
like even the simplest programs don't work um people just do things you don't expect there's
situation you don't expect there's internet um um like internet like some people's internet connection
are not as reliable and um you know it's very very hard to plan for all of these things so you
definitely expect failure and uh have the right way of like ideally automatically just just sending
some information to you cool that's pretty much all i have. All right. Until next time. All right. Next time is the Christmas show.
And actually, for January, we have a special guest.
This might be our...
Hopefully. Hopefully. Hopefully.
Nothing against... Yeah, yeah, that's true.
Actually, I won't spoil it. I won't spoil it.
But we have an amazing guest.
Yeah, hopefully.
And I'm really excited.
Definitely, it's going to be absolutely phenomenal.
We're going to start the year off extremely special way.
But next month, we're going to have our Christmas giveaway or holiday giveaway.
And, you know, hopefully give out some really cool T-shirts and some other prizes.
The intro music is Axo by Binar Pilot.
Programming Throwdown is distributed under a Creative Commons
attribution share-alike 2.0 license.
You're free to share, copy, distribute, transmit the work,
to remix, adapt the work,
but you must provide attribution to Patrick and I
and share alike in kind.