Programming Throwdown - Customer Bug Handling

Starting point is 00:00:00 programming throwdown episode 84 customer bug handling dig it away jason hello hey everybody Jason. Hello. Hey, everybody. How's it going? Episode 84. This is going to be pretty interesting. I'm actually pretty excited about this. This is something that a lot of people, they don't really teach that well in school, and you could easily kind of get bitten by this. And I've definitely gotten bitten by it. And so we get to kind of talk about it. But before we do that, let's talk about some really funny computer science pop culture. So how many of you have seen some type of TV show, especially earlier on, like, let's say the 90s, early 2000s, and they talk they try to talk about computers and computer science is just such an epic fail.

Starting point is 00:00:58 We posted a few of them here. My favorite is and definitely check out the website Programming ThrowThrowdown.com, and you can check this out. But just spoilers here. The first one, they're trying to track some criminal hacker, and this woman's like, he's doing it in real time. And then this other lady goes, I'm gonna write a GUI in Visual Basic to track his IP address dead serious and then she

Starting point is 00:01:27 just walks away and i'm pretty sure yeah at the end of the movie they yeah that that gui saves the day but uh it's pretty epic and then there's uh there's a couple other ones definitely check them out on the website um yeah who is this 4chan guy and uh that's another good one um if you have any good ones send them over to us um but it's just it's no end of entertainment there um people have now even made playlists there's whole playlists of all just hacking uh uh you know hacking video fails um one other thing to mention a lot of people wrote in about our last episode, which was the episode on teaching kids to code. And let me pull it up right now. There were some actually absolutely phenomenal suggestions out there.

Starting point is 00:02:16 While you pull that up, I'll give my contribution to this, which is I have nothing to say on computer science references, but I know this is a thing that other people talk about as well. Like I've seen YouTube videos from a biologist saying biology and video and movies is all wrong. Astrophysics is all wrong. I think Neil deGrasse Tyson does some about like space travel and astrophysics and ruins movies, you know, by, by pointing out how bad the plot is. And the one, one I saw recently was people complaining about, like, how you learn jazz music. So I think this is a universal thing where movies try to take artistic liberty

Starting point is 00:02:55 because you want to make an interesting movie. And personally, people are like, oh, they don't know what they're doing. They're so dumb. But I don't know. I think personally it's just they might know or they could go figure it out. out like i'm gonna make a visual basic gui to hack his ip address yeah sure that doesn't actually make sense but to 99.9 of people that sounds the equivalent of avada kedavra it's like oh yeah you're just making harry potter spells like it's just an incantation they don't need it they're picking real words but it doesn't have to be meaningful yeah that totally makes sense yeah i think uh i wonder i wonder uh if now because

Starting point is 00:03:30 i feel like nowadays they're a bit better about that so so i wonder if they maybe they consult with somebody it's just that when these shows were made you know that was so kind of unpopular and passe that they just made stuff up because nowadays it seems much more legit like when i watch more recent shows that's right and i think even in um so i think famously in the die hard movie the bad guy is oh i'm not forgetting it either german or russian and he's that's what they insinuate but he's just speaking i think just gibberish and in the sort of german and russian version you know whatever it is the bad guy is some other country but he's speaking the same words because they're just sort of like gibberish words i believe i'm recalling this correctly but then

Starting point is 00:04:14 someone was pointing out that uh in like game of thrones they actually like consulted there's like a guy who's the hollywood consultant for foreign languages or invented languages he's like a linguist oh and so like in game of thrones this this darth vackie language or whatever is actually modeled after realistic languages it didn't and it's how it's a whole language it's a whole thing and when they need a new word or sentence structure that they've not had before they like consult with him and he like helps them come up with like a cohesive system as if it really were a language wow that's cool yeah i uh i think i recently saw um the show scandal and uh um yeah my wife was watching it and i was catching a couple episodes and um yeah they had some hacking sequence and it kind of made sense i mean it wasn't

Starting point is 00:05:00 i i don't know much about hacking myself but i mean it definitely seemed a little like out of place like but but you know they had a bash command and they're doing you know i don't ls and they're going to a directory and looking at a file i was like it was it was much better than the ones we posted but also much less funny um but yeah as far as teaching kids to code um anthony wrote in he had a ton of really amazing resources. One of them, which I actually have for my kid, but I forgot, is Kano. I don't know. Do you have a Kano kit?

Starting point is 00:05:33 No. Or have you heard of this? So it's like this. It looks like about a Raspberry Pi with a case, maybe a little bit bigger than that. But it has like a little joystick built right onto the onto the motherboard and the joystick kind of pops out of the case and it's also got some buttons and um you could just they have like uh snakes and a couple of simple games but you actually have to build the computer yourself um you don't have to solder or anything crazy but but uh you know the you have to plug

Starting point is 00:06:00 the battery in you have to actually plug the buttons in um you have to kind of assemble the case and everything um so that was when he mentioned um also scratch that's pretty popular it's like a little graphical programming language um and then betty wrote in and um betty's actually a math professor at actually i won't say the university just in case i don't really know if people like mine does, you know, revealing their identity or not. But but anyways, with that in mind, I would say where she teaches, but she's a math professor and she recommended something called Bricklayer.org, which looks really cool. So she actually founded it and it looks like it has a bunch of different things for um you're using lego um to teach kids math and things like that so some really cool resources i was actually really impressed that a math professor even even listens to the show so

Starting point is 00:06:56 kudos to you uh betty and uh and also anthony thanks for the thanks for the heads up all right man is it time for news? Links? I think so. I completely lost. Do you want to do the first one? For some reason, I totally lost mine. Okay, no problem.

Starting point is 00:07:17 All right, so the first one I have is running across an article about using, which apparently is built into both Excel and Google Sheets, but using the Google online version of Google, of Excel, the spreadsheet application, is being able to run simple SQL-like queries in it. I find it very difficult sometimes that I want to do something, and there almost always is a way to do it, of manipulating or filtering or searching for data. And so this points out that you can do some common SQL operations within a column in the spreadsheet. Do you have to install something? No, no, I'm built in. It does not let you do joins, but it does allow for filtering. like sort of to me like the group by clause the you know min of something group by order by this

Starting point is 00:08:08 is more in not honestly intuitive is the right word but it's more at hand to me because i've done that more recently uh yeah and so i'll link this but this is from ben collins uh and so he there's some videos linkedin even as well to be fair i honestly didn't watch the videos but i just looked at the examples and sort of got the gist and so you can sort of say equals query the column you want you know select this other column this other column this other column where you know the whole the whole thing so that was pretty cool i didn't had no idea and then you know apparently it's also there's some stuff that works in excel so if you ever find yourself needing to use spreadsheets i find is one of those weird things there are many many many people who spend enormous amounts of time in spreadsheets

Starting point is 00:08:55 and really know what they're doing when it's like a python program probably could have done this way easier but it's just what they're comfortable with. But on the flip side, I find a lot of times... I think the GUI for Excel is just amazing. Like the user experience is so nice. I'm really surprised that more languages don't have something like Excel where you can kind of just easily see everything and things like that.

Starting point is 00:09:23 And I will say that for me, people from a CS background though, I think sometimes underutilize, of just easily see everything and you know things like that yeah and i will say that for me people from a cs background though i think sometimes underutilize i guess you're sort of alluding to the same thing but people from a cs background tend to underutilize spreadsheets um because sometimes it really is fast just to dump a bunch of data into a csv open it up in excel google sheets uh the open office i'm not sure what it's called their number no no no numbers is the um in Excel, Google Sheets, the OpenOffice. I'm not sure what it's called there. Numbers?

Starting point is 00:09:46 No, Numbers is the macOS version. Yeah, there's LibreOffice, which is a fork of OpenOffice. That's the only one I know. Yeah, but they have a spreadsheet equivalent, but I don't know what the name of it is in that suite. Do you know? Okay. Oh, I think it's just called Sheets.

Starting point is 00:10:03 I could look it up. Okay, it doesn't matter. And any of those. Anyways, dumping a CSV, opening them up in there, and being able's just called Sheets. I can look it up. Okay, it doesn't matter. And any of those. Anyways, dumping a CSV, opening them up in there, and being able to do something really quick, depending on your data processing skills in Python, may or may not be faster. With stuff like Jupyter Notebooks and Pandas,

Starting point is 00:10:20 that story is becoming slightly less awesome if you know how to use those tools. They can often be almost as effective. Okay. But yeah, anyways, if you, if you do know how to use this or stuck with some data in that format,

Starting point is 00:10:33 check it out. Yeah. It sounds awesome. Yeah. Actually the, the there's our studio and that's the only thing. I mean, I guess MATLAB,

Starting point is 00:10:42 right. I think MATLAB has a little spreadsheet type thing, but I guess um that's that's kind of like a sledgehammer right there really isn't a good you know if you want um excel that excel style where you just have all of your matrices are just completely visualized like that like in those sheets um there's really there's really nothing out there i'm actually shocked well actually isn't there uh there are some products i've just never used them there's tableau and um yeah there's there's another one i can't remember the name of it but i think people have done this it's just none of them have have have gotten any popularity all right that's super cool i'll have to try that out. My news is about decentralization

Starting point is 00:11:25 issues or sorry, deserialization issues. So a lot of people might not have tried this, but in Python, you have pickle. And in Java, you have, I can't remember, I think it's just called the Java serializer. And you have similar things in Ruby and things like that. And it's pretty amazing. Like in Python, you can actually, you know, you can pickle anything. So any, almost any object, if you try to pickle, let's say a file pointer or something like that, you'll get a, you'll get an error, but you could pickle almost any object without having to write any serialization code. Like you don't have to, you know, do a, like, like concatenate a bunch of fields into a string and then figure out how to pull them back apart. It just does the deserialization for you. The

Starting point is 00:12:12 serialization does everything for you. And it writes it to some binary file. So it seems kind of like magic. It seems really cool. The thing about is it's so open ended that, well that well for one it's not very efficient so a lot of people don't really use it other than kind of you know prototyping and things like that also the language upgrade so if you go to from python 2 to python 3 that could cause some issues but on top of that there's some real security vulnerabilities and so i think that's again comes back to the whole like fact that you could you could pickle almost anything so there's actually a way where um and you'll have to read the article to get the details but i think at a high level what's going on is you know you unpickle something and it unpickles into uh or basically someone has created, so you pickle a file and then you unpickle it

Starting point is 00:13:07 later to get back your object, right? But let's say someone has access to that file and they can manipulate it or replace it with another file. Then you unpickle it and, you know, thinking that it's the original file and then just start using that class. But it turns out that person has kind of poisoned it so for example um just a really naive example let's say you could just pickle lambda functions so you could pickle a whole function with all the operations and so you know you have various you have dot x

Starting point is 00:13:38 equals three so you have class my class dot x equals three you pickle my class and so that file now says okay my class had an object x and it was an integer of three now someone comes along and replaces that file with one that says oh yeah my class had a variable called x and it was actually a function that you know wipes your hard drive and then later on you de-pickle that, unpickle that, and then call, you know, myclass.x and, and you, and you lose your hard drive, right? So, so, I mean, I don't think it's that simple, but, but at a high level, that's kind of what's going on is someone can actually inject really dangerous code. And when you, when you unpickle, you get, you get hit by that. But it's a cool article. They actually, they, they linked to some by that. But it's a cool article.

Starting point is 00:14:26 They actually link to some other articles. So it's a bit of a rabbit hole, but you can follow the rabbit hole down to some really good presentations where people talk about it in detail. And they show you, they walk you through examples of how exactly this can happen. And yeah, I just found this stuff fascinating. It's an attack vector that I didn't really realize was the thing until just now. Yeah, serialization and deserialization are tricky and important to get right, especially if other people are providing you data. Yeah, exactly.

Starting point is 00:14:55 Yeah, I mean, anytime someone else is contributing data, you have to really expect just about anything. I have no idea why you would use pickle for data that's coming from other people i actually maybe it's because you're unpickling crash logs in which case that might not be the right tool for that job so on a similar related topic i ran across i actually think this i got this on the programming uh subreddit the stack sort a la xkcd so there was an xkcd oh i should have brought it up so i could link that as well um there was an xkcd article where they described an ineffective sort and one of the ineffective sorts is it here uh no there was one where they

Starting point is 00:15:48 basically alluded to let me see if i can find it that they alluded to uh searching for a sorting algorithm oh here it is perfect linked i should have just gone to the github page no it is there so basically uh like crappy the xkcd was showing crappy ways to do various sorts. So things like, you know, pick a random number. And then at some point, like run various commands, like rm-rf star, you know, do just like crazy things. It was just sort of a joke, right? Like XKCD stuff. So this person tried to say, oh, I know of a joke right like xkd cd xkcd stuff so this person tried to say oh i i know of

Starting point is 00:16:27 a way to do an ineffective sort we'll search stack overflow for how to sort a list in javascript and we'll run the example code like people put example code so we'll find top voted answers where there's code and then we'll try to run it and see if it sorts the list and so if you go to the website uh which is sort of a super sketch and even like says i'm gonna pull some scripts from stack overflow and eval them so like hey this is probably a really bad idea. You probably don't want to do this. But if you do, it goes and pulls it and then tries to see if it'll sort the list or not. So most of them turn out not to be runnable, but occasionally you'll run across one that will work.

Starting point is 00:17:17 And so you just keep running them until you find that it's sorted. And so of course you could check if it's sorted. And then, it's a little more clever though because the first immediately obvious thing is to just start upvoting an answer that does something very malicious like uploading all of your browser history or you know running a you wouldn't mind bitcoin but running a cryptocurrency miner in JavaScript. And so you could do something really crazy malicious. So what he did was limit it to posts that came out

Starting point is 00:17:51 before he sort of pushed this up so that people wouldn't know that he was going to do that. Yeah, so it was at least somewhat thoughtful. So anyway, so I thought that was kind of like a good joke. Oh, that's what it was. My brain is not working this episode. I apologize in advance. So it was the alt text of the XKCD.

Starting point is 00:18:12 So if you don't, this is something I didn't know about for a long time. I guess I'm not enough of an enthusiast. So there's like XKCD Explains website, I think is what it's called, where they explain some of the jokes because sometimes it's not obvious even like to me but then a lot of times there's stuff when you hover over the image and it gives you the like explanation of what the image is supposed to be there's this alt text and a lot of times there's this funny alt text so the alt text of the ineffective sort page on xkcd is stack sort connects to stack overflow searches for sort a list downloads and runs the code snippets until it finds one that sorts the list that's what the is stack sort, connects to Stack Overflow, searches for sort a list,

Starting point is 00:18:46 downloads and runs the code snippets until it finds one that sorts the list. That's what the alt text says. And this person implemented that. Sorry, I completely butchered the beginning of that story. But if you didn't know about this... It's amazing. You should absolutely try that. Or don't. Like, I'm not sure.

Starting point is 00:19:02 Sounds immediately like a terrible, terrible thing to do. Yeah, mines of Bitcoin, sends it to Patrick, and then sorts the list. Takes a long time. I did notice, while we've mentioned Bitcoin now a couple times, that the number of times I overhear cryptocurrency being mentioned randomly while walking by has dropped very strongly correlated with the drop in price. Yeah, I mean, nobody's really talking about it anymore it's pretty wild actually um they started cracking down on the browser-based miners i saw that yeah yeah so there was an issue where um there were

Starting point is 00:19:39 some websites that were um just mining cryptocurrency So you would go to the website and while you're on the website doing whatever you're doing on there, it's mining cryptocurrency. So one of the examples is actually BitChute, which we talked about in the past, which was a way to watch videos kind of like YouTube. They were running a cryptocurrency miner and they actually just they posted in their blog that they just turned it off because of pressure. I guess their ISP is going to shut them down or something. But up until now, they've been just mining Bitcoin on your computer, which is pretty wild. Yeah.

Starting point is 00:20:18 So I don't know. So the reason why I said not Bitcoin is because i think bitcoin from a cpu perspective is really bad now but there are other coins which are still like designed purposely to work pretty well in a cpu still um and so i'm not sure the exact details but i'm actually that falls into this is off topic i guess but it falls into a little bit of a murky thing because at some level it's like if i go to youtube and i'm watching a video for free, you can send advertisements. Or if you disclosed it properly, like not the whole time watching a video, but maybe for like the first 30 seconds that I'm watching a video, it's computing some hash, doing some work, whatever. And that's like the exchange for me being able to watch the video to like help offset the costs.

Starting point is 00:21:05 Like at some level, it seems like it could be a reasonable business model the issue is just doing it without people's consent seems a little like slimy but i'm not clear why it became such a major issue i think if they had popped up a box saying you know you have to agree to mine bitcoin to watch this video i think it would be a totally different story yeah yeah i guess i yeah i don't know i don't know how i feel but people installing malware that does it on your computer like obviously like sucks right that's they're doing something completely without your permission and there were some malicious i believe chrome add-ons and other add-ons to browsers where people were doing it as well you know slipping them into well that was an article yesterday or two days ago

Starting point is 00:21:45 where there was a node.js package that uh checked to see if you have a wallet on the computer and if it had a certain number of bitcoins it would get your private key and send it to a command and control uh server and this like node package was imported by lots of major uh projects um i mean that obviously like horrible right and or putting putting some sort of uh obfuscated crypto miner in a package that you manage as an open source thing without telling anybody like i think those things are all pretty bad but yeah in general i'm not it seems kind of like a decent trade you could make like i'll go to your website in exchange for donating some of my compute power yeah i mean it seems fine to me i mean as long as yeah as you said as long as the consent is handled correctly i mean i think it's fine all right my my show topic is a retrospective on GraphQL.

Starting point is 00:22:46 So we haven't actually talked a lot about GraphQL. I think GraphQL would definitely be a great show. And we should definitely, I'll add that to my list of things as I think about it. But just to kind of explain briefly GraphQL, It's a replacement for REST. So you typically have a REST API, and you'll have, for example, you have a bunch of endpoints. So if I go to my website, slash API slash me, it maybe returns some JSON with my information. If I go to slash API slash user, you know, question mark, you know, Patrick, then it gets his information if I have access to it or whatever. And so you could you could build endpoints like this. But things kind of especially for graph based information, it kind of gets out of control.

Starting point is 00:23:39 And it turns out a lot of things are are can be modeled in sort of a graph. So imagine you have email. So you have your inbox folders, and each of those folders have emails. Each of those emails have attachments. You can kind of see there's this sort of tree structure developing, right? And so you end up writing these really complicated query handlers for being able to return data.

Starting point is 00:24:04 And typically, you end up needing to access different slices of data at different times. And rather than write, you know, 10,000 query handlers, you end up writing one that returns more data than you actually need, just so that it can service, you know, three or four different requests or different types of requests. And it just gets very messy. And so GraphQL is basically a replacement for REST where you basically say, GraphQL, you know, you own, you know, this part of my website. So it's like, you know, slash API slash GraphQL, like everything in there is owned by GraphQL. And then what they will do is, is it's, it's just a, they will basically handle a lot of that for you. And so you can execute different queries and you can ask for,

Starting point is 00:24:53 you know, certain pieces of information and you'll only get back that information. Um, it's really nice. Um, I highly recommend it, especially, you know, once, if you're building a website that needs to kind of fetch data in that format, like if you're building a website that needs to kind of fetch data in that format like if you're building an email server that's a that's a great example um it's definitely a good thing to learn if you're doing anything web and this company talked about their evolution from a rest api to uh graph ql what that was like and they also have a list of lessons learned. So I definitely give it a read.

Starting point is 00:25:28 It's pretty interesting. They've been using GraphQL for two years. So they have a lot of different experiences. They have things that they like, things they didn't like, certain packages that they were using, things like that. And so, yeah, if you're doing any web stuff,

Starting point is 00:25:42 definitely check it out. All right. I think it's time for patrick's book of the show yeah so we're gonna do something a little bit different we didn't uh um there's not too much time between last show and this show so um i have to admit i haven't been doing much reading patrick hasn't been playing enough video games so patrick's gonna handle the book of this show and i'll do the total show uh yeah so with the thanksgiving holiday i just didn't end up uh commuting as much and so as i've attested before i call it reading but i actually mostly listen to the books uh and so i instead

Starting point is 00:26:19 was using a different book over the thanksgiving break and i decided hey this would be good to talk about on programming throwdown because it's very programming related and that is a cookbook uh that was sarcasm that's nothing to do with programming uh this book is the food lab by kenji i don't actually know how you say his last name i guess it's lopez alt but um i know him from know him i read about him when he he writes for a website called serious eats and although there have been more nerdy food websites that have been around in the past this one is like a good balance of uh sort of like nerdy tech approach to doing things without it being a gimmick and so he just takes like a more what i would say like an engineer style approach to how to cook where you'll see things like in traditional

Starting point is 00:27:12 french uh if you ever watched a lot of engineers talk about good eats which was alton brown's food network tv show where they kind of play the science of cooking that's sort of like that like oh in traditional French tradition, traditional French tradition. Oh man, I'm really out of it. You're not supposed to wash mushrooms under running water because they would absorb too much of the water. And so you're supposed to use this dry brush

Starting point is 00:27:37 and you really scrub at them to get the dirt out because mushrooms are really dirty. But then like, is that true? Like that's easy to test. Just dump a bunch of mushrooms in a bucket of water for a while and then put them on a scale and see if they got heavier and if they didn't get much heavier then they're not absorbing water um but those things are i don't how would you say that those things are not even like acceptable to question because that's how people have done it for hundreds of years if you're adhering to taboo high like high cuisine yeah like they're what do you mean like you do it

Starting point is 00:28:09 because you do it and there's just no questioning of it um so one of kenji's big ones like he uses like a you know uh like what about reverse sear so if you're going to try to crisp the outside of a piece of meat tradition says you always sear it first and then braise it or whatever um so like i'm going to cook the outside really hot and then put it in a pot in the oven and cook it really low and is that better or worse than what they call the reverse here which is like cooking it low and slow and then searing it at the end um and so like it that's easy just try it both ways and then and then you know make an objective call or whatever like but people just you know kind of don't so they do this and they actually

Starting point is 00:28:49 do measurements and yeah so this food lab is sort of his approach where he tries to say like hey if you want scrambled eggs this way do it this way you want it that way do it that way like the trick isn't this or that it's just like oh add a little bit of water and your eggs are fluffy why well because the water turns to steam and when the steam is escaping it's at about the same temperature that the proteins and the eggs set and so it makes your eggs fluffy um because it sort of happens around the same temperature so if you want fluffy eggs add a little bit of water to your eggs when you scramble them um you know this is like really sort of common sense they're not really exactly food hacks but just like i i don't know i sort of relate to i feel like this is the kind of cookbook i want to be rather than this very fancy highfalutin thing

Starting point is 00:29:29 where it tries to you know impress you with the you know you need the i don't want to even i don't even know how to pretend to be that you need you know one centimeter long julienned carrots and you know it's like okay great like no Yeah, no, I'd much rather have a practical book that says like, you know, here are the things you already cook. Here's how to do them better. Yeah. So then there's recipes for stuff I haven't cooked and good ideas. And I just feel like, oh, if I go do this, I'm more likely to know if it didn't work,

Starting point is 00:30:00 that it was something I did rather than this is a bad recipe. So anyways, yeah. So that's the Food Lab by Kenji Lopez-Alt. And we'll have a link in the show notes. And similarly to the stack sort we were talking about earlier, I do find myself and my family, the internet is sort of weird for recipes because you just go to Google and you say like, you know, French toast. And then you can find just whatever website appears

Starting point is 00:30:25 at the top you click it and if it looks reasonable you cook that recipe uh and then in a week you're like i like that i didn't cook everything so i go that that's basically how i cook everything oh okay but then if you like it or don't like it what do you do again the next time because if you search it again it's not guaranteed the same site will come up oh yeah i mean if it's if i cook it and i like it i I bookmark it. Okay. Okay. Yeah.

Starting point is 00:30:46 Yeah. But anyway, so I still like cookbooks. I have a series. I might recommend. I might have a couple shows where I recommend them because I kind of enjoy cooking as a hobby, I guess. I'm not particularly great at it, and I don't do it sort of all the time. No, I think it's a great hobby. I like it as a hobby.

Starting point is 00:31:01 Yeah, it's fun. It's something you have to do anyways, so you might as well make a hobby out of it sure yeah so that's my book of the show very cool all right my tool of the show my tool of the show is uh is actually software raid controller or just using raid in general so uh quick story um about or no, I guess it was about a week ago. Um, I, you know, turned on my, my, uh, computer that I have sitting behind the TV in the living room. And, um, it told me that one of my hard drives failed and sure enough, it just completely failed.

Starting point is 00:31:38 Like no warning. Uh, actually my wife told me about a few weeks ago ago that it was she was hearing this kind of clicking sound coming from the computer so there was a little bit of a warning but um uh but then it went away and then all of a sudden just yeah the hard drive just spontaneously died uh fortunately i had a raid one backup and so what that means is i actually had two two terabyte hard drives that were raid one they. They were mirrored. So every time I would write a byte to one, it would write to the other one.

Starting point is 00:32:09 And it's done automatically. So basically, the way this happens, I don't know how this works on Windows or OS X or anything like that, but on Linux, you actually, all of your hardware is in the slash dev directory. So anything in slash dev is of your hardware is in the slash dev directory. So anything in slash dev is not of your hard drive or of your, I guess, file system, is not a file.

Starting point is 00:32:33 It's an actual pointer to some type of device or some type of engine, like dev random or something, right? So your hard drives might look something like dev slash sda, dev slash sdb for your second order, sdc, so on and so forth, right? What the RAID controller does is it makes a new device. I think mine's called dev slash md0. And when I, you know, mount that device, it, you know, looks and feels just like a hard drive. So I can, you know, mount that device, it, you know, looks and feels just like a hard drive. So I can, you know, mount it to a folder. I can open it up.

Starting point is 00:33:09 I can add files to it, et cetera, et cetera. But everything I do actually gets written to this RAID controller, which then writes it to both of the hard drives. And so you can, you know, check the file system for errors and stuff like that just like a normal um file system and under the hood they're they're doing everything basically twice right so um so yeah i had raid i've had raid one set up for years and years and uh yeah i chose last week for one of the hard drives to just completely bite the dust and i have um almost all my family photos on that rate array so you know if I didn't have a rate array they would have just been gone I mean I probably have some on Google or something but definitely the original copies you know the original size would have been just gone so I highly highly

Starting point is 00:34:00 recommend doing this if you don't have a Linux machine and you don't want to go to the trouble of setting this up, you can install, you can buy one of these NAS devices. So there's a Symbology is one, Drobo is another one. And what they'll do is they basically run the RAID controller and everything for you. So you buy this box, it's running Linux. You can just put as many hard drives as you want in it, and then you can access it from other computers in your house. So it just stays on all day. And from your desktop, you can go to some network drive and set that up.

Starting point is 00:34:39 In my case, I already have a computer that's running all day, just sitting behind the TV. So I set up my own. But either way, yeah, I highly recommend setting up a RAID array. There's RAID 1, which is total mirroring. So you basically end up with half the capacity. If you have more than two hard drives, you can do some of the other things like there's RAID 5, RAID 6, so on and so forth. Basically, I think the idea is that, you know, as long as you have N hard drives, you can lose one of them. Like the RAID 5 is set up so that you can lose one. And as long as you don't lose another one while, you know, the remaining hard drives are, you know, are recovering from the loss. As long as you don't lose another one in that time, you're OK, which is, you know, it's highly unlikely you're going to lose two in a row. So, yeah, definitely set up RAID.

Starting point is 00:35:38 It really saved me. I went to Amazon. I bought a new two terabyte hard drive. When it arrived, I plugged it in where the old one was. I told my RAID controller, hey, you know, add this new device. It actually, it told me it was going to take 220 minutes to, you know, replicate. But even during that time, I could add files to the RAID array. Like it's smart enough to just work, like be completely available while it's

Starting point is 00:36:06 replicating and all of that so it's pretty magical actually how it works and you know totally saved me so that's my tool to show nice also it seems brave that do you have anything like that wait the 220 minutes that's only like three hours yeah well you know I was kind of curious so yeah i mean i just i created a dummy file just to see what happened and uh yeah it was pretty dangerous but you know i live on the edge i guess okay yeah i have a something sort of similar although i will admit my uh mass the network attack storage i have is RAID-RAID inside, but I sort of use backup between my desktop and it as

Starting point is 00:36:48 the backup strategy, I guess. So I don't have like... My individual computers aren't RAIDed. Yeah, that makes sense. Yeah, that's true. My desktop isn't RAIDed. The only thing I have RAIDed is the family photos. Okay. And yeah, I also have that

Starting point is 00:37:03 backed up. And I assumed my NAS was hardware RAID, but it could be software RAID. You know, yeah, I don't think hardware RAID is a thing anymore. Oh, really? Because yeah, I was thinking about this. Like I remember in like early 2000s

Starting point is 00:37:18 when you could get a hardware RAID motherboard. But nowadays, you know, the RAID uses such a trivial amount of CPU because of moore's law and all that i don't think you can even get a hardware raid i mean or if you can it's like for some super industrial setting i will defer to you anyways all right all right um well that's my tool of the show we uh we typically uh do a pitch for audible you like like to support the show we didn't recommend a book on audible this week but we have a history on the website of many books which are available on

Starting point is 00:37:50 audible and if you've never tried audible before i think both jason and i are subscribers and so we both enjoy it and pay happily for our own subscriptions but if you haven't and you would like to become an Audible member and get a free book in the process, you can go to audibletrial.com slash programming throwdown and check out one of the many, many books that they have. And like I said, look in our show notes for many recommendations we've made throughout time. Short books, long books, funny books, happy books, sad books. I think they got something of everything there. It really is

Starting point is 00:38:28 actually kind of overwhelming and they have sales too. Like while you're a member, you're able to get other books on a discount price. So it's a pretty good arrangement. I actually like it. Yeah, same here. I'm a big fan. If you don't want to support us on Audible, you can support us on Patreon

Starting point is 00:38:44 or even if you do, you can support us on audible you can support us on patreon or even if you do you can support us on both um so patreon is pretty cool it's uh um you can give up to a dollar actually this is the last month before the christmas show um last christmas that was pretty ambitious um you know we really set out to give a gift to everybody we were able to give a gift to everyone in the u.s and canada but um we had issues shipping international like shipping overseas and so i'm going to think really hard this month about what we can do um i learned a lot we learned a lot from from last year um we have a ton of fans which is amazing but it also means that, you know, logistically, there are things that we're not experts in that we either need to get some help with.

Starting point is 00:39:30 We're definitely going to give out the T-shirts and all of that. We do that every year. And, you know, that's pretty easy. They drop ship for us. But I'm going to, you know, figure it out. I'm thinking, you know, anything we buy online they'll ship relatively cheaply. I don't know if we can give something

Starting point is 00:39:49 to everybody this year, but I'm going to figure that out. Stay tuned until next time. This is the last month to sign up to be in the raffle for the t-shirts and whatever mystery prize we figure out.

Starting point is 00:40:04 Go ahead. So go ahead. You can join. You can get a crack at the mystery prizes and you can leave. It's fine. You won't hurt our feelings. But also, while you're a Patreon subscriber, you get access to the Patreon RSS feed, which is the super fast download. Downloads way faster and you can listen to it.

Starting point is 00:40:21 Very reliable and all that. And you help support the show, which allows us to ultimately reach more people. Well, I think it's time for the topic of the show. Customer bug handling. This is something that is actually really hard to get right. So you think about this, you have all the code on your machine you might even be cross compiling so you might even be running linux but building something for windows or something like that i mean it could be that could be really extreme right but you have all of that information you might be able to build in debug mode you end up with this enormous binary right i mean you have

Starting point is 00:41:01 all of your own system libraries all of that and so that makes debugging relatively easy you can step through the debugger you can do all sorts of stuff like that the issue is what happens when you know you send your program to someone else or you deploy your website or you push a mobile app and then all of a sudden you start getting reports that, hey, it locked up, it crashed, things like that. Like, what do you do then? And it turns out that question is not trivial to answer. Actually, it's actually kind of surprising that, you know, there hasn't been just something that just has solved this for everybody seems like uh like such a fundamental problem but i guess it's just so fragmented the ecosystem that's hard to really do that yeah i mean i think not just the

Starting point is 00:41:51 fragmentation i think that we were talking about a little bit before when we were preparing for the show but i think the scale of things matters a lot too um and so i think in a you know video game where it runs on many many many well hopefully it's a successful video game where it runs on many, many, many, well, hopefully it's a successful video game, and it runs on many, many people's computers where the individual user is, I don't want to say like low sophistication, but they're not really supposed to be helping you debug your app, right? Like it just is supposed to work. And you will almost guaranteed going to get crashes just because of the statistical like

Starting point is 00:42:27 the number of hours your game is running an aggregate across everything is like somebody's going to drop their you know tablet or lose power on their pc while they're playing your game um that versus you know this application i write is for a very important you know customer who has 10 people who use it and it's part of a multi-million dollar business and they're very sophisticated they're very invested in helping me fix my app and if something goes wrong i'm probably coming out to their site and you know working with them um to get it you know all those the spread there is just incredibly wide yep yep so yeah so i mean it's true if you go to i mean if you if you have uh enterprise customers and you can actually go on site then uh um there's a whole lot you can do there right

Starting point is 00:43:20 but let's talk about the case in the beginning where you know you are let's say video game developer and you're sending your game out to people they're paying let's say ten dollars for it um and you're not really going to be able to fly over to their house so um one thing to do is turn on debugging symbols um you know it's one of these things, I mean, you have to pass the, you know, dash G flag in GCC and things like that. And so you might be tempted to build things in release mode. But I mean, honestly, I mean, disk space is cheap. You know, network bandwidth is not that cheap, but you can, you know, most of the time these debug symbols compress quite nicely and you one thing and it's gonna be an overall theme for this show is you don't want to put yourself in a position where you can't debug the problem like you really don't you you

Starting point is 00:44:16 absolutely don't want to um you know push out a release binary you get a bunch of crash reports that you can't really do anything with, and then decide, okay, now I'm going to, you know, take out some of these optimizations, right? I mean, you can build with debug symbols and the opt mode, and it's not going to be that much slower. I will caveat that a little by saying that, in the start here, we're going to talk about a bunch of things as sort of a survey of topics. But we talked about this when we talked about don't roll your own encryption, that some of these things can have implications that are kind of big.

Starting point is 00:44:53 So turning on debug symbols is very helpful when you're debugging, but you have to weigh the cost of people understanding more about your code. So if you're in something where you're distributing code that you don't want people to know what it does, or it has secret sauce in it, or a really great filter or whatever, which is not most people, admittedly. But there are certain cases where you're going to want to be very careful about, I give this to someone, but I want to limit their ability to understand exactly how it works. Yeah, you know, that's a really good point.

Starting point is 00:45:25 I hadn't really thought about that. But I guess what you could do in that case is you could have, you know, basically you could divide your code up into the secure code and the unsecure code. And, you know, the secure code could be built without debug symbols and then as a static library and then just linked in to the unsecure code. But yeah, I mean, generally, unless, yeah, if it's a secret sauce, you can, there's ways around that, but generally you want to.

Starting point is 00:45:53 Just be careful. Yeah, and I think most of the time, the errors are typically in the interface, in the high level part of the code, which is not, it doesn't generally have a lot of IP. Yeah, the other part is, you know, you have logs. If you can, you want to leave those logs running. Ideally, you want, if you're using like G log or easy logging, in the case of C++, or using like Bunyan, if you're using JavaScript, any of these log tools,

Starting point is 00:46:26 they all have log levels, even like the log4j, if you're using Java. And you typically want to allow people to turn on verbose logging. So you can have some kind of configuration file. And when they set verbose to one you know they get all the verbose log and and then they can you know email that to you you could have a button where when there's a crash you know you it automatically emails the logs things like that

Starting point is 00:46:57 but basically you want to have access to that information that you have when you're debugging. That's another way to think about it is, is if you don't have something when you're debugging, can you still triage the problem? If the answer is no, then you have to kind of plan for that, right? You have to assume that whatever crashes you're seeing, you're going to see similar crashes coming from other people. So for example, when we were rolling out the first versions of the Eternal Terminal, we were getting tons of really bizarre crashes because it was running on Windows for Linux, it was running on BSD, there's people. So we had deployed it at the place where I work, and there was pretty quickly, there was hundreds of people using it.

Starting point is 00:47:48 And we just encountered tons of OS-specific bugs that were, like, extremely difficult for us to triage because we didn't have any logs. We didn't have really any of the stack trace, any of the debug symbols turned on. And since then, you know, we've kind of added all this, any of the debug symbols turned on. Since then, we've kind of added all this capability. It's made a big difference.

Starting point is 00:48:11 I think with both... With both rolling logs and log levels, there's consideration. On rolling logs, one of the things is you want to limit how much space you take up, but you need enough logs so that if something starts dumping tons of things going wrong you don't miss the thing that started at all and instead only see the the symptoms which

Starting point is 00:48:31 is a problem i run into other things there you can be clever about is trying to be careful about how many of a given kind of error you're seeing so often you start spitting off the same kind of error over and over again and oftentimes it's the first one that's the most interesting and so making sure you always capture like the first of a given kind of error over and over again. And oftentimes, it's the first one that's the most interesting. And so making sure you always capture like the first of a given kind of error, like sort of slightly smarter ways. And with log levels, being careful that logging has some cost, spending time driving down that cost is important. And then making sure that you're not logging at such a high rate or in a part of code that is running at a high frequency that you end up costing a lot of computation time to do the actual logging yep yeah it totally makes sense yeah this is something that you know definitely before you ship a binary you want to run it in

Starting point is 00:49:17 verbose mode and just make sure that it isn't just completely blowing up the logs because that can happen very easily. If you put a log inside some really tight for loop, just one log line can cause, you know, just megabytes and megabytes of log to just start building up almost as fast as you can see it. Another thing is, you know, definitely, you know, handle crashes and things like that pretty gracefully.

Starting point is 00:49:43 There's a bunch of libraries to help with this, but basically you want to catch the crash. And typically what will happen is you'll try to get the stack trace. So in the case of Linux, there's actually a system call called backtrace that will give you the function pointers of the whole stack up until

Starting point is 00:50:06 where you are right now. And so you'll want to, you know, run backtrace and then, you know, dump that to a disk to file or something or to a log or something like that. In addition to crashes, I mean, crashes is the most common thing that, you know, you'll want to handle. But also there are signals. So so for example if you divide by zero um well depending on the programming language but i think in c you might get an exception but i'm sorry in c++ you might get an exception but if you divide by zero and c you're not going to get some type of exception there isn't even a concept of that you're going to get what's called a signal um and you can look up uh the different signals there's sig abort there's uh sig sev um and and you what you want to do is look up all

Starting point is 00:50:54 these different signals and make sure you're handling them uh gracefully like some of them like sig interrupt you don't really want to handle because that's only for when you're debugging or if someone hits ctrl c on your program maybe just leave it up to them. But you know, definitely the big ones like sig abort, you know, sig sev and maybe sig term when the process, when something external tells your process to terminate. You want to handle those and catch those. And again, write a log out really quick or something like that. So then on the other end,

Starting point is 00:51:29 now you've dumped all these logs. You have some way of getting these logs to some server that you own. Maybe use GraphQL for that or something else. And now you have to process all of those logs, right? And you don't want to be especially if you have you know let's say 20 or 30 bug reports which you feel like are the same bug you want to be very efficient at sort of going through those logs and picking out the most important pieces maybe you

Starting point is 00:51:57 even want to write something that will pull the stack trace and dump it to a database you just have a database full of stack traces. And so this, I think we've talked about this many times, but become an expert in grep. Grep will absolutely save you. Grep is basically a way for you to just, you know, search for specific words and files. If you're your stack trace,

Starting point is 00:52:22 log line starts with stack trace colon, you could grep for stack trace colon, and you'll just see all of the stack traces and all of your logs. It's getting more common now to actually write logs as JSON objects. So like Bunyan does JSON. A lot of people now are starting to use JSON. And so you could use JQ, which stands for JSON query. And it's similar to grep in the sense that you can say, you know, just fetch this one object inside of this JSON object if it exists.

Starting point is 00:52:58 Or you could say, you know, give me all of the keys that start with this. And so it definitely becomes proficient with JQ. The other part that's really important is, you know, when you get the, you know, again, the people who are running your code, they don't have your code most of the time, right? So they're running your game or your app or your program, but they don't have the original source code, right? But when there's a crash, you want to know the line of the source code so that you can debug, right? So what you have instead are you have these pointers,

Starting point is 00:53:32 these function pointer, these function addresses, and the addresses map to lines of code, right? So every function address maps to the start of a function in some source file, right? And so you can use, on Linux, it's called ADDR2LINE. On OS X, it's called ATOS. I don't actually know what it is on Windows. I'm sure there's something on Windows. No, there's not.

Starting point is 00:54:01 Okay, yeah. But these tools, what they will do is, you know, when you run backtrace on Linux, as I said, you're going to get a list of these pointers, and you can dump them to a log, let's say in hexadecimal, right? Then on your side, when you get this stack trace, you're going to need to convert that into the lines of source code. And that's what these tools do. So these tools take in the binary and they also i think you run them from the directory that has your source code uh or maybe it's from the directory that built the binary you have to look up in the manual but basically you'll run it from

Starting point is 00:54:34 a place where they know where your source code is um they know where the binary is that's another thing is when you ship the binary you're going to want to basically know, get your source control tagged. So if I ship, let's say version one, I want to tag, you know, my source control repository and say, this is version one, this is exactly what every source file looks like. And here's the version one, you know, binary. So so then if you get a crash back you can load that version one binary you can download that you can get the pointers from that from that stack trace and using this the the correct you know version of the source code and binary you can actually recover the lines um in the source where that person was when they crashed, right?

Starting point is 00:55:26 So without this, you're going to be totally lost, right? I mean, someone's going to say, hey, my program crashed. And you'll have to kind of say, oh, what were you doing when it crashed? I mean, it's impossible, right? You don't want to do that. So with something like this, let's say, oh, my program crashed. You say, okay, give me the stack trace. And you're going to get this list of hexadecimal numbers.

Starting point is 00:55:49 And you can pipe that list into address to line. And it will actually tell you the specific lines that the person was on. And you can follow that all the way up to main. Like the furthest one will always be the int main that starts the program. So, you know, if you're running, if this is like a Python program, then you don't really have to worry about that.

Starting point is 00:56:12 But that also means you're sort of basically kind of distributing the source code in that case, right? I think you could use something like PyFreeze, like one of these compilers um but i think even in that case it distributes a source code see i don't think there's any way around it in python um now obviously if you're giving people the source code and the binary they could you could even run address to line on their computer and then just send you the location of the crash. But for most of these languages, like even Java, you'll have to unwind the stack.

Starting point is 00:56:52 And I don't think you have to use any special program in Java, but you'll have to handle that. So I think we covered, that's pretty much desktop in a nutshell. The hardest part there, as I said, is the crash handlers. There's a lot of, if you just look up GitHub C++ crash handler,

Starting point is 00:57:09 you'll find a bunch of really great libraries to help you out. And I think it's definitely something where if you're going to ship, you also want to practice it, if that makes sense. So introducing crashes and making sure that your logging is actually working as intended, because nothing's worse than a very rare crash happening and the logging isn't configured properly and you don't have what you need. Yeah, actually, I can't remember.

Starting point is 00:57:31 Oh, I think it's Android. I don't know if it's in the modern Android, but there was definitely a version of Android, or maybe I'm mixing things up here, but there was something I saw where basically you could cause a crash like there was just like a expert menu and you could go into a developer menu developer settings and one of the developer settings was crash and and i i just you know out of morbid curiosity i tapped it and yeah it crashed and uh you definitely want something like that in the first version where you know hopefully you

Starting point is 00:58:06 bury it where way down in the menu but um you know someone who's helping you out can go hit the crash button and you should be able to get a crash and you should be able to trace exactly the line uh that crashed and uh and and what you know functions called into that function so on and so forth so now for the web um you, if you're building a website, there's two components, your client and your server, right? The server is pretty straightforward because on the server, it's you own the code, you have the binary all in one place.

Starting point is 00:58:39 Even if you're not developing on the server, you can always push the code to the server. And so that's not really that um let's say interesting um you could even be running like a totally interpreted language like python or ruby or something um it's much more interesting when you get crashes on the browser right now this isn't you know literally crashing the browser because um that's that's always chrome's fault if you can crash the browser um you because that's always Chrome's fault. If you can crash the browser, that's pretty much on them. But if your code causes an exception, that's typically the thing you ought to watch out for.

Starting point is 00:59:18 So for example, you're getting the mean of a list of values, but the list is empty and you divide it by zero or something like that. So in this case, your browser JavaScript will throw an exception and you have to handle that. And so typically on the website, the way this works is because the JavaScript is, because the source code of the JavaScript

Starting point is 00:59:43 is sent to the browser, you don't have to worry about address to line or anything like that. One thing you do have to worry about is the demangling. So if you do, for example, you run like one of these scripts that compresses your JavaScript code, it might take out all the new lines. And then now, big surprise surprise every crash happens on line one right so so typically javascript you know when you do one of these compression tools it'll

Starting point is 01:00:12 create what's called a source map which the source map basically just says you know column 40,000 to 40,030 actually maps to line 3004 you know know, and so on and so forth. And the source map is just this huge file that maps, you know, chunks of that enormous one-line JavaScript file to the original, you know, file name and line. So it's actually similar to address to line, you know, conceptually. And so there are tools which will take the JavaScript stack trace of the one line file and convert it into the appropriate stack trace that's readable.

Starting point is 01:00:56 So you have to deal with that. But basically, it's a similar idea where when there's a crash, because it's a website, you don't have to worry about how to get the crash to the server because you already have like a client server framework in place. You just use whatever you're using to send data between those two machines. So use AJAX or even WebSockets, whatever you want. But then on the server, you know, you'll have to do some work. On mobile, it's very different. So I've linked to a couple of libraries. For Android, it's ACRA.

Starting point is 01:01:35 And for iOS, it's KSCrash. And yeah, basically, you know, these libraries will, it's similar to, it's more similar to desktop than to web, right? So these will send um crash logs to some endpoint you know that you have to set up um sometimes they can even send it through some ios or android infrastructure and then they'll arrive in the you know developer um in the you know the developer store the developer platform or front end um you can collect these crash logs. But, you know, I think in the case of Android and iOS, it's much more handled by the actual app stores. So I know I made an Android app a while back. It's been probably six years.

Starting point is 01:02:19 But at that time, they had a set of, you know, crash reporting tools. So I would just get a list of crashes and they would provide the stack traces. So they would do a lot of that for you. So I think definitely take a look at what the Play Store can do, what the iOS Store can do. But here's a couple of libraries that streamline that and make it even easier.

Starting point is 01:02:43 I've not developed much on mobile, but I feel in some ways mobile might be the easiest because even though there's lots and lots of mobile devices, it feels like there's far less than the number of configurations of computer hardware or web browser configurations. And so I feel like the number of your ability to potentially be able to replicate the issue might be much higher. Yeah, that's a really good point.

Starting point is 01:03:08 There is, let me see if I can find it. There's a really good tool which simulates different browsers. I won't look it up now in the interest of time. But if you kind of Google around for, I think it's called like a browser simulator. But basically, literally on the left-hand side, it has a panel and you can choose like Chrome, Firefox. I'm sure it can't simulate everything because it's just trying to run your JavaScript kind of through this middleware layer.

Starting point is 01:03:38 It's not literally executing different browser code. But it'll try its best to, at least for CSS and things like that, catch a lot of those issues. But yeah, you're right. I mean, browser is in a sense the worst, although things are kind of homogenizing now where I think Chrome, you know, Brave, Firefox, you know, most things now are kind of pretty cross platform. Cool. Cool. Yeah, definitely.

Starting point is 01:04:07 Send us your worst debugging nightmare where someone has a crash and there's absolutely no way to replicate. We always love hearing stories like that. I've definitely caused more than a few of those myself. This is why it's what inspired this episode. And hopefully with these tips, you can kind of avoid doing it. You could set people up where if they have a crash, they can send you some information. You'd be able to triage.

Starting point is 01:04:34 One thing, definitely expect everything to crash. Like if you whatever you write, you have to have some way to handle crashes either as patrick said because of different um clients software um sometimes there's people who are running you know windows xp and you'd be amazed like even the simplest programs don't work um people just do things you don't expect there's situation you don't expect there's internet um um like internet like some people's internet connection are not as reliable and um you know it's very very hard to plan for all of these things so you definitely expect failure and uh have the right way of like ideally automatically just just sending some information to you cool that's pretty much all i have. All right. Until next time. All right. Next time is the Christmas show.

Starting point is 01:05:26 And actually, for January, we have a special guest. This might be our... Hopefully. Hopefully. Hopefully. Nothing against... Yeah, yeah, that's true. Actually, I won't spoil it. I won't spoil it. But we have an amazing guest. Yeah, hopefully. And I'm really excited.

Starting point is 01:05:43 Definitely, it's going to be absolutely phenomenal. We're going to start the year off extremely special way. But next month, we're going to have our Christmas giveaway or holiday giveaway. And, you know, hopefully give out some really cool T-shirts and some other prizes. The intro music is Axo by Binar Pilot. Programming Throwdown is distributed under a Creative Commons attribution share-alike 2.0 license. You're free to share, copy, distribute, transmit the work,

Starting point is 01:06:18 to remix, adapt the work, but you must provide attribution to Patrick and I and share alike in kind.

Your Ad Here

Programming Throwdown - Customer Bug Handling

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.