Coding Blocks - Algorithms You Should Know

Episode Date: June 25, 2018

It's time we discuss algorithms we all need to know as we continue diving into Rob Conery's The Imposter's Handbook while Michael will read anything, Allen questions Greenland's name, and Joe talks w...ormholes.

Transcript
Discussion (0)
Starting point is 00:00:00 You're listening to Coding Blocks, episode 84. Subscribe to us and leave us a review on iTunes, Stitcher, and more using your favorite podcast app. And check us out at CodingBlocks.net where you can find show notes, examples, discussion, and a lot more. Send your feedback, questions, and rants to comments at CodingBlocks.net, follow us on Twitter at CodingBlocks, or head to www.CodingBlocks.net and find all our social links there at the top of the page. And with that, I'm Alan Underwood. I'm Joe Zeck.
Starting point is 00:00:29 And I'm Michael Outlaw. So this episode, we're going to be talking about some common algorithms that you should probably be familiar with because they're going to come up in an interview and because many real-world algorithms that you'll need to come up with might be similar to these, or at least these will give you a foundation as to how those might be better. These are the types of things that are either for a review for CS or what you might've missed. So I love writing the sentences that you end up reading because I feel like I'm
Starting point is 00:01:00 throwing like jungle gym at you. I'm giving you like the obstacle course. I'm so amazed that you made it through that at all. There's random words just pasted on the screen, right? Yeah, man. I'm sorry. That sentence was not meant to be read. Oh, really? Not by humans.
Starting point is 00:01:16 No. No. I could tell when he was reading it. That doesn't make sense. Got to make sense. I was twitching a little bit, but now I know why. I thought I had just forgotten some medication or something. No, no, man.
Starting point is 00:01:29 I just kind of flopped that out there. All right. Well, thanks for doing that. A bit of podcast news. First, we're going to talk about those reviews. You know we love them. Big thanks to everybody who left us a review, past, future, present. And first up, iTunes. Thank you,
Starting point is 00:01:45 Akeith47. Thank you with my harebrain. T. Tordo Dove. Marrow IT. Your worst enemy. Fiery Ginger. Mcar86. Josh Rainvan. Michael Knot and Outlaw. And then we got Michael Joe Allen's Serial Killer.
Starting point is 00:02:02 Now, wait a minute. Are you sure it's not Mchar? Dude, I was right there. I was like, no. But it's actually car. It's got to be because you say VAR car. You don't say VAR-Char. I think we've discussed this before. What?
Starting point is 00:02:14 All right. Oh, I've got Stitcher. How'd this happen? All right. So, I've got Liam M., Daniel 65, CBfan76290 two nine Oh four, five, four one nine and raggling. Yep. So thank you always to everyone who has taken the time to leave us a review and everyone that is leaving us a review.
Starting point is 00:02:34 We always greatly appreciate that. And so real quick, I do want to mention kind of have a call back to going back to episode one. Maybe I know, especially Joe, we've talked about many times had some feature requests for interfaces that he just wanted to have. Number one that always stood out in my mind was he wanted to have a default constructor for the interface, if I remember right. So it's not quite going there. But I thought, Joe, you might be happy that there is a proposal for C Sharp eight for default interface methods. And we've talked about this. I forget how many episodes go back where Java actually already has something like this. the C sharp team, they're actually kind of modeling, they're looking at the Java implementation
Starting point is 00:03:27 for kind of like best practices or what worked, what didn't work, try to like, you know, we've said this before, like how we're always building on the shoulders of the giants before us. Right. And so they're, they're kind of looking at the Java community to see like how that worked and whatever. And so there's an article there that I got for you. I thought you might enjoy where it shows some of this in action and this will be, uh, you know,
Starting point is 00:03:53 backwards compatible. You'll be able to, you'll be able to apply these default, um, interface methods to your interfaces that are already, that already exist in the wild. That's crazy. Yeah. You know what? Um, The old me used to love that, but
Starting point is 00:04:07 I'm JS now. Oh. So you're not TypeScript? I'm Syndax. Everything. I'm denaturing stuff. So wait, you're not into the TypeScript thing then? You're going all ECMA 2015 or whatever?
Starting point is 00:04:22 2015? Yes, sex, yo. Can you tell by my attitude, my backwards hat? Oh, that's right. There we go. Oh, that failed. That's excellent. Hmm. That was awkward.
Starting point is 00:04:42 All right. So I also want to mention really quickly that we've been releasing a ton of videos. In particular, I want to call out the Community Talk number two where we got five of our favorite people on a call here. And we've got a great presentation on cross-platform development with Rad Studio from Zach and Ted. And we've got a video up there on YouTube. And I want to specifically focus in on the fact that we have a survey in there, a SurveyMonkey link that you can fill out. That's in that description and actually on the blog too. And you can win one of $400 gift card slash shirt packages. So it's four prizes, $100 gift card and
Starting point is 00:05:20 a shirt if you go fill out that form. So go check it out. You'll get a free trial too. That's good up for nine months. So go do that before July 6'll get a free trial too. That's good up for nine months. So go do that before July 6th. So do it now while you're thinking about it and make some money and check out cross-platform development. With Rad Studio. And I do want to say, we were all surprised by how cool some of the features were. So don't dismiss it out of hand when you just hear Rad Studio.
Starting point is 00:05:44 Like there was some really cool stuff shown in that video. Exactly. Yeah, there was definitely some wow moments, some jaws hit the floor kind of moments, which reminds me too. We always talk about subscribing to the podcast, but if you haven't subscribed to our YouTube channel, you should definitely take this as an opportunity to do so or use it as an excuse to do so. You can head to www.codingblocks.net for the easy way to get to our YouTube channel. And you can subscribe to us there, and that way you won't miss any of these amazing videos. Hey, you want to guess how many videos we have just include episodes uh i'll say 68 oh man that's real we have every episode up
Starting point is 00:06:38 there oh every episode i didn't realize we had all right we right. We'll go with 95. So, we have 115. Wow. Which means we have 32 videos that are not episodes. And one of those videos has like, what is it, like 60,000 views? Really? Is that the laptop one? Yep. Very nice.
Starting point is 00:07:00 Show enough. Awesome. Yeah. And you want to guess how many views we have total if you were to go through and add up all of those? 14. 14. Oh, wait. I might be missing that other 60,000 14.
Starting point is 00:07:16 Let's go with 250,000 views. Oh, man. We have 95,000 views. Unfortunately, they all dropped off after about minute two when we started talking about meta-type stuff like how our YouTube channel is doing. Oh, yeah. All right. So, move it on. Move along.
Starting point is 00:07:32 Did you mention your QIT diary video as well? Well, it's not really my diary anymore because we had five people on the last one all talking about different changes and different ideas we have with the app. So if you haven't checked out QIT.cloud, I highly recommend it. It lets you find podcasts by topic. So you can search, like, say, GraphQL and find all the podcast episodes on GraphQL. And it's totally focused on programming topics. So if you like programming podcasts, and I think you do, then you should check it out. And you should check out the videos if you're interested in building software, because I also think maybe that's an interest that you have.
Starting point is 00:08:05 And you're going to see some of your favorite people from Slack hop it up on that video, talking about the different things that they're trying and working on in terms of the app, because we're up to nine contributors now. Yeah, it's really cool. And the video does go over sort of like what you do in a business. These are the features we're thinking about adding. Maybe we put it here, maybe we put it there. So it is kind of interesting to see the collaborative approach to it, not just on GitHub, but also in real life, right? People discussing the things and the reasons why they're choosing to do things certain ways. So it's pretty interesting.
Starting point is 00:08:36 One of my favorite topics is what we said we'd work on last week and what we actually worked on. It's kind of funny to see that disconnect. But hey, this is a fun project, so you can work on whatever the heck you want. It's awesome. Super cool. And so I wanted to bring up something that's come up from several different people now. And I think that it's worth revisiting. Our last episode was on search, right?
Starting point is 00:08:57 Like we kind of deep dived into the reasons why and that kind of stuff. And I think along the way, you know, I think I even said that, you know, maybe it sounds like we were cheerleaders for it. And I want to back up because one of the things that has been asked a couple of times right now is, well, why wouldn't you just use a data warehouse instead of a search engine? Or why wouldn't, you know, we talked about aggregating and get aggregates back when you do your searches. Why wouldn't you just use a cube, you know, or some sort of OLAP type processing? And while this is not going to be a comprehensive answer, I'm going to give you some of the reasons and love for Joe and Mike to jump in as well.
Starting point is 00:09:36 But here's the thing. Search engines aren't perfect for everything. And I'll give you a perfect example right now as to why. When you are sticking something into like a Lucene-based type search index, you have to denormalize all that stuff into one document, right? Anything that you want to be able to search on on that thing. Now, that's not 1000% true because there are ways around it. But generally speaking, if you want your search to be fast and you want it
Starting point is 00:10:06 to be scalable, if that's the thing, like if you're using Elastic, you have to denormalize all the data that you want to be searchable and sortable and able to be aggregated into one document. So there, if you're comparing that to something like a data warehouse, obviously in a data warehouse, you can join tables, right? You have these things like star schema. So if all of a sudden you decide that you need something from table Z over there, you can join to it, right? Through however many joins you need to do to get to it. So that's one reason why search engine isn't perfect for everything. It's not a relational database. It's not even a relational lookup, right? It is literally put everything you want in this thing that you want. Now let's get into why I don't think that data warehousing
Starting point is 00:10:51 necessarily solves that problem. A perfect example for me is one to many joins. All right. When you have something in a data warehouse and you want to get a perfect example is when you go to Amazon and we talked about cameras last time, right? You go into the camera category and now you're going to have DSLRs, mirrorless, point and shoot, whatever, right? And then there's going to be different things down the side that have sensor sizes and there's going to be counts associated with all those. Think about what you got to do in a SQL query to get those counts back. You're going to be grouping by all these different types and that is expensive, right? Even if you put it in one table and you don't have this join,
Starting point is 00:11:33 which you wouldn't really do in a data warehouse because typically you'll have your fact tables that generally have as much of the data as possible, but those things are going to get really big. And even doing the group buys in those is going to be slow trying to get that kind of stuff. And then when you take it a step further and you want to do something like paging and sorting on these columns, anybody who's messed with SQL enough knows that when you do those, and I'll talk specifically to SQL Server right now, you have to index on the things that you want to search by. So now you're going to have a ton of indexes on these things. And indexes are typically done either ascending or descending, and they're going to perform properly based on what you do. So now these searches and these aggregations and all those things kind of fall flat because even a data warehouse is not necessarily meant for search. And it's not meant for paging and it's not meant for sorting. It's meant to be this place where you can go get all the data in, in a certainty normalized form. So going back to
Starting point is 00:12:31 something like fuzzy searching on a search engine, you can't necessarily do that with a data warehouse, right? And, and you're sorting on columns and getting your counts and your paging and all that kind of stuff is not as easy. It just doesn't work that way. And then the OLAP thing, bringing that one in, that's a different tool for a different use case. So typically when you're looking at OLAP, you're wanting to be able to slice and dice data, you know, as many possible different ways as you can. And this also goes back to the difference between what that technology provides for you versus something like a search engine or even SQL. That is the ability to look at data. And when we say slicing, if you have a cube and you want to draw a line down through the middle of that thing, that's one way to slice it.
Starting point is 00:13:26 If you want to do it from a different angle, you can do it that way. But that's the relationships of the data. If you think about a cube, typically what it is, is if you equate that to like an Excel workbook, you have an Excel workbook. In that Excel workbook, you have many sheets in that workbook. And then there's tons of cells on those. That cube allows you to drill down into those points in that thing and relate these things all kinds of different ways. Search isn't meant for that. So when you do something like search, when we're talking about aggregations, we're talking about simple counts, right?
Starting point is 00:14:00 Like that left-hand nav on the camera search on Amazon. All those counts of those particular keywords that showed up. So the CMOS sensor size, that kind of thing, right? That's what goes there. When you're talking about cube, you're going way deeper. You're talking about forecasting. You're talking about historical data. You're talking about all kinds of number crunching that is way beyond what search is meant for. So I wanted to bring that up because again, these are all different technologies that are built for different reasons,
Starting point is 00:14:31 right? Search is literally for search. And with that, you get your fuzzy, your fuzzy matching, you get your, your quick aggregates on counts of datas. You get your ability to sort and page and do all that kind of stuff.
Starting point is 00:14:44 And I don't know that we talked about that much. You also your ability to sort and page and do all that kind of stuff. And I don't know that we talked about that much. You also get scoring, right? So when you have different types of data in your system and you do a search, if you think about this, when you search for something on Google, it's going to come back with what it thinks its top 10 results are based on what you searched on. And it's because it applies a scoring algorithm to the data that comes back based off what you search for and the types of documents there. Same type thing happens in a search engine on your side. So if I typed in the word wood on a search engine and I had woodworking as a category, and then I had books as another category, it might come back with a,
Starting point is 00:15:26 hey, a 90% score on the woodworking category and a 60% on the book category, because it's going to find things in both, right? So you get those kinds of features. You're not going to get that with an OLAP cube. You're not going to get that in a data warehouse because it's not what it's meant for. Just like in a cube, you're going to get all these slicing and these dimensions that you can just break down in tons of different ways. They are tools that solve different problems that can all work in conjunction with each other to get you what you want as a full suite of products. But none of them replace the other. It sounds like, to say it another way, is there's a lot of overlap between these different products. But ultimately, you know, cubes are kind of designed for reporting and search engines are designed for searching.
Starting point is 00:16:12 So it's just a matter of, you know, using a specific tool for a specific job. Yeah. I mean, in a nutshell, that's it. And that's the thing, right? People have done crazy things with relational databases because you can. But does that mean you should? Probably not. You can accomplish a lot of things by being creative, but these tools, what you said, they're very good at doing specific things that can really help you out in those areas.
Starting point is 00:16:43 You don't have to use them. They have different use cases. But, you know, do you have any thoughts on any of that? No, I kind of got lost as you were talking because I was thinking about just the data warehouse concept and not having a ton of experience with them. I was like, well, I never really envisioned that you would directly use them in an application like an e-commerce type application like with the Amazon example. I was like, that feels like the wrong use for the data warehouse, but maybe I'm wrong. And there is someone that uses that for that, but then it kind of goes along with what you were saying, too, about like, well, you know, yeah, you can make anything.
Starting point is 00:17:27 You know, there's a lot of overlap to Joe's point. So, you know, you could get away with something whether or not it's the right thing for it or not. I mean, usually data warehousing is where all your data ends up landing after your transactional systems have gotten the data in. And it's typically for reporting purposes or for people to be able to do analysis or forecasting based off that data. Right. So it's like, you know, nowadays they talk about data lakes and stuff like that. But it's usually a place that people pull from to make business decisions off of not to use in a real live app like what you're talking about. Yeah. I mean, I kind of I kind of view the data warehouse is like a hub and spoke model. Right. Like you have spokes of smaller databases that feed into the center, which is the data warehouse is like a hub and spoke model, right? Like you have spokes of smaller databases
Starting point is 00:18:05 that feed into the center, which is the data warehouse. But then there's also other spokes that are pulling from that center that are creating their own little data marts for their own purpose. And that's kind of what the way I've always envisioned the data warehouse. I never envisioned that you would try to create an e-commerce site that directly access the data warehouse. I would agree. E-commerce-commerce site that directly access the data warehouse. I would agree. E-commerce or other. And that's the thing.
Starting point is 00:18:29 I mean. Well, I was just following along with our Amazon example. Yeah. No, that's totally accurate. And the thing is, again, you could make it to where it does something like that. But I don't necessarily think you'd be solving the problem the best way. Right? If you have a search problem that you're trying to solve, then a search engine matches that thing. But no, you are going to ETL that data into that search engine in a very denormalized format so that the search engine can do what it does well, right?
Starting point is 00:18:56 One point, though, that I don't know that I heard maybe you made, but about like fuzzy searches or stemming, word stemming, like those are features that the search engine is going to provide for you that any kind of relational database system you're not going to get. Well, at least not easily. You can. You can. Not easily for free. Well, it's not baked in as a core concept. You can do a lot of stuff with full text search and a lot of vendors have kind of added some of those features. So there's pretty much anything
Starting point is 00:19:27 you can say about any tool. Someone can come and say, oh, there's a checkbox for that and whatever. It's just a matter of how core that functionality is to the center of that tool. Yeah, stimming might not be a part of it. I'm not aware of that in something like SQL Server, but I know the full text, like find words close to this and all that. You can do a lot of that. Well, gray being like, you know, with an A versus an E. Yeah. And I can't remember.
Starting point is 00:19:50 It seems like I looked some of that up, and there might be a little bit of that that's available. But generally speaking, yes, it's not as full-baked as the other ones, right? And, again, it's just picking the right tool for the job. And that's why I wanted to bring it up. the other ones, right? And again, it's just picking the right tool for the job. And this is, that's why I wanted to bring it up. There were several questions like, why wouldn't you just do this? Or why wouldn't you just do that? And none of them were like attacking questions. I think it was truly like, yo, why wouldn't you? And hopefully we answered some of those. Yeah, that actually ends up really nicely with the topic of tonight, because we're going to be
Starting point is 00:20:22 talking about a lot of different algorithms and a lot of them ultimately do the same thing. Exact same inputs, exact same outputs, but there are strengths and weaknesses in this, even in these very limited topics that would lead you to pick one over the other. So let's use that as a segue to dive right in and let's talk about these algorithms. So we're going to be talking about some of the algorithms that you should probably already know. And if you don't, then this would be a good one for you. And, you know, one question is like, well, why, why do I need to know these algorithms? Right. So kind of, as I hinted at before, if you ever do any interviewing, these are certainly going to come up in some way or form,
Starting point is 00:20:59 like you might accidentally implement one of these and not realize it. And then they're going to start asking you about the complexity of it. You need to understand the complexity. That's another reason why we need to talk about these so that you can know when or when not to use them. And, you know, aside from that, you probably aren't going to find a need to implement these yourself because whatever framework or language that you're, you know, you spend your daytime in, your day job in, it's probably going to have these or better implementations available for you. But remember that first point that I made. So you should actually start practicing how to implement these and others. Because if you do get asked in an interview, like, hey, how would you sort this? Or how would you implement this particular algorithm?
Starting point is 00:21:46 You're going to want to know. So let's dive in with the first, maybe the easiest one, and that is the bubble sort. And do we want to mention we got these algorithms from the simple algorithm section of the Impostor's Handbook, which is a book that we love? Yep. Yep. And before you dive directly into the bubble sort i will say being somebody that interviewed people at one of the
Starting point is 00:22:10 large tech companies uh these things come up a lot uh just you know how would you do a binary tree search or how would you do this or how would you do that it is something that if you're going to work for one of the big tech companies out there, the Microsofts, the Googles, the Amazons, these are the types of things that you will get hit on. Yep. So let's go into the bubble sort. First, starting with like, well, what is it? So there's going to be several of these algorithms. So I'm going to try to answer like a couple, you know, some core questions. So namely like, what is it? Right? So we're going to iterate through a list that needs to be sorted and we're going to compare adjacent items, swapping them if they are in the wrong order. And we're
Starting point is 00:22:55 going to continue this process until there are no more swaps required. So let's consider we have a small list of five items. All right. And those items are in the following order, 1, 3, 5, 2, 4, okay? So we're going to start walking through this list and we're going to compare the 1 and the 3. They're already in the correct order, so we move on. Then we get to the 3 and the 5. They're also in the correct order. Then we get to the 5 and the 2.
Starting point is 00:23:21 This is the first time we found something that's not in the correct order, so we're going to swap those. So now our list is 1, three, two, five, four, right? Then we'll compare the five and the four. So the five is getting compared a second time to, to an adjacent item. And we need to discover that we need to swap it. So now our list is one, three, two, four, five. Okay. So we've gone completely through the list and we're mostly the way there. We had to compare one item a couple of times, but we're not done. So we've got to go back through the list again. So we're going to keep walking through the list. 1 and 3. We started at the very beginning, like node zero. Yes. Element zero of the array. Yes. We're starting
Starting point is 00:24:03 from the very beginning. So our 1 and 3 are still good. Then we compare the three and the two. Now we need to swap those. So those are going to be swapped. So now our values are in the order of one, two, three, four, five, but our comparisons aren't done. We still have to go through the remainder of the array. So we still have to compare three and four and then four and five, and then we're done. Then we have to go back through it again, just to verify that there is no additional value that needs to be swapped. Right. So we went through that three times to go from one, three, five, two, four into one,
Starting point is 00:24:42 two, three, four, five. Right. Um, so that's, that's the bubble sort in a nutshell. You're just bubbling the values, you know, like in that kind of, you're, you're bubbling them into the correct order, but you're doing them in a very kind of slow pace. I guess you might say like one at a time kind of comparisons. So you just keep swapping neighbors until you don't need to swap anymore. Correct. So where do the strengths of the bubble sort? So probably the best one to say would be its strength would be that of all the algorithms you might ever hear talk about, this is going to be the easiest one to implement and
Starting point is 00:25:26 certainly the easiest to understand, right? So you can make a case that because this is so easy to understand and implement, that it makes it a very nice introduction into other algorithms, right? Although there are some that argue that other algorithms like an insertion sort or a quick sort should be used as the introductory algorithm instead of a bubble sort but you know at least academically i feel like bubble sort comes up first just because it is kind of easy to to talk about and you know kind of kind of wet the palette and you know give you a taste for it and then you can move on from there so you know it's not so bad when the list is already sorted huh yeah if the list is already sorted you're fine yep yep also um i kind of like that as an
Starting point is 00:26:14 example of what not to do so a lot of times in like computer science class or in interviews if you find yourself doing something that looks like a bubble sort meaning you got like you know like a a for loop and you're looking at everything twice and you're like, hmm, this kind of reminds me of the bubble sort algorithm. Chances are there's a much faster way to do it. And it's a trick question. Yeah. So, you know, kind of going along with that, what are the weaknesses of this algorithm? So, somewhat, I guess Joe might have hinted at it, like this might be one of the worst sorting algorithms that you could pick in regards to complexity. It's horribly inefficient, as you saw just in that five element array. You know, we had to go through it three times to get it sorted. Right. Um, so there's almost always a better choice than the
Starting point is 00:27:08 bubble sort, especially for like large, uh, lists. I mean, you can maybe make the case that if you're, if your list is rather small, then maybe it's not that horribly inefficient and it's okay to use it. Right. Maybe you can make that point, but, um, usually there's something better that you could choose. Worst, and then this was, I found curious that according to Wikipedia, the best case, the worst, I'm sorry, the worst case and the average, which I questioned like, and the average time complexity for the bubble sort is O of N squared. So if we remember our big O complexity, you can head to bigocheatsheet.com. That one is not one of the worst offending hockey sticks, but it's really close to the worst offendings. It was like third worst, uh, you know, in terms of, you know, complexity, time complexity. Um, so, you know, and, and kind of to Joe's point a moment ago, like you saw that
Starting point is 00:28:16 we were nesting over this same thing multiple times. So, you know, those are, that's almost always going to be a sure indication that you're heading into O of N squared territory. I was trying to think of a real-world example where you might do something like a bubble sort. And I was trying to imagine a teacher, say an elementary school teacher, and they've got a stack of test scores, like tests with grades on them. And they want to get them in order, from worst score to best score. And with bubble sort, what you do is you take that stack and you compare the first paper and the second paper. If one's bigger than the other,
Starting point is 00:28:50 you flop them and you do that through the whole stack. And then you start back to the top of that stack and do it again and then do it again and do it again. So you can just feel kind of intuitively like you're going over the same data over and over and over again. It's just, it feels wrong. Yeah, I did try to find some like real world uses for the bubble sort because I was curious, like this thing has to have more value than just academic use, right? And I really didn't come up with anything of interest. There was a case that in the Wikipedia article for it, where it talks about like, there's a, you know, very popular, a use that is very specific for graphics. But it really felt like aside from that, though, you should pretty much avoid using the bubble sort in real world, you know, aside from being asked about it on a job interview, maybe. But, you know, otherwise you should try to stay away from it in like your day job, right? But in the case that Wikipedia was talking about though,
Starting point is 00:29:52 it was basically saying like, hey, if you can assume, which is a strong assumption, but if you can assume that the list is already almost sorted, then the bubble sort performance isn't so bad, right? Like it, it will be okay. And there is this one quote in here that I wanted to call it, see what you guys thought about it, where they, in the Wikipedia article for it, it says in computer graphics, bubble sort is popular for its capability to detect very small error, like swap of just two elements in almost sorted arrays and fix it with just linear complexity. For example, it is used in a polygon filling algorithm where bounding lines are sorted by their X coordinate at a specific scan line, a parallel line to the X axis with incrementing Y their order changes. Two elements are swapped only at intersections of two lines
Starting point is 00:30:46 so like that last part of it was like okay i thought i was kind of following you and then you got into the polygon filling algorithm part blah and it looks like it has to do with the actual like scan line so maybe like wanting to sort things in a way that like visually makes sense with that use case yeah i mean, that's where it was like, I don't know, I kind of felt like their example got a little too much into the weeds for me. But that was about as best as I could find for a real-world use case for the bubble sort.
Starting point is 00:31:18 And you can imagine that the worst case scenario is also when the data is sorted just in the wrong direction. Oh, man. Like 10 with 9, 8, 7, 6. That means you're going to have to march that 9 all the way over to that last spot. Yeah.
Starting point is 00:31:34 That's going to be ridiculous. Yeah. I think that wraps us up for Bubble Sport. Bubble Sport. You guys play that Bubble Sport where you get those things in the packages and you pop, pop, pop? That's right. That's actually awesome.
Starting point is 00:31:48 Bubble Sort is not awesome. But what is actually is pretty awesome. I got lucky here. I got Merge Sort. And Merge Sort is probably the most common type of sort. So if you're just doing something in a library and there's a good chance that it's going to be doing some sort of Merge Sort. I don't have any hard numbers for that, so maybe that's not true anymore. We've got some other algorithms that are pretty cool too, but this is like a good standard. And it's a divide and conquer kind of algorithm, which is pretty interesting. It's a type of problem where you keep
Starting point is 00:32:19 splitting up your input and then you kind of bring it back together. And while that sounds like kind of at a high level, like you're not really doing anything different than a bubble sort where you're just kind of looking at smaller pieces and then putting it back together, it actually ends up being much more efficient because you don't have to keep re-looking at the same amount of data, or the same data over and over again.
Starting point is 00:32:44 And so the way it kind of works is with that, using that teacher example, if you've got a stack of, of graded tests and you split that stack in half, and then you take that left stack and you split it in half again, take the left stack, split in half again, split it again until you're down to two papers. And you basically take those two papers and then you sort them, go back up a level. Now, if you've still got papers that need to split, you do that again. And eventually you're going to get to the point where you've got like two papers in your left hand and two papers on your right hand. Both of those are sorted.
Starting point is 00:33:21 And that's really the key there is that the papers in your left stack and your papers in your right stack are both in order. So what you can do is rather than comparing each element to each element is you, I'm not trying to describe it too well, but basically what you can do is only go through each item once. So you say left one, are you greater than the item on the right? If yes, then the right one goes in first and then you repeat. And the opposite of the, whatever the opposite of what I said was, it changes how you return. So what that means in practice is you split your stuff down and you build it back up. But because you know that the lists are sorted on the way up, you don't have to keep re-comparing the same values over and over again. Like the example where you've got like 1 through 10, but it's sorted 9, 8, 7, 6, 5, all the way down to 1. You're going to compare 9 with 8 more than once.
Starting point is 00:34:24 You're going to compare 9 with 7 more than once, 9 with 6 more than once. You're going to keep doing the same work over and over and over again. But just the simple act of only comparing these sorted smaller lists makes all the difference here. And that is going to bump us into making it O of n log n, which is kind of the standard for sort algorithms. I also wanted to mention that they call it a stable sort, which means that, I don't know if you want to go into this, but it's basically the idea is like that if you sort this list
Starting point is 00:34:58 twice with merge sort, then you're always going to get whatever item came first in a match. So if we've got two students that got a 78 on the test, you're always going to get the one that you ran into first. The one that was left most in your data set is going to be sorted in the first position, which in terms of numbers, it doesn't matter because a 78 is a 78. But for some reason, you're doing something where you do want to see that first value in the array being sorted higher or rather maybe you just want to make some decision,
Starting point is 00:35:32 maybe you want it lower, then this would be a good algorithm to use because you can make a deliberate choice on what happens when you see a match. We didn't get into the space complexity on these. So, um,
Starting point is 00:35:47 yep. Just real quick. I thought it would be worth mentioning again, you know, borrowing from the big cheat sheet and big O cheat sheet.com. Uh, the bubble sort space complexity was O of one, but for merge sort it's O of N.
Starting point is 00:36:02 Yep. And the deal with that is because, uh, in bubble sort, you don't need a new array. You could do everything in place. You just swap the values of the indexes. So that's really easy. You don't need a new data structure there. It's just whatever your input is. Merge sort is different. When we break stuff apart, actually into smaller and smaller arrays, you don't actually need to create separate arrays for that. You can actually just kind of keep track of your starting indexes. But when you start putting things in order,
Starting point is 00:36:30 building that array back up to return, you have to do that. I'm sure somebody's figured out some convoluted way of doing it without, but for the most part, you have to do it in a new data structure because it's very important that you keep those lists in their individual sort order when you're building this thing back up. And so you do end up having to create another data structure that is exactly as large as the one you got in, which sounds terrible, but actually works out really well and to its benefit in some cases where like, for example, this is a great algorithm to use if your data set is too big to fit in memory because you keep dealing with these things in smaller chunks. And then eventually when you kind of build this thing up, you're able to deal with it in small parts
Starting point is 00:37:15 because you're only ever comparing small things together. And so this is the algorithm that divides nicely up even in distributed environments. So kind of to say that in another way, in a low memory setting, if I give you an array of, let's say, a million integers, then with a bubble sort, while it might not be the most efficient way, you don't need any additional memory to do that sort. But with a merge sort, you will need to create a second array of equal size to the first. So you will have two arrays with a million integers in it. One is the input, one's the output. Correct. Yep. But in a merge sort, you don't have to have both those arrays in memory. Like it's easy for you to kind of only keep the chunks that are important. And same with BubbleSort. I'm sure you can imagine
Starting point is 00:38:05 you could throw that thing on disk and then only pull up the two individual numbers that you need. Well, I'm thinking about the case of if you just wrote a function to do the sort, and it had to return the result. That's why I said the input and output. But yeah, I mean, to your point.
Starting point is 00:38:22 Well, think about it this way. You've got a cluster of computers. You've got 10 computers and you've got a data input size of like a million. You could throw 100,000. You could divide that array up into chunks of 100,000, give each one of those 10 nodes 100,000
Starting point is 00:38:37 amount of the work to do. And they can all merge, sort their individual pieces and then pass up those pieces to some sort of master node that's going to merge those individual parts and that's totally valid nobody did any extra work it's still a merge sort but could you imagine doing that with bubble sort like each one has a hundred thousand and so it's like the first server does all its work now the second server does all its work third server blah all the way and then it'd have to do it all when it came back together too
Starting point is 00:39:02 right yeah i mean that doesn't work out very well well for bubble sort and being able to do that stuff at the same time. But with merge sort, it's like designed for it. Yep. Very cool. So the space was O of N and then it looks like the operations to do it are also O of N. It's got to run basically the number of operations as there are elements. No, the time complexity for merge sort was O of N log of N. It's got to run basically the number of operations as there are elements. No, the time complexity for merge sort was O of N log of N. That's time complexity, but the operation, right?
Starting point is 00:39:32 Once we need to sort the items. Oh, O of N was the sorting of the items. Yeah, no. So the complexity is N log N. Yep. And as a solid performer, it's often a kind of default. And so it's a popular choice. And I find it interesting to add,
Starting point is 00:39:49 in worst case scenarios, merge sort does about 39% fewer comparisons than quick sort does in the average case. So a lot of how people compare search engines, search, search engines, sorting algorithms is they basically compare them to each other for different types of use cases.
Starting point is 00:40:05 And there are some certain cases of ones outperforming other ones. So this one just overall performs pretty well. But it does have a couple of weaknesses. Mainly one that I think is the weakest point is that it's annoying to program, which is silly because obviously this is the kind of thing that you program very rarely and use very often. But I sat down at Panera today for lunch and programmed it. And it's pretty annoying because you have to kind of split it into two parts. The first part is easy. You basically have to kind of keep splitting down the arrays until you get
Starting point is 00:40:35 two individual arrays with like one element in it. And then you merge those guys back up. And then you write a function that actually does the merging. And it just seems like the whole thing is like edge cases and if statements. So it's not a very elegant looking, uh, algorithm. If you do it correctly,
Starting point is 00:40:53 you know, you'd like bubble sort. Like, I feel like, you know, it's just like a couple of lines. He's like, you can look at it and get it.
Starting point is 00:40:59 We're sort of, you're looking at it. Like, are you sure this is faster? So this looks, this looks wrong guys. Hey, so going you sure this is faster? This looks wrong, guys. Going back to this, we said log in, all that kind of stuff. Any of you want to
Starting point is 00:41:14 give just a real quick synopsis seeing as how these are numbered things? What log of what means what? Anybody remember off the top of their heads in terms of math? It's a whole lot less. So, we're generally talking about, is it log base 2? Well, that's the thing, right? You have to know what the base is. I think by default, it's usually log base 10, I think. So, log of 10 would be 1 because it's 10 to the first. So, 10 to the first gives you 10, right?
Starting point is 00:41:47 Log of 100 on a base 10 would be two because it's 10 to the second power gives you a hundred, right? And so log of a thousand would be three, et cetera. So now base two, that would obviously be, you know, whatever it took using that base. So, it's important. I'm talking log base two. So, pretty much any time in computer science that we're talking about logarithms,
Starting point is 00:42:12 like in time complexity, we're almost always talking about base two. Okay. So, if we're doing. Which is messed up. Yeah. So, it's worth knowing that because it matters in terms of what you're thinking about, right? So if you do, if your base is two, then we're saying what, two to what power will give you the next number, right? So log of two, base two, the answer is one, right? If we say log of four, then it's two,
Starting point is 00:42:41 because it's two squared, right? So then log of eight would be three and log of 16 would be four, et cetera, right? So it's basically the inverse of a power is kind of what you're getting at, right? So if you go back and put on your math hats, you know, when you had two to the second power or let's say four. Eight, it's 256. Yeah. And if you wanted to know what the inverse of that was, like four to the one half power, that was two squared, right? So, you got to go back to your math thinking, but it's basically the undoing of the powers.
Starting point is 00:43:18 Oh, God. I think I just did it wrong. You can fit 256 worth of binary in eight bytes, but I think it's actually 2 to the 8th is something else. Well, let's do 256. I think that might be right. Let me ask Google. Like log base 2 of 256. Log base 2 of 256 is 8.
Starting point is 00:43:36 That's correct. Okay. Yeah, because it's 2 to the 8th power will give you 256. So just remember that when you're talking about logarithmic time, you're basically doing an inverse power. And that's why they're so much faster and more efficient because as your size of your inputs or your operations increase, the amount of time or complexity of whatever is going down by a factor. And that's why log, you know, when you talk about O log in, that is why that's so much better is because as your size is increasing, your actual operations are going way down or
Starting point is 00:44:14 your space or whatever, whatever you're talking about, your complexity or your O, that is going to go way down. So, yeah. I was at Panera today. I actually did a little example where I programmed bubble sort and then I programmed merge sort and I ran them both. And bubble sort was just a random array of like a So here you go. Every time we changed the numbers for only 1,000 items in the array. And you can tell that it actually did pretty well because that's much less than 1,000 times 1,000. That would have been a million. And my merge support, my merge support, merge sort did 9,488. So, a whole 1% less comparisons and 245 swaps. Wow. That's almost nothing comparatively. Well, this is, you know, going back to your logarithm conversation there, you know, I've always thought about whenever I think about logarithms, it's always, you're talking about
Starting point is 00:45:19 measuring magnitude. Yes. Like you're, you know, the order of magnitude, like what's the difference. And so, you know, yeah, it's, it's like when you talk about, um, uh, earthquakes,
Starting point is 00:45:34 for example, right. And, uh, you know, uh, a four versus a five versus a six, you know,
Starting point is 00:45:41 on the Richter scale for the strength of the earthquake, right? Those are like orders of magnitude different. You know, it's not like it's just one more powerful. Right. It's an order of magnitude more powerful than the other one. So that's kind of the way I think about logarithms. I could be wrong. No, that's probably pretty accurate.
Starting point is 00:46:02 I mean, it's almost like when you talk about sound like decibels, a lot of people don't realize, you know, 30 decibels is like a whisper somewhere in that ballpark. 40 decibels is double the volume of 30 decibels. 50 decibels is double the volume of 40 decibels. So it's not like it's just going up 10 things. It's doubling every 10. So, yeah, it's the same type thing where you get into the powers. And so the reason I brought it up is because it's easy to say, oh, you know, log in or oh, in log in. So I just, I wanted to bring it up because you have to think about these things in mathematical terms, because that's what big O notation is, right? It's all about what is the
Starting point is 00:46:39 mathematical operation or complexity of this thing. And so like when you say n log four right we just said it's base two that would be two but if you said you know if it's o n log n then you have two times two so your complexity is you know oh four or whatever so um i think it's i think it's worth at least knowing when we talk about these things so you're not just drifting off in a no man's land going oh god what what are they talking about you know one thing i noticed is kind of funny about the complexity times if you look at any of the cheat sheets like anytime you're looking at like an algorithm and says you have the input or you have to whatever like you know there's a log n going to be in that number it's expected to write in that blank there. Yep. Yeah. I mean, the big O cheat sheet.com is amazing resource for, you know, anything like if you
Starting point is 00:47:29 already, you know, all this conversation about O of login or O of in login or O of in or whatever, you know, if any of that's, uh, you know, if you're questioning any of that, you get to head to bigocheatsheet.com. And it's a great resource that just very simply, very easily visualizes it for you on a graph. And then they talk about it in terms of common data structures and then sorting algorithms. And there's even like a poster that you can get if you wanted to just hang that on your wall next to your Lamborghini or Ferrari posters. It would be right up there. It would belong. It would fit right in. All right. So with that, then we're going to get on to mine, the quicksort, which is a little bit harder to explain just because there's so many things that kind of happen at once. But this is also another divide and conquer strategy.
Starting point is 00:48:27 So what it does is it basically takes a list and it breaks those lists in a small list, but it uses a pivoting technique. So what Joe just talked about with the merge sort, he just put the list in half and then sorted within those once it got down to two, right? Like half, half, half, half until you get to two and then you sort those and then you put them all back together. This one's kind of interesting because what it does is it sorts in place, but it uses pivot. So it kind of goes through the thing where it breaks it apart into smaller ones.
Starting point is 00:48:58 And then after it's all done, then everything's just sorted in place. And here's the interesting thing. It says the partitioning is done at random, sort of. And the reason I say sort of is because what they do is you take a list and you take the last number in that list. And this is actually called the Lomoto partition scheme to where you take the last number in that list, the last item in the list, and that becomes your pivot point. So I'm going to give an example, and this is going to take a little while to walk through. I mean, I'll go as fast as I can, but there's a lot of swapping going on.
Starting point is 00:49:32 So with this, the quicksort, let's say that we have the numbers in a list, 5, 2, 7, 4, 1, 8, 3, 6. Now, I know that's a bunch of them, but to get the pencil. Yeah, get a pencil or, you know, go to the site when this thing's up and you can take a look at it because I'll have it written out as far as, you know, what they end up being after each one. So the last number here was six. So it says you're going to take that last number and you're going to use it as the partition value. And all that means is that's going to be your pivot. So everything's going to be compared to that number as you go through this list. So what it's going to do is all values less than the partition will stay to the left of that pivot point.
Starting point is 00:50:14 So anything that's less than six is going to stay on the left. You don't have to move it because it's already to the left. It's already in the right spot or the correct spot. Let's use correct, yes, because right's going to get confusing. Turn here? No, turn there? Turn right? Yeah. Let's use correct. Yes, because right is going to get confusing. Turn here. No, turn there. Turn right. Yeah, right.
Starting point is 00:50:27 Turn that. No, sorry. Anyways, so all the values that are greater than six are going to get shifted over to just after the pivot point. But to do so, that means that whatever number was before the six is going to have to swap places with the number that just moved to the right of the six. All right. So if you think about it like this, if we find a number that's greater than six, that thing is going to move over to the right of six, which means it's going to push six to the left in that list. So whatever was over there on the left now has to go back to where that other number was towards the front of the list. So let's go ahead and roll through this thing. So again, it was 5, 2, 7, 4, 1, 8, 3, 6. All right.
Starting point is 00:51:12 So the first number five, it's less than six stays where it is. The second number was a two. It's also less than six. So it stays where it is. Seven was the third number in that list. It's greater than six. So what we said we were going to do is we're going to take that 7 We're going to bump it onto the very end Of this list now Right? Well you don't bump it on you swap it right? Well you're basically going to push the 6 over to the left
Starting point is 00:51:37 Which the number before the 6 Was a 3 So now you're going to move that 3 to where that 7 was And then you're going to move that three to where that seven was. And then you're going to move the seven after the six. So kind of what you're doing is you're shifting all the numbers down before the six or you're not, not all of them. You're shifting the six down one position, which moves that, that number that was there back to where the number is just being pushed out to
Starting point is 00:52:02 the right of six. So now – Your pivot point is moving. Your pivot point is moving. Your pivot point is moving inward. Yes. So we started with 5, 2, 7, and now we're at 5, 2, 3. 5, 2, 3, because the number that was before six at the end of the range was three, and that had to move into the seven position. So now what we had starting was 5, 2, 7, 4, 1, 8, 3, 6. Now what we have is 5, 2, 3, 4, 1, 8, 6, 7, because that 7 is after the 6 now.
Starting point is 00:52:33 So now what we're going to do is we're going to look at the 3, because the 3 just took the 7 spot. So we want to make sure that that thing is in the right position. 3 is less than 6, so it's going to stay there as well. just took the seven spot. So we want to make sure that that thing is in the right position. Three is less than six. So it's going to stay there as well. So now we're going to take a look at four. Four is good. That stays there. One is good. It's less than six. So that stays there. The next number in the list was eight. Eight's more than six. So we're going to do this same little dance move, right? We're going too fast, man. Hold on. Grab my pencil down. So now what we're at right now is 5, 2, 3, 4, 1, 8, 6, 7. So we compared the four and the one, those are both less than six. So those didn't move. But that eight right before the six now,
Starting point is 00:53:23 that is obviously more than six. So we're going to move it to the right of six. Well, conveniently, because it's right next to the pivot point, all we're going to do is swap these. Right. So now what you've got is five, two, three, four, one, six, eight, seven. All right. So that's the end of the first partition sort. So now what we're going to do is you sort of take six out of the equation now. So now really what you've got is you've got the list to the left of where six ended,
Starting point is 00:53:50 and you've got the list to the right of where six ended. So on the left, you have five, two, three, four, one, and then six is your partition point. It is where it's going to be at the very end because of this little dance move that we just did. To the right now, you've got eight and seven. So we're going to go through these lists individually and break these down. So here – My first inclination is to think like, okay, move the pivot over, start over again. But then you realize like, wait a second, we've already got that six.
Starting point is 00:54:16 We've got a divider and we know that we've got a list on the left that's all smaller and a list on the right that's all smaller. So why don't we just partition sort those two since we know we can do it in line. Exactly. And the thing is, is because of how those things moved around, that six actually ended exactly where it needs to be for the final result set. And that's the cool part about the way this thing shifts around. So with 5, 2, 3, 4, 1, you have to pick a new pivot point on that left list, which the last number was one. So that's your pivot point. So this is where it's going to be kind of interesting. So the first number five is obviously greater than one. So you're going to move that to the
Starting point is 00:54:57 very end. And so now you've got, and because you did that, the number that was before one, which was four is going to take the place where the 5 was. And I know this is confusing as heck to listen to, but now you've got 4, 2, 3, 1, 5. You're going to do the same comparison now. Because 4 moved into that first spot, you're now going to compare that with 1. 4 is greater than 1. We're not doing anything different. This is the same rules.
Starting point is 00:55:23 We're just kind of applying them and kind of jolting it down. Yep. You just keep going. And so now because you move that four into that first position, you're going to compare that one again. Four is greater than one. It's going to go to the right of the one. The one's going to shift down and now three's in the first position. Three's greater than one. You're going to move it after the one shift, the one down and now two's in the first position. So this is really cool because this is actually just sorting the list all the way down and you end up with one two three four five now on the right list we had eight and seven this one's really easy the pivot point is seven the first number is eight eight is greater than seven it's just going to swap places with the seven
Starting point is 00:56:01 and now you have seven and eight and then you you put these lists, these lists are now in the correct order. They were sorted in place because they were just shifting and moving things around. So this is all done and it was done pretty well. Now it's kind of confusing and hard to look at. Um, so let's talk, I didn't break these down into weaknesses and strengths like they did, but hopefully I'll be able to summarize this pretty well. So here's what's interesting. If your list is already sorted and you take this approach, it is horrible performance because you're going to take these pivot points and you're going to go through all of them and you get to O N squared time on this thing because it's going to basically shuffle everything to put them back in order. So that's kind of an interesting thing to know about it. Wait, it'll shuffle everything? Yeah, so if you think about it, if you have 1, 2, 3, 4, 5, 6, it's going to make 6 your pivot point.
Starting point is 00:57:02 And so now it's going to compare 1 to 6. In order. It's's going to compare one to six. So now... In order. It's not going to shift everything. It's going to have to keep moving the pivot. It's going to have to keep... So one, two, three. It's going to go through all of them. Why?
Starting point is 00:57:20 Hold on a second. No, that... One, two, three, four, five, six. Am I missing something? Hold on. Oh, well, you... Am I missing something? Hold on. Oh, well, you still got to go through the list. Maybe you're thinking if it was reversed. No, they said in order.
Starting point is 00:57:34 If it's all sorted and it's already sorted, you have to have a pivot point and then... Huh. Why would that be? Now I'm not so sure. Because it seems like it'd be o sub n not n squared i'd have to go back and look at it now i read it and blindly trusted it any thoughts there joe trying to look it up it doesn't make sense to me just like you guys said um to like i would think that the worst case scenario would be
Starting point is 00:58:05 reverse sorted. Because you would immediately... You'd keep swapping that. You'd have to keep moving it. The worst number of times. Well, I have the link. Yeah, I mean, I do see that in the book. Again, this is
Starting point is 00:58:24 the imposter's handbook where he says that if it's pre-sorted, we'd have to split and order the list for every element turning the complexity into O of N squared. Yeah. And thinking about it, like you said, the 6 is your pivot point. Unless you're moving something to the right of 6, I don't see why that would happen. And I actually can't see. Oh, I think it's because you I think it's because you keep moving
Starting point is 00:58:50 the pivot. Right? But you don't move the pivot, right? Because you never swap it. But you still have to compare with every other operation. So it's not about the swaps. It's like your array of eight there. You do eight with seven, eight with six, eight with five, blah, blah, blah. Then seven with six, seven with five. So you literally compare every number against every other, blah, blah. Then 7 with 6, 7 with 5.
Starting point is 00:59:07 So you literally compare every number against every other number. And that's where the N squared comes from. Okay. I'll take that. No, no, no. But wouldn't you still have to go... Am I thinking of this wrong? Maybe I'm thinking of this wrong. Because I was thinking if you had like a five element array.
Starting point is 00:59:21 Let's just talk 1, 2, 3, 4, 5. Right? So 5 is your pivot point. You have to compare five to all of the first four numbers, but that doesn't necessarily, you don't know yet that your other four numbers are in order. So now you got to pivot. You got to use four as the pivot because it's on the left of the five.
Starting point is 00:59:38 So you got to use four as the pivot and find out that everything's in order. Now, you know that four is in the correct place, but you don't know about the first three. So you got to go to the left, everything on the left, because everything on the left isn't sorted yet. So now you have to go to three as your pivot. So you keep moving your pivot point,
Starting point is 00:59:54 and that's where the O of N squared comes from. I don't think you do that moving though. I think it's all about that comparison. I was able to finally find somewhere to verify that the worst case scenario is there's actually three of them. One is when the array is already sorted in the correct order because you have to compare all of them. Another is when it's sorted exactly wrong, and that is because you have to compare every one against each other. But in that case, you actually end up doing that swapping too.
Starting point is 01:00:18 It's just that the swapping isn't such a big deal compared to the actual number comparisons just because there's so many more of them. And the third case is actually if all elements are the same, like if you've got an array of all fives, that's also the worst case scenario because in that case, you're also comparing every number against every other number. Okay, so it's just in the comparisons that you're getting that. Okay, that makes more sense.
Starting point is 01:00:40 Yeah, because I don't think the pivot point would ever move at that point because you don't have anything to swap. It'd have to be. And kind of like intuitively, like swapping is more work. So it's kind of like tempting to think like, okay, that's a problem. But really when you're looking at algorithmic complexity, it's all about how it scales. And so it doesn't really matter that the operation takes a little bit longer to do. It's all about how many times it has to do operations. Okay. Wait a minute. How am I missing this then? It would have to, the pivot would have
Starting point is 01:01:03 to move. In the one, two, three, 4, 5 example, they're already in order. Your 5 is your pivot. You compare 5 to the first four numbers, 1, 2, 3, 4, 5. And each one of those numbers is already less than 5, but you don't know that they're in order. You just know that they're less than 5. Right. So you have this list on the left that is now 1, 2, 3, 4 that you know is less than 5. But you still have to go through and sort that.
Starting point is 01:01:28 So you have to move the pivot. So the pivot point moves to the 4. Oh, I see what you're saying. Yeah, you have to move the pivot. You have to choose the new pivot. You don't have to swap the pivot. That's why it's O of N squared because your pivot point becomes everything from the list backwards. Okay, that makes sense.
Starting point is 01:01:42 Okay, so you're not moving your pivot. You're choosing a new pivot after each run. However you want to refer to that. I'm sorry. When you said move, I was thinking swap. That's what I was thinking. I was thinking swap, too. It's not swapping. I just meant you're picking a new pivot point for every number. That makes perfect sense. And yes,
Starting point is 01:01:57 that is exactly what you'd have to do, because what you were saying, you wouldn't know that they were sorted until you stepped through every single one of them. Okay. So, here's an interesting thing. So the take on it is typically you choose the last number in the series to do this. There's another approach to this that can make it better in most cases where you try and get smart and you try and pick a median, right? Which means that you would have to scan the list
Starting point is 01:02:22 one time to find out what all the values were, and then you'd have to pick the one in the middle, right? So, or an average or whatever, whatever's closest to the average. And that way you sort of divvy up your list into an even on the left and an even on the right when you choose an intelligent pivot point. But that does require an additional step up front without just jumping directly into the algorithm. But
Starting point is 01:02:50 if I remember right, this one, man, I could have sworn I put the big O on this one. So if we knew we had a normal distribution, then this would be a great choice because we know that things are mostly going to sort out assuming it's randomized nicely, it's going to be a great choice. And it also seems like the worst case scenario we said was O of N squared, which is much assuming it's randomized nicely, it's going to be a great choice.
Starting point is 01:03:08 And it also seems like the worst case scenario we said was O of N squared, which is much worse than, and that's on par with, that's bubble sort, but it's much worse than merge sort. However, I'm guessing that in the average real world cases, it ends up being much faster and performing much faster than merge sort. Otherwise, we would never even talk about it. Yeah, it says in the best case case it's O N log N. So it's somewhere in between O of N squared and O of N log N. So,
Starting point is 01:03:34 you know, it's probably, it's probably not terrible in most, in most cases. Yeah. But you can kind of see why like merge is the kind of the default there because it's like, well,
Starting point is 01:03:44 we've got one algorithm that is sometimes faster and sometimes much slower. And we got another one that's always pretty good. Right. Yeah, I mean, like space complexity for merge sort, best, average and worst were all the same O of n log n. Yeah, but to your point, like, you know, quick sort goes from one extreme to the next. But there is the space thing, though, right? So that's where the merge sort is actually a little less efficient because it's going to require double the memory footprint. Yeah, this is where the magnitude comes in.
Starting point is 01:04:15 Yeah, so depending on what you're doing here, this is sorting in place, right? This is just shifting out elements in the array or whatever, you know, whatever memory thing that you've got. So your space is constant, but your, your actual operations could be pretty bad. Wait, constant? Or, uh, the, the fix that it'd be O of one, right? The same, whatever the input is, it's the same. You know, I think that sounds right, but it's curious because I have the big Ochi sheet here in front of me. That's why I sound so smart.
Starting point is 01:04:47 And they actually show the space complexity. Oh, no, this is the worst case. I'm sorry. Worst case for Quicksort, they show the space complexity is O of login. Huh. Why would it be less? That doesn't make sense. Yeah, I don't know.
Starting point is 01:05:09 I'm with you. I think that it would be similar to the bubble sort where it would just be O of 1. That seems reasonable. The only thing I can think, the only reason I can think that you would have a log, and I'd probably need to look at the code, is my guess is you're probably going to have a variable
Starting point is 01:05:23 to store the swap values, like one or two variables there, because as you're shifting things, you might need to move something into a temporary store to push it out. I don't know. You have to do the same thing for the bubble sort though. You can do math. You can do math.
Starting point is 01:05:38 I mean, you could, a constant variables like that should drop out of the equation. They're not supposed to matter. Right. Yeah. But you got to wonder too, like all this stuff like really depends, like looking at quicksort
Starting point is 01:05:48 even there's like different ways of doing it, different ways of choosing the pivot. There's even a stable quicksort and a non-stable quicksort. And so you can imagine that like if you're, you know, big O cheat sheet, like you might be able to construct a situation with one version of this algorithm that somehow, you know,
Starting point is 01:06:04 ends up looking like this. So it's really hard to know exactly what their qualifications are and which exactly which implementation of the algorithm they're using to score. Yeah, man, I'm not real sure why. Yeah. Anyways, you know, it's mostly shifting things around within the arrays. So that's, that's kind of odd, but anywho. Yeah, I'm curious about that too. Yeah, there's another great resource. I'm going to have a link to this one. This might be one of my favorite resources that I've come
Starting point is 01:06:31 across here lately. But I want to say that you can get to it real quick. I'm going to double check it. If you go to sorting-algorithms.com and it'll redirect you to the real one because it's a really long name, but it has like all these great animations for all of the different sorts. And you can see like how, for example, a quick sort works on random, a random list, a nearly sorted list or a reversed or a reverse list, or a list with few unique items,
Starting point is 01:07:09 right? And you can see like all these animations comparing to one another, or you can see how one sort works for all those variations of the data. But you can also go and dig into any one of those and see the details of that sort. And so for example, there's quick sort, but then there's another version of it, which is instead of the two way partitioning that Alan described, there's another version of, for a quick sort where it's a three way partitioning,
Starting point is 01:07:35 um, that they, they get into. But even in the, the quick sort, um, here they're saying saying the space complexity, the worst case space complexity would be O of log n.
Starting point is 01:07:49 Same as what Big O had. So, all right. Well, I guess we didn't answer that. So let us know in the comments there. We'd love to see that. And also, if you have any resources for somebody who maybe learned about this stuff like uh 15 years ago and is now kind of forgetting but they don't want to like start from zero let me know i mean let us know it's funny some of
Starting point is 01:08:18 these things i literally have not seen since college right like it's it's been a while my computer science one course, it felt like all we did all year was just talk about the four main sorting algorithms that we're going over right now. So I've done all of these at least once 15 years ago. Alright,
Starting point is 01:08:38 so what do we got next? Alright, so I got the luck of the draw with the great algorithm. So selection sort is next. So I got the luck of the draw with the great algorithm. So selection sort is next. So what is the selection sort? So you remember when I said that the bubble sort might be like one of the worst sorting algorithms that you could ever use? Meet selection sort. Obama even knows not to use bubble sort. Obama even knows not to use bowl sort. All right. So, all right. So maybe it's not, this might not actually be the worst maybe, but this is how we might actually think about sorting
Starting point is 01:09:15 something. So it's kind of easy to think about this one in our head. So let's, let's take a playing cards, for example, like if you had to sort playing cards in your hand, like you're playing poker with somebody like selection sort might If you had to sort playing cards in your hand, like you're playing poker with somebody, selection sort might be how you might sort those cards in your hand. We're like, okay, let me find all of the twos, then all the threes, then all the fours, and try to put them in order. We're going to iterate through the list, find the smallest element, and we're going to swap it with the first element.
Starting point is 01:09:40 We're going to repeat this for each element in the list. Let's consider the same list that we had with the bubble sort. 1, 3, 5, 2, 4. Okay. So we're going to scan the list looking for the smallest element. And we find that it's the one and it's already in the correct position. So we're going to scan the list again, looking for the next smallest value, which could still be a one, but in this particular example, it's not. So we're going to find the two and we're going to swap it with the second item in the list, which was a three. So our list went from one, three, five, two, four to one, two, five, three, four. Okay. So we're going to scan the list again, looking for the next
Starting point is 01:10:26 smallest value and we're going to find the three. So now we're going to swap the three and the five. So our list now goes from what was one, two, five, three, four to one, two, three, five, four. And we're going to scan the list again and we're going to find that the next smallest item is the 4, and we're going to swap it with the 5. So now our list went from 1, 2, 3, 5, 4 to 1, 2, 3, 4, 5. All right? So we iterated over this list of five items four times. So you've basically got, you're keeping a pointer to the next index that you're going to swap out. And then you're keeping track of the last or the biggest number that you've swapped at that point.
Starting point is 01:11:13 Or the, yeah, the biggest of the low numbers that you've swapped. Okay. Yeah. But keep in mind, though, there could be, you know, you could be sorting things with that, you know with the same value could be repeated multiple times, right? So, all right. So what about strengths for the selection sort? All right. So doing a selection sort, you can just write a simple loop to do a selection sort.
Starting point is 01:11:42 So that's great because it doesn't require any recursion. And anytime we ever think about recursion, we should immediately have some alarm bells that go off in our head because we should think about the data that we're going to be recursing over because we need to be concerned about stack overflow exceptions, right? Unless it's tail recursion. That's the bomb.com. When we recurse, all of those variables are staying on the stack, right? As we go into the next iteration of that function.
Starting point is 01:12:19 So that's one thing to consider. Now, here is an interesting thing. Wikipedia, according to the Wikipedia article for the selection sort, claims that this usually outperforms the bubble sort. But if you were to look at the Big O cheat sheet, they're, you know, ruling on the time complexity of selection sort versus bubble sort, and the sorting-algorithms.com animations that I mentioned, they would leave you to believe that bubble sort outperforms selection sort. So your mileage might vary, but Big O' Cheat Sheet ranks selection sort best case at O of N squared. And it's O of N squared for its best case, its average case, and its worst case.
Starting point is 01:13:14 Well, it's consistent. It's got that going for it. It's much like the merge sort in that value. It also sounds stable i'm really not sure how the where the wikipedia article where they what their meaning was then for the saying that it would outperform the bubble sort then but um so this would be a good choice for a small list um and aside from that you know there's there's you know like joe said merge sort will probably be like that's going to be your default go-to right um if you're a computer well i'll tell
Starting point is 01:13:54 you if i'm sorting something like no matter what it is like this is what i kind of do if i'm like sorting socks or like instead really really sort of like magic cards or pokemon cards right like color and then cost. Like that's where I would do this kind of sort. I'm not merge sorting or merge conflicting. You don't merge sort your Pokemon. I know that I should. That's why I beat you all the time.
Starting point is 01:14:15 Yeah. So I can't find my strong stuff. All right. So one advantage to this algorithm, kind of like Joe hinted on with it being stable, is that in circumstances where memory is limited, then this isn't a bad choice. Like the bubble sort, its space complexity is O of 1, because you're able to just swap things out in place. In regards to its weaknesses, though, also similar to the bubble sort, it's not efficient. It's not a very efficient one. Like I said, its worst case time complexity is O of N squared. And it's pretty much like it's always time complexity
Starting point is 01:14:58 is O of N squared, best, average, or worst. And we have to scan each element in the order in order to find the smallest value for each position in the list except the last one. Because the last one is going to be assumed to already be in the correct place. So this is not a good choice for large arrays. You know, I wrote that and now I'm trying to remember why did I write that the last position doesn't have to be the last position can be assumed.
Starting point is 01:15:35 Well, it kind of makes sense. Why did I write that? Or does it? Yeah, because once you get to that last position, there should be no more swapping at all. Because it's either going to be equal to or greater than the value before it, I believe. Because you know it's not the lowest value because that was the first step. Right. Right. I'm going to rethink that.
Starting point is 01:16:02 So, when to use the selection sort? Playing Rummy? Playing Rummy, yeah, exactly. So if you're playing any kind of card games, this would be a good choice. Which kind of to that point, I mean, this is good for small arrays. So if you're playing a card game, right, you're only going to have a handful of cards in your hand. So, you know, I mean, that, that makes it an easy, you know, kind of relatable analogy there. So, but then the question is, well, how small is small, right? Like how many,
Starting point is 01:16:36 how many items in your list is still considered small? So some of the things that I read, people would argue that like, Hey, if you have 10 to 20 elements, then that's small and selection sort would be okay to use. Others would say 20 to 30. So, yeah, I mean, I guess use your best judgment, maybe. Yeah, if you have the AMD Threadripper 19, you know, whatever. You're probably fine. Speaking of probably fine, the best sorting algorithm you can use nine times out of ten is just like whatever your program language does when you do dot sort.
Starting point is 01:17:12 That's what most people do, right? Now, here is a very interesting advantage in like when to use kind of strength about the selection sort, though. So, earlier in the conversation, Joe had mentioned like, well, Hey, you know, with the merge sort, if you're going to like write this out to flash, or if I'm
Starting point is 01:17:29 going to spread this across some kind of distributed load or whatever, like, you know, depending on what my right medium is, maybe it's not as simple as just a, like, Hey, give me a function that takes in an input and returns an output. Right. So if maybe you, what you're trying to sort is being written to flash memory, then as you're doing this sort operation, then the selection sort is actually a good choice because it results in fewer write operations, which for flash memory would be important because every time you write to that flash memory, you're degrading its life. So that's interesting. Yeah. I mean, in this case, you'd be trading I.O. for operations, but that might be preferable because of the bubble sort. You could be literally just constantly swapping all the way up it, right?
Starting point is 01:18:16 Yeah. That's interesting. There are use cases for it. That's funny about bubble sorts. You would literally march just about every number. If it's reverse sorted, it will You will literally march just about every number. If it's reverse sorted, it will march almost every number to almost every position.
Starting point is 01:18:29 Yeah, it's going to write it. You have 10 of them. It's going to write each one of them in its position minus one total, right? All the way down. Whereas this could just be like if you had 10 cards, you'd have 10 swaps and that'd be it.
Starting point is 01:18:46 Yeah, that's pretty cool. Yeah, so moving on to Heapsword, I want to take a brief analogy just to mention why I keep bringing up stability. Stability may seem like kind of like a weird outlier that no one really cares about, but actually in practice it has some really cool implications. So when you think about that magic card example where I've got like a big shoebox full of magic cards, right? Let's pretend that I've got them sorted alphabetically. And at some point I realized, you know what? That's kind of annoying for building decks. So I would rather have them sorted by mana cost. So that's going to be some sort of number. So if I use a stable sorting algorithm on those magic cards, which means that they're going to reliably
Starting point is 01:19:26 end up in the same position that I ran into them, then what that means is my output, when I'm done, that shoebox is going to be sorted by mana cost and then by name. If I use a sort that isn't stable, then there's no guarantee. So I'm going to end up with them sorted by mana cost and then nothing. So for example, if you're doing something like in a SQL database or you're building a relational database because you're crazy,
Starting point is 01:19:53 then you... Those aren't real things in the world, I hear. You want to order by this and then order by that. And then you want to reorder it or do something like that, then you're going to want to consider a stable sort because you're going to be able to maintain that secondary order without doing any extra work. So that was my side note. Sorry. Cool. Heap sort. That's what I'm talking about. Heap sort is another sort algorithm where this time you're going to basically divide your input into a sorted and unsorted region, which kind of sounds a little bit like quick sort,
Starting point is 01:20:33 where we're going to start with this big unsorted region. And then we're going to kind of lop off a small part of that array or input. And then we're going to say, this is the sorted portion. We're going to start with just one item. That's the only item that's in order. As we go along, we're going to keep plucking items out of that heap and we're going to be inserting it in order into that sorted region. Over time, the sorting
Starting point is 01:20:56 section is going to get bigger and your unsorted heap is going to get smaller. This actually looks really cool on those display, that website sorting algorithms, because you can kind of see it, like it kind of all like builds one side at a time and eventually like overcomes the chaos.
Starting point is 01:21:14 It's quite beautiful. But I want to mention specifically when I say heap, I'm talking about a very specific data structure. I'm talking about a certain kind of tree where depending on if it's a max heap or a min heap, the parent is always either greater than or less than its children. And so for a heap sort, there's like a million different ways to do any of these things. But the way I'm mostly going to be talking about is a max heap where we're going to have the children be always
Starting point is 01:21:46 less than their parent. So what I'm going to do is basically take this unsorted array. I'm going to find, I'm going to take the first node. I'm going to grab the second node. If it's bigger, it's the new parent. If it's smaller, it's going to go as a child of the root node. And I'm going to keep doing that until I build it up. And what that means in practice is that when it comes time to getting the biggest node, I always have it. And the rest of my heap, while not sorted, is pretty close. So this is like a cool take on the algorithm, where step one is create this heap, which actually gets us close to sorted. And then we kind of let the other half of the algorithm kind of take over where we do,
Starting point is 01:22:31 you know, it's kind of like that, what's it called? Selection sort where you've got, you know, a small number of items and you just say, hey, pop this guy in the right position and then do it again. And every time after you pluck the root node off of the max heap, you do have to do a little shimmy on the rest of that heap to make sure that the top node is again in the top position. But you only ever have to look at two children. So you don't have to look at that heap and keep resorting over and over and over again to find that max node like you did in selection search.
Starting point is 01:23:04 You always know that it's always one of these two values. Does that make sense? Keep going. All right. I'm not. I'm trying to picture some of it. Yeah. So the deal with the max heap data structure is that you always know that your root node is the biggest node in that heap.
Starting point is 01:23:29 It's always the biggest. Okay. So if we're looking at a top-down tree node, like a binary tree-looking thing, right? But this is going to be your biggest numbers at the very top. Yep. So if you had – Biggest numbers at the – I'm Yep. So if you had... I'm sorry. What is it?
Starting point is 01:23:49 If you had... Basically, what we're saying is like, let's just take three numbers. If you had a three-number list, one, three, two, then what you would do is you would put one at the top and then three on the bottom left
Starting point is 01:24:03 and two on the bottom right. But then you would immediately see, oh, well, three is greater than the one. So those two would have to flip. Okay. So now you would have three on the top and then one on the left and two on the right. Yeah. You're floating your bigger numbers up to the top. Yeah.
Starting point is 01:24:18 It's not a max heap until it's settled. So when you've got that one, two, three situation, that's not a max heap. It's not a max heap until that parent is the topmost point. Okay. Yes. And then we pluck that point off. We take that three off. We cut the head off. Then we have to decide which one's bigger, two or one. And then we're going to pluck that two up there. And now it's got a child of one. So that's a simple example, but you can imagine if you've got a much bigger heap, like say you've got a hundred million nodes in your tree, you always know that the root of the tree is the biggest number and the rest of it is mostly sorted. So it's not great. It could be,
Starting point is 01:24:56 it could have some problems. It may not be that even depending on your use case or your input data, but you know, once you pluck that top node off now you know that either the left node or the right node is going to be the next biggest number so you have to compare those two and whatever one wins you're going to make that the new root node and then your your the rest of your heap is still mostly sorted so it's taking advantage of this case where we know that things that are mostly in sorted order we can take a couple shortcuts on. Okay, so that's the key point is it's not that these things have to be in order going down the tree.
Starting point is 01:25:36 Not like if we said the 1, 2, 3 thing, right? It's not like you have to have 3 and then 1 on the left and 2 on the right. The order of that next level doesn't matter. It's just like you have to have three and then one on the left and two on the right. The order of that next level doesn't matter. It's just that it's smaller than that top one. Yeah, it's that every parent is always bigger than any of its children, bigger or equal to its children. And it's only got two children because we're looking at like a binary tree, essentially. Yep, binary tree, but it's not a binary search tree. So it doesn't mean the left is bigger than the right. You don't want to go to that level of sorting the tree
Starting point is 01:26:07 because then we have, we have another sorting algorithm on our, on our hands. And it's just a, it's like a, we do a cheap first pass at building this heap up, which is faster than a sort, but isn't very accurate. And then we go through and we only have to look at one depth level of that tree whenever we pluck the root off. Yep. Makes sense. So it's just kind of a cool conceptual algorithm, which I thought was really cool when I kind of figured out what was going on there. I started programming this
Starting point is 01:26:33 at Panera, but then I saw how late it got. This merge short took me forever. You're like, dude, you got to get off our Wi-Fi. Yeah. So yeah, as for strengths, it's got a better worst-case scenario for quicksort. So quicksort, you remember that had the N squared worst-case scenario where you have to look at every node. In this case, it's always O log N.
Starting point is 01:26:59 And it can also be done in place. So it's great for memory. So it's O of 1 again. And it performs worst case error better than quicksort. And so you would use this in a case where you want consistency. So quicksort is going to be faster for a lot of average kind of normal everyday use cases. It's going to be faster than heapsort and even faster than merge sort in a lot of cases. However, if you really want to protect yourself from that worst case scenario,
Starting point is 01:27:26 then this would be a good substitute for quick sort because you're not going to get dinged, maybe slower on average, but you're never going to get that worst case scenario where it takes billions of seconds to come back. So I have a question though. And I don't know if in any of your research, you found this though,
Starting point is 01:27:43 but if merge sort was like the default go to, but this one has the same best average and worst time complexity, but better space complexity, why wouldn't heap sort be our default go to? That's a good question. Well, I think it might even go back to what he led this off with, where he's like, here's a side note. The stable part. This is not stable. So you're not guaranteed secondary sorts on this kind of stuff. Yep, for sure. Whereas I believe with the merge sort, you do have that.
Starting point is 01:28:20 And sometimes there's variants of these things. Like each one of these guys, you can almost find a stable version of. Even though a lot of the ones that are normally recursive, you can find a non-recursive or an iterative version. But a lot of times, they are super complicated and just weird. Well, and they probably also sacrifice pieces for others, right? Like either storage or something. You're always going to give up something to gain something else, right? Yeah, and I should mention, too, on those big O times,
Starting point is 01:28:48 the average big O, which is funny because big O is worst case. Anyway, big O, the average case for quicksort and for heapsort are the same. However, in practice, quicksort is almost always faster than heapsort.
Starting point is 01:29:04 Interesting. Even though they have the same exact numbers, like when it comes down to like sorting real numbers, like with real data sets, like you should see in practice that quicksort performing better. And it's still got the same space considerations there. Well, that's also because when you're talking about big O notation, and this just goes back and my brain will probably, and I'll probably say this wrong, but you throw out insignificant things. So basically things that aren't multiplied values, you kind of just toss away, right?
Starting point is 01:29:30 So anything that might have been six operations on one, but four on the other, you just get rid of those, right? Like you're not going to do an N plus four. If it's not some sort of factor of the N part of the big O notation, it just gets chopped off. So even if there were 50 extra steps in doing something, it's insignificant in terms of big O notation. So if I remember correctly.
Starting point is 01:29:57 Yeah, I was just looking too. I was seeing that merge sort is really nice for sequential type data storage. So if you've got an array with those numbers right next to each other, then it's really efficient for kind of hopping back and forth. Whereas if you've got like, say, a spinning disk and you're doing something like on a heap sort,
Starting point is 01:30:12 just because of the nature of it, you're looking at basically like something like a linked list. So that stuff could end up in very different spots in memory. And so that's part of the consideration there. So I don't have a firm like authoritative answer on this, but my guess is that merge sort is going to just perform
Starting point is 01:30:29 faster on average than heap sort. Like in real world use cases. Okay. So when would you use this thing? I lost my notes. That's it.
Starting point is 01:30:46 Oh, especially when you want quick sort, but you want to protect yourself from that worst case scenario. So if you want that in place, that's the big selling point. I think for an overmerge sort, if you want to be able to do stuff without adding
Starting point is 01:31:01 that additional layer of overhead and memory, then I think this would be a good choice. Okay, cool. All right. And with that, it's time for us to give our little beg. For all those that have written in and given us a review, there were some excellent ones. We get messages saying that we've helped people change their lives and their careers and all that kind of stuff. And it seriously is awesome to read that kind of stuff, right?
Starting point is 01:31:29 Like we love being able to help people advance their careers or make the switch or whatever. So, you know, if you find that we're helping you out in some way and you want to give back, please do go leave us a review. You can do that at codingblocks.net slash review. And we read them all and we appreciate it. We take the feedback to heart and, you know, thank you. Okay. And so with that, it's time to head into my favorite portion of the show.
Starting point is 01:32:01 Survey says... Alright, so last episode we asked Survey says... All right. So last episode, we asked, now that you've had some time to digest the news, how do you feel about Microsoft's acquisition of GitHub? And your choices were, very excited, looking forward to the awesome things Microsoft will add to GitHub,
Starting point is 01:32:21 or I'm concerned, but not enough to do anything about it yet. Or I don't care at all. Should I? Or, Oh my God, this guy's falling. Why? How could we let this happen?
Starting point is 01:32:38 Or I've already packed up my code and moved to GitLab. So Alan, I think Joe went first. You're up first this time. Man, I honestly have no idea on this one. I'm going to say I don't care at all. Should I? And I'm going to go with just because I really don't think people care that much. I'm going to say 40%.
Starting point is 01:33:06 Okay. Joe? Alright, I'm going to say I'm concerned, but not enough to do anything about it yet. Dun, dun, dun. And I'm going to put it at a strong 32%. That's not that strong. You got to man up.
Starting point is 01:33:26 Alright, and survey says Strong 32%. That's not that strong. You got to man up. All right. And survey says you're both wrong. Really? Oh, wow. Ish. So I don't care was the top answer. Oh, God. But you were over optimistic in your percentage there.
Starting point is 01:33:42 Showcase showdown. I lost. Yeah. So Price is Right rules, you lost. It was – that's the beauty of combining two game shows, right? It really makes it confusing. Combine the Price is Right with Family Feud and you're like, what? Yeah.
Starting point is 01:33:59 Something's off. Yeah. So, yeah, by Price is Right rules, you went over. It was 36%. Oh, so close. I was really hoping that Joe was going to $1 it. That'd be hilarious.
Starting point is 01:34:16 What was number two there? With this part, I liked. People were optimistic about it. Very excited, huh? Very excited was the second highest. 30%? 30%. Very nice. So we've got 66%
Starting point is 01:34:31 of the votes on those two. Yep. So that's both in either I don't care or this is awesome. Yeah, and then the other part, you know, Joe had the third highest highest what was that at 20s yeah it was like 27 so okay most of the votes were like right in there which i guess the good
Starting point is 01:34:54 side the good news to that is a lot of people haven't like packed up moved to gitlab or to a bit bucket or you know we there were some comments there were some comments well i guess uh there were some comments though on the show notes for that one about like hey we didn't include like every other uh you know code hosting facility out there you know namely bit bucket came up right so i will say this anybody who's worked in vsts has got to be somewhat impressed with how good a job microsoft has done with source control type stuff and how it integrates things. So if they kind of keep status quo with that on GitHub, I think there is a good reason to be excited about it.
Starting point is 01:35:36 Yeah. And I've talked to a few people that were kind of close with some of the stuff that's going on over there, and they said that things aren't going to be changing radically anytime soon. Cool. Awesome. So a little wink, wink, nudge, nudge there. But I've got an account on every platform.
Starting point is 01:35:55 It's going to go out on the name, if nothing else, right? All right. So with that, we'll get to this episode's survey, which is what is your most productive season? So your choices are, I don't know if you're going to be able to guess these, but I'm going to tell you what they are anyways. Spring, because it's the best way to avoid the pollen. Or summer, let me stay behind my keyboard in the nice air conditioned room.
Starting point is 01:36:28 Or fall, the sky or leaves are falling. Safer to stay inside. Or winter because Olaf scares me. I'm not going out there. Or seasons. We don't have any seasons here. I'm going to be upset if that's
Starting point is 01:36:44 the winning answer. Because I'm going to be upset if that's the winning answer because I'm going to move. Well, I mean, you're thinking that not having seasons could be a good thing, though. I mean, you know, if you could live in a colder region of the world and not have any seasons. That's a good point. I really wish that you did. I wasn't thinking Greenland when that answer came up. Which, by the way, why is that called Greenland? All right.
Starting point is 01:37:07 Anyways, this episode is sponsored by Datadog. Datadog is a software-as-a-service monitoring platform that provides developer and operation teams with a unified view of to collect, visualize, and alert on out-of-the-box and custom metrics to gain full-stack observability with a unified view of all their infrastructure apps and logs at cloud scale. They've got 200-plus turnkey integrations including AWS, PostgreSQL, Kubernetes, Slack, and Java. Check out the full list of integrations at www.datadoghq.com slash product slash integrations. And key features include real-time visibility from built-in and customizable dashboards, algorithmic alerts, things like anomaly detection, outlier detection, and forecasting alerts.
Starting point is 01:38:04 You got end-to-end request tracing to visualize app performance and real-time collaboration. Datadog is offering listeners a free 14-day trial, no credit card required. And as an added bonus for signing up and creating a dashboard, they will send you a Datadog t-shirt. Head to www.datadog.com slash coding blocks to sign up today. All right, so diving back in here, we are now going to do the binary search. So we're out of the sorting algorithms. Now we're going into these searching ones. And on this one, this is like the one that I actually remember from school. There's very little that sticks in my mind from school, but this one was fairly easy for me to digest back in the day. So here's the thing about a binary search is you have a list, and that list has to be sorted first.
Starting point is 01:38:58 Now, this isn't terrible if your list is already sorted a binary tree format because it will be sorted for you. And this is another one of those divide and conquer algorithms. So really all you're doing in this thing is splitting the list in half and keep splitting the list in half and looking for the number or whatever you were searching for, whatever that search term was. So let's take a pretty simple example here. So you've got 1, 3, 5, 7, 9, 11. So where I put that pause in there is where you'd split this. So let's say that you're searching for the number nine. So you're going to take this 1, 3, 5, split 7, 9, 11. Is the nine bigger than the numbers in that left list? Yes. Okay. Then you can toss that left list out.
Starting point is 01:39:45 It's not going to be there. All right. Now you're going to split what's left in the right list and you're going to say seven, nine, 11, split it down the middle. Well, you can't do that because you can't cut nine and half, right? So probably what you're going to do is take the first two of that. So you're going to have seven and nine on the left and you'll have 11 on the right. So the number that you're looking for nine, was it greater than, uh, or, or was it less than 11? Yeah. Okay. Throw away the 11 list, right? So now you have seven and nine. You're going to split that one down the middle so that you have seven on the left, not on the right. You compare those. Okay. Nine's obviously the one on the right and that's how you find it. So it's a divide and conquer. You literally just keep splitting it in half until you get to where it is.
Starting point is 01:40:27 So it's pretty easy. I mean, I don't think I put down any of the complexity on this, but I have to imagine this is probably. Just divide two. There's your hint. Yeah, that's what I was going to say. I mean, that would be what? Log in?
Starting point is 01:40:43 Yep. Yep. So. And I really like this one. The example I always heard in school was like the phone book example. I don't know if you guys have phone books anymore. What's that? I haven't seen one of those in the last 10 years.
Starting point is 01:40:54 But the idea is like if I'm trying to look up, you know, say Carrie Underwood in the phone book here. What I would do is if I was smart about it, sort of, I would flip open to the middle and I'd say, oh, nope, now I'm in the phone book here, or what I would do is if I was smart about it, sort of, I would flip open to the middle and I'd say, oh, nope, now I'm in the outlaws. Let me flip to the right. So you flip a big chunk of the book over and like, okay, well now I'm in the Zach. So I went too far. So you would flip back. And so eventually you're going to end up out of the underwoods in a minimum number of steps. However, like things get crazy. So if you actually know, for example, the phone book is kind of a bad example because you know that certain letters are more prominent than others. So you could actually do a little bit better job by not splitting right for the
Starting point is 01:41:35 middle. So you could actually do better than log n there. But for the most case, that's just a really good way of going. And it's amazing to me how much binary search comes up in other algorithms. Because if you need to find some sort of value to do something, then a good way of doing that is to basically keep a list sorted or build up a sorted list as you go along so you can find things easily. And so I think that's a big part of why this particular algorithm was listed with all these other algorithms that we're talking about. It just comes up all the time. Yeah, it's simple and efficient, which is not something that you necessarily hear a lot, right?
Starting point is 01:42:10 Like usually simplicity does not equal efficiency. And I can give you a real-world use case where binary search comes in because we talked about it last episode. The phone book is not real-world? Oh, with the search. Git bisect, there you go yeah get bisect uses a binary search effectively i mean it's kind of like but going kind of going along your divide and conquer you're doing the searching but it's doing the dividing awesome
Starting point is 01:42:37 so that was it like that was that was probably the easiest thing that that i'd seen in all these. So next we're going to get into graph traversal. So we've talked about some sorting algorithms. We've talked about some searching algorithms. So what about something more complicated like graphs? So like a lot of things in computer science, we'll put a lot of fancy words on something that is rather simple. So graph traversal is just the concept of navigating the graph. So this gets tricky. We can either traverse a graph via recursion or via an iterator. And remember what I said before about recursion, right?
Starting point is 01:43:20 Our little alarm bells are going off because we risk a stack overflow, so we need to choose wisely. So what is this? Let's think about your family tree. It shouldn't be a stick. So your family tree. There are a couple ways that you could search that tree. I mean, or graph. Your family graph. Everybody calls it a family tree.
Starting point is 01:43:48 But one way. You're in the deep south. Yeah. Right. That's your family stick. Oh, God. So one way might be to go as far down one path as possible before you go to the next node.
Starting point is 01:44:08 So consider this, you're thinking of your, you're into researching your ancestry. So you explore the lineage as far back as you can go on your mother's side of the family before you start exploring your father's side. Okay. That is a depth first search. Another type of search that we might do might be to search first on all of the nodes on the same level before going to the next. So again, let's consider your family as example, but this time you might be more inclined to ask closer family members for money or help before you go further down the tree.
Starting point is 01:44:50 So in other words, you might be more likely to ask a sibling for help before you would go to your parents. And you might be more willing to ask your parents before you go to your grandparents, etc. Right. That is a breadth-first search. So that is my introduction to depth-first and breadth-first search. So how about we take a deeper dive into depth-first? Yeah, so looking at depth-first search, so like Outlaw said is where you basically go down the length of the tree.
Starting point is 01:45:24 And pretty much whenever I think about a tree, I actually think about it un-inverted, so you basically go down the length of the tree. Pretty much whenever I think about a tree, I actually think about it inverted, so it looks more like the roots of the tree. I don't know about you guys. Even with the family tree example, I always think of it kind of going down. It's kind of funny. Yep. The single node is at the top, and then everything falls up underneath
Starting point is 01:45:40 it. Yeah. When we talk about depth-first searches, that works for binary trees as well as try trees that have different, uh, amount of branches. And it works for things that aren't even trees like graphs, uh, whether or not they have cycles is up to you.
Starting point is 01:45:53 Um, you know, what can you say about it to me? Like I always, almost always think depth first. So whenever I'm doing a problem, um, and like say code boards or something like this is my go-to.
Starting point is 01:46:05 And, uh, I think that depth first just kind of fits better with my mental model because it's easier for me to do depth first without having any sort of storage associated with it. And that works well for trees and doesn't work at all for graphs. And the deal there is that if you've got a tree and you know that there are no cycles in that, so it's a true tree, then you don't actually have to keep track of your path because you can know that you can always get back to the root by just going to your parent. So you don't need to know your position. Now, in some cases, like if you're trying to track your path or you want to store a path or you want to know what nodes you hit along the way and you're not just searching for a single value, then that's a different situation. You're going to need to keep some sort of array or stack there to keep that travel history. So it's kind of a nice thing if you're doing code wars in an interview.
Starting point is 01:46:52 If you just need the value, depth first works great because you don't need to know anything about where you are. It's recursive, so there's a possibility for either an infinite loop there or a blow in your stack however if you're in a language that supports tail recursion then you then you can pass primitive values and get around that all entirely and then you just have to worry about running forever which by the way it's kind of funny i don't know if you guys looked up like bfs verse dfs but some of the examples they give is like well depth first if you've got an infinite amount of data and your data you're looking for is right here at the top
Starting point is 01:47:33 level but it's to the right you're never going to get there because you're just going to go down down and down forever like well i mean that applies for breaths first too right it does yeah infinite number of children. So, that's what I'm bringing up. Yeah, I agree. Unless you know your data super well and where the stuff is going to live in it. No, no, no. I'm going to give you a hypothetical situation that can never exist to prove why your solution sucks.
Starting point is 01:48:02 Man, I can't stand that stuff. Yeah, I didn't really understand why. Maybe I'm misunderstanding something fundamental, but it was just kind of funny to see that even come up at all because I kind of felt like, well, what if you've got infinite data and you want to find the biggest number? You're never going to find it because you don't know.
Starting point is 01:48:15 It's like, well, you know. If you have infinite data, you'll never find it because it's infinite. There is no end. It's turtles all the way down. That's right. I could talk about stuff that's never going to happen all the time.
Starting point is 01:48:26 Right. Insert joke about love life here or something. Wow. Oh, man. We went there. Another way to kind of get around having that to keep that actual path in memory is to have a pointer back up to the parent but then that means more maintenance which is annoying uh and if you've got a second graph then that's doubly annoying because now you've got potentially more than one parent the idea of a parent doesn't even make sense anymore uh and i want to mention too there's
Starting point is 01:49:00 lots of different kind of trees and so there red, black trees and binary trees and binary search trees. And that's where search works for all those guys. But it gets really hard to kind of talk about these things without going into those sort of details. So I'll keep it kind of high here. Hey, wait. So you mentioned the recursion with the stack. And then you also mentioned the pointers back to the parents. Is there any reason you choose one over the other?
Starting point is 01:49:26 Yeah, I mean, if you have a tree, then, you know, pointers are okay. But whenever you have to insert, you have to maintain that stuff. So, if that's a tree you're bumping around all the time, it's annoying. It's just like a double-linked list. It's just, you know, you have to keep up with that stuff. It's extra work to do. It's extra overhead.
Starting point is 01:49:40 That, for the most part, is unnecessary depending on the kind of tasks that you're doing. Stack thing is nice. Go ahead. Stack is nice if you need to keep your path so if you need to find like the the path to a certain node that's a great way to go but if you don't need to know the path and you don't need to know how you got there then you don't need to store either one of those and you can just keep going down forever and you can know you can always get to any node from any other node without having to keep track of your path at all. So, I like this idea of like you kind of traveling amongst these nodes and never actually knowing where you are in the tree because it doesn't really matter. But I guess the thing I wanted to point out is the pointer, yeah, it's annoying. But you're never going to run into a stack overflow issue with that, right?
Starting point is 01:50:23 Whereas with your recursion, you know, if your tree is infinite levels deep, then it's, I mean, there's going to be some point where you're going to hit a stack overflow. If you're doing the recursive method is the only thing I was getting out there with the pointer.
Starting point is 01:50:39 You don't run into that particular problem. Yep. And I should mention too, one thing that's nice about having a stack is actually it lets you easily know what depth you're at. So a lot of times when people talk about like searching trees,
Starting point is 01:50:51 like they'll even say like max depth 11. They'll just say, you know, after a certain depth, you'll see this a lot of times like game theory, you're like talking about like chess simulations or stuff like that. And say like, don't go all the way down
Starting point is 01:51:03 to all the possibilities of moves. Only look like four or five or ten or whatever some sort of number uh levels into the tree because like it's really common for infinite trees to come up which which is also funny to me not necessarily infinite but just trees so large that it's not worth dealing with and so you might set a max level there if you've got got a stack, then you know you can get to that point. The alternative there is to try to maintain that depth whenever you kind of move nodes around and stuff, which is annoying because you have to maintain that. And when I say annoying,
Starting point is 01:51:33 like I mean computationally annoying, like your processor is grunting about it. So it's simple. And I like it because as far as I know, I don't think there's a way to do a breadth-first search without keeping track of where you've been. I could be wrong about that. I guess we'll find out here in a minute. But I want to point out too that it's worse at finding closer data points.
Starting point is 01:51:56 So, like the example that Ella gave of the family tree. Like if you go down to your sister's side of the family first and you're going to be checking all these people for money, but really we've been better off going to your brother and searching for something that's closer to your root node. So it's just not good at finding things like that where you suspect that your data might be closer to the root node. It's better to finding things that are deeper. So you kind of have to have an idea as to like what you think you're going to be, where, where your result might be. Yep. By the way, these types of questions are very, when I mentioned earlier like interview type questions with big tech companies,
Starting point is 01:52:41 this is the kind of stuff that you'll get hit with, right? Like I want to find all the nodes with this value. How would you do it? You're given a binary tree. How would you go do this? And how would you know that you've been there? And how would you, like, there's a lot of things that you run into with this. And so, if you haven't visited this stuff a lot, like, I highly recommend this book
Starting point is 01:53:00 because he does a great job of simplifying some of these sort of hard to visualize and conceptualize things that we're talking about that you're probably going to get. You're scratching your head on your way to work right now. Like, oh, my God, they just said a lot of numbers and they rearrange those numbers a lot of times. But three, five, two, four. Right.
Starting point is 01:53:19 No way. It should have been three, two, four, five. So. So now you combination of my luggage. I love that. Oh, so I want to four, five. So, so no, he did a combination of my luggage. I love that. Oh, so I want to mention too, we didn't really go into it, but a big deal with trees is kind of keeping them balanced and even graphs
Starting point is 01:53:33 too. Like you can imagine like if a tree of all it ever has is like left children, then that's not a tree. That's a, that's a stick we were talking about. Right. So pretty much everything's going to be bad in that kind of stick case. So balancing trees is a big deal. And a lot of algorithms like for even creating trees are going to deal with that. But we're not talking about those tonight. to talk about the breadth first, which Michael hit on just briefly a little while ago. And this is interesting. What Joe said is you have to track where you're at as you're going down.
Starting point is 01:54:11 You have to know at each level because what the breadth first search is, instead of like what Michael said where you want to search all your grandmother's grandchildren all the way down before you go to your grandpa's children or whatever, instead of this, you're going to search at every single level before you go to your grandpa's children or whatever. Instead of this, you're going to search at every single level before you go down the next level. You're going to search all the nodes on a level before you go to the next level. So if you're on the top, right, like your great, great grandpa, right, he's the very top of this thing. You're going to look at his, it's not there. Okay, now you're going to go to the next one, which is going to be his children, right? So you're going to look through
Starting point is 01:54:45 every one of those children at that level, that second level, before you go down to the next level. So that's why it's breadth first. You're going all the way across the nodes on a particular level before you step down to the next level. So it's just literally flipping which direction you're going, right? So, instead of going down, you're just going sideways on every level. I think the difference there is kind of like once you end up hitting, let's say, your aunts and uncles, so like a generation above you, you kind of go aunt, uncle, aunt, uncle, aunt, uncle. Now, it's time to start hitting their kids with your nieces and nephews. No, sorry, your cousins. Then you need to know like, okay, I've hit this aunt or this uncle already.
Starting point is 01:55:30 And I don't think that there's any real good way of knowing. I could be wrong. Maybe you can just track up. Maybe there's something simple I'm missing there. But I think it's kind of a pain in the butt to know what is next for you. Actually, I don't think it's terrible like if you look at the example that he had it's basically just a queue like it's not that hard basically you pop you add the the the node onto the queue and then you look at the left first and then you look at the right
Starting point is 01:56:01 after you're done with that then or or you do yeah after you're done with that then you're going to pop the the left one onto the node you're going to look it's left and right and the same with the with the right so it's not as complicated as it seems um i think that wikipedia even had an example on this or they showed they showed some pseudocode on it so i've got a link to that as well, but it's not bad. It seems like it would be, but if you just use a simple cue to where you're just adding, popping or pushing things onto the cue and then popping things off, it's not bad going across those. Well, what I think I'm kind of getting at is like you say, okay, let's start with my grandparents,
Starting point is 01:56:39 right? And we check with them. Okay, well, we'll move on. So now we go down to your aunts and uncles. So now in your cue, what you're going to do is you're going to say, aunt, we'll check with them. Uncle, check with them. Aunt, check with them. Uncle, check with them. But we don't pop them off the queue, even though we've dealt with them because now we need to go back and do their children, right? So it's kind of like we need to go, there's this kind of this notion of going back and then popping off the queue because you want to deal with each as it comes along and then go back to deal with their children in that particular order i got you well i was thinking it was kind of like you're flat into the queue tree into the queue so you're taking this tree and you're basically like making it into a stick so So you are popping them off.
Starting point is 01:57:27 Yeah. But you'd have to deal with them as you add them to the queue. So you say like, hey, Uncle Tom, can I borrow 20 bucks? No. Okay. I'm going to add you to this queue because I'm going to come get your kids later. Say, all right, Aunt Marjorie, can I borrow 20 bucks? No.
Starting point is 01:57:41 Okay. Well, you're next. After I talk to Tom's kids, I'm coming back to your kids. So looking at the code, let's see. So the breadth first search. So they have this thing. They add the root to the queue while the queue.counts are greater than zero. Move the next item from the queue. It checks to see if the left node has a value. If it does, then it adds it. If the right node has a value. If it does, then it adds it.
Starting point is 01:58:08 If the right node has a value, then it adds that. And if we load it up just like that, we'll have it completely. Yeah, it's basically making a stick out of the tree before it bothers to do any searching. Right. So how do you get to that second generation? So you queue up aunt, uncle, aunt, uncle. That's all the children of your grandparents. At what point, you know, do you go on to their children? Cause if you just pop them off as you say like, Hey, uncle Tom, 20 bucks, no aunt Marjorie, 20 bucks. As you drop those guys off the list, at what point do you go deal with their kids? Cause you've kind them um i think so looking at the way no so looking at the way this thing happens is it pops
Starting point is 01:58:55 the top one off as it gets to it so so for instance if you're at the root so you know the grandparent or whatever it's at the root it adds it to the, the grandparent or whatever, it's at the root. It adds it to the queue. Okay. Now it's going to start operating on that thing and immediately pops it off the queue. And that is now the current node that you're on. And then it's going to say, okay, now I need his children and I'm going to pop those or push those onto the queue now. So your left and your right get added, right? Then you're going to work on those two items in the queue at that point. So if at the top you were zero, you added it to the queue, you immediately took it off the queue,
Starting point is 01:59:36 and now you say give me their left and right. So that's going to be one and two that you just push onto the queue, right? As you operate on, let's say that two was the last one that was added, so you're going to pop that thing off. When I pop them off, that's when I add their children. Correct. So you're going to pop that. It's actually, yeah, it's actually going to happen after you pop both of them off, I believe, because then you're going to take those notes. You're going to add it back to the queue, right? Okay. That's where I was kind of saying where it's like, okay, we would deal with one and I'll be back for you later to deal with your kids. And the way you do that is you say like, hey, Uncle Tom, $20?
Starting point is 02:00:08 No. Okay, I'm adding your kids to the queue now, but I'll be back after I talk to Aunt Marjorie. Right, right. So I shake you down. I think that's exactly how that works. And why I like Depth First search better is because I can just say like, Uncle Tom, 20 bucks. No problem. I'm going to your kids.
Starting point is 02:00:28 Hey, you got 20 bucks? No. Okay. I'm going back to your parent. So I don't need to kind of keep track. There's no queue involved in some situations unless I'm doing something specific like needing to keep track. I could just say like, Hey, Uncle Tom, go to your kids, which Uncle Tom knows about. And that kid potentially knows about his parent or just the recursive nature of the algorithm knows to go back up. And so I don't have to keep track of any special
Starting point is 02:00:48 data structure there, but you need a data structure for breadth first. There is a flip side to this, right? And this is where just knowing and understanding your data comes into play. So in a real world example, you would probably start keeping track of, hey, where did I find these things, right? If I'm a company that has to crunch tons and tons of information and there's all these searches always going on, then you're probably going to want to start trying to keep stats of, hey, where in these trees was this data found, right? Because it might be while the depth first is easier to do and kind of less complicated to do in a sense, it might also be the least efficient because it might be that everything's always on the second
Starting point is 02:01:32 level, right? Like most of the time you find everything on the second level, but you just traverse 50 levels, you know, going down trying to find data when all you really need to do is go across four different nodes, right? So it's one of those things to where there's not one that's better than the other. It literally just depends on where the results are in the data. And the only way you'll ever know that is analyzing. And it might be that it's completely random and you have no option. So let's put some context to this. If you were saying like, hey, if you're trying to think of an example,
Starting point is 02:02:04 why would the data probably be only one level deep? So let's say I connect with Alan on Facebook, for example. Well, they might want to show me recommendations of other people that I might likely be friends with. Now, there's a higher probability that the people that I might be friends with are going to be one degree away from Alan, more likely than if you went through an entire lineage of Alan's friendship and went like 50 levels deep and like, oh yeah, here's this person that they both know. That might happen, but chances are higher that it's going to be one degree away. Yeah. So exactly what you're saying. If you look
Starting point is 02:02:59 at my direct friends and you look at Joe Zach's direct friends, chances of you finding somebody that you know in common is going to be in that first level, right? As opposed to, oh, well, I'm friends with somebody that's friends with somebody else that's friends with somebody else. That's not going to be as useful. And so that's an excellent example. And LinkedIn does the same type thing, right? Any kind of social media. Yeah. So that's excellent. And again, on that one, it's kind of easy to see that that one's going to be there. But like I said, if you have other data complexity problems at a company or something, you're not going to know which is better until you kind of store your findings of doing the searches, right? Which is something that
Starting point is 02:03:44 companies that deal with a lot of data probably do, and then other companies don't pay attention to it, and they might be just using a ton of processing power improperly. If you're looking at harder algorithms, like good luck doing them without some sort of sorting or some sort of graph searching or tree searching going on. We talked about both of these in relation to binary trees, but you can envision how it could work to where each node might have more than just two siblings or ancestors.
Starting point is 02:04:18 I can't say that word, but you know the word I'm trying to say. So you say it in your head head and then it sounds all right. That's very true. But yeah, like instead of having two parents, you know, if it was, you know, a larger family. Yes. So short DFS rules, BFS rules, because it's hard for the programmer. It's not that much harder. It's an extra 15 minutes of that algorithm
Starting point is 02:04:46 that make all the difference. So, I can't remember what I found on this one as far as I put on there that I think it's either why in the world would I put O sub 1 or O sub N squared? Those are kind of at opposite ends of the spectrum. Well, that's okay.
Starting point is 02:05:02 I said for selection sort, you didn't have to search their last one. So, I mean, we both said weird things tonight. Well, that's okay. I said for selection sort, you didn't have to search the last one. So, I mean, we both said weird things tonight. We could just move on. I'm just going to go ahead and delete this line because I don't know what I was doing there. It depends a lot on your graph. Like if you've got a sorted graph, then your time to search something can be really fast because you know like left is smaller and right is bigger.
Starting point is 02:05:20 And so, you know, you're looking at – that's like one of those halving problems. You're constantly halving your problems, So you're looking at O of N. But if you've gotten no sorted data and you don't care where you put stuff in the graph, then it's O of one. Cause yeah, just throw it over there. I appreciate you making me feel better about that. Yeah, man. I'm on board. It says Diet Mountain Dew. I've been trying this with podcasting. You know, your complexity is like really complex when you get into, you get out of O of N and you get into other letters. Oh, yeah, the V and the E's and all that garbage? Yeah, worst case performance they have is O of absolute V equals O of B to the D.
Starting point is 02:06:02 Yeah, man, the V were the vertices, and I don't know what the Bs or the Ds were. And I think that's where I was like, well, it's O some something. O some something. That's the takeaway. Yeah, a lot of tree algorithms are like that. We'll be like M times N, where M is the depth of the tree.
Starting point is 02:06:19 I think where you probably got that sentence is there was a space-time complexity analysis sentence here in the Wikipedia article where they said the time complexity can be expressed as O of absolute V plus O of absolute E, since every vertex and edge will be explored in the worst-case scenario. So absolute V is the number of vertices and absolute E is the number of edges and absolute E is the number of edges. So note that O of absolute E may vary between O of one and O of absolute V squared.
Starting point is 02:06:53 Oh, so I did put it in there kind of like that. Depending on how sparse the input graph is. It makes no sense. And I wonder if I'm reading that right. Like, is it supposed to be absolute V and absolute E or just some weird symbol of O of E, V plus E? It's Wikipedia.
Starting point is 02:07:14 Because why would the number of vertices be negative or the number of edges be negative? So why would it be absolute? Yeah, I don't know. You could have negative weights on it. You could have negative distances on a graph like wormholes and stuff. Well, okay. Now we're getting ahead of ourselves and we're talking about like things like Dijkstra's or the Bellman-Ford where you have weights when you traverse the graph, right? But we're not talking about that here.
Starting point is 02:07:40 You could have negative distances though. It doesn't make sense on like a map, like a Google map. I don't know if you just – You can have negative distances, though. It doesn't make sense on a map, like a Google map. I don't know if you just... That's it. It's not. It's like wormholes and quarks change everything. Once the Q continuum comes into play. Joe, we are not talking about that.
Starting point is 02:07:58 Today on Star Wars... Sorry, that's my other podcast. Oh, yeah. I'm sorry. That's awesome. So with that, we've covered all the – man, I thought this was going to go faster than it did. We've covered what we think the – or what – Rob Connery. Rob Connery thinks are probably some good algorithms and sorting things to know about
Starting point is 02:08:23 and search algorithms that are sort of the building blocks for understanding more complex things, which we may cover here in a future episode. We've got tons of links in the resources we like section. As always, there will be a lot of links in our episode show notes. So yeah, give those a look. Don't forget that discount code too for the digital version of the Impostors Handbook. It's Happy Impostors, however you spell that. Happy Impostors.
Starting point is 02:08:50 Is there a space in there? No space, but you have to ask Google about how to spell imposter. It's with an E. It always throws Joe, but it's with an E. Happy Impostors. Give you a hint. No space, and that gives you the discount for the book, which is most excellent.
Starting point is 02:09:05 And they've actually got a new version coming out sometime in the near future. Maybe, Joe, if you pronounced it as imposter, then you could remember how to spell it. Posters. Yes. Yeah. The imposters handbook. I'll give that a shot. No promises.
Starting point is 02:09:24 All right. And with that, it's on to my favorite part of the show it's the tip of the week and so mine being that we talked about uh search engines and all that stuff last episode touch back on this episode one of the most important parts of getting things into a search engine or of sorts if you're going to do such a thing, is ETLing, extracting, transforming, and loading that data into your search engine. And if you go the Elasticsearch route, they actually have a tool built for this.
Starting point is 02:09:57 It's called Logstash. And if you're not aware of it, it's pretty interesting. I mean, a lot of people go to their database first to try and use, if you're in aware of it, it's pretty interesting. I mean, a lot of people go to their database first to try and, you know, use if you're in SQL server or something like SSIS to try and do this kind of stuff. Well, Logstash is a pretty simple way to do this, and I highly recommend you check it out. So I'll have a link to that here in the show notes. All right. And for my tip of the week, this is a Visual Studio Code tip, or if you prefer me just to call it VS Code or maybe just code, we can do that too. This is a way to, if you want to write markdown in code, Visual Studio Code, you can preview it by pressing Control-Shift-P on a PC or Command-Shift-P on a Mac.
Starting point is 02:10:51 And you can preview your markdown, which is really great. So if you're writing a readme for your next great project, you can preview that markdown. Doesn't that screw with the command palette command? I think control shift P is how you pull up the command palette. Let's find out. Yeah, let's all open up code. See who's faster. Boom, it's open.
Starting point is 02:11:19 Control shift P is, yeah. Yeah, that brings up the command palette. Oh, you know what? Okay. When you do that, though, it'll bring up a dropdown where you can type in preview markdown. Okay, that's what I forgot. Okay, so just a heads up, then, I'm going to add to that tip. I love that.
Starting point is 02:11:40 So command shift P or... I forgot to say that. I actually had that written down. Okay, cool. So that's actually the shortcut to open up the command palette. And so you can do all kinds of stuff in there. Like what you were saying, you type in preview or markdown or whatever, right? I assume.
Starting point is 02:11:56 Yeah, just type in like markdown and then you'll get a preview markdown. Yep. So what I use that for a lot of times, and I'm just going to piggyback on this because I love it is if I have a JSON document, you know, like man formatting JSON always feels like there's never a tool around to do it. What you can do is if you take JSON that you get from,
Starting point is 02:12:19 maybe you were looking in Chrome developer tools and you get a payload back, you can take that, paste it in, do control shift P and then change. I think. You can take that, paste it in, do Control-Shift-P, and then change, I think it's document. If you type in document, it'll say change document type or something. And you just choose JSON and then format it directly. So I don't like using my mouse.
Starting point is 02:12:41 And so Command-Shift-P or Control-Shift-P is a way to get to that command palette to do those things quickly. And I didn't know that VS Code plugins were available for TIFFs. I got TIFFs for days now. But we talked about them before. How could you not know that? Yeah, man. But now, this Mountain Dew,
Starting point is 02:13:00 man, I'm waking out of it. I've never been this awake at this point in the show before. It's all this sorting and stuff. So you're tossing your last tip here and we're going with some VS Code stuff? No, no. This is a great tip. This is Bookmark Navigator and this was sent to me
Starting point is 02:13:18 by Neil Spackman who said, hey, check out my plugin and I freaking love it. And what I do on my Mac or I guess Windows too is I hit control B and it pops down like a little search, kind of like that command palette where I can start typing in my bookmarks. So like if I've got some documentation or something in there, like I've got a bunch of bookmarks and they're not exactly sorted very well, but now I can kind of treat it like spotlight on Mac or kind of Cortana where I hit like control B and I start
Starting point is 02:13:42 typing doc or VSTS or whatever and just hit Enter just by typing the minimum amount of letters. So it lets me do a really quick search through my bookmarks. So now I've started getting better about actually adding bookmarks to common sites rather than using Google for the same thing by hitting Control-T
Starting point is 02:14:00 and then typing and then oh god, I didn't know that was in my history. And you know and being distracted. So it's just really nice and it works out really well. And so we'll have a link to that in the show notes. And thank you, Neil. Very cool. That looks very useful.
Starting point is 02:14:16 Yeah, true that. Awesome. All right. So we've been talking about some algorithms that you should probably already know. And if not, hope now we've helped you to know them. These are algorithms that you're probably going to come across at least in an interview, if nowhere else. But I hope you enjoyed it. And with that, subscribe to us on iTunes, Stitcher, and more using your favorite podcast app. Be sure to leave us a review by visiting www.codingblocks.net slash review. Yep.
Starting point is 02:14:51 And while you're up there, go ahead and check out our extremely detailed show notes, our examples, discussions, and more. And send your feedback, questions, and rants to the Slack channel, codingblocks.slack.com. And you can actually sign up for the Slack by going to codingblocks.net slash Slack. So hopefully that didn't confuse you. Make sure to follow us on Twitter at CodingBlocks or head over to codingblocks.net where you can find links to all sorts of stuff like the Slack
Starting point is 02:15:16 so you don't even have to remember anything except for codingblocks.net.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.