Algorithms + Data Structures = Programs - Episode 116: Max Gap Count in C++23

Starting point is 00:00:00 Sure enough, the reason they're called unholy donuts is because they don't have holes in them, which I didn't put together. That's pretty clever. Welcome to ADSP, the podcast, episode 116, recorded on February 1st, 2023. My name is Connor, and today with my co-host, Bryce, we follow up part two of our two-part conversation that started in episode 115, and we solve a follow-up problem to max gap called max gap count in C++23. Same problem. We'll assume that we already have a sorted list of numbers.

Starting point is 00:00:47 And the question is now not what is the maximum of adjacent values, difference between adjacent values, but what is the count of the maximum difference between adjacent values? And this, this is where things get interesting. So wait, so wait, you want the, you want to know how many times that maximum gap occurs. Yes. And I realized probably I should have done this 20 minutes ago. Also, we might be bleeding in. This might be part two now of this episode.

Starting point is 00:01:16 I probably should have given an example. So for instance, for the first problem, if we had the numbers 1, 3, 5, and 10, the differences are 2, 2, and 5. So the answer to that question is 5. But for the second problem, the answer would be 1 because there is 1, 5. However, if the sequence was 1, 3, 5, 10, 15, 20, our differences are then 2, 2, 5, 5, 5. And so then you want the count of the number of fives, whatever the maximum difference is. In this case, it would be three.

Starting point is 00:01:52 And it's only a slightly different problem, but the implications on how you solve this I think are interesting. I'm not so convinced convinced it's just another reduction it is you just you just you just how many more reductions it's one right is it okay so you you do your max reduction but this time but instead of just reducing a single piece of information, which is what's the maximum value, you reduce your reduction operator returns to pieces of information,

Starting point is 00:02:39 one of which is the maximum value, and then the next of which is how many times you've seen it. And then any time you see a new maximum value, and then the next of which is how many times you've seen it. And then any time you see a new maximum value, you reset the how many times have I seen it to zero. So that doesn't work. I'll ask why. Why doesn't it work? Are you asking me now?

Starting point is 00:03:06 Yeah, I'm asking you now. So what if our example is one, one, what if it's a zero? Actually, how do I do this? What if it's one, three? All right, it's too hard for me to come up with an example on the fly. But basically, what if in resetting your maximum...

Starting point is 00:03:28 Actually, you're right. Of course. Wow. It's all reductions. All the way down. It's all reductions. Wow. See, your problem is

Starting point is 00:03:44 you don't always go for the one pass answer you should always go for the one pass answer well so no the the the example that i was trying to construct was what if you when you reset your maximum you've already passed by that maximum and you lost count but that's flawed thinking in that anytime you, by definition, anytime you meet a new maximum, you haven't seen it yet. Yeah. Isn't math great? Isn't math great? There's probably, you know, I probably have the book here. Yeah, I do. Introduction to Optimization. This is probably some, you know, all of these reduction, all of these like find style reductions

Starting point is 00:04:33 or like reductions where you're searching for some answer, they're probably all some class of optimization problem. And it would probably be interesting for us to research and to think about that more at some point. I'm amused here that like two minutes ago when Connor was saying things, I was sitting here like trying to work out the problem in my head and like saying wow and things in the background. And now Connor's doing the same thing. My mind is literally exploding. Well, I use literally in the non-literal sense. And I apologize to those that are upset by that.

Starting point is 00:05:09 My mind is exploding because two of the problems in my top 10 problems that are going to lead towards this algorithm course are maximum consecutive ones and longest consecutive increasing sequence. I refer to those as MCO and LCIS. And both of those, I don't want to do this because, oh yeah, I was going to bring this up. There's a new APL podcast called The APL Show. Are you running The APL Show? No, no, no. It's made by two of my co-hosts on my other podcasts.

Starting point is 00:05:55 But they share screens constantly and one of them has a chalkboard behind them. And I tried listening to the first couple and I know Adam listens to this podcast who's one of the co-hosts of The New APL Show. And so I haven't told him, given this feedback. But like it's almost impossible to follow like they read papers together and they talk about things that they don't actually spell out for the listener at times we do that all the time too

Starting point is 00:06:15 no I mean but we've done I think probably not we think we've done a good job but probably we fell into the no no I didn't say I didn't say we did a good job I said we do that all the time with the implication being that's not a good job, but probably we fell into the mistake. No, no, no. I didn't say we did a good job. I said we do that all the time with the implication being that's not a good thing. No, that's not what I'm saying is I was about to say that we do do a good job, but probably like in my head, I think we do a good job whenever we screen share. But the thing that I've learned from listening to their podcast is that one, don't have a chalkboard. Never have a chalkboard and start writing on that chalkboard because unless if you're

Starting point is 00:06:48 spelling out what they're drawing or writing, you have no idea. Like so much time, you'll just hear Rich going, and you see that's beautiful. And I'm like, I have no idea what you're seeing. And the other thing is that I think probably when we've been coding, we also fall into the same trap where we think we're spelling it out well enough, but there are times where we're leaving off 50%. Well, I think that our opinion of this podcast and our listeners' opinion of how this podcast is going is very different. But I'm telling you, this should have been a YouTube channel. And we're so photogenic.

Starting point is 00:07:20 People could get to meet my adorable dog, although my adorable dog's not here right now. But yeah, it should have been a YouTube channel. Everybody could get to see my great New York apartment. Oh, and you know what? Connor, look at my screen because the blinds are down right now. And watch this. Are you watching? Yeah.

Starting point is 00:07:45 Yeah, I've seen this before. Is this it? I thought there was going to be like balloons or something. No, I didn't set up ones, but you can see the Chrysler Building. I'd have to actually adjust the camera up a little bit for you to see the Chrysler Building. There we go. There's the Chrysler Building. But now you can't see me me and seeing me is more important

Starting point is 00:08:07 than seeing the christ they're building so all right back back okay my my mind is exploded you you came to some realization and it seemed like it was a realization about the all these i i had to open up my github repo but we won't we should make that a rule we no longer share screens if we're going to talk about code i have to communicate it clearly enough to you over the microphone. But it actually, so LCIS was not the problem I wanted to mention. So it's maximum consecutive ones and then Cadain's algorithm, which is also known as maximum subarray sum. And we're not going to get into the details of that. But basically, those are solvable with a one or two character.

Starting point is 00:08:45 Let's actually look at the difference. It's with a. Anyways, we're not going to do this live. It's like a one or two character difference. It's basically a scan followed by a reduction. Yeah. And. I'm trying to think now, is this the exact same thing?

Starting point is 00:09:18 Which if it is, is going to absolutely like, it's going to make my brain explode. Because this will now be, because yeah, it's, so in this case, if you're if the state you're carrying is the number, the count of the maximum difference. Yeah. And then every once in a while, you just might need to reset that. Although. I don't that's actually just a reduction, right? It's just a reduction. It's just a reduction right it's just a reduction it's just a reduction yeah and then it's a reduction that carries two pieces of state yeah no yes it's a reduction that carries

Starting point is 00:09:53 two pieces of state and then you just toss the first one and keep the count the first piece being what the max is wow well this blows everything up um because the interesting part of that problem what i initially thought was that in an array language you could use like an s combinator to basically find the maximum then do a rank polymorphic equals with the original array. So in our case, when we had, you know, one, three, five, 10, 15, 20, you do your adjacent difference on that gives you two, two, five, five, five, you then calculate the max. So that's going to be your first operation in your S combinator. And then the binary operation. So now you've got your original list and your maximum, and then you put an equal sign between

Starting point is 00:10:50 that and that'll turn it into 00111. And then you do a plus reduction. And it's interesting because in J you do that with a hook, which is the S combinator. In BQN, you do that with the S or the Sigma combinator, which are the names of those are after and before. But in APL, you don't with the S or the Sigma combinator, which are the names of those are after and before. But in APL, you don't have the S combinator. So you have to use the Phi combinator, aka a fork, where you fix one of the operations to be identity. So you get back your original list. Anyways, it's very interesting. But now you've completely broken my whole Eureka moment or not

Starting point is 00:11:23 Eureka moment, but just all the time i've spent thinking about this well and there's there's another there's a there's sort of a next step algorithm that i i think of um which is um you you've your first version of this was find the largest gap your second one was count how many gaps there are um the next you know thing that you might want to do that could be useful is give me the locations of all those gaps. And that one you could solve with a similar thing where your two pieces of state are what's the maximum and then a list or a vector. Um, or when I say list, I, I don't, is this a different problem you're solving?

Starting point is 00:12:09 That's a different problem. Okay. It's, it's like, I want to know, I want to find the locations, um, uh, of, of, of all these gaps. Um, and so my state is, you know, a, um, the list instead of just account. Yeah. Yeah. is a list instead of just a count. Yeah, yeah. And the only downside with that

Starting point is 00:12:27 is that needing to have a vector of the locations means that you need to have some dynamic storage for it. But I think in most cases, it's going to be a small list. But I actually imagine that there is some useful problems that you could solve with that like if you were if you were looking if you if you wanted to like slice up a sequence in some way like like you know we've been using gaps in this like the maximum difference between

Starting point is 00:13:03 two integers here but the search criterion could actually be anything right it could be you know, we've been using gaps in this, like the maximum difference between two integers here, but the search criterion could actually be anything, right? It could be, you know, any, you know, it could be finding the end of a sentence. It could be count how many sentences there are, you know, something like that. Or count how many words there are. You know, things like that um uh or count how many words there are um uh you know things like that actually that's an interesting thing that you could like i i've shown um uh in some of my talks i'd love to use word count like a parallel word count as an example um you could you could do um like something similar to that where you find like you, you know, what's the, um, uh, what's the longest word in this text? And then like, also like how many times does the longest word in this text occur? Um just saying, I think that the interesting case is the one where I say like, hey, find all of the places in this sequence that satisfy this criterion and then give me their locations. So like, hey, like here's a string, find all the ends of the sentences so that I can cut this string up into a vector of sentences. And I think with this type of reduction, you can in a single pass, a single parallelizable pass, you can do, you could do an adjacent transform where you get the diffs, then another transform that does a make pair or make tuple where it initializes the second value in that

Starting point is 00:14:54 tuple, which is going to be the count to one. And now you have a monoid. You can do a fully parallelized because you're basically like initially when i was thinking like how parallelizable is it it's like well it's not really 100 parallelizable because you still need to do a left fold to make sure that like every single time you hit a max it resets but actually that's not true at all yeah that property holds even if you you break your sequence or array up into a bunch of chunks whatever result is yielded is the count of the max diff in that chunk. And when you merge those chunks together, like it's, it's monoidal. Um, it'll, you'll always end up with the right answer, regardless if you're doing a left fold,

Starting point is 00:15:35 a right fold, or some arbitrarily, you know, tree reduction. So, so I, which is, which is so awesome. I hate to admit that I might have learned things in college, because college was not a pleasant experience for me, something that we discussed on this podcast before. I don't have any problem with college. I know that a lot of people learned things in college and had a good time there. I did not. I was very happy to get out there. So I hate to admit that I learned anything in college. However, once again, going back to optimization theory, I am recalling learning properties about local minimum and maximum and the relation to global minimum and maximum. And I think some of those things apply to this class of reductions too. And I think that I'm going to give you a homework, which is that I think your homework is that you need to go and do some reading on optimization theory.

Starting point is 00:16:37 I would mail you – What's going to be more optimal than a commutative associative monoid operation that you can put on a GPU. No, no, but what I mean, by optimization theory, I mean, mathematical optimization. I mean, like finding, you know, the minimum or maximum of some function. Okay, you really need to go and learn about this

Starting point is 00:17:00 because you know nothing about it. I would mail you my introduction to optimization book um uh what am i gonna get i got so many things to learn i gotta learn the jelly programming language have you heard of that is this is it gonna be more the last time i mailed you something if you recall it got seized by canadian customs it's true did you ever get it i did yeah but now i've got like 20 rapid tests sitting in my shelves because uh i don't understand it but some of the grocery stores started handing them out for free at some point when at another point they were like illegal to hand out it's yeah but things

Starting point is 00:17:35 changed it is an important clarification that when i when i'm talking about optimization here i don't mean like optimization the computer science i mean like optimization in the computer science i mean like optimization in the mathematical sense and like solving optimization problems all right i'm not sure what i will you haven't convinced me it pains me but what what am i it pains what am i going to get out of it it pains me that i learned something in college it really does am i gonna have to go read my introduction optimization book yeah but yeah how about say how about your homework assignment i'll go learn the jelly programming language. I got to be on a plane in like 36 hours. I don't have time for this stuff.

Starting point is 00:18:11 That sounds like a great learning opportunity. There's nothing that I love more than going to the airport. Tons of podcast opportunities, standing in line for customs. Everybody's angry. Me, I'm listening to Oxide and Friends. One, I don't stand in lines at airports and two i'm never angry at airports that's true i love airports airports cafeterias what do they call them food courts whoo it's one of my favorite places in the world um buddy buddy we we really have to fly together more we really have to fly together more why that's not going to change me loving uh airport food courts i'm gonna take you to some airport lounges, and then you're going to be unable to fly like a regular person.

Starting point is 00:18:51 I've been in several lounges, and depending on the airport and the lounge— Connor, you've been in lounges, but you haven't been in price-quality lounges. This is probably true have i told you this story about the kathai pacific um uh dan dan mian noodles no the kathai pacific first class lounge in uh london heathrow terminal three um maybe the one saving grace of london heathrow as an airport. They have great food options. It's like one of my favorite lounges. And they have these Dan Dan Mian noodles on the menu. And they're really, really good.

Starting point is 00:19:36 It comes in, when I, I've tried to order Dan Dan Mian noodles in New York to get something similar, but it's different because the ones that they have the um, it's, it's different because the ones that they have the lounge, um, it's, they're like drenched in sauce. It's basically like a, it's like a, um, you know, almost like a, a soup with noodles in it. Um, and the sauce is this like really spicy peanut, um, uh, sauce and it's really, really good. And I get it every time I transit through London. Some people in my family have suggested that I maybe fly to London solely to go to this airport lounge. But anyways, I love them so much and I can't find a place that makes them the same way in New York.

Starting point is 00:20:19 So I finally was like, all right, I'm just going to reach out to the airline on Twitter and be like, hey, can you give me your freaking noodle recipe? And so I did that. And, um, uh, I, I, this was, so I think I first tried calling them, um, and, uh, their customer service was like, yeah, we can't help you. And then I tried, uh, contenting them on twitter and then i tried emailing a few people and no luck but then i eventually i finally i found the recipe online um from the from the chef so now i have the cathay pacific um dan dan mian lounge noodles recipe and i do intend on making them soon. Yeah. All right. Well, I hope the listener enjoyed that.

Starting point is 00:21:11 I mean, I guess I shouldn't give you too hard of a time. I think I did mention on this podcast before that at one point I tried to reach out to Cinnabon to get them to sponsor me for my races. Yeah. Although I'm not a big Cinnabon fan anymore, to be honest. I wonder if I could get a local restaurant or not restaurant. Like there's two donut places that I really like in Toronto. One of them is called Harry and Heels, which is actually, it's interesting what they've done. It's a pizza restaurant.

Starting point is 00:21:43 But because of online delivery services, they've just. It's a pizza restaurant, but because of online like delivery services, they've just created a secondary business that they call Harry and heels that only sells the donuts that they sell at their pizza store. So it's like, Oh, we're at a donut shop. And the first time I went there, I was like, this is a pizza shop. It's not a donut shop, but like they just, they market it as two different things because if it's just a pizza restaurant and people are looking for donuts, they're never going to end up on the sort of pizza store. And there's another one called Unholy Donuts, which I only I went into the store the first time a couple of weeks ago. And I was so confused because I don't like filled donuts.

Starting point is 00:22:19 I don't like filled donuts because it's just too many textures. It's upsetting. And all of their donuts looked like they were filled and i was like oh how do you feel about eclairs oh i absolutely hate them name something where it's like uh pastry or like the gusher is the worst i was gonna bring you offender it was for cpp north but i guess i'm gonna bring you something else no if it's got... You know that Ramona and I met at McClure Bakery. I did not know that. I mean, you met on a dating app.

Starting point is 00:22:52 Yeah, that is true. Your first date was probably... Like, let's be clear. Let's be clear. But yeah, this Unholy Donuts place, I went in and all of them looked filled. And then I asked the lady behind the counter, oh, you guys don't have any unfilled donuts? And she's like, yeah, the sprinkle, the cinnamon, the Homer, and like one other one are not filled. And I was like, huh? And so I ordered a sprinkle one.

Starting point is 00:23:14 Sure enough, the reason they're called unholy donuts is because they don't have holes in them. Oh. Which I didn't put together. That's pretty clever. But you think we can get one of them to sponsor us? I mean, not this podcast. I mean, maybe I could get the new bodega on my block to sponsor us because I'm pretty sure

Starting point is 00:23:35 that we are the sole reason that their smoothie operation stays in business. They've got a problem now that Ramona's uh up in albany uh a good chunk of the time yeah i don't i feel like a local business wouldn't like i was thinking more to get sponsored for for racing like if you're racing in a singlet that says you know yeah yeah run for the unholy donuts or whatever um for this podcast not as much because like what percentage of our listeners are in toronto no no your your thing makes more sense than my thing i'll admit that yeah although it would be interesting to find like it it'd have to be like that's why i

Starting point is 00:24:15 thought you know lululemon it's like global right or at least it's both america lemon call us call us yeah you know we we had a whole episode dedicated basically to this. Or like there's so many companies like American Airlines. You could sponsor us. I stopped spending money on bubbly though. I only have one of those cartridges with the little squirt squirt soda buzzes. What do they call them? Whatever.

Starting point is 00:24:41 Soda streams. And I got the flavors. But honestly, like you can add so much carbonation like you just keep going and uh it's it's so overwhelming that you don't need flavors the flavors are meaningless when you've got basically just like an explosion of carbon dioxide in your mouth anyways uh okay this conversation's gone off the rails yeah it's even for us it's gone off the rails i have determined though in the back of my head that um this max gap count problem is not going to follow the same pattern as uh cadanes and maximum consecutive ones yeah i don't

Starting point is 00:25:19 think so because it's just a reduction not an inclusive scan reduction yeah and it's actually probably won't even be that pretty because actually it might be well i'll i will solve it off off air and then maybe next episode and and see see this this connor i'm i'm dear listener i'm about to be very arrogant this connor is why i am a principal architect. Because you've been thinking about this for days. You hop onto a meeting with me and I'm like, oh yeah, just do it like that. And boom, the entire trajectory of your next week has been decided by me. And now I can go fuck off and go to sleep.

Starting point is 00:25:59 This is going to take all of 10 minutes to program in several different languages. Yes, but it required the two minutes of hard work that I did to lead you to the promised land. It is a certain amount sad that... Well, no, fake hubris aside, you know what the reality is? The reality is, and I think we all know this, that we often do our best work when speaking to

Starting point is 00:26:35 or collaborating with others. Because oftentimes, we lock in on, this is the way that we have to solve this problem. Or my old boss, Hartman, whenever we were debugging a performance problem, he would always be like, you got to watch for the red herrings, you know, you have to watch

Starting point is 00:26:54 for the things that seem like they're going to be the problem or the solution, but they're actually just a trap. And you probably just got off thinking about this one way and you were just stuck in that that course of thinking and just just you just needed somebody who was fresh to the problem to be like oh well you know my gut reaction is to do this um and and probably if my gut reaction had been wrong i probably would have spent you know two or three or four days thinking about how to solve it with the way that initially occurred to me. Um, and it would only be when somebody new looked at it, uh, that I'd spoken with that,

Starting point is 00:27:33 that I would be like, oh yeah, that's actually how to do it. Or maybe I'm really, or maybe really I'm just that much smarter than you. Could be either thing. Could be either thing. Could be. I mean, we'll never know, but, uh, it is, you know, the key insight is just recognizing that you don't need to, every time you hit a new max, you can still keep reducing. Because it, at least when I thought about it initially, I was like, well, at first I have to find the diffs. And then I need to find the maximum.

Starting point is 00:28:01 And then once I have the maximum, then I can do it. But it's not true. You can do those two things at the same time, which is like, at least for me, clearly was a non-obvious insight. This brings us to the end of the episode, starting from episode 115. If you don't have Twitter, there is going to be a link to a GitHub discussion on a per episode basis if you want to leave comments or questions, and we'll be monitoring that. Or you can also do it on Twitter,

Starting point is 00:28:28 which is typically where we hang out. Thanks for listening. We hope you enjoy and have a great day.

Your Ad Here

Algorithms + Data Structures = Programs - Episode 116: Max Gap Count in C++23

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.