Algorithms + Data Structures = Programs - Episode 43: Parallel Scans and Associativity

Starting point is 00:00:00 Wait, I gotta go get my Lady Great tea, okay? I'll be back and we'll talk about that in like 30 seconds, but I need my hydration. Maybe I'm gonna pick up a strawberry Perrier as well, because, you know, I'd offer you one, but you know, you're in New York, so bad choice. There's so much to unpack, dear listener. Welcome to ADSP, the podcast episode 43 recorded on September 11th, 2021. My name is Connor. And on today's episode with my co-host Bryce, we talk about parallel scans and what the associativity requirement on binary operations actually entails, as well as NASA plants, Ramanujan numbers, and more. Oh, see, I bet you expected me to be late, and that's why you weren't ready on time. Um, well, let's talk about this, Brycece you've been late 33 minutes once 28 minutes once you know the group that i used to play magic the gathering with in uh in california there may have

Starting point is 00:01:17 been a few times where i was confused about whether we started at 6 p.m or or 7 p.m. on Friday nights. And so I showed up an hour late. And so they, of course, never let me forget it. And they started using Bryce as a verb to mean late. You know, so somebody would be like, oh, I'm going to be five minutes, or not as a verb, as a whatever part of speech it would be. I'm going to be five minutes Bryce. Yeah.

Starting point is 00:01:56 Wait, so you just indicated that I was missing out for having moved to New York, but I'm closer to where you are, so it would be more practical now for you to share a drink with me than it would have been when I was in California. Yeah, I guess that's true. That's true. Were you just suggesting that I made the wrong decision in not deciding to relocate to Toronto? Maybe. All right, here, listeners.

Starting point is 00:02:24 Usually I would mute this this is the sound of a strawberry perrier being opened and it's gonna be it's gonna be magnificent you ready here we go actually that was pretty bad that's pretty i'm gonna cut that out or maybe i'll go get another maybe i'll go get another strawberry perrier later. My, what, you're going to re-record that? You're going to redo that? No, I mean, these things are only 250 milliliters. I'm going to pound through this in like 30 seconds. My stuff is here.

Starting point is 00:02:55 Organized boxes. Well, all my boxes are here. I do have a bed. Yep. Our listeners are informed of all of your furniture and my furniture for that matter. We're keeping the listener. Hey? Yeah, yeah. There you go. Good one.

Starting point is 00:03:17 I told Connor that we should start referring to our – when we break the fourth wall and talk directly to you, the listener, that we shouldn't talk about you, the listener, in the plural. We shouldn't say, you know, audience, you know, insert some fact about Connors here that I want to tell you directly or something like that. No, instead, I should directly address you, specifically you. Yes, you. Whoever's listening to this right now, because it creates a more personal experience for the podcast. I picked up that a lot of the big podcasts do this. Should we just assume also that everyone we talk about is a listener? So when we mention like, you know, Tony's name, we just say, Hey, Tony, how's it going? Oh, yeah, of course.

Starting point is 00:04:10 Of course. Because either they're listening and they feel special or they're not listening. And then we have a chance to guilt them over it later. Oh, yeah, that's a good point. Yeah, that's how we should grow our listener base, is through emotional manipulation. It is what my people are known for. So we gotta start things off by addressing the first ever ADSP, the podcast, Twitter poll, which was obviously, if you're a listener. I am unfamiliar with this.

Starting point is 00:04:46 It's when Bryce previously had said, quote, any civilized person knows what a credenza is. Now I do know. Now I do know what's happening. I posted the question. So I wasn't sure if I should have posted it. I have heard of or I have not heard of, like just a true false sort of thing, but I opted for like the balderdash. Let's make up things. So I put four options.

Starting point is 00:05:11 What is a credenza? A pastry, B luxury car, C furniture and D clothing. In last place with 4.7% was a luxury car. And I should also give myself some credit that when I said it sounded like a Ferrari Credenza, I think the reason why a bunch of people knew what this was is because they're Italian. And it's an Italian word,

Starting point is 00:05:33 and luxury cars mostly come from Italy and Germany, not to offend our German listeners. Did you capitalize Credenza in the poll? Because if you didn't, then obviously people would be like, well, yeah, it can't be a luxury car because otherwise it would be capitalized. Oh good point it was not capitalized i i'd done messed up all right all right so that was four what was three uh in third place was clothing at 11.2 percent i really should have instead of putting clothing after the fact i realized i

Starting point is 00:06:00 should have put bicycle but that's okay in place, in third place with 21.3% was pastry. I'm not sure if that says more about people thinking it's a pastry or people thinking that if this is something about Bryce, it's a pastry. Which that means that in first place with 62.7% was furniture. So that means, what is that, 37.3%? I'm pretty sure that you switched the place ordering system that you were using in between the second place and the last one. Because you said in third place was pastry and then in first place was oh my bad yeah so fourth fourth was luxury car third was clothing second was pastry and uh first was furniture so yeah a grand total if my math is my math is correct uh 37.3 percent of people are

Starting point is 00:06:59 uncivilized so i think you owe the listener 37.3% of the listener an apology, Bryce. I just want to say, dear listener, I'm really glad for the 67% of you that are sublimes. 63. 63. You're not rounding up to two-thirds there, buddy. You don't get to do that. Also, this doesn't work if we're referring to the listener with a poll where, how many people responded?

Starting point is 00:07:30 It's like 10. Oh, 169. Wow, that's 13 squared. That's pretty good. Why do you know that? Why do you know that? Doesn't everyone? Well, I don't know.

Starting point is 00:07:40 Like math nerds, I just assume have like all of their squares. Why did you, I like how you not only math words, I just assume, have like all of their squares. I like how you not only knew that, but you felt the need to just share it with the world right now. I mean, how often do you come across the number 169, you know? I don't know, but I have a feeling that somewhere you have some spreadsheet where you've written down every time you've come across. No, no. written down every time you've come across no no it's just like like i've never come across like the number 343 just like as a figure um but that's like my one of my favorite numbers that's seven cubed what are some of your other favorite numbers well we already well 42 1729 4104 anything that's

Starting point is 00:08:20 a remodeling number uh yeah i'm so bad when i write uh when i write tests there's a Ramon Jean number? Yeah, I'm so bad when I write tests. There's a surprising number of tests that I write where the value 1742 or 3.14 is used as some value for testing purposes. Like, if you look through a bunch of uh you know tests in a code base that i've worked on and you see those numbers it was probably me that did it yeah that's uh it's uh yeah a lot of times when i'm just messing around 42 and 1729 are the the numbers that i that i throw in or 1729 so uh 1729 is the first Ramonajan number. What? Which is...

Starting point is 00:09:08 I covered this in a CPPCon talk, which means Bryce doesn't watch my talks. Also, Stefan Lawaway, in the same CPPCon of 2019, he snuck a bunch of 1729s in, which I pointed out in the same conference at my talk that was later in the week. So there's a movie called, well, Ramanujan is a famous mathematician. Yeah, I knew that. And there's a movie called, is it The Man Who Knew Everything?

Starting point is 00:09:38 Something like that. Dev Patel plays Raman Ajan. And they're also known as taxicab numbers, but they're numbers that are expressible two different ways as the sum of two cubes. So there's this famous anecdote from history where Ramon Ajan is getting out of a taxicab. And his advisor, I'm not gonna remember his name. There's Littlewood, and then there's one other guy. And he says, oh, that's, he remarks, that's not a really intriguing number. And then Ramon goes, what are you talking about? 1729. It's, it's a number that is expressible two different ways as a sum of two cubes. And it's so it's, it's 10 cubed plus 9 cubed, and I believe, I can't remember,

Starting point is 00:10:27 1 cubed plus 13 cubed maybe, something like that. Why taxicab? Because Ramon Ajan was getting out of a taxicab where the taxicab number was 1729, and that's where they got given his name. That wall behind you needs like a picture. It's something. It's sad right now.

Starting point is 00:10:54 It's sad and lonely. I mean, I will give you that it is lonely in my place. I wouldn't say it's necessarily sad. I've got like eight plants. I've got like a little garden A greenhouse My apartment is basically a greenhouse I have no plants I'm very Some people

Starting point is 00:11:12 Some people move to New York And they're like I miss having a place to go outside I miss having a backyard That's not me I'm very happy in my sterile Apartment that has nothing living in it aside from me yep it's uh i don't know you should get some nasa plants i know it's nasa java java uh

Starting point is 00:11:35 nasa nasa um the nasa approved plants that like uh they they clean oxygen and like super did you freeze or are you just very still you're not just very still i'm just very still um anyway it's enough about plants we also so we have to talk about we have to talk about the associate so this this is going to get a bit technical i apologize to listeners for if this becomes really boring you don't apologize to listeners now do you oh yes we apologize to you the listener we gotta be the only podcast where there's two co-hosts and one pings them hey hey we gotta stop referring to them in the aggregate we gotta singular let's connect with them more one-on-one and then it becomes just meta because the first episode we're recording 30 of our discussion is like no no no connor what did we talk about we talked about that there's only one listener. You hear that? He's listening or she's

Starting point is 00:12:27 listening. They're listening right now. There's only one of them. All right, here we go. So what happened? CEEF, C-E-A-F, it's going to be a big deal, Combinator Enabled Algorithm Fusion. It's going to be the foundation upon which potentially a programming language will be written in the future is this the programming language that when you started working at nvidia i told you you were going to write and you gave me an incredulous look and did that thing where you giggle and laugh that uh as far as i can recall never happened i'm not sure what you're talking about there were witnesses giggling is not something that i believe i'm capable of i i'm i'm gonna spare you dear listener from me repeating the story again of the time that connor giggled at the mention of sean parent but continue okay okay uh that's fair um but anyways i posted i posted so actually

Starting point is 00:13:30 what happened is on the most recent or was it two episodes of array cast ago um at the tail end of it i had mentioned the possibility of like building an array language that uh took advantage of the fact that if you built it up of primitives and you knew all the properties algebraic properties of the primitives like if whether they're associative or commutative you could like fuse your operations down to whatever the bare minimum of what needed to be done and then on top of that if it was like a reduction at the end of the day you would know that the like combination or composition of all those binary operations, you'd know the associativity or commutativity of it. And then let me get his name right. Troll

Starting point is 00:14:11 Henriksen, I believe trolls Henriksen. I apologize if I'm pronouncing incorrectly your name. We went back and forth on Twitter a tiny bit. And so trolls is the I'm not sure if he's the creator, but he is working on the Futhark language, which is like a GPU slash Haskell slash sort of functional programming language. And basically at one point, he asked for some example code. I posted that example code. And then his response was

Starting point is 00:14:42 that operation is not associative. And let me find that. Yeah, show me this operation. And so what I thought was that basically the binary operation had a pair as the left argument of the binary operation. And just a single element for the right argument. So it's definitely not commutative. And based on his definition, it made sense to say that it wasn't associative,

Starting point is 00:15:13 but it was the same pattern that we had when we were solving that maximum consecutive ones. And the way we solved that was by creating a function object in the form of a struct that overloaded the function call operator like three or four different times for dealing with the different cases of do I have sort of that information, like information struct and a single element? Or do I have two like information structs on the left and right? So it dealt with the cases of like, am I doing my first little pass on a chunk of the array and building up the information

Starting point is 00:15:48 and then doing like the down sweep and the up sweep? But like, does that not qualify as associative? So I'm looking at your Godbolt example here. Which operator are we talking about? Is it the one in the K function or the K2 function or the scan reduce lift function? But I wonder whether maybe him and us are working from a different definition of what associative means. If you look at either K2.

Starting point is 00:16:22 Yeah. K2 is the easiest because it's not split into that scan reduce lift thing so basically what what we're looking at just so that we're not just randomly talking about some godbolt example is we're looking at three different solutions to caddane's algorithm and caddane's algorithm is a solution to solving the maximum uh subarray sum problem. So if you're given a list of numbers that are negative and positive, what's the longest contiguous sequence of elements such that it maximizes the sum of that subarray?

Starting point is 00:16:57 So just do a proof. Like, just, you took a, you must have taken a class in college where you did proofs, right? I took discrete mathematics, but yeah. Here, let me send you the tweet that Trolls sent. I think that's the easiest way to think this through, is to just sit down and write out a proof. Maybe we can even do that now.

Starting point is 00:17:29 Essentially, you... But that's the thing. I think his point is that, so if you click that tweet, it'll bring you, I sent it in the chat. It says, how can f of x, so f that takes two parameters where the first one is x and the second one is f of y and z, be the same as if you switch the f of x and y, z, if they have different types. But that's the thing is like – Well, I think he's going along the same logic that I'm going along, that that statement that he has there in his tweet, that's what you want to prove. Just take your operator and plug it in and see what happens. Well, so that's why I'm getting back to the maximum consecutive ones problem. So if you haven't listened to that, that was a couple episodes actually,

Starting point is 00:18:33 but the one where we really dive into it and then it's called the loss reduction episode 25. I was going to say, don't you recall that like in that function object, like we have overloaded the function call operator. And that's where the disagreement between us and him comes up so he's he's asking how can f of x comma f y z be equal to f of f x y comma z if f y comma z has a different type than z yeah um and then he asks maybe we are using different definitions of associative and i think he's right that we are because in in our thinking um uh overloading you know f could be an overloaded function, right? That F could take,

Starting point is 00:19:26 it could take either, you know, a pair as the second argument, or it could take a scalar element. But I think... Isn't that cheating? I feel like that's cheating, is that like, so like, isn't, aren't we bending the definition of associative then? Yes. I mean, I think it's hard to say. Isn't this in the ISO standard? Like, do we define what we mean by associative? I don't think so. I mean, it's really, but the question isn't what we define as associative here. The question is, what do we define as an operation?

Starting point is 00:20:08 Like, what do we define as f? If we say that f is only a single overload, then yeah, this couldn't be associative. But if we say that f is an overload set or if f is a function object um then sure this this could be um uh associative and i think i would even argue from um from like a mathematical grounds i don't see any reason that you couldn't define, uh, uh, of, of, of function this way. Um, you know, there's, there's, there's no reason to restrict yourself to, uh, uh, just to requiring that the second parameter is just a single type. And in fact, I would go a step further. One of the tricks that I used in that maximum consecutive ones in a couple of my little prototypes was

Starting point is 00:21:25 relying on implicit conversions so this reduction that you're doing it's a reduction over what like a pair of int and then what's the data type this is the data type like int for the input sequence yeah basically it's a pair of int and then what's the data type this is the data type like int for for the input sequence yeah basically it's a pair of ints and one of them represents sort of a internal scan that we're doing and then the other int represents a maximum value over that scan which the point is is that like when you have a scam followed by a reduction you don't actually need to do the scan in terms of like writing to a linear amount of memory. You can just fuse those into a single reduction.

Starting point is 00:22:10 So, you know, one way that you could implement this operator is you could have one overload that takes pair and then, you know, pair events and then int. And then you could have one overload that takes pair events and then pair events, and then int. And then you could have one over there that takes pair events and then pair events. But you could also pick some sentinel value. You know, what is the second pair? Like in this pair events, what are the two values that are represented? The first value in the pair is the maximum. So like you see how we're doing a dot first at the very end?

Starting point is 00:22:52 Yeah, yeah. That's our actual result that we're collecting. And then the second value is the running sum that's sort of being floored at certain points in time. So basically, if you were doing if you were if you wanted to take just a single element and construct one of these pairs from it, you could do that. Yeah, you would basically you would just add it to the second one and then you would set it as the first one as well, because that's going to be the maximum at that point in time, unless if it's negative, of course. So then instead of defining your operation as an associative operation over the set of like ints and the set of these pairs and having this problem of dealing with heterogeneous types, you could just think about your operation as being an operation that only operates on these pairs and that you would just convert those the ints to the pair just do a transformer yeah or yeah and like like like you know you could one you could actually do this in your implementation you instead of using stood pair you could implement your own little struct type that has a constructor that takes a single int, and it's not an explicit constructor so that it will actually implicitly convert. You could do that.

Starting point is 00:24:15 But my point is really more that this is a theoretical exercise that I think demonstrates that the operation is indeed associative and it can be associative over a single type if you make that single type this collection of pairs. I do see his point, but I do think that he's wrong. I do think that this is an associative operation. I think that in the worst case, you could explain it as an associative operation over these you know these pairs right but like i from him looking at the code where you see that it's they're not both pairs the binary operation as a pair on the left and an element on the right it's definitely like when he asked the question i was like yeah he seems right but i know that we worked through this in the previous maximum consecutive ones and we did that by overloading the function call operator yeah and then my question was like so either either

Starting point is 00:25:10 we're like we've taken the word associative and sort of redefined it for our own sake or um you're allowed to call that op like if the operation can do different things that is also a valid in terms of it being associative either Either way, I'm not sure which one's right, but Henrik, this is for you, or not, sorry, Henrik, Henriksen, Trolls Henriksen. Yeah, that's, I guess, the answer. Do you have a sense of whether which one it is? Like from, what do you think the mathematicians are saying, that we're like, we're speaking blasphemy right now? I'm not sure. I'll say this. I think overloading is such an intuitive part of C++. I feel like somebody would clip that because overloading, of course, can be a fairly complex aspect of C++ that can be hard to understand.

Starting point is 00:26:02 But when I say it's intuitive, what I mean is it's a tool that we reach for so frequently that we take it for granted and that we don't perhaps even think about all the implications of that. And let me explain it a step further. When we as C++ programmers talk about an operation, either, you know, an operation that's defined as functions, or maybe it's defined as a customization point object, but like a named operation, like std reduce, let's use this as an example. We think of that as a single entity, right? That's a single thing. Maybe we even say like it's a single function.

Starting point is 00:26:58 And I consciously avoided saying the word function when I was just describing that there, because I didn't want to add confusion to it. But in reality, the named thing stood colon colon reduce is actually multiple different overloads. It's multiple different function definitions. And that's how you implement it. And another interesting aspect of this is that in some cases, especially when you're dealing with defaulted parameters,

Starting point is 00:27:40 you have the choice where you could either say, I'm going to implement this function with a single overload that has defaulted parameters, or I'll implement it with multiple overloads without the defaulted parameter. And the distinction between the two isn't really observable. And in fact, we give standard library implementers the freedom to pick which of these, to pick, to choose to do this either with more overloads or less overloads when they can, when they have defaulted parameters. The point I'm getting at is that we think about an overload set as being like one contiguous entity. And that's not, you know And that's not always the case.

Starting point is 00:28:26 Sometimes different overloads have different functionality. We try to avoid that. We generally think that that's not good design. In fact, one of my key library design rules is that all of the overloads of a function should have the same semantics, the same general semantics. And likewise, all of the specializations of a class template should have the same semantics

Starting point is 00:28:59 and the same, the same set of member functions and the same interfaces because we're going to think of those things as being a single entity. But there's other ways in which overloads can be quirky too. Different overloads can return different things, which can sometimes be surprising. The parallel overload of for each returns void. The scalar overloads return the function object. The parallel versions, of course, can't return that function object because that function object, it may have been stateful.

Starting point is 00:29:42 And that's the reason why the scalar version of, or the sequential version of for each returns it is because you might've passed in some function object that was gonna accumulate some data and then you wanna get it back later. Well, we couldn't support that in the parallel version. So we didn't. And we didn't return you the function object

Starting point is 00:30:04 because there was no reason for us to do that because there was nothing useful you could do with it. But that's a little funky to have a operation under one name that has two different return types. And likewise, you can have a variety of different parameters that a single named operation can take of a variety of different types. And I think we perhaps take it for granted that operations work like that in C++.

Starting point is 00:30:42 And maybe this fellow coming from a, it sounds like a more Haskell background and a background from other programming languages, maybe in his mind, a named function or a single operation should be something that has, you know, the same signature. And so maybe that's what what was bothering him um but i i could see i could see even even coming from a like similar background but you're like i i'm not a a trained mathematician so i'm not sure in the world of mathematics what they consider like the definitive definition of associative and um i wouldn't be surprised though, if many mathematicians saw that, you know, the pair on like, not even just type A on one side and type B on another side and said, well, clearly that's not associative. I think if you looked at the math definition,

Starting point is 00:31:37 yeah, I think if you looked at the math definition, it would have to be associative defined on some set of objects. And that set of things shouldn't be heterogeneous. But I'm not really sure, because the definitions of associative in the math world that I'm familiar with work like that, but it's possible that there are more complex definitions of associative uh that do have a notion of you know heterogeneity but i definitely think that at the very least you could define the set of objects that this operation operates on as being those pairs not the ints and the pairs right right you've solved your problem right i uh we should put i feel like we should put a disclaimer on the our website we do have a website by the way we never mention it

Starting point is 00:32:30 but it's uh adsp the podcast.com and maybe we should call it the title of this episode like surprise uh we are not mathematicians i i i have a math degree i mean i studied actuarial science which is like in in the math world but uh but um you know i feel like we're too bryce has just run off to go get something but i'm going to continue to talk but uh i feel like we're we're two developers that are math inclined but you know we're talking about the definition of associativity and opining on it when really we should just have a math person on and we should have a, we should, Oh, ADSP should have like a, like, so if you study pure mathematics, or maybe, maybe Lisa Lippincott would be great for that.

Starting point is 00:33:17 I think you should narrate what I'm doing, by the way. All right. I'll start doing that. So Lisa, you probably don't listen, but if you are listening, would you want to be our math consultant and every couple well a couple we will accrue our math questions and then have you on every once in a while um and also too lisa's got a crazy fascinating story it'd be awesome to have her on just for her to i'm sure she has some crazy stories similar to Sean. So, yeah, narrating.

Starting point is 00:33:47 I don't actually know. Bryce has got a package that's something he's wrapped up. You can hear in the background he's got scissors. There's a 37% chance he's going to stab himself. These are very nice scissors, by the way. Yeah, of course they are. I'm sure they're imported from Italy. They've got a mahogany handle. Imported from Japan. They don't have a mahogany

Starting point is 00:34:06 handle, but they are high carbon steel. They're very good. The fact that even Bryce has to clarify that his scissors don't have a mahogany handle is like the perfect summary. So this actually, this

Starting point is 00:34:22 didn't come in the stuff that just arrived yesterday. This traveled with me, but this is my Louisiana State University diploma. And can you read that? It says, Louisiana State University and Agricultural and Mechanical College, on the nomination of the faculty of the college of science has conferred upon bryce alexander whoa i didn't know you had a second middle name adelstein lelback actually i think i did know that um the degree of bachelor of science with all the honors rights and privileges

Starting point is 00:34:59 of that degree appertaining you got to move it up a little bit. Oh, yeah. Wait, wait. It doesn't say what my major was. Crap. Well, I assure you, I assure you,

Starting point is 00:35:16 I had a major in applied mathematics. Thanks for listening. We hope you enjoyed and have a great day.

Algorithms + Data Structures = Programs - Episode 43: Parallel Scans and Associativity

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.