Algorithms + Data Structures = Programs - Episode 9: C++ vs Clojure partition (Naming - Part 2)

Episode Date: January 22, 2021

In this episode, Bryce and Conor wrap up the std::move_only_function saga and continue their discussion on naming.Date Recorded: 2021-01-09Date Released: 2021-01-22C++ std::partitionClojure partitionf...ilter across languagesC++20 views::filterHaskell filterAn example of D’s amazing docsany_invocable ProposalC++ Coroutinesstd::accumulatestd::partial_sumPython accumulateJulia accumulateQ LanguageQ sum & sumsArthur Whitneyk programming languageKx Acquired for $100M (note $40M of stock was purchased previously)std::messagesSean Parent’s APL tweetIsaacson’s Steve JobsIntro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8

Transcript
Discussion (0)
Starting point is 00:00:00 And then they said, we'll pay you $100 million. And he said, okay. And that's a true story. That's not, that is, it's a 50 kilobyte program. And in the APL community, it's regarded as like the most expensive piece of software in the world because it's $2 million per kilobyte of code. Welcome to ADSP The Podcast, Episode 9, recorded on January 9th, 2021. My name is Connor, and today with my co-host Bryce, we wrap up the saga of stood, move-only function, and continue our discussion on naming. It's not just symmetry within a language, but it's symmetry across languages.
Starting point is 00:00:56 There's many times, like, so one of the best examples is partition. In most languages, partition is very similar to what we have in C++, where you're giving a unary predicate, and then you'd partition the structure so that those that pass the predicate are at the front, and those that fail are at the back. Many languages have something called partition. Some of them slightly vary because instead of getting back a single vector or whatever data structure you had, you might get back a tuple of two lists. So they split it, And so they haven't like joined it back together. But closure has partition, which is one of the lists I mentioned, and it does something completely different. It's the equivalent of an algorithm called sliding or windowed. And I just think
Starting point is 00:01:35 that's like mistakes like that, where there is an extreme amount of like prior art, and all you have to do is go and look, I think is, it's unfortunate when that happens. And then there's the argument to be made, like, well, what if all the prior art is bad? Like, so some people say that filter, like, that's a bad name, because do you filter in or do you filter out? And I agree with that argument. There is some ambiguity there. But like, almost, I don't think I found a single language that doesn't filter in. So like if you have in Haskell, you know, filter odd, it keeps all the odd elements and the ones that fail to meet the predicate get excluded.
Starting point is 00:02:14 Almost every single language that I found works this way. And so I would argue that in a new programming language, don't go and reinvent the wheel. Just choose filter. Some people say, you know, keep if is a better name. Potentially, that is a better name. But do you really want to, you know, I'm not sure. What are your thoughts on like filter versus keep if? Well, I mean, I think you make,
Starting point is 00:02:41 when you want to break from existing precedent, you need to have a really strong case. The default should be stick with what the existing precedence is. Because even if it's a bad name, if it's a bad name that everybody knows to have this meaning, if all these other programming languages call filter this thing, and if everybody's going to know what that means, then yeah, maybe it's bad. But introducing the new name the new name you're going to teach that to people um i mean i i i don't know i i i do get the argument about yeah are you filtering in or are you filtering out um and there is some ambiguity there so so what's the typical meaning of filtered? Is it typical? It filters in.
Starting point is 00:03:27 So yeah, like I said, in Haskell or insert any language because they all call it filter. Ranges, C++20 ranges has a view called filter. If you go filter, pass it your list and then you pass it your predicate, anything that returns true for the predicate is kept, which is why the proposed
Starting point is 00:03:45 name, and so a potential argument to, or rebuttal to when you say, oh, well, now you have to teach the new thing. I'm sure someone that is in votes for keep if would say, well, you don't even need to teach what keep if means. It's obvious. You keep if it meets the predicate. My argument against renaming it is it's the discoverability of it. If I'm going to some new language X and I want to filter something, I'm going to go new language X and search the docs for filter. And it's, you know, potentially if the docs are really well written, like D languages docs are fantastic because they put at the bottom of every like function, you know, in other languages, this is known as, and then like a, uh, maybe not comprehensive, but I've seen like up to like six
Starting point is 00:04:29 or seven different names, um, for, uh, algorithm names. Um, I think that those docs are fantastic. So if you're doing that and you say this in most languages known as filter and it comes up in the docs really quickly, you know, that's not really a big issue, but I've definitely like when going to other programming languages and looking for an algorithm, uh, you know, that, that does partition and it's called something else, you know, it's, it's sometimes tricky to track that stuff down. Um, um, I would say I, I'd probably be inclined to, to go, if I was going to pick better names, I'd probably be inclined to go with something like filter in or filter out instead of going to something like keep if just to like not have a just plain filter but have a filter in or an explicit filter in and filter out or just an explicit filter in. Just to retain some of the familiarity of the existing name.
Starting point is 00:05:23 You know? Yeah. But it's... So did you mention what the voted name? I don't even think the very end of your stood function ref and stood... Oh, no, I don't think I did. What was the name that got voted in? We settled on the name. name well it's pending a
Starting point is 00:05:46 final vote but i'm pretty sure that we're going to move forward with this name uh which is move only function um it it and any function so the the two top two contenders after a few rounds of uh voting were any any underbar function and move only function. And the argument for the any underbar function was we want to establish this naming pattern for type erased things. And okay, function isn't the name of a concept, but any invocable really would have established the pattern for the concepts. And as I mentioned, with the way that C++ works by case law, oftentimes this is how you actually establish a guideline in C++ is you get the first thing to follow the pattern and then subsequent things just build upon it. But so that was the argument, I think, in favor of any function. There were also people who argued that the move, that the name shouldn't reflect the,
Starting point is 00:06:49 the move-onlyness of it or the moveability of it. Um, but, uh, move-only function won the day. And I, I think it's a, um, a good enough name. I think at the end of the day, what happened when we first met in December, it was very much not clear whether we were going to get consensus on a name. And we had not only a telecom meeting, but we also had, there were mailing list discussions, which I think must have exceeded 300 emails in length. It probably was more than that. But I think what happened was a lot of people had strong opinions, but at the end of the day, folks have realized that they do not want to, they would rather have this feature with a good enough name, or at least a name that's not actively bad,
Starting point is 00:07:51 rather than not have this feature while holding out for what they believe is the ideal name. So I think at the end of the day, not everybody is 100%, or not everybody thinks move-only function is the perfect name, but I think that we all agree that it's a name that is good enough. And more importantly, it's not actively bad. It does not have any active consistency problems. It fits in nicely with the other names. It doesn't answer or resolve this question of how should we name type erased facilities, type erased wrappers for concepts. It sort of punts on that question. It doesn't embed a particular use case into the name as, say, callback or some of the other suggested names may have. So there were no large problems
Starting point is 00:08:58 or negative connotations associated with this name. It's good enough. It's consistent. And so I think that's why it's going to be what we end up going with. Yeah. You know, another naming topic that I'll sort on that subject of ambiguity and overloaded terms, if you pick a name for something that is overloaded with other meanings, you might invite critique because somebody thinks that your thing is something different. And the best example of this is C++ coroutines, which I will make a bold claim. I think if the feature that is called C++ coroutines
Starting point is 00:09:50 had been given, the feature as landed that's called C++ coroutines, not the other proposals, the form that we got, if that facility had been given a different name, if the proposal had just called it async await and not coroutines, had been given a different name, if the proposal had just called it async A-weight and not coroutines, if the word coroutines hadn't been a part of that proposal's title,
Starting point is 00:10:12 I think the feature would have shipped in C++ 17. Not with any technical changes to the semantics or the syntax. The same feature, just with different branding, I think it would have shipped three years earlier and with a lot less controversy. And the reason for that is that we spent years in a place where people thought that the coroutines proposal, which was for this async, await, compiler-assisted facility, which lets you do things like generators, et., lets you do suspendable and resumable functions, lazy functions, essentially. People thought that that proposal was competing with other proposals that also used the word coroutine. People thought that people had different meanings in their heads of what coroutine meant, was. There was a lot of discussion in those days of stack-full versus stack-less coroutines. So as opposed to an
Starting point is 00:11:15 async A-weighted facility, the other form that you might think of is a library that provides you with a way to create a new stack frame object and then to context switch to it. These facilities are used frequently in creating lightweight threading systems. There's a proposal for this that's going through the committee now, and I think it's now called Fibers or something like that. But at some points in the life of these proposals, it was called coroutines as well. And that help ended up holding
Starting point is 00:11:54 up all of these different proposals. A lot of these different proposals were not competing. It was not you pick one of these two or three coroutine proposals. There were some that were actively competing, but there were a bunch of proposals, and we spent a bunch of years held up on a competition between different things because they happened to pick the same name. And if those things had picked different names that did not have different meanings to different people. I think that all of those features would have advanced a lot quicker. Yeah. So names do matter. Naming does matter. So I think what we're going to do is at the tail end of the story where we revealed it was move-only function, well, that'll be the end of part one.
Starting point is 00:12:43 So right now you're listening to part two and uh we'll probably continue this discussion because there was a bunch of things too of uh neither of us i'm not sure maybe in the midst of um the stories that have been shared you have revealed what you think is one of the most surprising sort of namings but i'll save that for our next recording and then also too I wanted to mention some stuff about naming patterns. Oh, yeah. I got like three things written down on my notepad. Let's do it.
Starting point is 00:13:14 Let's do it. Well, we'll do it in the next recording. No, no, no. Come on. We're warm. We're warm. I need to go in like 12 minutes. Give me those 12 minutes. Give me 12 minutes. All right. All right. We're warm. I need to go in like 12 minutes. Give me those 12 minutes.
Starting point is 00:13:26 Give me 12 minutes. All right. All right. We can keep going. So first off, what is – do you have off the top of your head? So I'll share what mine is. And if you've watched my talks, you've probably seen this before. So in C++, our reduction is called std accumulate, which definitely not the
Starting point is 00:13:46 best name as far as reduction algorithm names goes. And then we call our scan partial sum, at least in C++ 11. We have parallel versions with different restrictions in C++ 17 called std reduce and std inclusive scan and std exclusive scan which are much better names in my opinion but in not just python now because and that's python if there's python developers probably not but if you're working on the core team python needs to pay so much attention to what they what they name their stuff because other languages whether it's in the core language, the standard library, or just like packages, so many languages just copy what Python does. And an example of this is Python's Accumulate.
Starting point is 00:14:36 So Python also has an algorithm called Accumulate. But it is not a reduction. It is a scan. So C++'s reduction is called accumulate and python's accumulate is our partial sum and another programming language went ahead and stole this or copied it and that language is julia their scan, their generic scan is called Accumulate, which is also when it comes to their scan names, they're awful, in my opinion, for multiple reasons. And so they might not have copied Python. At least that's the connection I see. But they obviously, Julia, it's sort of a flavor of the sort of MATLAB and Octave sort of scientific and numerical languages.
Starting point is 00:15:26 But they have their generic version of a scan called Accumulate. And then they have a bunch of specializations which actually have the binary operation hardcoded. So they have like Cumprod, which is a multiplication scan. They have CumSum, which is a addition scan or a plus scan. And I think there's a couple other variants. But that really bothers me because there's no symmetry between the specialization names or the specialized versions of the scans and the generic scan.
Starting point is 00:15:57 I would prefer it if it was just plus accumulate and product accumulate or some version of that. And a really good example of this is in an APL derivative of a derivative called Q, which is a finance language. Or it's an array programming language that's primarily used in finance domains. But I think also like F1 Racing also uses it a bit and some other like physics and crazy stuff. They have explicit or, you know, explicit versions and specializations of their algorithms. But they're called like sum and sums, max and maxes, prod or PRD and PRDs. So like the specialization reduction is always called like max, sum or prod, which is three characters. And then the scan
Starting point is 00:16:46 version of that just adds an s to the end and i think that's absolutely beautiful um because like the symmetry across the specialized specializations is it's it's it's perfect like you always know like the three character uh is a reduction and then when you add an s to that you get the scan version i don't know i don't i i'm not a huge fan of super short names i mean but you have to understand that like uh three characters or four characters for an apl derivative that's a generous that's a generous number of characters for for a language name in fact uh arthur whitney um who sold, so he, Arthur Whitney's Ken Iverson's protege, he went and wrote K0 through K9. So there's like eight or nine different dialects of K.
Starting point is 00:17:33 K4 is basically Q, but Q is just K4 plus words. So K is just like an ASCII APL, so single character algorithms and algorithm names. And a company called First Derivatives said that they would pay for K4. They wanted to buy it, but they wanted words added, which is like antithetical to what Arthur Whitney believes in. Like he was just like over my dead body. And then they said, we'll pay you $100 million. And he said, okay. And that's a true story.
Starting point is 00:18:08 That's not, that is, it's a 50 kilobyte program. And in the APL community, it's regarded as like the most expensive piece of software in the world because it's $2 million per kilobyte of code.
Starting point is 00:18:22 Anyways, anecdote over. What's your worst uh uh naming naming uh or not worst case but just the most surprising like so when i came across the python accumulate i was like fantastic they've got a scan and then i i typed some code and i was like what is this um that was that was probably the most surprised stood almost nobody's Almost nobody's going to know about this. Stood. This is in C++, C++ standard library. Stood colon colon messages. What do you think it does?
Starting point is 00:19:01 Stood colon colon messages. First of all, have you ever heard of it? No, I have not. I don't even know what header it's in. If you tell me the header, will I have a better idea? You will never guess the header, but let's see how far you get. I give up already. I want to say it's...
Starting point is 00:19:20 I have no idea. Stood messages? What do you think it has to do with? What would you imagine that a thing called Stood messages? What do you think it has to do with? Like, what would you imagine that a thing called stood messages has to do with? I don't know. Stood messages. I mean... Some helper function in, like, Iostream or something like that?
Starting point is 00:19:44 I don't know. Some helper function for, like stream or something like that. I don't know. Some helper function for like IO streams. You're, you're, you're, you're in the right family. If I,
Starting point is 00:19:53 if you told me that there was a thing called std messages and I didn't know anything more about it, I would probably guess that has something to do with like communication library or something like that. The class template std messages is a standard locale facet that encapsulates retrieval of strings from message catalogs such as the ones provided by new git text or by posix cat git s it is a part of locales I'm not even entirely sure what those things are, but yeah. That was, that is a name.
Starting point is 00:20:31 That is a name. And like, it just squatted on that territory. Like, it just said, hey, here's a, you know, like a fairly good name for a thing, like message. And I'm just going to claim that for all time instead. Yeah, that is a, that's a land grab for sure. Yeah. Yeah. Tweet at us or LinkedIn at us. If, if you knew of that, I'm curious how many people actually were aware of Stib messages. We should do user feedback and Sean's tweet. Okay, yeah.
Starting point is 00:21:11 So the one piece of, I'm not sure if it was feedback, but it was a tweet we got in the last 24 hours from when we're recording this is from Peter Larson on Twitter who said, I listened to the first seven episodes between Christmasmas and new year i can strongly recommend this podcast if
Starting point is 00:21:29 you're interested in programming uh you get information and feel in very good company so thanks peter that's a very kind thing of you to say and i mean if you're it's it's an endorsement but if you're listening to this endorsement you're already listening to the podcast so um not sure not sure how we're supposed to get that endorsement out there um uh and the other thing we wanted to highlight is uh well i don't know what are we highlighting about this just that like um so far this has made my just being a fan boy man yeah it's uh so far this is like it's the highlight of the year uh we'll see at the end of 2021 when we do our annual retro. I'll try and cut this in. I'll try to remember to cut this in when we're doing the retro in a year.
Starting point is 00:22:12 But yeah, at Sean Parent tweeted at myself and Bryce saying, that was a lot of APL with a photo of a hand holding a glass of eggnog. So, yes, highlight of the year. I'm not sure if you've got any thoughts that you want to share on that. I'll just – I feel like I should let Sean know that don't worry. Connor no longer lives in the same city as you. I actually – you know what I said to my partner? I said that there's a chance that like Bryce paid Sean to do that as like a Christmas gift.
Starting point is 00:22:52 I did not. Sean tells the best programming lore stories of anybody I've ever met by, like, an order of magnitude. Can confirm. Can confirm. really cool companies and just happened to be there like at very interesting times and just has all these great and amazing stories. Yeah. I've, I've had the privilege of having a couple of lunches and dinners with Sean and yeah,
Starting point is 00:23:40 like it's, he'll be telling a story and then you know, it's, and then we walked into the meeting room at Apple and Steve Jobs was there and he was a bit upset. Wait, what? You walked into a meeting with Steve Jobs? Well, it's only a couple of times. And then you realize that in all the books, if you've read about Steve Jobs, I think Isaacson, he wrote like a pretty big one, you know, adjacent to that whole, you know, story that is Steve Jobs, you know, Sean Parent, for those of you that don't know,
Starting point is 00:24:13 he works at Adobe. And Adobe sort of is like that has a very long history with Apple. And, and yeah, the stories are just are their epic. Yeah. yeah so yeah maybe we'll have to start we'll see if we can convince him we'll have to start like a series uh where like every every once a month we bring sean on and then we just say uh tell us the story of and we turn off our mics and we just let sean talk for uh an hour or so yeah i don't i don't hate that idea i don't hate that idea at all i feel like that's a that podcast hack, though. That's like – that's just – I'm not above using hacks. You've got some hat hair there, buddy. Oh, yeah.
Starting point is 00:24:53 Connor, every time we've recorded this podcast and every time I've seen Connor and we like have a separate Zoom call like once a week, he's always wearing the same cap on backwards. Okay, now he's wearing it on forwards. That's because I'm cool. That's what the cool kids do. I haven't seen Connor's hair in like a year. And with that, we'll wrap up our second episode on naming. We hope you enjoyed and we'll see you in the next episode.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.