Algorithms + Data Structures = Programs - Episode 9: C++ vs Clojure partition (Naming - Part 2)
Episode Date: January 22, 2021In this episode, Bryce and Conor wrap up the std::move_only_function saga and continue their discussion on naming.Date Recorded: 2021-01-09Date Released: 2021-01-22C++ std::partitionClojure partitionf...ilter across languagesC++20 views::filterHaskell filterAn example of D’s amazing docsany_invocable ProposalC++ Coroutinesstd::accumulatestd::partial_sumPython accumulateJulia accumulateQ LanguageQ sum & sumsArthur Whitneyk programming languageKx Acquired for $100M (note $40M of stock was purchased previously)std::messagesSean Parent’s APL tweetIsaacson’s Steve JobsIntro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8
Transcript
Discussion (0)
And then they said, we'll pay you $100 million.
And he said, okay.
And that's a true story.
That's not, that is, it's a 50 kilobyte program.
And in the APL community, it's regarded as like the most expensive piece of software
in the world because it's $2 million per kilobyte of code. Welcome to ADSP The Podcast, Episode 9, recorded on January 9th, 2021.
My name is Connor, and today with my co-host Bryce, we wrap up the saga of stood, move-only function, and continue our discussion on naming.
It's not just symmetry within a language, but it's symmetry across languages.
There's many times, like, so one of the best examples is partition.
In most languages, partition is very similar to what we have in C++, where you're giving a unary predicate, and then you'd partition the
structure so that those that pass the predicate are at the front, and those that fail are at the
back. Many languages have something called partition. Some of them slightly vary because
instead of getting back a single vector or whatever data structure you had, you might get
back a tuple of two lists. So they split it, And so they haven't like joined it back together. But closure
has partition, which is one of the lists I mentioned, and it does something completely
different. It's the equivalent of an algorithm called sliding or windowed. And I just think
that's like mistakes like that, where there is an extreme amount of like prior art, and all you have
to do is go and look, I think is, it's unfortunate when that happens. And then there's the argument
to be made, like, well, what if all the prior art is bad? Like, so some people say
that filter, like, that's a bad name, because do you filter in or do you filter out? And I agree
with that argument. There is some ambiguity there. But like, almost, I don't think I found a single
language that doesn't filter in.
So like if you have in Haskell, you know, filter odd, it keeps all the odd elements
and the ones that fail to meet the predicate get excluded.
Almost every single language that I found works this way.
And so I would argue that in a new programming language, don't go and reinvent the wheel.
Just choose filter.
Some people say, you know, keep if is a better name.
Potentially, that is a better name.
But do you really want to, you know, I'm not sure.
What are your thoughts on like filter versus keep if?
Well, I mean, I think you make,
when you want to break from existing precedent,
you need to have a really strong case.
The default should be stick with what the existing precedence is. Because even if it's a bad name,
if it's a bad name that everybody knows to have this meaning, if all these other programming languages call filter this thing, and if everybody's going to know what that means, then
yeah, maybe it's bad. But introducing the new name the new name you're going to teach that to people um i mean i i
i don't know i i i do get the argument about yeah are you filtering in or are you filtering out
um and there is some ambiguity there so so what's the typical meaning of filtered? Is it typical?
It filters in.
So yeah, like I said, in Haskell
or insert any language
because they all call it filter.
Ranges, C++20 ranges has a view called filter.
If you go filter, pass it your list
and then you pass it your predicate,
anything that returns true for the predicate is kept,
which is why the proposed
name, and so a potential argument to, or rebuttal to when you say, oh, well, now you have to teach
the new thing. I'm sure someone that is in votes for keep if would say, well, you don't even need
to teach what keep if means. It's obvious. You keep if it meets the predicate. My argument against
renaming it is it's the discoverability
of it. If I'm going to some new language X and I want to filter something, I'm going to go new
language X and search the docs for filter. And it's, you know, potentially if the docs are
really well written, like D languages docs are fantastic because they put at the bottom of every
like function, you know, in other languages, this is known as, and then like a, uh, maybe not comprehensive, but I've seen like up to like six
or seven different names, um, for, uh, algorithm names. Um, I think that those docs are fantastic.
So if you're doing that and you say this in most languages known as filter and it comes up in the
docs really quickly, you know, that's not really a big issue, but I've definitely like when going
to other programming languages and looking for an algorithm, uh, you know, that, that does partition
and it's called something else, you know, it's, it's sometimes tricky to track that stuff down.
Um, um, I would say I, I'd probably be inclined to, to go, if I was going to pick better names,
I'd probably be inclined to go with something like filter in or filter out instead of going to something like keep if just to like not have a just plain filter but have a filter in or an explicit filter in and filter out or just an explicit filter in.
Just to retain some of the familiarity of the existing name.
You know?
Yeah.
But it's...
So did you mention what the voted name?
I don't even think the very end of your stood function ref and stood...
Oh, no, I don't think I did.
What was the name that got voted in?
We settled on the name. name well it's pending a
final vote but i'm pretty sure that we're going to move forward with this name uh which is move
only function um it it and any function so the the two top two contenders after a few rounds of uh
voting were any any underbar function and move only function. And the argument for the
any underbar function was we want to establish this naming pattern for type erased things.
And okay, function isn't the name of a concept, but any invocable really would have established
the pattern for the concepts. And as I mentioned, with the way that C++ works by case law, oftentimes this is how you actually
establish a guideline in C++ is you get the first thing to follow the pattern and then subsequent
things just build upon it. But so that was the argument, I think, in favor of any function. There were also people who argued that the move, that the name shouldn't reflect the,
the move-onlyness of it or the moveability of it. Um, but, uh, move-only function won the day.
And I, I think it's a, um, a good enough name. I think at the end of the day, what happened when we first met in December,
it was very much not clear whether we were going to get consensus on a name.
And we had not only a telecom meeting, but we also had, there were mailing list discussions,
which I think must have exceeded 300 emails in length. It probably was more than that.
But I think what happened was a lot of people had strong opinions, but at the end of the day,
folks have realized that they do not want to, they would rather have this feature with a good enough name,
or at least a name that's not actively bad,
rather than not have this feature while holding out for what they believe is the ideal name.
So I think at the end of the day, not everybody is 100%, or not everybody thinks move-only function is the perfect name, but I
think that we all agree that it's a name that is good enough. And more importantly, it's not
actively bad. It does not have any active consistency problems. It fits in nicely with the other names.
It doesn't answer or resolve this question of how should we name type erased facilities,
type erased wrappers for concepts. It sort of punts on that question.
It doesn't embed a particular use case into the name as, say, callback or some of the other suggested names may have.
So there were no large problems
or negative connotations associated with this name.
It's good enough. It's consistent.
And so I think that's why
it's going to be what we end up going with. Yeah. You know, another naming topic that I'll sort on
that subject of ambiguity and overloaded terms, if you pick a name for something that is overloaded with other meanings,
you might invite critique because somebody thinks that your thing is something different.
And the best example of this is C++ coroutines, which I will make a bold claim.
I think if the feature that is called C++ coroutines
had been given,
the feature as landed that's called C++ coroutines,
not the other proposals,
the form that we got,
if that facility had been given a different name,
if the proposal had just called it async await
and not coroutines, had been given a different name, if the proposal had just called it async A-weight
and not coroutines, if the word coroutines hadn't been a part of that proposal's title,
I think the feature would have shipped in C++ 17. Not with any technical changes to the semantics
or the syntax. The same feature, just with different branding, I think it would have
shipped three years earlier and with a lot less controversy. And the reason for that is that we spent years in a place where
people thought that the coroutines proposal, which was for this async, await, compiler-assisted
facility, which lets you do things like generators, et., lets you do suspendable and resumable functions, lazy functions, essentially. People thought that
that proposal was competing with other proposals that also used the word coroutine.
People thought that people had different meanings in their heads of what coroutine meant, was. There was a lot of
discussion in those days of stack-full versus stack-less coroutines. So as opposed to an
async A-weighted facility, the other form that you might think of is a library that provides you
with a way to create a new stack frame object
and then to context switch to it.
These facilities are used frequently in creating lightweight threading systems.
There's a proposal for this that's going through the committee now,
and I think it's now called Fibers or something like that.
But at some points
in the life of these proposals, it was called coroutines as well. And that help ended up holding
up all of these different proposals. A lot of these different proposals were not competing.
It was not you pick one of these two or three coroutine proposals. There were some that were actively competing, but there were a bunch of proposals, and we spent a bunch of years held up on a competition between different things because they happened to pick the same name.
And if those things had picked different names that did not have different meanings to different people. I think that all of those features would have advanced a lot quicker.
Yeah.
So names do matter.
Naming does matter.
So I think what we're going to do is at the tail end of the story where we revealed it was move-only function,
well, that'll be the end of part one.
So right now you're listening to part two
and uh we'll probably continue this discussion because there was a bunch of things too
of uh neither of us i'm not sure maybe in the midst of um the stories that have been shared
you have revealed what you think is one of the most surprising sort of namings but i'll save
that for our next recording and then also too I wanted to mention some stuff about naming patterns.
Oh, yeah.
I got like three things written down on my notepad.
Let's do it.
Let's do it.
Well, we'll do it in the next recording.
No, no, no.
Come on.
We're warm.
We're warm.
I need to go in like 12 minutes.
Give me those 12 minutes. Give me 12 minutes. All right. All right. We're warm. I need to go in like 12 minutes. Give me those 12 minutes.
Give me 12 minutes.
All right.
All right.
We can keep going.
So first off, what is – do you have off the top of your head?
So I'll share what mine is.
And if you've watched my talks, you've probably seen this before.
So in C++, our reduction is called std accumulate, which definitely not the
best name as far as reduction algorithm names goes. And then we call our scan partial sum,
at least in C++ 11. We have parallel versions with different restrictions in C++ 17 called
std reduce and std inclusive scan and std exclusive scan which are much better names in my opinion
but in not just python now because and that's python if there's python developers probably not
but if you're working on the core team python needs to pay so much attention to what they
what they name their stuff because other languages whether it's in the core language, the standard library, or just
like packages, so many languages just copy what Python does.
And an example of this is Python's Accumulate.
So Python also has an algorithm called Accumulate.
But it is not a reduction.
It is a scan.
So C++'s reduction is called accumulate and python's accumulate is our partial sum and another programming language went ahead and stole this
or copied it and that language is julia their scan, their generic scan is called Accumulate, which is also when it comes to their scan names, they're awful, in my opinion, for multiple reasons.
And so they might not have copied Python.
At least that's the connection I see.
But they obviously, Julia, it's sort of a flavor of the sort of MATLAB and Octave sort of scientific and numerical languages.
But they have their generic version of a scan called Accumulate.
And then they have a bunch of specializations which actually have the binary operation
hardcoded.
So they have like Cumprod, which is a multiplication scan.
They have CumSum, which is a addition scan or a plus scan.
And I think there's a couple other variants.
But that really bothers me because there's no symmetry between the specialization names
or the specialized versions of the scans and the generic scan.
I would prefer it if it was just plus accumulate and product accumulate or some version of that.
And a really good example of this is in an APL derivative of a derivative called Q,
which is a finance language.
Or it's an array programming language that's primarily used in finance domains.
But I think also like F1 Racing also uses it a bit and some other like physics and crazy stuff. They have explicit or, you know, explicit versions and specializations of their algorithms.
But they're called like sum and sums, max and maxes, prod or PRD and PRDs.
So like the specialization reduction is always called like max, sum or prod, which is three characters.
And then the scan
version of that just adds an s to the end and i think that's absolutely beautiful um because
like the symmetry across the specialized specializations is it's it's it's perfect
like you always know like the three character uh is a reduction and then when you add an s to that
you get the scan version i don't know i don't i
i'm not a huge fan of super short names i mean but you have to understand that like uh three
characters or four characters for an apl derivative that's a generous that's a generous number of
characters for for a language name in fact uh arthur whitney um who sold, so he, Arthur Whitney's Ken Iverson's protege, he went and wrote K0 through K9.
So there's like eight or nine different dialects of K.
K4 is basically Q, but Q is just K4 plus words.
So K is just like an ASCII APL, so single character algorithms and algorithm names.
And a company called First Derivatives said that they would pay for K4.
They wanted to buy it, but they wanted words added, which is like antithetical to what Arthur Whitney believes in.
Like he was just like over my dead body.
And then they said, we'll pay you $100 million.
And he said, okay.
And that's a true story.
That's not, that is,
it's a 50 kilobyte program.
And in the APL community,
it's regarded as like
the most expensive piece of software
in the world
because it's $2 million
per kilobyte of code.
Anyways, anecdote over.
What's your worst uh uh naming naming uh or not worst case but just the most surprising like so when i came across the python accumulate i was like fantastic
they've got a scan and then i i typed some code and i was like what is this um that was that was
probably the most surprised stood almost nobody's Almost nobody's going to know about this.
Stood.
This is in C++, C++ standard library.
Stood colon colon messages.
What do you think it does?
Stood colon colon messages.
First of all, have you ever heard of it?
No, I have not.
I don't even know what header it's in.
If you tell me the header, will I have a better idea?
You will never guess the header, but let's see how far you get.
I give up already.
I want to say it's...
I have no idea.
Stood messages?
What do you think it has to do with? What would you imagine that a thing called Stood messages? What do you think it has to do with?
Like, what would you imagine that a thing called stood messages has to do with?
I don't know.
Stood messages.
I mean...
Some helper function in, like, Iostream or something like that?
I don't know.
Some helper function for, like stream or something like that. I don't know.
Some helper function for like IO streams.
You're,
you're,
you're,
you're in the right family.
If I,
if you told me that there was a thing called std messages and I didn't know
anything more about it,
I would probably guess that has something to do with like communication
library or something like that.
The class template std messages is a standard locale
facet that encapsulates retrieval of strings from message catalogs such as the ones provided by
new git text or by posix cat git s it is a part of locales I'm not even entirely sure what those things are, but yeah.
That was, that is a name.
That is a name.
And like, it just squatted on that territory.
Like, it just said, hey, here's a, you know, like a fairly good name for a thing,
like message. And I'm just going to claim that for all time instead.
Yeah, that is a, that's a land grab for sure. Yeah. Yeah. Tweet at us or LinkedIn at us. If,
if you knew of that, I'm curious how many people actually were aware of Stib messages.
We should do user feedback and Sean's tweet.
Okay, yeah.
So the one piece of,
I'm not sure if it was feedback,
but it was a tweet we got
in the last 24 hours
from when we're recording this
is from Peter Larson on Twitter
who said,
I listened to the first seven episodes between Christmasmas and new year i can strongly recommend this podcast if
you're interested in programming uh you get information and feel in very good company so
thanks peter that's a very kind thing of you to say and i mean if you're it's it's an endorsement
but if you're listening to this endorsement you're already listening to the podcast so um not sure not sure how we're supposed to get that endorsement out there
um uh and the other thing we wanted to highlight is uh well i don't know what are we highlighting
about this just that like um so far this has made my just being a fan boy man yeah it's uh so far
this is like it's the highlight of the year uh we'll see at the end of 2021 when we do our annual retro.
I'll try and cut this in.
I'll try to remember to cut this in when we're doing the retro in a year.
But yeah, at Sean Parent tweeted at myself and Bryce saying,
that was a lot of APL with a photo of a hand holding a glass of eggnog.
So, yes, highlight of the year.
I'm not sure if you've got any thoughts that you want to share on that.
I'll just – I feel like I should let Sean know that don't worry.
Connor no longer lives in the same city as you.
I actually – you know what I said to my partner?
I said that there's a chance that like Bryce paid Sean to do that as like a Christmas gift.
I did not. Sean tells the best programming lore stories of anybody I've ever met by, like, an order of magnitude.
Can confirm.
Can confirm. really cool companies and just happened to be there like at very interesting
times and just has all these great and amazing stories.
Yeah.
I've,
I've had the privilege of having a couple of lunches and dinners with Sean
and yeah,
like it's,
he'll be telling a story and then you know,
it's,
and then we walked into the meeting room at Apple and Steve Jobs was there and he was a bit upset.
Wait, what? You walked into a meeting with Steve Jobs?
Well, it's only a couple of times.
And then you realize that in all the books, if you've read about Steve Jobs, I think Isaacson, he wrote like a pretty big one, you know, adjacent to that whole,
you know, story that is Steve Jobs, you know, Sean Parent, for those of you that don't know,
he works at Adobe. And Adobe sort of is like that has a very long history with Apple.
And, and yeah, the stories are just are their epic. Yeah. yeah so yeah maybe we'll have to start we'll
see if we can convince him we'll have to start like a series uh where like every every once a
month we bring sean on and then we just say uh tell us the story of and we turn off our mics
and we just let sean talk for uh an hour or so yeah i don't i don't hate that idea i don't hate
that idea at all i feel like that's a that podcast hack, though. That's like – that's just –
I'm not above using hacks. You've got some hat hair there, buddy.
Oh, yeah.
Connor, every time we've recorded this podcast and every time I've seen Connor and we like have a separate Zoom call like once a week,
he's always wearing the same cap on backwards.
Okay, now he's wearing it on forwards.
That's because I'm cool.
That's what the cool kids do. I haven't seen Connor's hair in like a year.
And with that, we'll wrap up our second episode on naming.
We hope you enjoyed and we'll see you in the next episode.