Algorithms + Data Structures = Programs - Episode 10: snake_case vs camelCase (Naming - Part 3)
Episode Date: January 29, 2021In this episode, Bryce and Conor complete the naming trilogy and talk about some of the most important questions in tech - indicated by the title.Date Recorded: 2021-01-27Date Released: 2021-01-29CppC...ast Episode with Guy DavidsonConor’s tweet as Guy about predicatesstd::vector::empty()std::is_empty()std::filesystem::is_empty()cudf::device_spanRuby each_consRuby each_with_indexRuby 3.0 Static TypingCrystal Programming LanguageCrystal each_consCrystal each_with_indexsnake_case, PascalCase, camelCase & kebab-caseRename concepts to standard_case for C++20, while we still canJulia Unicode InputIntro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8
Transcript
Discussion (0)
We're not, we can't have, we can't have a 60 minute episode.
I think that was great. I think that that all should be part of this, this episode.
Naming. Part three.
Welcome to ADSP The Podcast, episode, recorded on January 27th, 2021.
My name is Connor, and today with my co-host Bryce, we're going to be wrapping up part
three of our three-part series on naming.
So let me ask you about this.
So one thing we didn't talk about last time that I made a note
about was ordering in names. So let me give you a few examples of that. So the first is like
noun underbar verb versus verb underbar noun. Like as an example, do you call a function apply underbar policy
or policy underbar apply? And are those two even necessarily the same thing?
And what really gets me here is I sometimes see codebases that do this inconsistently. Like they have one function in
their API that follows the noun underbar verb, like policy underbar apply. And they have some
other function that follows the verb underbar noun pattern. And I don't particularly care which of those two you follow, as long as you're consistent.
I have a slight preference for phrasing it as something like verb underbar noun,
because I like my identifiers and my code to sort of read as like plain English.
So like something like apply policy,
like what does it do?
It applies the policy.
So do you have any thoughts on that?
Well, so this is,
this is great because this is like
completely adjacent to one of the things
I wanted to talk about.
So to first answer your question,
I've never actually thought about this in terms of noun, underscore, verb, verb, underscore, noun.
But in thinking how I would just write code, I think if I were to go look back, it's always verb, underscore, noun.
Especially because, like, I started my career in a mathematical domain, actual science. So there
was a lot of like, calculate underscore reserve, or calculate underscore premium, or determine
underscore, you know, insert something. So I think, inevitably, yeah, if I look back, it was it's
almost verb underscore noun. But taking a step back, one thing that I thought was great, because I hadn't thought about it at the time, I heard it from Guy Davidson on a CppCast episode, is the idea of naming functions, verbs and nouns separately, based on whether they are modifying or non-modifying functions. So there's a very classic pattern in Java and C++,
and I've even seen it in like Herb Sutter's talks
where he shows some reflection example
of where we're generating methods for you,
where you generate these getters and setters.
So you have, you know, you have get value and set value.
And Guy's point was that you shouldn't ever need
to prefix something with get underscore.
If you're just retrieving something, you're not modifying.
If it's a non-modifying method, just drop the get.
And anytime you have a modifying method, you should be using a verb.
Aim into that.
There's nothing I hate more than seeing a function that starts with get, you know, under bar.
It's just like, it's unnecessary symbols.
Which was so, so that's like when it comes to verbs and nouns,
when you started talking about that, I was like, oh,
is Bryce about to make my point for me?
So I'm, this is not, it's not, I just adopted, you know,
the first time I heard it was Guy.
He probably heard it from someone else, but I really liked that.
Don't judge guy like that for
all you know guy was the brilliant genius that invented this it's true well we'll give guy credit
even if he did uh uh we'll see yeah maybe maybe guy will listen to this we'll we'll add him on
twitter and he can tell us if he heard it from someone or he made it up um because i think it's
a great rule and i also i think on that guy the record, I am choosing to assume that you made it up
and Connor's choosing to assume
that you just took it from somebody else.
Sure, sure, we'll stick with that.
But on that episode,
he actually made another point
how you should prefix predicates,
which I had never even heard of that term at the time
with is and has. Um,
and so I think I asked him, you know, uh, can you define predicate? And that's just a function
that returns a Boolean. Um, and, uh, that whenever you have that, you should prefix it with is and
has, because, or I think that's what ended up happening is that I said, what about, what about
functions that like is even, Because technically is is a verb.
And he said, oh, any function that starts with an is or has is a predicate.
And that's an exception.
Anyway, so this gets back to sort of the noun versus verb.
So you can comment on everything I just said.
But I think putting the verb first is like it makes it more clear that the whole thing together is a verb.
When you're saying, you know, calculate reserve.
Really, the key in those two words is calculate.
And if it's actually a free function, so it's not like a non-modifying or modifying, like ideally it's a pure function without side effects.
And in that case, actually, I don't know what I prefer.
Do I prefer calculate reserve or reserve?
Because if you call it reserve, you just pass it the arguments.
I don't know.
So that question about the predicate or that Mark does about the predicate function is actually interesting.
And that's one of the few patterns that's actually in the proposed recommendations for standard library design guidelines.
But as I mentioned a little earlier,
one of the conclusions we came to when considering C++ library design guidelines
in the C++ library design group last week was we want to try to avoid
decorating names whenever it's not necessary. And by decorating names, I mean adding prefixes or
suffixes just because a function or an identifier is some class of thing, sort of similar to the Hungarian naming convention where you embed the type of the thing into the name.
But one of the ones that we'd actually suggested did make sense was, you know, to have the is under bar for predicates. One of the examples that somebody
pointed out was the empty method on a lot of the standard library containers. It's kind of
ambiguous whether calling vector.empty empties the contents of the container or tells you whether
it's empty or not. And the standard library is actually like a little bit inconsistent here.
So as an example of that, C++'s std future has a valid method.
Now that one's pretty unambiguous, you know,
that that tells you whether it's valid or not,
unlike empty, which is a bit more ambiguous.
But the concurrency TS
proposes some additions to std future. And one of those additions is an is ready function.
And I just find it amusing that, you know, I could understand inconsistency between different
classes, but it really tickles me that there's a proposed inconsistency within this same class,
that we would have is ready, which would be one predicate, and valid, which would be another.
And I am inclined to say the world would have been better if std vector had called it is empty.
So maybe that's one of those rare cases where the decoration is really truly useful. So I thought that it was
a widely acknowledged fact that empty was a mistake and that going forward we were always
supposed to use is empty. I can't find it right now, but I know that there's two examples in the
standard that have actually started using is empty. One of them is std file system.
The other one I can't find right now.
You're absolutely right that that is the policy going forward,
and that's what we were discussing last week was codifying it into an actual policy document.
What was interesting, though, is that the other day I was code reviewing someone's code,
and we have, I think it's in the QDF repositories.
It's the open source library I work on.
We have a device span and it had an empty method.
And whenever I see that, I comment,
oh, let's prefer is empty.
And I have some links somewhere where I link to,
you know, this is a widely regarded mistake
and that going forward we should use is empty.
And I think it was- Although if we're going to do that, we should probably, and that going forward, we should use is empty. And I think if
we're going to do that, we should probably all of the C++ standard library things that currently
have an empty, we'd probably need to give them an additional is empty, so that people can
consistently just use is empty, you know, it's a pain if you have to remember hmm does this thing have an
empty or an is empty and then it becomes even more of a pain in generic code uh that's true um so
yeah it would be nice to get those um but where i was going with that is that the the person i think
it was vukasan i hope he doesn't mind me shouting out his name his response was oh we're keeping it
consistent with stood span or yeah and was like, wait, what?
Span added empty?
I thought we were not doing that anymore.
Well, span probably did it for the reason I just mentioned,
which is span probably called it empty and not is empty
because for consistency with the existing things, yeah.
See, this is the problem with consistency
is that sometimes you have to, like, some, I I always say and this is sort of like Bryce's law.
I'd I'd rather be consistently wrong than inconsistent.
That that the idea that that, you know,, like why is spans empty? Why does span have empty and not is empty? Because it would rather be consistently wrong. Like, I'm not going to have any flexibility
of thought on my opinions on matters. I'd rather just stick to my guns and be wrong than
eventually be inconsistent and change my mind. That sounds like an awful law.
I am happy with my law as stated.
Are you sure? I'm going to give you a couple episodes to think about whether you want
this to be your law. I will think about that. So I have another story, which I'll give a shout out
to my coworker, John Bichon, who was actually my coworker at NVIDIA and also my coworker at
Lawrence Berkeley National Lab, where we used to work. And we were both at Lawrence Berkeley. He was working on this asynchronous runtime system.
And whenever you have such a system, I worked on one myself called HPX.
You tend to have some set of functions that are asynchronous and that have some async decorator on them.
And in the code bases I had worked with, the async always came on them. And in the codebases I had worked with,
the async always came at the front.
So you'd have async underscore reduce
or async underscore transform, et cetera.
But in John's codebase, if I recall correctly,
he would call it reduce underscore async, transform underscore async. And I one time asked him,
hey, why do you put it at the end, not the start? And he told me, well, he thinks of it sort of like That he wanted the most specific thing to be first.
So what's the most specific thing of an asynchronous reduction algorithm?
Well, it's of the family of reduction algorithms.
That's more fundamental than it being a member of the family of things
that are asynchronous. So he sort of made an argument that the ordering of things in the name
said something about sort of the grouping or the categorization of those things.
And I've also heard this claim made for things like code completion, that if you call
it async underscore reduce and async underscore transform, then when you type async underscore,
you're going to get almost everything in your library. Whereas if you do it the other way
around, then if you type reduce, you'll get auto-completion for reduce just the synchronous version and reduce underscore async
and whatever other functions you have that are some form of reduction. And so I thought that
was a very interesting notion that I had namespaces, but even just within the identifier itself.
Yeah, no, I've actually spent a lot of time thinking about that because there are examples where it's an amazing design for exact for that exact reason that you
end up with like a list of algorithms that are alphabetically sorted and then you see you know
zip underscore zip underscore zip underscore and you you're clearly like oh okay these are these
are all doing roughly the same thing however you can go too far with it in some cases so like in
ruby there is a collection of algorithms. I just looked it
up. So they each start with each underscore, each cons, each entry, each slice, each with index,
and each with object. And I don't know what all these do, but if I were to ask you,
what do you think each underscore cons does? Like, do you think you could guess?
And our listeners can play along if you want
and and the point that i'm about to make here is that yes they've they've family
familyized these you know they put them in a group but they've completely in my opinion
obfuscated like what these do and like the same named algorithms in other languages are completely
different interesting in fact i'll give you a I'll give you a hint about each cons.
The cons is a reference to an old language. Yes. So it's like a reference to Lisp, right?
Like to the operation where it takes the tail of a list? Well, cons is basically your procedure that creates a pair.
So in Lisp, your linked list is just a bunch of conses.
Oh, yeah, yeah, yeah.
But so I believe, I can double click just to make sure that I'm not wrong,
that each cons, you pass it an integer,
and it gives you the sliding window of of the length of that so like each cons
two will give you adjacent pairs each cons three will give you a sliding window of length three
where the step is one and so so this yeah this in most languages is called uh sliding or windows or
windowed you know rust i think just added a couple different versions of this in their 1.48 or
whatever the version that they're on. But yeah, so I completely agree that there are times where
prefixing everything with the same prefix is a really good way to have people explore the
language and discover, oh, look, there's a very similar algorithm I could use. In other cases
like this, I think each with index isn't bad.
I think that's just basically, it's called...
Yeah, that makes more sense.
It's called different things in different languages.
The most common name is enumerate from Python,
where you just bundle each element with an index.
But yeah, in this case, I think they went too far.
And it's really unfortunate because Crystal,
which is actually a language I think I first heard about from you,
it's basically Ruby statically typed,
although I think Ruby actually just got static typing
in like Ruby something point something.
It's, yeah.
Thank you for that very informative comment.
Ruby static typing.
3.0, I'll guess.
The state of Ruby 3 typing.
I'll find out if it's been added and added to the show notes.
I could be mixing up Scala 3 and Ruby 3.
Because Scala 3, otherwise known as Dottie, is like...
See, folks, this is why you shouldn't learn...
This is why you should have learned only a few terms in there.
This is not a lot.
Because then you don't get them mixed up.
Touche. Fair point.
Anyways, what was my point here?
All right, Crystal is statically typed, supposed to have the performance of C version of Ruby.
But they basically just like verbatim took all the algorithm names from Ruby.
Whereas Elixir, which is basically like
Ruby plus Erlang, they looked at all the algorithms, took the good ones, and then renamed the bad ones.
So they didn't take any of the each underscore methods. They renamed them to like chunk and
better names, which I think is the better learn from, learn from languages, mistakes, copy what they did right. And then, you know, iterate. Yeah. What do you, what do you think about, uh,
about naming styles? Well, so is this, is this like snake case versus Pascal case versus case?
Well, so, um, well, well I'll answer that in a sec. Cause I don't know. Yeah. Well,
kebab case is adjacent to, um, yeah. So there's, well, okay, let's, let's answer that in a sec. What is kebab case? I don't know what kebab case is. Yeah, well, kebab case is adjacent to...
Yeah, so there's...
Okay, let's answer this question.
So we've got snake case, pascal, camel, and kebab case are like the four main ones.
There's a couple other variants, but I think those are the four main ones.
I think most people have heard of snake case.
It's the underscores in between words.
Pascal case, the start of each word is capitalized.
Camel case is pascal case, but of each word is capitalized camel case is pascal case but the
first word is lowercase and kebab case for any lispers out there um they'll know what um kebab
case is it's where you hyphenate the the words uh i actually like kebab kebab case quite a bit
although it does not work in a number of languages uh it definitely doesn't work in a number of languages but i i think snake case
is my my preferred my my preference i really strongly dislike camel case i just think it's like
super unreadable and it's just like it is jarring to me for the first word to not be capitalized i
and i just i want to understand why why should it why should the first word not be capitalized but the
rest of them should be it's just like pascal case i can i can i'm i can be okay with because like
it's consistently capitalized at the first letter of each word like that's okay but
but no i'm not okay with not okay with camel case i do not share i mean i'm i'm
surprised actually that you have such uh not strong feelings about formatting you know you can do
without but then for for that first letter in camel case uh that's where you draw the line
either capitalize or don't capitalize um yeah is there is there is there a style where you don't
use underscores but you don't capitalize anything where you just mush everything together does that
have a name uh all lowercase i don't know we'll let we'll let our listeners if that's a thing
tweet at us and we'll mention it what do you say say? So in C++20, in the C++ standard for many years, the names for things that were the
moral equivalent of concepts, we didn't have concepts in the language, but we had names
for these sets of requirements.
Those names were always, they used Pascal case, which was nice because it made them distinct from the standard library identifiers,
which were all used snake case. And when we added concepts, we were originally going to add
the named concepts in the standard library as Pascal case, following the precedent of the named
requirements that had been in C++ for many years, like copy constructible or random access iterator.
Those names would have been spelled in Pascal case.
But then a fairly late decision was made
to make those concept names be snake case,
which has led, I think, to some unfortunate ambiguities.
So for example, std colon colon iterator is the name of a concrete type from like C++ 03 or C++ 11,
whereas std colon colon random access iterator is the name of a concept. Whereas
with the Pascal casing, you had this, it was very visually clear when something was a concept
versus a type because the concepts were in Pascal case and the concrete types were in snake case. But we decided we
didn't like that. And I personally think it was a mistake. I don't know, what do you think
about it? Do you think it's confusing that the concepts use the same naming pattern or
the same naming style as types? I did say earlier in this discussion that, you know,
I'm not a fan of, you know, these decorator type
patterns where you have a prefix or a suffix or a particular style for things. But I guess in this
case, I felt it would have been a good idea to have concepts be visually distinct from types
in the standard, in the C++ standard library. Yeah, I mean, I feel like that paper and that
decision got made on either the first or the
second committee meeting that i went to and um i didn't spend most of my time in luge um but i was
i did sit in on a couple of the discussions i did not have anywhere as near as a strong opinion
as the people in the room um uh think the biggest argument, if I recall,
against mixing them was like,
this was breaking precedence for the first time ever.
And it's-
Although, but you could argue it was like,
it was going to break precedence either way
because it broke the precedence
of the existing naming pattern.
There was a thing that was referenced in the standard
called random access iterator in Pascal case.
It wasn't a programmatic concept
because those didn't exist yet.
But the name of it now is inconsistent
with what we used to refer to this set of requirements.
So I think, and I wasn't in the room
at the time of this discussion
because I was chairing a different room. But if I had't in the room at the time of this discussion because
I was chairing a different room, but if I had been in the room, I would have made the argument that
there was going to be an inconsistency either way. And we should have gone with the one that
was going to be less ambiguous and that was going to follow what the existing name of these things
was. Well, so although what you say is true, I think the argument was coming from trainers or people whose primary occupation and are very exposed to teaching C++.
And you mentioned that's the way it is in the standard.
I don't think the majority of C++ developers are working their way through the standard.
And then you can make the argument, well, it's not just the standard.
It's also an EOP, elements of programming. I don't think it was just through the standard. And then you can make the argument, well, it's not just the standard. It's also an EOP, elements of programming.
I don't think it was just in the standard.
If you go and look on CPP reference uses it,
I think there's a lot of code that,
there's a lot of pre-C++20 code that referred to some of the core concepts
from the standard using the Pascal case naming that the standard uses.
I'm sure there's a lot of codes out there that refer to this type is copy constructible Pascal
case. So, you know, yes, you can make all those arguments, but I think the argument, and so this
is the thing is I'm doing a poor job because I'm trying to make someone else's argument for them.
But I think their point was like, this is it's different. And
this is going to be one more thing that I have to teach that like, people are going to are, you know,
whether it's students, or whether it's professionals that are, you know, getting training,
or just learning the language for the first time, they're going to ask, you know, oh,
why is this different? And it's a short answer. It a concept but i think from the educator's point
of view like if they're if there's one thing that they don't need to teach or like they don't need
to add to a bullet list of points of like you know insert why is this different that's preferable
but now they do have to teach people that stood colon colon iterator isn't the name of the of a
concept even though stood colon colon random access iterator is the name of a concept, even though std colon colon random access iterator is the name of a concept.
And a bunch of other little cases like that.
Like, I don't remember what it was for.
There was another one around Boolean that was very odd.
Yeah.
Anyway, so we should get someone on that voted for it, and then you can debate them.
But yeah, we have to go.
But the last thing I want to say before we wrap this episode up is that the same way that LISPs have a very sort of unique kebab style for some of the LISP dialects,
I actually think that LISPs have a feature that I wish a lot of other programming languages had when it comes to the naming of predicates do you know how uh you
denotate a predicate function in a lisp or in most uh i don't you use a question mark
um which i actually really really like because instead of having to say is empty or has feature, which don't share the same prefix,
all of the predicates end in a question mark. So instead of is empty, it's just empty question
mark. And then if you have, what's another one? Is even. So yeah, even and positive are not like
is even or is positive. It's just even question mark, positive question mark.
And I think that's super, super nice
because there is actually some cases where is and has,
you know, the is underscore and has underscore
aren't the right prefixes.
And then you sort of have to come up
with some awkward prefix that doesn't really work great.
But if you can just add a question mark on the end,
it becomes immediately clear,
oh, this is just a predicate
and I can use something simple.
That's interesting. Yeah. I think that it would be interesting to see what a language like C++
would have looked like if we could have had a wider set of characters and identifiers.
Like if we could have had question marks and identifiers or exclamation points and identifiers, a wider set of characters and identifiers. Like if we could have had question marks and identifiers or exclamation points and identifiers, what styles would that have led to?
Because I know languages that do allow question marks or exclamation points and identifiers,
you know, those are often used to convey, to denote certain types of functions, etc. Like,
I generally think that languages benefit greatly from having as many characters in the identifier
set as possible without, of course, restricting, you know, what you have available as operators,
etc. Yeah, there's like Smalltalk is a language that you can define binary operations as any
combination of like a set of like 20 different ASCII characters. So you can come up with some
really, really cool things that yeah, you can't you can't do anything. You're limited by the set
of what binary operators are overloadable in C++, which we do have a bunch, but you can't just like
create one out of thin air that doesn't exist before, whereas in other languages you can.
Yeah, I'm always jealous of languages that can invent, you know, operators like that.
I think it's a very powerful feature.
It's one that I often wish I had access to in a language like C++.
Julia, actually, I just learned you basically can define like any unicode operator that you want
because it's a scientific language and so a lot of like uh domain specific applications they want
to have like a logical and that's you know the upside down v and a logical or with a with a lot
of c plus plus uh uh compilers you can get away with using Unicode in your function names,
not for operators, but in function names.
And so there are some shenanigans you can get away with there.
If you follow JFBastion on Twitter, you're probably familiar with this.
Yeah.
I've also, I think it was in a meeting C++ quiz, I saw it was just smiley face emojis.
It was, you know, what is this output?
Yeah, I remember that too.
That was, we were there together.
That was 2019.
Man, you remember that?
You remember that?
That was a crazy month.
We were on the road for like six weeks.
That was when we went to...
You were on the road longer than I was.
Yeah, that was... I miss those days.
I miss those days.
How long has it been since I've been on a plane?
It's almost, I think, 11 months or 12 months now.
Yeah, it's got to be, yeah, because it would have been Prague last year that we went on a plane.
Oh, man, I miss it.
I miss it. I miss it.
It's all right.
The vaccine.
My partner just got her second vaccine yesterday.
Oh, awesome.
My grandfather and my dad are getting their second dose this week.
My grandmother finally got hers.
So now all of the people over 80 in my immediate family have gotten one so i'm
happy about that yeah it's awesome it's uh it's happening slowly but hopefully 2022 we'll be able
to meet up in person that brings our naming trilogy to an end we hope you have a great day
and we'll see you in the next episode