Algorithms + Data Structures = Programs - Episode 12: Formatting && (Parentheses)

Episode Date: February 12, 2021

In this episode, Bryce and Conor talk about parentheses, formatting, ClangFormat and more.Date Recorded: 2021-01-27Date Released: 2021-02-12ES.41: If in doubt about operator precedence, parenthesizeRA...PIDS cuDF ClangFormat config fileLeading commas article (aka Abrahams comma)Clang-FormatC++Now 2017: Tony Van Eerd “Postmodern C++” (90 min version)CppCon 2017: Tony Van Eerd “Postmodern C++” (60 min version)Avast Meetup: Tony van Eerd: Postmodern C++Intro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8

Transcript
Discussion (0)
Starting point is 00:00:00 So yes, in my code, I switch between styles. I let the circumstances decide which is worthwhile. The code is more what you call guidelines than actual rules. Welcome to ADSP The Podcast, episode 12, recorded on January 27th, 2021. My name is Connor, and today with my co-host, Bryce, we talk about parentheses, formatting, and so much more. How do you feel about redundant parentheses? Oh, redundant parentheses? I'm a very big fan of redundant parentheses oh redundant parentheses i'm a very big fan of redundant parentheses um like if if i think there's any chance that the that the code can be read misread without the redundant parentheses i'll add them like let me put it to you this way if i have to ask myself
Starting point is 00:01:01 hmm how does uh order of operations? That means that I'm going to add redundant parentheses. So if it's something where the order of operations is not immediately, I'm 100% certain of what it is, I'll add parentheses. Like for example, if I'm doing something like A less than B and B less than C, I'll probably put the A less than B and the B less than C in parentheses, even though I don't need them, just for clarity. In general, I'm a fan of verbosity. I don't like short identifier names when you can make them longer and clearer um i although the one place where i'm not a fan of verbosity is um i omit the braces for an if statement or a for statement um if i don't need them uh and apparently i'm like the only person in my team at work that does this.
Starting point is 00:02:06 No, I do the same thing. And then people get really angry because they're, oh, it could be a hanging, whatever, blah, blah, blah, future bug. I can't explain right now why this came up at work, but there's a context now in which we cannot do that ever again. I'll be able to explain why in a few months. But as soon as I became aware of this context where this was going to be problematic, I was like, well, this might affect people like me who write code this way. And everybody else in the team was like, nope, nope. I only write that code like that for slides. Bryce, you're out of your mind. You should like, well, this would be perfectly fine. Nobody really writes code this way. And I was like, all right, all right. If you guys tell me
Starting point is 00:02:54 that, then fine. I surrender. And Connor, I know that you're now thinking about this and now you're wondering, is the only code that I write code for slides and that's why I don't use them? No, no. Well, so, yeah, I mean, I have no problem putting braces if it's code reviewed and people whine about it. I personally, in my own projects, will admit that stuff if I think it's abundantly clear. What do you think about the single line if? Like if the statement behind the if is really short i'll put it on the same line as yeah yeah i i we configured our rapids kudiaf like i was i wasn't
Starting point is 00:03:32 in charge of setting the the config for clang format but i was the one that implemented it and people were allowed to review it but we have our setup such that short functions will have like all all the brace and the contents if it can fit in the column limit and if statements that are short uh go on the same line which then it sort of really irritates me then because if we have braces then you end up with like if predicate brace you know consequent and brace and like to me i just i really don't want to see those braces there but yeah it's not it's not the end of the world. You said you'd do the same thing, right? One of the reasons why I dislike adding the braces to the if statements is the coding convention that I was raised with, which was the coding convention of Hartmut Kaiser, the one that we used in HPX.
Starting point is 00:04:20 We would always put the brace on a new line so we wouldn't dangle the brace at the end of the last line. And the main reason why, and this is also sort of a boostism, the main reason why I saw for doing this was when you're writing a function signature, if the function has a lot of arguments and the function signature takes up a few lines, if you put the opening brace for the function on the, the end of the last line of the function signature, um, it can be like a little bit jarring for the, some of the function signature to be indented more than the first line of the function body. Um, uh, and like, it's not apparent where the brace is.
Starting point is 00:05:07 And I always just felt like it was a little bit cleaner to have the brace be on its own line. And so for the if statements, I would just put it on its own line, just solely out of consistency. Now, I think if you're in a coding convention where you put the brace at the end of the last line, then there's less of an argument for omitting the braces on the
Starting point is 00:05:34 if statement. Because if you put the brace on the end of the last line, then the if statement, you're only adding one extra brace or one extra line to the if statement than if you'd omitted the braces. And especially if you've got like an if else, and if your else statements have the closing brace for the previous if on them, then you're really not adding that much more. And I have used that coding convention a little bit, but it's not like my favorite preferred style, which is still to have the brace be on its own line. The brace stands alone, if you will. Yeah, in QDF, we have a mix. So like you said, oh, just for the sake of consistency kudief
Starting point is 00:06:26 it breaks on a function so always has the brace on its own line and i actually i don't personally prefer that but because kudief makes so much use of templates and sfine stuff you end up yeah like you said in some odd cases where the function signature is like indented and then you have the brace and then it's not really clear like where the start of the function body is like indented and then you have the brace and then it's not really clear like where the start of the function body is but for ifs we have them on the same at the end of the line but anyways we're supposed to be talking about naming however i do want to get back to how this all started i asked i asked how do you feel about un like redundant parentheses you said i prefer to over parenthesize i should highlight i don't know if we'll throw this all in to this episode or another episode, or I'll just cut it.
Starting point is 00:07:07 But if it does end up in an episode, it's worth mentioning that there's a core guideline, 41 or ES41, but it says, if in doubt about operator precedence, parenthesize, which I agree with. Hey, so I'm right. I'm 100% right. Exactly what I said. However, this comment got made and it's not me being critical. I'm just not sure how I feel about it. Like, where I have the line autoconst, you know, I'll omit the names of the variables, but autoconst A equals B minus C minus D. And then the comment is to parenthesize the first subtraction, B minus C,
Starting point is 00:07:42 because it is actually semantically a little bit closer to what's happening like you are subtracting the first two and then from that subtracting the third thing however i don't think there's there's no confusion about operator precedence here it's more from like a semantic point of view of what's happening and i don't because like typically i like i like i'm the opposite of you i brevity as long as it's still expressive like I prefer short things I have no problem calling a local variable n in like a three-line function if it's explicitly clear what like n is doing but yeah for this I don't know I don't know I'll probably end up putting putting them in but I don't have a formed opinion on on this like if the parentheses
Starting point is 00:08:22 are not expressing if they're not introducing clarity in terms of operator precedence they're supposedly introducing like more meaning to what's happening i don't know how i feel about that so so uh you know the the the other the other time when i do weird stuff with parentheses is um if i'm writing um a condition for an if statement and that condition's kind of meaty and it's going to take up multiple lines, I might use parentheses there to make it a little more readable.
Starting point is 00:09:02 One of the things I typically do is if it's a bunch of like sub conditions that are all anded together I'll try to put one on each line and then I'll put the the logical and at the start of each of the set you know each of the following lines of the condition. So I'll have, you know, on one line, I'll have if, you know, parens, then another parens, a less than c. And then on the second line, I'll have aligned all mice. I'll have the logical and, and then I'll have the second condition in parentheses.
Starting point is 00:09:43 And sometimes when i'm like writing more complex conditions like that um it it's useful to add additional parentheses just to sort of like encapsulate a line a little bit better or make it look a little bit better yeah um but i i i kind of have some weird code formatting things that I do. Like I'm a fan of the Abrams comma, which is where when you've got a comma separated list of something that's separated onto multiple lines where each different element is on a different line you don't put the comma at the end of the last item you put it at the start of the next item and this is supposedly well this i believe makes it easier to reorder things um around in the list um Some people think it's very ugly.
Starting point is 00:10:47 And then there's another thing I do where... Wait, pause, pause. I thought the motivation for that, and there's a couple different flavors of it. I didn't know it was called the Abraham's comma, but I thought the motivation was that when you end up introducing an item at the end of the list,
Starting point is 00:11:03 your git diff is only one line as opposed to two of like the extra comma at the top line, or the you know, the last most recent item, and then your new item, whereas if it's at the, you know, if they're combined on the same line, you only get like a, you get a single green addition instead of one red and one green. Yeah, I think that's one of the motivations to I'll be perfectly honest, I've heard some number of motivations for it over the year. That's not why I do it. I do it because that's sort of the way that I've done it for a number of years. Um, uh, and, uh, I've just always found it a little bit cleaner. I mean, the other thing that I like about it is that, um, it's, uh, if I, if I'm, if I, it's more clear where the comma is and it's more clear where the comma is
Starting point is 00:11:48 and it's more clear if I'm missing a comma because it's always at the front. So if I do something like I decide, oh, you know, the first argument of this function should actually be the third one or something and I move the first one down to be third, it's very obvious that I need to add a comma to the first one and remove a comma to the last one. And also the comma that I need to remove, it's in the same column position in
Starting point is 00:12:13 both of those lines. Whereas if it's at the end, it might be in a different position. So it's just like a little bit visually clearer that, hey, I need to add a comma to one of these and remove a comma from another one of these. And so that's why I sort of think it makes my life a little bit easier when I'm going to change the order of something in one of these comma-separated lists. And also, I tend to find that I'm more likely to change the order of the subsequent arguments than the first argument and if you put the comma at the end of the line then the special case is the last one
Starting point is 00:12:54 whereas if you put it at the start of the line the special case is the first one damn, I was totally on board with the there is more symmetry, you get vertical alignment of the commas. Yeah. But that second one that you just mentioned. Anyways, lots of reasons. I cut you off though.
Starting point is 00:13:11 You were saying there was. Whoa, explain what problem you just had with what I just said. There's not a problem. I was just, the first one I was on board with. The second one. The second one, not so much. Is, I mean, that's a real, that's a real granular piece of like, well, technically speaking, if you look at the proportion of times, and it's a true statement, but just like the first one was like, oh, I could definitely see that. The second one, I'm like, you're measuring the distribution of like times.
Starting point is 00:13:37 Like I modified this scoped enumerator, and there's a histogram on the number of times I've changed each of these lines. And if you compare the last versus the first, it's less, which is probably true. It's probably true. Connor, you have to understand I am inventing justification for what amounts to ritual. This is ritual. It's not rooted in any, like, I will happily use whatever coding convention is the coding convention of a code base i work with it's just this is this is the this is my heritage connor this is this is the style in which i i wrote code back in the day and uh i like to invent a justification for it and
Starting point is 00:14:21 in case in case uh i think this probably i'll cut this into a separate episode. So for those that are listening, I completely agree with Bryce, whatever the convention is, I will, I will use that. Probably Bryce, you agree that I care so much more that there just is a convention than like what the convention is, because I just don't want there to be like arguments in code reviews about like, oh, you know, blah, blah, blah, you didn't align this. I just want there to be a formatter that does it for me. That being said, like, I care so much what my personal code looks like. So in the code base, like one of the code bases, I currently work on Thrust, which is about a 10 year old open source project. And it, it's a project that has a bunch of code that's derivative from Boost, a bunch of code that's
Starting point is 00:15:05 from like different projects. There's probably code from like four or five different licenses in the project, like both versions of BSD, MIT, Boost, just code taken from a whole bunch of different places at a bunch of different times. And it's all never been clang formatted so the rule of thumb in thrust is you follow the style of whatever precinct of the code you happen to be in and that that's not even necessarily the file it's like it might be like this set of functions that i'm working on have one style and like 300 lines down in this header, there's going to be a different style. And this was actually sort of how code formatting worked in the other project I worked on prior to being at NVIDIA, which I spent most of the first five years of my career on, HPX, where in HPX, Hartmut and I and the other main developers had similar code styles,
Starting point is 00:16:07 but not 100% similar. And so there were certain files that I could look at and I'd be like, okay, yeah, this is one that I wrote. I can just tell from the style. And this is one that Hartmut wrote that I could tell from the style. This is one that Hartmut wrote in 2012 because that was the style he was using then. I think they've since clang formatted that code base. My team keeps talking about clang formatting thrust, which is probably a good idea, but I'll be a little bit sad when all of that history gets formatted away, but it'll probably greatly improve readability.
Starting point is 00:16:44 I think that like in a... So was it just you and HeartMid primarily working on that code base when you were, you know... When I... I was the first person that HeartMid brought on. I was like developer number two on HPX. And then developer number three was Thomas Heller. And then we had a whole bunch of other folks come in over the years but like in the in the very early stages it was mostly me and heartmit um like the first six months that i was working on it
Starting point is 00:17:12 yeah so i think i think in that situation like going clang format free or like format or free actually can work like quite well because well clang format didn't exist at the time this was 2011 so i mean take that take that into consideration as well but just like even even today if you only have two people working on a project i think it can work totally fine because i actually have like a ton of grievances like i think i think clang format has done amazingly but like if like there are a bunch of things like tabular tabular formatting is probably like the biggest criticism like if you have a switch case statement with you know a bunch of things like tabular, tabular formatting is probably like the biggest criticism. Like if you have a switch case statement with, you know, a bunch of different length enumerators and every case statement is just like a single expression returned.
Starting point is 00:17:53 Like you want those returns aligned. You want like the cases a lot. The cases will be aligned, but basically nothing else will. Similarly, if you have like a case switch case statement with like an expression, semicolon, then break, you want all of the starts of the expressions to be aligned and then the breaks to be aligned and like claim format absolutely just like falls over not having a formatter and just having two people that you're both familiar with each other's coding styles can actually lead to like nicer looking code the problem is that when you have not two developers but like you know n developers what
Starting point is 00:18:24 where n is not a small number, and end people aren't like operating on the same wavelength. And then you get pull requests, where clearly the person hasn't like paid as much attention to how their code looks. And then you spend like the first three iterations of the code review going back and forth saying, you know, please make this a little bit more readable. Yeah. So, so I actually probably like going clang format free. If it was just like me, I might do that all the time. My problem with clang format, and this is probably just me being grumpy and resistant to new things is that there's a lot of times when I will do some formatting that makes sense locally only. Like, you know, maybe I've got, you know,
Starting point is 00:19:06 a series of if and else ifs where I want to align the starting parens of the condition. Or maybe I've got, you know, in one place I declare four variables in a row where they're all sort of similar and I want the equal sign to line up. But maybe there's another place later, like, you know, maybe there's some variables that are clear right after that where I don't want the equal signs lined up because if I did that, then it would make the line too long and it would break across the line. It's just like when I'm formatting my code, there's a whole bunch of special cases that I'll make. And I always worry that a formatter is just going to muck those up. I've never really used Clang format.
Starting point is 00:19:55 So I don't suppose I have much actual justification for that claim. I've just never really found a need for it. Even though I've worked on some projects that are very large, that have a lot of developers, I've never been in the case where I've wanted to argue with somebody over formatting. If I'm code reviewing a contribution to a project I'm working on and it uses a different coding style, that doesn't really bother me. I'm kind of okay with that.
Starting point is 00:20:36 If it's inconsistent with the rest of the file, I might make some remark about it. The only thing I can remember asking for changes for that was like a formatting thing was indentation. Like, hey, you're indenting by four spaces and the rest of this file is indented by two. Can you indent by two instead? But I don't know. It just doesn't really bother me.
Starting point is 00:21:09 This brings to mind a moment from my favorite Tony Vanner talk, which is one that he gave at CppCon... And C++ Now. 2017 called Postmodern C++. And in that talk, he spoke in rhyme for the entire talk, first of all. So it's an amazing talk. You have to go watch it. The entire talk, he spoke in rhyme. I have to go watch it.
Starting point is 00:21:41 I've seen all three versions of it. The listeners need to go watch it. Yeah. And it seems like it's just like a funny humorous talk, but it's not actually just a funny humorous talk. It has some very deep, like real world lessons about programming. And one of the things he says is in my code, I switch between styles. I let the circumstances determine which is worthwhile. And that sort of gets to the whole core of his talk, which is this idea of postmodern C++.
Starting point is 00:22:13 And one of these postmodern ideas is that the postmodern coding convention is to not have any fixed coding convention. And what Tony means by that is sort of exactly what I just said, that you shouldn't have a rigorous set of formatting guidelines that is applied unconditionally everywhere. You should instead use the coding style that makes sense in the particular part of the code that you're in. And I really connect like that. And whenever I mention this quote, it always brings to mind, to me, the Pirates of the Caribbean meme of, you know,
Starting point is 00:23:00 the code is more what you'd call guidelines than actual rules. The code is more what you'd call guidelines than actual rules. The code is more what you'd call guidelines than actual rules. And I think this is something that's true for not just formatting guidelines, but also for library design guidelines, and also for naming policies and naming patterns. We had a meeting of Library Evolution, the C++ Library Evolution group, last week where we talked about some proposed library design guidelines and naming guidelines.
Starting point is 00:23:36 And one of the things that came out of that meeting was we sort of said, you know, we don't want to prescribe hard rules. The strongest thing we want to say is if you're naming something that meets this criterion, then you should consider a name like this. It should be one of the candidates. Or if you're doing X, you should consider this. And somebody during that meeting, my friend Ryan Google, said this was in reference to adding a particular prefixes or suffixes to names of certain things. And he suggested that the policy should be decorated as a last resort. you should rigorously follow the rule, but the rule or the suggested pattern may be helpful if you need it to clear up something that's ambiguous, et cetera. And I think this really
Starting point is 00:24:35 captures sort of the core of C++ standard library design, that we try to not rigorously enforce rules. We try to do what makes sense for the particular thing we're working on. All right. So I have like three or four different things to say, but I'm not going to say any of them because we need to switch to our naming. No, no, no, no, no, no. Go, go, talk. We got to flow.
Starting point is 00:25:04 We got to let it just i will i will mention this is all going in i will mention the topics and then we will have another episode because i have a lot more to say but so i totally agree going back to well so first of all yes tony all of tony's talks are awesome uh but definitely postmodern c++ he's given a couple different variations highly recommend it we'll put it in the show notes for sure. You mentioned about how locally when you're making certain formatting decisions and like aligning of equal signs that like in other contexts, you know, sure enough, you look at three files when you're deciding on your Clang format config. You look at, you know, three or five files or maybe more than that.
Starting point is 00:25:40 But you're not going to do it. You're not going to go through every single file in your code base if it's hundreds and hundreds of files. So sure enough, inevitably, like locally, it looks good. And then you didn't consider this one case, and it ends up looking awful. And the thing that I want to talk about in a future episode is something that like, I've actually never talked about, because I'm pretty sure I'm just going to get a ton of hate for it. But it's, it's something that I sort of inside my head I think about is format-driven development. And it goes right next to const-driven development, which it's like like really, really readable code and the semantic, you know, difference between what you would have ended up writing without thinking about like how
Starting point is 00:26:31 it's going to be formatted isn't isn't too far off. Anyways, we'll talk about that in a future episode. And that's where we'll call it. Thanks for listening and have a great day.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.