Algorithms + Data Structures = Programs - Episode 179: CheckGrade, ACCU & CppNorth

Episode Date: April 26, 2024

In this episode, Conor and Bryce chat about the CheckGrade problem, ACCU, CppNorth and have a guest appearance from Bryce’s mom!Link to Episode 179 on WebsiteDiscuss this episode, leave a comment, o...r ask a question (on GitHub)TwitterADSP: The PodcastConor HoekstraBryce Adelstein LelbachShow NotesDate Recorded: 2024-04-17Date Released: 2024-04-26CheckGrade TweetRuff (Python Tool)Software Unscripted Episode 69: Making Parsing I/O Bound with Daniel LemireThe simdjson libraryCppNorthComposition Intuition - Conor Hoekstra - CppNorth 2023Intro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8

Transcript
Discussion (0)
Starting point is 00:00:00 Wait, hey mom, you want to say hi? It's good podcast content. So here's the magic number. The magic number is two? Yes. Can you tell me what it's the number of at the ACCU conference? I do not know. Well, I just came back from the bar and I counted two women amongst maybe, oh, I don't know, 50 or 60 computer bros. Welcome to ADSP, the podcast episode 179, recorded on April 17th, 2024. My name is Connor, and today with my co-host Bryce, we chat about the problem that took Twitter by storm, check grade, get a guest appearance from Bryce's mom, and talk about one other thing because it's so topical and you,
Starting point is 00:01:08 you just, you're just going to have to find a way to work it into, um, you're just going to have to find a way to work it in. Sorry. We will see. Maybe, maybe not. Let's, let's talk about check grade. Oh, oh yeah. You want to, you want to summarize for the listener what CheckGrade is? No, no. You're the one that brought the tweet up. So you tell your story how you – because I only stumbled across it because you sent it to me. So you tell the origin story. So CheckGrade has been a collective Twitter code golf moment that has been unfolding this week, where some poor soul tweeted about
Starting point is 00:01:47 this problem. And then everybody on the internet and their dog had an opinion. And at first, there were a lot of serious opinions and responses posted. And then there were a lot of amusing ones posted. But the problem is so we got a function check grade. Check grade takes an integer that is a score or a grade. So it's between
Starting point is 00:02:20 0 and 100. And it's going to return you a string or really just a character, of a letter grade. You know, A being 90 and up to 100, or really 90 and higher. B being 80 up to 90. C being 70 up to 80. D being 60 up to 70, and F being anything below 60. So this poor soul called this the arrow anti-pattern because the first version that he wrote or that he tweeted about had like
Starting point is 00:03:05 a big nested if loop, a bunch of nested if statements. Sorry. So he had like an if score is larger than or equal to 90, then return A, else, and then the else logic had, you know, if score larger than or equal to 80, et cetera, et cetera, et cetera, recursing down to handling all the cases. And his refactoring that he posted as a reply, and this guy was writing code in Java. We're not going to hold that against him. His refactoring, he creates a dictionary of integers to strings and then looks them up in the dictionary.
Starting point is 00:03:48 And that is not the worst possible solution, but it's a pretty bad one. Because why use a dictionary when you could just use a simple array? And some people posted responses that just used to look up into a simple array. And some people posted responses that just used a lookup into a simple array. But as other people pointed out, like anything that's going to involve a memory access is going to potentially be slow and you could just do some clever math instead. So yeah, and it just sort of spiraled out and people, you know, people had opinions about like,
Starting point is 00:04:23 okay, should you do this with a switch statement? you is it okay to use early returns like just switch over the um the grades or just like have some if statements with early returns um and uh yeah i mean like like you know if you look at the if statement with early return it it does look pretty elegant but apparently some people really dislike early returns. I think I saw Jonathan Mueller post something about it. I started seeing a bunch of C++ people posting about it. And my, of course, first and immediate reaction was, well, okay, how do we write this in a way that can be lowered down into SIMD masks and will not actually have any branching. And some people came up with clever solutions that'll do that.
Starting point is 00:05:12 I don't have any of them in front of me, but there's some stuff you can do with some divisions and multiplications and some clever math where it'll just get you to the right answer without needing any actual branching. It may not be the cleanest code, but it is going to be pretty fast code, or at the very least pretty, yeah, I'll stick with fast.
Starting point is 00:05:35 It may not be efficient code because it may be doing, it may be more math-heavy than you'd want, but in a parallel context, that's fine if it avoids thread divergence. Yeah, but I suppose I just wanted to get my two cents in on this. I suppose the solutions I probably liked best were those clever mathy ones or the lookup in a constant array ones. Because I think that a compiler will lower the look up in an array down to something pretty good. Like into like some sort of like jump table or something,
Starting point is 00:06:16 or it'll just lower it down into the clever math. But I haven't looked at that deeply. But I do feel, I do generally feel for this fellow who posted this solution, who posted this problem because it got like 600 retweets and like four million views and a lot of people shit on his on his answer. And, you know, I think, I think somebody, you know, wanting to find a better way of doing things, even if they're, even if their first attempt isn't, isn't that great, like, you know, just getting out there and, and making the effort and asking for help is like, that's more than what a lot of people do so uh yeah like i feel like that should be encouraged not uh uh punished by the internet yep we will link the
Starting point is 00:07:14 tweet my my couple thoughts are one i don't see what's wrong with early return i mean if it was in python right and you were using a rough linter like literally you could have just gone rough fix i actually i should make a youtube video i'll make a youtube video what is rough rough are you ff it's basically a implementation of like every single type checker linter pep standard formatter it's like imagine all of the stuff in c++ clang format clang tidy asan you know fsan all the sanitizers all the linters cpp check include what you use imagine like someone created a tool that just re-implements all of that and they re-implemented it in rust that's what basically this rough project is for python so if you've heard of like
Starting point is 00:08:03 mypy as a type checker or if you've heard of you know the pep8 you know standards and linters and stuff they're implementing all of it so uh the point is i don't actually know which one of the linters uh warns you about early return but one of them does and rough comes with the similar to clang format and clang tidy where you can give it a dash dash fix and it'll just fix it for you and if you were to do this like you know if greater than 90 print or like return this else then indent indent indent it i'm pretty sure rough would just collapse all of that into like five if statements with returns uh because the the lint says like you don't need an else when the only like your previous branch is returning, which I think is completely great.
Starting point is 00:08:49 The sad thing about Python formatting is that it's going to put the return on a separate line, whereas I think it looks a lot nicer if it's just on a single line. But hey, that's just my personal preference. My second thought, though, is all this fancy math that you describe in order to get lowered down into SIM onto a SIMD. Uh, first of all, anybody, whenever anyone ever says SIMD, that's just a fancy way of saying array programming. And second of all, it's not, it's, it's not fancy math. Like my very first, my first thought was like, just do the early returns.
Starting point is 00:09:22 You're just going to drop that on me and expect me to to just take it yeah like i've listened to i've listened to an episode on what is what is richard feldman's podcast called it should be called the rock programming podcast but it's called functional something anyways link in the description to the episode. And he has an episode where he talks to the implementer of SIMD JSON. And every little tool and technique that they describe in that episode is just array programming. Like it's just array programming, building up masks instead of using ifs, et cetera, or finding indices. Yes, but I'll put in the caveat that sometimes you have to write code in a very specific way to get it to vectorize correctly with a particular compiler or on a particular platform. That is not like, yeah, sometimes you use high level, like more abstract ideas to write code that's going to be SIMD efficient. But sometimes it's like, oh, well, you know, like we need to use, you know, this math operation here instead of that
Starting point is 00:10:33 math operation, because on this particular x86 microarchitecture, you know, that's more efficient. Okay, so sure, feel free to add that asterisk. But what I was going to say is that the fancy math is a subtraction, an integer division, and a maximum. Yeah. But I think people don't like the fancy math because it's not. That's not fancy math. That's not fancy math. Three binary operations.
Starting point is 00:10:59 People don't like it because it makes it a lot less clear what the function is doing. It does make it faster. And I would definitely write that like, and if a lot less clear what the function is doing. It does make it faster. And I would definitely write that like, and if you want to know what the function is doing, like put the slow version in the comments. Like put the version that's easy to read, like that's clear and concise. Just put that in the comments, right? And then put the fast thing that like is not as clear like in the code i i think it's a red herring to say that it's harder to read just people people can look at
Starting point is 00:11:36 an if you know a set of if statements and returns and i guess because they're used to if statements and returns but like index calculating an index and indexing into a string inside a one-line function that's called check grade i don't think like anybody would have any no uh buddy i i just agree strongly i i think like if if if if a human like thinks about like how our our head processes it um like like how we would think about it and describe it i'll describe it to other people we would describe it closer to the early return if it like if version than to the indexing into a string this is why i said the other episode that the array community is the best community because they think the right way everyone it's
Starting point is 00:12:21 easier for people to think about it that way because that's just the way that they've been reading code and writing code the whole time but if you're used to seeing this like you're such an array language nationalist i don't think it has anything to do with nationalism but uh you know replace that word with another word and sure just sign me up it's it's the right way it's the it's the way it's the way to think everybody's everybody's hopping on the bandwagon i mean you hopped on the bandwagon it's great folks we'll link the simd simd json episode we'll link the tweet well there's an answer here that i gotta show you that i i think is somebody's serious answer but i i deeply hope it's somebody's joke answer.
Starting point is 00:13:06 Wait, it hasn't shared yet. I mean, maybe it's not so bad. No, it's pretty bad, right? So what this person has is they have a switch statement where they've got, you know, case 100, and then they have case 99, colon case 98, colon case 97, colon case 96, have case 99, case 98, case 97, case 96, case 95, all the way down to case 90. And they have formatted it nicely. And looking at it, it is not as bad as I thought because, you know, there's not actually that many cases that you have to handle.
Starting point is 00:13:40 You only have to have like 40 cases and then you can default to F for the rest. But yeah, there's one person who seemed to... And of course, the person wrote, C has a simple and elegant solution to this. So I'm going to assume they're a C person, which is why they would consider this to be serious code. But yeah, their solution is just a switch statement with uh 41 cases and some nice formatting yeah i don't know why you would do this over the if returns but i'm pretty sure he's just trolling or they're just trolling i don't think they're trolling because i've seen like i've seen people's answers to this who are not trolling and are who are trolling and they make it a lot more ridiculous. This to me is somebody's attempt to make this particular
Starting point is 00:14:34 approach look elegant. Oh, I mean, I don't know. It's not like it does actually look not that bad. Looking at 41 case statements here, like it's not as ugly as I thought it was going to be. But still, like, you know, like, think about it, like, think how bug prone this is, like, all you have to do is make one mistake in one of these, like, you know, like you accidentally you wrote case 95 twice and and you left out case 94 and then you've got this inscrutable bug in your code oh oh people people people people well there you have it folks hopefully accu is good i mean by the time the listener is hearing me say that accu is over but uh i hope you enjoy it yeah so um i am one i'm gonna see you in july because i'm going to c++ north oh maybe i'll keep that in yeah congrats uh which talk i also got accepted as well so we'll both
Starting point is 00:15:41 you know if you're going to my talk think parallel got accepted um but uh uh i do not know what version of think parallel people will be seen at uh at uh c++ north it might be a refinement of think parallel scans it might be a different think parallel part oh you gotta you gotta bring us part two because by that point the accu one will be online and then uh you gotta you gotta keep the content coming you've already talked about it i think that c++ north will likely be um the rolling reduce one and that's fantastic because my talk is also a part two talk not of an accu talk but of the talk that I gave last year, Composition of Intuition.
Starting point is 00:16:28 So we'll both be giving part two talks, which is a good story. You got to tell them to not schedule us against each other because I want to see your talk. That should be fine. I mean, ours are too closely related, right? Like algorithms and algorithms, you know? Yeah. The talk I was really excited for at
Starting point is 00:16:46 this conference is now scheduled at the same time as my talk so that's unfortunate and and and yes kevlin i i am talking about your talk wow well yeah hopefully you get to see it online i know that uh bjorn faller he's given the closing keynote and I always love his talks. So that should be. I'm pretty sure I will not be around for that because I'm going to be on a plane briefly after my talk going to Dingle, Ireland. And if you've never heard of Dingle, Ireland, that's because it is in the middle of nowhere. It is in Kerry County on the far west, southwest part of Ireland. So yeah, I'll be there for a little while. Fun. Why are you going there? Just for fun?
Starting point is 00:17:35 My mom has a friend who has a house there. So we're going there. Nice. Yeah. Look at you, globetrotting. Globetrotting? nice yeah look at you globe trotting globe trotting yes i it's like it's bad man it's bad i have um so the american airlines uh like status qualifying year um starts march 1st and uh it's what april 17th and i have already nearly requalified for the top tier American Airlines status. I got to take a break. I've been on the road too much.
Starting point is 00:18:12 Your carbon footprint has probably got to be one of the worst. Got to be one of the worst. I am one of the worst, yeah. I am. I am. All right. Well, hey, Mom, you want to say hi it's good podcast content so here's the magic number the magic number is two yes can you tell me what it's the number of
Starting point is 00:18:37 at the accu conference i do not know well i just came back from the bar, and I counted two women. Amongst maybe, oh, I don't know, 50 or 60 computer bros. What's with that? I think that that's probably an accurate ratio. Although I will note that two of the keynotes are women. Well, maybe they were both there. I did actually think that the count was going to be three on my way to the elevator.
Starting point is 00:19:13 However, it was just a guy with long hair and a ponytail. Bryce's mom doing damage in her first appearance on the podcast. That's not damage. I think it's amazing damage this is is i think it's amazing that this uh i was like it's damage to whoever had the long hair anyways uh yeah i'll look into it i'll make it happen folks if i have to wake if i have to wake up at you know 3 a.m uh it's not it's not the most difficult thing i've had to do for adsp the podcast uh we will do it what is the most difficult thing you've had to do for ADSP, the podcast.
Starting point is 00:19:46 We will do it. What is the most difficult thing you've had to do for ADSP, the podcast? One time I had norovirus when I was in Mexico on vacation, and I was deathly ill for like 48 hours, and the podcast had to be launched in that window of time. Luckily, it had been edited but yeah i i was i like was barely moving i just crawled up in a fetal position you were in mexico dying of a virus and you still posted the podcast and this is the first i've ever heard of this. Really? I never mentioned it. You never mentioned it, which is bad because like most podcast co-hosts, like, you know, if you were dying of something, you'd be able to contact your co-host and be like, hey,
Starting point is 00:20:35 can you do this this week? But I'm not that type of co-host. No, no, you're not. And it's only like 10 minutes of work, the actual publishing slash posting of it once it's edited. Yeah. So it still wasn't, you know, I hadn't broken my leg and was in the hospital and then had to get someone to bring my laptop so I could edit it while on, you know, morphine or something like that. In case you've broken your leg and are in the hospital. I don't.
Starting point is 00:21:04 I'll tell you what connor if you're if you are under general anesthesia um nothing local but if you're under general anesthesia i will take responsibility for getting the podcast posted uh sure i mean really my my my backup plan is if like everything really terribly, all we need to do is record. And then I just won't edit it. We'll splice the tracks and it'll be like episode whatever, 273. I feel like I should knock on wood there because that's going to be the one where I'm in the hospital. And then we'll just ship it.
Starting point is 00:21:45 There'll be no jingle. There'll be no intro. And it will be explained in the unedited version. And there won't be any show notes or anything. But it'll still get released. As long as it gets published, that's all that matters, truly, at the end of the day, for us to hit our... Consistency.
Starting point is 00:22:02 You know, weeks. What is it? 179 at this point, as of this episode going out so we're we're getting there you know we're getting there wait wait is is is episode 200 gonna potentially be recorded when we're at c++ north no i believe episode 200 which is 30 episodes away, plus or minus, and it's like four in a month. So that's going to be several months. I think it's going to be in September of this year. Oh, okay.
Starting point is 00:22:33 All right. Well, I really should go because I desperately want to get to the gym. All righty. Good chatting. Enjoy the conference. Any last words, Mom? Have a good time at the gym and break your leg during your talk. Thank you. Be sure to check these show notes either in your podcast app or at ADSPthePodcast.com
Starting point is 00:22:53 for links to anything we mentioned in today's episode, as well as a link to a GitHub discussion where you can leave thoughts, comments, and questions. Thanks for listening. We hope you enjoyed and have a great day. Low quality, high quantity. That is the tagline of our podcast. listening. We hope you enjoyed and have a great day.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.