Algorithms + Data Structures = Programs - Episode 78: C++ Algorithms & Profiling with Ben Deane (Part 4)
Episode Date: May 20, 2022In this episode, Bryce and Conor finish their conversation with Ben Deane about C++ Algorithms!TwitterADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the Guest:For Ben Deane, C++ wasn’t ...even among the first 10 languages that he learned on his programming journey, but it’s been the one that has paid the bills for the last 20-odd years. He spent most of that time in the games industry; many of the games he worked on used to be fondly remembered but now he’s accepted that they are probably mostly forgotten. These days he works in the finance industry writing high-frequency trading platforms in the most modern C++ that compilers can support.In his spare time he watches a lot of YouTube’s educational sector, practices the Japanese art of tsundoku, reads about the history of programming, avoids doing DIY, and surprises his wife by waking in the middle of the night yelling, “of course, it’s a monad!” before going back to sleep and dreaming of algorithms.Show NotesDate Recorded: 2022-04-19Date Released: 2022-05-20ADSP Episode 72: C++ Algorithm Family Feud!ADSP Episode 75: C++ Algorithms with Ben Deane (Part 1)ADSP Episode 76: C++ Algorithms with Ben Deane (Part 2)ADSP Episode 77: C++ Algorithms with Ben Deane (Part 3)quick-bench.comTyler Weaver TweetWordleGuass’s Calendrical AlgorithmC++ std::sortC++ std::nth_elementC++ std::max_elementC++ std::reduceC++ std::transform reduceC++ std::accumulateGoogle BenchmarkThrust Parallel AlgorithmsBeautiful Folds (A tutorial on the universality and expressiveness of fold)McNugget NumbersIntro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8
Transcript
Discussion (0)
This is reminding me, this is Bryce's restaurant algorithm.
This is sort of reminiscent of, um, is it the chicken nuggets algorithm?
What?
Welcome to ADSP, the podcast episode 78, recorded on April 19th, 2022.
My name is Connor, and today with my co-host Bryce, we finish up part four of our four-part conversation with Ben Dean about C++ algorithms.
Don't worry, we will have you back, Ben, because clearly we bring Ben on,
and we actually end up talking about what the name of our podcast is,
Algorithms.
We talk about data structures and algorithms for all of four podcasts,
and there have been ones where we –
Dude, this has reduced the podcast, so I don't know what you're talking about.
That's true.
I mean, let's actually actually we'll take 30 more seconds
as a diversion go go to the website and we have a very nice tagging system where you can go and
look at all the different topics and algorithms is one of them the one that jumped out of me is
the life advice topic so we have zero one episode zero one 23 25 36 37 is that the largest by category 51 54
55 56 72 definitely not c++ i think is the largest so that's about 10 10 or 15 of our 75
so that's not that bad that's about 20 the largest a lot of has got to be c++ and how many are in life advice life advice
just one episode 44 should you go to school a lot more life advice than that
i mean i'll be honest i create these tags a lot of the times while i'm
the individual podcast and then completely forget about forget about his strategic decision making i think that was part three of the channel patricia podcast yes what is k
it's a language uh k was the very derivative language yes it's the derivative language
of j yeah if there's anything in the tags that you don't know there's three episodes being lower
case is what threw me we are gonna have to talk more about wordle because there's anything in the tags that you don't know, there's three episodes. It being lowercase is what threw me.
We are going to have to talk more about Wordle because there's an article in the ACCU magazine about Wordle that I haven't gotten to yet.
But anyways.
I almost lost.
I mean, today's Wordle.
I actually got today's easily.
I got down to my sixth guess.
And I could think of two words and I knew there
was more. And I ended up guessing correctly and I went and did my wordle hindsight. There was four
different words I could have guessed. I had a 25% chance. So I almost lost my streak of, I haven't
lost yet. And I was going to retire today. I thought for sure I was going to miss miss it you were going to say something about word connor so once you miss once you break your
streak you're going to retire you're going to rage quit i'm exhausted because i play four different
wordles i play wordle quartal another quartal which is quantum wordle you actually play quantum
wordle do you play worldle i've played once. Do you want to join my quantum compiler team?
Because if you can play quantum Wordle, then you can join.
Then you're qualified to be in my quantum compiler team.
I mean, quantum Wordle is upsetting because it uses a different word list.
And so a lot of times...
That is not the only reason that quantum Wordle is upsetting.
I'm not that big on Wordle, to be fair,
mostly because when it came out,
I immediately thought about it
and I thought, well, this is just solvable by computer.
So it's not that interesting.
However, I've recently been made aware of,
you know, my Wordle strategy now
is the ultimate sort of play it safe.
Like, I don't want to get into I just like I just type in.
Thanks to my son, he told me the strategy where you go flame, brick, shunt, podgy they are four five letter words that have no letters in common and they cover
like once you've typed in those four you've covered most letters and then you should be
able to get it in what was what was hanging on because i like to write down different people's
strategies so tell me well this is this is my son Henry's strategy.
Flame, brick, shunt, and podgy.
Shunt and podgy.
Yeah.
That's pretty good.
That's pretty good coverage.
All right, all right.
Connor, I want to see the... All right, we've held this off.
Potentially, it's been a week for the listener.
It feels like it's been a week for the listener. It feels like it's been a week for me.
Zooming out too much.
We've got the custom code written by yours truly.
We're going to have to make a few modifications.
But I guess actually it's really just you're going to have to rename, yeah, change that to V and then change 10,000 to N.
And boom, you're done.
That's so easy.
Oh, you're going to have to add random as an include.
You want to be parens there, not braces, right?
Because you're making a vector of one element, which is N, I think.
Oh, God.
Burns me again.
Burns me again.
Initializer list.
Everyone's favorite thing about C++.
I hate it so much.
I hate it so much.
That's random.
That's the name of the header, not random.
Yeah. All right. random that's the name of the header not yeah random yeah all right um we are running the benchmark will it compile first we'll find out in about five seconds how good
is bryce's c++ oh my c++ is amazing um oh yeah that's because you don't have c plus plus 20 enabled no we we don't we can't use ctad
um if we don't oh no we do actually we just we what's actually what what's the error here
oh because we didn't it doesn't it doesn't know how to ctad because we don't we didn't give it
an element right right we we switch from initializer list to a constructor specifying
the number of elements so there's no actual data to oh c plus plus that's
why we get paid job security through code obscurity it's my favorite quote from the
scott meyers books is the uh at one point he says job security through code obscurity
um do you well this is probably gonna take another 30 seconds
do you want to just mention that or you want to go ahead and mention some more are you talking
at c++ not this year i decided to take a break and not submit um i was going to mention more
favorite algorithms for future episodes if you like um sure i i'm pretty interested in calendrical algorithms like i was saying to you the algorithm
for finding the date of easter um is is an algorithm that drove much mathematical progress
in the western world some hundreds of years ago it's interesting the things that cause technology or mathematics, like, you know, the fact that gaming is, like, driven a lot of technological advances.
And that you, yeah, you mentioned that to me the other day, Ben, about the Easter, and I had no idea.
It's a difficult thing, it turns out.
And it took up mathematicians' time in whenever the 16th century of 17th century
interesting is there a specific name of an algorithm that is used for that or is it just
a general category the usually like the most understandable algorithm is attributed to gauss
interesting and the results are back bryce completely vindicated about the data mattering because it has completely changed the order.
And now what we've said in the past is actually false.
The two max element algorithms, basically, they're not identical because in the lower numbers, max element two definitely is slower.
But then once you get to basically about a thousand elements, they're the same.
All of this demonstrates is that we still have insufficient data.
This is just from one sequence. the way to actually benchmark an order-dependent problem like this,
you need to run an ensemble
where you vary the seed of the input.
So for me to find this data believable,
one, I'd want to do multiple trials of each test,
but also I would want to do like a hundred trials each with a
different input sequence. Um, uh, uh, by different input sequence, I mean, changing the seed to the
random number generator. Um, and I would also want to test a few special sequences. Um, some of the
special sequences, um, would be like the sequence of the input is all the same element.
They're all zero.
And one would be like, you know,
Reverse sorted or something.
Yeah, yeah.
Reverse sorted.
Like, you know, the first two elements happen to be the, you know,
the highest.
The last two elements happen to be the highest.
Yeah.
So you'd want to try some battery of that because I bet you if we changed the seed around a few more times, we would see more variance.
Now, even just this input sequence that we're using is far better than the previous input sequence because the previous input sequence was regular.
It would be regular because of this modulo pattern. So the, the highest numbers would always be the, you know, the 50th or the 60th number or so would always be the one that would
be the closest to 10,000. Um, uh, and, and so, yeah, I think, I think to actually really get
some, some real answers here, we would have to do a little bit more investigation and a little bit more work. Maybe Tyler would be willing to do that for us.
My question that I really want answered is across these two different,
in fact, across the two different sequences and all the different versions of these,
although I guess we haven't made any changes to accumulate and transform reduce.
So it's really just the two different sequences of data.
Transform reduce has always been slower than accumulate.
And I'm confused as to why that is based on what we talked about earlier,
that even in the non-parallel version,
it's still taking advantage of some stuff.
Why is it slower than a fold left?
I do not think we have sufficient data to draw that conclusion.
I think the only data that we have now is data that indicates that we do not
have enough information to make any conclusions.
So what have we been talking about for like the last hour?
How to keep Bryce awake.
Yes.
But I mean, this is, this is, this is almost,
almost every time that I'm presented with performance data, this almost always is the case benchmark it in, in not, I'm not going
to say wrong, but you're, you're, you're, you're, you're starting off an exploration and you may not
be measuring the thing that you wanted to measure. Um, it, the, the process of, of arriving at
performance truth is almost always an incremental process.
And once you've been doing this for a while,
you start to learn to be skeptical of data.
And the only way that you get less skeptical of the data is you go and you try a new thing.
Assume any time that you're presented with performance data,
assume that it's wrong until you can build a really strong case that it's correct.
Because it's so often that there is not a mistake, but just that you're measuring the wrong thing or you're measuring some unexpected effect or there's some other thing that's influencing your experiment.
And you end up drawing the wrong conclusions because you just assumed that the numbers that you were seeing were, you know, were actually the thing that, you know, were truth. Whereas
oftentimes, you know, those numbers might be the right numbers for what you were measuring,
but you were measuring the wrong thing. So my general advice to anybody doing any sort of
benchmarking work is be incredibly, incredibly skeptical until there is strong, overwhelming
evidence otherwise. So I guess the question is, if I mean, Tyler may or may not, or someone may or may not else now go and do this as an exercise.
Is there a library that you've used in the past that does all this?
Or is it just hand-rolled timing stuff?
Oh, I have little frameworks that I've pulled around from place to place.
But the framework that you use is less important than, than, than the technique.
I mean, you could use any framework and, and, and get to the, to the right result.
Um, I mean, I'm not really necessarily that sure about quick bench though, because it
seems like, uh, the, the charts that are generated, I've only seen two.
I've seen these.
The charts here are not how I would generate charts.
How would you export to Excel and then just do it there?
Excel or I would use some framework or some other tool.
Like when I am producing, like a large part of my job at NVIDIA is producing performance graphs.
And like one of the fundamentals of producing good performance graphs. And, and like one, one of the fundamentals of producing good performance graphs,
one higher should always be better. And, and that means like your, your Y axis should not be
wall time. It should be some, like, there are a few, there are very few exceptions to this. Almost always you want
your Y axis to be some, some metric of, of throughput, um, because that makes higher be
better. Um, and yeah. And I mean the, the there's, you know, 50% uncertainty in that timing
or 100% uncertainty in that timing, that tells me that I'm probably like trying to measure something
that is too small to really be measured with the timer that I'm using. And so I won't be able to, you know, get any useful information out of that. And so frequently people end up doing that
unintentionally where they measured something that's too small. So yeah, I mean, like I've
written frameworks for doing this in the past. You know, if I need to do something quick and dirty, I'll often just like throw it in Excel.
You know, Google Bench is like pretty good.
I have some moral objections
to some of the things that it does,
but it does a pretty good job.
There's a bunch of good Python libraries
for doing these sorts of plots.
Yeah, but like at the end of the day,
like when you're trying to answer a performance question,
like ideally your data should be simple enough
that you don't need a graph.
Like, you know, yeah, here in this case, there's the question of, you know, um, uh,
of input size. Sure. Um, but like it, it's gonna, it's gonna vary somewhat, but like, it's gonna,
it's not going to be that interesting once you get like beyond
like small. Um, uh, and so like, you could just like send me this data and like, you know,
a plain text table and I'd be able to like infer the results from it if the data was presented in the correct way, if you will.
So yeah, I mean, I don't think it matters really what graphing tool you're using.
If you've collected the right data, if you've collected it in the right way,
I think you'll get to the right data, if you've collected it in the right way, I think you'll get to the right result. But in this particular problem case, there's going to be an overabundance of data. A lot of
benchmarking problems, you end up with an overabundance of data where you have, there's a
ton of different things that you can benchmark and different parameters. And then it becomes unclear what things are actually useful.
And one of the problems here is our discussion over the past hour,
we just identified that there is another axis.
And any problem that has two primary axes
is going to be very difficult for humans to wrap their head around.
But what I mean by that is that the assumption was that the accesses here were, you know,
your independent variable was the input size and also, you know, what the particular algorithm
was.
So input size and algorithm, and your dependent variable was
time. The thing that we've just demonstrated is that there's another dependent variable, which is
the input sequence itself. And there's actually a ton of different input sequences. There's one,
there's a wide range of pseudo-random input sequences, that's fine because that actually, all of the
different pseudo-random sequences will boil down to one data point because you'll just say, we're
going to try a hundred different pseudo-random sequences with the same seed, and then we're
just going to average them. That'll be fine. But there's also like the special case sequences.
Those ones are probably data points that are not super interesting.
Those like tell us about corner cases, but we don't want to focus on those.
But that still means that you've now got three dependent variables.
And one, just in terms of visualization, three dependent variables means that you need to
slice or you need some sort of 3D visualization. And it's very hard to tell how you're going to get
useful information out of that. But I mean, I think the way to do it would be to
try to remove one of those dependent variables.
And the way to do that would be to remove the
order or the input sequence order dependent variable.
And you do that by saying, well, okay, we're going to try,
you know, 100 different input sequences for each problem size. And then we will just take the
average time of all of them. Um, and maybe like the, you know, uncertainty of that, um, too,
because that's, you know, a useful thing to have. And then you're back down to just the,
the dependent variables of, uh, which algorithm and, uh, the input size.
But you might get into weird...
Well, one of the nice things, I guess, would be that
if you pick a large enough set of these pseudo-random input sequences
for each problem size, you're unlikely to run into too many weird artifacts.
It probably needs to be like at least 100 or so
to avoid running into a bunch of weird cases.
All right.
Well, we will see what our listener does
with all of that information.
Maybe we're going to get a beautiful, you know,
JavaScript rendered, you know,
drop down three-dimensional viewing of all this stuff.
And well, I feel like this is going to be,
this is our new lost reduction series
where we did like four episodes
where we kept on revisiting.
And now we're going to be doing this
with this problem for like two more.
So all of Cobb is NVIDIA's parallel algorithm
primitives library.
And all of the benchmarks and tests
for all of the non-sort algorithms
take like an order of magnitude less time to run
than for the sort algorithms.
Because not only do you have to do,
not only do you have to like think about this
in terms of benchmarking,
but this problem is actually far more important
for unit testing your, you know,
sorting or rearranging class of algorithms.
Like you need to test all of these weird corner cases
of sorting and you need to have like a good
comprehensive test of these different like input sequences to make sure that like you're doing the
correct thing especially in parallel and and like when you think about it in parallel you know if
you're if you're if you're doing this problem like in parallel those corner cases can become a lot more
important um uh because the the corner cases may expose you know weird race conditions that you'd
not thought of testing is important to sum up
testing yes testing is important is what i was saying to sum up. Testing, yes.
Testing is important, is what I was saying.
Not just benchmarking.
Yes.
Well, I'm curious to see.
I mean, admittedly, this part,
this is probably going to be a four-part.
This is great.
This is good because I'm going on vacation.
This is great content, though.
This is just that we could talk about algorithms and associated things for hours if not days i mean if i was
younger me we haven't even talked about politician copy yeah if i was younger me listening to this
um this would be great because i didn't have any of this um at some point i started at zero
in terms of my knowledge about algorithms
and yeah so did we even the observation that like stood accumulators can be used uh to implement
like that was not something that was obvious to me for a very long time i think probably the first
time i really picked up on that is when i read the beautiful folds paper that's recommended um from something
graham paul graham maybe uh i might be getting that wrong i'll link i'll link the paper but it's
yeah beautiful folds and it just talks like graham hutton yeah yeah graham graham that sounds right
was it hutton or sutton if it's it's hutton if it's who I'm thinking of. What did I say? Graham Paul?
I wonder who that is.
You said Paul Graham.
Or Paul Graham.
Paul Graham, he's a list person.
Graham Hutton is a... That's cool.
All right.
Bryce needs to go to sleep.
But first, Ben, some important C++ Now related questions.
Okay.
Juan, is Ava coming with you?
No, she is not.
Not this year um okay yeah yeah i understand could have brought the big guinea pigs that's a covid you don't want
them to get so so um my girlfriend's coming to c++ now and uh we're bringing her dog, which is going to be very complicated. Okay.
What?
You don't want to know the amount of complicated things that's involved here. But, yeah, I was hoping that other people would be bringing their partners
so that she would not be stuck alone with a group of tech bros.
Standard birds.
But, okay okay next question um which of the cuisines uh do you
wish to join us for dinner for the sushi night the like one of the italian nights um i think
that that's yeah it's basically like either gonna be like you have the choice of like European or sushi.
So pick between those two.
I got to say, I'm not.
Sushi is one of those things that I periodically try because I think, hey, you know, I should I should really develop a taste for this.
And I've never developed a taste for it.
Having said that, you know, the sushi restaurant doesn't only, I think. The bottom is closed this year, only the top.
And the top is mostly just sushi.
Well, you know, I'm sure it's all good, but of those two then, I would probably go for the European.
Yeah, you probably want the Italian.
Yeah.
So the way that Bryce dinners are going to work this year is that Bryce dinners are going to be a four-person affair.
It'll be me, the girlfriend and two invited guests.
So I, I, I, you will be informed which, I'll let you know which night I have in mind.
I'm on, I'm on tenterhooks.
Why, what is, what is four?
Is this a COVID thing?
No, I, I've always, well, my, my rule of thumb has always been like the optimal group size is like
um uh you know six because if you do five they may try to force you to a four top table
um and four top tables are like usually like not actually that up that ideal um so usually it would
be like i would take six people to dinner because then you'd be
forced to get that free force to give you a bigger table. They can't force you at a four top table.
Um, and like, if not six, then probably three is the optimal number. Um, because then you get a
four top table and you have extra space. Um, uh, but, uh, so you're choosing the third optimal
number. Well, well, four, four is not really the optimal number.
But if you're taking your partner to a tech event and you want to bring some of the professional folks from the tech event to dinner,
three is not really great because then it's very intimate for the tech person.
It's like, you know, you're a third wheel.
And you can't do more than four because then your partner is totally and hopelessly outnumbered.
And so really the only options are either four where you don't bring other tech people.
It's either four or two.
So I've decided to go with four.
And four also gives you the option
that if somebody else has brought their partner,
then you can bring that person and their partner.
Yeah, exactly.
This is reminding me.
That's the strategy.
This is Bryce's restaurant algorithm.
This is sort of reminiscent of, is it the chicken nuggets algorithm?
What?
I hope you're still recording, Ben.
Now, I've got to remember.
So, as you may be aware, at your favorite fast food restaurant, whatever that may be be there might be a few um you can buy
chicken nugget meals in various sizes right but they are fixed sizes and so there's a there's a
problem out there of like what's the optimal way to order n chicken nuggets given the fixed sizes
and the price of i think that's it it's something similar to that oh something like the knapsack problem or something
like that yeah except that might be different people being people they've turned into i think
at one point there's one of these restaurants where you can order a custom number of chicken
nuggets right i think this is the problem right and so so let's say you want i want 17 chicken
nuggets because i've found out that that's my optimal number of chicken nuggets.
How should I fulfill that order given the standard sizes of chicken nuggets that are on offer?
I think that's also related to the problem.
This is a great way to wrap things up.
When I was in Sunnyvale, there was a McDonald's on Matilda Avenue, for those of you local
to the valley in Sunnyvale.
And I don't know how this worked, but they had a 10 pack of nuggets for $4.99.
And I think they had mispriced a 20 pack for $5.
Right.
It was one cent more for an additional 10.
And let me tell you, I thought that was a banger of a deal.
And I may have had quite a few 20-piece McNuggets while down in the valley.
Nice.
There is another, going back to the price dinner algorithm seating, strong preference for even number of folks, even partners aside.
Because if you got six, then you can sort of have three groups of two people talking with each other or like larger groups talking.
I think when you have odd numbers, it becomes more likely that somebody gets left out of a
conversation especially at like five and like seven and like obviously anything above eight
and it's just chaos just absolute chaos like you're never going to be able to to deal with
you know paying the bill there's a window into bryce as an individual that he has put this much
rigor and thought into the optimal number i mean the other
the other nice thing about four where one person uh has their partner there with them is that it's
only a three-way split of a bill and almost every restaurant's gonna let you do a three-way split of
a bill it's only once you get up to like a four-way split that they're gonna start having problems
yeah i i do i do put a lot of thought and like there may be a spreadsheet with like
you know who's gonna be going to dinner with bryce on which days it's plus plus now
lunches though lunches are lunches are unplanned lunches are unplanned
all right i gotta desperately go to sleep.
This has been awesome. This is
been great. Ben, we're gonna
be having you back. That'd be great.
Yeah, we'll talk about the
we didn't get to the end of the
heap, but we will have you back to get to the
end of the heap. I will still be
reaching out to Connor with
opinions.
Yeah, it's great to talk to you
too about algorithms in general,
whether it be on the podcast
or not. Well, I think the podcast
is a great format because then
everyone else gets to... I mean,
maybe not all our listeners
care as deeply, but...
Connor's just saying that he wants
a recorded record of all of
his interactions with you
alright
I seriously have though I gotta
I like do need to sleep like
very badly thanks for listening we hope
you enjoyed and have a great day