Algorithms + Data Structures = Programs - Episode 227: Re: The CUDA C++ Developer’s Toolbox
Episode Date: March 28, 2025In this episode, Conor and Bryce chat about Bryce’s talk The CUDA C++ Developer’s Toolbox from NVIDIA GTC 2025.Link to Episode 227 on WebsiteDiscuss this episode, leave a comment, or ask a questio...n (on GitHub)SocialsADSP: The Podcast: TwitterConor Hoekstra: Twitter | BlueSky | MastodonBryce Adelstein LelbachShow NotesDate Generated: 2025-03-20Date Released: 2025-03-28NVIDIA GTC 2025NVIDIA GTC Trip Report⭐ The CUDA C++ Developer’s Toolbox - GTC 2025 - Bryce LelbachThrustRAPIDS.aiCUTLASSCUBnvbenchHow to Make Beautiful Code PresentationsIntro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8
Transcript
Discussion (0)
Like I think honestly, this is your best work, Bryce.
It's your best work maybe of your career.
And because CUDA, let's be honest, has a lot of work to do
when it comes to its onboarding and education experience.
And I think that this is the start of something beautiful.
Welcome to ADSP the podcast episode 227 recorded on March 20th 2025.
My name is Connor and today with my co-host Bryce I ask him questions about his GTC 2025 2025 talk entitled the CUDA C++ developers toolbox and more.
One, two, three, four, five screenshots, five screenshots.
So not eight or nine, but definitely the most I've taken of any other talk I've watched is one,
which leads me to, I mean, I'm not gonna screen share
because we don't need to.
You'll know the slides I'm talking about
and maybe we'll include it in the show notes.
By far, my favorite slide,
arguably better than Jensen's equivalent folks.
And Jensen said it was his favorite slide
if you watch the keynote.
And Jensen, if you're listening to this for some reason,
your slide was beautiful too,
but for the discussion that's about to ensue,
you'll understand why I like this one better.
It is the four by six grid of logos of libraries.
And this is why I've been up, I've lost sleep.
I think I've had like four or five hours of sleep.
I couldn't get to sleep for like an hour.
I woke up at another point in the night,
like couldn't get back to sleep for an hour
because all I am doing is thinking about this slide deck.
First of all, where did you get all the logos from?
Oh man, so much work went into that slide. I know, I know. I saw the slide and I've never,
I've lost sleep over this slide. So just so that the listener is not like, what are they talking
about? We've got a four by six grid on a single slide of 24 different libraries. And we'll list off a couple.
Thrust, Libku++, Rapids, KooFFT, KooTensor, KooDNN,
Cub, Cutlass, etc.
Some of them have logos.
Some of them have modified versions of their logos to make it look better.
Some of them just have text.
Anyways, and I haven't seen most of these logos. We do have to clarify that they are not logos. They are graphic signifiers.
All right, fantastic. This is exactly what I was hoping to get into.
Some of them are logos. Thrust? Is thrust a logo? That's a logo.
It is a graphic signifier.
Okay, none of these are logos. Even rapids? Rapids doesn't count?
Nope.
They're all graphic signifiers. All right, I'm sure that's for some legal reason that I'm not involved.
But walk me through what went into this slide.
So none of these are official logos.
Why do some of them don't have graphics?
Where did you get these graphics from?
Was AI involved?
Yeah, so actually, AI was involved, but the place where AI was involved
was actually a different place with those graphic signifiers.
The place where I used AI most heavily
was on the slide introducing Thrust and Cub,
where I list the four different kinds of things that
are in Thrust and Cub.
And there's a little graphic signifier for each one.
And those were all generated by O1,
although it required me holding a baseball bat over the model.
It's an interesting question of why would I generate images
with a large language model, right?
Like what shouldn't show it, let me put it
differently. There are large language models specifically designed
for generating images. There's one I've used in the past, I forget the name of
it. I did not use one of those models and the reason I did not use one of those
models is because I wanted to generate very simple vector graphic images.
In particular, I wanted to generate SVGs.
And so I viewed this as being more of a coding problem
than a image generation problem.
And so I thought it would be better to use something
like O1 or Claude.
I just used O1 because convenience, that seems to be the day to day model that I use for
things.
So I had it generate those SVGs and it took a bit of back and forth.
I wanted them in a specific color scheme which it was able to do, but then things like, I told it like
make it 200 by 200 SVG and it would make something but it would put padding around it, right?
So that what I want in these little graphic signifiers is I want either in the horizontal
or vertical dimension of the graphic, I wanted to touch the boundaries
because I'll add padding myself.
And it took a little work to convince it that actually I know I don't want any padding.
And there was a bunch of iteration back and forth of getting it, like one of the mistakes it made frequently was it would get the
the layering wrong so like the if there's some of these graphic signifiers
have arrows and dots like like dots and then arrows going between them and
sometimes it would have the arrows be on the layer in front of the dots so you'd
see the start of the arrow like in the middle of the dot and like it doesn't
look good.
And I would be like, no, put them in this order.
And I was surprised that if I told it a complex instruction
like have the dots in front of the arrows, and then the
arrows in front of the other element, and then the other
element in front of this, it kind of got the idea.
And then the logos on that page that you saw, so some of them come from, some of them have
like all the CUDA math libraries have their own little graphics signifier, those are the
ones on the left and they, I love those because they have a very nice uniform style.
And for slideware, I love having these simple vector graphic images.
And I've been trying to think about how do I describe the property of what I'm looking for in the slide graphic and I
think it's like one
I'm not looking I want something with no shading, no like shadows or anything
like that
sometimes they're 3D sometimes they're 2D
usually it's like a 2D flat perspective
I usually use very few colors.
And I want it to be a vector graphic so that I can scale it
and have it look very nice.
And I want it to be simple.
And one of the reasons that I like it to be simple is
because if it's simple, then I can make it as small or as
large as I want.
And I don't have to worry about there being a loss of the detail of it.
Some of the libraries didn't have
a graphic signifiers, some of the libraries didn't have graphic signifiers
and I just made the graphic signifier like that little one for NVBench
did not exist, did not exist a week ago.
And then some of them like the cuDNN and the tensorRT ones NvBench did not exist a week ago.
And then some of them, like the cuDNN and the tensorRT ones,
I found there was a blog post where there was on cuDNN,
and there was a blog post on tensorRT.
And those are not official logos for those libraries.
But the blog post had a nice image
that was at the header of the blog post.
It was an official NVIDIA blog post.
And I was like, this is a nicer way than just putting the
Kudian in text up on the slide.
So yeah, and in at least a couple of these libraries, I
just went to the person who maintains the library.
And I'm like, hey, look, your library is going to be on this
slide.
Do you have a graphic signifier that you want me to use?
And I think the next time I give this talk,
there will be more of these that will actually
have graphic signifiers and not just text.
Because at least one person promised me
that they would create the graphic signifier.
And I didn't have a chance to check back in with them.
But they have an idea of what it would look like.
We just haven't produced it yet.
But I know how much you, Connor, love having programming
language logos.
And I was talking with some people internally about this.
So CUDA C++ does not have a programming language logo. And one of the points that I made to people
is that I think the best programming language
logos and the best graphic signifiers in general
are the ones where you don't have to put text below it
to say what it is.
Like the C++ logo, the C logo, the Fortran logo, you don't have
to put the text below it saying that it's C++ or Fortran or C. Now the Python logo
doesn't, like, if you have no familiarity, you're not going to know. But I feel like
Python is ubiquitous enough that maybe it's not a problem I don't know about the like the rest logo
Like maybe it's fine
But the ones that I like best for programming languages are the ones where you can even if you don't know
The logo you can tell what the programming language is. I
Mean I've got so many so many so many thoughts and follow-up questions and this is clearly
Gonna turn into two parts. So maybe the start of my questions was part two of this episode
So you got the gtc recap which probably was only 25 minutes
But at this point this is gonna blow way past the 35 minute like max one episode
And i'm about to twi i'm about to twi i'm about to tweet low on sleep folks
I I did mention I lost sleep over this slide and I still got other questions to get to
Um, i'm about to tweet, with your permission,
Jensen versus Bryce slide deck. Is that fine? I'm not gonna add Jensen.
That's fine. I'm just gonna pretend that I didn't hear you ask me the question.
All right, but it needs to be done mostly, I mean, I don't need to put Jensen versus Bryce
slide deck.
It's just this slide that I'm talking about versus the one because do you know the slide
that I'm referring to in Jensen's talk that actually then I don't need to tweet it.
But now that I said I will, I will because I needed you to and actually it probably will
be good.
So head to Twitter.
I'm about to tweet this right now.
Actually, should I just put Jensen versus Bryce?
No commentary.
People can infer what they want. And once you're at Twitter, and this is why I've lost so much sleep, I'm
about to explain. So once again, for the audio listener, Jensen's slide is, we will say,
much more consistent in that every single one of the, what I'll assume is a cell phone
screen with some artistic art on it, it looks roughly the same.
Unlike Bryce's slide,
there are no missing graphics signifiers.
That being said, they're also way less meaningful
because these are all just artistic art things
that have nothing to do with the actual underlying
technologies.
And on top of that, it does some weird things or weird
or it's maybe the future. All of the cu prefix libraries are followed by uppercase letters, which is not the case in your slide deck.
Which is also not the way that these libraries are named.
But at least on Jensen's slide that is the case.
And...
So, I have slightly mixed feelings about that. I, you know, the math, I do feel like we ought to be consistent in the style and the spelling.
And the math libraries all consistently do cu and all uppercase letters now.
And I think part of that is because cu-blas, which is one of the first ones, cu-blas and
cu-f-f FFT were the first two
CUDA libraries.
Well, BLAS is an acronym and FFT is an acronym.
So I think that started the trend.
And then originally Coup Solver, which solver is not an acronym, solver is a word, Coup
Solver was originally Coup capital S and then lowercase the rest of solver. And
the same for Coo sparse. There were periods of times where it was spelled differently.
But I think that for consistency with the Coo Blas, the Coo FFT, the Coo DSS, that
it became this way.
I mean, that's fair enough. I completely agree that we should this way.
I mean, that's fair enough. I completely agree that we should be consistent.
I'm just merely pointing out that there's a delta.
And one of the biggest things that I've lost sleep over
is the fact that on your slide, Rapids is there.
And one, first of all, Rapids is purple and...
I'm gonna get in so much trouble.
I'm gonna get in so much trouble.
I mean, it's there and it's been greenified.
I don't necessarily, I'm not too upset by the fact that it's been greenified because you want to make the slide look nice.
If it was purple it would look awful. But on Jensen's version of the slide, he doesn't mention Rapids.
He just puts cuDF and cuML which is, and so for folks outside of Nvidia not familiar with Rapids,
which is where I started my
So for folks outside of Nvidia not familiar with Rapids, which is where I started my Nvidia career on the Rapids team before switching to research, is Rapids is basically an umbrella
term for a bunch of these libraries, which include QDF.
QDF is also an acronym for DataFrame, which is the Pandas equivalent, and ML for machine
learning.
And it's kind of, the reason I've lost sleep
is that it is indicative of this kind of,
I don't wanna say marketing problem,
but like, and also too, there was another slide
from the Python talk where they show this graphic
of all the different Python technologies
of which Nvidia was not a part of. It shows PyCuda in 2010,
then Numba, KooPy, PyTorch, Jax, and then Rapids Warp, KudaPython, and Triton Lang.
Rapids Warp and KudaPython being only the NVIDIA-built projects. All the other ones
were outside, which I thought was kind of curious that for the first decade NVIDIA didn't have any
direct ownership over these projects.
But that, I'm not sure about warp
because I actually haven't used it,
but I believe Rapids requires Conda at the moment.
KooPy numeric requires Conda,
whereas like Jax and PyTorch, you can just pip install.
And anyway, so there's just this, what do you call it?
Like disconnect of like, you do you call it? Like disconnect
of like, you've got in the first two verticals, all the coup libraries, and then you've got
rapids, which is kind of like an umbrella for a bunch of coup. So I'm getting to my
question. What are your thoughts and feelings about this?
Yeah, so you know, I actually, so one, I do have to say, I liked Jensen's slide. And I
think that his slide and my slide
are trying to do different things, because they're
trying to speak to different people.
My slide and my talk is targeting purely
a developer audience.
And Jensen's speaking to not just developers,
but also people who are in these particular industries.
He's speaking to analysts, to everybody who cares about GPUs.
And I think that his slide is far more effective for a broader audience.
But I also I love that despite the fact that Jensen's got a much wider audience that he
has to speak to, that he's still able
to have a talk that's so developer focused.
I think it's great that he's still got a talk where he's got a slide there where he calls
out and speaks about a lot of the great libraries that we built.
That's I think really special.
I think there's not a lot of tech CEOs who will talk about, you know, developer libraries,
like specific developer libraries that their company's building.
So I actually was kind of excited by Jensen's slide.
I debated whether I should, and I actually, I think if I had had time, I would have taken
Rapids off of my slide and I would have put cootie up there instead.
It's not solely because of the purple logo nature of Rapids.
It's because Rapids isn't a library.
Rapids is a collection of libraries and to some degree it's more of product.
I wanted this slide to be not product focused,
but library focused.
Like the point of my talk was to,
or the point of this slide in my talk was to point people
to all the tools that are available to them.
And so I think that when I give this talk in the future,
I'll probably instead of rapids just have cootie-f there.
The problem is there's not enough real estate in the slide
for me to do what I would really want which would be to list all of the various Rapids
libraries, of which there are many. But maybe what I would do is I would take off one or
two of the math libraries and put some of the Rapids libraries up there too.
Well, I'm thinking of like this slide, I know because I know you will clearly evolve over time
and the iterations that you give this talk.
But like I think, you know, one of the questions
at the end of the talk was about docs and, you know,
best practices in terms of education,
which CCCL and Nvidia is one of their primary focuses now.
But what would be amazing is if like this was,
there was a JavaScript, I don't know if you need D3
for the kind of animations that I'm thinking of,
but like imagine this as a landing page website
where you hover, you take your mouse and hover over these
and it does a little jiggle every time you go over each one
and then for rapids, you would get this little balloon
that expanded into QDF and QML.
And it would be a good starting point of like,
the other thing is too is,
which is a whole other layer of this of like,
what is the language that you are interrupting with?
Technically Rapids at the highest level is Python, right?
There are C plus, CUDA C plus plus libraries
that you can target that lie beneath them.
But it seems a bit odd to have Rapids sitting right
on top of Thrust. I mean, actually it doesn't seem odd because have RAPIDS sitting right on top of Thrust.
I mean, actually, it doesn't seem odd because RAPIDS is kind of built on top of Thrust,
but like Thrust is a CUDA C++ library, whereas RAPIDS is primarily a C++ or a Python library
that has C++ lower level libraries that you can target. Anyway, so depending on the,
what do we call them, graphic signifier? You know, you have access to different languages
per tool or per library.
So the hardest part about this slide was figuring out the ordering and the grouping of these.
And there's a couple different accesses upon which I needed to group them. First, there
were libraries that do similar things. So there's, on the left hand side, there were libraries that do similar things.
So there's, on the left hand side, there are eight libraries that are typically known as
the CUDA math libraries.
And then we have the three C++ core libraries, Thrust,coup++, and cobb.
Then there's Envybench, which is maintained and developed by the C++ core libraries team,
but it's really, it's a benchmarking framework.
I think of it more of like a developer tool.
And then there's the CUDA runtime, which is what everything here is built on top of.
Then some of these libraries are device side libraries.
Some of these libraries are libraries where it's a thing that you call from your CPU that launches work on the GPU.
Some of them are things that you purely use in device code on the GPU. And some of them are a mix of both. And so I try, like
there's a couple on here, Cutlist, Cooperative Groups, and Cub.
I needed those three to be next to each other.
But also Cub needed to be near all the other CCCL libraries.
But then the last constraint, and then there were libraries like Envy, Shrem, and
Nickel, which are communication libraries that I wanted to have together and then like
Coup files and IO library I wanted to have that near like the Coup to runtime
and some of the other like like it's not a compute library there's one way
of dividing these libraries is is it a library that does, you know, that does math or science or physics
or is it a library that does like, you know, runtime management or, or comms or IO or stuff
like that.
And then there's the, the machine learning libraries, QDNN and TensorRT.
And I wanted those three libraries to kind of be centered in the image.
I decided that it was best for them to either be at the top or the bottom because I have
four rows.
If I had had three rows, I would have put them in the center row.
But I feel like with four rows, that would have been hard. And I actually, the way I made this slide is I started off with just raw boxes and I
figured out how many of these blank boxes did I want to have and then based on that
I tried out different configurations and then I made the list of what things I was going to include.
You know going off of that.
Let me see if I can find the first version of this slide deck went through.
It's on R4 right now. And the way that I version my slide decks is I bump to another R version of them when
either I give the talk or if I delete content.
Like if I change the content plan substantially, then I will make a new version
of the slides because I want to record, I want to save what my old content plan was.
I'm going to share my screen with you very briefly, but it's okay because the visual
will be easy to explain. This is what the slide originally looked like where it was the four by six grid in a slightly different configuration and it is just boxes,
square boxes that say the library names and some of them have different colors
to indicate the grouping and there's a giant like text box on the slide that says to do this but do better and I think I
did better than the you know colored text boxes one hopes all right I mean
I'll have to think on this but like I'm coming to realize the reason I have lost
sleep and we're discussing this at length.
And also I put the cart in front of the horse.
I didn't even explain Bryce's talk, which I'm going to take a 10 second digression.
The title of Bryce's talk was the CUDA C++ Developer's Toolbox.
It is at a high level showing some examples of Thrust, cub, and what was the other libcu++?
You show that at all?
libcu++, yeah.
Yes.
And then also it covers the universal vector, which I have a question about, which we'll
get to.
But anyways, high level, that's what it's about.
I was supposed to say that at the beginning, but I got too excited about this slide.
And the reason I've lost sleep, I think this is your best slide you've ever made, but it's
going to be so much better. Like I think honestly, this is your best slide you've ever made but it's gonna be so much better.
Like I think honestly this is your best work Bryce. It's your best work maybe of your career
and because CUDA, let's be honest, has a lot of work to do when it comes to its onboarding and
education experience. And I think that this is the start of something beautiful and so my
recommendation, I already had this in my mind
before you showed me what I'm looking at right now,
which is I'm staring at six different colors.
We've got yellow, red, blue, purple, pink, and green.
When you were describing that on the two left columns
of the six columns on this slide are the CUDA math libraries,
I was thinking, man, what you should have added
to that talk was like having a rounded, not pointy, but a rounded rectangle
that kind of briefly goes over and says,
these are the math libraries.
And then that one disappears.
And then the next one pops up and that one
has a red coloring.
So basically have like six slides that transition and fade
from like one rounded rectangle to another rounded rectangle
that basically visually, and then you would vocally describe because
I think to folks at Nvidia, you know, we see thrust next to libcu++ next to cub and and we know that like, okay
You know thrust is is kind of a layer above cub and you know, it interacts with anyways
We see the connections but for folks that are maybe even experienced or not experienced,
those connections are invisible.
They have no idea that this implicit grouping exists there.
And also, they may also not know what these libraries are.
And it's interesting that you say that because we talked a little bit earlier about that
this is a 35-minute talk about that this is a 35 minute talk
and this is a talk I plan to give a lot this year and for the most part speaking engagements
are hour long talks.
So the reality is the GTC version of this talk was really just a preview.
There is an hour long version of this talk and there was content that had to be cut and
I think of the content that had to be cut. And I think of the content that had to be cut,
the parts that are probably the most interesting to you is one, the reason I put so much effort
into this slide is because this slide, something just like what you described, will happen with
this slide. I haven't built that part of the deck yet, but I'm going to spend five to ten minutes
going through this overview of the libraries and giving people the, not going to go into details,
but giving people the pointers about what are these libraries,
when should you use them?
Because it's the CUDA C++ Developers Toolbox,
the name of the talk.
We want to teach people what are the tools in the toolbox.
And the other thing that got sadly
cut from this version of the talk
is I spent some time talking about the by-key algorithms.
And I was very sad to cut that from this talk because I think if you're looking at thrust
versus the standard library algorithms, the part of thrust that's most compelling for
people is that there's all these very useful extended algorithms.
Like you could use C++ standard parallelism, which we have an implementation that supports
GPU acceleration, but if you wanted to do a segmented reduction, that's not in standard
C++ yet.
And so I had a couple different examples of doing some nice segmented reductions that I had to trim.
Those will be in the next version of this talk and also the deep dive and
this or not the deep dive the overview of some of these libraries will be in
the next version of this talk and I suspect that I will do something like
what you described with the boxes. It'll probably just be like a like drawing a
circle around these.
And I actually like the idea that you just had about rapids, to have it expand out into
all of the various rapids libraries.
I should say this is the first time I've given a talk where I have used transitions and I used the morph words transition that
Connor taught me about.
But then I think that I maybe do this more than you because when I've seen you use the
morph words transition, you usually do it on a pure code slide.
But I often have a code with diagrams slide side by side and
I found that when I used morph words that it would fade the diagram to black and the
reason is because PowerPoint in a lot of cases if I had the diagram, even if I had the same
diagram on one side versus another slide, the group of objects would have a different name from slide to slide. So PowerPoint would get confused.
But I learned that if you go into this magical thing called the selection pane in PowerPoint,
you can give your group of objects a specific name. And if they have the same name from
slide to slide, then the PowerPoint transitions know
that they're the same thing.
And so that's how I was able to use that trick to build some really cool morph transitions.
And I really love the morph transitions.
For so long I resisted having transitions in my slides and part of it is that transitions
take time and they can throw off by timing.
Part of it is that if you rely heavily on transitions and animations in your slides
then the PDF version of your slides, the like if somebody's just reading through your slides
it can be a little bit harder.
But I am a convert.
I loved the morph words transition and to explain for people who don't know, that what the morph words transition does is,
if you've got like two slides of text
that are very similar,
but there's some changes between the two,
the morph words transition will,
it will show the words,
the text between the two slides,
like evolving into each other.
Maybe you have a better way of explaining it than me.
Yes, sure.
I will, better than me explaining it,
although I will briefly.
First thing I have to say is we need to wind down
this part two of our conversation
because I don't have a super hard stop,
but I do have a hard stop-ish,
and I definitely wanna get to the stay tuned to next week listener. You know what super hard stop but I do have a hard stop ish and I definitely want to get to the
Stay tuned to next week listener. You know what my hard stop is my hard stop is after a certain hour
I will be unable to park anywhere in downtown San Jose
Well, we'll put our hard cap at 30 minutes from now. So if we want a 25 minute part three
We've got to wrap this up in five minutes. So
And I have I have like four questions
We're gonna have to rapid fire
and then you get like 15 seconds per answer
and maybe we'll revisit this topic.
So I have my third most popular YouTube video
is how to use basically morph
because it's the number one question I got.
Welcome to the club.
It's been here since 2019.
You're only half a decade late
and it's the most beautiful thing ever.
There's three different versions.
Objects, the default,
words is the second setting and characters. So most people don't know how to use it effectively for code because
they leave it set to the default option of objects and that does nothing for your code.
You have to change it to words. And yes, it basically just automatically finds the closest
number of words. It's magical. And also that little hack that you sent me. I've I've watched
basically this morph not keynote, but presentation where you can do crazy things
with it even more than what you showed me.
So I've been a morph pro.
In fact, I don't want to name the person at Microsoft, but a very senior person at Microsoft
one time emailed me and asked, how are you doing that in PowerPoint?
And I was like, brah, you work for Microsoft, like email the PowerPoint team.
But wait, take 15 seconds, 15 seconds for whatever you're going to say, because we got
a rapid fire questions after this. Do I know what? Do you know anything about the algorithm that they
use to do the matching? No, but I would love to talk to the PowerPoint folks. I imagine it's just
kind of some like closest thing, because it does some weird stuff every once in a while. But if
you work for Microsoft and you know or happen to work on that algorithm, get in touch with us. I
usually block anyone that emails me asking to be a guest, but we're reaching out this time.
So I'll briefly say the thing that I love the most about the morph transition is nothing about the visual itself.
It's about the structure of the code. I used to, if I had a progression of code evolutions,
it used to like where there was different
things on different slides, it used to be that I would add additional new lines and
additional spacing so that when I went from one slide to another slide, elements would
stay in the same place. And that meant that if you looked at the first code slide in one
of these evolutions, the spacing would be all weird and unnatural. And the meant that if you saw, if you looked at the first code slide in one of these evolutions,
the spacing would be all weird and unnatural.
And the thing that I love about Morph is that with Morph, I don't have to do that because
with Morph, if I need to insert a line at the top of the code in the next slide, what
Morph will do is it will gracefully shift all the code down.
I don't like it if it's jarring.
If like one slide it's nine lines of code,
and then the next slide those same nine lines of code
are there, but they've moved down one line
because I've inserted a tenth line of code in front.
I don't like that.
So what I would do is on the nine lines of code slide,
I would put a blank line at the top.
And now I don't need to do that because now with morph it will just
Beautifully just shift everything down and that that is that is for me the killer feature
All right, it's very graceful and we've used up three of our five minutes left in this episode folks
So maybe I'll limit myself to one question and you have 30 seconds to answer the number one thing
That I was confused in your talk
How come MVC++
and Stoodpar didn't get mentioned as a CUDA C++? I hope this isn't breaking news that
that technology has been backburnered or something. What's the reason for not mentioning it?
Not that it's been backburnered, just that we wanted to give a talk about how to do CUDA programming.
Stdpar is great. It's not in any way something that we don't support.
We certainly want people to be using it, but we wanted to give a talk about if you want to write CUDA code.
Like how should you write idiomatic CUDA C++ code?
And I think part of the problem is that people that consider
themselves CUDA programmers, they
don't think that a talk about C++ standard
parallelism is for them, right?
So we wanted to reach a particular audience.
And so this talk is, or the whole curriculum that our colleagues Georgi of Tushenko and much
of this talk is based on material that Georgi made for the tutorial.
So I can't take too much credit for it.
But Georgi proposed that we shouldn't start teaching people the low level things.
The way that you typically teach CUDA is you start off by teaching people about kernels and device code and separate host and device memory and warps and blocks.
This talk doesn't mention warps or blocks once.
In this talk, I only talk about distinct host and device memory in the last part of the
talk.
Right?
I introduce the idea that there's different CPU and GPU memory as the last part of the talk, right? I introduce the idea that there's different
CPU and GPU memory as the last thing. And I don't even say the word kernel anywhere
in here really. I just talk about launching work on the GPU. And Georgie came up with
this idea that we should teach people to use libraries first, to use abstractions first.
to use libraries first, to use abstractions first, use libraries first, try that out,
use the things like thrust first and then you know only as a more advanced topic would you go and write your own kernel. You should only do that if you've tried to use the libraries
and the libraries haven't worked for you because honestly you're going to get better performance
if you use the abstractions. And so if you think about how do you just teach like modern C++ versus C. Well the way
that people teach modern C++ is like you don't start with teaching them pointers, right?
You don't start teaching them pointers and C-style arrays.
The way that I've seen people teach modern C++ to beginners these days is you start off
by teaching them the standard start off by teaching them
the standard library, by teaching them containers.
You only introduce the notion of pointers much later on.
And that's what we're trying to do with CUDA C++ here is we are teaching the content from
the high level to the low level instead of from the low level to the high level.
And I had two instructors during GTC come up to me and they were like, wow, this is
really surprising.
This is a very different way of teaching.
And one of them, he sort of like asked this to me and then just paused as if he expected
me to say something like, oh no, this isn't how you should actually teach it.
And I was just like, yeah, no, like that's what we want you to do.
And he just paused and he just processed.
And he was clearly, he was expecting me to say something
to make him feel reassured that he didn't have
to completely rethink how he was teaching this material.
And then he realized that, no, wait,
you really are telling me to completely rethink
how I've been teaching this material.
And that's exactly what we want.
We wanna teach from the high level to the low level. An amazing answer, if too long. telling me to completely rethink how I've been teaching this material and that's exactly what we want.
We want to teach from the high level to the low level.
An amazing answer if too long.
We have now exceeded our threshold and I have follow up stuff but we're going to table this.
Remember I asked about MVC++.
We're going to revisit MVC++ which the short version of what Bryce said is this was a
CUDA C++ talk, MVC++ is ISO C++ with, you know,
inherent or automatic parallelism.
A topic for another day, and I did have questions about
the universal vector and unified memory,
but we will table those questions for a follow-up discussion.
We are now transitioning to part three because at this point,
if we stop at,
you know, five minutes past the half hour mark it's only going to be a 22 minute episode folks.
So part three of this recording which I believe puts us on like episode 228 potentially. Be sure
to check these show notes either in your podcast app or at adspthepodcast.com for links to anything
we mentioned in today's episode as well as a link to a get up discussion where you can leave
thoughts comments and questions.
Thanks for listening, we hope you enjoyed, and have a great day!