Tech Over Tea - Making Nouveau & OpenCL Usable | Karol Herbst
Episode Date: November 24, 2023Today we have Karol Herbst on the show who you may know from his work on projects like nouveau, NVK and starting the RustiCL project bringing much better OpenCL support than before. He's been in t...his space for a long time and has a lot to say about driver development. ==========Guest Links========== Mastodon: https://chaos.social/@karolherbst Github: https://github.com/karolherbst Gitlab: https://gitlab.freedesktop.org/karolherbst NVK: https://docs.mesa3d.org/drivers/nvk.html Nouveau: https://nouveau.freedesktop.org/ ==========Support The Show========== ► Patreon: https://www.patreon.com/brodierobertson ► Paypal: https://www.paypal.me/BrodieRobertsonVideo ► Amazon USA: https://amzn.to/3d5gykF ► Other Methods: https://cointr.ee/brodierobertson =========Video Platforms========== 🎥 YouTube: https://www.youtube.com/channel/UCBq5p-xOla8xhnrbhu8AIAg =========Audio Release========= 🎵 RSS: https://anchor.fm/s/149fd51c/podcast/rss 🎵 Apple Podcast:https://podcasts.apple.com/us/podcast/tech-over-tea/id1501727953 🎵 Spotify: https://open.spotify.com/show/3IfFpfzlLo7OPsEnl4gbdM 🎵 Google Podcast: https://www.google.com/podcasts?feed=aHR0cHM6Ly9hbmNob3IuZm0vcy8xNDlmZDUxYy9wb2RjYXN0L3Jzcw== 🎵 Anchor: https://anchor.fm/tech-over-tea ==========Social Media========== 🎤 Discord:https://discord.gg/PkMRVn9 🐦 Twitter: https://twitter.com/TechOverTeaShow 📷 Instagram: https://www.instagram.com/techovertea/ 🌐 Mastodon:https://mastodon.social/web/accounts/1093345 ==========Credits========== 🎨 Channel Art: All my art has was created by Supercozman https://twitter.com/Supercozman https://www.instagram.com/supercozman_draws/ DISCLOSURE: Wherever possible I use referral links, which means if you click one of the links in this video or description and make a purchase we may receive a small commission or other compensation.
Transcript
Discussion (0)
Good morning, good day, and good evening.
I'm, as always, your host, Rody Robertson,
and today, we have someone who you may not know the name of,
but if you've been following, you know, a lot of Linux,
graphics stuff, you've probably seen some of the work
that he's been involved with.
Welcome to the show, Carol Herbs. How are you doing?
Hi. Great. And you?
I'm doing pretty good.
So, I think, correct me if I'm wrong,
but the best way to describe the general work you do
is work on the Linux graphics stack.
Is that a fair way to describe it?
Or do you have another way to describe it?
Yeah, I think that's fair enough.
I'm more leaning to the compute side now,
but it's still like I'm working on MISA primarily.
Right, right.
I wasn't particularly like meaning specifically like gaming graphics.
I was sort of meaning all of that together.
But I guess compute, yeah, I guess compute you would consider separate from that.
That makes sense.
considered separate from that that makes sense um so the three main things that people probably recognize that you've been involved with i'm sure there are other things as well that you can
definitely talk about but the three main things that are definitely noteworthy of late especially
are nvk rusty cl and novo like those are all people are always going to be talking about them as they
are being improved.
Yeah, I think that's fair.
So I guess we can, before we get into any of that stuff,
how about we talk about like how you really got yourself into this sort of
space?
Like how did you get yourself involved in not only doing the driver
stuff you're doing but sort of getting yourself involved like in in linux and foss in the first
place yeah yeah i think that's quite a fun story because i used to start as a java backend
developer oh okay j Java backend stuff.
Like the full story with 500 lines of stack traces.
Some Java developers know what I'm talking about.
And at some point, I was just.
I mean, we never had Windows installation in our home.
My server was mostly like Apple user
with all the Mac stuff going on.
And at some point, I just got into like HTML programming
and got some jobs with Java and stuff like this.
And at some point, I was like, yeah, Linux
might be something fun to try out.
I remember that at some point, my father
got like a Linux CD installer
in some Mac magazine with a penguin on it.
And that was kind of my reason to wanting to try it out,
just because of the penguin.
So at this point, he was against that idea.
And I didn't have enough hardware of my own,
so I couldn't.
But yeah, at some point, I was just
having this kind of laptop with an NVIDIA GPU
and installed Linux on it because open source,
and it kind of made sense to me.
I think the first thing I was mostly involved
is because it was a gaming laptop, playing games
on Linux and everything.
And it was like, I think that was like 12 years ago.
It was not in a great state.
So the only way to reasonably game on this laptop
was still with the NVIDIA driver.
Right.
And even the hybrid BU setup was kind of a mess.
And I had one of those laptops where
you can't flip the main GPU.
So yeah.
I think there was this project called,
I don't know if the project was called OptiWren,
but it was this command which starts the second X server
and is copying stuff around.
Bumblebee was the project.
Yes, yes.
So it was part of Bumblebee.
And yeah, so that's kind of the reason how I
got into Linux gaming stuff.
And at this point
there was also like i don't know if you heard of it but there was like the zura and the zurium
which is like an open source client like steam okay okay um i don't know if people are familiar
with it because you know the company was kind of financially in a bad state and that
they had like this open source i guarantee there's gonna be like one random person like i remember
that thing hopefully yeah so i started to contribute to this project oh yeah i see the
last chance is like nine years ago so um it's an old project.
But yeah, I think I'm, I don't know if I'm still
the biggest contributor, but yeah, the build system
was like a bunch of shell scripts.
And because I knew CMake at this time, I was like, yeah,
let me just port this entire mess to CMake.
And yeah, so yeah, let me just port this entire mess to CMake. And yeah, so yeah, I got involved in helping out
with this open source gaming client.
It's basically like Steam, just open source,
so fixing bugs and all this kind of stuff um at that time did you have any like existing
programming experience or were you learning to program like as you were doing that uh not really
i mean i've talked about the java and the hml stuff right right um i i was like i am familiar
with c um i also programmed in objective c becauseC because, you know, Mac system and everything.
Right, right.
But it wasn't really like in a...
I never had like a FOSS project before and I kind of just got by accident into this position
of maintaining this project.
So yeah.
What a fun... You know, that's a fun way to start. That's certainly a fun way to start.
With those Apple systems you were talking about, was that like pre or post Unix? Was
it like before there were Unix systems or where um i well my the first computer i kind of had access to was even like not even
a mac but like atari but i only used it for like playing games right but the the first
apple i had contact with was also like pre-os x okay like. Like it was 9.2 or something, but it was like, you know,
very close to the migration and the hardware just couldn't handle the new
operating system.
But yeah, from like the first actual Mac I had my own was like, yeah,
it was, I think 10.3 or 10.4, somewhere around that.
When you were making that transition like into Linux when you were like you found this Linux CD were you aware of what Unix was
at that point or like was it just like completely unknown to you um I wouldn't say...
I think the concept of Unix itself was unknown to me,
but I was familiar with, like, command line stuff.
Right, right.
So the command line, like,
felt familiar with the two of them?
Yes, yes, yes, yes.
Okay, okay.
So you eventually got from there into
somehow working on
open-source N source Nvidia GPU drivers?
Yes, so the reason I actually started to look into Nouveau was because, you know, there
was like the open source Nvidia driver and I was like, yeah, maybe I try that.
And then it's like, oh, yeah, it's pretty slow. But it had fixed the hybrid GPU setup
with suspending the powering down.
My laptop even was one of those who
had an LED showing if the GPU is actually turned off or on.
Oh, OK.
So yes, I also had motivation to make it work so I can power down the GPU and everything.
But yes, so the reason was the performance was kind of slow and the re-clocking support
was in a bad state at this time.
I think it worked with DDR3 GPUs, but it was broken for GDDR5.
So yes, I figured out why it was broken in the war and fixed it
and that was kind of my first big patch to the project do you remember roughly when that was
uh 2013 oh so it's like just after nova was getting started then Uh... 2013?
Oh, so it was like just after Nova was getting started then.
I am... I think the project is older.
It's like 2011 I want to say. I don't think it's that much older than that.
What does Wikipedia tell me?
Uh... let's see. I want to say it's like 20... Yeah, initial release was 2012.
At least according to this, it might... The numbers might be a bit wrong here,
which is very possible. Yeah, let me find the patch.
Okay, I wasn't expecting you to find the exact patch it was.
But regardless, it was fairly early on in the project,
within the first couple of years.
Yeah, it might be.
I didn't really follow up on the history of Novo at this point,
or later, so...
Okay, okay.
But yeah, let's see. I have patch yeah it was uh 2015 oh 25th
okay it's like three years into the project being around then um i yeah oh i was gonna say i i'm
fairly new to linux myself like how aware were people back then of of novo like
do people because nowadays you know people when they're using video cards it's like
novo is mentioned as a thing that exists but most people tend to just run the proprietary drivers
because as as great as like the work that's been done on novo is
a lot of like the experience is just going to be better right now and hopefully some of the
other stuff that you're involved with is going to help sort of improve that experience um
but how aware were people of novo at that point
at that point?
At least how popular did it seem like it was?
Yeah, I think it was more popular than before
NVK showed up.
Like, we had, like,
more contributors on the
OpenGL driver and everything, and
the contributors just
moved away to other projects,
like working on the AMD driver or something else.
So before NVK emerged, the popularity, I think,
was decreasing.
I think around when Kepler was still pretty commonly used,
was I think it was pretty much used by a few people
because you know with clocking working it was actually useful for some kind of
gaming what generation was Kepler 600 oh okay okay so that yeah that's a while
ago now yeah yeah so my first like my GPU was also Kepler, and I fixed it for that.
But, yeah, Kepler was kind of the generation with the best reclocking support, and it was decently fast, at least for some games.
And I think that the best benchmark I had was, like, 80% towards the NVIDIA Deviver.
Oh, wow, okay.
Yes, so it wasn't that terrible, but like other games were like at 10 or 20%, so it was...
Right, so it was very hit and miss with what actually worked well.
Yeah, yeah, but people were like, there was a bigger community of people being interested and seeing where the project is
going and um yeah and it was just declining because you know also the firmware situation
um bringing up new hardware was difficult
um and this this happened around maxwell which is like with the 900 series,
and all the signed firmware mess.
So what's the deal with the firmware stuff?
Because I occasionally will see articles about, you know,
firmware situation improving with Nova, all of this stuff.
What's actually the problem here, and why is it such a big deal?
what's actually the problem here and why is it such a big deal?
Yeah, so the big thing is that for whatever reason, Nvidia came up with their signed firmware stuff
to essentially lock down certain access to the hardware from the CPU side.
Mm-hmm.
Um...
For example, in Maxwell, we aren't able to control the fan speed of GPUs.
Okay.
Um...
So, even if we could re-clock, if your fan is, like, still slow, it's kind of a problem.
Yeah, yeah.
Um... There's only so much wiggle room you can really have there until it starts to overheat.
Yeah, I mean the GPU has overheat protection at some point.
Sure, but you don't want to hit the overheat protection. That's the point I'm getting at.
Well, that's fine to some degree because you know the first level of overheat protection is just
throttling the clock without cutting the voltage.
Right, but if you're overclocking, the pointer is to raise the clock.
That's the problem.
Yeah, sure.
But if you run at 1 eighth the frequency, you're still cutting the power consumption
by half.
Right, okay. that's fair that's right yeah and at some point if it's like really getting hot then this gpu just shuts off um but
yeah it's not a great place and uh you don't want to run the gpu constantly like this um
and then in later generations they also like um prevented us from even changing the voltage at all.
And then it's like, yeah, no point in bothering anyway.
So yeah.
And sadly, the firmware we were getting from Nvidia
was also customly made for the Nouveau project.
OK. Nvidia was also custom made for the Nouveau project. Okay. And they never gave us the firmware for power management.
Um, and only for context switching.
Okay.
Yeah, so, yeah.
So, you had firmware, but it was the most minimal of minimal firmware, basically.
Yes, it was enough to, you know, drive displays and have multiple rendering contexts on the GPU.
Right, right.
Yeah.
So what's all this stuff that I'm hearing about the firmware situation getting better?
What's all this stuff that I'm hearing about the firmware situation getting better?
Yes.
So the GSP firmware is basically a large chunk of their original driver moved to the GPU side into firmware.
Okay.
And it includes power management and display management and a lot of other stuff.
includes power management and display management and a lot of other stuff.
Basically, a lot of the driver just
moves into the firmware, into the GPU.
And the main advantage we have here
is that we can use the exact same firmware
Nvidia is using in their driver.
Right.
same firmware NVIDIA is using in their driver. Right.
Which helps us because we can basically do the same thing.
And now that they also open sourced their kernel driver,
because if you move everything or a lot into the firmware,
there are not really any secrets anymore.
We can also check out how the driver interacts
with the firmware and what we can do with the firmware and everything.
Because I remember when that kernel module came out, there was a lot of confusion about what it was, what it would actually do.
I'm certainly someone who doesn't have a deep understanding of how how nvidia's graphic stack actually functions so
what was that kernel module and what what wasn't it is also very important
yeah so the kernel module is more or less a layer for the user space driver to talk with the hardware. The NVIDIA driver does a lot of stuff inside the kernel,
like inside user space.
And the kernel driver isn't really doing that much anymore.
They also have something called user space command submission,
which essentially means instead of telling the kernel,
you have commands for the GPU, they all do it inside user space. OK.
And they just need a kernel driver
to wire up all the stuff and make sure there's
no security implications of doing so and stuff like this.
So basically just making GPU resources
available to user space.
OK.
It's a little bit different than the style
we usually have with all the other DRM drivers inside Linux.
So but yeah, there is a huge benefit
for using user space command submission
because it helps with compute because you can submit
more stuff with lower CPU overhead.
But yeah, the basic idea of the driver
was just to have something open source.
And in the past, there was also sometimes the GPL symbol
situations coming up regularly.
What's this?
I don't actually know this.
What have you been? What's this? I don't actually know this.
What have you been?
Sometimes, NVIDIA wanted to use interfaces in the Linux driver, and then at some point,
people were like, oh, yeah, but that's GPL only.
You can't use this.
And things like this happening.
And so it's kind of, know from their perspective it makes sense to
have like an open source gpl driver so they just don't have this problem anymore okay that makes
sense so from the from the perspective of novo what would you want NVIDIA to actually do that would really help
out the project? Because obviously
AMD's got this great
open source driver that
every gamer uses
that no one really worries about,
and it's just fine.
What would you want NVIDIA
to do differently that would make
this work out a lot
smoother? At least hopefully yeah
yeah i think it will certainly help if they would dedicate more developers to the project obviously
um um but i think their documentation approach could also be improving.
I don't know if you saw it, but we have this repository on GitHub
with some documentation.
Well, documentation.
It's mostly just header files with names for registers.
header fights with names for registers.
But yeah, they publish some stuff there. They document how to program the 3D engine or compute engine.
And for example, in NVK, we only make use of that.
So we don't really have to reverse engineer
certain commands anymore.
So that's certainly helping.
But there are still gaps.
And from time to time, we are still
asking NVIDIA for more documentation.
And sometimes they are responsive,
and sometimes they are not.
And sometimes they forget to handle some requests
for months or years.
Right, okay.
So why do you think there is this discrepancy with NVIDIA?
Like why, obviously Linux is a smaller platform,
so that might just be a simple enough answer,
but why do you think there is this discrepancy
with how they handle their their drivers on on on linux
as opposed to you know what's happening over on the amd side and over on the intel side as well
yeah i i think
it's probably just a perspective of, you know, does it actually make sense to them from a
business perspective?
Like would they even get, you know, profit out of it?
I don't think Nvidia is the company to do it like out of goodwill.
Right.
Yeah.
So.
I think that's probably the...
That's definitely understandable it it'd be hard to
really like get into their heads and really understand exactly why they're doing something
but yeah i i think it's a it's probably a fair guess to just say money like it it's a simple so what is actually the
difference
how does this
what is the difference between how
the AMD driver
is set up and Nvidia
I don't know how much you've done on that side as well
if you
yeah
not really
okay okay that's fair.
I mean, I didn't really get into, you know,
architectural details of all the GPU drivers.
No, that's, okay, that's fair.
Not yet, at least.
So, in the state that Novo is currently in,
what does it do well?
Like, obviously, newer cards, it struggles.
I saw that laugh there.
Okay.
What does it not do terribly?
Okay.
I think we try to make a,
we make sure that at least you can boot to a desktop.
Okay.
That's, I think like the baseline,
the project currently tries to fulfill,
if some update breaks, GNOME or KDE or something,
then yeah, that's a regression.
And we should probably fix it as soon as possible.
But if a user comes to it and say, oh, yeah,
I have this 10% performance regression,
and it's like, I wish we had time to look into all of this.
But sadly, there are more pressing issues to deal with.
And yeah, everything display related,
like if DisplayPort doesn't work or HDMI on certain GPUs
or regressions in general. I think we try pretty good to
at least not request the driver but deal with the issues and you know at least make sure that
display is working. Yeah. But you're really sort of fighting this like really uphill battle trying to just get something
that like it's it's a massive undertaking to get something like this to to actually work in the
state that it's currently in and the fact that it does as much as it does is already like a
a testament to how much work has been done by all of the people involved.
Yeah, I think the biggest problem was just that before we started NVK, I think it would have been a fair estimation to say there were like two full-time developers on the entire project.
Okay, yeah, that makes sense.
So there's just so much we can actually do there yeah and it got a lot better with nvk um
well i guess that leads us perfectly into nvk then
uh excuse me i said i guess that leads us perfectly into nvk then we can talk about that
yes yes yes yes so uh i guess we're always asking about working driver for NvO.
What I was going to say is, what is NVK?
I think you're about to explain it anyway.
Yes, so NVK is the open source working driver for NVIDIA GPUs.
It's still built upon the same kernel driver.
It's still built upon the same kernel driver. And I don't know if the new compiler is merged, but it also shared the compiler with the OpenGL
driver for quite some time.
Yeah, I think the new one isn't shared yet.
So it doesn't matter.
But yeah, it's basically just open source implementation.
And the main idea was to start the project in a way
where we are doing things right and more
sustainable than everything.
Yeah, like a lot of issues with the OpenGL driver,
we didn't want to repeat the same mistakes there.
So yeah, it actively makes use of the documentation
NVJ is giving us.
And it lives inside MISA and makes
use of the same infrastructure we
have for all the other Vulkan drivers there as well.
You did mention documentation there but I've seen some write-ups from I think yeah I've seen some
of the the collaboration write-ups where it specifically mentions multiple times reverse
engineering reverse engineering reverse engineering how much documentation do you actually
engineering reverse engineering how much documentation do you actually have from nvidia and how much of it is just trying to work out how this device act how this can fit together
yeah um or maybe i like file files like these um
which especially like you see that they don't really explain anything they are just you know
And I see. But at least it allows us to search for terms and figure out
where to look, basically.
And sometimes it's straightforward.
We just search a term and then, yeah, play around with it
a little bit and figure out how it works.
Sometimes it's a little bit more complex.
And there has been work to write a new tool
to be able to reverse engineer their working driver as well,
so we can also compare against whatever NVIDIA is doing.
And do reverse engineering against their driver
in case we need more information so when you talk about like reverse engineering a vulcan driver
like what is that what does reverse engineering actually mean in this context like what what sort
of work would you be doing to to work that out, so most of the time, because we also
have this Vulkan CTS repository with like hundreds,
thousands of tests, there are a lot of small tests
which test very specific cases or parts
of the Vulkan specification.
And what we would do for reverse
engineering is to just check what commands the nvidia driver sends to the gpu and then just
try to figure out what to do on our end as well okay so it i'm sure that's a very long and tedious process.
It kind of depends.
I mean, the tests are pretty small, so it's not that bad.
But there's a lot of...
But it would be...
Yeah.
But it would be, like, more complicated
if you, you know, try to figure out certain things
they're doing in games or something.
Because of the overload
of information you would be getting
there.
So it's
approaching
one little command at a time
and then over time you
build those up and then
you get something
that passes the test and hopefully also works
properly as well yeah something like in the best case it's just one unknown command but sometimes
it can also be you know multiple commands and what they also have is a macro engine on the
And what they also have is a macro engine on the command processor.
And they can, like, what they're doing is they can execute a macro and it would generate more commands for the GPU. And that's a little bit more tricky to figure out.
But yeah, there's, you know, it always depends on the test. Sometimes there are more
But yeah, there's, you know, it always depends on the test. Sometimes there are more commands involved and sometimes not.
So when did NVK get started? Because I know it's a fairly new project.
Yeah, I think it's already like two years old.
Oh, wow.
Wow, it has been around for a bit. Maybe not quite.
One and a half year, I think.
Yeah.
It's really impressive how far it's come along.
I remember seeing articles from the start of this year
about how it can actually play a video game now,
which is impressive.
Very impressive, actually.
Yeah.
Yeah, we actually have, like,
on our Novo channel,
we have users actually trying out
NVK on DXVK
and Proton and everything.
And, yeah,
some games are already running.
It's very slow, but...
Right.
It's running. i'm seeing this article from gaming on linux that was talking about your post from the start of this year with our talos
principal running the screenshot has it running at five fps which is more than one that's it's a
lot more than one yes so it's functioning it's not i wouldn't call that
playable but it's functioning yeah yeah i think this was uh without the gsp firmware
um so it was like with the basic most basic clocks um which are usually quite slow on modern gpus
um yeah i think some people some users are already trying it out with GSP. I think
it's already got merged upstream, like the GSP support as well. So I think it will come
in with Linux 6.7. I remember reading a Pharonix article about that, yes. I think just a couple of days ago it was merged.
6.7 GSP.
Let me see.
Best week Linux 6.7, blah, blah, blah.
Yeah, GSP should be in 6.7.
So it's going to take a while to make its way out to the other distros,
but on things like Arch, that'll be out there fairly soon.
Yeah.
That's really cool.
Let me go over to it.
Yeah, we don't really enable it by default on GPUs we already support.
So the first one to actually use it is Ada, and that's the 40 series.
Oh, really new, okay.
Yeah.
It can be used on Turing and Empyr, which is like 20 and 30 series, but users would
have to opt in.
And, yeah.
Is there any reason that it's...
Add a kernel command line flag.
Is there any reason why it's going with 40 first,
not also doing those older ones?
Oh, yes.
For the 40 series,
we haven't gotten any firmware from Nvidia so far.
So that's the only GPU you can use with GSP anyway.
Okay, right.
That makes sense.
Yeah.
And for the previous generations,
we already have firmware.
And in order to not request users,
we don't use it there yet.
But we do plan to flip the switch at some point once we are more comfortable with flipping it.
So that article I saw was from the start of this year.
So I doubt you just have benchmarks sitting in front of you right now,
but how much better of a state is it in than it's basically like
it's like 10 months ago like how much how much better is it at this stage
i don't know but i think michael said he's planning to do some benchmarks
on the article.
I think.
Yeah, in the last section of the article.
Ah.
Of the ESP binary firmware blob article.
OK, I can't. ESP binary firmware blob article You might be having a look in a slightly different article
There's a lot of things that go up on this website
No, I just quoted the part to you so
Here we go
Now to you so. Oh here we go. Oh no. Yeah you are reading a different one than I am. Okay that's why I can't find it.
Now on to benchmarking this new Novo Sport with Linux 6.7 to see how much it
improves the open-source situation for RTX 20-30 series hardware as well as the
initial RTX 40 GPU Sport on Novo. For those on GPUs prior to RTX 20,
this firmware isn't relevant,
with the GSP only being introduced with RTX 20 Turing GPUs.
Okay, that makes sense.
So this is like...
Yeah, so...
So it's for like fairly recent GPUs.
So...
Yes. Yeah. I thought you were going to answer that. GPUs. So, yes.
I thought you were going to add something to that.
Yeah, yeah.
I assume what you are going to ask,
but I think
like the,
not that I think,
but the generations
between Kepler and Turing
probably won't get
any reclocking support ever. That makes sense.
So that's very
unfortunate.
But if NVIDIA is not
giving us any firmware for that
and we ask
then it's
not going to happen.
I mean there are some people who
try
to figure something out but it's really not sustainable.
Yeah.
It's understandable.
Like, I...
Like, I...
Okay.
I have similar discussions with people about, like...
Because I talk a lot about Weyland.
And I will have people message me, like, I'm on a Kepler GP or whatever.
And it doesn't... Like, Weyland doesn't work great on my GPU. It's like, I'm on a Kepler GPU or whatever and it doesn't, like,
Wayland doesn't work great on my GPU. It's like,
yes, but you also have
a 10-year-old GPU. Like,
at a certain point,
like, Nvidia's gonna
stop supporting it. Whether
they should stop supporting it at the
point they stop supporting it, you know, that's a
whole other discussion. But there is gonna be a cut-off
point where they have to just
say, this is
just too old, it's not our problem anymore.
And
look, it
is what it is, basically.
It would be nice if they went back further,
but, you know.
I mean, they can always try to use Novo.
No, I mean, I think on Kaplan it would be fine,
because we also support the Kaplan GPUs in NVK.
Well, kind of.
I don't think it's at the same level as Turing,
but it kind of works.
It's better than nothing. But yeah, all the GPUs in between it kind of works um it's better than nothing um but yeah all the gps in between are
kind of yeah not much we can actually do about you can also frame it slightly differently with
20 series being the like early cutoff point everything going forward will not be a problem
so in a couple of years when you buy an old 20 series card, that will support it.
Oh yeah, that's true.
And those GPUs are already like five years old.
Is it that long already?
Yeah.
The first Turing was like September 2018.
What the hell? Uh, tiering architecture 2018. What? Okay!
Oh! Yeah, okay, that was a while ago, wasn't it?
Jesus!
Yeah.
Wow, that feels like it just happened.
Wow, that feels like it just happened.
Maybe it's because GPUs started getting really expensive starting around then,
so a lot of people just didn't upgrade.
It's been like...
There's less excitement I hear about GPUs nowadays than...
Because I first started getting into PCs back during Kepler,
and people got super excited every time a new generation
came out, and
now it's like, okay,
but it costs as much
as a car, so like,
do I want to buy it?
Probably not. I'll just stick...
There's a reason why the... I think it's the 1650
is the most popular GPU on
Steam Hardware Survey.
Yeah. Yeah.
Yeah, and that's
also supported, by the way.
It's just weird because the 16
series is newer than 20.
Just in case somebody
doesn't make the connection.
It's also Turing. It's just
Turing without ray tracing.
Right, I forgot about
that.
What? I want to know who at Nvidia thought that that naming scheme was a good idea. I really do.
Yeah, it's always been confusing.
Yeah, it's probably not going to be that long until we do another reset of the numbering? The numbers are getting a little bit too high.
We're going to get back to the 100 series at some point.
Give it probably a couple more generations.
They
went from three digits...
They started with four digits and went to
three and now two, so I guess they will
go with one digit.
Yeah, just NVIDIA 1.
Honestly, I wouldn't...
At the end of the day,
it's still better naming than monitor naming.
And monitor naming, I just...
I don't know what goes on there.
Outside of the size of the screen,
it's like just throw random letters at it.
That's fine.
Some will work it out.
Yeah.
So... Okay. With NVK, out yeah so okay with with nvk why why was something like this needed like what what was deficient about the current method that was being done
and that something like nvk like MVK needed to be around?
I don't think that the need was really, you know, because the OpenGL driver wasn't that great,
it just we wanted to have a working driver. But there we also think about the idea of dropping the OpenGL driver
in favor of Zynq.
Mm-hmm.
Because it might just be faster at this point in time,
because more work is done on Zynq to make it actually run
faster than ever was put on the OpenGL driver.
And nobody really cares about the OpenGL driver.
So the performance will probably not improve there anyway.
So yeah, and I think one of the big problems of the GL driver
is that nobody really actually figured out
why the performance is bad.
There are some assumptions and there are ideas on why it's bad but you
know nobody actually put in the time to actually performance optimize the entire driver i've heard
a little bit about zinc here and there but i imagine a lot of people listening probably have
no idea what that is so briefly explain what zinc Zynq is an OpenGL
driver inside MISA
and instead
of talking to hardware
like the NeuroDriver, it's
talking to a Vulkan driver.
So
we have this
so-called Gallium
framework inside MISA which is just
a driver abstraction layer.
The OpenGL driver is using this abstraction layer
to implement functionality.
And Zynq is translating this abstraction layer onto Vulkan.
And it's able to provide not just OpenGL,
but also in theory and in practice other APIs on top of Vulkan as well.
Okay.
So theoretically, if the OpenGL driver didn't suck,
let's just assume that.
Is there some sort of overhead of doing it through this method?
Or is it not noticeable? Like...
Is this a- obviously it makes it convenient to write it, but is it a good idea if there was more resource to do the OpenGL driver directly?
Um...
I wouldn't say so.
Um...
At least not anymore.
I mean, Marek from AMD spent a lot of time
like reducing the CPU overhead of Gallium
and D-Wide DNS iDriver.
So I think if you really want to reduce the overhead,
that's entirely possible.
I think some abstractions might be not perfect,
but we are also free to change them as we go.
So it's not like a fixed API we can't change.
So if there are performance problems,
somebody can look into them and fix all the drivers
or something.
somebody can look into them and fix all the drivers or something.
It's also not like that Gallium is actually a library. It's really just an API that driver exposes directly
to the OpenGL implementation.
And the OpenGL implementation mostly
is just responsible for tracking the state of the OpenGL context.
So if you bind a texture or something,
something has to track this stuff.
And then at some point, you also have
to call into the driver to draw stuff or allocate memory
and this kind of stuff.
So you mentioned in there, is that T in there? What do you got, is that tea in there?
What do you got in the pot?
Yeah, it's tea.
I just noticed you keep filling it up.
How many of those have you had?
It's not much, it's like 700 millilitres the entire pot.
I just keep seeing you pour a bit more in there.
I only pour like a little.
Okay. I thought you had like three or four cups
no
it's not that bad yeah am i um i was sick like last week and my foot is still a little bit so
that's fair that's fair it helps's fair. It helps with, um, speaking. Yeah, totally understandable.
Um, so you mentioned Zynq would later support like other APIs.
Uh, what else, you can probably see where we're trying to direct this now.
Uh, what else would this let you support?
Um, OpenCL for example.
And it's already merged and everything. Um, it works. Mm-hmm. Mm-hmm. And it's already merged and everything.
It works.
It's great.
I actually have ran the OpenCL conformance test on RedVee, and it's passing.
So it's even feature complete and everything.
It's great.
So...
Oh, go on.
Yeah, there's still like, at some point,
I plan to make an official conformance submission.
But for that, I need a second driver it runs on.
And it has to be an independent one.
And I'm not quite sure what exactly that means,
because we have various work drivers
of various vendors inside MISA, but I don't know if they are like independent enough.
Yeah.
So I'm also looking into making it work on Nvidia and yeah.
That's cool.
So what is OpenCL?
What is the purpose of OpenCL?
Yes, OpenCL. It's actually a compute API that Apple came up with.
Okay.
And this was like...
I have to do math, but I think it's like 14 years ago.
It's like when the specification was released.
And I think Apple looked into how
to accelerate their UI in the operating system
came up with this API, I think.
I mean, I didn't really look at the history,
but I think the trademark was originally Apple
and might still be.
I don't know, but they started it.
And the rough idea was that we have those GPUs,
and it would be nice to use them for everything.
And instead of running your code on the CPU,
you can also run it on the GPU.
And the programming language you use for OpenCL is OpenCL C.
And it's actually a C-derived standard.
So it's C with a bunch of stuff on top.
And yeah, the main goals were just
we want to have higher power efficiency, run stuff on a GPU,
and do crazy stuff.
I don't know precisely what they were using it for.
But yeah, that was kind of a general idea.
I think it was in the time where the GPU GPUGPU phrase term was still a thing.
I don't know if you heard.
I don't think so.
I just had a look.
Apple still does hold the trademark to OpenSeal.
It doesn't matter at all.
Just had to check.
Yeah. yeah yeah there was like i think the gp gpu gp gpu term was like a fancy word
a few years ago gp gpu that's a yeah that is kind of annoying to say isn't it gp gpu
uh general purpose graphics processing unit is a graphic processing unit that is programmed for purposes beyond
graphics processing, such as
performing compu- oh, okay, so it just
means compute on a GPU, right.
Yes, so
the history for that is that
at some point, even
OpenGL got, like, shaders.
It's kind of at this time when
you didn't have this fixed function
stuff where you say, OK, this triangle goes there
and has this color.
But the industry transitioned to shaders.
And this also had implications for the GPUs
because to run arbitrary code, you also
need something like a CPU on the GPU instead of your fixed
function graphics pipeline.
And Nvidia was also kind of big with this,
where they also came up with CUDA around the time,
I think even a little bit before.
And what they were doing is they were looking at this and said,
you know what?
All the shaders we are doing, it's all the same with compute.
It doesn't really matter what it is.
We just have one thing for the entire thing.
So since forever, how you are programming graphics shaders
or compute shaders is on the hardware, it's basically the same.
So that's kind of like the rough idea
of where all this
can come from. As you were saying that,
I checked, Coot has been around
since 2007.
I didn't realize...
Yeah, it's a little bit older, I think.
Yeah, it's been around for a little bit.
Yeah, I think
Tesla was the first GPU to support it.
Jeez, okay.
That's just like 8000 series.
Right, okay, before the numbering reset.
Yeah, uh...
Yeah, GeForce 8800.
Yeah.
Yeah.
That was... that's a while ago.
Yeah.
So everything is quite old.
Yeah.
They came up with CUDA and Apple had their own thing.
And it transitions at some point to Kronos to make it cross-vendor.
I don't know if that was the reason but that's kind of what happened yeah
so
so what is this
RustyCL thing that you've been working
on
yeah so
I started it mostly
as trying to learn Rust
and because I was also like you know I started it mostly as trying to learn Rust.
And because I was also like, you know, at Red Hat,
we have like those days of learning where we can dedicate some days to just learn
whatever we want, basically.
OK, OK, that's good.
And I always wanted to go into Rust.
And I was thinking, because I was also
involved with Clover, which is the old OpenCL implementation
inside MISA, written in C++.
And I'm sure there are a lot of people loving C++
and everything, but it never really
was that much by most MISA developers.
So a lot of people didn't really like to work with it.
And I was also not a big fan of OpenCL.
Not OpenCL, I meant C++.
Yeah.
And so because I was involved in Clover and thought,
oh yeah, we can do maybe a new OpenCL implementation,
and we could also figure out what would it actually
mean to support Rust inside MISA, or how would,
if Rust becomes the cool new language everybody
wants to use, what would be a migration path for MISA?
Because at some point, you can say, OK, we will always
stick with C. But that also comes
with the risk of maybe all the new developers
don't want to program in C anymore.
And that's kind of bad for a project
if C is all what you have.
And so I was thinking, I can just spend some time figuring out
how to use Rust to implement APIs inside MISA,
and how all the integration would work out,
all the cargo stuff, and compiling Rust
code with the current build system,
and just how it would fit in
most people when they say they're gonna learn a language they don't go and implement
OpenCL for TV like that's not a normal way that most people learn how to write a language
might be no it's awesome that you did like someone's gonna someone's gonna be the one who wants to do that
but um so it started as like nothing look were you so you didn't start like with the intention of taking it anywhere if if
i understood that correctly like initially um yeah initially it was more like a prototype i
just wanted to figure out if it would work at all um at what point it was just you know
it was just fun to work on it so i I kept working on it, and that was basically the reason.
At what point did you realize that maybe it's actually a good idea?
Good question. I think when I was creating the merge request for it, probably.
I think you worked that out a bit late.
But yeah, I think it was quite some, I don't know how long I worked on it before
I started to merge it
maybe a year maybe half a year i think it was a year because people were kind of aware of it
i remember talking about it back when it was like first getting a bit of attention yeah i think i was talking about this in xtc
2022 okay the first time maybe i think i had a lightning talk in 21 already just saying okay
this is what i was doing uh what do you think and uh yes yeah i had a lightning talk
on the last day of 2021 uh on xdc so it's not been that long that then since like it's
it's been like a serious thing it's still like relatively new
yes
yeah I
yeah it's between
two and three years
something like that
jeez okay so
you got a lot of
it seems like a lot of the stuff you're involved in is like
sort of new obviously
Novo's been around for a long time but like both NVK and Rusty lot of it seems like a lot of the stuff you're involved in like sort of new obviously novo's
been around for a long time but like both nvk and rusty cl like these fairly new additions to
the way that both compute and graphics are done on linux um yes maybe i didn't quite get what you're trying to say oh okay no i was just saying um
i was just saying your how how would i rephrase it um
i don't know how to rephrase it, actually. So, anyway.
We'll move on from that.
So, what sort of state was Clover in, like, beforehand?
Because I know there is this open merge request on the Mesa project.
I just love the title.
Just called Delete Clover.
Which is a great title straight
to the point um how usable was clover if at all were there was there anyone who was working on it
at the time is there anyone working on it now yes so uh nobody's working on it. And so I kind of started.
I don't really know the history of why people started it, like precisely, just that people started
to have an OpenCL implementation.
And when it was made, it was in the time when LLVM came around. And we had this LLVM compiler for AMD GPUs.
So what Clover was doing is it was very LLVM-centric at the start.
And it was not using any of the other compiler infrastructure
we had inside MISA.
There was like the time even some Niveau developers
were thinking about moving to LLVM
and to have all the GPU compilers inside LLVM.
And as everybody maybe knows, that didn't happen.
So we just write our own compiler inside MISA
for every hardware.
And even the Radion SI driver is probably
moving to Echo, which is like the AMD backend compiler
for RedV soonish.
And that was kind of a limitation of Clover, that it only worked for GPUs with an LLVM
backend compiler. And that included the, you know, it was basically just AMD.
When I was getting involved in the project, can you still hear me?
Yeah, yeah, I can hear you. Can you not hear me?
Yes, because I'm running Discord in the Firefox tab and now Firefox is complaining of
it's not responding, so I was a little bit, what is going on?
Oh, this is great.
I can't do anything, but, but anyway, as long as you still hear me, I can just ignore it,
I guess.
I guess, hopefully.
As long as the tab doesn't crash.
Yeah.
I have no idea.
Anyway.
Yes, so when I was getting involved...
Oh, there we go.
Now the tab's gone.
I think we lost him.
I wonder if he comes back.
I don't want to cut this, because this is amusing as it is.
Oh, I think I might actually need to...
Alright, we'll see what he says.
It still says he's online.
Uhh...
I don't know...
There he goes!
There he goes! There he goes!
Okay, we'll cut back to when he's back.
Okay, I'm back.
How long was I gone?
Uh, no, two or three minutes.
I-
Oh no.
Not long after you said,
uh,
your tab was
fine, it hadn't crashed yet.
It crashed.
Ah, yeah.
No, yeah, okay.
Anyway, it's working now.
Yeah, the main deal with Clover was just that it required this LLVM backend compiler.
And we have this near thing inside Mesa going on,
which is kind of like our own compiler infrastructure.
And all the other drivers are using it for Vulkan specifically.
So yeah, what I was doing was for Nouveau to support near,
which was also then used for Vulkan, and make Clover be able to use it as well.
Mm-hmm.
I think I would be lying if I say I know exactly what's
the problem with Clover.
Mm-hmm.
I think for me, it was just annoying and frustrating to work with it or develop on it.
So yeah.
And it just...
It was just more fun.
How has the experience been learning Rust?
Oh, it's pretty great, actually. Yeah?
And I know that I like a lot of people who have strong opinions on telling
them what to do.
And for me, it's great.
I know that developers are making mistakes all the time.
That's totally normal and totally fair.
And when a language can help me with not making those mistakes,
then it's really great.
With C, you usually have always this risk
of having some out-of-bound memory access
or use after free and stuff like this.
And at least with the things I was doing for a year,
it's mostly something I never have to bother with.
So even if when I'm running into heap corruption or something
like this, then it's just me using the C code incorrectly
inside Rust and having bugs in the wrappers around the C code incorrectly inside Rust
and having bugs in the wrappers around the C functions.
Yeah, so it prevents a lot of bugs,
and it's really helpful with this kind of stuff.
I was occasionally seeing..
It also has a strong and big standard
library, which is kind of a pain.
Because even basic things like linked lists or something,
it's like everybody has to write their own implementation.
And having the standard library and everybody kind of agreeing
on what it's there really helps with having also
other developers just
you know look at the code and saying oh yeah those things you should probably do differently
and stuff like this i often see people complaining about the it's not just a rust thing but i'll see
this with like typed languages as well people complaining about the compiler telling them
they're doing something wrong the compiler's telling you you're doing something wrong because you're probably doing
something wrong like a uh my my my favorite example of this is anytime you take a javascript
developer and stick them in typescript and you have a type system that's actually there
like they get very confused because the compiler is telling them they're not allowed to like do
certain things with types like yes because what you're trying to do is
a bad idea, so stop doing it. And I'm sure Rust is much the same in that way, where it's like,
stop doing that. The compiler is complaining because that's a bad idea, so stop it.
Yeah, I would say that Rust goes further,
because it also manages ownership of values.
Yeah.
And that's what tripping off a lot of programmers.
And yeah, because it tells you that if multiple things own
something and they have mutable access to it,
you might have a different state than you expect this value to be
and stuff like this.
And it's quite heavily enforced.
Yeah.
Some people just, I mean, it's sometimes a little bit annoying
because you have to write code in a way where you don't run
into those issues.
write code in a way where you don't run into those issues.
But I would also say that it generally leads to cleaner code anyway.
It's easier to, from thinking about the code perspective,
because a lot of the errors you can just page out of your brain
and don't have to think about certain issues
anymore um like in c it's totally fine to return the address to stack memory and maybe the compiler
complains maybe it doesn't um yeah stuff like this so as you were learning rust how did you
go about doing so while writing this
driver? Did you just like look at the documentation for Rust and just wing it basically? Like
what was, what was your approach? Yeah, I think it's basically that, um, they have this,
uh, introduction thing on the Rust link website, I think,
which I looked into.
It was just basic stuff on how to write Rust code,
and so nothing really complicated.
But yeah, but then at some point, I just say, OK,
I want to write code.
I'm not really a good documentation reader,
so documentation is kind of things I avoid.
And documentation in the sense of big block of text
on how to do something.
I look into the reference.
If it's reference manual, I don't know.
But like the actual documentation
of the standard
library.
Right.
That's something I usually look a lot into.
They have examples there on how to use code and how to use the functions and types and
everything.
So how long do you reckon it took you to start feeling comfortable writing Rust code?
Not quite sure.
I don't know if I'm comfortable yet at the level where I can say, okay, I know what I'm
doing.
Sure.
But I'm comfortable enough to just write code and the compiler is not getting
too much into my way.
I think there's still a lot I need to learn and more, you know, write code, more idiomatic
and stuff like this, but I think it's not too bad.
See, most people would have answered i feel comfortable now like you have a
rust implementation of opencl in the mesa project and you're like i don't feel comfortable writing
rust yet no i mean from uh um or is this code like optimal enough or is there a way to do it better
and stuff like this sure um yeah i can write code and that's totally fine.
I um, yeah, I wouldn't feel comfortable enough to express my opinion on how Rust code should
look like.
Mm-hmm, okay, now that's fair.
If that makes more sense.
So you, you know enough to get what you need to get done, but you're not like a,
you're not a Rust specialist.
Yeah, I think that's a fair summary
so what is the state of rusty cl at this point
it's a good conformant open cell implementation on some Intel hardware.
So I filed for official conformance
at the beginning of this year, I think.
Yeah.
So it's passing the OpenCL.
There's the conformance test suite, which a lot of tests,
and I'm generally testing against that.
So that is what basically works.
And a lot of applications are already running, I think.
Oh, OK.
It's not enabled by default yet, just for stupid reasons.
Or it was actually last year, end of last year.
So kind of a year since it's conformant.
Some OpenCL code is really, really heavy.
And the way we are compiling code inside MISA
is that we basically inline everything
into one huge function doing everything.
And this can lead to some benchmarks using 30 some benchmarks using, like, 30 gigabytes of memory
or more, which is a problem.
So I'm not really comfortable enough to enable any devices by default yet because of this.
Because, you know, if you start something and your system goes out of memory because
of that, it's kind of bad.
But besides that, a lot of stuff is working.
Recently, we also merged the support for OpenGL sharing.
So there's an OpenCL implementation
to import OpenGL objects into your OpenCL application.
And we had a student intern working on this.
And they've done a lot of great work on this.
And I've mentored him to work on this project.
And I'm very happy that we finally managed to merge it.
I think it was like the internship ended like nine
or 10 months ago or something.
And there was still a lot of details to cover.
And the student, Antonio is his name,
stuck around and helped out with random bugs and putting it into proper shape.
So, and with that, you can actually run applications like DaVinci Resolve on MISA out of the box.
Because I know that was one of the issues with um With davinci, I think you needed to use the proprietary
amd gpu drivers to actually
Get anywhere with it previously
Yeah, that uh, that might be right because now nvidia's I know that the cop sorry go on
um Sorry, go on. Yeah, I think there were issues the way they were.
The big problem with this GL sharing implementation
is that it requires private extension on the OpenGL
side as well.
And the way AMD implemented it against MISA
was kind of weird and probably not working well with MISA.
But I also haven't really tried it. I know that debugging DaVinci Resolve is a mess because
it starts like hundreds of threads. And every time there's a bug, then I look at GDP,
it's like, oh yeah, thread number 500. It's like, oh, yeah, thread number 500.
It's like, oh, yeah, okay, fine.
So it's heavily multi-threading,
and a lot of bugs could trigger just by doing stuff.
And I know, for example, if the video,
if there's like a bug,
then the video preview wouldn't show and stuff like this.
It's, yeah, It's a mess.
So I guess it's sort of just as extreme as trying to test out, like, NVK,
like, actually try to get NVK to work, like, with a game.
If instead of doing the, like, individual test
going through that,
like, actually trying to, like, reverse engineer,
like, from, like, a full game trying to run now
i probably worded that really badly um it's a lot easier i think it makes sense yeah okay
good i hope it makes sense yeah there's just a lot of stuff going on with it enough
yeah yeah definitely it's uh yeah it's uh it's a complex application.
It's doing a lot of stuff.
Yeah, and I also have other projects running.
It's really hard to find fancy applications using OpenCL,
because it's not graphical.
It's not visually like a game.
You can impress people really easily.
But yeah, DaVinci Resolve
is definitely something people
are happy to see
working on more hardware
and out of the box and everything.
Yeah, I've seen some people talk
about using RustyCL with
Darktable, but that's...
Just looking at Darktable as someone who doesn't understand what dark table is it's just like okay like what what
what am i looking at here like what what's what's so special about this yeah i mean dark table is
just uh like um photoshop i would say, is equivalent.
I don't know if it's like, or maybe more like darkroom
is I think the, or Lightroom or something.
One of those professional photo editing tools.
Yeah, Lightroom is one you're thinking of, yeah.
Yeah, I think Lightroom is the proper equivalent.
Yeah, and it's basically like you have,
if you have your fancy camera and all the raw pictures
you are importing there, you want to make it look better,
essentially.
And you have hundreds of filters to achieve that.
And as far as I know, all those filters
can also run through OpenCL.
So instead of using the CPU, you can actually
run it on the GPU as well.
And it usually means that the power consumption
is going down.
It's not necessarily faster on Intel,
but instead of having your CPU going to 100%, it doesn't do that and the GPU is maybe a little bit busy.
But yeah, your laptop is heating up less and on AMD it's even better.
Like it's faster there than, you know, using the CPU.
Yeah, because AMD's put like, they've made a big deal about their whole APU things
and they've tried to make their GPU core
really powerful.
That was a big push they started doing.
I think when they started doing Ryzen,
that was where they made that shift.
Yeah, it might be.
It's really cool that something like DaVinci
actually works.
That's just cool
yeah
I would like to share a video
but I don't think it's a good idea
because I used material
and there's raw video materials
being copyrighted and everything
and I would have to
ask permission
don't show anything that could be a problem
for you
yeah so but yeah I have to ask permission. Yeah, no. Don't show anything that could be a problem for you.
Yeah, so, but yeah, I, yeah, it works. I had to fix a few bugs, but yeah, it's,
I think with like MISA 24, it should be working. What's the current version now? At 23.3. Okay. At some point next year.
Oh, okay. So not that long away then.
Yeah.
That's...
That... that... yeah, that's just really cool.
Let's see, where can we go from here so
i actually don't know i i don't know we can go from here um i did ask people like in my like
channel uh about like any anything they have to ask you so let's see what they have
let's see if there's any
questions that actually make sense in here uh anything we didn't already already cover in here
of course we've got questions about gsp please ask him if there are any plans for i'm just going
to read the question if the question makes absolutely no sense we'll just move past it
gonna read the question if the question makes absolutely no sense we'll just move past it
um that's fair uh please ask him if there are any plans for s y c l is that supposed to be is is that oh camera's back uh yeah i didn't notice i haven't enabled it no no it's good oh now the
camera is in the wrong order okay that's fine's fine. Oh no, I messed it up.
It's fine.
Please ask if you have any plans for SYCL on RustyCL, and when does he see Novo plus
RustyCL becoming a viable alternative to CUDA?
Jeez, that's a...
Can you quote?
Can you post?
Yeah, I will just post the comment.
I don't know if maybe there was a misspelling in there or something.
Oh, Cycle.
Oh, Cycle.
I think it's pronounced Cycle.
Okay, I've never actually heard of that.
Yes, so Cycle is a fun story.
I can definitely talk about it for quite some minutes.
It will also go into compute ecosystem madness. That's fine.
I don't know much about compute.
Yeah, so the big advantage we have with Cuda is,
maybe I start with OpenCL.
So OpenCL has the host code, application code
you are writing inside C. And then it
has the OpenCL C kernel language,
which you write in a separate file.
And it's a different syntax and everything.
So at runtime, you kind of have to load the file
and into a memory buffer and then compile your code
from there and stuff like this.
So it's not that great.
CUDA, on the other hand, put the gpu code and the cpu
code inside the same file you compile it with one tool and then you're good to go like you call the
gpu functions in a special way but the compiler is making sure that it all runs on the gpu um
so people kind of like this model more.
And people came up with Cycle, which is kind of the same thing,
just open source and well, Cycle is just an API.
But we have open source implementations,
and it's kind of there to be C++ focused, which Scuda, I think,
also is.
And you basically write your own, the same code for the GPU
and the CPU in the same file.
They use template for some of the magic.
They compile the, well, now it's into implementation detail.
So the problem with Cycle is that it's an API defined.
It's not like the application API is defined,
but not the runtime.
So you don't have any actual runtime with Cycle. You have it with OpenCL, you have it with OpenGL or Vulkan.
So if you compile against OpenCL, OpenGL,
you can run it on every driver.
That's not the case with Cycle.
There are Cycle implementations like the Intel one,
which is able to layer it on top of OpenCL.
And in order to do so, it compiles the GPU code
into SPOV.
A SPOV is just some abstract intermediate representation
for GPU code.
It's like assembly, but not like this.
And it's also used in Vulkan.
So for everybody who doesn't know what SPOV is. And the problem is that there are bugs inside the Intel
implementation which actually violate the OpenCL and SPIRV
specification.
So the question is always, do we think,
from a MISA perspective, do we think,
is it important enough to work around those bugs or not?
Or file the bugs and hope they fix it?
I have a merge request, which I hope I can merge soonish,
which kind of makes sure basic works. The biggest problem is that there
is this optional API in OpenCL to accept SPURV instead
of OpenCLC.
And I actually validate the SPURV.
And the validator is just screaming at whatever
I'm getting from cycle and says no.
So I had to add a debug option to say, OK, I ignore
what the validator is saying.
And this can lead to crashes.
It might not.
Some stuff works, as far as I know.
But yeah, having to deal with invalid speed is always like,
it's difficult because you also don't
want to make your code way more complex just
to accept buggy code for something which isn't really
heavily used as much yet.
Yeah.
But yeah, it's like I work on this,
and I kind of want to make it work,
but there are challenges and everything.
So yeah, I think that answers the question.
Oh yeah, a viable alternative to CUDA.
Good question.
I don't know, honestly.
I know that NVIDIA has a very good compiler.
And I know that in MISA, we would
have to add a lot more optimization to match
the performance of NVIDIA.
I think NVK is probably a better approach
to get closer, because Faith was also starting a new backend
compiler for Nouveau.
And that will be critical to match NVIDIA's performance
there.
And once the compiler is in proper shape
and compiles code pretty, like, is
able to compile the quick code,
I was also wondering, thinking about using it for OpenCL.
If it would match, like, if OpenCL
would be able to match the performance of CUDA,
I don't know, honestly.
So we have to see.
But I think it's good if we would have other alternatives
besides CUDA in the compute ecosystem,
because everything else is kind of weird.
Well, whether it's...
The biggest advantage you have with CUDA
is that you install it on your system.
It doesn't matter what NVIDIA GPU you have, it runs.
Right.
You don't have it with Wacom,
which is like an AMD clone of CUDA.
They call it HIP, which is kind of the same API,
just renamed and a little bit changed.
But their difference is that, for example, in CUDA,
you have PTX, which is like Spirvy.
It's not targeting a specific GPU.
So if you're compiling a CUDA code,
you can run it on every GPU.
And on Wacom, that's not the case, because they compile to GPU code directly.
So if you have newer GPUs with a new ISA on the shader
processors, it doesn't run.
So you have to recompile the code.
This is totally fine if you think in the HPC mindset,
where you know what
hardware you're targeting and you
want to get the most performance out of this hardware
as possible.
But from a developer, I only have a desktop, laptop kind
of situation, it kind of sucks.
And they have to improve on this.
And the same is also with Cycle.
They also are more driven from the HPC mindset,
but they are improving.
They are currently working on adding infrastructure codes
to LLVM.
So it's less of a problem there.
And I think there are also people working on the runtime
API, but I don't know.
There might be rumors.
I never talked to anybody actually knowing anything there.
But yeah, once this problem is solved,
at least Cycle might be a good alternative.
But so far, I think, at least for the things I care about,
and that's mostly non-HPC stuff, it's more like desktop stuff,
where you have DaVinci Resolve and this kind of stuff,
then besides CUDA, you only really have OpenCL.
And even if it's not an alternative
in terms of performance, it's the only alternative
you have to actually have cross-band or GPU compute support
right
when I was saying
a viable alternative, I don't think
it necessarily means
one-to-one sort of performance
but having something that actually
works
it probably won't be like
you're not going to see giant
GPU farms, switching from NVIDIA like and or whatever like switching to opencl but like actually having
something that's like usable i think is definitely a good goal to have
yes and i would say it is at this level um i think obviouslyCL had kind of a sad story a few years ago where nobody really cared about
it.
But I, for example, participate in the OpenCL working group at Kronos.
And there's active engagement there on improving the language and everything.
So I think there is more interest in OpenCL than it was a few years ago.
It's definitely going up than going down.
So, yeah.
I mean, yeah.
Intel is investing into it and some other vendors.
This is a giant paragraph,
but I'm going to assume the person makes sense.
Okay.
This might be a very harsh question,
but I'd like to ask about visions of OpenCL.
Currently, CUDA is very dominant on a GP GPU market space,
which is NVIDIA specific.
And those trying to support AMD GPU usually fall towards HIP
from ROKM instead of OpenCL.
Blender dropped OpenCL backend for cycles in favor of hip
tensorflow only supports cuda and only way to run with amd gpu is unofficial rockham fork
which runs with hip where would opencl stand in these situations especially when chronos
released vulcan to kill both opengl and opencl at the same time
um Vulkan to kill both OpenGL and OpenCL at the same time?
I think it makes sense for some projects to drop OpenCL support if they are not
willing to spend time on it.
I don't actually know what was the reason for Blender,
but I wouldn't be surprised if they looked at it and said,
OK, nobody actually cares about the code.
And that happens, and that's totally fine.
TensorFlow doesn't only support CUDA.
There is, I mean, it's not easy to run it with something else.
But yeah, there is like Rackham stuff going on,
as the question stated.
But there's also other projects to make
it run on other hardware.
There is the OpenVINO project Intel is working on,
which is kind of making a common API for AI ML use cases.
And it claims to support TensorFlow.
I haven't tried it out yet, but I'm
kind of interested in looking into OpenVINO
because the GPU backend actually uses OpenCL. And that's something I'm planning to look into.
It's very Intel specific. It uses a bunch of Intel OpenCL extensions. I don't know yet how difficult would we support but um there are a few projects at least from intel
where they are working on making it you know making all the aiml stuff usable besides cruder
they also have like their own you know replacement for opencl called Level0, which is just more low level than OpenCL.
And they have their own OpenCL on top of Level0 implementation.
But that's the thing.
But most people, they can ignore this for now.
But yeah, they use OpenCL for quite a lot of stuff.
I wouldn't say that Kernos released Vulkan to kill both OpenGL and OpenCL.
The premise of Vulkan was always to be as low level as possible.
And if you're an application developer,
so just want to get stuff done, then using Vulkan
is usually not quite right.
Like, I wouldn't be surprised if most, for example,
indie game developers are still targeting OpenGL just
because it's easier.
I also haven't asked what's the general opinion.
But generally, if you look at Vulkan examples,
and if you want to draw a triangle,
then there's this amount of code and everything.
So it's not easy to use.
And generally, what people use are engines on top of Vulkan.
OpenCL is also able to be layered on top of Vulkan. So I mean, I've talked about it earlier with Zynq.
You can implement OpenCL, like run OpenCL applications
on top of any Vulkan implementation in theory.
And Vulkan is adding extensions to make it easier to do so.
So I'm making use of a bunch of very OpenCL-specific
Vulkan extensions, because otherwise it
would be like low performance.
You would have to work around certain things inside Vulkan,
which is just killing your performance.
So yeah.
I think OpenCL is pretty straightforward to use.
It's really easy to offload stuff onto the GPU with it.
And I think it will stay around with OpenGL for quite a while
exactly for this purpose.
I think it has more, like OpenCL will probably
stay around for longer.
But I also wouldn't be surprised if you end up
with a new API at some point.
So but it also doesn't really matter because, you know, most of the work in OpCL is really
done on the compiler side and that can be also used for any other new compute API emerging.
I was curious how much code it actually would take to make a triangle in Vulkan.
So I had to look it
up and there is a lot of code there i i i see i see what you mean there's like there's this yes
this page that does like a big walkthrough of the code you would need and it's like i just keep
scrolling it's just just a blue triangle it's very explicit it's it's very explicit. It's very explicit with everything.
It's explicit because it's such a low-level way to interact with GPU.
Yes, so the big problem with OpenGL was that all the memory allocation was hidden.
So you could allocate high-level API objects, but you couldn't say, oh, give me 500 megabytes of GPU memory.
Right.
And Vulkan is more like this, where you say, OK,
give me this memory, and I want to do this and this
and that with it, and I want to explicit synchronize
all this stuff.
And there's a lot of low level stuff you have to do and i guess that
explains why you can build something like zinc that then does these other apis on top of vulcan
because it's so low level it gives you that control that allows you to do these these more, I guess, specialized APIs?
Yes.
I mean, I've also done some benchmarks with Zynq.
And for example, there's this Luxmark ray tracing benchmarks
a few people might know.
And when I was testing on my Nvidia GPU,
it was Nvidia OpenCL versus Zynq plus Nvidia Vulkan.
It ran at like 92% of the performance
of the native OpenCL driver.
So we have like 8% of performance loss.
It isn't really runtime heavy, so it's really just executing
a lot of GPU code.
But it's a great benchmark on how well the GPU compiler works.
And I think 8% isn't that much of a deal.
Nope.
It's probably like, you know, at like the extremes,
probably a massive deal.
But like for like just having it like actually do the thing,
like that's not that much.
Yeah.
Yeah. What else? Okay, let's see what else we have in here i think we sort of touched on this before um but it's worth mentioning again
is there any hope for pascal and maxwell or are they doomed forever without rec-clocking support.
So I would say Pascal is doomed because we can't control the voltage.
Maxwell is in this weird state that because we
can't control fans, like only fans,
that's not a problem on all GPUs.
You might have a passive GPU, like passively cooled,
or you might have a GPU in a laptop where usually the system
firmware is responsible of driving the GPU fans.
Most laptops just have one cooling system,
but some laptops have two fans for the CPU and the GPU.
And in this case, we could actually re-clock the GPU,
but the firmware situation is a little bit weird,
because we are using the firmware of Nvidia, some stuff,
and making use of our own firmware for, like,
we need it for memory re-clocking,
because that's kind of a pain.
firmware for like we need it for memory we're clocking because that's kind of a pain um somebody would have to make that stuff work again i had a really really dirty hack
to make it work but apparently it doesn't work anymore so
and considering the age of the hardware, it's like not a focus.
Yeah, if somebody really wants to...
Yeah, yeah.
There's hope, but also not for every user of that hardware. Right, right. It wouldn't hold my breath for it.
But if someone wants to do it,
like, there's a spot open for you.
But if someone wants to do it,
like, there's a spot open for you.
Yeah, it would probably also be a pain to maintain,
so it's kind of...
Probably with an opt-in, it might be fine.
But, yeah.
Simple question.
I don't know what they're trying to ask.
What's with the 600 GPU series?
I don't know what
that's probably Kepler
I'm not sure what they're trying to ask
I mean we have a clock
it's not perfect but it runs
and it's supported with NVK
so
that was the entire question
um um yeah that was the entire question um
uh here we go copy this one
as far as i know novo targets 535 gsp but 545 is now certified for g4 series Yeah, painful question.
Less painful for us.
But yeah, so there's not strictly a need
to update to every version.
There have been situations in the past
where we needed updated firmware for existing generations
of GPUs.
Like for GPUs, we had this problem
that the firmware we initially got from NVIDIA
didn't work on newer GPUs.
So they had to give us an update.
It took a while and landed at some point.
But yeah, this can arise.
And because there is no stable API, it might break.
I think the API at least is stable
because they have an RPC mechanism
to fall
into firmware code and stuff.
So I think this part is relatively stable.
But yet the entire API you would use,
it's like it can change it at any moment.
It even changes within the release branch.
So the new 5.35 update also broke the API.
The problem with this is, I think some articles already
covered it, but the firmware is huge.
And just the two files we have now are like 60 megabyte.
And if you consider that some distributions are putting
those files into their inner drum of S,
and you have like, I don't know, if you have three kernels,
you have like the files three times there,
it's like 200 megabytes.
And if you would update to every firmware version,
we also have to keep the old firmware files because
of kernel not request user space kind of rules.
So it would, people would run out of space on their boot partition
because some distributions are still
hiring this one gigabyte boot partition,
and it's like, pain.
Big problem is if you have full disk encryption,
you can't really access the system yet.
So it kind of makes sense to have the firmware
inside your boot partition.
And some, you know, what's the usual discussion on this
is, yeah, why aren't you just loading the GPU driver later?
The problem with that is that sometimes you
need the GPU driver to drive all your displays.
Usually on a laptop, it's not a problem.
Usually on a desktop with single display, it's also not a problem
because firmware will set up your display just fine.
But for example, imagine you have an office at home,
and then you have your laptop always closed.
It's buried under a lot of stuff,
and you don't want to get it out ever
and have external displays connected.
And sometimes the firmware is not
able to bring up those displays.
So you need a native GPU driver to bring up
all the displays on your system.
This usually happens if you have multiple GPUs on your laptop.
So the internal one works, but the discrete ones doesn't.
Or I think it also doesn't work through Thunderbolt.
I have this problem with my laptop.
So the Thunderbolt DisplayPort MST connected display
doesn't boot up through the firmware.
So it needs a GPU driver to actually bring up the display.
So it's a really hard issue to solve.
And if we keep adding more firmware files,
then the boot partition is just out of space.
So yeah, this is kind of the problem
we have with this firmware.
And distributions also need to figure out what to do with it.
So you mentioned the...
If you want to update...
Go on, go on.
Yeah.
I was going to bring up the boot partition.
Yeah. I don't know what other distros do, but if you go to the Arch Linux wiki, it recommends
at least 300 megabytes for your boot partition.
Yeah, it's enough for one kernel.
I mean, Arch Linux also only has one kernel installed, so they don't really have that
issue.
Yeah.
Distributions like Fedora, they keep three kernels.
I think Ubuntu is kind of doing the same.
Or at least, like Ubuntu, I think,
always keeps the current one and then one or two new ones
or something like this.
So it kind of depends.
And yeah, it's not so bad yet with 60 megabytes.
But if we would update to every single firmware version,
then it's not really sustainable.
Right, right.
There are ideas on improving the situation.
There are ideas of having a separate initRAMFS
shared across all kernels where you could put all the firmware
files so you wouldn't
have to duplicate them um there are also like idea david early is working on it to declare
on the kernel module level what are the sets of files you need like versioned sets of files and
you could say i support those 10 sets but i only need need one of those. And you could then, in the initRAMFS generator,
just pick the newest or something.
This would probably also solve a big chunk of this problem.
Well, you could not have one gigabyte in it.
Big petition.
Yeah.
I mean, butter as subvolumes would also
solve this problem.
So. Yeah, but then you've got to-volumes would also solve this problem, so...
Yeah, but then you gotta get people to stop using EXT4.
Sure.
I'm just saying that it's one solution to the problem. It's certainly a solution, yeah.
Let's see, was there another one
that we haven't really addressed in here?
I want to know how the progress
with Rusty CL profiles are going.
Profiles?
I suspect they mean the OpenCL profiles,
because you have embedded profile and full profile,
which is to differentiate between embedded devices
and four-file GPUs.
And on Intel, the full profile is supported.
On AMD, the embedded profile is supported.
The reason is very stupid.
You need a certain amount of samplers supported,
or images, or whatever.
And the Radeon SI driver limits to 32.
And the full profile requires 128.
The difference between the embedded and the full profile isn 128. That's like the difference between the embedded
and the full profile isn't even that huge.
It's just mostly some precision stuff.
And yeah, it highly depends on the actual MISA driver,
what gets advertised.
But the profiles are supported, and the code
detects on what's valid to advertise.
I think we've addressed pretty much everything then.
Yeah, let me just check.
Did anyone mention anything over on my master?
I think I might have saw one here.
Where is the replies?
Why? Why is that?
Okay.
I would like to know what their reaction was when the NVIDIA drivers leaked.
I know they can't legally look at them,
but did they?
What a question. No but I seriously haven't looked at it. Yeah no
definitely not. Definitely don't. Also what is the what is their major hurdle
having to reverse engineer stuff DRM? What a mess of a question.
I think we talked about what the major hurdles were with Driversport earlier.
It's documentation and then having to reverse engineer stuff.
I don't think there was anything...
Is there anything else you would like to add to that?
No. you would like to add to that or? No, I think that one problem is that how the driver works
can change, and then all the tools you have stop working.
So that's why we have memory work on the new tool
to dump the GPU commands.
But yeah, it's something you have to do from time to time.
And then, yeah, just figuring this stuff out is the biggest problem.
Well, I think we're pretty much done here then.
Thank you for doing this.
Appreciate it.
I still don't understand much about the GPU stuff and compute stuff,
but I feel like I'm more informed yeah I mean in the end it's just code on a GPU yeah fair enough
um is there anywhere that you would like to direct people to any i don't know talks you've done or anything like that or just like nothing
oh um the i i usually always have a talk at xtc um they can probably look at this but
yeah i don't have a blog. I'm pretty bad with this.
I have an account on Mastodon, so people can follow me there if they want to.
But yeah, I'm... Fair enough.
...busying regularly enough, I think.
Sounds like you're busy enough doing actual important stuff that you don't need to spend
your time
writing a blog uh i don't think that's the reason i'm just uh i'm trying to give you i'm not a very
yeah yeah i'm super busy that must be it
um i'm sure if you if you sat down and wrote a blog, there's a lot of, definitely a lot of stuff
you talk about, it's just a matter of do you feel like actually writing it down?
Yep.
So...
I usually talk about this stuff on my Mastodon account, so...
Yeah, well that, yeah, that does the job anyway, and then, and then, you know, every,
you just post something on Mastodon and then you get a bunch of articles written about it and then you know every you just post something of macedon and
then you get a bunch of articles written about it and then you can do the work for you
yeah basically that's uh kind of works pretty reliably
well awesome um thank you as i said thanks for doing this um i guess i'll do my outro and then
we'll just sign off uh so go check out the main channel.
I do Linux videos there six days a week.
Not a clue what'll be out when this comes out.
This will be out in a couple of weeks at this point.
I've got the gaming channel that is Brody on Games.
Right now, I'm probably still playing through Armored Core 6
and probably still Kingdom Hearts Dream Drop Distance.
That I do twice a week on Thursday and Friday.
Check out the Discord and you'll see when that all goes up.
Or just check when the previous streams went up and there'll be a notification.
If you're listening to the audio version of this,
you can find the video version on YouTube at Tech Over Tea.
And if you're watching the video and you want to hear the audio version,
there is an RSS feed.
It's on pretty much every audio
podcast platform. You'll find it pretty
easily.
So give me a final word. What do you want to
say? How do you want to sign up for the show?
Try
out everything what I'm doing and please
report bugs.
Absolutely.
See you guys later.