Two's Complement - Async Whatevers
Episode Date: May 18, 2022Ben and Matt talk about various styles of asynchronous programming, ranging from Node.js, Ruby's EventMachine, C++ coroutines, and the new JVM Project Loom. Schedule yourself a listen, won't you?...
Transcript
Discussion (0)
Are you set?
I'm good.
Right, then we should sing this theme tune.
Do do do do do do do do do do do do do do do do do do do bling bling bling bling.
No?
You're going to have to do that now.
That's going to be the episode intro.
Oh no.
I'm Matt Godbolt.
And I'm Ben Rady.
And this is Two's Compliment, a programming podcast.
Hey, Ben.
Hey, Matt.
How are you doing?
Great.
It's another Friday. It's another podcast recording day.
Mm-hmm.
And you came to me with an idea already.
I did. I had a thought, which is... Most unusual.
Async, A-weight, and asynchronous programming in general,
and why we do it, and threads, and all the things.
There's a whole...
This topic could go on forever, but...
Like all of our topics, really, they're deliberately vague
because we haven't really thought about them very much.
So that's the thing.
So that's interesting.
You said async await.
Is that how you pronounce that?
Because I would say async await.
And I'm wondering if that's a British English thing.
You await something rather than await.
Or is it because you said async?
Yeah, no, that's just me wanting there to be alliteration.
Is that what that is?
I mean, it's alliterative anyway, right?
Because they both have an A sound at the beginning.
Yes, but I'm trying to emphasize the alliteration in async await.
So we better say what that is.
What is async await?
Async await.
Well, it's a
programming style, I guess
is what I would call it.
It's a technology. It's a programming style.
It's a floor wax. It's a
dessert topping.
What's that from?
I think it's like an SNL bit from
back in my very, very,
very early days.
But yeah, it's...
Well, it's a solution to a problem, actually.
And we should talk about what that problem is.
Which is, if you have a program that has no threads
and has no async magic,
there's just a series of instructions.
And that program does things that are IO-bound. It has no async magic. It's just a series of instructions.
And that program does things that are IO-bound,
like, for example, fetching a bit of data from a remote server or performing a database query or even reading a file.
Right.
And the progress of that program is blocked on that information,
it can't go forward until you've retrieved the information that you need,
then your program is sitting there, or the CPU usually is sitting there,
doing not very much while you're waiting for this favicon ICO file
to be downloaded from a server somewhere halfway
across the world, or whatever it is that you're doing.
And so it could be doing a lot more, and you could be getting a lot more out of your computing
power by not simply just waiting for it to do that, by running other parts of your program or a different
program or other things well so taking the other side there just just quickly obviously in a modern
operating system the operating system is going to have other things to do you know it's got
animating gifs to to keep moving in the background it's's got dancing hamsters, the like.
Or other programs will take advantage of your CPU.
But if you, as in the user,
are running a single program that has a single thread of execution
and it is blocked,
then yeah, you're going to have to wait.
And if there's other useful things
that that program could otherwise be doing,
well, they'll happen after the file has come down,
not before. so obviously the
normal solution to this is like i'll make some threads and then i'll have all these complicated
mutex and work queues or something something something so is that async yeah no so i think
threads are one solution to this problem and async is maybe another solution to this problem. But the underlying problem, I think,
is that for a single sort of program,
you want to be able to maximize CPU time,
reduce latency, increase throughput.
All of these things can be achieved
by just making better use of the CPU.
And
one way to do that is by having threads.
Another way to do that is with async.
And I think
that either way you go,
there are some sort of
head-twisting, complicated
problems that you can
run into.
Some might be easier to deal with than others.
Some might be easier to test than others.
But this is the classic programming technique
of solving one problem by creating another problem.
Regular expressions, for example.
Yes, exactly.
If the R value of your problem growth is less than one,
then you've done a good job.
Yes.
But I suppose this hypothetical program you said it's fetching something or reading a file or doing whatever
the assumption here is that there are other things that that program could be doing in order to
continue i mean let's say it's a word count program and we've given a list of files
and it's going to open up the files and like naively i would write for each file in list of files open file f read every line count how many lines there
are add them up and print it out at the end that would be like a not unreasonable way to write that
piece of code but for all you know there's multiple drive spindles. Some of these things are on the network or whatever. And, you know, you can easily saturate the PCI bus or the network link or whatever if you were doing more than one file access at a time.
But you're not.
You're just reading one after another.
So how do you – the async await is a solution to the problem of if i know that i have multiple things i could
be doing at once how can i more easily for some definition of easily write a program that looks
like a kind of normal program still um isn't too far removed from that for loop that i just
described but without tying myself in all these kind of operating system level thread spawning uh and uh synchronization
issues that i might otherwise have and so should we talk a little bit about what a straw man kind
of uh implementation of this thing might look like yeah yeah i think that's good so i mean if
it was uh this this uh word counting thing then in in way, you could spawn one thread per job,
per file,
and then let those threads run their separate ways
and then sort of have some kind of
collect all the threads together at the end
and look at the answer and add those all up.
That would be a perfectly sensible way of doing this.
But in the async await world,
you need a framework
that understands how to do
input output operations that um can sort of cooperatively multitask i think that's the key
here is threads means that we're actually using operating system resources to create multiple
execution threads at the operating system level there are literally potentially multiple cpus
could be involved and these things could be running at once on multiple cpus but that's not the kind
of problem we're trying to solve here because the cpus are not the problem it's the waiting
for files that's the problem in the async await world you are scheduling pieces of work with
callbacks somewhere deep down in the butt in the bowels of the the uh the the framework
you've said open a file but just call me back when the file's opened don't yeah don't block
don't block like open the file and read this stuff but don't wait until you're done reading
it to return from that call right and that's really the the trick behind the back of all of
these async wait is that now you
suddenly have co-routines that are powering this uh system of uh sort of cooperatively
multitasking uh all the different things that you want to do in a single operating system thread
one after another as they become ready like the file contents have become available and now your uh your uh your code can operate on it and then let's talk about like like actually what the
syntax of that looks like you typically um hence the async away is that you tag functions of your
program as being async which is a big hint to the framework or language that you're in to say
this function can be suspended halfway through and can sort of
return early in some way and then when it hits an await within that async function what we're
saying is do some work but actually park me here and when that work is ready come back to me yeah yeah i think the trick there is that that sounds a bit
like a thread and instead of logically it is a thread of execution there's like a stack and
there's a sequence of of instructions that put that pertain to a single kind of um idea that
you're doing a function that you're, but there isn't an operating system
resource associated with it. There's literally just a big list somewhere in a framework of,
here's all the things I know I need to do when, I don't know, this file read has completed,
or this network access is done, or somebody deliberately yields and says, hey, someone
else's go to run now. And quite often those things are implemented with an event loop,
right? So you'll have basically a queue of events, and then you'll have basically just like a wild
true loop that is consuming those events one at a time, and then calling back to the callbacks that
are related to those events. So you might schedule a schedule a file read and then take in a
take that event and or you know take something and say okay when this file
read is done generate an event put it in the in the in the queue and then when
the queue when that loop sort of processes through all the events and it
gets to that event it says oh well we did this file read it actually finished
about you know 200 milliseconds ago but we were processing other events at that time. Now we're ready to process it. So I'm going to
call back all the people that were interested in this and tell them, hey, your data from this file
is ready. Now you can do what you want to do. And this has the advantage, like you say, of being
able to run in a single operating system thread so that you don't have to worry about the things
that you have to worry about when you're doing multi-threaded programming. You don't have to worry about synchronization.
You don't have to worry about threads clobbering each other's data.
And in that way, it's actually much simpler because there's just this whole set of problems.
And I will tell you from personal experience, loving tests and wanting to write tests for these kinds of things,
it is quite difficult to write tests for multi-threaded code that truly
give you confidence that the multi-threaded code actually works 100% of the time. It's quite easy
to write tests that convince you that it works and it only works 99% of the time. And then you
find, oh, actually, no, yes, there is this one case that I didn't think about that it can not
work. So from that standpoint, it's kind of attractive.
Absolutely.
I think that's a really good way of phrasing what's going on.
Like at the nuts and bolts level, there is, as you say, an event loop.
On a Unix-based system, there's probably a select loop
that's got all the file handles of all the things that are going on.
And then as they become ready, as you say, things are woken up.
And described like that, it puts you in mind of say how javascript and node.js do all of their work
with with actual callbacks you know you say file.open file name comma and then call this
function when the file has opened and then you end up with this you know very deeply nested
uh callback because you know
once the file's open your your function is called which then wants to read and then of course that's
another asynchronous call and before you know it you've got like 18 levels of indentation or you've
got 200 disparate tiny little functions each of which then calls another tiny little disparate
function and that was how javascript was for the longest time async await specifically so you can have an event loop
and you can have this callback type thing without async right but async await is at a language level
feature which hides under some level of syntactic sugar that callback based thing by kind of writing callbacks for you when you say await file.read what's really
quotes happening is something like a the the function that you're in the middle of is cut
at that point and turned into another function two functions like the bit before the await and
then the bit immediately after the await and the bit immediately after the wait is essentially
turned into the callback function
for the file.read and so you don't need to think of it in these terms you don't need to nest things
further and further and further your code for an individual like i'm the line counting function
just says uh file equals await file.open file name something like async for uh line in uh file dot async read lines count plus plus return count
and so you've just written code that looks almost exactly the way that we described for the single
threaded case with just a couple of magical keywords like tagged in there behind the scenes
the whole thing is rewritten to be be callback based or futures based or some of the other techniques promise based exactly yeah um and it makes for as you say a much more testable design because it is
not subject to the whims of operating system time slices or um multiple cpus actually executing
multiple code paths at the same time which threads would be
it's only when you say await blah does does your logical flow of instructions stop and someone else
could potentially get the use of the cpu so you kind of have to be aware of that under more advanced
um techniques like if you share a cache um for a class and you have multiple um like async awaits of your class going on at the
same time you have to be aware that every time you await something then potentially the cache
could be being used by by someone else but it's so much more controllable and it is deterministic
which i think is the key for things like tests yeah yeah yeah and there's still you can still
get things like race conditions in both paradigms, right?
Like you can have things that are racing.
They're either racing across multiple threads or sometimes racing to the top of the event loop, right?
Like depending on which events get into the queue first, you might have a code that's executed in different orders. Right, and oftentimes frameworks don't necessarily specify the sequence in which
if there are two things that are ready
at the same tick of the clock,
who goes first? In which case,
again, that can be something that can be exposed by
tests where you're trying to puppeteer time
and the completion of things.
Yes, usually a lot of
both languages and frameworks are
intentionally vague about how that's going to happen,
but yet sometimes accidentally consistent, which sometimes will trick you into thinking that you have code that runs properly
when actually there are cases where it doesn't.
But yeah, I think what's interesting is that there's, and you sort of touched on this a little bit,
is that there are levels to this.
There's like the just like, I'm just going to do a non-blocking call with a callback, right? Like that's like the most basic level of this is
there's no async, await, syntax, sugar. There's not even necessarily like an event loop. There's
just, I'm going to, you know, the sort of select statement is sort of the classic example of this
is I'm just going to do it a non-blocking IO call and I'm going to have some facility that that allows me to say when this is complete call me back over here but this initial
call that I'm making I want you to return immediately right that's like the lowest level
of this good old e would block in unix yeah you know like yeah right um and then I think one level
above that are things like futures and promises and things that are not, you're not doing anything special in the language. You're not introducing new keywords. You're not doing anything there. What you're doing is just creating, you know, objects, essentially.
Placeholder objects. where you can call a function, the function will return immediately, you'll get some object back that
represents the thing that you've scheduled to happen,
and you can interact with that object in interesting ways.
Obviously, you can just apply a callback and say like,
hey, when you get my data, let me know.
You can also gather a bunch of them together and say,
when all of these are complete, let me know.
You can gather them together in ways that say,
if ones fail, we'll then call this function,
and if some pass, then call this other function. And you can do interesting things with that as
well, but that requires no language changes or language support or anything like that. That's
just objects, essentially. And then a level above that is going all the way to sort of full,
sort of like, yes, we're going to introduce new keywords into the language. We're going to bake the sort of asynchronous, non-blocking IO things into the standard
library of the language.
And we're going to make this sort of like a first-class citizen within the language.
Node.js obviously is a great example of that because they just took something that had
already been in web browsers and were like, all right, we're just going to lean into this
real hard and start building other things out of it. Obviously
you and I have written a lot of Python recently
and Python 3
6? Is that what it was?
Yeah, 3 and a bit.
3 and a bit.
Is when they introduced
the sort of async
async IO
and introduced those keywords into the
languages.
But you can also just use like a third-party framework like i certainly used um event machine for ruby
for a good long time which isn't really async a way that's more of that sort of second level
that i was talking about right with futures and adding callbacks to the future or something or
yeah yeah um it's really it's it's like there are futures but it's it, it's like, there are futures, but it's like one and a half even.
Yeah.
It's sort of like, but the interesting thing about it is it sort of spawns this whole separate non-blocking ecosystem.
And one of the things that you'll see a lot in the Ruby world is that you have like the, you know, Postgres library, and then you have the Event Machine Postgres library.
And they're only kind of related to each other.
And that's because, like...
Same in Python, right?
You know, you've got, like, Postgres and AIO Postgres for the same reason.
And I think that's one of the more...
Insidious is a little bit of a bad word to use.
But, like, one of the things that happens when you start having async libraries is that
you get this separation.
And because it's all pervasive, anything that that async library is that you get this separation and because it's it's all pervasive
anything that that async library calls that itself might need to be async also needs to be async
yeah that's like keyword level down right like you know so you you end up with you know read file and
async read file because they are very different operations one of which returns the contents of
a file the other one returns essentially a the promise or a future of a file.
And so, yeah, you end up with this a bit.
So like I say this in like C++,
once you start introducing const correctness in one place,
suddenly everything needs to be const correct
because it won't let you not be.
And it's the same with async.
As soon as you get to a point
where you can't do something asynchronous anymore,
you're like, well, now we're blocking again.
Yeah, yeah.
We had this running joke for a while there.
It's like the glitter of the programming world.
Once you start using it, it's like it just gets everywhere, right?
Like you can't.
It's got a half-life of, yeah, measured in millennia.
You never get rid of that last grain of glitter or sand from the beach.
Yeah, microplastic.
It's the programming microplastic
um so yeah so so you know but it has a lot of advantages but you know you do sometimes get
into this day where it's like oh well that means this needs to be async which means this one which
means this one which well i'm just going to rewrite the whole thing at this point yeah
everything's async and you feel like you're just scattering the async keyword everywhere yes
and that's not without cost um but before we move on to like things like cost and what how it's actually what's
going on under the hood a little bit more i was going to sort of make the observation i don't
think i made clear when we were talking about the the word counting example right you know we've
made written these like individual async functions that count the number of lines in a file by async
opening it and then async for each reading lines
or however that works but then you mentioned gather as like a primitive operation and that
is a key to actually unlocking the uh multiple outstanding io requests at once right so in this
instance you know one could write the code uh at the top level that says for i in each of the files i want a word count
async await that word count of file zero you know running total plus equals that um but you would
still be back in the world of single single uh file happening at once you're just doing
asynchronously and that event hasn't got any other work to do so it'll just sit there and go okay i just i'm literally in select now blocking for that file
to become ready and then i call you back and then you're no better off you're slightly worse off
but the gather primitive operation you talked about takes a bunch of future objects of some
description be they promises or futures or co-routines or osing whatever's and it
says okay all of these we want to kick them all off at once at once again there's not any threading
so it doesn't actually happen at once it could each one is run until it gets to a point where
it yields yeah but then i come back and do the next one then the next one the next one and now
i have multiple io requests in flight and then whichever one of those comes back first i will service and
it goes into the event loop and now that uh file is open and so that um co-routine is starting to
process the first file or the second file and so on so um that is the equivalent of spawning threads
at that point that's how you get the parallelism, is by gathering a bunch of async awaitable things.
And I think that's another thing to bear in mind,
is how coroutines interact with this.
Because languages that don't have coroutines,
or something like coroutines,
it becomes a difficult thing.
I don't know, actually, difficult becomes a difficult thing i don't
actually you mentioned ruby and i don't have any experience with with ruby specifically i know
python had coroutines before it had the async in a way and they were able to be like library level
solutions to async away just using coroutines and i know that the async sort of decorator is
a spin on always forcing an object a function to return
a code as opposed to being a function causing its own right so does ruby have a similar thing
how do you or was it you did mention it's like we're level one and a half so maybe it didn't
have the same yeah i mean if it does i'm unaware of it when i was using it event machine was was
entirely like a a third-party thing.
Right.
It was not a language extension, so it didn't need coroutines.
And again, another sort of aspect that makes writing this kind of code more tractable
are languages that allow you to do arbitrary captures into lambdas and unnamed objects and things.
So either you can use those as the callbacks and not worry about um who owns what uh or or um you use co-routines themselves so yeah it's
i suppose what i'm winding up towards here is that why do other more low-level programming
languages like c and c++ not really with a big asterisk footnote,
have async await type operations,
even though all these other programming languages we're talking about are written in C or C++, right?
So there's got to be a way of doing it.
Mm-hmm.
Yeah.
And, you know, really, I mean, so the trick there is
coroutines are a huge unlocker they're
the thing that allow you to suspend a function and carry on again a bit later on in that function i
know i described it as it being syntactic sugar for like cutting a function into two functions
and there is a way you can do that and i believe the first version of c sharp that introduced this
had that kind of rewriting
technique behind the scenes.
It was just literally rewriting the code as blocks of, of, um, of, um, follow on continuation
functions one after another.
But, um, modern times, um, these are co-routines that can yield.
That means they literally say, Hey, I'm not not finished yet but i'm going to return a value
to my caller and that value to the caller indicates that yeah i haven't finished so
go do something else and then you can call me back and then it will continue from where i left on and
in that traditional classic co-routine um sense um but um the problem with that is that you need
to be able to keep around all of the internal state of a function between calls to that function.
You need to be able to suspend and resume.
Right.
Going back to our original, like, operating system level threads, that's what the operating system knows how to do.
And it's its meat and potatoes.
Right, right. meat and potatoes right right it knows how to suspend a a thread and page in switch in a new
thread you know restore all the state from that and at an arbitrary point it's kind of like a
hostile takeover right at any point in time the operating system could say you've had enough cpu
now i'm just coming in i'm saving your registers you'll never know that i'm about to start using
this cpu for something else and later on you'll wake up dazed and confused and won't even know what happened right right and what the co-routine does is this is like this is a
cooperative choice i need to be able to preserve enough of my state at this point for you to
continue doing some other piece of work but then when you come back to me i can continue where i
left off and for interpreted languages that's usually as straightforward and
i'm going to use air quotes that our listeners can't see as storing the interpreter state
for that function up to the point up to some point where you know you don't need to store
higher up maybe that's where the the event loop started or whatever um and then you can resume by
just jumping back into the bytecode with that kind of information around.
And usually they're garbage collected as well.
So anything that you had references to will still naturally still exist because they still have a reference count or they still have something pointing at them.
So you can just carry on just in C and C++.
You have the problem of, well, it's arbitrary assembly instructions, and
the compiler needs to know how to hang
onto things like the stack. And there is
a single stack. It is the
stack. It's not a stack.
When you have multiple threads, you have multiple stacks
by default, but in a coroutine, you deliberately
don't necessarily want
that. It becomes more complicated.
So then, to the footnote,
the big asterisk i said earlier
co-routines are now here for c plus plus but it is the beginning of the process so the really
interesting way that co-routines have been brought into c plus plus is that the keywords co-await and
co-yield and i'm sure there's another one gosh i can't believe i can't remember the other one now
they are in the language now,
but they're sort of batteries not included.
They're very, very low level primitives that the language is making promises.
And then the STL, the standard template library,
does not yet have much in the way
of support scaffolding for you.
So if you want to write coroutines,
you can roll your sleeves up
and write a whole bunch of complicated
state management code and you can do it. there are there are also some cool libraries that do
some of this stuff for you but the expectation as i understand it is over the next couple of cycles
of the c++ committee every three years meeting then um the sort of predominant best solution
for um a whole bunch of things to do with execution and
other things will come to the fore and that will become the way that we think about these these
these co-routines and how they exist in a bit bigger in a wider context and then once those
building blocks are in place maybe there will be some async uh routines and async libraries that
we can use yeah yeah which will be great so obviously i was just me
steering it my way there to talk about my own thing which and again i i've seen people use the
co-yield and co-await stuff um in sort of like toy examples but i've never used it myself in anger
and um although this is not async await at all because co-routines are separable from async
await um you can build async await with them uh co-routines are separable from async await um you can build async away with them
co-routines are really really cool for things like writing emulators where you want
multiple very lightweight processes think every single device wants to have hey another clock
tick happened right and you want to be able to write code that looks not like uh the traditional
you know poll you know everyone's written poll where you have like a state machine in every single um element on that's being polled and the state machine is
like hey every time you have a cool poll you have to do switch what's my current state oh
case waiting for this oh okay well then i guess i'll do this type of thing and it's really
convenient to be able to just do some write like a uh video uh chip that just says you know
await next cycle draw two pixels await next cycle draw two pixels in a in a for chip that just says, you know, await next cycle, draw two pixels.
Await next cycle, draw two pixels.
In a for loop that's as wide as the screen,
and then you do the, oh, and now do the thing
that happens at the edge of the screen,
await another cycle.
And you just write straight line code
like you were the video chip,
and there was no one else in the world,
except every time there's a cycle, you say,
okay, I'm done for this clock tick.
Now someone else gets a go.
And you don't use
threads for that because the synchronization cost is staggeringly high for that and the work you're
doing in between them is tiny but it's inconvenient to write the normal poll stuff so anyway that is
a complete aside and i have never i've not yet written a co-routine based emulator but i would
like to yeah yeah um so let's sort of go back to some of the other languages we were
talking about earlier i know i know we we have spoken about java before now um is there an
equivalent for java uh you know i'm a little embarrassed to say that i don't know there's
nothing embarrassing about not knowing things yeah i don't should make that clear i feel like
i should know this that's why i mean there's lots of things i don't. You should make that clear. I feel like I should know this. That's why. I mean, there's lots of things I don't know.
Let's be real clear on that.
All right, okay.
But I feel like as much as I've done with the JVM
and with asynchronous programming, I've never...
I guess, doesn't Clojure have like an async?
I think it does.
So there must be a mechanism for it.
Well, to your point,
it's like you can build these things
on top of lower-level languages
and they work just fine.
But I actually don't know if there's one in Java.
The one point that you made me think of, though,
when you were talking about the C++ introduction of this, is that I would imagine that that means that the C++ community and standard library is going to face the same situation that is true in Python, that is true in a few of these other languages that introduced this, where you sort of get this bifurcation of the standard library into
these are the asynchronous non-blocking
calls, and these are the star-bellied sneetches.
And between the two, you kind of don't want to cross.
And so you sort of have to make your choice.
And if you have to go back and sort of retrofit it, it sucks
but in a lot of cases it means you either have to spend a lot of time
figuring out how to unify the duplication between those two worlds
in a way that makes sense
or just duplicate it.
You're just going to do everything twice,
the async way and the sync way.
And that can definitely create confusion
and make it harder to use things.
But if you want to be able to take advantage of this,
that's what you need to do.
Yeah, I don't know about that, actually.
I know that the Boost library ASIO, which is the asynchronous IO library, is one of the things that's been talked about for being standardized as the network library of C++. than people thought is that there is this kind of unification discussion going on and i'm you know i don't know much about that myself but um i'm sure if we found some people uh we could find
some people who wanted to talk about this kind of stuff but i it's uh yeah yeah i'm very much
more a user of c++ than than anyone too uh deep in the details of how it's uh how it's designed but
yeah and i know that this is the
kind of thing that people think about a lot and that's one of the reasons why arguably the c++
standard library is mostly i was gonna say impoverished that's a terrible
but it's a pretty impoverished standard library when it comes to non, like, actual pragmatic things, like opening files, reading every line of a file or whatever.
It's kind of, well, there's this very high level concepts now and algorithms that you can use.
You know, if you want to do some of the the more obscure um partial sorting of an arbitrary array
that's not even an array it's just iterator pairs of things you know you can do all these clever
things but if you want to find um the the full stop at the end or if you want to like trim the
last character off a string then you just have to do it using those algorithms you don't just have
a dot remove last or whatever or you know uppercase and dot lowercase you know all those kind of niceties don't come but mainly because they aren't necessarily as general purpose as as one might
imagine you know like what does it mean to be uppercase well what locale oh now you've got all
these questions right you know a lot of the language just go meh but yeah we just want to
print something out in shout caps you know that's important to me not like right general so i think that is uh definitely
a factor in the um the the slower adoption of these new technologies is because they want to
do it right knowing that it kind of has a certain persistence it stays on for a long long time there
is you know it's it's worse than glitter for the c++ uh committee they have to make decisions that
are basically forever decisions they never really
go back on stuff yeah yeah yeah uh another point that you made that i i wanted to briefly touch on
was the the sort of performance impacts and like i i definitely have made the mistake myself and
i've seen other people make the mistake of sort of just throwing threads at a problem when they
have a performance problem it's like oh i'll just break the work up and distribute it across threads
and then it'll be faster, right?
And, you know, the answer is not always.
If the cost of crossing those thread boundaries
is higher than the actual amount that you save
by breaking up the work
and the cost of reassembling it when you're all done,
you can actually make it slower
and more complicated at the same time.
Yeah, brilliant.
Which is not great.
And so, you know, one of the advantages of having a sort of single-threaded,
you know, event loop style, whether it's with async keywords or not,
is that you don't have to pay the, not only do you not have the sort of
programmer complexity problems of crossing thread boundaries,
but you also don't have to pay the performance hit of crossing thread boundaries. And if you can sort of structure your work in
such a way that you can take advantage of the non-blocking IO while keeping the CPU busy,
you can actually get very good throughput by doing that and sort of avoid some tricky performance
problems of like,
oh, yeah, why is this queue?
Every time I go to read from the queue and it blocks,
it really slows things down.
It keeps the queue full, and even when it keeps the queue full,
it still slows down.
What's going on here?
Yeah, I think it's an important distinction you make there. If you are predominantly IO-bound,
then you can definitely take advantage of a event based system
because like you said like if you're keeping the cpu busy then sort of by definition when the cpu
is finished doing whatever it's doing now there's always something new for it to do because some
other io event has completed and someone who was waiting for a read has now got it or someone who
was waiting for a right to drain has got the the okay that it's gone um but it's easy to fall in the situation and this is
i mean this is what compiler explorer suffers from right if you hit compiler explorer uh some of the
work that it does on the event thread so like the the web server itself is running like this
everything in node.js which is what compiler explorer is written in um is is uh uh continuation passing style mostly although we're slowly moving to a sinker weight
actually um style of uh of work which means every web request is like an io thread as far as we're
concerned io response um every now and then when we've actually got the results of a compilation
so for us um another piece of io is we ran the compiler it's another process and as far as we're concerned we're
awaiting it now so like right we can be serving more web responses we can be giving the fav icon
dot ico to whoever wants it and then when the compiler is finished in its own process somewhere
else in a little sandbox we're going to read the results back in and we're going to parse it so that we can give something back to the web browser that's renderable that is totally
cpu bound and for like large programs that have 20 30 40 000 lines of assembly output it can take
us a long while meanwhile we are essentially blocking any other web request that's coming
in that's unrelated to that for that one node now we
have low balancing we have multiple physical computers that are doing the work so that's less
of a problem but our cpu is 100 wedged and we can't do the other work that is available to us
now if we had threads that would not be true and so we've discussed having worker threads for exactly
this thing in which case actually just to sort of square this off um we would probably kick off that parse the
assembly output and return me like the dictionary that i need to send to the user that would be a
worker thread and then we would await the thread coming back you know hey kick off the thread and
then await the results and now we're back in the land of async await it's just just like running
the compiler the result of parsing the
output becomes itself an async io inverted commas thing as for at least as far as we're concerned
so there are kind of ways of making everything fit together um i think the other thing i think
about with performance you mentioned about crossing thread boundaries as it were obviously
thread boundaries are typically not very expensive in the general sense because
in the same process you can actually affect the same right memory as the other yes for good or
for evil for good or for evil exactly yes um so obviously the locking is where it starts to become
right troublesome and and you mentioned queues so you did sort of cover that exactly as as one
would want but that hidden inside that queue is a lock that is going to prevent you from being able to do work once it's either full or there's no work to do
all those kinds of things or you know you've got the possibility of deadlock and and other other
issues like that um but conversely if you are in uh async await land and with gay abandon you're
putting the uh async keyword on every single function you can
you can find even when there isn't any actual awaiting inside of that function then you're
pessimizing your program because behind the scenes what that means is that when you call that
function you don't actually get like the nuts and bolts that are that function which returned an
integer now returns a future of an integer and you call it first of all and you get
back like a hey there's nothing to do yet and then it gets scheduled on the event loop or it could
at least if someone's awaiting it right now it's going to be scheduled on the event loop which
means there's at least one tick of like the event loop clock before the first line of your function
is going to start running or it's actually it depends on the implementation
some of these things do actually eagerly um execute and there's a difference between tasks
and not and async things in different languages or whatever but i've at least had definitely some
some experiences where i've called a function i've gone why did that not do anything and you
realize oh i didn't await it which meant that all that happened is it called the function
to generate like the coroutine context.
And then we're like, okay, here's your thing.
When you want it, it's ready.
Here's this code that I can run for you whenever you like.
But not now.
You didn't call it.
No, you didn't.
You told me you were going to call it.
It's not as cheap to do that.
You're creating these intermediate objects that have some setup cost as well as the actual,
maybe you're going once around the horn of the event
loop before you're actually making progress now that does have some some positive side effects
if you are cpu bound and you're just trying in a really naff way to yield up so this is another
compiler explorer anecdotal thing here and i hope hope that none of my developers who are actually good at developing code are listening, who contribute code.
This is my harebrained ideas.
But one thing I consider doing is literally putting async await sleep zero into that big parsing loop.
You know, every 10,000 loops, just to yield it back.
And of course it's awful,
and no one should do that,
but it's a way of giving someone else
a go at the CPU,
so that at least some progress
can be made on those web requests
that are coming in.
Yeah, you asked me earlier
about async, await, and Java, and I don't really know.
But one thing I did see earlier this week that I thought was super interesting is a proposal for Java virtual threads,
which might be an interesting middle ground between these two worlds that we're talking about. And the proposal, as I understand it and remember it, basically revolves around threads that when you're executing blocking what would normally be blocking IO calls within the context of a virtual thread, the JVM underneath the covers will turn those into non-blocking calls for you
and suspend the execution of the virtual thread
to allow other virtual threads to execute,
which means you don't have to do the thing that you normally do with threads
and pool them.
You never want to spawn thousands and thousands of threads at the same time
because you're operating system threads
because you're going to starve out the operating system
and make the thread schedulers job significantly
harder but with virtual threads you can just go ahead and make as many as you want because they
don't they're not mapped to um operating system threads they are essentially async awaiting
automatically at the point where the jvm says hey you called file to open well rather than
file opening i'm gonna rewrite that as an
async await kind of thing and let another virtual thread take over the this thread capital t right
like this whatever execution however that's where but obviously you kind of need to opt into that a
little bit higher up somewhere where you spawn these virtual threads right and each one of them is essentially a top-level co-routine or a syncable um uh thing
that's that's a neat solution that doesn't require much change to the code and you're still writing
straight line code as far as you're concerned you're still blocking and you are blocking right
yes it's just at the point where you would block and rely on the operating system to context switch
you the jvm says i can context switch you much more cheaply right and i'll do that here and get someone yeah
that's awesome yeah and one interesting side effect of that that i was thinking of is that
it avoids this problem that we were just talking about where you sort of have bifurcate the standard
library oh right like you're just saying no no underneath the covers it might be bifurcated but
that will never hit the api you will never see that really interesting it's all one thing yeah which is kind of cool because this just sort of
reduces the cognitive load on programmers who need to understand how the standard library works
no that's super awesome yeah that's a really interesting observation i hadn't thought of
that but yeah that bifurcation is otherwise a big deal breaker for a lot of people they're like oh
which one oh you know which which way are we going to go oh no that's super cool to finish on i just want to make a couple of
observations that amuse me and that is all these async await style things and again mainly the
co-routines that i that help them may be more uh useful are are the way that operating systems used to work. I don't know if you, did you ever do any Windows,
Windows 3.X development back in the day?
No.
Windows NT was really the first operating system
that I used as a programmer in Anger.
The Windows, the Win32 API,
which is still around and about,
has like an event loop, right?
And I don't know if you've ever looked or remember do you remember doing did you any do any raw win32 api stuff even in nt
because although it wasn't the way i'm about to describe it it was the same api yeah so you would
do um you'd call a get next event effectively and you'd have a switch statement on get next event so you do you know switch uh get message or whatever the i actually can't know wm get message does
that ring a bell i'm now i'm making up like stuff like you're reaching into the very depths of my
my neurons here but yes that does actually ring a bell and then you would sort of you'd pass it
all these structures to fill in then it would return the the why you know what's the next message right and it would be you know wm paint and then you'd be like oh i better draw
them or like a mouse click event right or a mouse click or yeah any of these things yes so in um
in modern times that's just reading off of an event queue right your your thread is is uh goes
to sleep and you can have many threads obviously they can be do any other things but your your thread is is uh goes to sleep and you can have many threads obviously they can be
doing any other things but your your sort of ui thread is there just reading the next message off
of that so as the mouse is moved through your world or if people are clicking or typing whatever
you know you get those events but but way back in the dawns of time that was how the operating
system gave someone else a go you call get message and it's like oh i'm suspending you now there is no
preemptive multitasking here at this point we're saving all the registers and we're going to load
someone else's registers and memory map and then we're going to return to their get message and
now they get a go on the uh the cpu and um and certainly the operating system that i was using
prior to windows which was risk, had that feel too.
You would have to do this system call that would tell you what the next thing is.
And then really it was switching and returning to somebody else, which was a really interesting design because you could do that in user mode.
It wasn't that difficult to see how it was saving all the registers and reloading them all back out again and so you could write your own like cooperatively multitasked sort of sub threads
within your thread using the similar kind of techniques which was a great way of introducing
of like well this is how the operating system must be doing it and i remember having an epiphany
moment um doing exactly that and just having those two routines that would you know call a function
which isn't really calling a function because they're calling that function actually jumps
back into and returns from the other functions call a function function and then you can ping
pong backwards and forwards between the two of them and it's it's uh yeah just one of those
crazy things so it's kind of come back now this cooperative multitasking which is why i described
it as that right at the beginning of this whole conversation. It's like it's cooperative multitasking.
You get to say when someone else gets a go of the CPU.
Yeah.
Yeah.
Everything old is new again, my friend.
Everything old is new.
Well, we've covered some things that we are inexpert in only as users.
So I do hope that anyone listening to this who's been shouting at the microphone, at their headphones or their speakers about how...
I'm sure our listeners are very frustrated by now.
So do tweet us at TWOSCP to let us know all the mistakes we've made, as per usual.
And until next time, my friend.
Yep, next time.
You've been listening to Two's Compliment, a programming podcast by Ben Rady and Matt Godwald.
Find the show transcript and notes at tw Inversephase.com.