Two's Complement - Async Whatevers

Starting point is 00:00:00 Are you set? I'm good. Right, then we should sing this theme tune. Do do do do do do do do do do do do do do do do do do do bling bling bling bling. No? You're going to have to do that now. That's going to be the episode intro. Oh no.

Starting point is 00:00:18 I'm Matt Godbolt. And I'm Ben Rady. And this is Two's Compliment, a programming podcast. Hey, Ben. Hey, Matt. How are you doing? Great. It's another Friday. It's another podcast recording day.

Starting point is 00:00:39 Mm-hmm. And you came to me with an idea already. I did. I had a thought, which is... Most unusual. Async, A-weight, and asynchronous programming in general, and why we do it, and threads, and all the things. There's a whole... This topic could go on forever, but... Like all of our topics, really, they're deliberately vague

Starting point is 00:01:00 because we haven't really thought about them very much. So that's the thing. So that's interesting. You said async await. Is that how you pronounce that? Because I would say async await. And I'm wondering if that's a British English thing. You await something rather than await.

Starting point is 00:01:19 Or is it because you said async? Yeah, no, that's just me wanting there to be alliteration. Is that what that is? I mean, it's alliterative anyway, right? Because they both have an A sound at the beginning. Yes, but I'm trying to emphasize the alliteration in async await. So we better say what that is. What is async await?

Starting point is 00:01:42 Async await. Well, it's a programming style, I guess is what I would call it. It's a technology. It's a programming style. It's a floor wax. It's a dessert topping. What's that from?

Starting point is 00:01:57 I think it's like an SNL bit from back in my very, very, very early days. But yeah, it's... Well, it's a solution to a problem, actually. And we should talk about what that problem is. Which is, if you have a program that has no threads and has no async magic,

Starting point is 00:02:22 there's just a series of instructions. And that program does things that are IO-bound. It has no async magic. It's just a series of instructions. And that program does things that are IO-bound, like, for example, fetching a bit of data from a remote server or performing a database query or even reading a file. Right. And the progress of that program is blocked on that information, it can't go forward until you've retrieved the information that you need, then your program is sitting there, or the CPU usually is sitting there,

Starting point is 00:02:57 doing not very much while you're waiting for this favicon ICO file to be downloaded from a server somewhere halfway across the world, or whatever it is that you're doing. And so it could be doing a lot more, and you could be getting a lot more out of your computing power by not simply just waiting for it to do that, by running other parts of your program or a different program or other things well so taking the other side there just just quickly obviously in a modern operating system the operating system is going to have other things to do you know it's got animating gifs to to keep moving in the background it's's got dancing hamsters, the like.

Starting point is 00:03:48 Or other programs will take advantage of your CPU. But if you, as in the user, are running a single program that has a single thread of execution and it is blocked, then yeah, you're going to have to wait. And if there's other useful things that that program could otherwise be doing, well, they'll happen after the file has come down,

Starting point is 00:04:04 not before. so obviously the normal solution to this is like i'll make some threads and then i'll have all these complicated mutex and work queues or something something something so is that async yeah no so i think threads are one solution to this problem and async is maybe another solution to this problem. But the underlying problem, I think, is that for a single sort of program, you want to be able to maximize CPU time, reduce latency, increase throughput. All of these things can be achieved

Starting point is 00:04:43 by just making better use of the CPU. And one way to do that is by having threads. Another way to do that is with async. And I think that either way you go, there are some sort of head-twisting, complicated

Starting point is 00:04:59 problems that you can run into. Some might be easier to deal with than others. Some might be easier to test than others. But this is the classic programming technique of solving one problem by creating another problem. Regular expressions, for example. Yes, exactly.

Starting point is 00:05:18 If the R value of your problem growth is less than one, then you've done a good job. Yes. But I suppose this hypothetical program you said it's fetching something or reading a file or doing whatever the assumption here is that there are other things that that program could be doing in order to continue i mean let's say it's a word count program and we've given a list of files and it's going to open up the files and like naively i would write for each file in list of files open file f read every line count how many lines there are add them up and print it out at the end that would be like a not unreasonable way to write that

Starting point is 00:05:56 piece of code but for all you know there's multiple drive spindles. Some of these things are on the network or whatever. And, you know, you can easily saturate the PCI bus or the network link or whatever if you were doing more than one file access at a time. But you're not. You're just reading one after another. So how do you – the async await is a solution to the problem of if i know that i have multiple things i could be doing at once how can i more easily for some definition of easily write a program that looks like a kind of normal program still um isn't too far removed from that for loop that i just described but without tying myself in all these kind of operating system level thread spawning uh and uh synchronization issues that i might otherwise have and so should we talk a little bit about what a straw man kind

Starting point is 00:06:53 of uh implementation of this thing might look like yeah yeah i think that's good so i mean if it was uh this this uh word counting thing then in in way, you could spawn one thread per job, per file, and then let those threads run their separate ways and then sort of have some kind of collect all the threads together at the end and look at the answer and add those all up. That would be a perfectly sensible way of doing this.

Starting point is 00:07:21 But in the async await world, you need a framework that understands how to do input output operations that um can sort of cooperatively multitask i think that's the key here is threads means that we're actually using operating system resources to create multiple execution threads at the operating system level there are literally potentially multiple cpus could be involved and these things could be running at once on multiple cpus but that's not the kind of problem we're trying to solve here because the cpus are not the problem it's the waiting

Starting point is 00:07:52 for files that's the problem in the async await world you are scheduling pieces of work with callbacks somewhere deep down in the butt in the bowels of the the uh the the framework you've said open a file but just call me back when the file's opened don't yeah don't block don't block like open the file and read this stuff but don't wait until you're done reading it to return from that call right and that's really the the trick behind the back of all of these async wait is that now you suddenly have co-routines that are powering this uh system of uh sort of cooperatively multitasking uh all the different things that you want to do in a single operating system thread

Starting point is 00:08:40 one after another as they become ready like the file contents have become available and now your uh your uh your code can operate on it and then let's talk about like like actually what the syntax of that looks like you typically um hence the async away is that you tag functions of your program as being async which is a big hint to the framework or language that you're in to say this function can be suspended halfway through and can sort of return early in some way and then when it hits an await within that async function what we're saying is do some work but actually park me here and when that work is ready come back to me yeah yeah i think the trick there is that that sounds a bit like a thread and instead of logically it is a thread of execution there's like a stack and there's a sequence of of instructions that put that pertain to a single kind of um idea that

Starting point is 00:09:41 you're doing a function that you're, but there isn't an operating system resource associated with it. There's literally just a big list somewhere in a framework of, here's all the things I know I need to do when, I don't know, this file read has completed, or this network access is done, or somebody deliberately yields and says, hey, someone else's go to run now. And quite often those things are implemented with an event loop, right? So you'll have basically a queue of events, and then you'll have basically just like a wild true loop that is consuming those events one at a time, and then calling back to the callbacks that are related to those events. So you might schedule a schedule a file read and then take in a

Starting point is 00:10:26 take that event and or you know take something and say okay when this file read is done generate an event put it in the in the in the queue and then when the queue when that loop sort of processes through all the events and it gets to that event it says oh well we did this file read it actually finished about you know 200 milliseconds ago but we were processing other events at that time. Now we're ready to process it. So I'm going to call back all the people that were interested in this and tell them, hey, your data from this file is ready. Now you can do what you want to do. And this has the advantage, like you say, of being able to run in a single operating system thread so that you don't have to worry about the things

Starting point is 00:11:02 that you have to worry about when you're doing multi-threaded programming. You don't have to worry about synchronization. You don't have to worry about threads clobbering each other's data. And in that way, it's actually much simpler because there's just this whole set of problems. And I will tell you from personal experience, loving tests and wanting to write tests for these kinds of things, it is quite difficult to write tests for multi-threaded code that truly give you confidence that the multi-threaded code actually works 100% of the time. It's quite easy to write tests that convince you that it works and it only works 99% of the time. And then you find, oh, actually, no, yes, there is this one case that I didn't think about that it can not

Starting point is 00:11:41 work. So from that standpoint, it's kind of attractive. Absolutely. I think that's a really good way of phrasing what's going on. Like at the nuts and bolts level, there is, as you say, an event loop. On a Unix-based system, there's probably a select loop that's got all the file handles of all the things that are going on. And then as they become ready, as you say, things are woken up. And described like that, it puts you in mind of say how javascript and node.js do all of their work

Starting point is 00:12:12 with with actual callbacks you know you say file.open file name comma and then call this function when the file has opened and then you end up with this you know very deeply nested uh callback because you know once the file's open your your function is called which then wants to read and then of course that's another asynchronous call and before you know it you've got like 18 levels of indentation or you've got 200 disparate tiny little functions each of which then calls another tiny little disparate function and that was how javascript was for the longest time async await specifically so you can have an event loop and you can have this callback type thing without async right but async await is at a language level

Starting point is 00:12:53 feature which hides under some level of syntactic sugar that callback based thing by kind of writing callbacks for you when you say await file.read what's really quotes happening is something like a the the function that you're in the middle of is cut at that point and turned into another function two functions like the bit before the await and then the bit immediately after the await and the bit immediately after the wait is essentially turned into the callback function for the file.read and so you don't need to think of it in these terms you don't need to nest things further and further and further your code for an individual like i'm the line counting function just says uh file equals await file.open file name something like async for uh line in uh file dot async read lines count plus plus return count

Starting point is 00:13:50 and so you've just written code that looks almost exactly the way that we described for the single threaded case with just a couple of magical keywords like tagged in there behind the scenes the whole thing is rewritten to be be callback based or futures based or some of the other techniques promise based exactly yeah um and it makes for as you say a much more testable design because it is not subject to the whims of operating system time slices or um multiple cpus actually executing multiple code paths at the same time which threads would be it's only when you say await blah does does your logical flow of instructions stop and someone else could potentially get the use of the cpu so you kind of have to be aware of that under more advanced um techniques like if you share a cache um for a class and you have multiple um like async awaits of your class going on at the

Starting point is 00:14:47 same time you have to be aware that every time you await something then potentially the cache could be being used by by someone else but it's so much more controllable and it is deterministic which i think is the key for things like tests yeah yeah yeah and there's still you can still get things like race conditions in both paradigms, right? Like you can have things that are racing. They're either racing across multiple threads or sometimes racing to the top of the event loop, right? Like depending on which events get into the queue first, you might have a code that's executed in different orders. Right, and oftentimes frameworks don't necessarily specify the sequence in which if there are two things that are ready

Starting point is 00:15:28 at the same tick of the clock, who goes first? In which case, again, that can be something that can be exposed by tests where you're trying to puppeteer time and the completion of things. Yes, usually a lot of both languages and frameworks are intentionally vague about how that's going to happen,

Starting point is 00:15:44 but yet sometimes accidentally consistent, which sometimes will trick you into thinking that you have code that runs properly when actually there are cases where it doesn't. But yeah, I think what's interesting is that there's, and you sort of touched on this a little bit, is that there are levels to this. There's like the just like, I'm just going to do a non-blocking call with a callback, right? Like that's like the most basic level of this is there's no async, await, syntax, sugar. There's not even necessarily like an event loop. There's just, I'm going to, you know, the sort of select statement is sort of the classic example of this is I'm just going to do it a non-blocking IO call and I'm going to have some facility that that allows me to say when this is complete call me back over here but this initial

Starting point is 00:16:31 call that I'm making I want you to return immediately right that's like the lowest level of this good old e would block in unix yeah you know like yeah right um and then I think one level above that are things like futures and promises and things that are not, you're not doing anything special in the language. You're not introducing new keywords. You're not doing anything there. What you're doing is just creating, you know, objects, essentially. Placeholder objects. where you can call a function, the function will return immediately, you'll get some object back that represents the thing that you've scheduled to happen, and you can interact with that object in interesting ways. Obviously, you can just apply a callback and say like, hey, when you get my data, let me know.

Starting point is 00:17:15 You can also gather a bunch of them together and say, when all of these are complete, let me know. You can gather them together in ways that say, if ones fail, we'll then call this function, and if some pass, then call this other function. And you can do interesting things with that as well, but that requires no language changes or language support or anything like that. That's just objects, essentially. And then a level above that is going all the way to sort of full, sort of like, yes, we're going to introduce new keywords into the language. We're going to bake the sort of asynchronous, non-blocking IO things into the standard

Starting point is 00:17:49 library of the language. And we're going to make this sort of like a first-class citizen within the language. Node.js obviously is a great example of that because they just took something that had already been in web browsers and were like, all right, we're just going to lean into this real hard and start building other things out of it. Obviously you and I have written a lot of Python recently and Python 3 6? Is that what it was?

Starting point is 00:18:12 Yeah, 3 and a bit. 3 and a bit. Is when they introduced the sort of async async IO and introduced those keywords into the languages. But you can also just use like a third-party framework like i certainly used um event machine for ruby

Starting point is 00:18:31 for a good long time which isn't really async a way that's more of that sort of second level that i was talking about right with futures and adding callbacks to the future or something or yeah yeah um it's really it's it's like there are futures but it's it, it's like, there are futures, but it's like one and a half even. Yeah. It's sort of like, but the interesting thing about it is it sort of spawns this whole separate non-blocking ecosystem. And one of the things that you'll see a lot in the Ruby world is that you have like the, you know, Postgres library, and then you have the Event Machine Postgres library. And they're only kind of related to each other. And that's because, like...

Starting point is 00:19:06 Same in Python, right? You know, you've got, like, Postgres and AIO Postgres for the same reason. And I think that's one of the more... Insidious is a little bit of a bad word to use. But, like, one of the things that happens when you start having async libraries is that you get this separation. And because it's all pervasive, anything that that async library is that you get this separation and because it's it's all pervasive anything that that async library calls that itself might need to be async also needs to be async

Starting point is 00:19:30 yeah that's like keyword level down right like you know so you you end up with you know read file and async read file because they are very different operations one of which returns the contents of a file the other one returns essentially a the promise or a future of a file. And so, yeah, you end up with this a bit. So like I say this in like C++, once you start introducing const correctness in one place, suddenly everything needs to be const correct because it won't let you not be.

Starting point is 00:19:59 And it's the same with async. As soon as you get to a point where you can't do something asynchronous anymore, you're like, well, now we're blocking again. Yeah, yeah. We had this running joke for a while there. It's like the glitter of the programming world. Once you start using it, it's like it just gets everywhere, right?

Starting point is 00:20:14 Like you can't. It's got a half-life of, yeah, measured in millennia. You never get rid of that last grain of glitter or sand from the beach. Yeah, microplastic. It's the programming microplastic um so yeah so so you know but it has a lot of advantages but you know you do sometimes get into this day where it's like oh well that means this needs to be async which means this one which means this one which well i'm just going to rewrite the whole thing at this point yeah

Starting point is 00:20:39 everything's async and you feel like you're just scattering the async keyword everywhere yes and that's not without cost um but before we move on to like things like cost and what how it's actually what's going on under the hood a little bit more i was going to sort of make the observation i don't think i made clear when we were talking about the the word counting example right you know we've made written these like individual async functions that count the number of lines in a file by async opening it and then async for each reading lines or however that works but then you mentioned gather as like a primitive operation and that is a key to actually unlocking the uh multiple outstanding io requests at once right so in this

Starting point is 00:21:18 instance you know one could write the code uh at the top level that says for i in each of the files i want a word count async await that word count of file zero you know running total plus equals that um but you would still be back in the world of single single uh file happening at once you're just doing asynchronously and that event hasn't got any other work to do so it'll just sit there and go okay i just i'm literally in select now blocking for that file to become ready and then i call you back and then you're no better off you're slightly worse off but the gather primitive operation you talked about takes a bunch of future objects of some description be they promises or futures or co-routines or osing whatever's and it says okay all of these we want to kick them all off at once at once again there's not any threading

Starting point is 00:22:11 so it doesn't actually happen at once it could each one is run until it gets to a point where it yields yeah but then i come back and do the next one then the next one the next one and now i have multiple io requests in flight and then whichever one of those comes back first i will service and it goes into the event loop and now that uh file is open and so that um co-routine is starting to process the first file or the second file and so on so um that is the equivalent of spawning threads at that point that's how you get the parallelism, is by gathering a bunch of async awaitable things. And I think that's another thing to bear in mind, is how coroutines interact with this.

Starting point is 00:22:57 Because languages that don't have coroutines, or something like coroutines, it becomes a difficult thing. I don't know, actually, difficult becomes a difficult thing i don't actually you mentioned ruby and i don't have any experience with with ruby specifically i know python had coroutines before it had the async in a way and they were able to be like library level solutions to async away just using coroutines and i know that the async sort of decorator is a spin on always forcing an object a function to return

Starting point is 00:23:26 a code as opposed to being a function causing its own right so does ruby have a similar thing how do you or was it you did mention it's like we're level one and a half so maybe it didn't have the same yeah i mean if it does i'm unaware of it when i was using it event machine was was entirely like a a third-party thing. Right. It was not a language extension, so it didn't need coroutines. And again, another sort of aspect that makes writing this kind of code more tractable are languages that allow you to do arbitrary captures into lambdas and unnamed objects and things.

Starting point is 00:24:02 So either you can use those as the callbacks and not worry about um who owns what uh or or um you use co-routines themselves so yeah it's i suppose what i'm winding up towards here is that why do other more low-level programming languages like c and c++ not really with a big asterisk footnote, have async await type operations, even though all these other programming languages we're talking about are written in C or C++, right? So there's got to be a way of doing it. Mm-hmm. Yeah.

Starting point is 00:24:40 And, you know, really, I mean, so the trick there is coroutines are a huge unlocker they're the thing that allow you to suspend a function and carry on again a bit later on in that function i know i described it as it being syntactic sugar for like cutting a function into two functions and there is a way you can do that and i believe the first version of c sharp that introduced this had that kind of rewriting technique behind the scenes. It was just literally rewriting the code as blocks of, of, um, of, um, follow on continuation

Starting point is 00:25:12 functions one after another. But, um, modern times, um, these are co-routines that can yield. That means they literally say, Hey, I'm not not finished yet but i'm going to return a value to my caller and that value to the caller indicates that yeah i haven't finished so go do something else and then you can call me back and then it will continue from where i left on and in that traditional classic co-routine um sense um but um the problem with that is that you need to be able to keep around all of the internal state of a function between calls to that function. You need to be able to suspend and resume.

Starting point is 00:25:51 Right. Going back to our original, like, operating system level threads, that's what the operating system knows how to do. And it's its meat and potatoes. Right, right. meat and potatoes right right it knows how to suspend a a thread and page in switch in a new thread you know restore all the state from that and at an arbitrary point it's kind of like a hostile takeover right at any point in time the operating system could say you've had enough cpu now i'm just coming in i'm saving your registers you'll never know that i'm about to start using this cpu for something else and later on you'll wake up dazed and confused and won't even know what happened right right and what the co-routine does is this is like this is a

Starting point is 00:26:30 cooperative choice i need to be able to preserve enough of my state at this point for you to continue doing some other piece of work but then when you come back to me i can continue where i left off and for interpreted languages that's usually as straightforward and i'm going to use air quotes that our listeners can't see as storing the interpreter state for that function up to the point up to some point where you know you don't need to store higher up maybe that's where the the event loop started or whatever um and then you can resume by just jumping back into the bytecode with that kind of information around. And usually they're garbage collected as well.

Starting point is 00:27:07 So anything that you had references to will still naturally still exist because they still have a reference count or they still have something pointing at them. So you can just carry on just in C and C++. You have the problem of, well, it's arbitrary assembly instructions, and the compiler needs to know how to hang onto things like the stack. And there is a single stack. It is the stack. It's not a stack. When you have multiple threads, you have multiple stacks

Starting point is 00:27:35 by default, but in a coroutine, you deliberately don't necessarily want that. It becomes more complicated. So then, to the footnote, the big asterisk i said earlier co-routines are now here for c plus plus but it is the beginning of the process so the really interesting way that co-routines have been brought into c plus plus is that the keywords co-await and co-yield and i'm sure there's another one gosh i can't believe i can't remember the other one now

Starting point is 00:28:03 they are in the language now, but they're sort of batteries not included. They're very, very low level primitives that the language is making promises. And then the STL, the standard template library, does not yet have much in the way of support scaffolding for you. So if you want to write coroutines, you can roll your sleeves up

Starting point is 00:28:22 and write a whole bunch of complicated state management code and you can do it. there are there are also some cool libraries that do some of this stuff for you but the expectation as i understand it is over the next couple of cycles of the c++ committee every three years meeting then um the sort of predominant best solution for um a whole bunch of things to do with execution and other things will come to the fore and that will become the way that we think about these these these co-routines and how they exist in a bit bigger in a wider context and then once those building blocks are in place maybe there will be some async uh routines and async libraries that

Starting point is 00:29:01 we can use yeah yeah which will be great so obviously i was just me steering it my way there to talk about my own thing which and again i i've seen people use the co-yield and co-await stuff um in sort of like toy examples but i've never used it myself in anger and um although this is not async await at all because co-routines are separable from async await um you can build async await with them uh co-routines are separable from async await um you can build async away with them co-routines are really really cool for things like writing emulators where you want multiple very lightweight processes think every single device wants to have hey another clock tick happened right and you want to be able to write code that looks not like uh the traditional

Starting point is 00:29:42 you know poll you know everyone's written poll where you have like a state machine in every single um element on that's being polled and the state machine is like hey every time you have a cool poll you have to do switch what's my current state oh case waiting for this oh okay well then i guess i'll do this type of thing and it's really convenient to be able to just do some write like a uh video uh chip that just says you know await next cycle draw two pixels await next cycle draw two pixels in a in a for chip that just says, you know, await next cycle, draw two pixels. Await next cycle, draw two pixels. In a for loop that's as wide as the screen, and then you do the, oh, and now do the thing

Starting point is 00:30:12 that happens at the edge of the screen, await another cycle. And you just write straight line code like you were the video chip, and there was no one else in the world, except every time there's a cycle, you say, okay, I'm done for this clock tick. Now someone else gets a go.

Starting point is 00:30:24 And you don't use threads for that because the synchronization cost is staggeringly high for that and the work you're doing in between them is tiny but it's inconvenient to write the normal poll stuff so anyway that is a complete aside and i have never i've not yet written a co-routine based emulator but i would like to yeah yeah um so let's sort of go back to some of the other languages we were talking about earlier i know i know we we have spoken about java before now um is there an equivalent for java uh you know i'm a little embarrassed to say that i don't know there's nothing embarrassing about not knowing things yeah i don't should make that clear i feel like

Starting point is 00:31:03 i should know this that's why i mean there's lots of things i don't. You should make that clear. I feel like I should know this. That's why. I mean, there's lots of things I don't know. Let's be real clear on that. All right, okay. But I feel like as much as I've done with the JVM and with asynchronous programming, I've never... I guess, doesn't Clojure have like an async? I think it does. So there must be a mechanism for it.

Starting point is 00:31:28 Well, to your point, it's like you can build these things on top of lower-level languages and they work just fine. But I actually don't know if there's one in Java. The one point that you made me think of, though, when you were talking about the C++ introduction of this, is that I would imagine that that means that the C++ community and standard library is going to face the same situation that is true in Python, that is true in a few of these other languages that introduced this, where you sort of get this bifurcation of the standard library into these are the asynchronous non-blocking

Starting point is 00:32:11 calls, and these are the star-bellied sneetches. And between the two, you kind of don't want to cross. And so you sort of have to make your choice. And if you have to go back and sort of retrofit it, it sucks but in a lot of cases it means you either have to spend a lot of time figuring out how to unify the duplication between those two worlds in a way that makes sense or just duplicate it.

Starting point is 00:32:43 You're just going to do everything twice, the async way and the sync way. And that can definitely create confusion and make it harder to use things. But if you want to be able to take advantage of this, that's what you need to do. Yeah, I don't know about that, actually. I know that the Boost library ASIO, which is the asynchronous IO library, is one of the things that's been talked about for being standardized as the network library of C++. than people thought is that there is this kind of unification discussion going on and i'm you know i don't know much about that myself but um i'm sure if we found some people uh we could find

Starting point is 00:33:31 some people who wanted to talk about this kind of stuff but i it's uh yeah yeah i'm very much more a user of c++ than than anyone too uh deep in the details of how it's uh how it's designed but yeah and i know that this is the kind of thing that people think about a lot and that's one of the reasons why arguably the c++ standard library is mostly i was gonna say impoverished that's a terrible but it's a pretty impoverished standard library when it comes to non, like, actual pragmatic things, like opening files, reading every line of a file or whatever. It's kind of, well, there's this very high level concepts now and algorithms that you can use. You know, if you want to do some of the the more obscure um partial sorting of an arbitrary array

Starting point is 00:34:27 that's not even an array it's just iterator pairs of things you know you can do all these clever things but if you want to find um the the full stop at the end or if you want to like trim the last character off a string then you just have to do it using those algorithms you don't just have a dot remove last or whatever or you know uppercase and dot lowercase you know all those kind of niceties don't come but mainly because they aren't necessarily as general purpose as as one might imagine you know like what does it mean to be uppercase well what locale oh now you've got all these questions right you know a lot of the language just go meh but yeah we just want to print something out in shout caps you know that's important to me not like right general so i think that is uh definitely a factor in the um the the slower adoption of these new technologies is because they want to

Starting point is 00:35:12 do it right knowing that it kind of has a certain persistence it stays on for a long long time there is you know it's it's worse than glitter for the c++ uh committee they have to make decisions that are basically forever decisions they never really go back on stuff yeah yeah yeah uh another point that you made that i i wanted to briefly touch on was the the sort of performance impacts and like i i definitely have made the mistake myself and i've seen other people make the mistake of sort of just throwing threads at a problem when they have a performance problem it's like oh i'll just break the work up and distribute it across threads and then it'll be faster, right?

Starting point is 00:35:47 And, you know, the answer is not always. If the cost of crossing those thread boundaries is higher than the actual amount that you save by breaking up the work and the cost of reassembling it when you're all done, you can actually make it slower and more complicated at the same time. Yeah, brilliant.

Starting point is 00:36:04 Which is not great. And so, you know, one of the advantages of having a sort of single-threaded, you know, event loop style, whether it's with async keywords or not, is that you don't have to pay the, not only do you not have the sort of programmer complexity problems of crossing thread boundaries, but you also don't have to pay the performance hit of crossing thread boundaries. And if you can sort of structure your work in such a way that you can take advantage of the non-blocking IO while keeping the CPU busy, you can actually get very good throughput by doing that and sort of avoid some tricky performance

Starting point is 00:36:44 problems of like, oh, yeah, why is this queue? Every time I go to read from the queue and it blocks, it really slows things down. It keeps the queue full, and even when it keeps the queue full, it still slows down. What's going on here? Yeah, I think it's an important distinction you make there. If you are predominantly IO-bound,

Starting point is 00:37:02 then you can definitely take advantage of a event based system because like you said like if you're keeping the cpu busy then sort of by definition when the cpu is finished doing whatever it's doing now there's always something new for it to do because some other io event has completed and someone who was waiting for a read has now got it or someone who was waiting for a right to drain has got the the okay that it's gone um but it's easy to fall in the situation and this is i mean this is what compiler explorer suffers from right if you hit compiler explorer uh some of the work that it does on the event thread so like the the web server itself is running like this everything in node.js which is what compiler explorer is written in um is is uh uh continuation passing style mostly although we're slowly moving to a sinker weight

Starting point is 00:37:51 actually um style of uh of work which means every web request is like an io thread as far as we're concerned io response um every now and then when we've actually got the results of a compilation so for us um another piece of io is we ran the compiler it's another process and as far as we're concerned we're awaiting it now so like right we can be serving more web responses we can be giving the fav icon dot ico to whoever wants it and then when the compiler is finished in its own process somewhere else in a little sandbox we're going to read the results back in and we're going to parse it so that we can give something back to the web browser that's renderable that is totally cpu bound and for like large programs that have 20 30 40 000 lines of assembly output it can take us a long while meanwhile we are essentially blocking any other web request that's coming

Starting point is 00:38:42 in that's unrelated to that for that one node now we have low balancing we have multiple physical computers that are doing the work so that's less of a problem but our cpu is 100 wedged and we can't do the other work that is available to us now if we had threads that would not be true and so we've discussed having worker threads for exactly this thing in which case actually just to sort of square this off um we would probably kick off that parse the assembly output and return me like the dictionary that i need to send to the user that would be a worker thread and then we would await the thread coming back you know hey kick off the thread and then await the results and now we're back in the land of async await it's just just like running

Starting point is 00:39:22 the compiler the result of parsing the output becomes itself an async io inverted commas thing as for at least as far as we're concerned so there are kind of ways of making everything fit together um i think the other thing i think about with performance you mentioned about crossing thread boundaries as it were obviously thread boundaries are typically not very expensive in the general sense because in the same process you can actually affect the same right memory as the other yes for good or for evil for good or for evil exactly yes um so obviously the locking is where it starts to become right troublesome and and you mentioned queues so you did sort of cover that exactly as as one

Starting point is 00:40:01 would want but that hidden inside that queue is a lock that is going to prevent you from being able to do work once it's either full or there's no work to do all those kinds of things or you know you've got the possibility of deadlock and and other other issues like that um but conversely if you are in uh async await land and with gay abandon you're putting the uh async keyword on every single function you can you can find even when there isn't any actual awaiting inside of that function then you're pessimizing your program because behind the scenes what that means is that when you call that function you don't actually get like the nuts and bolts that are that function which returned an integer now returns a future of an integer and you call it first of all and you get

Starting point is 00:40:45 back like a hey there's nothing to do yet and then it gets scheduled on the event loop or it could at least if someone's awaiting it right now it's going to be scheduled on the event loop which means there's at least one tick of like the event loop clock before the first line of your function is going to start running or it's actually it depends on the implementation some of these things do actually eagerly um execute and there's a difference between tasks and not and async things in different languages or whatever but i've at least had definitely some some experiences where i've called a function i've gone why did that not do anything and you realize oh i didn't await it which meant that all that happened is it called the function

Starting point is 00:41:22 to generate like the coroutine context. And then we're like, okay, here's your thing. When you want it, it's ready. Here's this code that I can run for you whenever you like. But not now. You didn't call it. No, you didn't. You told me you were going to call it.

Starting point is 00:41:35 It's not as cheap to do that. You're creating these intermediate objects that have some setup cost as well as the actual, maybe you're going once around the horn of the event loop before you're actually making progress now that does have some some positive side effects if you are cpu bound and you're just trying in a really naff way to yield up so this is another compiler explorer anecdotal thing here and i hope hope that none of my developers who are actually good at developing code are listening, who contribute code. This is my harebrained ideas. But one thing I consider doing is literally putting async await sleep zero into that big parsing loop.

Starting point is 00:42:22 You know, every 10,000 loops, just to yield it back. And of course it's awful, and no one should do that, but it's a way of giving someone else a go at the CPU, so that at least some progress can be made on those web requests that are coming in.

Starting point is 00:42:40 Yeah, you asked me earlier about async, await, and Java, and I don't really know. But one thing I did see earlier this week that I thought was super interesting is a proposal for Java virtual threads, which might be an interesting middle ground between these two worlds that we're talking about. And the proposal, as I understand it and remember it, basically revolves around threads that when you're executing blocking what would normally be blocking IO calls within the context of a virtual thread, the JVM underneath the covers will turn those into non-blocking calls for you and suspend the execution of the virtual thread to allow other virtual threads to execute, which means you don't have to do the thing that you normally do with threads and pool them.

Starting point is 00:43:36 You never want to spawn thousands and thousands of threads at the same time because you're operating system threads because you're going to starve out the operating system and make the thread schedulers job significantly harder but with virtual threads you can just go ahead and make as many as you want because they don't they're not mapped to um operating system threads they are essentially async awaiting automatically at the point where the jvm says hey you called file to open well rather than file opening i'm gonna rewrite that as an

Starting point is 00:44:07 async await kind of thing and let another virtual thread take over the this thread capital t right like this whatever execution however that's where but obviously you kind of need to opt into that a little bit higher up somewhere where you spawn these virtual threads right and each one of them is essentially a top-level co-routine or a syncable um uh thing that's that's a neat solution that doesn't require much change to the code and you're still writing straight line code as far as you're concerned you're still blocking and you are blocking right yes it's just at the point where you would block and rely on the operating system to context switch you the jvm says i can context switch you much more cheaply right and i'll do that here and get someone yeah that's awesome yeah and one interesting side effect of that that i was thinking of is that

Starting point is 00:44:53 it avoids this problem that we were just talking about where you sort of have bifurcate the standard library oh right like you're just saying no no underneath the covers it might be bifurcated but that will never hit the api you will never see that really interesting it's all one thing yeah which is kind of cool because this just sort of reduces the cognitive load on programmers who need to understand how the standard library works no that's super awesome yeah that's a really interesting observation i hadn't thought of that but yeah that bifurcation is otherwise a big deal breaker for a lot of people they're like oh which one oh you know which which way are we going to go oh no that's super cool to finish on i just want to make a couple of observations that amuse me and that is all these async await style things and again mainly the

Starting point is 00:45:34 co-routines that i that help them may be more uh useful are are the way that operating systems used to work. I don't know if you, did you ever do any Windows, Windows 3.X development back in the day? No. Windows NT was really the first operating system that I used as a programmer in Anger. The Windows, the Win32 API, which is still around and about, has like an event loop, right?

Starting point is 00:46:06 And I don't know if you've ever looked or remember do you remember doing did you any do any raw win32 api stuff even in nt because although it wasn't the way i'm about to describe it it was the same api yeah so you would do um you'd call a get next event effectively and you'd have a switch statement on get next event so you do you know switch uh get message or whatever the i actually can't know wm get message does that ring a bell i'm now i'm making up like stuff like you're reaching into the very depths of my my neurons here but yes that does actually ring a bell and then you would sort of you'd pass it all these structures to fill in then it would return the the why you know what's the next message right and it would be you know wm paint and then you'd be like oh i better draw them or like a mouse click event right or a mouse click or yeah any of these things yes so in um in modern times that's just reading off of an event queue right your your thread is is uh goes

Starting point is 00:47:02 to sleep and you can have many threads obviously they can be do any other things but your your thread is is uh goes to sleep and you can have many threads obviously they can be doing any other things but your your sort of ui thread is there just reading the next message off of that so as the mouse is moved through your world or if people are clicking or typing whatever you know you get those events but but way back in the dawns of time that was how the operating system gave someone else a go you call get message and it's like oh i'm suspending you now there is no preemptive multitasking here at this point we're saving all the registers and we're going to load someone else's registers and memory map and then we're going to return to their get message and now they get a go on the uh the cpu and um and certainly the operating system that i was using

Starting point is 00:47:43 prior to windows which was risk, had that feel too. You would have to do this system call that would tell you what the next thing is. And then really it was switching and returning to somebody else, which was a really interesting design because you could do that in user mode. It wasn't that difficult to see how it was saving all the registers and reloading them all back out again and so you could write your own like cooperatively multitasked sort of sub threads within your thread using the similar kind of techniques which was a great way of introducing of like well this is how the operating system must be doing it and i remember having an epiphany moment um doing exactly that and just having those two routines that would you know call a function which isn't really calling a function because they're calling that function actually jumps

Starting point is 00:48:27 back into and returns from the other functions call a function function and then you can ping pong backwards and forwards between the two of them and it's it's uh yeah just one of those crazy things so it's kind of come back now this cooperative multitasking which is why i described it as that right at the beginning of this whole conversation. It's like it's cooperative multitasking. You get to say when someone else gets a go of the CPU. Yeah. Yeah. Everything old is new again, my friend.

Starting point is 00:48:52 Everything old is new. Well, we've covered some things that we are inexpert in only as users. So I do hope that anyone listening to this who's been shouting at the microphone, at their headphones or their speakers about how... I'm sure our listeners are very frustrated by now. So do tweet us at TWOSCP to let us know all the mistakes we've made, as per usual. And until next time, my friend. Yep, next time. You've been listening to Two's Compliment, a programming podcast by Ben Rady and Matt Godwald.

Starting point is 00:49:29 Find the show transcript and notes at tw Inversephase.com.

Two's Complement - Async Whatevers

Ben and Matt talk about various styles of asynchronous programming, ranging from Node.js, Ruby's EventMachine, C++ coroutines, and the new JVM Project Loom. Schedule yourself a listen, won't you?...

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.

Your Ad Here

Two's Complement - Async Whatevers

Ben and Matt talk about various styles of asynchronous programming, ranging from Node.js, Ruby's EventMachine, C++ coroutines, and the new JVM Project Loom. Schedule yourself a listen, won't you?...

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.