CppCast - C++ Concurrency

Starting point is 00:00:00 This episode of CppCast is sponsored by JetBrains, maker of excellent C++ developer tools including CLion, ReSharper for C++, and AppCode. Start your free evaluation today at jetbrains.com slash cppcast dash cpp. Episode 28 of CppCast with guest Anthony Williams recorded September 29th, 2015. In this episode, we talk about the CPP Core Guidelines announcement. Then we'll interview Anthony Williams, author of C++ Concurrency in Action. Anthony will tell us about all the concurrency features that were added in C++11 and 14, and give us a peek at what might be coming in 17. Welcome to episode 28 of CppCast, the only podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner.

Starting point is 00:01:20 Jason, how are you doing today? Doing great, Rob. How are you doing? I'm doing great. I know you have a bit of a cold still recovering from CppCon, right? I am. It was a great week, but I did get a cold. It was a long week and so much to do. And we got a great reception from our listeners at the conference. It was amazing how many people recognized me and were friends of the podcast. It was a great experience. That's awesome. I hear you had no trouble giving out the shirts we made. No, I'm sure I could have given away quite a bit more. I made a point of focusing on previous

Starting point is 00:01:58 guests of the podcast, but I know I missed a couple people. Sorry about that, if anyone's listening. But, yeah. Cool. Any other highlights you wanted to share before we go into the rest of the show? Obviously, there's a lot to talk about, and I feel like we should try to have maybe some of the keynoters on to talk about the GSL and everything, which we'll touch on in the news items. But anything else you wanted to mention? Well, I guess, I mean, they're going to put all the videos up eventually. Right now there's only four up, and I will say besides the obvious keynote ones, Chandler Carruths, which was, I guess, technically a plenary session, not a keynote. It's fun. It's good

Starting point is 00:02:39 fun on optimizing C++, and I do recommend watching it. Yeah, so far I've watched Bjarne and Herb's talks. I want to watch Sean Perrins and Chandler Kruth's later this week. Yeah, they're all good. But Chandler's has a couple of big surprises in it from my perspective. Great. Okay, well, at the top of every episode, I'd like to read a piece of feedback. This tweet came in a couple days before CppCon.

Starting point is 00:03:07 As was mentioned, I think during some of the keynotes, they published this new CppCore guidelines document on GitHub, and it got noticed prior to the conference, and thousands of people found the repo and were favoriting it,

Starting point is 00:03:22 and I think they even may have started branching off it and everything before the conference even started. And they started explaining what it was. Right, Jason? Yeah. Yes. It gets mentioned by both Bjarne and Herb Sutter.

Starting point is 00:03:37 And for a while there, well, Herb made a point of pointing out that the repo is one of the top trending ones on GitHub right now, which was cool. Yeah. So anyway, this tweet was from Andrew Karpoff and he writes, hey, take a look at CPP core guidelines. Definitely worth mentioning on your show. And it absolutely is. And I think there's going to be a lot for us to dig into over the next few episodes with all this new content. We'd love to hear about your thoughts for the show. You can always email us at feedback at cppcast.com. Follow us on Twitter at twitter.com slash cppcast, or like us on Facebook at facebook.com slash cppcast, and you can always review us on iTunes as well.

Starting point is 00:04:14 So joining us today is Anthony Williams. Anthony is a UK-based developer and consultant with many years of experience in C++. He's been an active member of the BSI C++ Standards Panel since 2001, Thank you. and is a developer of the JustThread implementation of the C++11 thread library from Just Software Solutions Limited. Anthony lives in the far west of Cornwall, England. Anthony, welcome to the show. Hi. It's great to have you here. Yeah, it's good to be here. I am curious what the BSI standards panel is.

Starting point is 00:04:58 That doesn't ring a bell to me. Well, each national body has their own representative so in the us it's incits um where the national experts come together to have a discussion or or at least those people that could be bothered to go to the meetings to come forward to have a discussion in in the uk it's bsi is the panel is the uh standards panel. And so when the international standards meeting is held, then there will be various representatives from BSI there. And the UK has a vote, as does the US. And various other, like in France will have a vote and Germany will have a vote.

Starting point is 00:05:40 I forget what the acronyms for their standards bodies are. Okay. Okay, so these are different regionalized panels for the ISO C++ group, basically? Yeah, that's right, yes. Okay, very interesting. I didn't know about that either. So a couple of news items. Obviously, the big one to talk about, as we kind of already mentioned with the feedback, was the C++ core guidelines that got announced during Bjarne's keynote at CppCon.

Starting point is 00:06:06 So, Jason, have you had time to really dig into this at all? I'm guessing not since you were at the conference. Yeah, and there's so much information there. I read one comment from someone that pointed out that it was so dense, and that was my takeaway, too, when I first looked at it. But if you watch um the talks the keynotes from the conference what you get is this feel that the the guidelines aren't necessarily meant 100 for humans they're meant a large portion of them for static analyzers to teach our analysis

Starting point is 00:06:40 tools how to catch memory problems and other core problems that have been issues in c++ right and uh it sounds like microsoft and the visual studio team are working on the first implementation of an analyzer built off of these core guidelines and that's going to be released whenever vs 2015 update one comes out is that right something like that they said look for it next month and i don't know if that's necessarily coming with update one or not i was unclear on that personally okay i thought during herb's talk you mentioned uh the update one time frame oh maybe yeah i'm uh sure okay anthony have you had a chance to look into these guidelines at all yeah well i i saw the announcements and on twitter and was

Starting point is 00:07:25 they sort of oh this looks interesting no certainly i mean some of them are straightforward things that everybody who's no followed c++ at all is likely to know like no no naked news now use use your smart pointers and stuff right but uh there's quite a lot of stuff that's in that's new in there there like tagging owners for passing raw pointers around then you sort of tag it with an owner type template so you can say this is a raw pointer that owns its resource

Starting point is 00:07:55 as opposed to this one over here which isn't or tagging raw pointers with not null so it can be checked with a static analyzer and you don't need to then put an active test in your code, because you can just rely on it being not null. Right. And they were very big on pointing out repeatedly that null pointers aren't, I mean, excuse me, raw pointers are not evil by themselves.

Starting point is 00:08:21 Just don't pass around pointers where you don't know who owns them right yes yes so yeah so there was a lot about the pointers I noted that some of the there's a lot of entries in the index of things that they intend to cover so I was having a look down

Starting point is 00:08:38 sort of oh yeah well they've got a section on concurrency guidelines and what's going to be in there well actually there's not a lot there at the moment it's all waiting to be fleshed out it, actually, there's not a lot there at the moment. It's all waiting to be fleshed out. It says at the top it's a work in progress. Yeah, and I guess that's a big thing to keep in mind with all this is they're asking for contributions to add new rules, to build out the library.

Starting point is 00:08:59 So if you have any interest in working on this, they're definitely looking for help from the community. And speaking of community help, this next article, or not really an article, but it's another GitHub library, is someone went ahead and made a GS guideline support library implementation for C++ 98 to C++ 11. I guess the one Microsoft is working on maybe only targets C++ 14 up, but if you're using an older code base and you haven't upgraded past C++ 11 yet,

Starting point is 00:09:33 then this library might be useful for you. Not sure if there's really much else to mention, but it's kind of the same type of stuff that's going to be going on in the GSL, but implemented using just standard C++ 98 to 11 features. Well, they are encouraging many people to make their own implementations in the GSL. Was it a comment from Herb, I guess, about you don't really know the problems in a standard specification until multiple people try to implement the standard.

Starting point is 00:10:06 So that's what they're hoping for. Yeah, so Microsoft is taking the first stab at it, but I'm guessing the Clang and GCC communities won't be far behind, hopefully. Okay, so Anthony, we have a lot to talk about today, I think. To start off, maybe we should just talk a little bit about why concurrency is becoming increasingly important in c++ applications yeah sure i mean one of the reasons that people choose to use c++ just in general is because they want the raw performance that you can get whilst having the ability to use the high level abstractions of of object-oriented code and things like that.

Starting point is 00:10:48 So you're looking at places where people are after performance. Now, increasingly over the last decade or so, then the processors that we're using are giving us that performance in the form of more cores rather than higher clock speeds. The clock speeds have pretty much topped out. So these days it's not uncommon for a desktop machine to have 8 cores, some of them have 16, even mobile phones often have quad cores, 8 cores. So if you're after raw performance, then you're looking for multi-threading and concurrency. And so you need to be able to write your code to take use of that.

Starting point is 00:11:36 Right, and just looking back a bit in the 80s and 90s, I guess maybe this wasn't as important a concern because we were just using you know single core processors and they just kept getting faster and faster each year right yeah yeah that was right i mean obviously the the the high performance um computing community have been you know doing this sort of stuff for years because they're working on on systems with you know if they if that if there weren't multi-core cpus then they had multiple cpus attached to each other on the same same board and no these days if you look at the the supercomputers that people

Starting point is 00:12:11 know they've got i think is it thousands of uh of cpus attached together these days so they've been doing this yeah they've been doing this for years and it's just that in the in the mainstream then it's becoming far more important right so what are some of the key improvements for concurrency that were added with c++ 11 and 14 okay well i mean c++ 11 was pretty was the first c++ standard that actually acknowledged the existence of multiple threads and concurrency. Prior to C++11, if you tried using threads or concurrency in any form within your program, then you're relying on your platform-specific extensions. So, yes, on a POSIX platform, you could use the POSIX C library.

Starting point is 00:12:57 On Windows, you could use Windows Threads. But that was entirely proprietary, and you just had to hope that they'd managed to integrate it properly with C++. With the C++11 standard coming out then that then became part of the standard and the key part of that was actually the memory model that determined though if you do this in one thread and that in another thread, then is it synchronized? If you write to a variable over here, can you see it on a different thread over there? And that is a very important part of the C++11 standard was specifying that.

Starting point is 00:13:35 And that was a lot of work from a lot of serious experts. trying to make sure that that not only worked but also was implementable and efficient across a wide variety of processor platforms and so so yeah so that that that's the key thing that we got from c++ 11 um and then obviously so then there's all the all the high level wrappers so we've got the standard thread class the standard mutex class um you know standard async and standard future for high level things standard condition variable um so just the that is that that was the base level implementation of threading facilities in c++ when you move forward to c++ 14 then actually we didn't add a lot um the The key addition there was shared mutex, which people might notice is a read-write mutex.

Starting point is 00:14:30 So you can have one thread that's doing a writing and then nobody can read it. But if people are reading, you can have multiple threads reading simultaneously. So sometimes then that can provide efficiency when you've got something that's very rarely updated. I'm a little confused about the shared mutex specifically. It seems like it was added in 14, but only as a shared timed mutex or something like that. And then in C++ 17, they're making a more generic version.

Starting point is 00:14:58 Am I just confused? No, no, no, you're right. In the 14 standard, we've got shared timed mutex, and then coming up for 17, we've also got shared mutex. Now, you'll see the shared... This is by analogy for standard mutexes. We've got standard coupled mutex and standard timed mutex, which you can have timeouts on your lock.

Starting point is 00:15:20 So I'm going to try and lock this, but if I don't get it within 30 milliseconds, then I'll give up and do something else okay um so that's what the timed aspect is so with shared mutex then originally it was like well we're going to go for the full-blown thing with all the timeouts and then um microsoft in particular said hang on on our platform we can implement it more efficiently without the timeouts if you don't need them okay so um so for c++ so at in for 14 we said well hang on a minute we don't have time to add that now but we'll just change the name because no for for um to make it symmetric with all the others so we've

Starting point is 00:16:00 got no standard time mutex recursive time to mutex and now we've got standard timed mutex, recursive timed mutex, and now we've got shared timed mutex. And now for 17, then we've actually added in just the straightforward untimed version, but the name hasn't already gone. So we can just fit in with the name you might expect of shared mutex. And then if you don't need the timeouts, you can just use that.

Starting point is 00:16:21 Okay. So going back to different threading implementations, if you're already working on an application that maybe started off in C++98 and you're using something like Boost Thread or your own library built on top of POSIX threads and Windows threads,

Starting point is 00:16:38 would there be any gains from switching over to the standard thread implementation? I think I'd start off by saying if you're using boost threads and you keep your version of boost up to date which i know some people do and some people don't i mean so yeah um then the advantage from switching to standard thread is probably fairly minimal okay the um unless your implementation is providing additional features or better optimizations than Boost does. Obviously, no. If you're thinking about optimizations, you want to profile it.

Starting point is 00:17:13 But Boost is portable across so many platforms. If you're already using Boost, there's not necessarily a lot to be gained from switching to the std version but if you're using the raw threading facilities provided by your platform then i would definitely say that it is worth switching to the standard library version because that provides the portability which obviously the platform specifics don't right but then again no if it works there's no need there's no need to change go through and say right we're just throwing out all our all our posix threading stuff obviously if you're going to make a change to that part of your application then it might be worth thinking about making the change to the standard library usage as well but if you're not if you're not

Starting point is 00:18:00 changing that then that can be a lot of work and there's not necessarily any gain right um in some some levels the facilities that you've got you might know the higher level ones might be easier to use so standard condition variable is easier to use than a posix condition variable because you can you can automatically you can pass the predicate directly and you don't have to worry about then spurious weights. And things like promises and futures just aren't available outside of either standard library implementations or things like Boost, which provided their own. All right, since you just mentioned promises and futures,

Starting point is 00:18:42 would you like to give our listeners an overview of those? Yeah, sure. At the basic level, it's a means of providing a communication between two threads, a safe communication of passing data between two threads. So you've got the promise half, which one thread looks after and can set the value on the promise when it is ready and then on the future half you can is an object that you can just pass around and you can then use that to wait for the data from the to come from the promise okay so so you can you

Starting point is 00:19:22 can you can use that as a base basic synchronization now you if you if you had a promise void and the future of void then that's just a straightforward no tick the box yes i'm ready and then no whereas you can or you alternatively you could pass through some data no of any amount it could be a vector that you're passing it could be a string it could be no any sort of no type of object that you're creating on one thread and then passing through to another thread somewhere else in the system. And you don't need to have any direct access to a shared variable because all the synchronization is done internally in the future.

Starting point is 00:19:58 Okay. So we're talking a little bit about sharing data between threads right now. Are there any other new tools for sharing data that were added with C++11? The basic tools that we've got, so we've got futures and promises. As well as promises to give you futures, we've got standard async, which spawns a task which which would generally be running in its own thread and then but when it returns when the function returns then the value is stored in a in a in the future so it's it's no like the most simple possible way

Starting point is 00:20:39 of implementing no thread pool type type stuff i mean because it you call standard async, it spawns your task on its new thread and then you can wait for the result to come back so there's that and the promises that's the highest level of synchronisation we've got and then also we've got the call once which is for one time initialisation to ensure that some function and then also then we've got the call once which is for one-time initialization either um to ensure that now some function is run only once to initialize data you know i'm going to say the dreaded word of singleton but it's a it is a safe way of

Starting point is 00:21:16 no you couldn't use it to initialize your singletons with call once um or but you might use it in other contexts too because if you've got multiple objects of each type, but each instance has some lazy initialization type aspect that you only want to happen once, but your object might be accessed

Starting point is 00:21:37 many times. You can use call once and that to ensure that each object is initialized only once. Does something like that have a high amount of overhead? I don't know. Is there an expensive check every time callOnce is interrogated to see if it's already been called? There has to be a check, obviously. But the cost varies quite considerably between implementations if

Starting point is 00:22:06 a very generic implementation will use a mutex under the hood which therefore imposes a large cost whereas um a highly optimized implementation can do it with straightforward atomics um and no very carefully optimized um operations so that if the if the that if the function's already being called, then it's a very, very quick check. And the only time that there's any expense is when you've got multiple threads competing right now to try to see who's going to be first. For over a decade, ReSharper has been helping.NET developers

Starting point is 00:22:40 be more productive and write better code in Microsoft's Visual Studio. Starting this year, it expands the list of languages it supports by adding a dedicated edition for C and C++ developers. To put it simply, ReSharper C++ makes Visual Studio a much better IDE for C++ developers. It provides on-the-fly code analysis, quick fixes, powerful search and navigation, smart code completion, automated refactorings, a wide variety of code generation options, and a host of other features to help increase your everyday productivity. The list of things that ReSharper C++ can generate for you include definitions, missing and overriding members, equality and relational operators, hash and swap functions, add dozens

Starting point is 00:23:25 of customizable templates, and you'll deal with boilerplate code faster than ever. Code refactoring for C++ helps change your code safely, while context actions let you switch between alternative syntax constructs and serve as shortcuts to code generation actions. With ReSharper C++, you can instantly jump to any file, type, or type member in a solution. You can search for usages of any code and get a clear view of all found usages with grouping and preview options. Visit jetbrains.com slash cppcast dash ReSharper dash cpp to learn more and download your free 30-day evaluation. And if you're a student or an open source project,

Starting point is 00:24:05 get your license for free, courtesy of JetBrains. Okay. Well, now you just went and mentioned Atomic. I think that's the first time that came up. So let's talk about that for a minute. Yeah, sure. Yeah. So Atomics are the lowest level of synchronization that's provided in the C++ library. And you can create an atomic. It's a template, so standard atomic of some type. So standard atomic of int or char or whatever. And you can have standard atomic of any pod struct. Pods, pods specifically.

Starting point is 00:24:46 Yes, so you... Oh, sorry, go ahead. The requirements are that it must be trivially destructible and trivially assignable and

Starting point is 00:25:00 comparable with memcump. So the point is that when you do any operations on it, the library is going to use the raw bits and not call any user code when it's doing the assignments and the comparisons. Because mostly that's going to be a single processor instruction, you're hoping.

Starting point is 00:25:21 Okay. Obviously, if you try and stick in a struct that's got an array of a million ints in it then yes you can do that atomic will let you but it's not going to be a single instruction that's going to be a probably a spin lock that it uses internally to guarantee that it's atomic whereas no a struct that contains just a single int will probably just use the atomic operation on that int that's provided by the processor. So I know... Okay, so let's dig into a little bit more.

Starting point is 00:25:53 What do we actually get from using standard atomics? Okay. Yeah, well, it's... If you use standard atomic with all the default parameters, so standard atomic of int and then no don't and then just do assign to it and read from it as if it was a normal int as far as possible then that guarantees that your operations from multiple threads occur in some unknowable compile time but some definite order so if you assign from one thread and then

Starting point is 00:26:25 read from another then no either they will happen in some specified no either the assignment will happen before the read or vice versa but there won't ever be a clash and so this is particularly good if you try if you've got use the read modify write instructions like no adding no fetch add or even just the plus plus operator which is provided for integer types then you know that if you call it from two different threads if you call plus plus from two different threads you will add two whereas if you try and do that on a raw int you might actually either end up with a random number because of some sort of internal um writing um to the memory or actually you might end up with a random number because of some form of internal writing to the memory, or actually you might end up either adding nothing or only adding one.

Starting point is 00:27:12 So, yeah, it guarantees that the values are interleaved, any writes are interleaved in some order. It's the key benefit of using standard atomic. Okay. At the last session of CPPCon, there was some mention, I believe, of atomic shared pointer becoming part of the standard. And I'm just confused as to what atomic shared pointer gives us.

Starting point is 00:27:44 Yes. Yes. Yes. I know it's, it's a separate type. I mean, it, no, it's,

Starting point is 00:27:51 um, there was, when it was being initially discussed, then people wanted it to be a specialization of standard atomic, no standard atomic angle, bracket, standard shared pointer of whatever. Okay. But,

Starting point is 00:28:00 um, people said, hang on a minute, shed point is not a pod. So that seems a bit sort of unusual. So we've got the whole type, atomic underscore shared underscore pointer. shared variable that's shared between multiple threads then you can do reads and writes to your atomic shared pointer instance from from all these different threads without any external synchronization whereas if that was a a plain shared pointer then you'd have to have external

Starting point is 00:28:37 synchronization like a mutex so do you mean accesses to the data that the shared pointer points to? Or do you mean modifying the shared pointer itself? Modifying the shared pointer itself. Okay. The most common sort of things for that is where the shared pointer is like the root pointer into some form of data structure, like a list or a tree. And then you're adding nodes to that data structure and so you're having then to change what the root is you know because either you've added it you know it's a linked list and you're adding something at the beginning or it's a tree and you've rebalanced it so you the the the root node is now different so if you're changing that then the owning node

Starting point is 00:29:23 from your from your data structure, then you're going to have to write to that shared pointer. And if you're then trying to read from that route from another thread, then you've got a data race. And so you need to either put an external mutex or use an atomic shared pointer. Okay. need to either put an external mutex or use an atomic shared pointer okay and and so this isn't about the data the rest the pointed to data structure obviously if you want no once you've dereferenced your pointer you've got to make sure that no the data that you're then accessing is

Starting point is 00:29:58 also safe but uh this is just about that pointer itself, the shared pointer instance. Interesting. Okay, so that is one new feature that might be coming with C++17. Are there any other things on the horizon with C++17 in concurrency? Yeah, well, the big thing that's coming is, at the moment, things like atomic shared pointer aren't directly in the C++17 draft. What we're working on is the, it's colloquially known as the Concurrency TS. So there's a long note, it's like the Draft Technical Specification for Concurrency in C++ or something is the long title.

Starting point is 00:30:42 But yeah, we call it the Concurrency TS. And so that's got things like we've got atomic shared pointer there's latches which allow you to have have threads that are waiting for a certain number of predefined number of events to happen and so then and then they sort of oh yeah everybody's ready so now the whole rest of the system can go um there's barriers which are a bit similar but they're sort of they're a well they're reusable and b that it's only the defined threads within the group that are waiting so no it's like okay we wait first wait for the threads to initialize then wait then they so they block up the barrier when everybody's ready to go

Starting point is 00:31:25 then we move on to the process or process the data and then and then do the next chunk so so that's um that's something actually that people have been asking for quite a while it's that's been in boost threads for a long time but actually getting it through the standardization process is tricky uh there's people like the guys from NVIDIA who want to make sure that the base level of implementation was doable as a hardware, because they've got nice hardware instructions on some of their platforms for doing barriers.

Starting point is 00:31:57 And they wanted to make sure that the simplest implementation would work with their hardware instructions. So then there's also the flex barrier, which then has more complicated stuff. So when everything's ready, then it doesn't cause a callback on exactly one thread, which then, no. Okay, so we're going to process the data

Starting point is 00:32:16 that everybody's ready, that everybody's organized before we release the whole herd of threads on to do the main bit of the processing. And then the thing that's actually quite a big, from my point of view, important aspect to the C++ TS, concurrency TS, is continuations. So this is an extension on the future library. So you've got some future which you're waiting for waiting

Starting point is 00:32:45 for the data to come and then you say well when this data is ready then I want to do something else and I'm going to spawn a new task to process that that result and so there's a new member function on your future dot then which you you pass it a a function or callable object, and that then is invoked when the future becomes ready in whichever thread is providing the data. Correct me if I'm wrong, but has Microsoft already provided an implementation of that with their concurrency library?

Starting point is 00:33:20 I think it's called PPL. Yes. I can't remember the details of ppl but certainly there's been um there have been similar sorts of facilities provided in from microsoft they've had various names and ways of doing it for for quite a while i think there was um a weight was was one thing that they had there's um similar sort of stuff but yeah so and there's been lots there's been lots of work from lots of places about continuations um but yeah this is with the concurrency ts then it's being added to standard future

Starting point is 00:33:55 and standard share future and then also similarly then there's like when any which is like well you've got a bunch of futures well when one of them is ready then i want to do something and so you can't gather a bunch of futures together into a single future with your when any and then as soon as one of them becomes ready then your when any result is ready and then and then you can put a continuation on that so when any of my futures are ready then launch this task um it enables quite high level sort of code and it's it's sort of things that people would use callbacks for um then it simplifies the whole process i think we talked to um hartmut kaiser a while ago about concurrency with the library he works on, and he seemed to think that futures in the current state of C++11, I apologize if I'm getting this wrong, but I think he didn't seem to think they were very usable.

Starting point is 00:34:54 Do you think all the new features coming from the concurrency TS are really going to boost up that library? Yes, I think so. I mean, yeah, futures from plain C plus 11 um they can be hard to use they're they're certainly the continuations is a very important aspect that i think is going to make them a lot more usable for people right um yeah and so the you know the when any we've got when all as well so no if all of a bunch of futures are ready, then it simplifies waiting for events

Starting point is 00:35:28 and gathering things together from multiple threads to do continued processing. And it allows a sort of a higher level, so you can chain things on, sort of, well, first I'm going to do this, then I'm going to do that, then I'm going to do the other. And you just write that as a chain series of then calls.

Starting point is 00:35:48 And then you just need to wait for the final result, whereas doing that otherwise, then you're having to manually spin up threads or manually submit tasks to a thread pool, and it can be a lot more complicated from the user's perspective. You just brought up another topic I don't think we mentioned yet, thread pooling. Is that something we could go into a little bit? Yeah, sure. Yeah, I mean, thread pools actually is something that's been a bit of a controversial topic among the committee, trying to work out exactly how best to sort it.

Starting point is 00:36:26 And first off, I'll say that we've got um if you're what whereas the concurrency ts is still in draft and actually as long as that there's the parallelism ts and so that's things like essentially a parallel version of the stl so no like you've got standard sort and you say standard sort, but make it parallel. Standard for each, but make it parallel. And so that will deal with lots of the issues that people might want a thread tool for, because basic algorithm, STL style algorithms, you can just leave it up to the compiler runtime to parallelize that across whatever threads, and the runtime will completely manage your threads for that. So for lots of things that people might have wanted a thread pool for, then that will just deal with it.

Starting point is 00:37:14 And that is already out for voting. I can't remember whether it has now been officially approved as an ISO standard TS, or whether it's still out for voting. I can't remember on that. But it was this year that it was out for voting. Okay, that's the parallelism TS you're talking about? That's the parallelism TS, yes. Okay.

Starting point is 00:37:36 And then, so aside from that, which actually has been relatively straightforward to do, then the other side we've got thread thread pools is is for no different for different tasks that people are wanting to add so no you're actually wanting um okay well here's like if you're trying to do a pipeline then you might have various tasks okay well the first stage i'm going to submit to my thread pool and then no then post the second stage for for another bit and various things like that um and that level of thread pooling has been quite controversial and we've got it's been the the framework that people talk about is executors

Starting point is 00:38:13 and that's been in and out of the concurrent cts it was originally voted in and then there was alternative composals and it's been ripped out again whilst people try and sort out what the um you know the the best way to approach it is because the people have different needs i mean there's people who are coming from the asynchronous io tile community who want really really um low overhead executives that they can pass around um as parts of their callbacks and um i know and make sure that you can have hundreds of little IO packets, the handlers waiting for all the asynchronous IO so that as things come in, we get the first bit of our packet and then we process that and then we submit a new task to our executive to deal with the rest of it. When the next bit comes in

Starting point is 00:39:05 and you want to make sure that well if it's already ready then it runs straight away without having to stick it on a queue synchronize with the queues spool up a thread to run it no quite heavyweight things like that so so that's one side is people with really really small lightweight tasks that they're wanting to be able to some know to throw loads and loads and loads of tasks at the at the executor and have it just handle it with really low overhead and then there's other people who are thinking well actually we're wanting sort of you know a few big tasks so at the time you know on an eight core machine we might have like you know less than 20 tasks in progress

Starting point is 00:39:41 at any time and so we're looking at big tasks that are going to be occupying a processor core for a substantial sort of chunk of processing and those people have got a slightly different way that they want to look at things and they and the interface that they're after is slightly different so i try and trying to come up with a combined interface that works for everybody has been quite a quite a challenging uh challenging task so we've covered atomic async futures promises mutexes thread pools where would you recommend that someone who's just getting started with asynchronous programming start? Should they start at the low level?

Starting point is 00:40:30 Should they start at the, well, I guess you said a promise is the highest level thing that exists today? Yeah, yes. Async, futures, promises, that's the highest level that's part of the current c++ standard library i would say that yeah if you're just starting then that's probably where you want to look to be sort of aiming your stuff at obviously if you're porting an existing code base then i would say that the first thing you want to do is you want to go through and make sure that

Starting point is 00:41:02 things where you've got accesses to shared variables from multiple threads then some people are using like platform specifics to do do the atomic access and i would say replace those platform specifics with standard atomic okay because you know just to make sure that that's that's right um and at the higher level then now use standard mutex if you're if no or at least check all your all your usages but if you're developing new code then yeah i would i would say you want to start with standard async standard promise and standard future to an extent using raw threads can actually be no it it's a bit of a it's a bit of a shame that that's as far as we got in our standardization at C++11,

Starting point is 00:41:49 because it's too low level for most people. You really don't want to be writing your code with manual thread management if you can at all avoid it. But unfortunately, that's all we've provided with people at the moment. So you consider future to still be manual thread management or no i think future in itself doesn't provide much in the way of thread management it's the no the so standard async is all we've got at the moment on on that front okay which allows you to spawn a task and you can either say spawn it definitely on its own thread, or you can say, when I want the data, then I'll run the task. So that's deferred.

Starting point is 00:42:33 Or you can say, we'll let the implementation decide whether I've got enough spare processing power to run it on its own thread. Otherwise, then I'll run it when you want it. And so that's a very, very basic task task management and with continuations then obviously we get more of that on that because you can know you're you're scheduling the more tasks um as each you know bit of data comes in and is ready um but really then yeah that's just the beginnings of proper proper thread and task management which now unfortunately you're then looking for no we're wanting these executors we're very keen to get them but trying to get them no so everybody's happy is a hard task

Starting point is 00:43:14 um so yeah so at the moment then yeah we want async we want um promise and future. Use those in your new code. But yeah, you want to look at things like the Intel thread building blocks because they do provide some of the higher level thread pooling type stuff. Some of the Microsoft concurrency runtime libraries similarly provide the higher level building blocks. It's just unfortunate that we don't have them in the standard library yet. Okay. So we haven't even mentioned yet that you're the author of the book, C++ Concurrency in Action. I did read through it a bit before getting you on for the interview.

Starting point is 00:44:04 It's quite a tome. I think it was like 528 pages in total. I saw you had chapters on lock-based and lock-free concurrent data structures, and I was wondering if you could go a little bit into your recommendations on which to use and in what situation. Yeah. If you're writing your own,'s dead easy okay do not write your own luxury data structures okay because getting it right is too hard um it might look easy you

Starting point is 00:44:37 might think it's straightforward but there are so many gotchas um and you know really it's just not worth it however if you can find a um somebody who's written you one for you then use theirs yeah so use the one that's um if if if you're using um intel thread building blocks i think they do have a couple of lock free data structures use those um if you're using my just threads pro library then there's a couple of lock-free data structures there um so we've got um but if you're if you're writing your own data structure do not make it lock-free because it is no too hard for most people um and it's probably not worth the development time in getting it right. Obviously, if it really, really turns out to be a core bottleneck, then sit down with your local concurrency experts and work it out

Starting point is 00:45:35 and go through it and spend a dedicated chunk of time thinking through all the possible permutations of ordering. Because, of course course the key the key thing about lock-free data structures is that and the reason you use it is you've got multiple threads in your data structure at the same time and so if you change any of the shared state then that's got repercussions for all the threads currently there if you've got a lock if you've got a lock-based data structure then you lock you know you a lock if you've got a lock based data structure then you lock you know you take your lock you then modify your data structure you can modify in any way you

Starting point is 00:46:10 like including breaking all the invariants provided that the invariants are re-established by the time you release your lock because then no you know the the lock it's mutual exclusion. It provides that only one thread is modifying your data structure at a time. And so once you've taken your lock, then until you hit that release, then it's just you and just that one thread, and you can do what you like. If you've got lock-free code, then every thread can see every bit of your data structure. So you have to make sure that your your invariants are maintained the whole time and making sure that happens can be really really hard

Starting point is 00:46:52 you've got no in between every single instruction no which can be there can be multiple instructions in a given line of code you've got to think can my thread be interrupted here by another thread if the other thread does what's it going to do what's it going to see in the data structure can it change something that will mess me up no is it going to be coherent enough for the other thread to when they look at the data structure no if in anything other than a non-trivial data structure that's really tricky and so i mean there are some examples in my book but the examples in my book are the most basic data structures and even that gets quite complicated um so the so actually i the one of the most difficult things with lock free is making sure that you can you can ever release any of your memory because you don't know

Starting point is 00:47:37 who's still who's still looking at your data um and going back to earlier we had atomic shared pointer then actually if you use atomic shared Shared Pointer in your lock-free data structures, if your implementation provides a lock-free version of Atomic Shared Pointer, then that can actually simplify your data structures quite considerably because you don't have to worry about doing all the manual memory management because you can rely on the Shared Pointer doing that for you. No, Atomic Shared Pointer is not always available. or memory management because you can rely on the shared pointer doing that for you um no atomic shared pointer is not always available and the new version of just threads that will be coming out

Starting point is 00:48:11 very shortly i think my build server is currently in the process of building a new release that will have a version of atomic shared pointer which is lock free i think that's the first one that's publicly available. But I imagine that there will be ones from Microsoft and one in the GCC standard library coming shortly. Okay. Is there anything else we should go over before we let you go, Anthony? Yeah, I'm just having a quick check down my notes here. Sure.

Starting point is 00:48:40 Yeah, I just want to go back to Atomics. I just wanted to say, I mean, lots of the problems that people have with Atomics is when they use the non-default memory orderings, so that you can use relaxed Atomics, which have a lower overhead in terms of the CPU instructions, but they make synchronization a whole order of magnitude harder. And so that's one of the aspects of lock-free data structures

Starting point is 00:49:15 that you end up dealing with using atomics with non-standard, non-default memory orderings. And I'd really, really recommend people stay away from that. Okay. Where can people find you online if they want to read more of your stuff, Anthony, or find maybe this library that you work on? Yeah, okay. Well, I've got a blog, justsoftwaresolutions.co.uk slash blog.

Starting point is 00:49:41 Okay. The thread library, JustThreads and JustThreadPro is at stdthread.com or.co.uk. And I'm a underscore Williams on Twitter. Okay. I imagine that

Starting point is 00:49:59 covers most bases for most people. Yeah. Well, thank you so much for your time, Anthony. Thank you very much. It's been good to be here. Thanks for joining us. Thanks so much for listening as we chat about C++. I'd love to hear what you think of the podcast.

Starting point is 00:50:17 Please let me know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic. I'd love to hear that also. You can email all your thoughts to feedback at cppcast.com. I'd also appreciate if you can follow CppCast on Twitter and like CppCast on Facebook. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode is provided by podcastthemes.com.

Your Ad Here

CppCast - C++ Concurrency

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.