CppCast - C++ Concurrency
Episode Date: September 29, 2015Rob and Jason are joined by Anthony Williams to discuss some of the Concurrency features of C++. Anthony Williams is a UK-based developer and consultant with many years of experience in C++. H...e has been an active member of the BSI C++ Standards Panel since 2001, and is author or coauthor of many of the C++ Standards Committee papers that led up to the inclusion of the thread library in the new C++ Standard, known as C++11 or C++0x. He was the lead maintainer of boost thread from 2006 to 2011, and is the developer of the just::thread implementation of the C++11 thread library from Just Software Solutions Ltd. Anthony lives in the far west of Cornwall, England. News C++ Core Guidelines GSL Lite Anthony Williams @a_williams Anthony Williams on StackOverflow Links C++ Concurrency in Action: Practical Multithreading Just Software Solutions just::thread C++ Standard Thread Library
Transcript
Discussion (0)
This episode of CppCast is sponsored by JetBrains, maker of excellent C++ developer tools including
CLion, ReSharper for C++, and AppCode.
Start your free evaluation today at jetbrains.com slash cppcast dash cpp.
Episode 28 of CppCast with guest Anthony Williams recorded September 29th, 2015.
In this episode, we talk about the CPP Core Guidelines announcement.
Then we'll interview Anthony Williams, author of C++ Concurrency in Action. Anthony will tell us about all the concurrency features that were added in C++11 and 14,
and give us a peek at what might be coming in 17. Welcome to episode 28 of CppCast, the only podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
Doing great, Rob. How are you doing?
I'm doing great.
I know you have a bit of a cold still recovering from CppCon, right? I am. It was a great week,
but I did get a cold. It was a long week and so much to do. And we got a great reception from
our listeners at the conference. It was amazing how many people recognized me and were friends of the podcast. It
was a great experience. That's awesome. I hear you had no trouble giving out the shirts we made.
No, I'm sure I could have given away quite a bit more. I made a point of focusing on previous
guests of the podcast, but I know I missed a couple people. Sorry about that, if anyone's
listening. But, yeah.
Cool. Any other highlights you wanted to share before we go into the rest of the show?
Obviously, there's a lot to talk about, and I feel like we should try to have maybe some of the keynoters on to talk about the GSL and everything, which we'll touch on in the news items.
But anything else you wanted to mention?
Well, I guess, I mean, they're going to put all the videos up eventually.
Right now there's only four up, and I will say besides the obvious keynote ones,
Chandler Carruths, which was, I guess, technically a plenary session, not a keynote. It's fun. It's good
fun on optimizing C++, and I do recommend watching it.
Yeah, so far I've watched Bjarne and Herb's talks.
I want to watch Sean Perrins and Chandler Kruth's later this week.
Yeah, they're all good.
But Chandler's has a couple of big surprises in it from my perspective.
Great.
Okay, well, at the top of every episode, I'd like to read a piece of feedback.
This tweet came in a couple days before CppCon.
As was mentioned,
I think during some of the
keynotes, they published this
new CppCore guidelines document
on GitHub, and it got noticed
prior to the conference, and
thousands of people
found the repo and were favoriting it,
and I think they even may have started
branching off it and everything before the conference
even started.
And they started explaining what it was.
Right, Jason?
Yeah.
Yes.
It gets mentioned by both Bjarne and Herb Sutter.
And for a while there, well, Herb made a point of pointing out that the repo is one of the
top trending ones on GitHub right now,
which was cool. Yeah. So anyway, this tweet was from Andrew Karpoff and he writes,
hey, take a look at CPP core guidelines. Definitely worth mentioning on your show.
And it absolutely is. And I think there's going to be a lot for us to dig into over the next few
episodes with all this new content. We'd love to hear about your thoughts for the show. You can
always email us at feedback at cppcast.com. Follow us on Twitter at twitter.com slash cppcast, or like us on Facebook
at facebook.com slash cppcast, and you can always review us on iTunes as well.
So joining us today is Anthony Williams. Anthony is a UK-based developer and consultant with many
years of experience in C++. He's been an active member of the BSI C++ Standards Panel since 2001, Thank you. and is a developer of the JustThread implementation of the C++11 thread library from Just Software Solutions Limited.
Anthony lives in the far west of Cornwall, England.
Anthony, welcome to the show.
Hi.
It's great to have you here.
Yeah, it's good to be here.
I am curious what the BSI standards panel is.
That doesn't ring a bell to me.
Well, each national body has their own representative so in the us it's
incits um where the national experts come together to have a discussion or or at least those people
that could be bothered to go to the meetings to come forward to have a discussion in in the uk
it's bsi is the panel is the uh standards panel. And so when the international standards meeting is held,
then there will be various representatives from BSI there.
And the UK has a vote, as does the US.
And various other, like in France will have a vote and Germany will have a vote.
I forget what the acronyms for their standards bodies are.
Okay.
Okay, so these are different regionalized panels for the ISO C++ group, basically?
Yeah, that's right, yes.
Okay, very interesting. I didn't know about that either.
So a couple of news items.
Obviously, the big one to talk about, as we kind of already mentioned with the feedback,
was the C++ core guidelines that got announced during Bjarne's keynote at CppCon.
So, Jason, have you had time to really dig into this at all?
I'm guessing not since you were at the conference.
Yeah, and there's so much information there.
I read one comment from someone that pointed out that it was so dense,
and that was my takeaway, too, when I first looked at it.
But if you watch um the talks the
keynotes from the conference what you get is this feel that the the guidelines aren't necessarily
meant 100 for humans they're meant a large portion of them for static analyzers to teach our analysis
tools how to catch memory problems and other core problems that have been issues in
c++ right and uh it sounds like microsoft and the visual studio team are working on the first
implementation of an analyzer built off of these core guidelines and that's going to be released
whenever vs 2015 update one comes out is that right something like that they said look for it
next month and i don't know if that's necessarily coming with update one or not i was unclear on that personally okay i
thought during herb's talk you mentioned uh the update one time frame oh maybe yeah i'm uh sure
okay anthony have you had a chance to look into these guidelines at all yeah well i i saw the
announcements and on twitter and was
they sort of oh this looks interesting no certainly i mean some of them are straightforward
things that everybody who's no followed c++ at all is likely to know like no no naked news
now use use your smart pointers and stuff right but uh there's quite a lot of stuff that's in
that's new in there there like tagging owners for
passing raw pointers around
then you sort of tag it with an owner type template
so you can say
this is a raw pointer that owns its resource
as opposed to this one over here which isn't
or tagging raw pointers with not null
so it can be checked with a static analyzer
and you don't need to then put an active test in your code,
because you can just rely on it being not null.
Right.
And they were very big on pointing out repeatedly that null pointers aren't,
I mean, excuse me, raw pointers are not evil by themselves.
Just don't pass around pointers where you don't know who owns them
right yes
yes so yeah so there was
a lot about the pointers I noted that some of the
there's a lot of
entries in the index of things
that they intend to cover
so I was having a look down
sort of oh yeah well they've got a section on concurrency
guidelines and what's going to be in there well actually
there's not a lot there at the moment
it's all waiting to be fleshed out it, actually, there's not a lot there at the moment.
It's all waiting to be fleshed out.
It says at the top it's a work in progress.
Yeah, and I guess that's a big thing to keep in mind with all this is they're asking for contributions to add new rules,
to build out the library.
So if you have any interest in working on this,
they're definitely looking for help from the community.
And speaking of community help, this next article, or not really an article, but it's
another GitHub library, is someone went ahead and made a GS guideline support library implementation
for C++ 98 to C++ 11.
I guess the one Microsoft is working on maybe only targets C++ 14 up,
but if you're using an older code base
and you haven't upgraded past C++ 11 yet,
then this library might be useful for you.
Not sure if there's really much else to mention,
but it's kind of the same type of stuff
that's going to be going on in the GSL,
but implemented using just standard C++ 98 to 11 features. Well, they are encouraging
many people to make their own implementations in the GSL. Was it a comment from Herb, I guess,
about you don't really know the problems in a standard specification until multiple people
try to implement the standard.
So that's what they're hoping for.
Yeah, so Microsoft is taking the first stab at it, but I'm guessing the Clang and GCC
communities won't be far behind, hopefully.
Okay, so Anthony, we have a lot to talk about today, I think.
To start off, maybe we should just talk a little bit about why concurrency is becoming increasingly
important in c++ applications yeah sure i mean one of the reasons that people choose to use c++
just in general is because they want the raw performance that you can get whilst having the
ability to use the high level abstractions of of object-oriented code and things like that.
So you're looking at places where people are after performance.
Now, increasingly over the last decade or so,
then the processors that we're using are giving us that performance in the form of more cores
rather than higher clock speeds. The clock speeds have pretty much topped out. So these
days it's not uncommon for a desktop machine to have 8 cores, some of them have 16, even
mobile phones often have quad cores, 8 cores. So if you're after raw performance, then you're looking
for multi-threading and concurrency.
And so you need to be able to write your code to take use of that.
Right, and just looking back a bit in
the 80s and 90s, I guess maybe this wasn't as important a concern because
we were just using
you know single core processors and they just kept getting faster and faster each year right
yeah yeah that was right i mean obviously the the the high performance um computing community have
been you know doing this sort of stuff for years because they're working on on systems with you
know if they if that if there weren't multi-core cpus then they had multiple cpus attached to each
other on the same same board and no these days if you look at the the supercomputers that people
know they've got i think is it thousands of uh of cpus attached together these days so
they've been doing this yeah they've been doing this for years and it's just that in the in the mainstream then it's becoming far more important
right so what are some of the key improvements for concurrency that were added with c++ 11 and 14
okay well i mean c++ 11 was pretty was the first c++ standard that actually acknowledged the
existence of multiple threads and concurrency.
Prior to C++11, if you tried using threads or concurrency in any form within your program,
then you're relying on your platform-specific extensions.
So, yes, on a POSIX platform, you could use the POSIX C library.
On Windows, you could use Windows Threads.
But that was entirely proprietary,
and you just had to hope that they'd managed to integrate it properly with
C++. With the C++11 standard coming out then that then became part of the standard and the
key part of that was actually the memory model that determined though if you do this in one
thread and that in another thread, then is it synchronized?
If you write to a variable over here, can you see it on a different thread over there?
And that is a very important part of the C++11 standard was specifying that.
And that was a lot of work from a lot of serious experts. trying to make sure that that not only worked but also was implementable and efficient across
a wide variety of processor platforms and so so yeah so that that that's the key thing that we
got from c++ 11 um and then obviously so then there's all the all the high level wrappers so
we've got the standard thread class the standard mutex class um you know standard async and standard future for
high level things standard condition variable um so just the that is that that was the base level
implementation of threading facilities in c++ when you move forward to c++ 14 then actually
we didn't add a lot um the The key addition there was shared mutex,
which people might notice is a read-write mutex.
So you can have one thread that's doing a writing and then nobody can read it.
But if people are reading,
you can have multiple threads reading simultaneously.
So sometimes then that can provide efficiency
when you've got something that's very rarely updated.
I'm a little confused about the shared mutex specifically.
It seems like it was added in 14, but only as a shared timed mutex or something like that.
And then in C++ 17, they're making a more generic version.
Am I just confused?
No, no, no, you're right.
In the 14 standard, we've got shared timed mutex,
and then coming up for 17, we've also got shared mutex.
Now, you'll see the shared...
This is by analogy for standard mutexes.
We've got standard coupled mutex and standard timed mutex,
which you can have timeouts on your lock.
So I'm going to try and lock this,
but if I don't get it within 30 milliseconds,
then I'll give up and do something else okay um so that's what the timed
aspect is so with shared mutex then originally it was like well we're going to go for the full-blown
thing with all the timeouts and then um microsoft in particular said hang on on our platform we can
implement it more efficiently without the timeouts if you don't need them okay so um
so for c++ so at in for 14 we said well hang on a minute we don't have time to add that now but
we'll just change the name because no for for um to make it symmetric with all the others so we've
got no standard time mutex recursive time to mutex and now we've got standard timed mutex, recursive timed mutex, and now we've got shared timed mutex.
And now for 17, then we've actually added in
just the straightforward untimed
version, but the name
hasn't already gone. So we can just fit in
with the name you might expect of shared
mutex. And then if you
don't need the timeouts, you can just use that.
Okay.
So going back to
different threading implementations,
if you're already working on an application
that maybe started off in C++98
and you're using something like Boost Thread
or your own library built on top of POSIX threads
and Windows threads,
would there be any gains from switching over
to the standard thread implementation?
I think I'd start off by saying
if you're using boost threads and you keep your version of boost up to date which i know some
people do and some people don't i mean so yeah um then the advantage from switching to standard
thread is probably fairly minimal okay the um unless your implementation is providing additional features or better optimizations than Boost does.
Obviously, no.
If you're thinking about optimizations, you want to profile it.
But Boost is portable across so many platforms.
If you're already using Boost, there's not necessarily a lot to be gained from switching to the std version but if you're using the raw threading
facilities provided by your platform then i would definitely say that it is worth switching to
the standard library version because that provides the portability which obviously the platform
specifics don't right but then again no if it works there's no need there's no need to change
go through and say right we're just throwing out all our all our posix threading stuff obviously
if you're going to make a change to that part of your application then it might be worth thinking
about making the change to the standard library usage as well but if you're not if you're not
changing that then that can be a lot of work and there's not necessarily any gain right um in some
some levels the facilities that you've got you might know the higher level ones might be easier
to use so standard condition variable is easier to use than a posix condition variable
because you can you can automatically you can pass the predicate directly
and you don't have to worry about then spurious weights.
And things like promises and futures just aren't available outside of either standard library implementations
or things like Boost, which provided their own.
All right, since you just mentioned promises and futures,
would you like to give our listeners an overview of those?
Yeah, sure.
At the basic level, it's a means of providing a communication between two threads,
a safe communication of passing data between two threads.
So you've got the promise half, which one thread looks after
and can set the value on the promise when it
is ready and then on the future half you can is an object that you can just pass around and you
can then use that to wait for the data from the to come from the promise okay so so you can you
can you can use that as a base basic synchronization now you if you if you
had a promise void and the future of void then that's just a straightforward no tick the box
yes i'm ready and then no whereas you can or you alternatively you could pass through some data
no of any amount it could be a vector that you're passing it could be a string it could be no any
sort of no type of object that you're creating on one thread and then
passing through to another thread somewhere else in the system. And you don't need to
have any direct access to a shared variable because all the synchronization is done internally
in the future.
Okay.
So we're talking a little bit about sharing data between threads right now. Are
there any other new tools for sharing data that were added with C++11?
The basic tools that we've got, so we've got futures and promises.
As well as promises to give you futures, we've got standard async,
which spawns a task which which would
generally be running in its own thread and then but when it returns when the function returns
then the value is stored in a in a in the future so it's it's no like the most simple possible way
of implementing no thread pool type type stuff i mean because it you call standard async, it spawns your task on its new thread
and then you can wait for the result to come back
so there's that and the promises
that's the highest level of synchronisation we've got
and then also we've got the call once which is for one time
initialisation to ensure that some function and then also then we've got the call once which is for one-time initialization
either um to ensure that now some function is run only once to initialize data you know
i'm going to say the dreaded word of singleton but it's a it is a safe way of
no you couldn't use it to initialize your singletons with call once um or but you might
use it in other contexts too because if you've got multiple objects
of each type, but each instance
has some
lazy initialization type
aspect that you only want to happen
once, but
your object might be accessed
many times.
You can use call once
and that to ensure that each object is
initialized only once.
Does something like that have a high amount of overhead?
I don't know. Is there an expensive check every time callOnce is interrogated to see if it's already been called?
There has to be a check, obviously.
But the cost varies quite considerably between implementations if
a very generic implementation will use a mutex under the hood which therefore imposes a large
cost whereas um a highly optimized implementation can do it with straightforward atomics um and
no very carefully optimized um operations so that if the if the that if the function's already being called,
then it's a very, very quick check.
And the only time that there's any expense
is when you've got multiple threads competing right now
to try to see who's going to be first.
For over a decade, ReSharper has been helping.NET developers
be more productive and write better code in Microsoft's Visual Studio.
Starting this year, it expands the list of languages it supports by adding a dedicated edition for C and C++ developers. To put it simply, ReSharper
C++ makes Visual Studio a much better IDE for C++ developers. It provides
on-the-fly code analysis, quick fixes, powerful search and navigation, smart code completion,
automated refactorings, a wide variety of code generation options, and a host of other
features to help increase your everyday productivity.
The list of things that ReSharper C++ can generate for you include definitions, missing
and overriding members, equality and relational operators, hash and swap functions, add dozens
of customizable templates, and you'll deal with boilerplate code faster than ever.
Code refactoring for C++ helps change your code safely, while context actions let you
switch between alternative syntax constructs and serve as shortcuts to code generation
actions.
With ReSharper C++, you can instantly jump to any file, type, or type member in a solution.
You can search for usages of any code and get a clear view of all found usages with grouping and preview options.
Visit jetbrains.com slash cppcast dash ReSharper dash cpp to learn more and download your free 30-day evaluation.
And if you're a student or an open source project,
get your license for free, courtesy of JetBrains. Okay. Well, now you just went and mentioned
Atomic. I think that's the first time that came up. So let's talk about that for a minute.
Yeah, sure. Yeah. So Atomics are the lowest level of synchronization that's provided in the C++ library.
And you can create an atomic.
It's a template, so standard atomic of some type.
So standard atomic of int or char or whatever.
And you can have standard atomic of any pod struct.
Pods, pods specifically.
Yes, so you...
Oh, sorry, go ahead.
The requirements are that
it must be trivially
destructible
and trivially
assignable
and
comparable with memcump.
So the point is that
when you do any operations on it,
the library is going to use the raw bits
and not call any user code
when it's doing the assignments and the comparisons.
Because mostly that's going to be a single processor instruction,
you're hoping.
Okay.
Obviously, if you try and stick in a struct that's got an array of a
million ints in it then yes you can do that atomic will let you but it's not going to be a single
instruction that's going to be a probably a spin lock that it uses internally to guarantee that
it's atomic whereas no a struct that contains just a single int will probably just use the atomic operation on that int
that's provided by the processor.
So I know...
Okay, so let's dig into a little bit more.
What do we actually get from using standard atomics?
Okay.
Yeah, well, it's...
If you use standard atomic with all the default parameters,
so standard atomic of int and then
no don't and then just do assign to it and read from it as if it was a normal int as far as
possible then that guarantees that your operations from multiple threads occur in some unknowable
compile time but some definite order so if you assign from one thread and then
read from another then no either they will happen in some specified no either the assignment will
happen before the read or vice versa but there won't ever be a clash and so this is particularly
good if you try if you've got use the read modify write instructions like no adding no fetch add or
even just the plus plus operator which is provided for integer types then you know that if you call
it from two different threads if you call plus plus from two different threads you will add two
whereas if you try and do that on a raw int you might actually either end up with a random number
because of some sort of internal um writing um to the memory or actually you might end up with a random number because of some form of internal writing to the memory,
or actually you might end up either adding nothing or only adding one.
So, yeah, it guarantees that the values are interleaved,
any writes are interleaved in some order.
It's the key benefit of using standard atomic.
Okay.
At the last session of CPPCon,
there was some mention, I believe,
of atomic shared pointer becoming part of the standard.
And I'm just confused as to what atomic shared pointer gives us.
Yes. Yes.
Yes.
I know it's,
it's a separate type.
I mean,
it,
no,
it's,
um,
there was,
when it was being initially discussed, then people wanted it to be a specialization of standard atomic,
no standard atomic angle,
bracket,
standard shared pointer of whatever.
Okay.
But,
um,
people said,
hang on a minute,
shed point is not a pod.
So that seems a bit sort of unusual.
So we've got the whole type, atomic underscore shared underscore pointer. shared variable that's shared between multiple threads then you can do reads and writes to your
atomic shared pointer instance from from all these different threads without any external
synchronization whereas if that was a a plain shared pointer then you'd have to have external
synchronization like a mutex so do you mean accesses to the data that the shared pointer points to?
Or do you mean modifying the shared pointer itself?
Modifying the shared pointer itself.
Okay.
The most common sort of things for that is where the shared pointer is like the root pointer into some form of data structure, like a list or a tree.
And then you're adding nodes to that data structure and so you're having then to change what the root is you know because either you've added it you know it's a
linked list and you're adding something at the beginning or it's a tree and you've rebalanced
it so you the the the root node is now different so if you're changing that then the owning node
from your from your data structure,
then you're going to have to write to that shared pointer.
And if you're then trying to read from that route from another thread,
then you've got a data race.
And so you need to either put an external mutex
or use an atomic shared pointer.
Okay. need to either put an external mutex or use an atomic shared pointer okay and and so this isn't about the data the rest the pointed to data structure obviously if you want no once you've
dereferenced your pointer you've got to make sure that no the data that you're then accessing is
also safe but uh this is just about that pointer itself, the shared pointer instance.
Interesting.
Okay, so that is one new feature that might be coming with C++17.
Are there any other things on the horizon with C++17 in concurrency?
Yeah, well, the big thing that's coming is, at the moment,
things like atomic shared pointer aren't directly in the C++17 draft.
What we're working on is the, it's colloquially known as the Concurrency TS.
So there's a long note, it's like the Draft Technical Specification for Concurrency in C++ or something is the long title.
But yeah, we call it the Concurrency TS.
And so that's got
things like we've got atomic shared pointer there's latches which allow you to have have
threads that are waiting for a certain number of predefined number of events to happen and so then
and then they sort of oh yeah everybody's ready so now the whole rest of the system can go um there's barriers which are a bit similar but they're sort of they're a well
they're reusable and b that it's only the defined threads within the group that are waiting so no
it's like okay we wait first wait for the threads to initialize then wait then they so they block
up the barrier when everybody's ready to go
then we move on to the process or process the data and then and then do the next chunk so
so that's um that's something actually that people have been asking for quite a while it's that's
been in boost threads for a long time but actually getting it through the standardization process is
tricky uh there's people like the guys from NVIDIA
who want to make sure that the base level of implementation
was doable as a hardware,
because they've got nice hardware instructions
on some of their platforms for doing barriers.
And they wanted to make sure that the simplest implementation
would work with their hardware instructions.
So then there's also the flex barrier,
which then has more complicated stuff.
So when everything's ready,
then it doesn't cause a callback on exactly one thread,
which then, no.
Okay, so we're going to process the data
that everybody's ready,
that everybody's organized
before we release the whole herd of threads
on to do the main bit of the processing.
And then the thing that's actually quite a big, from my point of view,
important aspect to the C++ TS, concurrency TS, is continuations.
So this is an extension on the future library.
So you've got some future which you're waiting for waiting
for the data to come and then you say well when this data is ready then I want to do something
else and I'm going to spawn a new task to process that that result and so there's a new member
function on your future dot then which you you pass it a a function or callable object,
and that then is invoked when the future becomes ready
in whichever thread is providing the data.
Correct me if I'm wrong,
but has Microsoft already provided an implementation of that
with their concurrency library?
I think it's called PPL.
Yes.
I can't remember the details of ppl but certainly there's
been um there have been similar sorts of facilities provided in from microsoft they've had various
names and ways of doing it for for quite a while i think there was um a weight was was one thing
that they had there's um similar sort of stuff but yeah so
and there's been lots there's been lots of work from lots of places about continuations
um but yeah this is with the concurrency ts then it's being added to standard future
and standard share future and then also similarly then there's like when any which is like well
you've got a bunch of futures well when one of them is ready then i want to do something and so you can't gather a bunch of futures together into a single future with your
when any and then as soon as one of them becomes ready then your when any result is ready and then
and then you can put a continuation on that so when any of my futures are ready then launch this task um it enables quite high level sort of code and
it's it's sort of things that people would use callbacks for um then it simplifies the whole
process i think we talked to um hartmut kaiser a while ago about concurrency with the library he works on, and he seemed to think that futures in the current state of C++11,
I apologize if I'm getting this wrong,
but I think he didn't seem to think they were very usable.
Do you think all the new features coming from the concurrency TS
are really going to boost up that library?
Yes, I think so.
I mean, yeah, futures from plain C plus 11 um they can be hard to use
they're they're certainly the continuations is a very important aspect that i think is going to
make them a lot more usable for people right um yeah and so the you know the when any we've got
when all as well so no if all of a bunch of futures are ready,
then it simplifies waiting for events
and gathering things together from multiple threads
to do continued processing.
And it allows a sort of a higher level,
so you can chain things on,
sort of, well, first I'm going to do this,
then I'm going to do that,
then I'm going to do the other.
And you just write that as a chain series of then calls.
And then you just need to wait for the final result, whereas doing that otherwise,
then you're having to manually spin up threads or manually submit tasks to a thread pool,
and it can be a lot more complicated from the user's perspective.
You just brought up another topic I don't think we mentioned yet, thread pooling.
Is that something we could go into a little bit? Yeah, sure.
Yeah, I mean, thread pools
actually is something that's been a bit of a controversial topic among
the committee, trying to work out exactly how best to sort it.
And first off, I'll say that we've got um if you're what whereas the concurrency ts is still in draft and actually as long as that
there's the parallelism ts and so that's things like essentially a parallel version of the stl
so no like you've got standard sort and you say standard sort, but make it parallel. Standard for each, but make it parallel.
And so that will deal with lots of the issues that people might want a thread tool for,
because basic algorithm, STL style algorithms, you can just leave it up to the compiler runtime to parallelize that across whatever threads,
and the runtime will completely manage your threads for that.
So for lots of things that people might have wanted a thread pool for,
then that will just deal with it.
And that is already out for voting.
I can't remember whether it has now been officially approved as an ISO standard TS,
or whether it's still out for voting.
I can't remember on that.
But it was this year that it was out for voting.
Okay, that's the parallelism TS you're talking about?
That's the parallelism TS, yes.
Okay.
And then, so aside from that,
which actually has been relatively straightforward to do,
then the other side we've got thread thread pools is is for no different for different
tasks that people are wanting to add so no you're actually wanting um okay well here's like if
you're trying to do a pipeline then you might have various tasks okay well the first stage i'm going
to submit to my thread pool and then no then post the second stage for for another bit and various
things like that um and that level of thread pooling has been
quite controversial and we've got it's been the the framework that people talk about is executors
and that's been in and out of the concurrent cts it was originally voted in and then there
was alternative composals and it's been ripped out again whilst people try and sort out what the um you know the the best way to approach it is
because the people have different needs i mean there's people who are coming from the
asynchronous io tile community who want really really um low overhead executives that they can
pass around um as parts of their callbacks and um i know and make sure that you can have hundreds of little IO packets,
the handlers waiting for all the asynchronous IO so that as things come in,
we get the first bit of our packet and then we process that and then we submit a new task to our executive to deal with the rest of it.
When the next bit comes in
and you want to make sure that well if it's already ready then it runs straight away
without having to stick it on a queue synchronize with the queues
spool up a thread to run it no quite heavyweight things like that so so that's one side is people
with really really small lightweight tasks that they're wanting to be able to some know to throw
loads and loads and loads
of tasks at the at the executor and have it just handle it with really low overhead and then there's
other people who are thinking well actually we're wanting sort of you know a few big tasks so at the
time you know on an eight core machine we might have like you know less than 20 tasks in progress
at any time and so we're looking at big tasks that are going to be
occupying a processor core for a substantial sort of chunk of processing
and those people have got a slightly different way that they want to look at things and they
and the interface that they're after is slightly different so i try and trying to come up with a combined interface that works for everybody has been quite
a quite a challenging uh challenging task so we've covered atomic async futures promises
mutexes thread pools where would you recommend that someone
who's just getting started with asynchronous programming start?
Should they start at the low level?
Should they start at the, well, I guess you said a promise
is the highest level thing that exists today?
Yeah, yes.
Async, futures, promises, that's the highest level
that's part of the current c++ standard library
i would say that yeah if you're just starting then that's probably where you want to look
to be sort of aiming your stuff at obviously if you're porting an existing code base
then i would say that the first thing you want to do is you want to go through and make sure that
things where you've got accesses to
shared variables from multiple threads then some people are using like platform specifics to do
do the atomic access and i would say replace those platform specifics with standard atomic
okay because you know just to make sure that that's that's right um and at the higher level then now use standard mutex if you're
if no or at least check all your all your usages but if you're developing new code then yeah i
would i would say you want to start with standard async standard promise and standard future
to an extent using raw threads can actually be no it it's a bit of a it's a bit of a shame that that's as far as we got
in our standardization at C++11,
because it's too low level for most people.
You really don't want to be writing your code
with manual thread management if you can at all avoid it.
But unfortunately, that's all we've provided with people at the moment.
So you consider future to still be manual thread management or no
i think future in itself doesn't provide much in the way of thread management it's the
no the so standard async is all we've got at the moment on on that front okay which allows you to
spawn a task and you can either say spawn it definitely on its own thread, or you can say, when I want the data, then I'll run the task. So that's deferred.
Or you can say, we'll let the implementation decide whether I've got enough spare processing
power to run it on its own thread. Otherwise, then I'll run it when you want it. And so that's
a very, very basic task task management and with continuations
then obviously we get more of that on that because you can know you're you're scheduling
the more tasks um as each you know bit of data comes in and is ready um but really then yeah
that's just the beginnings of proper proper thread and task management which
now unfortunately you're then looking for no we're wanting these executors we're very keen
to get them but trying to get them no so everybody's happy is a hard task
um so yeah so at the moment then yeah we want async we want um promise and future. Use those in your new code.
But yeah, you want to look at things like the Intel thread building blocks because they do provide some of the higher level thread pooling type stuff.
Some of the Microsoft concurrency runtime libraries similarly provide the higher level building blocks.
It's just unfortunate that we don't have them in the standard library yet.
Okay.
So we haven't even mentioned yet that you're the author of the book,
C++ Concurrency in Action.
I did read through it a bit before getting you on for the interview.
It's quite a tome.
I think it was like 528 pages in total.
I saw you had chapters on lock-based and lock-free concurrent data structures,
and I was wondering if you could go a little bit into your recommendations
on which to use and in what situation.
Yeah.
If you're writing your own,'s dead easy okay do not write your
own luxury data structures okay because getting it right is too hard um it might look easy you
might think it's straightforward but there are so many gotchas um and you know really it's just not worth it however if you can find a um somebody who's
written you one for you then use theirs yeah so use the one that's um if if if you're using um
intel thread building blocks i think they do have a couple of lock free data structures use those
um if you're using my just threads pro library then there's a couple of lock-free data structures there um so we've got um but if you're
if you're writing your own data structure do not make it lock-free because it is no too hard for
most people um and it's probably not worth the development time in getting it right.
Obviously, if it really, really turns out to be a core bottleneck,
then sit down with your local concurrency experts and work it out
and go through it and spend a dedicated chunk of time
thinking through all the possible permutations of ordering.
Because, of course course the key the key
thing about lock-free data structures is that and the reason you use it is you've got multiple
threads in your data structure at the same time and so if you change any of the shared state
then that's got repercussions for all the threads currently there if you've got a lock if you've
got a lock-based data structure then you lock you know you a lock if you've got a lock based data structure then you
lock you know you take your lock you then modify your data structure you can modify in any way you
like including breaking all the invariants provided that the invariants are re-established
by the time you release your lock because then no you know the the lock it's mutual exclusion.
It provides that only one thread is modifying your data structure at a time.
And so once you've taken your lock, then until you hit that release,
then it's just you and just that one thread, and you can do what you like.
If you've got lock-free code, then every thread can see every bit of your data structure.
So you have to make sure that your
your invariants are maintained the whole time and making sure that happens can be really really hard
you've got no in between every single instruction no which can be there can be multiple instructions
in a given line of code you've got to think can my thread be interrupted here by another thread
if the other
thread does what's it going to do what's it going to see in the data structure can it change something
that will mess me up no is it going to be coherent enough for the other thread to when they look at
the data structure no if in anything other than a non-trivial data structure that's really tricky
and so i mean there are some examples in my book but the examples in my book are the most basic data structures and even that gets quite complicated um so the so actually i the one of the most difficult things with lock
free is making sure that you can you can ever release any of your memory because you don't know
who's still who's still looking at your data um and going back to earlier we had atomic shared
pointer then actually if you use atomic shared Shared Pointer in your lock-free data structures,
if your implementation provides a lock-free version of Atomic Shared Pointer,
then that can actually simplify your data structures quite considerably
because you don't have to worry about doing all the manual memory management
because you can rely on the Shared Pointer doing that for you.
No, Atomic Shared Pointer is not always available. or memory management because you can rely on the shared pointer doing that for you um no atomic
shared pointer is not always available and the new version of just threads that will be coming out
very shortly i think my build server is currently in the process of building a new release
that will have a version of atomic shared pointer which is lock free
i think that's the first one that's publicly available. But I imagine that there will be ones from Microsoft
and one in the GCC standard library coming shortly.
Okay.
Is there anything else we should go over before we let you go, Anthony?
Yeah, I'm just having a quick check down my notes here.
Sure.
Yeah, I just want to go back to Atomics.
I just wanted to say, I mean,
lots of the problems that people have with Atomics
is when they use the non-default memory orderings,
so that you can use relaxed Atomics,
which have a lower overhead in terms of the CPU instructions,
but they make synchronization a whole order of magnitude harder.
And so that's one of the aspects of lock-free data structures
that you end up dealing with using atomics
with non-standard, non-default memory orderings.
And I'd really, really recommend people stay away from that.
Okay.
Where can people find you online if they want to read more of your stuff, Anthony,
or find maybe this library that you work on?
Yeah, okay.
Well, I've got a blog, justsoftwaresolutions.co.uk slash blog.
Okay.
The thread library,
JustThreads and JustThreadPro is at
stdthread.com or.co.uk.
And I'm
a underscore Williams on Twitter.
Okay.
I imagine that
covers most bases for most people.
Yeah.
Well, thank you so much for your time, Anthony.
Thank you very much.
It's been good to be here.
Thanks for joining us.
Thanks so much for listening as we chat about C++.
I'd love to hear what you think of the podcast.
Please let me know if we're discussing the stuff you're interested in,
or if you have a suggestion for a topic.
I'd love to hear that also.
You can email all your thoughts to feedback at cppcast.com.
I'd also appreciate if you can follow CppCast on Twitter and like CppCast on Facebook.
And of course, you can find all that info and the show notes on the podcast website at cppcast.com.
Theme music for this episode is provided by podcastthemes.com.