CppCast - Sequence-Oriented Programming
Episode Date: July 7, 2023Tristan Brindle joins Timur and Phil. Tristan talks to us about a safer alternative to iterators and his library, Flux, that implements it. News 60 terrible tips for a C++ developer Big, comb...ined, committee trip report from Varna CLion 2023.2 EAP4: AI Assistant "Making C++ Memory-Safe Without Borrow Checking, Reference Counting, or Tracing Garbage Collection" Links Episode 78 of CppCast, mentioning Cling Episode of ADSP recorded at C++ on Sea Episode 152 of CppCast, with Tristan and the C++ London Uni team Flux on GitHub
Transcript
Discussion (0)
Episode 364 of CppCast with guest Tristan Brindle, recorded 3rd of July 2023.
This episode is sponsored by JetBrains, smart IDEs to help with C++.
And Sonar, the home of clean code. In this episode, we talk about what happened at C++ on C,
C-Line's new AI assistant,
and different techniques for memory safety.
Then, we are joined by Tristan Brindle.
Tristan talks to us about sequence-oriented programming.
And Flux, a C++20 library that supports it.
Welcome to episode 364 of CppCast,
the first podcast for C++ developers by C++ developers.
I'm your host, Timo Dummler, joined by my co-host, Phil Nash.
Phil, how are you doing today?
I'm all right, Timo. How are you doing?
I'm not too bad. I'm just still recovering from C++ on C.
It was amazing, but I was still quite tired.
I didn't get very much sleep last week.
But yeah, it was an amazing conference.
So I'm still kind of in recovery mode, taking it slow.
How are you? You're probably even more tired because it's your conference isn't it i was there
as well yes uh i did have a slightly different perspective on it uh very tired but in a good way
i think it went very well overall a couple of technical issues right at the start but i mean
that's quite common with these things i went pretty smoothly after that and um we had about 50 more people than last year so it's just back to growth again so exhausted
but happy i think all right amazing actually one thing that happened at tbs1c is that another
podcast adsp the one run by connor and bryce they had a special episode episode 136 and they recorded
that at the speaker's dinner
at the conference, like in this big room where there was like all these people. And they actually
did this fun thing where they interviewed all the other people in the room who were like
running podcasts. So Phil and I are on it. I had, I think, quite a few beers at the time when they
interviewed me. So I'm not sure if things that I was saying there
made a lot of sense, but yeah, I think it was quite an entertaining episode. So shout out to
ADSB. Thanks for doing that. I think it was a lot of fun. Yeah. And that episode's already released.
Connor was editing it during one of the other talks, I think. So you can go and listen to it
now. We'll put the link in the show notes. All right. So at the top of every episode,
I'd like to read a piece of feedback. And this time we have a comment from Jamie on Reddit.
Jamie says, I'm new to the whole podcasting thing, and I was glad to find CppCast.
It's a great way to get exposure to all the projects and tools available in a C++ ecosystem.
I've been going through the episodes from the beginning, and the ChaiScript episode made me
think of the CERN root frameworks interpreter, Kling, a kind of REPL for C++ that allows you to prototype interactively a bit like Python or Node.
In my opinion, it's kind of surprising it doesn't get more exposure. The original application was
for data analysis, but I think it has a lot of potential in more general settings. Given that
it's based on the LLVM Clang stack, now as a more general purpose tool called Clang REPL,
it would be really cool to have an episode about this. Well, thank you, Jamie. That's a really good piece of feedback.
I went digging actually before we recorded this episode and I found that Cling was actually
mentioned briefly in the news section of episode 78, which was all the way back on 9th of November
2016. That was the one with Odin Holmes. And since then, from what I can tell, it really
hasn't been mentioned on the show. So I think it's time to look into that direction. What do you
think, Phil? It sounds like a kind of exciting thing to talk about on the show. Yeah, I'm glad
you dug into that because I did seem to remember it being mentioned on the show, but I couldn't
remember exactly when. So there we have it, episode 78. Definitely seems like something that,
you know, it's time may have come again. So we'll have a look into that.
Thanks for the suggestion.
I think it would be tying in really nicely
into our little mini series on tooling
that we're kind of doing, right?
Absolutely.
So this episode is not part of that,
but we're kind of interleaving them
with episodes about other topics.
So it's not just about tooling,
but we have this loose plan
to have kind of more episodes about tooling.
So I think that would tie really nicely into that.
Yep.
All right.
So we'd like to hear your thoughts about the show.
And you can always reach out to us on Twitter or Mastodon or email us at feedback at cppcast.com.
Joining us today is Tristan Brindle.
Tristan is a C++ consultant and trainer based in London. With over 15 years C++ experience,
Tristan started his career working in high-performance computing
in the oil industry in Australia
before returning home to his native UK in 2017.
He is an active member of the ISO C++ Standards Committee
and the BSI C++ Panel.
He is a regular speaker at C++ conferences around the world
and is a former director of C++ London Uni,
a non-profit initiative offering free introductory programming classes in London and online.
Tristan, welcome to the show.
Thank you very much. It's great to be here. Thank you for inviting me.
So, Tristan, you mentioned C++ London Uni in your bio.
Now, I know a thing or two about that because I was there when you set it up.
But for those that didn't listen to the episode that you're on before when you talked about that you
gotta tell us briefly what it was uh yeah so this was a spin-off from c++ london which is the
meetup in london run by a certain phil nash and the i mean like most meetups they're kind of uh you know aimed at kind of professional developers
and one of the people who was coming along was a c++ learner a guy by the name of tom brezza
and he was keen on getting some more sort of introductory level uh material in there and
phil suggested that perhaps this could be a sort of spin-off event.
And I kind of volunteered to help out with that and do some teaching.
And so we spun this off into C++ London Uni, where we did exactly that.
Offered introductory classes, weekly classes, starting from scratch, teaching C++.
And yeah, it was a lot of fun.
Unfortunately, as with many things, COVID sort of put pay to it.
We couldn't have our weekly meetups and it all sort of went away, which is a bit sad,
but it was lots of fun while it lasted.
All right, Tristan, so we'll get more into your work in just a few minutes, but we have
a couple of news articles to talk about. So feel free to comment on any of these, okay? Sure.
So the first one we have is Andrey Karpov from PVS Studio wrote a mini book called
60 Terrible Tips for a C++ Developer. It's available online for free. It's really cool.
It's kind of fun and serious at the same time. he has like these 60 terrible tips which are kind of kind of very sarcastically like
um like for example one uh terrible tip is use nested macros everywhere and how it's a good way
to shorten code and you will free up hard drive space and your teammates will have lots of fun
when debugging and so uh there's like 60 of those but then below each one um it's actually kind of an
explanation and then like actually a good tip what you should actually be doing and why that's bad
and so i thought that was kind of fun it's like a very kind of humorous resource but also very
educational at the same time so um yeah i thought that was fun and just wanted to recommend to check
it out yeah i didn't read them all i dipped into a few of them and yeah i agree so passed that initial sort of humorous device of presenting the terrible tip didn't really get in the way too much
but it is followed up with some quite sensible advice that's well written and i picked on a few
that i often see sort of oversimplified or not really fully understood like um there's ones in
there about floating point comparisons and there's one on globals and that they had pretty good coverage so yeah
thumbs up both entertaining and useful same as phil you know i haven't read all all 60 of them
i sort of scanned a few and it does look like you say both entertaining and really informative as
well um so yeah i'm definitely i'm gonna study it in more detail so the next thing i wanted to
mention is just a follow-up from last time.
We were talking about a couple of trip reports from the Varna meeting.
That's now a few weeks ago.
And there was this big trip report put together by a bunch of people, which usually appears
on Reddit.
And it wasn't available yet at the time when we recorded the last episode, but it is available
now.
And that's probably the most comprehensive report about what actually happened at the VANA meeting
that's out there.
So I just wanted to say that that's available now.
We're going to put a link in the show notes.
All right.
So the next thing is from our friends at C-Line.
C-Line always has these early access program
kind of preview versions that you can download for free.
And the latest 2023.2 EAP4 introduces an exciting new feature,
which is the AI assistant.
And that's really kind of a new thing, but it's been, I guess,
lately it's been more and more of a thing that people use AI tools
to help them code.
And now we have actually one built into an IDE.
So that's really cool.
There's a blog post on how this works, kind of connects to OpenAI and also some other
JetBrains large language models.
And in the future, it will connect to other large language models as well.
You get this kind of AI chat window where you can ask AI questions and ask it to, I
don't know, generate some
code for you.
And then you can like insert it at the carrot into your editor or copy paste it into your
code.
Or you can just select some code in your IDE.
And there's like a new little menu that says, you know, explain code, suggest refactoring,
find potential problems or a custom prompt, that kind of stuff.
So there's like a new ai actions sub menu where you
can ask the ai to do stuff with your code it can generate a commit message for you it can explain
cmake errors to you in addition to c++ errors and other things and yeah it's a pretty comprehensive
and pretty cool play around so um it's going to be uh yeah in the next release but uh there is a
preview available now yeah and it's uh
i've not actually tried it myself yet but it's definitely a cool feature very much in fashion
of course and in fact there's a tie-in with the closing keynote from c++ on c last week you'll
have to uh you have to watch the video to to see what i mean about that i did have one question
though is there any way to disable it from even appearing as an option?
The reason I ask is because I know a lot of organizations
have policies against using any of these AI generative tools for now.
And just wondering if there might be any sort of issues with that.
Have you heard about anything?
Yeah, so you don't actually get it by default if you just open CLion.
So you have to actually log into your JetBrains account
and then enable the AI feature there in your account in order to use it.
And for example, it's not available in all regions yet and stuff like that.
So I think, like, I'm not sure.
But from what I've seen, you can just kind of not enable that option in your account and then you're just not going to have access to it.
So by default, it's off as far as I understand.
But if that's somehow not accurate, I going to like follow up on that uh next time but that's kind of my current understanding that you have to actually actively switch it on in your
account and not everybody gets access to it depending on where you are or i think there's
like a limited amount of people that get access to it and then that amount is going to increase
over time i think that's kind of what everybody's doing. Yeah, yeah. I'd be interested to see where it goes.
Yeah, definitely.
I use C-Line, but I haven't tried out this new feature yet.
But certainly some of the things like auto-generating commit messages
and stuff, that sounds great.
It sounds like it would save you a lot of work.
So, yeah, I'm keen to try it out.
Yeah, it just came out literally a few days ago.
So it's like brand new.
All right.
So the last news item is something that ties in into something else that happened at C++
and C, which is we had lots of talks and discussions about safety in C++.
I actually myself had a talk that was called C++ and Safety.
We had obviously Sean Perrin's opening keynote that was all about safety.
And a lot of the discussion was around,
well, we can't really make C++ memory safe language because you would have to do something
like Rust's borrow checker
and that kind of doesn't really work
with the way C++ does references
and move semantics and iterators
and all of that stuff.
And we're going to get into some of that stuff later with Tristan
from kind of a different perspective.
But it was this kind of a thing where like,
well, C++ is this one model.
And then Rust, for example, or Val have like a very different model.
And they're kind of not really reconcilable with each other.
And so there was this really fascinating blog post,
which is called making C++ memory safe without borrow checking,
reference counting, or tracing garbage collection, which is obviously the other way you can make
language memory safe, which we can't really do in C++ either because of the performance
implication. And so that's a really, really long blog post, which I haven't fully read and digested
yet because there's just so much information in there. And it's quite information dense as well.
But yeah, the author of that blog post actually looked at quite a series of
programming language veil and val which are actually two different programming languages
which i didn't know before it's val and there's veil and there's also other programming languages
that i've never heard about before like austral and gel and inco has anybody of you have heard
about any of those i have not i've heard of veil and i think the author of the blog post is
also behind i've heard val and i think the author is behind the veil language uh there's also a
language called valor which is really confusing so uh yeah so we had timmy raccordone who's one
of the developers of the val language here on the show a few months ago but like i've not heard
about any of the other languages so what this person did is they went into kind of all these languages and looked at uh the different
ways that memory safety is achieved in those languages so val we actually had on the show
there's this thing called mutual value semantics but apparently there's like a whole plethora of
other methods and so this this blog post kind of dives really deep, explores these other methods to, you know,
come up with a memory-safe language
and how to apply them to C++ through different means,
like restricting the feature set, static analysis,
kind of, you know, or like changing things here and there.
And it discusses, so the Val mutable value semantics,
which actually it calls simplified borrowing.
Yeah.
It's kind of a subset of the Rust Borrow Checker,
but then it discusses three other techniques
that I was completely unfamiliar with.
One is called borrowless affine style.
The other one is called constrained references.
And the last one is called generational references
and random generational references.
That's apparently from the Vail language.
And yeah, it's a huge blog post.
And apparently, there are quite a number of ways to make a programming language memory safe. And
it looks like quite a few of those techniques might even be at least partially applicable to
C++. So I'm really excited about this kind of research and where this goes and kind of the solutions that people will come up with in that space. Yeah. So as you say, it's a very long blog post,
very technical and in-depth, and I haven't had time to kind of study it in detail, but
just kind of scan reading it, it looks, as you say, really, really interesting. I was quite
interested to see one of the techniques they mentioned
is to replace the use of raw pointers with indexes into sequences.
Which sounds like a very sensible technique.
Yeah. Someone should do that.
Yeah. So why don't we talk about that?
That's a nice transition to the main part of our episode today,
which is to talk to Tristan about sequence-oriented
programming and his library Flux.
So Tristan, why don't you tell us
what sequence-oriented programming is all about
and what your Flux library is about
and how that kind of fits into that whole space, right?
Yeah, yeah, absolutely.
Absolutely.
So sequence-oriented programming or collection-oriented programming, I think, is a slightly more common term, although neither one is particularly common.
But the idea is it's a programming style that emphasizes thinking about kind of high-level operations on sequences.
So, you know, you have your sequence of values and then you want to filter these values and then you want to transform them in some way, and then maybe you want to chunk them into equally sized chunks,
and then you want to do a scan on those.
Thinking about these high-level operations
rather than immediately diving into just,
I'm just going to write a for loop that does all this.
We're thinking in terms of high-level operations and sequences
or collections, if you prefer.
So some people sort of refer to this as like functional programming
or functional style programming.
Functional programming means a lot of things to a lot of people.
I've sort of moved on to using the term sequence-orientated programming
or collection-orientated programming.
And Flux is a library that's a C++20 library that helps you to do this
is designed around enabling this coding style.
So it has sort of similar goals to C++20 ranges
or D ranges or Rust iterators or Python iterators,
Python iter tools, all these sorts of things.
Flux is designed to make this easy to do in C++,
and the goal is improved safety as well.
So the idea of Flux is that it enables this sequence-oriented programming style,
but in a way that improves the safety of your code as well.
So you can do these things with iterators in C++ with C++ 20 ranges. But there are some problems with ranges in terms of the low-level operations
are kind of unsafe.
And at this point, we have to sort of take a diversion and say,
well, what do we mean by safety?
Timo had a nice talk all about this at C++ and C that I'm sure will be
online in due course.
There are lots of definitions of safety, so we kind of have to decide what do we actually
mean. So there's a really nice definition that I like and that I use and Timur also
mentioned in his talk, which is that safety, or one nice definition of it is the absence of undefined behavior.
And the problem is a lot of the code that we write in C++ is unsafe in the sense that undefined behavior can occur.
We have to be very careful to avoid UB.
And in particular, the iterator design that we have,
which is kind of a generalization of array pointers, right?
Iterators have pointer-like semantics.
And this means that iterators can be very prone to undefined behavior.
So if you think about, if you have a pass the end iterator,
you know, you initialize your iterator from standard end.
And then if you try and dereference that iterator,
if you try and increment that iterator, that is undefined behavior.
We can have iterator invalidation.
If you're holding on to an iterator and the container goes out of scope,
then you can't use that iterator anymore.
Pretty much anything you try to do with that iterator can be deleted.
And we have other situations in which,
really non-obvious situations in which iterators can become invalidated.
If you're holding on to an iterator into a vector and then you push back into that vector,
there's a chance that that iterator has become invalidated and it can no longer be used.
So iterators are great in the sense that they're very powerful, they're low overhead,
they're based around the raw pointers, which are very low level, very powerful thing.
But they have these problems with UB. And so the idea behind Flux is that we replace these
low level iterator operations with a different iteration scheme,
but very slightly different iteration scheme, where our low-level base operations are designed
to be safe. That is, they don't allow undefined behavior. And so if our low-level, our basis
operations are safe, then the things that we build on top of that should be safe as well.
Can I ask you a question?
In C++, if you have a traditional for loop
with an
iterator or pointer or something
like that, you have four operations.
You can initialize
the iterator to begin, you can
increment it, you can
compare whether you reach the end and
then you can like dereference it right and so at pretty much any of those stages i think apart from
the comparison with end and like when you get the begin one you can get undefined behavior right
but then we also have like range-based for loop which are kind of safe and we have like ranges
and views which are i think kind of safe so so how have like ranges and views, which are, I think, kind of safe.
So how is what you're doing different
from those approaches?
Okay, so the range-based for loop, yes.
If you're just beginning at the beginning of your container
and iterating all the way through,
then you're going to be okay, right?
Because kind of the range-based for loop,
it takes away, you know, you're not dealing directly with iterators you're you know the range-based
for loop is abstracting that for you but at some point you know most algorithms or many algorithms
at least you're not just beginning at the beginning and going all the way through to the end at some
point you know we have to actually deal with raw iterators ourselves. We have to hop forward X number of places.
We have to perhaps be holding on to an iterator,
and then we don't know if somebody else might modify that container in the meantime.
So very straightforward code with range-based for loops can be safe.
Actually, you're talking about views.
Actually, views, you can have dangling views and things,
which is a slight problem with the ranges library,
something you kind of have to be quite aware of
when you're programming with ranges,
is you have to still worry about the lifetime of things.
But at some point, we have to, or in many cases, we have to delve down to the
level of manipulating iterators directly. And then we have to be very, very careful to avoid
undefined behavior. So if we can replace these potentially unsafe operations with safe operations,
then as I say, we can build on a firm foundation on top of that.
It's the idea behind Flex.
So I'm familiar a little bit with the way Rust does it,
where you kind of only have two operations, right?
You initialize an iterator, and then the only thing you can do
is then to get the next item, right?
You can't actually manually increment the iterator and dereference it or compare it to end. You just say, give me the next item, right? You can't actually manually increment the iterator and dereference it or
compare it to end. You just say, give me
the next item, and then it returns you an optional.
And so either
there is a next item, and then you get it,
or there isn't, and then you're going to get an empty optional.
And that way, you can't trigger
your B. Is it something like that, or
is it a different idea?
Okay, so
the Rust iterator scheme is exactly as you say. So if
you think about your fundamental operations on iterators, you've got dereference operation,
increment operation, and your n check, which for iterators we do by comparing to a sentinel in
FBC++20. So the Rust iterator scheme actually fuses these three operations together
into a single function called next.
So you initialize your iterator and then you call next.
As you say, it returns an optional until such time as that optional is empty,
which signals the end of iteration.
So I actually wrote an earlier library, it's called Flow,
that implements the Rust iterator scheme in C++.
And it's very interesting, and it has the advantage that it's a lot simpler, right?
It's simpler for producers.
It's simpler for consumers.
There's really only one fundamental function you've got to worry about.
But it has the downside that you lose some expressivity
in that there are algorithms you either can't express
with the Rust iterator scheme
because you can't do random access, for example,
and there are some algorithms you can't implement as efficiently.
And the other thing that you lose,
not just with the Rust iterator scheme,
but also some other iterator schemes like in D,
you lose the ability to refer to a particular position in a sequence to have like a coordinate for a particular position in a sequence.
So it becomes quite difficult to do something like a find if.
Like an index, you mean?
Like an index or like an iterator,
which returns you a particular position in the sequence,
or an index, or in Flux we call it a cursor,
that returns you a particular position in the sequence.
So Flux actually grew out of this earlier library that I wrote.
I thought, well, how can we, because I was implementing this Rust-style scheme
and I was sort of being someone who was very familiar
with iterators and ranges.
I was kind of coming up against these limitations.
I was like, well, how can we extend this scheme
so that we can do all of the things that we can do with iterators?
And it turns out that the way that you tend to implement these iterators in a
Rust-style scheme is that your Rust iterator, if you like,
will tend to have an internal cursor that you, you know,
every time you call next, you bump the internal cursor,
or you check it and you return an optional.
And my sort of idea was that well if we could expose these cursors to the user
in a in a safe way then we can we can do all of the things you can do with iterators and then i
was playing with this for a while and i kind of realized that well if you expose these cursors
then you don't actually need this next function at all you can you can do it all with cursors so
that's that's kind of how flux came about and it it turns out that i mean i didn't know this when i started it turns out
this has uh quite a lot of similarities to the way it's done in the swift language so
swift has a collection protocol that um that does things in quite a similar way to how it's done
how it's done in flux that's really cool really cool. So this cursor, is it basically like,
let's take vector, right?
Because vector, that's always the go-to example
when you talk about containers.
So a cursor into a vector,
is it basically like an index?
Like if I'm holding onto like an integer,
which tells me, you know,
this is the fifth element of the vector.
Is it something like that?
Is that like the right mental model?
That is precisely the right mental model. Absolutely.
So in Flux, we talk about cursors.
So a cursor you can think of as a generalization of the idea of an index into a sequence,
in the same way that an iterator is like a generalization of a pointer into a sequence.
So a cursor or an index,
you can't do anything with it by itself.
So iterators are kind of smart in a sense
in that they know how to dereference themselves.
They know how to incorrect themselves.
Whereas with flux cursors,
you cannot do anything with a cursor by itself.
The cursor just represents some sequence position.
And you have to go back to the sequence.
So for a vector, that would
just be a number?
Just be literally an integer index, yes.
Ah, okay, okay, yeah. Interesting.
So my understanding
of what you were saying earlier, the safety
arises because whereas with the iterator
model, which as you say is a generalization
of a pointer,
you're basically combining the idea of the location
within the container and the container itself,
because you're just pointing directly at the element.
Whereas with your model, you separate those out,
so you have to have the actual container
that you are iterating,
and its index or cursor into it,
so you don't get those lifetime issues.
Those lifetimes have to overlap.
Precisely that.
Is that, therefore, a performance trade-off?
Are we losing something by having these two things separated
and having to check them as well?
So not in my experience.
I've spent quite a lot of time looking at all the generated code
in Compiler Explorer, doing benchmarking,
and there's no performance disadvantage to doing this
that I've come across.
Yeah, it's very interesting.
And the other thing to mention,
I mean, we were talking about this blog post
about how we get all this memory safety.
And one of the things that both Rust and Val,
they do, and Swift as well,
they have this thing they call the law of exclusivity.
That's how they ensure memory safety.
So I don't know who coined this term.
The law of exclusivity that says at any point in your program,
you have to have either one mutable reference
or you can have N, what we call const references in C++,
but you can't have both of those things at the same time right either you can have like n readers or you have one writer but you
can't have both at the same time and actually if you think about it because of the pointer model
that iterators use an iterator constitutes what you would call a borrow in Rust. So if we're holding on to, even if it's a const iterator,
that constitutes like a const borrow.
And then if we do anything that mutates that container,
that's going to be a mutable reference at the same time
as we're holding on to a const reference.
And that's where we get problematic things with iterator invalidation.
So actually, if we were to wave a magic wand
and put a borrow checker into C++ tomorrow,
we'd have a problem with the overwhelming majority
of iterator-based code.
However, if our cursors, it's just like an index,
just think of it like a numeric index,
that doesn't constitute a borrow.
We don't do the borrow until such time
as we want to actually read from the sequence
or ask the sequence,
please can you increment my cursor?
And so if we're holding on to a cursor
and we modify the sequence,
we don't automatically invalidate the cursor.
So you said there's no performance impact of doing this,
but I think I want to drill down into this a little bit.
So it seems obvious to me that if you operate with,
for example, on a vector,
if you operate with indices instead of iterators,
you kind of get the same expressivity.
You end up kind of doing the same operations.
But if you then actually index into the vector,
you again have this choice,
like are you going to do a bounce check?
Are you not going to do a bounce check?
But you could theoretically say,
well, I'm not going to do a bounce check.
But it sounds like you're actually enforcing a bounce check there.
And that is a branch right so you're kind of relying on that branch to be optimized out but but if that kind of check is not going to be optimized out especially on like a you know
platform like like i don't know microcontroller or gpu we like don't have a branch predictor
certainly you would measure a performance impact due to additional check, right?
It's the same as using vector at instead of vector operator square bracket, right?
It's precisely the same as using vector at instead of the unchecked square brackets operator.
So yeah, in many cases, either your branch predictor, if you've got a branch predictor, is going to very quickly learn that everything's okay,
and so you're not going to get any overhead there.
Or in an awful lot of cases, the compiler can actually just optimize out the bounce check in the first place.
So the overwhelming majority of new programming languages,
in fact, I can't think of a programming language
that doesn't have built-in bounds checking
on whatever its native array type is.
So optimizers are really, really good at recognizing these patterns
of being able to remove bounds checks in a lot of cases.
Now, you're quite right that there is some code.
Sometimes you write code, and after you've profiled,
after you've looked at it, you can see that this bounds check
kind of remains in there, and you say, well, this is really annoying.
But actually, we provide an unchecked read function as well.
So like vector has vector.at and has the square brackets,
we kind of flip the default in in flux so by default you're going to get a bounds checked read but there's the option if you need it of doing an unchecked read instead for these for these cases
where otherwise you know either the compiler can't optimize it out or you're seeing some, if you are seeing some sort of problem with this,
you can go for the unchecked read.
This is kind of like you'd reach for an unsafe code in Rust, let's say.
So it's not something you should do by default,
but if you really need it, it's there for you.
Yeah, I really like that approach.
I mean, this is what I keep saying, right, when I talk about safety, fault but if you really need it uh it's there for you yeah i really like that approach i mean this
is what i keep saying right when i talk about safety that yeah we really need to flip the
default the default should be always the safe thing that cannot possibly have undefined behavior
and d++ is not great at that at the moment we are slowly you know improving but like
this is a long way there but uh because d++ is a language about all about performance and we have lots of people
in industries who need to squeeze every nanosecond out of their code like i don't know low latency
trading for example we had an episode about that at some point uh you kind of need this like
unchecked uh escape hatch for lots of these things so good to know that that's there if you really measure and find out that you need it.
Always measure first. Yeah. Always measure first. Yes. Yes.
The other trade-off that might be involved here though is we're moving to a
fundamentally different iteration model. Does that mean we have to throw out all of our iterator and
range-based algorithms and write everything again from scratch? Fortunately not, no.
So one of the goals of Flux is to provide good integration
with existing ranges code.
So there are two sides to that.
The first of all is if I've got, you know,
maybe I've got custom containers that are going to provide iterators,
provide C++20 iterators.
How can I use those containers with Flux?
So first of all, any C++20 contiguous range is automatically usable with Flux
just directly, just without doing anything.
So if you've got a contiguous container,
it means we can get the raw data pointer
and we can do like a balance check indexing
by offsetting from the data pointer.
And then we also have a wrapper in Flux
as a function called from range.
So you can take any range and wrap it
in the Fl flux sequence API.
We lose some safety guarantees there because, you know,
in that case we are building on top of the range's operations,
the iterator operations, which might be unsafe.
So you can do that.
It's not ideal because we lose some of the safety guarantees, but it provides ranges compatibility.
So that's going in one direction.
That's taking existing ranges, and then we can use them with the flux algorithms.
We can use them with the flux sequence adapters.
Going in the other direction, if I've got a flux sequence, and let's say I've built
up a chain of adapters, and now I've got my flux sequence but there's some custom
algorithm based on iterators how can i use that with my flux sequence well it turns out
every flux sequence is automatically a c++20 range as well so we can call you can call
dot begin dot end on a flux sequence and it will give you C++20-compatible iterators.
These are kind of – they take a pointer to the sequence
that you're operating on, so they're not safe,
but they're no less safe than any other iterator implementation.
So you can take your flux code and call c++ 20 algorithm of c++ ranges
algorithms nice sounds like there's almost no downside we might have to dig into that a bit
more but before we do that this is a good point to pause and uh thank our sponsor for this episode
this episode is sponsored by son, the home of clean code.
And we talk about safety and security in our code.
Well, SonaLint is a free plugin for your IDE
that helps you find and fix bugs and security issues
from the moment you start writing code.
You can also add SonaCube or SonaCloud
to extend your CICD pipeline
and then it'll be a whole team to deliver clean code
consistently and efficiently on every check-in or pull request. to extend your CI-CD pipeline and enable your whole team to deliver clean code consistently
and efficiently on every check-in or pull request.
Sonar Cloud is completely free for open source projects
and integrates with all of the main cloud DevOps platforms.
All right.
So Tristan, I just had a very quick question.
So does that mean that, for example,
because you have these cursors, you basically get random access into your container, right?
That's kind of the problem.
If you're trying to rewrite something on a contiguous sequence, like an algorithm that operates on a contiguous sequence in Rust, you just don't have random access at all, right?
You have this weird other thing called slices, which kind of works differently and you kind of have to rewrite everything.
So with your scheme, you do get random access.
It's just by default checked, basically.
Did I understand that correctly?
That's absolutely correct.
So Rust iterators don't give you random access.
So the equivalent operations, things like sort in Rust operate on what they call slices.
It's equivalent to a span in C++20.
But in Flux, we have a random access sequence concept.
So you can do something like, for example, you can zip together two vectors.
And then you can, if with an appropriate comparator, you could perform a sort on the zipped sequence of two vectors,
which is sort of the ultimate test of how these things work.
And it works well in flux.
And you can do this with C++20 ranges as well,
but you can't do it with most other iteration schemes.
Right. So that kind of leads to my next question so uh you said like you want to zip vectors and you can sort the like range
that you get so you kind of can chain things right so um this is something that ranges does
as well this is kind of the the big thing that ranges adds on top of like, or in addition to like, or instead of what the STL does, right?
Where like you can chain these views and these operations on them.
And Ranges does it with this pipe operator.
And I looked into the documentation of your library and you do it differently.
You have these kind of member functions where you have something dot do this dot do this and you kind
of chain them which is kind of a very different design which i suppose there's a reason why rain
just doesn't do that so is there like an advantage to doing it this way is there a reason why you
designed it that way i'm just curious okay so uh yeah so flux provides a whole ton of sequence
adapters so these are things like whole ton of sequence adapters.
So these are things exactly like the range adapters that we have in C++20,
things people call views, although technically correct, range adapters.
We also have sequence adapters in Flux, lots and lots of them.
And yeah, absolutely right.
So the way we do it in Flux is that you start off,
you have to you have
to choose how you want to iterate over your sequence do you want to take a copy of your
sequence and then iterate over that you want to move your sequence or do you want to you want to
iterate over a reference to your sequence so your first line of any pipeline is you choose
how you're iterating over it and and after that, you can chain together these operations using the dot syntax. So you would say, like, flux ref of my vector, let's say, to iterate by
reference, and then you might say dot filter with your predicate, and then dot map to do some sort
of transformation, and then you might call dot sum to do the equivalent of accumulate. and have accumulated. So why member function chaining rather than pipes?
So to be honest, there's no technical reason for it
in the sense that there is anything that you can do with members
that you can't do with pipes.
I could have used pipes in Flux.
The reason mostly is because I think it makes the code much more readable.
I think the pipe std colon colon views, colon colon filter,
or transform, whatever it is, it's a lot of visual noise,
and it's kind of hard to just read what the code is doing.
And the other thing is that actually the dot syntax, you get nice auto
completion in your IDE as well. So you just press the dot key and you get a big list of
all of the operations that you can do on your sequence. And you actually get nice dot comments
as well. So there's no technical reason for it. Mostly, I just kind of like the visual
style better. And I know there have been proposals for a pipeline operator and things like that. And
I'll be looking closely at those things. I'm not kind of wedded to the dot syntax. I'm going to be
looking at the proposals to see whether they're appropriate for appropriate for flux but at the moment it uses the the dot syntax i think just because it reads better and as i say you get the auto completion
as well which is a nice bonus yeah i think that the trade-off is it's a bit more ergonomic
especially if you've got an ide and you get the auto completion as you said but on the other hand
it's harder to extend you have to actually change the underlying classes usually
because we don't have extension methods yet.
Well, okay, so the way around that is let's say you've got some custom adapter
that you can write in Flux or you've got some custom algorithm.
We have a function.
It's actually there are two spellings of it.
You can either spell it.apply or even.underscore.
Okay.
And so if you do this, so let's say.underscore brackets,
and then you put the function name or the adapter name
followed by the optional arguments,
and the Flux machinery will then serve this as part of your pipeline.
So that bit is covered as well.
We can use custom adapters, custom algorithms as well we can use nice yeah custom adapters custom algorithms as well
great so we talked a bit about we mentioned indexes and random access or um contiguous
containers that makes sense because you know if you've got an index you need to be able to
index into something so what about containers that don't have that sort of node-based containers like maps and sets and things?
Yeah, so node-based containers are tricky.
So if we take the simplest example of node-based containers,
which would be a linked list,
you want to store some sort of iteration state.
It doesn't matter kind of what your iteration scheme is.
If you're doing external iteration, you want to hold some state.
And then the natural kind of state
that you want to hold for a linked list
would be like a pointer to the node.
So what we need to ensure
if we want to do safe iteration
is we need to make sure
that nobody has messed with the node
while we're holding onto it, right?
And then for a linked list,
we've got our node
then has the next pointer
and we want to make sure
that no one's messed with our next pointer.
How do we do this safely
and with minimal overhead
is a very tricky problem.
So there are sort of three approaches
that I know of.
Perhaps this blog post
we were talking about earlier,
perhaps that has some ideas as well.
But so the three
approaches that that i know of firstly you can use uh shared pointers or weak pointers you know
let's say that your your iterator or cursor holds a weak pointer to the node and then you can try
and lock that weak pointer and if the node no longer exists then you can error out at that point so you can use shared shared pointers to do this of course shared pointers come with some
overhead whether that is whether that's problematic kind of depends on your use case
the other a second approach is that rather than just keep allocating all of your nodes, because the trouble is,
in general, we can't just take a pointer to some arbitrary memory and say, is there an
object here?
We can't even ask whether an object exists.
So one thing you can do is rather than just keep allocating your nodes just with operating new or equivalent,
is you can pre-allocate some sort of arena from which you take your nodes.
And then if you're in control of this arena, then you can ask,
is there an object at this address or at this offset into the array?
Because you're in control of that whole thing.
Again, there are trade-offs with this.
You almost sort of end up with a slightly different design,
but it's, again, something that can be done.
And the third approach was mentioned in this blog post we were talking about earlier.
The third approach is quite interesting
because what you can do is you can attach a generation marker
to your cursors and internally to your, let's say, linked list.
And then every time you modify the list by adding nodes,
adding elements, removing elements, whatever you do,
it increments the generation counts.
And then when you perform an iteration operation,
because in Flux we always use the sequence,
we can always compare the sequence's internal generation
to the generation ID that's stored in the cursor.
And if they don't match, we can say,
hey, no, you've messed with the list
after generating this cursor,
and so we're not going to allow you to do this.
What's really fascinating is you end up
with quite a different style of API,
but you can actually theoretically do this
at compile time.
You can generate a new type every time
you modify your list. Let's say you can store it in new type and then you can
actually do this at compile time you can reject these things that's a bit more niche but it's
something that's theoretically possible with the model that is so cool i mean i've just been
listening to this and that actually makes me think that this sounds very very similar to
how you solve another problem which is lock-free data structures,
where you don't necessarily want to solve the problem
of they have a dangling pointer,
but you want to solve the problem of,
you know, two threads are iterating
through the same linked list at the same time.
And one might remove a node
and the other might be reading it.
And how do you like get rid of the race conditions there
without putting a big mutex around the whole thing right and you you end up with basically three approaches and
i've done talks about the stuff it's like reference counting rcu or hazard pointers and
the way they work is quite i mean it's not quite the same but and definitely not like the compile
time stuff that you mentioned at the end but like they're kind of similar to to like what you were
describing like it's either reference counting or're kind of similar to like what you were describing.
Like it's either reference counting or some kind of garbage collector thing
where you kind of know
what the lifetime of every object is
or like similar stuff like that.
So it feels like it's a similar kind of problem,
like getting rid of data races there
or getting rid of just dangling pointers
and kind of memory safety.
That's really, really cool.
I need to think about that a bit more yeah yeah it's uh it's sort of fundamentally the
same the same kind of problem it's well it all comes down to memory safety right that's um
different approaches to that yeah it's like two sides of the same coin right either you have like
one thread but then you do things in a weird order and you might mess things up
or or you just have stuff happening at the same time um and then how do you get rid of race
conditions but like yes it sounds like as you say it's like two sides of the same coin almost it's
like really cool i've never thought about it that way thank you very much yeah yeah multi-threading
just takes all of these problems we have in the single-threading case and just makes it 10 times more difficult.
But it feels like if you solved the memory safety problem,
if you solve it comprehensively,
like for example, Rust does with the Borrowed Checker,
you kind of solve the cold concurrency problem as well.
Because then it just becomes a question of you know the exclusivity principle
uh not just between like different scopes or whatever but also like between different threads
but it's kind of the same thing yeah absolutely because you know if you can guarantee that all
of your threads are are only reading then of course you're not going to have a problem and
so it comes back to this
this law of exclusivity thing we were talking about is but yeah how do you ensure that there's
only one one writer at a time but if you don't have that then the problem you have to solve is
like how do i make sure that the node is not going to get deleted from under me and then you know we
get into all of these all of these approaches yeah exactly and talking about exclusivity it's a c++ 20 library isn't it
is it is it only c++ 20 it can work with or you just chose to to start with that uh so it's a it's
a c++ 20 library uh primarily because it makes heavy use of concepts so uh it reuses quite a lot of the concepts from uh what was the rangers ts became
c++ 20 so we use we use quite a lot of the concepts from that so yeah it's a c++ 20 library
the idea basically is that at the moment we're still we're still a reasonably early stage you
know i'm just kind of starting to talk about this and kind of make people aware of it i've been i've been beavering away on this kind of quietly in my spare time for
like the last year or so and now i feel it's at a stage where i want to try and get people
interested in it maybe i can get some early adopters and get people uh you know hopefully
get some contributors so this is something that i want people to be able to use two,
three years from now when, you know,
the early adopters would have been able to hammer out all of the bugs and
things like that.
So although there are a lot of places who are saying we're not using C++
20 yet, these things change very rapidly.
And so, you know, by the time Flux is ready for production use,
hopefully a lot more people will be on C++20
and we can look forward.
We can use all these nice concepts and things
that are now in the language.
And if you do want to use it now,
it's actually on Compiler Explorer, isn't it?
It's not.
So it's not in the libraries dropdown of Compiler Explorer yet.
It was funny, I was actually
talking to Matt about this at
C++ on C. And so
he told me how to go about that. I need to,
apparently I need to submit a PR. But what you
can do is, because we have a generated
single header, and Compiler
Explorer has this amazing feature where if you
type hash include and then put a URL
in your quotes, it will go and
download from that url
and so you can use it today in compiler explorer and i've got a whole bunch of examples
on the github of doing that but it's not yet in the libraries drop down right it will be it will
be soon might be by the time this episode ends so you're gonna have like a conan package and
all of the other things you need these days to ship a library?
I would very much like to have it on a Conan and VC package
and all of those.
I have to confess it's not something I know how to do.
But if there's anybody out there
who would like to see Flux on these things,
then please do. I'm very open to having submissionsux on these things, then please do.
I'm very open to having submissions
to put these on there.
So just in general,
what compilers do you support?
Do you support all the major compilers
and what's the license like?
How mature is this whole thing?
Can people just throw it into their repository
and start using it?
Okay, so it's the boost license so you're going to be able to use it i imagine everywhere uh i doubt your corporate lawyers are
going to have a problem with it it's it works uh so we test with gcc 11.3 and onwards, MSVC 2022, and Clang 16.
So that's only the most recent version of Clang
as the concepts of what we need.
So unfortunately, no Apple Clang yet,
but that will be updated in due course, I'm sure.
So we test with those three compilers.
I haven't tried it with the Intel
compiler or somebody was talking to me at the conference, they were trying to get this
working with the NVIDIA C++ compiler. It hit some sort of internal compiler error, but
hopefully that will get sorted out in due course because they were going to submit a bug report
about that. So in terms of maturity, as I this is we're still at quite an early stage
you know at the moment it's 12 000 lines of code primarily written by me uh it's one of the
contributors he's done some some fantastic uh examples and things uh but primarily it's you
know 12 000 codes lines of code written by one person it's um it's at a stage i would say where i'm looking for early
adopters people who want to try it out if you've got like personal projects or whatever and you
want to uh and you want to see how this works get a head start you know if you want to contribute
that's even better because uh you know hopefully within you know within a reasonable time frame we can get this uh get this
to a stage where people are happy to just drop it into their production code but i'd be a bit
nervous about people doing that right now but hopefully in due course so with those caveats
in mind then well obviously we will drop the link to the repo in the show notes. So if any listeners do want to give it a try,
ideally for non-production use,
just to kick their wheels, try it out,
feedback any issues and other things.
Or maybe contribute.
Are you open to other contributors as well?
Oh, absolutely.
More the merrier.
So anyone that wants to be a part
of the wave of sequence-oriented
programming, we will
show you where you can hook up.
So that's really
fascinating. So before we
wrap up, are you involved
in any other interesting C++
projects?
No.
Not really, to be honest. This is the main one that i'm focusing my spare time on so you're fully focused on on flux now fully focused on flux yeah so
got your full attention so that's that's good good to know flux is where my focus lives yeah
but you're involved in um c++ standardization on the on the bsi panel especially
so yeah and the bsi um and i uh i also attend the meetings of the ranges study group as well so i'm
still kind of involved in involved in that uh because i i just i really like the style of code.
I think Flux is great,
but if you don't want to use Flux,
if you want to use standard ranges,
that's still great.
I don't want to sit here and say ranges are terrible
because I think ranges are great as well.
I think Flux is even better,
but I still want to make ranges as good as they can be.
So I'm still involved in that in the standardization side.
That kind of begs the next question.
Would you like to see sequence-oriented programming
in the standard at some point?
Because I know you've actually done quite a lot of talks
on ranges that were really good,
but now you have this library.
Would you like that to be in the standard as well at some point?
I mean, ranges do enable the sequence
orientated program style right so if you're using these range adapters and these high level
adapters high level algorithms that is what i you know by my definition of the sequence
orientated program you can do it with ranges would i like to see something like flux
right now i would say there is kind say there's so much overlap with ranges that you wouldn't want to have both of them in the standard library. library based on some sort of safer subset of the language or something like that, then I think Flux
would be, or something similar to Flux, would be a very good place to start with that. So
maybe not today, but possibly one day in the future.
We do prefer to standardise existing practice, so we'd need some of that practice first.
Well, quite exactly that, yeah.
Yeah.
So is there anything else currently happening
in the world of C++
that you do find particularly interesting or relevant?
I mean, as you might have gathered,
I'm really interested in all this talk about memory safety
and how we can evolve the language in a safe way.
And the success successor languages, in particular, I'm really interested in Val.
I think that's a really fascinating approach, what they call their mutable value semantics.
So I'm looking at that closely.
Carbon as well.
I mean, I haven't looked closely at carbon since the announcement last year,
but definitely we'll be very interested to see where these things go in the future.
And I think in the future is the appropriate theme there.
So Tristan, do you want to tell our listeners how they can reach you
if they have more questions about this, or if they want to chat to you about Flux and you're concerned with programming?
Or is there anything else you want to tell us before we wrap up?
Okay, so I'm on Twitter, so at Tristan Brindle on Twitter, if Twitter is still around by the time this episode airs.
Yeah, it's kind of falling apart now.
Yeah, exactly. By the time this episode airs, my GitHub username is tcbrindle.
So feel free to reach out to me.
Either of those methods methods you'll be able
to find my email address uh if by uh either of those methods if you prefer to get in contact
with my email uh so please do if you've got any questions about the library any questions about
uh this sort of thing in general uh i'm very happy to uh very happy to hear from people
thank you and all those details will be in the show notes.
So Tristan, thank you very much for joining us today and telling us all about Flux and sequence-oriented programming.
It's been my pleasure. Thank you.
And thank you, everybody, for listening.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in.
Or if you have a suggestion for a guest or topic, we'd love to hear about that too.
You can email all your thoughts to feedback at cppcast.com. We'd also appreciate it if you can
follow CppCast on Twitter or Mastodon. You can also follow me and Phil individually on Twitter
or Mastodon. All those links, as well as the show notes, can be found on the podcast website at cppcast.com.
The theme music for this episode was provided by podcastthemes.com.