Algorithms + Data Structures = Programs - Episode 126: Flux (and Flow) with Tristan Brindle
Episode Date: April 21, 2023In this episode, Conor and Bryce chat with Tristan Brindle about his new library Flux and his predecessor library Flow.Link to Episode 126 on WebsiteDiscuss this episode, leave a comment, or ask a que...stion (on GitHub)TwitterADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the GuestTristan Brindle a freelance programmer and trainer based in London, mostly focussing on C++. He is a member of the UK national body (BSI) and ISO WG21. Occasionally I can be found at C++ conferences. He is also a director of C++ London Uni, a not-for-profit organisation offering free beginner programming classes in London and online. He has a few fun projects on GitHub that you can find out about here.Show NotesDate Recorded: 2023-04-05Date Released: 2023-04-21ADSP Episode 125: NanoRange with Tristan BrindleKeynote: Iterators and Ranges: Comparing C++ to D, Rust, and Others - Barry Revzin - CPPP 2021Rust IteratorsFlowFluxSwift SequencesEpisode 124: Vectorizing std::views::filterC++ std::find_ifC++17 std::reduceC++ std::accumulateCppCon 2016: Ben Deane “std::accumulate: Exploring an Algorithmic Empire”Intro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8
Transcript
Discussion (0)
I have this idea for a talk that I might flesh out at some point.
It's like, what is the most powerful algorithm in the STL?
And just to give away the punchline, my conjecture is that it's standard find if.
Really?
So I think, yeah.
I mean, he's not wrong because a find if is essentially just a reduction.
And the correct answer is a reduction.
I was like, there's no
way Bryce's answer isn't stood reduced. Welcome to ADSP, the podcast episode 126 recorded on April
5th, 2023. My name is Connor. Today with my co-host,
Bryce, we interview Tristan Brindle. This is part two of a three-part interview. And today,
we talk about his library's flux and flow. So, okay. So at this point, it's worth mentioning,
Barry Rizvin has this amazing talk. You've probably seen. I think he gave it a keynote at C++ Paris
where he compares the iteration schemes
of various languages.
Rusty and C++, yeah.
Yeah, Rusty, C++.
He also mentions like C Sharp and Java and Python and stuff.
And it's an amazing talk.
So if anybody's interested in this stuff,
like really do go and
watch this. I can't, I can't recommend it highly enough. But I, I, so I was kind of interested in
like how, before Barry's talk, how, how do other languages do this? And I started writing a new
library that was implementing Rust style iterators in C++. So for maybe people
who know C++ but don't know Rust, the Rust iteration scheme is really quite different to
how it's done in C++. So what they do is they have this object called an iterator, and this iterator
has a function called next. And what next does is it returns you an optional with the next value in the sequence
right so you call next the first time it returns you the first object an optional containing the
first object in the sequence call next again it returns you the second object in the sequence and
so on until eventually the sequence is exhausted and that point it returns you an empty optional which tells you that the
iteration is complete so it's an alternative way of doing iteration um there are pros and cons to
it the pro being it's very very much simpler than the c plus plus model right um the downside is
that uh it's more limited it's not as powerful as what you can do with C++
where you can go all the way to random access.
So I wrote this library, it's called-
That's quite a big limitation,
especially once you start thinking about
wanting to have implementations of your algorithms
that consume iterables that are optimized for parallelism
and for vectorization,
if you don't have a way to do random access,
that's kind of a problem.
Yeah, well, this was,
so I'm absolutely by no means a Rust expert and i'm sure the rust people will
say no no no you do it this way instead right so i don't i don't want to step on anybody's toes
there must be some something that we're missing here is my guess yeah yeah um as i say not my
not my not my my sort of field of expertise for how all this parallelism stuff is done in Rust.
But for me, so I wrote this library.
It's actually on GitHub.
It's called Flow.
It kind of does this Rust iteration scheme in C++,
just kind of for my own, like, just for fun, really,
just to kind of see how this was done. But I found this
limitation of not being able to go all the way to random access, you know, this bothered me.
So I was like, okay, is there a way we can extend the Rust iteration scheme,
or Rust-like iteration scheme? It's actually used in a couple of other languages as well.
We can extend this to do something like Rust, something like random access. And it turns out that the way you actually implement these
iterators in the Rust, or the way I implemented these, is you have like your iterator object,
and it holds a cursor to the underlying sequence vector, let's say.
So your cursor could be just an integer index.
And every time someone calls next,
you bump the index and you return the next element.
So I was like, okay, well, what if we provided an API
to expose these cursors to the user so that you could then move the cursor in arbitrary
amount as you need to for random access.
And then I did some experiments with this and I was like, you know, this works.
But actually, if you do this thing, if you expose these cursors um you know to the user and allow
them to manipulate cursors you really don't need this dot next function at all you can do it all
with cursors um so that was my starting point for this new library that um i've been working on
called flux so i was like okay scrap everything I've done. I'm going to
start again from the beginning and we're going to have this API based on cursors. So what is a
cursor? So this is getting into Flux now. This is the new thing, right? What is a cursor? A cursor
is no more and no less than a position in a sequence, right?
So the most general kind of cursor
is just an integer index into an array, right?
Cursors are a generalization of array indices.
It's so funny that you say that
because I've been using the same term
for the same thing in this multidimensional indices library that I've been playing around
with. I must have heard you say it before or something, because I recall hearing somebody
use this terminology. I just didn't remember who. Well, I don't think I've talked about this.
Maybe we were having a chat at a conference or or something but i can't remember talking about it
before but um yeah so a curse is like a like a generalization uh of an index in the same way that
an stl iterator is a generalization of an array pointer but really the key thing is that
you can't independently manipulate curses so if you think about STL iterators, STL iterators are smart,
right? STL iterators know how to increment themselves. They know how to dereference themselves.
Cursors don't. So a cursor just represents a position in a sequence and nothing else. And
if you want to increment the cursor, you have to say to the sequence, please will you increment this for
me? If you want to access the element, if you want to like dereferencing for iterators, you have to
say to the sequence, please will you give me the element at this position? And this doesn't seem
like a very big change compared to iterators. But it turns out this has pretty far, and that's the
core of Flux, right? That's the difference between Flux and Ranges. But it turns out this has pretty far reaching consequences.
Number one is that it pretty dramatically improves safety, which is kind of a big deal at the moment, right?
Because there are so many ways that you can accidentally stumble
into undefined behavior when you're using iterators.
Dereferencing a past the end iterator is undefined behavior. Incrementing past the
end iterator is undefined behavior. Bidirectional sequence decrementing a begin iterator is
undefined behavior, right? If you're holding onto an iterator and you change the,
if it's vector and you change the underlying vector,
you're gonna have, you can have a dangling iterator.
Well, you can have dangling iterators
because you're holding onto this iterator
and the vector dies, but you still got the iterator.
You know, there are many ways in which
you can stumble into undefined behavior this is much
more difficult if a cursor just represents a position this you it's much more difficult to
accidentally stumble into undefined behavior land right because to give you an example if
the cursor let's just think of it as a as index. If you're holding onto an integer index and your vector goes out of scope,
like there's no danger there
because you can't do anything with the index
without the original vector to like try and access it.
Right.
Right.
So just immediately dangling your traces
like return from functions, let's say,
are no longer a problem like by design.
We can also do, because we've always got the sequence object available,
we can do cheap bounds checking.
So this is something that is quite expensive to do with iterators
because it means iterators need to have a reference to the parent array
or to the parent range. Whereas with influx, we do bounds checking universally.
And unless you specifically request unchecked access,
we will check and make sure that your cursor is in bounds
before we do the dereference.
And otherwise, well, depends on the configuration,
what happens either we crash or we throw an exception.
But so we've got bounds checked access.
So this is like, I think, I mean,
for me, I think that's really important.
It's something that languages like Swift and Rust
and many languages do everywhere.
So we've got bounce checking, we've got no dangling.
There are various other things that turn out to be better.
Like in C++, we have to worry about iterators
versus const iterators, which is, you know, just
kind of a burden on the programmer to, particularly if you're implementing ranges,
you have to worry about bad iterators and const iterators, and they're not quite the same thing,
but you have to be able to convert from one to the other, and it's just a pain.
Whereas with cursors, because a cursor is,
you know, you can think of it as just being like an index,
you can have the same index,
whether you have a const vector or a non-const vector,
it doesn't matter, right?
We make the decision about const or non-const access
based on the sequence that you pass in.
There are a few other benefits as well when you get to writing adapters.
So when you are actually implementing range adapters, things like map, things like transform,
the way that the iterators for these things work is that you have to wrap the underlying iterator.
You see what I mean. So
like, in general, anyway, if you think about the filter, the iterator for a filter view,
it needs to contain the iterator for the underlying range, but also have a pointer
back to the filter view object so that it can access the predicate.
And so when you build up one of these ranges pipelines, your iterators, the size of your
iterators grows. Like for each added stage of your pipeline, in general, your iterator object
gets bigger and bigger. And since C++ algorithms, we assume that iterator is really cheap to copy.
You know, it's not a big deal, but it, you know,
it's not ideal.
Whereas in the Flux model with cursors,
there is less need to wrap these cursor objects
and have them get bigger and bigger.
So that I think is good. And also, what's also
quite nice about this is we can provide compatibility with ranges code because if we need to form
an iterator, we can form a C++20 compatible iterator by having effectively like a pointer
to a sequence and a cursor as a pair and then provide all of the, you know, whatever it is,
like 20 odd functions you've got to write for an iterator. So we can be compatible, like in some
sense this is even more fundamental than iter iterators because we can we can provide
compatibility with existing with existing iterations, so
I'm very excited about this. I think it's I think it's pretty good
This is I've never talked about this before anywhere. I've like tease it on Twitter a couple of times with like
Hey, look at this you can do
this with my code doesn't it look awesome but i've never actually talked and talked about this uh
anywhere so um yeah announcing to the world well like first part of this episode will be like uh
you know nano nano ranges implementing nano ranges and how you can too and then the the
second part will be uh flux and then in parentheses and Flow,
but mostly above Flux.
Yeah, the other thing to mention,
and I didn't know this one.
I didn't know this when I started out,
but we were talking about Swift before.
It turns out that the Swift collection API
is pretty much the same as what I'm doing in Flux. So I didn't know this when I
started, right? Because I don't really know much about Swift at all. I only found this out later.
So I mentioned before, Swift has like these two different protocols, but the collection protocol
is this same abstraction of the idea of an index. They use different terminology,
they use different function names, but the kind of the fundamental design is the same abstraction of the idea of an index they use different terminology they use different
function names but the kind of the fundamental design is the same which actually gives me quite
a lot of confidence that i'm i'm going in the right direction if you know what i mean like
because swift is a safe language in all the same ways that russ does safe language and if it works
for them well you know there's a good chance it'll work for us and what i've heard too uh dave
abrahams is you know he's a pretty pretty novice work for us. And what I've heard too, uh, Dave Abrahams is, you know, he's a pretty,
pretty novice programmer that doesn't have too many years of experience.
Right.
Uh,
yeah.
We'll just see.
Um,
wow.
That's,
uh,
that's quite interesting.
Wow.
So obviously we'll definitely link to this and,
and I mean,
uh,
a couple,
I mean,
we never got back to the tweet of why you ended up here,
but the second tweet.
Oh,
um,
yeah.
Well,
actually, did I link before I move on? I just got to say, I think, I think, we never got back to the tweet of why you ended up here, but the second tweet. Oh, yeah. Well, actually, did I link?
Before I move on, I just got to say, I think I think you're you're on the right track with this with this model.
And I think that it opens up a lot of things, especially related to loop optimization that are challenging with the iterator model. And I think, especially like in numeric,
in computational science spaces, this cursor model is potentially a much better fit.
Yeah, yeah. So that's actually one thing I didn't mention is that we have
an extra customization point that is not in ranges uh but we but uh in flux we have this
customization point for doing internal iteration oh that's interesting that's that's similar to
what i've been messing around with in uh like yeah this episode's not out but the the one
unreleased episode is basically bright i mean all the listeners have heard it now they're listening
to this one because it was like two episodes ago now but it's bryce trying to convince me that like filter is broken for the purposes of
parallelization and then coming up with like a new filter and then i'm like my point is bryce that's
not a filter that's a transform but anyways did we talk in that episode about that my my spaces idea
which is which is solved it tackles this problem of internal iteration you did end up screen sharing
because i was asking for a more motivating example yeah because your example i said was
unmotivating and then we got into an argument you were like that's your problem not my problem and
i was like it seems like a you problem and then you showed me a motivating example and then i said
oh i can tell me how your model for internal iteration works okay Okay, so we have a function that it's called,
I called it for each while,
which I don't necessarily think that's a great name.
But the idea is that a sequence has this fundamental
customization point that's called for each while,
and you pass it a predicate.
And you say uh iterate all of
iterate over the sequence in the way best way that you know how iterate over every element
until the predicate returns false and at that point it returns us the cursor
uh for the the element that for which the predicate returned false and it turns out that
for things like filter uh and map and things like that,
and really an enormous number of these adapters,
you can do internal iteration in a way that is more efficient than just
stepping through and doing increment, dereference, increment, dereference,
you know, et cetera, et cetera.
And this makes a real difference when it comes to,
although the algorithmic complexity is the same,
it makes a real difference in terms of the generated code.
Like the compiler can see through this.
This was literally our last episode.
We're two episodes now for the listener.
So for the benefit of the listener,
I've heard part one of the discussion on views filter.
And then you left me hanging, right?
You were like, we're just going to go and share the screen. And that's the end of the discussion on views filter. And then you left me hanging, right? You were like, we're just going to go and share the screen
and that's the end of the episode.
So I have not heard the conclusion of that, but yeah.
So in particular, one of the,
I don't want to call it a problem,
but one of the things that you have to bear in mind
with C++ ranges, and this is true of Flux as well, and Barry talks about this in this fantastic talk, is when you have transform and then filter, then you have this issue of double dereference.
Because when you're filtering, you have to see whether the element matches the filter.
You do that in the increment operation.
And so that requires dereferencing the underlying iterator.
And then when the user comes to dereference
your filter iterator, you have to like, again,
dereference the underlying iterator.
So you get this double dereference happening.
It doesn't occur in the Rust model, for example,
but there are other equivalent problems with the Rust model.
But it turns out when you do internal iteration,
you don't need to do this double D reference, right?
So right away, this is a tremendous benefit.
And there are a huge number of algorithms
that can be written in terms of internal iteration.
I have this idea for a talk that I might flesh out at some point.
It's like, what is the most powerful algorithm in the STL?
And just to give away the punchlines, my conjecture is that it's standard findif.
Really?
So I think, yeah.
I mean, he's not wrong because a find if is essentially
just a reduction and the correct the correct answer is a reduction i was like there's no way
bryce's answer isn't stood reduce or or stood accumulator well okay so so um ben dean has this
has a great talk where he did accumulate's going to accumulate an algorithmic empire.
Yes, yes.
And he talks about how all the things you can write in terms of standard accumulates,
which is a funny way of spelling fold.
But it sounds like you can write accumulate
in terms of find if.
But find if is just a funny way of spelling fold as well.
Well, yes.
I guess they're equivalent.
It depends if you're happy to have a stateful thing that you pass to.
Wait, wait, wait.
You're saying you can implement accumulate in terms of find if or find if in terms of accumulate?
I think you can do either.
Well, what was first
you have to have some way of terminating
you have to have some way of terminating the iteration
early that's the key right so that's what find if
does yeah well
okay so we could have a whole discussion
here now about
you're coming back again Tristan and we're
going to have a find if versus
the advantages and disadvantages
of having a find if that is required to
terminate early so but what what was it that you initially said though what do you said that you
can implement accumulate in terms of find if because that one seems easier than the other
way around if the other way around is even possible that's so that is what uh i do in
in flux because i'm not worried about parallelism
which um you know this is all just uh are you suggesting that it's harder to implement find if
in terms of accumulate yes i think you just said that backwards no why and why is the what what is
the number one reason that it's trickier and you should know this because i know you've tried this
and you literally dm me on twitter how do you do this thing to get around this i don't know one returns an iterator and one returns a value and
trying to implement uh algorithms that i don't care about i don't i'm not interested in the
yeah i'm not interested in the particular in the particulars of the interface my my point is that
we're we're talking about these algorithms in the abstract
i think when we were spelling out std colon colon find if instead colon colon accumulate we were
being pretty specific right but like we will we'll table this because this definitely could be like a
whole 60 minute discussion but so what you initially said was that implementing accumulate
in terms of find if is basically what you do in flux and it is your
opinion that because find if then therefore because the in the hierarchy of what is built
on top of what because accumulate is a child of find if find if is the most basically
uh important or fundamental algorithm that everything else is built on top of that is my conjecture until bryce convinces me
otherwise yes so just about every algorithm just about a lot of algorithms that operate on a single
that the boil down to some sort of reduce operation or things like uh all or any and none and all of
those um you can you can write in terms of find if.
So if you can do a find if more efficiently,
for example, by using internal iteration,
then that has tremendous benefits in terms of-
Well, I mean, in serial,
you always will want to have a specialized version
of a find if, because you want to do a specialized version of a Findif
because you want to do it in early termination.
But in parallel, the way that you implement a reduction is to just do it with a –
in parallel, the way that you implement a Findif is just to do a reduction,
at least on something like a GPU.
Yeah, for sure.
I mean, it's not my field of expertise, but, yeah.
One day, perhaps, I'll look at a parallel version of Flux, but many years from now, probably.
I need to get the serial version working first.
So what's interesting is that you probably, I'm not sure if you've even seen, I think you have seen the talk,
but at the end of my intuition, algorithm intuition i present this table and i present find if and
accumulate as like two of the most important but the the distinguishing difference is that
accumulate is a reduction that carries state with it and a find if yeah is a reduction that does not
carry state with it and so that what do you do for that then?
Do you have like a-
So, okay, so you're absolutely right.
And in this standard model,
you're not actually allowed to pass the,
I can't remember the exact term.
Anyway, you're not allowed to pass
a stateful thing to find if.
So we have to slightly, for influx,
we slightly weaken this requirement to allow you to pass.
He's doing the hand wavy thing too.
He's doing it in flux though. We were talking about the standard library.
I understand, but that was a different conversation than the conversation we were having there.
Yeah, so basically all of the
functional objects that you ever pass to, almost all of the functional objects you pass to standard library algorithms have this requirement can't give you, however many times you call them,
they have to give you the same answer.
If we slightly weaken that,
or not slightly,
if we weaken that,
then you're allowed to pass a stateful thing into this.
That's why I call it for each while
and not find if,
because it's more akin to for each
than the stricter requirements that are on find if.
Oh, so this weakened find if is what is for each while.
Yeah, exactly.
And if someone can come up with a better name than for each while, then let me know.
Naming is not something I'm good good at you know what's great at naming
chat gpt uh no don't don't don't get me started on uh on large language bottles because i've had a
i've drunk the kool-aid in the past week yeah maybe a whole separate episode yeah well we'll
table that as well but uh yeah, for each while... this is my multi-dimensional iteration thing, which is that you have this notion of a space,
which is essentially just a collection of ranges. And then you have these space-aware algorithms
that recursively unpack them. And one of the key insights that I had was that each inner loop
needs to have the value of the induction variable at the outer loop. So, you know, if you have a
nested loop of, you know, k, where you iterate over k, where you, like the first for loop is
a for loop over some thing, and it gives you, element K and the innermost for loop is an element J.
When you call this space thing, when you extract the range from the space thing in the inner loop, you have to give it the value of whatever element you're at in the outer loop.
And that lets you incrementally build up your position as you go. I did a very,
very poor job of explaining that, but they're able to look at code.
I'm getting like eerie, like just remembering university lectures about tensors and stuff.
And yeah, horrible flashbacks to things I didn't really understand at the time.
Be sure to check the show notes either in your podcast app or at ADSP, the podcast.com
for links to any of the things that we talked about in today's episode,
as well as a link to a GitHub discussion where you can leave comments,
questions or thoughts that you have on this episode.
Thanks for listening.
We hope you enjoyed and have a great day.