Algorithms + Data Structures = Programs - Episode 126: Flux (and Flow) with Tristan Brindle

Episode Date: April 21, 2023

In this episode, Conor and Bryce chat with Tristan Brindle about his new library Flux and his predecessor library Flow.Link to Episode 126 on WebsiteDiscuss this episode, leave a comment, or ask a que...stion (on GitHub)TwitterADSP: The PodcastConor HoekstraBryce Adelstein LelbachAbout the GuestTristan Brindle a freelance programmer and trainer based in London, mostly focussing on C++. He is a member of the UK national body (BSI) and ISO WG21. Occasionally I can be found at C++ conferences. He is also a director of C++ London Uni, a not-for-profit organisation offering free beginner programming classes in London and online. He has a few fun projects on GitHub that you can find out about here.Show NotesDate Recorded: 2023-04-05Date Released: 2023-04-21ADSP Episode 125: NanoRange with Tristan BrindleKeynote: Iterators and Ranges: Comparing C++ to D, Rust, and Others - Barry Revzin - CPPP 2021Rust IteratorsFlowFluxSwift SequencesEpisode 124: Vectorizing std::views::filterC++ std::find_ifC++17 std::reduceC++ std::accumulateCppCon 2016: Ben Deane “std::accumulate: Exploring an Algorithmic Empire”Intro Song InfoMiss You by Sarah Jansen https://soundcloud.com/sarahjansenmusicCreative Commons — Attribution 3.0 Unported — CC BY 3.0Free Download / Stream: http://bit.ly/l-miss-youMusic promoted by Audio Library https://youtu.be/iYYxnasvfx8

Transcript
Discussion (0)
Starting point is 00:00:00 I have this idea for a talk that I might flesh out at some point. It's like, what is the most powerful algorithm in the STL? And just to give away the punchline, my conjecture is that it's standard find if. Really? So I think, yeah. I mean, he's not wrong because a find if is essentially just a reduction. And the correct answer is a reduction. I was like, there's no
Starting point is 00:00:25 way Bryce's answer isn't stood reduced. Welcome to ADSP, the podcast episode 126 recorded on April 5th, 2023. My name is Connor. Today with my co-host, Bryce, we interview Tristan Brindle. This is part two of a three-part interview. And today, we talk about his library's flux and flow. So, okay. So at this point, it's worth mentioning, Barry Rizvin has this amazing talk. You've probably seen. I think he gave it a keynote at C++ Paris where he compares the iteration schemes of various languages. Rusty and C++, yeah.
Starting point is 00:01:15 Yeah, Rusty, C++. He also mentions like C Sharp and Java and Python and stuff. And it's an amazing talk. So if anybody's interested in this stuff, like really do go and watch this. I can't, I can't recommend it highly enough. But I, I, so I was kind of interested in like how, before Barry's talk, how, how do other languages do this? And I started writing a new library that was implementing Rust style iterators in C++. So for maybe people
Starting point is 00:01:48 who know C++ but don't know Rust, the Rust iteration scheme is really quite different to how it's done in C++. So what they do is they have this object called an iterator, and this iterator has a function called next. And what next does is it returns you an optional with the next value in the sequence right so you call next the first time it returns you the first object an optional containing the first object in the sequence call next again it returns you the second object in the sequence and so on until eventually the sequence is exhausted and that point it returns you an empty optional which tells you that the iteration is complete so it's an alternative way of doing iteration um there are pros and cons to it the pro being it's very very much simpler than the c plus plus model right um the downside is
Starting point is 00:02:40 that uh it's more limited it's not as powerful as what you can do with C++ where you can go all the way to random access. So I wrote this library, it's called- That's quite a big limitation, especially once you start thinking about wanting to have implementations of your algorithms that consume iterables that are optimized for parallelism and for vectorization,
Starting point is 00:03:10 if you don't have a way to do random access, that's kind of a problem. Yeah, well, this was, so I'm absolutely by no means a Rust expert and i'm sure the rust people will say no no no you do it this way instead right so i don't i don't want to step on anybody's toes there must be some something that we're missing here is my guess yeah yeah um as i say not my not my not my my sort of field of expertise for how all this parallelism stuff is done in Rust. But for me, so I wrote this library.
Starting point is 00:03:50 It's actually on GitHub. It's called Flow. It kind of does this Rust iteration scheme in C++, just kind of for my own, like, just for fun, really, just to kind of see how this was done. But I found this limitation of not being able to go all the way to random access, you know, this bothered me. So I was like, okay, is there a way we can extend the Rust iteration scheme, or Rust-like iteration scheme? It's actually used in a couple of other languages as well.
Starting point is 00:04:22 We can extend this to do something like Rust, something like random access. And it turns out that the way you actually implement these iterators in the Rust, or the way I implemented these, is you have like your iterator object, and it holds a cursor to the underlying sequence vector, let's say. So your cursor could be just an integer index. And every time someone calls next, you bump the index and you return the next element. So I was like, okay, well, what if we provided an API to expose these cursors to the user so that you could then move the cursor in arbitrary
Starting point is 00:05:08 amount as you need to for random access. And then I did some experiments with this and I was like, you know, this works. But actually, if you do this thing, if you expose these cursors um you know to the user and allow them to manipulate cursors you really don't need this dot next function at all you can do it all with cursors um so that was my starting point for this new library that um i've been working on called flux so i was like okay scrap everything I've done. I'm going to start again from the beginning and we're going to have this API based on cursors. So what is a cursor? So this is getting into Flux now. This is the new thing, right? What is a cursor? A cursor
Starting point is 00:05:59 is no more and no less than a position in a sequence, right? So the most general kind of cursor is just an integer index into an array, right? Cursors are a generalization of array indices. It's so funny that you say that because I've been using the same term for the same thing in this multidimensional indices library that I've been playing around with. I must have heard you say it before or something, because I recall hearing somebody
Starting point is 00:06:32 use this terminology. I just didn't remember who. Well, I don't think I've talked about this. Maybe we were having a chat at a conference or or something but i can't remember talking about it before but um yeah so a curse is like a like a generalization uh of an index in the same way that an stl iterator is a generalization of an array pointer but really the key thing is that you can't independently manipulate curses so if you think about STL iterators, STL iterators are smart, right? STL iterators know how to increment themselves. They know how to dereference themselves. Cursors don't. So a cursor just represents a position in a sequence and nothing else. And if you want to increment the cursor, you have to say to the sequence, please will you increment this for
Starting point is 00:07:25 me? If you want to access the element, if you want to like dereferencing for iterators, you have to say to the sequence, please will you give me the element at this position? And this doesn't seem like a very big change compared to iterators. But it turns out this has pretty far, and that's the core of Flux, right? That's the difference between Flux and Ranges. But it turns out this has pretty far reaching consequences. Number one is that it pretty dramatically improves safety, which is kind of a big deal at the moment, right? Because there are so many ways that you can accidentally stumble into undefined behavior when you're using iterators. Dereferencing a past the end iterator is undefined behavior. Incrementing past the
Starting point is 00:08:14 end iterator is undefined behavior. Bidirectional sequence decrementing a begin iterator is undefined behavior, right? If you're holding onto an iterator and you change the, if it's vector and you change the underlying vector, you're gonna have, you can have a dangling iterator. Well, you can have dangling iterators because you're holding onto this iterator and the vector dies, but you still got the iterator. You know, there are many ways in which
Starting point is 00:08:42 you can stumble into undefined behavior this is much more difficult if a cursor just represents a position this you it's much more difficult to accidentally stumble into undefined behavior land right because to give you an example if the cursor let's just think of it as a as index. If you're holding onto an integer index and your vector goes out of scope, like there's no danger there because you can't do anything with the index without the original vector to like try and access it. Right.
Starting point is 00:09:15 Right. So just immediately dangling your traces like return from functions, let's say, are no longer a problem like by design. We can also do, because we've always got the sequence object available, we can do cheap bounds checking. So this is something that is quite expensive to do with iterators because it means iterators need to have a reference to the parent array
Starting point is 00:09:42 or to the parent range. Whereas with influx, we do bounds checking universally. And unless you specifically request unchecked access, we will check and make sure that your cursor is in bounds before we do the dereference. And otherwise, well, depends on the configuration, what happens either we crash or we throw an exception. But so we've got bounds checked access. So this is like, I think, I mean,
Starting point is 00:10:16 for me, I think that's really important. It's something that languages like Swift and Rust and many languages do everywhere. So we've got bounce checking, we've got no dangling. There are various other things that turn out to be better. Like in C++, we have to worry about iterators versus const iterators, which is, you know, just kind of a burden on the programmer to, particularly if you're implementing ranges,
Starting point is 00:10:53 you have to worry about bad iterators and const iterators, and they're not quite the same thing, but you have to be able to convert from one to the other, and it's just a pain. Whereas with cursors, because a cursor is, you know, you can think of it as just being like an index, you can have the same index, whether you have a const vector or a non-const vector, it doesn't matter, right? We make the decision about const or non-const access
Starting point is 00:11:17 based on the sequence that you pass in. There are a few other benefits as well when you get to writing adapters. So when you are actually implementing range adapters, things like map, things like transform, the way that the iterators for these things work is that you have to wrap the underlying iterator. You see what I mean. So like, in general, anyway, if you think about the filter, the iterator for a filter view, it needs to contain the iterator for the underlying range, but also have a pointer back to the filter view object so that it can access the predicate.
Starting point is 00:12:08 And so when you build up one of these ranges pipelines, your iterators, the size of your iterators grows. Like for each added stage of your pipeline, in general, your iterator object gets bigger and bigger. And since C++ algorithms, we assume that iterator is really cheap to copy. You know, it's not a big deal, but it, you know, it's not ideal. Whereas in the Flux model with cursors, there is less need to wrap these cursor objects and have them get bigger and bigger.
Starting point is 00:12:42 So that I think is good. And also, what's also quite nice about this is we can provide compatibility with ranges code because if we need to form an iterator, we can form a C++20 compatible iterator by having effectively like a pointer to a sequence and a cursor as a pair and then provide all of the, you know, whatever it is, like 20 odd functions you've got to write for an iterator. So we can be compatible, like in some sense this is even more fundamental than iter iterators because we can we can provide compatibility with existing with existing iterations, so I'm very excited about this. I think it's I think it's pretty good
Starting point is 00:13:37 This is I've never talked about this before anywhere. I've like tease it on Twitter a couple of times with like Hey, look at this you can do this with my code doesn't it look awesome but i've never actually talked and talked about this uh anywhere so um yeah announcing to the world well like first part of this episode will be like uh you know nano nano ranges implementing nano ranges and how you can too and then the the second part will be uh flux and then in parentheses and Flow, but mostly above Flux. Yeah, the other thing to mention,
Starting point is 00:14:11 and I didn't know this one. I didn't know this when I started out, but we were talking about Swift before. It turns out that the Swift collection API is pretty much the same as what I'm doing in Flux. So I didn't know this when I started, right? Because I don't really know much about Swift at all. I only found this out later. So I mentioned before, Swift has like these two different protocols, but the collection protocol is this same abstraction of the idea of an index. They use different terminology,
Starting point is 00:14:44 they use different function names, but the kind of the fundamental design is the same abstraction of the idea of an index they use different terminology they use different function names but the kind of the fundamental design is the same which actually gives me quite a lot of confidence that i'm i'm going in the right direction if you know what i mean like because swift is a safe language in all the same ways that russ does safe language and if it works for them well you know there's a good chance it'll work for us and what i've heard too uh dave abrahams is you know he's a pretty pretty novice work for us. And what I've heard too, uh, Dave Abrahams is, you know, he's a pretty, pretty novice programmer that doesn't have too many years of experience. Right.
Starting point is 00:15:09 Uh, yeah. We'll just see. Um, wow. That's, uh, that's quite interesting.
Starting point is 00:15:16 Wow. So obviously we'll definitely link to this and, and I mean, uh, a couple, I mean, we never got back to the tweet of why you ended up here, but the second tweet.
Starting point is 00:15:23 Oh, um, yeah. Well, actually, did I link before I move on? I just got to say, I think, I think, we never got back to the tweet of why you ended up here, but the second tweet. Oh, yeah. Well, actually, did I link? Before I move on, I just got to say, I think I think you're you're on the right track with this with this model. And I think that it opens up a lot of things, especially related to loop optimization that are challenging with the iterator model. And I think, especially like in numeric, in computational science spaces, this cursor model is potentially a much better fit.
Starting point is 00:15:54 Yeah, yeah. So that's actually one thing I didn't mention is that we have an extra customization point that is not in ranges uh but we but uh in flux we have this customization point for doing internal iteration oh that's interesting that's that's similar to what i've been messing around with in uh like yeah this episode's not out but the the one unreleased episode is basically bright i mean all the listeners have heard it now they're listening to this one because it was like two episodes ago now but it's bryce trying to convince me that like filter is broken for the purposes of parallelization and then coming up with like a new filter and then i'm like my point is bryce that's not a filter that's a transform but anyways did we talk in that episode about that my my spaces idea
Starting point is 00:16:38 which is which is solved it tackles this problem of internal iteration you did end up screen sharing because i was asking for a more motivating example yeah because your example i said was unmotivating and then we got into an argument you were like that's your problem not my problem and i was like it seems like a you problem and then you showed me a motivating example and then i said oh i can tell me how your model for internal iteration works okay Okay, so we have a function that it's called, I called it for each while, which I don't necessarily think that's a great name. But the idea is that a sequence has this fundamental
Starting point is 00:17:19 customization point that's called for each while, and you pass it a predicate. And you say uh iterate all of iterate over the sequence in the way best way that you know how iterate over every element until the predicate returns false and at that point it returns us the cursor uh for the the element that for which the predicate returned false and it turns out that for things like filter uh and map and things like that, and really an enormous number of these adapters,
Starting point is 00:17:49 you can do internal iteration in a way that is more efficient than just stepping through and doing increment, dereference, increment, dereference, you know, et cetera, et cetera. And this makes a real difference when it comes to, although the algorithmic complexity is the same, it makes a real difference in terms of the generated code. Like the compiler can see through this. This was literally our last episode.
Starting point is 00:18:13 We're two episodes now for the listener. So for the benefit of the listener, I've heard part one of the discussion on views filter. And then you left me hanging, right? You were like, we're just going to go and share the screen. And that's the end of the discussion on views filter. And then you left me hanging, right? You were like, we're just going to go and share the screen and that's the end of the episode. So I have not heard the conclusion of that, but yeah. So in particular, one of the,
Starting point is 00:18:38 I don't want to call it a problem, but one of the things that you have to bear in mind with C++ ranges, and this is true of Flux as well, and Barry talks about this in this fantastic talk, is when you have transform and then filter, then you have this issue of double dereference. Because when you're filtering, you have to see whether the element matches the filter. You do that in the increment operation. And so that requires dereferencing the underlying iterator. And then when the user comes to dereference your filter iterator, you have to like, again,
Starting point is 00:19:13 dereference the underlying iterator. So you get this double dereference happening. It doesn't occur in the Rust model, for example, but there are other equivalent problems with the Rust model. But it turns out when you do internal iteration, you don't need to do this double D reference, right? So right away, this is a tremendous benefit. And there are a huge number of algorithms
Starting point is 00:19:42 that can be written in terms of internal iteration. I have this idea for a talk that I might flesh out at some point. It's like, what is the most powerful algorithm in the STL? And just to give away the punchlines, my conjecture is that it's standard findif. Really? So I think, yeah. I mean, he's not wrong because a find if is essentially just a reduction and the correct the correct answer is a reduction i was like there's no way
Starting point is 00:20:13 bryce's answer isn't stood reduce or or stood accumulator well okay so so um ben dean has this has a great talk where he did accumulate's going to accumulate an algorithmic empire. Yes, yes. And he talks about how all the things you can write in terms of standard accumulates, which is a funny way of spelling fold. But it sounds like you can write accumulate in terms of find if. But find if is just a funny way of spelling fold as well.
Starting point is 00:20:48 Well, yes. I guess they're equivalent. It depends if you're happy to have a stateful thing that you pass to. Wait, wait, wait. You're saying you can implement accumulate in terms of find if or find if in terms of accumulate? I think you can do either. Well, what was first you have to have some way of terminating
Starting point is 00:21:07 you have to have some way of terminating the iteration early that's the key right so that's what find if does yeah well okay so we could have a whole discussion here now about you're coming back again Tristan and we're going to have a find if versus the advantages and disadvantages
Starting point is 00:21:22 of having a find if that is required to terminate early so but what what was it that you initially said though what do you said that you can implement accumulate in terms of find if because that one seems easier than the other way around if the other way around is even possible that's so that is what uh i do in in flux because i'm not worried about parallelism which um you know this is all just uh are you suggesting that it's harder to implement find if in terms of accumulate yes i think you just said that backwards no why and why is the what what is the number one reason that it's trickier and you should know this because i know you've tried this
Starting point is 00:22:01 and you literally dm me on twitter how do you do this thing to get around this i don't know one returns an iterator and one returns a value and trying to implement uh algorithms that i don't care about i don't i'm not interested in the yeah i'm not interested in the particular in the particulars of the interface my my point is that we're we're talking about these algorithms in the abstract i think when we were spelling out std colon colon find if instead colon colon accumulate we were being pretty specific right but like we will we'll table this because this definitely could be like a whole 60 minute discussion but so what you initially said was that implementing accumulate in terms of find if is basically what you do in flux and it is your
Starting point is 00:22:45 opinion that because find if then therefore because the in the hierarchy of what is built on top of what because accumulate is a child of find if find if is the most basically uh important or fundamental algorithm that everything else is built on top of that is my conjecture until bryce convinces me otherwise yes so just about every algorithm just about a lot of algorithms that operate on a single that the boil down to some sort of reduce operation or things like uh all or any and none and all of those um you can you can write in terms of find if. So if you can do a find if more efficiently, for example, by using internal iteration,
Starting point is 00:23:32 then that has tremendous benefits in terms of- Well, I mean, in serial, you always will want to have a specialized version of a find if, because you want to do a specialized version of a Findif because you want to do it in early termination. But in parallel, the way that you implement a reduction is to just do it with a – in parallel, the way that you implement a Findif is just to do a reduction, at least on something like a GPU.
Starting point is 00:23:59 Yeah, for sure. I mean, it's not my field of expertise, but, yeah. One day, perhaps, I'll look at a parallel version of Flux, but many years from now, probably. I need to get the serial version working first. So what's interesting is that you probably, I'm not sure if you've even seen, I think you have seen the talk, but at the end of my intuition, algorithm intuition i present this table and i present find if and accumulate as like two of the most important but the the distinguishing difference is that accumulate is a reduction that carries state with it and a find if yeah is a reduction that does not
Starting point is 00:24:42 carry state with it and so that what do you do for that then? Do you have like a- So, okay, so you're absolutely right. And in this standard model, you're not actually allowed to pass the, I can't remember the exact term. Anyway, you're not allowed to pass a stateful thing to find if.
Starting point is 00:25:00 So we have to slightly, for influx, we slightly weaken this requirement to allow you to pass. He's doing the hand wavy thing too. He's doing it in flux though. We were talking about the standard library. I understand, but that was a different conversation than the conversation we were having there. Yeah, so basically all of the functional objects that you ever pass to, almost all of the functional objects you pass to standard library algorithms have this requirement can't give you, however many times you call them, they have to give you the same answer.
Starting point is 00:25:47 If we slightly weaken that, or not slightly, if we weaken that, then you're allowed to pass a stateful thing into this. That's why I call it for each while and not find if, because it's more akin to for each than the stricter requirements that are on find if.
Starting point is 00:26:06 Oh, so this weakened find if is what is for each while. Yeah, exactly. And if someone can come up with a better name than for each while, then let me know. Naming is not something I'm good good at you know what's great at naming chat gpt uh no don't don't don't get me started on uh on large language bottles because i've had a i've drunk the kool-aid in the past week yeah maybe a whole separate episode yeah well we'll table that as well but uh yeah, for each while... this is my multi-dimensional iteration thing, which is that you have this notion of a space, which is essentially just a collection of ranges. And then you have these space-aware algorithms
Starting point is 00:27:17 that recursively unpack them. And one of the key insights that I had was that each inner loop needs to have the value of the induction variable at the outer loop. So, you know, if you have a nested loop of, you know, k, where you iterate over k, where you, like the first for loop is a for loop over some thing, and it gives you, element K and the innermost for loop is an element J. When you call this space thing, when you extract the range from the space thing in the inner loop, you have to give it the value of whatever element you're at in the outer loop. And that lets you incrementally build up your position as you go. I did a very, very poor job of explaining that, but they're able to look at code. I'm getting like eerie, like just remembering university lectures about tensors and stuff.
Starting point is 00:28:25 And yeah, horrible flashbacks to things I didn't really understand at the time. Be sure to check the show notes either in your podcast app or at ADSP, the podcast.com for links to any of the things that we talked about in today's episode, as well as a link to a GitHub discussion where you can leave comments, questions or thoughts that you have on this episode. Thanks for listening. We hope you enjoyed and have a great day.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.