CppCast - Sequence-Oriented Programming

Starting point is 00:00:00 Episode 364 of CppCast with guest Tristan Brindle, recorded 3rd of July 2023. This episode is sponsored by JetBrains, smart IDEs to help with C++. And Sonar, the home of clean code. In this episode, we talk about what happened at C++ on C, C-Line's new AI assistant, and different techniques for memory safety. Then, we are joined by Tristan Brindle. Tristan talks to us about sequence-oriented programming. And Flux, a C++20 library that supports it.

Starting point is 00:00:56 Welcome to episode 364 of CppCast, the first podcast for C++ developers by C++ developers. I'm your host, Timo Dummler, joined by my co-host, Phil Nash. Phil, how are you doing today? I'm all right, Timo. How are you doing? I'm not too bad. I'm just still recovering from C++ on C. It was amazing, but I was still quite tired. I didn't get very much sleep last week.

Starting point is 00:01:20 But yeah, it was an amazing conference. So I'm still kind of in recovery mode, taking it slow. How are you? You're probably even more tired because it's your conference isn't it i was there as well yes uh i did have a slightly different perspective on it uh very tired but in a good way i think it went very well overall a couple of technical issues right at the start but i mean that's quite common with these things i went pretty smoothly after that and um we had about 50 more people than last year so it's just back to growth again so exhausted but happy i think all right amazing actually one thing that happened at tbs1c is that another podcast adsp the one run by connor and bryce they had a special episode episode 136 and they recorded

Starting point is 00:02:04 that at the speaker's dinner at the conference, like in this big room where there was like all these people. And they actually did this fun thing where they interviewed all the other people in the room who were like running podcasts. So Phil and I are on it. I had, I think, quite a few beers at the time when they interviewed me. So I'm not sure if things that I was saying there made a lot of sense, but yeah, I think it was quite an entertaining episode. So shout out to ADSB. Thanks for doing that. I think it was a lot of fun. Yeah. And that episode's already released. Connor was editing it during one of the other talks, I think. So you can go and listen to it

Starting point is 00:02:40 now. We'll put the link in the show notes. All right. So at the top of every episode, I'd like to read a piece of feedback. And this time we have a comment from Jamie on Reddit. Jamie says, I'm new to the whole podcasting thing, and I was glad to find CppCast. It's a great way to get exposure to all the projects and tools available in a C++ ecosystem. I've been going through the episodes from the beginning, and the ChaiScript episode made me think of the CERN root frameworks interpreter, Kling, a kind of REPL for C++ that allows you to prototype interactively a bit like Python or Node. In my opinion, it's kind of surprising it doesn't get more exposure. The original application was for data analysis, but I think it has a lot of potential in more general settings. Given that

Starting point is 00:03:19 it's based on the LLVM Clang stack, now as a more general purpose tool called Clang REPL, it would be really cool to have an episode about this. Well, thank you, Jamie. That's a really good piece of feedback. I went digging actually before we recorded this episode and I found that Cling was actually mentioned briefly in the news section of episode 78, which was all the way back on 9th of November 2016. That was the one with Odin Holmes. And since then, from what I can tell, it really hasn't been mentioned on the show. So I think it's time to look into that direction. What do you think, Phil? It sounds like a kind of exciting thing to talk about on the show. Yeah, I'm glad you dug into that because I did seem to remember it being mentioned on the show, but I couldn't

Starting point is 00:03:57 remember exactly when. So there we have it, episode 78. Definitely seems like something that, you know, it's time may have come again. So we'll have a look into that. Thanks for the suggestion. I think it would be tying in really nicely into our little mini series on tooling that we're kind of doing, right? Absolutely. So this episode is not part of that,

Starting point is 00:04:17 but we're kind of interleaving them with episodes about other topics. So it's not just about tooling, but we have this loose plan to have kind of more episodes about tooling. So I think that would tie really nicely into that. Yep. All right.

Starting point is 00:04:28 So we'd like to hear your thoughts about the show. And you can always reach out to us on Twitter or Mastodon or email us at feedback at cppcast.com. Joining us today is Tristan Brindle. Tristan is a C++ consultant and trainer based in London. With over 15 years C++ experience, Tristan started his career working in high-performance computing in the oil industry in Australia before returning home to his native UK in 2017. He is an active member of the ISO C++ Standards Committee

Starting point is 00:04:57 and the BSI C++ Panel. He is a regular speaker at C++ conferences around the world and is a former director of C++ London Uni, a non-profit initiative offering free introductory programming classes in London and online. Tristan, welcome to the show. Thank you very much. It's great to be here. Thank you for inviting me. So, Tristan, you mentioned C++ London Uni in your bio. Now, I know a thing or two about that because I was there when you set it up.

Starting point is 00:05:23 But for those that didn't listen to the episode that you're on before when you talked about that you gotta tell us briefly what it was uh yeah so this was a spin-off from c++ london which is the meetup in london run by a certain phil nash and the i mean like most meetups they're kind of uh you know aimed at kind of professional developers and one of the people who was coming along was a c++ learner a guy by the name of tom brezza and he was keen on getting some more sort of introductory level uh material in there and phil suggested that perhaps this could be a sort of spin-off event. And I kind of volunteered to help out with that and do some teaching. And so we spun this off into C++ London Uni, where we did exactly that.

Starting point is 00:06:17 Offered introductory classes, weekly classes, starting from scratch, teaching C++. And yeah, it was a lot of fun. Unfortunately, as with many things, COVID sort of put pay to it. We couldn't have our weekly meetups and it all sort of went away, which is a bit sad, but it was lots of fun while it lasted. All right, Tristan, so we'll get more into your work in just a few minutes, but we have a couple of news articles to talk about. So feel free to comment on any of these, okay? Sure. So the first one we have is Andrey Karpov from PVS Studio wrote a mini book called

Starting point is 00:06:57 60 Terrible Tips for a C++ Developer. It's available online for free. It's really cool. It's kind of fun and serious at the same time. he has like these 60 terrible tips which are kind of kind of very sarcastically like um like for example one uh terrible tip is use nested macros everywhere and how it's a good way to shorten code and you will free up hard drive space and your teammates will have lots of fun when debugging and so uh there's like 60 of those but then below each one um it's actually kind of an explanation and then like actually a good tip what you should actually be doing and why that's bad and so i thought that was kind of fun it's like a very kind of humorous resource but also very educational at the same time so um yeah i thought that was fun and just wanted to recommend to check

Starting point is 00:07:39 it out yeah i didn't read them all i dipped into a few of them and yeah i agree so passed that initial sort of humorous device of presenting the terrible tip didn't really get in the way too much but it is followed up with some quite sensible advice that's well written and i picked on a few that i often see sort of oversimplified or not really fully understood like um there's ones in there about floating point comparisons and there's one on globals and that they had pretty good coverage so yeah thumbs up both entertaining and useful same as phil you know i haven't read all all 60 of them i sort of scanned a few and it does look like you say both entertaining and really informative as well um so yeah i'm definitely i'm gonna study it in more detail so the next thing i wanted to mention is just a follow-up from last time.

Starting point is 00:08:26 We were talking about a couple of trip reports from the Varna meeting. That's now a few weeks ago. And there was this big trip report put together by a bunch of people, which usually appears on Reddit. And it wasn't available yet at the time when we recorded the last episode, but it is available now. And that's probably the most comprehensive report about what actually happened at the VANA meeting that's out there.

Starting point is 00:08:48 So I just wanted to say that that's available now. We're going to put a link in the show notes. All right. So the next thing is from our friends at C-Line. C-Line always has these early access program kind of preview versions that you can download for free. And the latest 2023.2 EAP4 introduces an exciting new feature, which is the AI assistant.

Starting point is 00:09:12 And that's really kind of a new thing, but it's been, I guess, lately it's been more and more of a thing that people use AI tools to help them code. And now we have actually one built into an IDE. So that's really cool. There's a blog post on how this works, kind of connects to OpenAI and also some other JetBrains large language models. And in the future, it will connect to other large language models as well.

Starting point is 00:09:38 You get this kind of AI chat window where you can ask AI questions and ask it to, I don't know, generate some code for you. And then you can like insert it at the carrot into your editor or copy paste it into your code. Or you can just select some code in your IDE. And there's like a new little menu that says, you know, explain code, suggest refactoring, find potential problems or a custom prompt, that kind of stuff.

Starting point is 00:10:02 So there's like a new ai actions sub menu where you can ask the ai to do stuff with your code it can generate a commit message for you it can explain cmake errors to you in addition to c++ errors and other things and yeah it's a pretty comprehensive and pretty cool play around so um it's going to be uh yeah in the next release but uh there is a preview available now yeah and it's uh i've not actually tried it myself yet but it's definitely a cool feature very much in fashion of course and in fact there's a tie-in with the closing keynote from c++ on c last week you'll have to uh you have to watch the video to to see what i mean about that i did have one question

Starting point is 00:10:41 though is there any way to disable it from even appearing as an option? The reason I ask is because I know a lot of organizations have policies against using any of these AI generative tools for now. And just wondering if there might be any sort of issues with that. Have you heard about anything? Yeah, so you don't actually get it by default if you just open CLion. So you have to actually log into your JetBrains account and then enable the AI feature there in your account in order to use it.

Starting point is 00:11:09 And for example, it's not available in all regions yet and stuff like that. So I think, like, I'm not sure. But from what I've seen, you can just kind of not enable that option in your account and then you're just not going to have access to it. So by default, it's off as far as I understand. But if that's somehow not accurate, I going to like follow up on that uh next time but that's kind of my current understanding that you have to actually actively switch it on in your account and not everybody gets access to it depending on where you are or i think there's like a limited amount of people that get access to it and then that amount is going to increase over time i think that's kind of what everybody's doing. Yeah, yeah. I'd be interested to see where it goes.

Starting point is 00:11:47 Yeah, definitely. I use C-Line, but I haven't tried out this new feature yet. But certainly some of the things like auto-generating commit messages and stuff, that sounds great. It sounds like it would save you a lot of work. So, yeah, I'm keen to try it out. Yeah, it just came out literally a few days ago. So it's like brand new.

Starting point is 00:12:08 All right. So the last news item is something that ties in into something else that happened at C++ and C, which is we had lots of talks and discussions about safety in C++. I actually myself had a talk that was called C++ and Safety. We had obviously Sean Perrin's opening keynote that was all about safety. And a lot of the discussion was around, well, we can't really make C++ memory safe language because you would have to do something like Rust's borrow checker

Starting point is 00:12:36 and that kind of doesn't really work with the way C++ does references and move semantics and iterators and all of that stuff. And we're going to get into some of that stuff later with Tristan from kind of a different perspective. But it was this kind of a thing where like, well, C++ is this one model.

Starting point is 00:12:51 And then Rust, for example, or Val have like a very different model. And they're kind of not really reconcilable with each other. And so there was this really fascinating blog post, which is called making C++ memory safe without borrow checking, reference counting, or tracing garbage collection, which is obviously the other way you can make language memory safe, which we can't really do in C++ either because of the performance implication. And so that's a really, really long blog post, which I haven't fully read and digested yet because there's just so much information in there. And it's quite information dense as well.

Starting point is 00:13:21 But yeah, the author of that blog post actually looked at quite a series of programming language veil and val which are actually two different programming languages which i didn't know before it's val and there's veil and there's also other programming languages that i've never heard about before like austral and gel and inco has anybody of you have heard about any of those i have not i've heard of veil and i think the author of the blog post is also behind i've heard val and i think the author is behind the veil language uh there's also a language called valor which is really confusing so uh yeah so we had timmy raccordone who's one of the developers of the val language here on the show a few months ago but like i've not heard

Starting point is 00:14:02 about any of the other languages so what this person did is they went into kind of all these languages and looked at uh the different ways that memory safety is achieved in those languages so val we actually had on the show there's this thing called mutual value semantics but apparently there's like a whole plethora of other methods and so this this blog post kind of dives really deep, explores these other methods to, you know, come up with a memory-safe language and how to apply them to C++ through different means, like restricting the feature set, static analysis, kind of, you know, or like changing things here and there.

Starting point is 00:14:37 And it discusses, so the Val mutable value semantics, which actually it calls simplified borrowing. Yeah. It's kind of a subset of the Rust Borrow Checker, but then it discusses three other techniques that I was completely unfamiliar with. One is called borrowless affine style. The other one is called constrained references.

Starting point is 00:14:57 And the last one is called generational references and random generational references. That's apparently from the Vail language. And yeah, it's a huge blog post. And apparently, there are quite a number of ways to make a programming language memory safe. And it looks like quite a few of those techniques might even be at least partially applicable to C++. So I'm really excited about this kind of research and where this goes and kind of the solutions that people will come up with in that space. Yeah. So as you say, it's a very long blog post, very technical and in-depth, and I haven't had time to kind of study it in detail, but

Starting point is 00:15:37 just kind of scan reading it, it looks, as you say, really, really interesting. I was quite interested to see one of the techniques they mentioned is to replace the use of raw pointers with indexes into sequences. Which sounds like a very sensible technique. Yeah. Someone should do that. Yeah. So why don't we talk about that? That's a nice transition to the main part of our episode today, which is to talk to Tristan about sequence-oriented

Starting point is 00:16:06 programming and his library Flux. So Tristan, why don't you tell us what sequence-oriented programming is all about and what your Flux library is about and how that kind of fits into that whole space, right? Yeah, yeah, absolutely. Absolutely. So sequence-oriented programming or collection-oriented programming, I think, is a slightly more common term, although neither one is particularly common.

Starting point is 00:16:30 But the idea is it's a programming style that emphasizes thinking about kind of high-level operations on sequences. So, you know, you have your sequence of values and then you want to filter these values and then you want to transform them in some way, and then maybe you want to chunk them into equally sized chunks, and then you want to do a scan on those. Thinking about these high-level operations rather than immediately diving into just, I'm just going to write a for loop that does all this. We're thinking in terms of high-level operations and sequences or collections, if you prefer.

Starting point is 00:17:05 So some people sort of refer to this as like functional programming or functional style programming. Functional programming means a lot of things to a lot of people. I've sort of moved on to using the term sequence-orientated programming or collection-orientated programming. And Flux is a library that's a C++20 library that helps you to do this is designed around enabling this coding style. So it has sort of similar goals to C++20 ranges

Starting point is 00:17:33 or D ranges or Rust iterators or Python iterators, Python iter tools, all these sorts of things. Flux is designed to make this easy to do in C++, and the goal is improved safety as well. So the idea of Flux is that it enables this sequence-oriented programming style, but in a way that improves the safety of your code as well. So you can do these things with iterators in C++ with C++ 20 ranges. But there are some problems with ranges in terms of the low-level operations are kind of unsafe.

Starting point is 00:18:15 And at this point, we have to sort of take a diversion and say, well, what do we mean by safety? Timo had a nice talk all about this at C++ and C that I'm sure will be online in due course. There are lots of definitions of safety, so we kind of have to decide what do we actually mean. So there's a really nice definition that I like and that I use and Timur also mentioned in his talk, which is that safety, or one nice definition of it is the absence of undefined behavior. And the problem is a lot of the code that we write in C++ is unsafe in the sense that undefined behavior can occur.

Starting point is 00:18:57 We have to be very careful to avoid UB. And in particular, the iterator design that we have, which is kind of a generalization of array pointers, right? Iterators have pointer-like semantics. And this means that iterators can be very prone to undefined behavior. So if you think about, if you have a pass the end iterator, you know, you initialize your iterator from standard end. And then if you try and dereference that iterator,

Starting point is 00:19:26 if you try and increment that iterator, that is undefined behavior. We can have iterator invalidation. If you're holding on to an iterator and the container goes out of scope, then you can't use that iterator anymore. Pretty much anything you try to do with that iterator can be deleted. And we have other situations in which, really non-obvious situations in which iterators can become invalidated. If you're holding on to an iterator into a vector and then you push back into that vector,

Starting point is 00:19:54 there's a chance that that iterator has become invalidated and it can no longer be used. So iterators are great in the sense that they're very powerful, they're low overhead, they're based around the raw pointers, which are very low level, very powerful thing. But they have these problems with UB. And so the idea behind Flux is that we replace these low level iterator operations with a different iteration scheme, but very slightly different iteration scheme, where our low-level base operations are designed to be safe. That is, they don't allow undefined behavior. And so if our low-level, our basis operations are safe, then the things that we build on top of that should be safe as well.

Starting point is 00:20:46 Can I ask you a question? In C++, if you have a traditional for loop with an iterator or pointer or something like that, you have four operations. You can initialize the iterator to begin, you can increment it, you can

Starting point is 00:21:03 compare whether you reach the end and then you can like dereference it right and so at pretty much any of those stages i think apart from the comparison with end and like when you get the begin one you can get undefined behavior right but then we also have like range-based for loop which are kind of safe and we have like ranges and views which are i think kind of safe so so how have like ranges and views, which are, I think, kind of safe. So how is what you're doing different from those approaches? Okay, so the range-based for loop, yes.

Starting point is 00:21:33 If you're just beginning at the beginning of your container and iterating all the way through, then you're going to be okay, right? Because kind of the range-based for loop, it takes away, you know, you're not dealing directly with iterators you're you know the range-based for loop is abstracting that for you but at some point you know most algorithms or many algorithms at least you're not just beginning at the beginning and going all the way through to the end at some point you know we have to actually deal with raw iterators ourselves. We have to hop forward X number of places.

Starting point is 00:22:08 We have to perhaps be holding on to an iterator, and then we don't know if somebody else might modify that container in the meantime. So very straightforward code with range-based for loops can be safe. Actually, you're talking about views. Actually, views, you can have dangling views and things, which is a slight problem with the ranges library, something you kind of have to be quite aware of when you're programming with ranges,

Starting point is 00:22:38 is you have to still worry about the lifetime of things. But at some point, we have to, or in many cases, we have to delve down to the level of manipulating iterators directly. And then we have to be very, very careful to avoid undefined behavior. So if we can replace these potentially unsafe operations with safe operations, then as I say, we can build on a firm foundation on top of that. It's the idea behind Flex. So I'm familiar a little bit with the way Rust does it, where you kind of only have two operations, right?

Starting point is 00:23:14 You initialize an iterator, and then the only thing you can do is then to get the next item, right? You can't actually manually increment the iterator and dereference it or compare it to end. You just say, give me the next item, right? You can't actually manually increment the iterator and dereference it or compare it to end. You just say, give me the next item, and then it returns you an optional. And so either there is a next item, and then you get it, or there isn't, and then you're going to get an empty optional.

Starting point is 00:23:36 And that way, you can't trigger your B. Is it something like that, or is it a different idea? Okay, so the Rust iterator scheme is exactly as you say. So if you think about your fundamental operations on iterators, you've got dereference operation, increment operation, and your n check, which for iterators we do by comparing to a sentinel in FBC++20. So the Rust iterator scheme actually fuses these three operations together

Starting point is 00:24:08 into a single function called next. So you initialize your iterator and then you call next. As you say, it returns an optional until such time as that optional is empty, which signals the end of iteration. So I actually wrote an earlier library, it's called Flow, that implements the Rust iterator scheme in C++. And it's very interesting, and it has the advantage that it's a lot simpler, right? It's simpler for producers.

Starting point is 00:24:35 It's simpler for consumers. There's really only one fundamental function you've got to worry about. But it has the downside that you lose some expressivity in that there are algorithms you either can't express with the Rust iterator scheme because you can't do random access, for example, and there are some algorithms you can't implement as efficiently. And the other thing that you lose,

Starting point is 00:25:00 not just with the Rust iterator scheme, but also some other iterator schemes like in D, you lose the ability to refer to a particular position in a sequence to have like a coordinate for a particular position in a sequence. So it becomes quite difficult to do something like a find if. Like an index, you mean? Like an index or like an iterator, which returns you a particular position in the sequence, or an index, or in Flux we call it a cursor,

Starting point is 00:25:29 that returns you a particular position in the sequence. So Flux actually grew out of this earlier library that I wrote. I thought, well, how can we, because I was implementing this Rust-style scheme and I was sort of being someone who was very familiar with iterators and ranges. I was kind of coming up against these limitations. I was like, well, how can we extend this scheme so that we can do all of the things that we can do with iterators?

Starting point is 00:26:01 And it turns out that the way that you tend to implement these iterators in a Rust-style scheme is that your Rust iterator, if you like, will tend to have an internal cursor that you, you know, every time you call next, you bump the internal cursor, or you check it and you return an optional. And my sort of idea was that well if we could expose these cursors to the user in a in a safe way then we can we can do all of the things you can do with iterators and then i was playing with this for a while and i kind of realized that well if you expose these cursors

Starting point is 00:26:39 then you don't actually need this next function at all you can you can do it all with cursors so that's that's kind of how flux came about and it it turns out that i mean i didn't know this when i started it turns out this has uh quite a lot of similarities to the way it's done in the swift language so swift has a collection protocol that um that does things in quite a similar way to how it's done how it's done in flux that's really cool really cool. So this cursor, is it basically like, let's take vector, right? Because vector, that's always the go-to example when you talk about containers.

Starting point is 00:27:11 So a cursor into a vector, is it basically like an index? Like if I'm holding onto like an integer, which tells me, you know, this is the fifth element of the vector. Is it something like that? Is that like the right mental model? That is precisely the right mental model. Absolutely.

Starting point is 00:27:26 So in Flux, we talk about cursors. So a cursor you can think of as a generalization of the idea of an index into a sequence, in the same way that an iterator is like a generalization of a pointer into a sequence. So a cursor or an index, you can't do anything with it by itself. So iterators are kind of smart in a sense in that they know how to dereference themselves. They know how to incorrect themselves.

Starting point is 00:27:55 Whereas with flux cursors, you cannot do anything with a cursor by itself. The cursor just represents some sequence position. And you have to go back to the sequence. So for a vector, that would just be a number? Just be literally an integer index, yes. Ah, okay, okay, yeah. Interesting.

Starting point is 00:28:13 So my understanding of what you were saying earlier, the safety arises because whereas with the iterator model, which as you say is a generalization of a pointer, you're basically combining the idea of the location within the container and the container itself, because you're just pointing directly at the element.

Starting point is 00:28:28 Whereas with your model, you separate those out, so you have to have the actual container that you are iterating, and its index or cursor into it, so you don't get those lifetime issues. Those lifetimes have to overlap. Precisely that. Is that, therefore, a performance trade-off?

Starting point is 00:28:46 Are we losing something by having these two things separated and having to check them as well? So not in my experience. I've spent quite a lot of time looking at all the generated code in Compiler Explorer, doing benchmarking, and there's no performance disadvantage to doing this that I've come across. Yeah, it's very interesting.

Starting point is 00:29:11 And the other thing to mention, I mean, we were talking about this blog post about how we get all this memory safety. And one of the things that both Rust and Val, they do, and Swift as well, they have this thing they call the law of exclusivity. That's how they ensure memory safety. So I don't know who coined this term.

Starting point is 00:29:34 The law of exclusivity that says at any point in your program, you have to have either one mutable reference or you can have N, what we call const references in C++, but you can't have both of those things at the same time right either you can have like n readers or you have one writer but you can't have both at the same time and actually if you think about it because of the pointer model that iterators use an iterator constitutes what you would call a borrow in Rust. So if we're holding on to, even if it's a const iterator, that constitutes like a const borrow. And then if we do anything that mutates that container,

Starting point is 00:30:14 that's going to be a mutable reference at the same time as we're holding on to a const reference. And that's where we get problematic things with iterator invalidation. So actually, if we were to wave a magic wand and put a borrow checker into C++ tomorrow, we'd have a problem with the overwhelming majority of iterator-based code. However, if our cursors, it's just like an index,

Starting point is 00:30:39 just think of it like a numeric index, that doesn't constitute a borrow. We don't do the borrow until such time as we want to actually read from the sequence or ask the sequence, please can you increment my cursor? And so if we're holding on to a cursor and we modify the sequence,

Starting point is 00:31:01 we don't automatically invalidate the cursor. So you said there's no performance impact of doing this, but I think I want to drill down into this a little bit. So it seems obvious to me that if you operate with, for example, on a vector, if you operate with indices instead of iterators, you kind of get the same expressivity. You end up kind of doing the same operations.

Starting point is 00:31:32 But if you then actually index into the vector, you again have this choice, like are you going to do a bounce check? Are you not going to do a bounce check? But you could theoretically say, well, I'm not going to do a bounce check. But it sounds like you're actually enforcing a bounce check there. And that is a branch right so you're kind of relying on that branch to be optimized out but but if that kind of check is not going to be optimized out especially on like a you know

Starting point is 00:31:56 platform like like i don't know microcontroller or gpu we like don't have a branch predictor certainly you would measure a performance impact due to additional check, right? It's the same as using vector at instead of vector operator square bracket, right? It's precisely the same as using vector at instead of the unchecked square brackets operator. So yeah, in many cases, either your branch predictor, if you've got a branch predictor, is going to very quickly learn that everything's okay, and so you're not going to get any overhead there. Or in an awful lot of cases, the compiler can actually just optimize out the bounce check in the first place. So the overwhelming majority of new programming languages,

Starting point is 00:32:47 in fact, I can't think of a programming language that doesn't have built-in bounds checking on whatever its native array type is. So optimizers are really, really good at recognizing these patterns of being able to remove bounds checks in a lot of cases. Now, you're quite right that there is some code. Sometimes you write code, and after you've profiled, after you've looked at it, you can see that this bounds check

Starting point is 00:33:13 kind of remains in there, and you say, well, this is really annoying. But actually, we provide an unchecked read function as well. So like vector has vector.at and has the square brackets, we kind of flip the default in in flux so by default you're going to get a bounds checked read but there's the option if you need it of doing an unchecked read instead for these for these cases where otherwise you know either the compiler can't optimize it out or you're seeing some, if you are seeing some sort of problem with this, you can go for the unchecked read. This is kind of like you'd reach for an unsafe code in Rust, let's say. So it's not something you should do by default,

Starting point is 00:34:00 but if you really need it, it's there for you. Yeah, I really like that approach. I mean, this is what I keep saying, right, when I talk about safety, fault but if you really need it uh it's there for you yeah i really like that approach i mean this is what i keep saying right when i talk about safety that yeah we really need to flip the default the default should be always the safe thing that cannot possibly have undefined behavior and d++ is not great at that at the moment we are slowly you know improving but like this is a long way there but uh because d++ is a language about all about performance and we have lots of people in industries who need to squeeze every nanosecond out of their code like i don't know low latency

Starting point is 00:34:31 trading for example we had an episode about that at some point uh you kind of need this like unchecked uh escape hatch for lots of these things so good to know that that's there if you really measure and find out that you need it. Always measure first. Yeah. Always measure first. Yes. Yes. The other trade-off that might be involved here though is we're moving to a fundamentally different iteration model. Does that mean we have to throw out all of our iterator and range-based algorithms and write everything again from scratch? Fortunately not, no. So one of the goals of Flux is to provide good integration with existing ranges code.

Starting point is 00:35:14 So there are two sides to that. The first of all is if I've got, you know, maybe I've got custom containers that are going to provide iterators, provide C++20 iterators. How can I use those containers with Flux? So first of all, any C++20 contiguous range is automatically usable with Flux just directly, just without doing anything. So if you've got a contiguous container,

Starting point is 00:35:45 it means we can get the raw data pointer and we can do like a balance check indexing by offsetting from the data pointer. And then we also have a wrapper in Flux as a function called from range. So you can take any range and wrap it in the Fl flux sequence API. We lose some safety guarantees there because, you know,

Starting point is 00:36:13 in that case we are building on top of the range's operations, the iterator operations, which might be unsafe. So you can do that. It's not ideal because we lose some of the safety guarantees, but it provides ranges compatibility. So that's going in one direction. That's taking existing ranges, and then we can use them with the flux algorithms. We can use them with the flux sequence adapters. Going in the other direction, if I've got a flux sequence, and let's say I've built

Starting point is 00:36:39 up a chain of adapters, and now I've got my flux sequence but there's some custom algorithm based on iterators how can i use that with my flux sequence well it turns out every flux sequence is automatically a c++20 range as well so we can call you can call dot begin dot end on a flux sequence and it will give you C++20-compatible iterators. These are kind of – they take a pointer to the sequence that you're operating on, so they're not safe, but they're no less safe than any other iterator implementation. So you can take your flux code and call c++ 20 algorithm of c++ ranges

Starting point is 00:37:26 algorithms nice sounds like there's almost no downside we might have to dig into that a bit more but before we do that this is a good point to pause and uh thank our sponsor for this episode this episode is sponsored by son, the home of clean code. And we talk about safety and security in our code. Well, SonaLint is a free plugin for your IDE that helps you find and fix bugs and security issues from the moment you start writing code. You can also add SonaCube or SonaCloud

Starting point is 00:38:00 to extend your CICD pipeline and then it'll be a whole team to deliver clean code consistently and efficiently on every check-in or pull request. to extend your CI-CD pipeline and enable your whole team to deliver clean code consistently and efficiently on every check-in or pull request. Sonar Cloud is completely free for open source projects and integrates with all of the main cloud DevOps platforms. All right. So Tristan, I just had a very quick question.

Starting point is 00:38:19 So does that mean that, for example, because you have these cursors, you basically get random access into your container, right? That's kind of the problem. If you're trying to rewrite something on a contiguous sequence, like an algorithm that operates on a contiguous sequence in Rust, you just don't have random access at all, right? You have this weird other thing called slices, which kind of works differently and you kind of have to rewrite everything. So with your scheme, you do get random access. It's just by default checked, basically. Did I understand that correctly?

Starting point is 00:38:55 That's absolutely correct. So Rust iterators don't give you random access. So the equivalent operations, things like sort in Rust operate on what they call slices. It's equivalent to a span in C++20. But in Flux, we have a random access sequence concept. So you can do something like, for example, you can zip together two vectors. And then you can, if with an appropriate comparator, you could perform a sort on the zipped sequence of two vectors, which is sort of the ultimate test of how these things work.

Starting point is 00:39:34 And it works well in flux. And you can do this with C++20 ranges as well, but you can't do it with most other iteration schemes. Right. So that kind of leads to my next question so uh you said like you want to zip vectors and you can sort the like range that you get so you kind of can chain things right so um this is something that ranges does as well this is kind of the the big thing that ranges adds on top of like, or in addition to like, or instead of what the STL does, right? Where like you can chain these views and these operations on them. And Ranges does it with this pipe operator.

Starting point is 00:40:16 And I looked into the documentation of your library and you do it differently. You have these kind of member functions where you have something dot do this dot do this and you kind of chain them which is kind of a very different design which i suppose there's a reason why rain just doesn't do that so is there like an advantage to doing it this way is there a reason why you designed it that way i'm just curious okay so uh yeah so flux provides a whole ton of sequence adapters so these are things like whole ton of sequence adapters. So these are things exactly like the range adapters that we have in C++20, things people call views, although technically correct, range adapters.

Starting point is 00:40:55 We also have sequence adapters in Flux, lots and lots of them. And yeah, absolutely right. So the way we do it in Flux is that you start off, you have to you have to choose how you want to iterate over your sequence do you want to take a copy of your sequence and then iterate over that you want to move your sequence or do you want to you want to iterate over a reference to your sequence so your first line of any pipeline is you choose how you're iterating over it and and after that, you can chain together these operations using the dot syntax. So you would say, like, flux ref of my vector, let's say, to iterate by

Starting point is 00:41:30 reference, and then you might say dot filter with your predicate, and then dot map to do some sort of transformation, and then you might call dot sum to do the equivalent of accumulate. and have accumulated. So why member function chaining rather than pipes? So to be honest, there's no technical reason for it in the sense that there is anything that you can do with members that you can't do with pipes. I could have used pipes in Flux. The reason mostly is because I think it makes the code much more readable. I think the pipe std colon colon views, colon colon filter,

Starting point is 00:42:11 or transform, whatever it is, it's a lot of visual noise, and it's kind of hard to just read what the code is doing. And the other thing is that actually the dot syntax, you get nice auto completion in your IDE as well. So you just press the dot key and you get a big list of all of the operations that you can do on your sequence. And you actually get nice dot comments as well. So there's no technical reason for it. Mostly, I just kind of like the visual style better. And I know there have been proposals for a pipeline operator and things like that. And I'll be looking closely at those things. I'm not kind of wedded to the dot syntax. I'm going to be

Starting point is 00:42:59 looking at the proposals to see whether they're appropriate for appropriate for flux but at the moment it uses the the dot syntax i think just because it reads better and as i say you get the auto completion as well which is a nice bonus yeah i think that the trade-off is it's a bit more ergonomic especially if you've got an ide and you get the auto completion as you said but on the other hand it's harder to extend you have to actually change the underlying classes usually because we don't have extension methods yet. Well, okay, so the way around that is let's say you've got some custom adapter that you can write in Flux or you've got some custom algorithm. We have a function.

Starting point is 00:43:39 It's actually there are two spellings of it. You can either spell it.apply or even.underscore. Okay. And so if you do this, so let's say.underscore brackets, and then you put the function name or the adapter name followed by the optional arguments, and the Flux machinery will then serve this as part of your pipeline. So that bit is covered as well.

Starting point is 00:44:02 We can use custom adapters, custom algorithms as well we can use nice yeah custom adapters custom algorithms as well great so we talked a bit about we mentioned indexes and random access or um contiguous containers that makes sense because you know if you've got an index you need to be able to index into something so what about containers that don't have that sort of node-based containers like maps and sets and things? Yeah, so node-based containers are tricky. So if we take the simplest example of node-based containers, which would be a linked list, you want to store some sort of iteration state.

Starting point is 00:44:38 It doesn't matter kind of what your iteration scheme is. If you're doing external iteration, you want to hold some state. And then the natural kind of state that you want to hold for a linked list would be like a pointer to the node. So what we need to ensure if we want to do safe iteration is we need to make sure

Starting point is 00:44:58 that nobody has messed with the node while we're holding onto it, right? And then for a linked list, we've got our node then has the next pointer and we want to make sure that no one's messed with our next pointer. How do we do this safely

Starting point is 00:45:11 and with minimal overhead is a very tricky problem. So there are sort of three approaches that I know of. Perhaps this blog post we were talking about earlier, perhaps that has some ideas as well. But so the three

Starting point is 00:45:25 approaches that that i know of firstly you can use uh shared pointers or weak pointers you know let's say that your your iterator or cursor holds a weak pointer to the node and then you can try and lock that weak pointer and if the node no longer exists then you can error out at that point so you can use shared shared pointers to do this of course shared pointers come with some overhead whether that is whether that's problematic kind of depends on your use case the other a second approach is that rather than just keep allocating all of your nodes, because the trouble is, in general, we can't just take a pointer to some arbitrary memory and say, is there an object here? We can't even ask whether an object exists.

Starting point is 00:46:16 So one thing you can do is rather than just keep allocating your nodes just with operating new or equivalent, is you can pre-allocate some sort of arena from which you take your nodes. And then if you're in control of this arena, then you can ask, is there an object at this address or at this offset into the array? Because you're in control of that whole thing. Again, there are trade-offs with this. You almost sort of end up with a slightly different design, but it's, again, something that can be done.

Starting point is 00:47:02 And the third approach was mentioned in this blog post we were talking about earlier. The third approach is quite interesting because what you can do is you can attach a generation marker to your cursors and internally to your, let's say, linked list. And then every time you modify the list by adding nodes, adding elements, removing elements, whatever you do, it increments the generation counts. And then when you perform an iteration operation,

Starting point is 00:47:31 because in Flux we always use the sequence, we can always compare the sequence's internal generation to the generation ID that's stored in the cursor. And if they don't match, we can say, hey, no, you've messed with the list after generating this cursor, and so we're not going to allow you to do this. What's really fascinating is you end up

Starting point is 00:47:54 with quite a different style of API, but you can actually theoretically do this at compile time. You can generate a new type every time you modify your list. Let's say you can store it in new type and then you can actually do this at compile time you can reject these things that's a bit more niche but it's something that's theoretically possible with the model that is so cool i mean i've just been listening to this and that actually makes me think that this sounds very very similar to

Starting point is 00:48:22 how you solve another problem which is lock-free data structures, where you don't necessarily want to solve the problem of they have a dangling pointer, but you want to solve the problem of, you know, two threads are iterating through the same linked list at the same time. And one might remove a node and the other might be reading it.

Starting point is 00:48:41 And how do you like get rid of the race conditions there without putting a big mutex around the whole thing right and you you end up with basically three approaches and i've done talks about the stuff it's like reference counting rcu or hazard pointers and the way they work is quite i mean it's not quite the same but and definitely not like the compile time stuff that you mentioned at the end but like they're kind of similar to to like what you were describing like it's either reference counting or're kind of similar to like what you were describing. Like it's either reference counting or some kind of garbage collector thing where you kind of know

Starting point is 00:49:10 what the lifetime of every object is or like similar stuff like that. So it feels like it's a similar kind of problem, like getting rid of data races there or getting rid of just dangling pointers and kind of memory safety. That's really, really cool. I need to think about that a bit more yeah yeah it's uh it's sort of fundamentally the

Starting point is 00:49:29 same the same kind of problem it's well it all comes down to memory safety right that's um different approaches to that yeah it's like two sides of the same coin right either you have like one thread but then you do things in a weird order and you might mess things up or or you just have stuff happening at the same time um and then how do you get rid of race conditions but like yes it sounds like as you say it's like two sides of the same coin almost it's like really cool i've never thought about it that way thank you very much yeah yeah multi-threading just takes all of these problems we have in the single-threading case and just makes it 10 times more difficult. But it feels like if you solved the memory safety problem,

Starting point is 00:50:12 if you solve it comprehensively, like for example, Rust does with the Borrowed Checker, you kind of solve the cold concurrency problem as well. Because then it just becomes a question of you know the exclusivity principle uh not just between like different scopes or whatever but also like between different threads but it's kind of the same thing yeah absolutely because you know if you can guarantee that all of your threads are are only reading then of course you're not going to have a problem and so it comes back to this

Starting point is 00:50:45 this law of exclusivity thing we were talking about is but yeah how do you ensure that there's only one one writer at a time but if you don't have that then the problem you have to solve is like how do i make sure that the node is not going to get deleted from under me and then you know we get into all of these all of these approaches yeah exactly and talking about exclusivity it's a c++ 20 library isn't it is it is it only c++ 20 it can work with or you just chose to to start with that uh so it's a it's a c++ 20 library uh primarily because it makes heavy use of concepts so uh it reuses quite a lot of the concepts from uh what was the rangers ts became c++ 20 so we use we use quite a lot of the concepts from that so yeah it's a c++ 20 library the idea basically is that at the moment we're still we're still a reasonably early stage you

Starting point is 00:51:41 know i'm just kind of starting to talk about this and kind of make people aware of it i've been i've been beavering away on this kind of quietly in my spare time for like the last year or so and now i feel it's at a stage where i want to try and get people interested in it maybe i can get some early adopters and get people uh you know hopefully get some contributors so this is something that i want people to be able to use two, three years from now when, you know, the early adopters would have been able to hammer out all of the bugs and things like that. So although there are a lot of places who are saying we're not using C++

Starting point is 00:52:16 20 yet, these things change very rapidly. And so, you know, by the time Flux is ready for production use, hopefully a lot more people will be on C++20 and we can look forward. We can use all these nice concepts and things that are now in the language. And if you do want to use it now, it's actually on Compiler Explorer, isn't it?

Starting point is 00:52:39 It's not. So it's not in the libraries dropdown of Compiler Explorer yet. It was funny, I was actually talking to Matt about this at C++ on C. And so he told me how to go about that. I need to, apparently I need to submit a PR. But what you can do is, because we have a generated

Starting point is 00:52:56 single header, and Compiler Explorer has this amazing feature where if you type hash include and then put a URL in your quotes, it will go and download from that url and so you can use it today in compiler explorer and i've got a whole bunch of examples on the github of doing that but it's not yet in the libraries drop down right it will be it will be soon might be by the time this episode ends so you're gonna have like a conan package and

Starting point is 00:53:23 all of the other things you need these days to ship a library? I would very much like to have it on a Conan and VC package and all of those. I have to confess it's not something I know how to do. But if there's anybody out there who would like to see Flux on these things, then please do. I'm very open to having submissionsux on these things, then please do. I'm very open to having submissions

Starting point is 00:53:47 to put these on there. So just in general, what compilers do you support? Do you support all the major compilers and what's the license like? How mature is this whole thing? Can people just throw it into their repository and start using it?

Starting point is 00:54:09 Okay, so it's the boost license so you're going to be able to use it i imagine everywhere uh i doubt your corporate lawyers are going to have a problem with it it's it works uh so we test with gcc 11.3 and onwards, MSVC 2022, and Clang 16. So that's only the most recent version of Clang as the concepts of what we need. So unfortunately, no Apple Clang yet, but that will be updated in due course, I'm sure. So we test with those three compilers. I haven't tried it with the Intel

Starting point is 00:54:45 compiler or somebody was talking to me at the conference, they were trying to get this working with the NVIDIA C++ compiler. It hit some sort of internal compiler error, but hopefully that will get sorted out in due course because they were going to submit a bug report about that. So in terms of maturity, as I this is we're still at quite an early stage you know at the moment it's 12 000 lines of code primarily written by me uh it's one of the contributors he's done some some fantastic uh examples and things uh but primarily it's you know 12 000 codes lines of code written by one person it's um it's at a stage i would say where i'm looking for early adopters people who want to try it out if you've got like personal projects or whatever and you

Starting point is 00:55:32 want to uh and you want to see how this works get a head start you know if you want to contribute that's even better because uh you know hopefully within you know within a reasonable time frame we can get this uh get this to a stage where people are happy to just drop it into their production code but i'd be a bit nervous about people doing that right now but hopefully in due course so with those caveats in mind then well obviously we will drop the link to the repo in the show notes. So if any listeners do want to give it a try, ideally for non-production use, just to kick their wheels, try it out, feedback any issues and other things.

Starting point is 00:56:17 Or maybe contribute. Are you open to other contributors as well? Oh, absolutely. More the merrier. So anyone that wants to be a part of the wave of sequence-oriented programming, we will show you where you can hook up.

Starting point is 00:56:33 So that's really fascinating. So before we wrap up, are you involved in any other interesting C++ projects? No. Not really, to be honest. This is the main one that i'm focusing my spare time on so you're fully focused on on flux now fully focused on flux yeah so got your full attention so that's that's good good to know flux is where my focus lives yeah

Starting point is 00:57:01 but you're involved in um c++ standardization on the on the bsi panel especially so yeah and the bsi um and i uh i also attend the meetings of the ranges study group as well so i'm still kind of involved in involved in that uh because i i just i really like the style of code. I think Flux is great, but if you don't want to use Flux, if you want to use standard ranges, that's still great. I don't want to sit here and say ranges are terrible

Starting point is 00:57:35 because I think ranges are great as well. I think Flux is even better, but I still want to make ranges as good as they can be. So I'm still involved in that in the standardization side. That kind of begs the next question. Would you like to see sequence-oriented programming in the standard at some point? Because I know you've actually done quite a lot of talks

Starting point is 00:57:55 on ranges that were really good, but now you have this library. Would you like that to be in the standard as well at some point? I mean, ranges do enable the sequence orientated program style right so if you're using these range adapters and these high level adapters high level algorithms that is what i you know by my definition of the sequence orientated program you can do it with ranges would i like to see something like flux right now i would say there is kind say there's so much overlap with ranges that you wouldn't want to have both of them in the standard library. library based on some sort of safer subset of the language or something like that, then I think Flux

Starting point is 00:58:47 would be, or something similar to Flux, would be a very good place to start with that. So maybe not today, but possibly one day in the future. We do prefer to standardise existing practice, so we'd need some of that practice first. Well, quite exactly that, yeah. Yeah. So is there anything else currently happening in the world of C++ that you do find particularly interesting or relevant?

Starting point is 00:59:14 I mean, as you might have gathered, I'm really interested in all this talk about memory safety and how we can evolve the language in a safe way. And the success successor languages, in particular, I'm really interested in Val. I think that's a really fascinating approach, what they call their mutable value semantics. So I'm looking at that closely. Carbon as well. I mean, I haven't looked closely at carbon since the announcement last year,

Starting point is 00:59:45 but definitely we'll be very interested to see where these things go in the future. And I think in the future is the appropriate theme there. So Tristan, do you want to tell our listeners how they can reach you if they have more questions about this, or if they want to chat to you about Flux and you're concerned with programming? Or is there anything else you want to tell us before we wrap up? Okay, so I'm on Twitter, so at Tristan Brindle on Twitter, if Twitter is still around by the time this episode airs. Yeah, it's kind of falling apart now. Yeah, exactly. By the time this episode airs, my GitHub username is tcbrindle.

Starting point is 01:00:40 So feel free to reach out to me. Either of those methods methods you'll be able to find my email address uh if by uh either of those methods if you prefer to get in contact with my email uh so please do if you've got any questions about the library any questions about uh this sort of thing in general uh i'm very happy to uh very happy to hear from people thank you and all those details will be in the show notes. So Tristan, thank you very much for joining us today and telling us all about Flux and sequence-oriented programming. It's been my pleasure. Thank you.

Starting point is 01:01:12 And thank you, everybody, for listening. Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in. Or if you have a suggestion for a guest or topic, we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate it if you can follow CppCast on Twitter or Mastodon. You can also follow me and Phil individually on Twitter or Mastodon. All those links, as well as the show notes, can be found on the podcast website at cppcast.com.

Starting point is 01:01:47 The theme music for this episode was provided by podcastthemes.com.

CppCast - Sequence-Oriented Programming

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.