CppCast - CMake and VTK

Starting point is 00:00:00 Episode 205 of CppCast with guest Robert Maynard, recorded July 3rd, 2019. Sponsor of this episode of CppCast is the PVS Studio team. The team promotes regular usage of static code analysis and the PVS Studio Static Analysis Tool. In this episode, we discuss some possible changes to coroutine syntax. Then we talk to Robert Maynard from Kitware. Robert talks to us about his contributions to CMake and VTK. Welcome to episode 205 of CppCast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today? I'm doing all right, Rob. How are you doing?

Starting point is 00:01:19 I'm doing fine. It's the 4th of July tomorrow, so happy 4th to all our American listeners. Yes, and for those of you who can't stand fireworks, I hope you make it through the weekend. Yeah. Does your dog not like the fireworks, Jason? No, she does not like the fireworks. So I will probably be sleeping in the basement or something like that with her on tomorrow night. So she doesn't keep the whole house awake. Part of the problem is, I mean, and I know I've mentioned this on the show before, fireworks are highly illegal in our city.

Starting point is 00:01:59 That doesn't stop anyone from setting them off for the whole week before the 4th, on the 4th, and the whole week after the 4th. Yep. But, and an interesting and exciting side note, she's also terrified of thunderstorms for the same reason. And there was a loud thunderstorm when we were not home the other day. And when we got back in, yes. Uh, and when we got back, she had tried to break into one, two, three, four doors and had torn, um, a considerable amount of drywall and trim from, uh, the, another door. Um, it was an interesting mess to clean up. Yes. Yeah. Well, I hope she, uh, doesn't cause a big mess, uh, over this week with all the

Starting point is 00:02:42 4th of July fireworks. Yeah. If we're home, she's just whatever. Okay. She, she's annoyed, but yeah. Yeah. Okay. Well,

Starting point is 00:02:50 at the top of episode, I'd like to read a piece of feedback. Um, I got a pretty long email. Um, I'm not going to read the whole thing, but I'm going to summarize some of it. Uh,

Starting point is 00:02:57 this I think came from a non or is what he, his email address said, at least, um, he said, hi Rob. Hi Jason. Uh,

Starting point is 00:03:03 thank you for the amazing podcast. It's been something like an entry point to the C++ community for me. And then he goes on to talk about how he's enjoyed joining the C++ community. But then he says he heard mention of CppCheck a couple times on the show, especially from you, Jason,

Starting point is 00:03:20 with your heavy focus on advanced development tooling. And while going through the site, he stumbled upon the funding campaign started by the main developer. And we have had him on the show. I think it was back in episode 79. I looked it up right before we came on. Daniel Marjomaki.

Starting point is 00:03:39 It was a long time ago. It was a long time ago. But yeah, he started an Indiegogo page where he's trying to raise some funds to support CPP check development. And I will put the link to that in the show notes. It looks like it has like two or three weeks left on the funding campaign. So I'd encourage our listeners to go check that out, especially if you use CPP check, because it looks like they could use a little support it seems like an interesting concept this is indiegogo campaign for a feature yeah it's specifically to detect more uninitialized variable usage um so i'm not sure if he's done other

Starting point is 00:04:20 indiegogo campaigns to support other feature work or why he decided to do it this way. I mean, maybe it's just a big feature and he feels like he really needs to kind of spend, you know, more than just his after work personal time working on it. Cause I think this is just a side project of his. Yeah. Yeah. And I mean, well, and it's the goal. I mean, so our so our listeners, I'm sure, are curious at this moment. The goal is $537. Right. It is not like he's asking for $10 million to implement this feature or something. Not asking for a lot. And currently he's only at 19% of that goal.

Starting point is 00:04:56 So, like I said, he's got two and a half weeks left is what he's set to meet that goal. And he definitely needs more backers, more support if he's going to meet that goal, and he definitely needs more backers, more support if he's going to meet that. Oh. Yeah. Okay. Well, we'd love to hear your thoughts about the show. You can always reach out to us on Facebook, Twitter,

Starting point is 00:05:15 or email us at feedback at cpgas.com, and don't forget to leave us a review on iTunes or subscribe on YouTube. Joining us today is Robert Maynard. Robert is a principal engineer at Kitware and spends most of his time as a primary developer of VTKM. VTKM is an HPC toolkit of scientific visualization algorithms for highly concurrent processor and accelerator architectures. It uses a fine-grained concurrency model for data analysis and visualization algorithms allowing for seamless

Starting point is 00:05:45 execution on GPUs or many core CPUs. When not working on VTKM, Robert is either writing CMake code, teaching CMake, or working to improve CMake. Robert, welcome to the show. Hi, thanks for having me. Robert, we'll talk obviously about what you're working on a lot more in a few minutes here, but I think of you as like the CMake guy. Like you are our window to the CMake world, but it seems like you would do a lot more HPC kind of related things than CMake. Yes. CMake time is a fraction of my time in theory. And not by like nine tenths, you mean? No, not by nine tenths. I think I started off at about 10% of my time on CMake. That's slowly increased, and it's really dependent on my projects

Starting point is 00:06:29 and how much work they need done in CMake to get to their end goals. So a lot of my work in CMake is driven by the projects I work on. Okay. And regarding your HPC-related work, what's your interest there? What's your focus? What kind of? So we do scientific visualization so this is after the simulation is done uh running or in situ with the simulation so

Starting point is 00:06:51 alongside the simulation and you want to see the results uh you want to see like a pressure field along like a wing or you know the stress on a piece of uh parts and you want to look at that and then do some statistical analysis or visual inspection so you don't i'm sorry go ahead so i was gonna say uh so we do all the visualization aspects of it so looking at the pressure field they're doing particle advection streamlines uh so allowing a scientist to have a better understanding of their data so like 3d analytics basically so is there any particular type of visualization that you like more? It sounds you're kind of agnostic to these things.

Starting point is 00:07:30 We're very agnostic. We want to give scientists the tools that they need to see whatever they want to see in their data sets. So if they're interested in like particle advection, we need to write particle advection algorithms that can run on any HPC environment. So it'd be at CPUs or GPUs or some crazy other thing. Do you have a personal favorite though? I'm not going to say any favorites. I like cool hard. I like cool hardware. So give me something like with tons of cores or like a high concurrency,

Starting point is 00:07:59 some weird things. That's what I like. You don't care what it's doing. You just want to see the magic happen. Right. All right. Okay. Well, Robertbert we got a couple news articles to discuss uh feel free to comment on these then we'll start talking more about cmake and vtk and other stuff okay yep okay so this first one is an article from uh anthony williams and it's the power of hidden friends in c++ and it's posted on the ACCU blog.

Starting point is 00:08:25 And I don't think Al is familiar with this. You know, when I think of friends, I just think of declaring a friend class as, you know, something like a factory. You can go in and create that class for you, but no one else can touch it. And he, you know, quickly goes over that common use of friendship,

Starting point is 00:08:41 but then he goes over hidden friendship, which is something I was not familiar with, Jason. It is a syntax I'm familiar with. And what Rob is referring to here is like declaring a free function operator equal as a friend inside the actual class that you are working on. And I guess the point is for more ADL friendliness, argument dependent lookup. I was interested that they say that there's also an impact on the compiler performance, that it would actually potentially give you faster compile times

Starting point is 00:09:11 because there's less lookups to be done. And that was really interesting, given if you have a lot of operator overloads, how much of an influence that'll actually have. So after reading this, I'm like, I want to go try out some of these things, see if I can get some performance improvements. Some compile time performance. Yes, yes.

Starting point is 00:09:30 Is there anything else you should highlight with this article? I like somebody explaining ADL lookups in a way that's understandable. That was wonderful. Because I always have to go look up the spec every time there's an ADL issue. Right. Yeah, I fortunately have very rarely actually had ADL cause a problem for me. I don't tend to write code that takes advantage of it so much. Okay, this next article is a post from Arthur O'Dwyer. And it's in support of P1485, which is a proposal for better keywords for coroutines.

Starting point is 00:10:12 I saw this proposal, I think, a few weeks ago, and I meant to bring it up on the show, so I'm glad that Arthur wrote this blog post. But the proposal is to basically not use the co-yield and co-return keywords that are currently in the co-routine proposal that's been accepted for C++20 and to instead just use, you know, yield and return. But in order to avoid the issues that kind of forced us into having co-yield and co-return, add a new identifier of async, which you will have to have in your function definition to mark that function as being asynchronous and using coroutines makes sense to me i uh the part of this that i immediately i'm like oh yeah like because co-yield co-await and those things like the compiler doesn't know that it's a coroutine until it's parsing the function and then finds one of those keywords,

Starting point is 00:11:06 which the part of the argument here is this is completely undiscoverable to the human reading this code, right? Like how far do you have to read before you know that it's a coroutine? So for that part, having an extra keyword at the end of the function and say, this is a coroutine, totally on board with that. Yeah. I like it. I also agree on the forward declaration that it should be there too, if we go this way. Having the forward declaration without the async keyword doesn't help the readability when you're the consumer of it. Well, this could be fixed with some clang-tidy checks. I would like it more in the language.

Starting point is 00:11:40 Yeah. And then there's the argument made further down that if you're intentionally taking a parameter by copy because you want to avoid lifetime issues, and you don't see the async at the end of the function declaration, you'd be like, well, why is this not being taken by const reference here? Because they're clearly just reading it, right? also I'm strongly in favor of that. But I do think that this is completely unprecedented to say adding the async keyword at the end of the function declaration means that yield is now a keyword inside the function. That part makes me a little nervous, although I'm not a compiler writer, so I'm not the one who would have to deal with that yeah i just want it to be easier for me the user right so i i'm in the same boat and what about the last bit on what this means for 2020 yeah go ahead dude give your opinion robert well so we had arthur o'doyle on it and he was already kind of saying that he was a big proponent of not going with these three-year releases,

Starting point is 00:12:45 that he thinks we should just spend more time. So I think he's just using this as a reason to go back to his opinion that we should have more time for each release. I think three years is fine. We just have to be comfortable dropping stuff. If we merge something in like this and then we realize there's a mistake, we just have to be comfortable dropping it before the release. So if we're saying that co-r realize there's a mistake, we just have to be comfortable dropping it before the release. So if we're saying that co-routines aren't ready, we just have to accept in three years we'll get them and

Starting point is 00:13:12 wait. CMake had this experience with 3.0. We kept on pushing off the 3.0 release and that just kept on getting pushed off, just like C++ 11. And now we're a big advocate of you set your release date, you hit whatever that release date is, and if you have to drop or revert features, you do that. You know, the release waits for no one. So I guess the question is, is there any chance of this paper being considered for C++

Starting point is 00:13:38 20? I mean, coroutines have been accepted, but this paper is new. You're allowed to modify features that have been accepted. Okay, so there's a chance. There's a non-zero chance, yes. Okay. Although, Arthur's, well, if he's listening to this episode, his last paragraph, and for the sake of the rest of the listeners, is, well, let's see what he says here. Coroutines is undergoing major revisions.

Starting point is 00:14:06 Concepts are undergoing major revisions. Spaceship Operator has major revisions. Modules in Concepts Per New are unimplemented, so no one knows what they look like, and there's a lot of questions remaining with contracts. He basically pokes a hole in, like, two sentences. Well, no, that's one really long sentence. That's a lot of semicolons.

Starting point is 00:14:27 He pokes a hole in every major feature of C++ 20. I don't know if I, I mean, I can say, I don't know if I agree because I don't know the details of all of those major features, of course. Right. But yeah, so if we have, if the committee feels like there's things here that we need to address,

Starting point is 00:14:46 I'm more comfortable saying, let's push it to 23. And there's other things in 20. Or, you know, we only get concepts and modules in 20. I'm happy with that. Right. That would still be a massive release. It still would be, definitely.

Starting point is 00:15:00 I would love to see contracts in, but if there's legit unanswered questions there, then yeah, you know, make sure they're ready. Otherwise, yeah, concepts and all the constexpr, not even the new constexpr features, but all the constexpr library updates, we need those. And those are easy. And it would be good to, if we push it off, what does that mean?

Starting point is 00:15:22 Does that mean that executors and networking can't come in or can we wait and put those in too? So that's why I'm a proponent of like a date because that, that waits for no one. That's, that is your, that's, that's the firm fixed point you have in the sand. So, but if we push it off, then we get into C++11 again, right? Where, what do we wait for? What do we keep this open for? Right. Okay. Well, the next committee meeting's

Starting point is 00:15:47 soon, so I guess we'll see how they feel about it very soon. It's quite soon, right? When is that? Next week? I think it's next week, yeah. It's either next week or the week after. I think it's in two weeks. It's in Cologne. That's what I know. Okay.

Starting point is 00:16:04 And then the last thing we have is uh from the cpp con website and that's the poster submissions are open for cpp con 2019 and uh yeah lots of details here if you are not familiar with uh poster submissions on what you would need to do in order to submit a post a poster um the deadline is pretty soon it's july 29th so you basically have this month to uh decide if you want to submit a poster that's like super simple way to get involved in the conference if you're nervous about speaking whatever you just have to go and man your booth i guess whatever stand there and wait for people to ask you questions for the most part and

Starting point is 00:16:46 give a short presentation when the judging committee is coming around. And I had the opportunity to do this, to judge the posters. Yeah, the very first year. And it's growing each year. And it's fun. And it gives people stuff to talk about during the like, check in and whatever. And I'm going to put this other article in the show notes i i was actually listening to npr a couple days ago and heard them talk about poster sessions and you know this is not just a you know c++ conference thing this is you know general scientific research community is big on poster sessions and um this one person uh some type of researcher submit uh proposed a new way of doing posters where you really just highlight what you're finding is have a qr code

Starting point is 00:17:34 for people who want to get more information and then just like a quick summary so a much more simplified poster and i just thought was worth putting in here in case people want to consider that who are uh working on poster submissions some of those scientific ones you can get some really small font as they try to explain everything and you're not going to be able to read all that or you're going to be standing in front of the poster for half an hour that is a good point and just as someone who's you know casually looking at the posters if something just has like four sentences on it you're totally going to read those four sentences. If it has like 10 paragraphs, you might be like, where's the information? Don't forget a cool image and probably like some bar chart. Then you're set. You want a cool image

Starting point is 00:18:17 so people come by and look at it. So posters need to be clickbait. Yep. Yep. Got it. Sort of tongue-in-cheek. Sort of, but not really. Kind of. Okay. So, Robert, do you want to tell us more about what your role is

Starting point is 00:18:36 on the CMake team? Yes. So, I started on the CMake team doing mainly the release process. So, I would do the forward-facing blog posts, upload the binaries, announce it on the mailing list, answer any mailing list questions. And from there, I started doing CMake training, the kit we're offered. And then I started to contribute to CMake. I started with the Visual Studio and Intel compiler feature detection, you know, for like 11, 14, 17. Then, because VTKM,

Starting point is 00:19:09 my project was moving into supporting CUDA, I started working on CUDA support and started becoming the native CUDA expert in CMake, writing a lot of that code. So compiling CUDA code inside of CMake, or with CMake, I did a large share of that. And from there, it's just sort of kept on growing. I just keep on modifying CMake every time I run into something or somebody tells me something that they want to see fixed. So that's how I got into it. During the intro, you complained about the, or mentioned the 3.0 that just kept getting pushed back and pushed back and pushed back. How involved were you at that moment? Like, like how personal is that for you? I just started on that. So we were just transitioning over. So I had done the 2.8.12

Starting point is 00:19:55 and the 2.8.11 releases, I think. And we had been wanting to do 3.0 for a while even then. And there was a version numbering issues. We needed to have that fixed and we needed to do 3.0 for a while even then and there was uh version numbering issues we needed to have that fixed and we needed to do some internal cleanups and have a major number just to like because we're going to break some backwards compatibility and then it started being like well maybe we should fix this too let's add that to 3.0 right like if we're going to do one backwards compatibility break between two and three how many places should we break it and so that just kept on creeping right let's just break this too let's break this too and so that kept on going and then at a point we were doing 3.0 and i came in i was like no we needed this too and so i i didn't help the situation out because i was like we should

Starting point is 00:20:40 fix this issue that we have and put this into 3.0. And yeah, so I help push off 3.0. And then what process have you moved to? So we aim for a release every four months. So we generally release three major releases. So that's like 3.11, 3.12, 3.13. And those are scheduled every four months. That has a bit of a flex given how long a previous release is. So if a previous release took a long time, we might not get three

Starting point is 00:21:13 in a year. But in the last three or four years, we've done three a year consistently. And then we have a date. We put like a milestone. We have a date. We generally try to stick to that date, give or take a little bit, and that becomes the feature freeze. So at that point, it's feature frozen, we know which features are going into that release, we tag it, and we only accept bug fixes or regressions on that from that point. So like a primary development can continue on with the next release, all the new features. So, you know, 3.15 has been in RC since the start of July. 3.16 has already had a month of development on it as we are iterating on the RC fixing regressions. And

Starting point is 00:21:53 if we find out that a new feature that we added in 3.15 is just not fixable in time for when we want to get it out, like if we know it's going to take like a month of extra work. We can't hold up the release like that. We'll just revert the whole change. And that happened in 3.13. We reverted some properties with the scoping on when you can use target link libraries, and we had to pull that all out and redesign it from scratch to catch all the issues. So the RC1 had it, and then RC2, we had to pull it because we realized it was going to take about

Starting point is 00:22:25 three weeks of development to fix now and we couldn't hold up a release for that you said a couple things that sounded like strong words like revert the change now that's implying to me that you went in and you're like nope get done like let's revert you know these five shahs let's be you know pull this out not just that you if-defted out or something. No, we reverted it. So we would revert it on just that release branch. So it's in master, it's just been reverted from the release. And we did that so it's clear.

Starting point is 00:23:01 If you looked at the git history, we reverted this due to this issue. Because in that case, we couldnverted this due to this issue. Okay. Because in like that case, we couldn't use most of the code. So if defing it out is not going to fly and we don't want any way for somebody else to like compile it back in, of course, in some ways. Right.

Starting point is 00:23:18 So, so that like, I'm thinking you've got your release branch you're working on and you've got the next development branch you're working on and you're making your changes on the, the release balance and fixing regressions and stuff, which clearly then need to be forward ported. Or do you work on a third branch with the bug fixes and then merge it in

Starting point is 00:23:39 both directions? Like how do you manage all this? We put, we, um, so changes for the release happen on the release branch, and then they get merged back into the master branch. So changes for that release are done off of the release branch,

Starting point is 00:23:56 and then we do some Git magic to bring them into the master that I don't remember. We have a robot that does it now for us. So we just like do merge it in the GitLab issue or the merge request and it does all the logic for us. Interesting. I'm curious if it does a bunch of cherry picking or? No, it would not do cherry picking. I know that for sure. It would probably do some form of like other way of doing it. We really like a single history that you can track down and bisect. Yeah.

Starting point is 00:24:29 That's what I was thinking about. Yeah. The part that had me like questioning all of this was the, the bit where you said, yes, we re we revert a major change if it's problematic. And I'm like, yeah,

Starting point is 00:24:40 then do you want to merge that revert into the next bit or not? That's a good question. I would have to go look at our history to see what we did. Okay. No, I'm curious. Go do it. No, I'm just kidding. Yeah, we have, like, a lot of those tools.

Starting point is 00:24:56 Like, we have reformatting tools that we run. So on CMake, we run Clang Tidy and, like, Clang formatting tools. And when you open up a merge request, it'll tell you, like, hey, you don't follow our style guidelines. And then you can tell automated tools to reformat all of your commits. So it's not just we want the tip to be correct. We want every single commit to be correct. Because we try to strive towards every commit being bisectable and also following all the style guidelines. And that's too much for people to do. So we have robots and scripts that do it that you can issue.

Starting point is 00:25:32 Bisect is an amazing tool that I don't think a lot of people know about. So since you've mentioned it like three times, explain it. So bisect allows you to mark a commit. I'm going to do the simple version. You can mark a commit that is good and a commit that is bad. And then you can use some algorithm like a search structure where it jumps between and it asks you if this commit is good or bad. And you can flag it as good or bad. And then it will finally narrow it down to one or many commits that are good or bad. And then it'll finally narrow it down to one or many commits that are good or bad. On a really nice tree, what you can actually do

Starting point is 00:26:08 is git bisect run and give it a bash script to run. And that will then do that and return an error code of like, a return code of zero means that's good and a different error code means bad. And then there's a special 125 for

Starting point is 00:26:23 ignore this commit, like we couldn't build it or something. What we can do is we do get bisect run, a script that re-runs, recompiles CMake and runs whatever test we want to bisect on that exposes a bug, and it'll go through the history and then find the exact commit

Starting point is 00:26:39 in our history that introduced that bug. And we can basically automate that. I imagine on a large project like CMake with many developers, that can be very helpful. Yes, yes. Very cool. I just had to run it

Starting point is 00:26:53 because I introduced a bug in 3.15. So it's very personal and real at the moment. It's very personal, yes. And did you run it with the full script mode? Yes, I get bisect run and then I had to rebuild CMake and then run off a test and did that. It came back to the commit that I did that introduced the regression, and then I had to go off and fix it. Could you plausibly estimate how many hours, days, weeks, whatever of effort that being able to run that as a script saved you?

Starting point is 00:27:22 No, I can't. It's just too much. But at least many hours, I can't. It's just too much. But at least many hours, maybe many days. Yes, many hours, at least. Very quick. I sort of knew that I introduced it. So in this case, it was pretty easy to fix. But in general, in past, it's been invaluable. And generally, we find all these things

Starting point is 00:27:43 because we do merge. Every merge, we test. So like every merge branch, we test all these things because we do merge, every merge we test. So like every merge branch, we test it on a subset of machines, and then we stage it, basically get it ready to go into production. And we do it on even larger set of machines. So generally, we find stuff there. It's just only when we have coverage holes do we do these leak in and then we have to do the bisecting though the bisecting does help on the stage because you'll merge in like 10 or 11 commits for a day and then some one of those produces an error and if you don't if you can't immediately see it it helps see me because a lot of developers yeah i'm familiar with git bisect i have used it to find

Starting point is 00:28:21 bugs and i feel like you're using it like if I'm like at beginner level and you're at like advanced level. Yeah, I've never done it. Yeah. So you said you do four month cycles. How far along are you in the current cycle? And can you tell us about some of the features you're working on? So 3.15 is going to have another release candidate this week. Hopefully it's the last one.

Starting point is 00:28:45 So that'll be out probably in a week or two after that. And so 3.16 has had about a month of development on it. So 3.15 actually, I think, has a lot of major changes. I've started to do some stuff online just to tell people about some of them. I think the big one for a lot of people is the improvements on the Windows generators. We stopped doing the W3 flags by default. It's guarded by a policy, but going forward, you will not have CMake enable warnings on all your projects without you opting into it. And we now allow you to specify the MTMD flags for the runtime library you want to use.

Starting point is 00:29:26 So that's actually now a target you use rather than having to string replace the flags. And on top of that, 3.15 is going to be the first signed binaries that we have for Windows. So we'll now be a proper executable on Windows.

Starting point is 00:29:42 And you won't have any warnings. Will I be able to use the Windows Store then to install it? No, we don't have it as a Windows Universal package. Right? No, but they just opened the store up to Win32 applications. So maybe in the future. They are working on that, yeah. I thought CMake, you could install either the 64 or 32-bit versions.

Starting point is 00:30:03 Am I thinking of a different project? No, you could install either the 64 or 32-bit versions. Am I thinking of a different project? No, you can, but they're not built with the universal Windows Store API, I think, that you need. I think I would have to go look at that. Maybe it'll come to the store. You know, I've only actually honestly used the Windows Store to install like four programs total, and only when I've had to. Like now Skype is required to be installed that way and uh installing one note which we use for show notes here for our listeners uh it wants to

Starting point is 00:30:33 be installed that way and then i went to install another program and i was getting like a license error so i had to go to their support forums and just download the executable directly even though it was a free project i couldn't get it to install the windows store like so far you know 75 success rate's not great yeah the i thought there was some stuff on the windows store on what kind of applications they allow so i think i think with us being a win32 based application we're not allowed in currently. I have no idea. Yeah. But it's signed now, so no more, you know, do you trust CMake? So we got that all fixed up. But like 3.15 has a massive amount of changes.

Starting point is 00:31:13 So did you have anything yourself that you saw that you wanted more information on? Well, I'm just kind of curious about the things that you just mentioned. The MT not being a string replace anymore and the warning flags. But you said that's an opt-in one. So that's one of those things that you just mentioned, the MT not being a string replace anymore and the warning flags. And you said that's an opt-in one. So that's one of those things that you said at the top of the project and you say, oh, I forget how that is. So the preserved backwards compatibility

Starting point is 00:31:35 CMake has a concept of a policy. So when you say CMake minimum required version, we convert CMake to that version of CMake. So it's in backwards compatibility mode and it acts just like that version would. So if you say your minimum required is 3.3, we'll behave like the CMake 3.3 binary did. We won't behave like 3.15. So if you say your minimum required is 3.15,

Starting point is 00:32:03 you'll get this new behavior where you would have W3 not on the line. You would have then not MTMD. You would have to specify which one of those through a target you wanted. That's complicated, right? Because now you've said you needed to update your version. So the minimum required also has a syntax, which is the version that's your minimum required, dot, dot, dot, three dots, and then another version number. And that is the version to set your policies to. So you can say 3.3 is the minimum required, but turn all policies on all the way up to like 3.15.

Starting point is 00:32:39 So give me the policy behavior like I'm running 3.15. But if that doesn't work, language-wise, I'm 3.3. Interesting. And that was introduced fairly recently, 3.13 or 3.12, if my memory serves me. And that was tricky because that has to be backwards compatible. So older versions that don't know that syntax can still handle it. Backwards compatibility is a big deal. Yeah, well, and a tool like

Starting point is 00:33:08 CMake can't afford to break too much. We try not to. So that's why we have this entire policy design. And you can selectively enable policies if you don't want to opt into certain ones, and there's a whole set policy API there to do that.

Starting point is 00:33:24 So when am I going to get CMake modernize, like the Clang modernize features that just goes through and cleans up all of my 10-year-old CMake that's so, so out of date? Oh, that's a good question. I don't know. I would love that tool too.

Starting point is 00:33:40 Somebody should write it. Somebody should write it, yes. All right. Yeah. Yeah. The other big one that people are really interested in now is CMake. We had that dash dash build command. So you could build CMake or build a project not knowing the underlying generator.

Starting point is 00:33:59 And now we added dash dash install. Oh. Yeah. added dash dash install. So yeah, so you can now basically from like for your CI, you can script it entirely and not know any of the underlying generators. You can specify like your source and your build directory and then do a dash dash build and do a dash dash install and never, never have to worry. Yeah. CMake dash dash build for those who are not aware of it is a huge time saver when you're working on cross-platform build scripts. Yes.

Starting point is 00:34:27 Yeah, so it will invoke the underlying build system, and we have support for passing parallelism levels, like a dash-J, and also a verbosity, which works everywhere but Xcode now. Yeah, well, eh, Xcode, no big deal. It's always verbose. Okay. So that's 3.15. That's the upcoming release. You said the one that you started, you branched in beginning of July. Oh, June.

Starting point is 00:34:54 Beginning of June, yeah. Are there cool features on 3.16 that we should be looking forward to? I'm working on a set of features right now to control some more of the toolchain searching. So when in a cross-compilation mode, specifying where a find package, find library, find path,

Starting point is 00:35:16 find executable, where those searches, we're going to offer more fine-grained control over that at a global scope. So you can turn off searching CMake variables or turn off searching CMake variables or turn off searching system libraries globally for cross-compilation. So this is going to help out some cross-compilation targets

Starting point is 00:35:35 where they have a splayed layout. That sounds pretty cool. It's been a long time since I've had to worry about cross-compiling. Now if I want an ARM build, I just go to my Raspberry Pi or something and actually do it natively instead of worrying about the cross-tools. So we call it cross-compilation, but a lot of times it's for package managers, too. So package managers need to make sure that they're isolated from any of the system installations. And so this gives them the control to even better insulate them from the system package so they don't accidentally find a system version of something that i am well

Starting point is 00:36:11 aware of that is a huge problem potentially you're like i don't the package works fine on my system i don't know why it's not working on yours and then you know three hours later of ldds and you finally figure out which library is wrong yeah yep yep so that that's one of the cross compilation targets is just building against a different tool chain or installation set of libraries that sounds cool and so like into that for 315 we introduced the ability that uh the generate sorry the export package command, which used to install packages into the user's registry, doesn't work by default anymore. You have to explicitly opt in as the user instead of explicitly opting out. And this is to get CMake's defaults to be more of an opt-in behavior where the user has explicit control rather than having to have the know the user to explicitly opt out because the user doesn't know a package probably or a program is

Starting point is 00:37:10 going to install itself into a package registry so then when it happens it's happened silently and then then it breaks some build down the line right where you expected to find your png but i found your package registry version and you're like like, well, how did it find this? So now you have to opt in, and that allows the user better control. Okay. I want to interrupt the discussion for just a moment to bring you a word from our sponsors. PVS Studio is a tool for detecting bugs and security weaknesses in the source code of programs written in C, C++, C Sharp, and Java.

Starting point is 00:37:41 It works under 64-bit systems in Windows, Linux, and macOS environments, and can analyze source code intended for 32-bit, 64-bit, and embedded ARM platforms. The PVS Studio team writes a large number of articles on the analysis of well-known open-source projects. These articles can be a great source of inspiration to improve your programming practices. By studying the error patterns, you can improve your company's coding standards, as well as adopt good programming practices which protect from typical errors. For example, you can stop being greedy on parentheses when writing complex expressions involving ternary operators. Subscribe to Facebook, Telegram, or Twitter to be informed about all publications by the PVS Studio team.

Starting point is 00:38:19 Links are given in the podcast show notes for this episode. So Jason obviously knows a lot more about CMake than I do. I spend most of my time in Visual Studio, but just last week we were talking about the JetBrains Developer Ecosystem Survey, and they pointed out how CMake is the most popular build system currently. They had it at 42%.

Starting point is 00:38:40 So it is the most popular build system, but I know a lot of people do still like to poke at it and hate on it. So I wanted to know how you felt about the love-hate relationship with CMake and the C++ community. I think it's expected. If you're fairly popular, you're going to have a lot of users. And CMake tries to support a lot of different user setups or how they want to do their build. And that kind of complexity comes with the requirement to have a lot of configuration options. And I think that becomes one of our primary complaints is everyone's like, I have a simple project and CMake isn't simple to do with that. And while we're working on that to make it easier, it's not there yet. So it makes

Starting point is 00:39:27 sense to me. And then you also have conflicting goals of what people consider to be simple, or what is a small project. And CMake really was always designed for large projects was one of our goals, right? We initially started CMake to build ITK back in the late 90s, and that was already a large C++ project that had to hit multiple different operating systems. It does sound like a large project if you built your own build tool to build the project. Yes. Although I guess what, Qt and Boost have both done that as well. Well, it wasn't our project. It was National Institute of Health, the NIH, funded the work because it was their project.

Starting point is 00:40:10 So just to be clear, the NIH funded CMIC initial development. Oh, and that's something I think a lot of people aren't aware of. I mean, I've had some involvement with national labs and such. And the U.S. government is effectively not allowed to own a copyright. So if they fund the development of software, a considerable portion of it's open source. Assuming it's not classified or something like that.

Starting point is 00:40:33 And those still could be open source. They're just not publicly available. Right. So if you have clearance, it's open source. Yes. I have not been involved in any of those projects. That is what I'm told. involved in any of those projects. That is what I'm told. That is what you're told.

Starting point is 00:40:58 No, we get a lot of funding from both private corporations and from public entities like the National Labs to improve CMake. So everyone uses it, and so everyone wants to see new features be added or fixes for their domains. What licenses, CMake? I don't know if I've ever looked. BSD. Three class. Are there any things you personally feel like CMake has a lot of room for improvement? Yes. Specifically, I still

Starting point is 00:41:22 think CMake needs the reverse documentation that we have currently. We have reference API documentation. So if you want to know what a command does with certain options, we document that fairly well. If you want to know why you should use that command, we don't document that. Right. So that is an area that we are going to work on. But until that lands, it's a large issue.

Starting point is 00:41:47 I also think that the CMake language itself could be tightened up. We could improve the language. We've done it in the past. We continue to do it. So I'm willing to put some elbow grease in and improve the CMake language. So I think those are like the two major ones. Oh, I remember. The other one is exporting.

Starting point is 00:42:08 So when you have a CMake project and you want to export it for consumers, there's a lot of boilerplate that you currently have to write. I want to see that reduced. That should be really simple so that therefore all projects can do it and they don't have to reproduce all this boilerplate. And that could be really helpful for things like consuming that project then with Conan in the future.

Starting point is 00:42:31 Yes. Yep. So I'm currently, there's a lot of stuff for the exporting that can be cleaned up, and wrappers could be provided to do the general case. And that's what we did for the install command with 3.14. So the install command, you used to have to specify, you know, destination for your binaries, for your dynamic libraries, for your static libraries, and so on. And now you can actually just drop all that, and we'll use the defaults provided by the GNU install package. So now you can just do like

Starting point is 00:43:00 install targets, and we will put the executables under a bin directory, and we'll put the libraries under lib. How does that work on Windows? Same feel? I believe it's the same. I think we put the DLLs under bin. Which makes sense on Windows, yeah. Yep. But that is one of those pushes towards a simplification where the defaults are better for users. I was thinking about the fact that you just said that you have made the language better in the past. And I have been a CMake user long enough to remember when if you had an if statement and

Starting point is 00:43:34 inside the parentheses, you had to put the clause, then on the end if you had to duplicate the same clause again. Yes. And that was so, so annoying. And the first time I saw CMake without that in the end clause, I was like, oh my goodness. Or when we supported decapitalized commands. I don't remember that one changing. So it used to be everything had to be uppercase? Okay. I knew everything used to be uppercase. And then I knew at some point style guides were like, stop making it all uppercase. And I was like, was there a reason why we to be uppercase, and then I knew at some point style guides were like, stop making it all uppercase. And I was like, was there a reason why we did it uppercase in the past? Like, I totally wasn't aware. It was a language restriction. We only supported uppercase as an identifier, so we knew when something was command.

Starting point is 00:44:19 But that was removed. Like, now we offer the ability, so you know, to dereference a variable in CMake you use the dollar sign and the curly parentheses and then we have a mode that says you can actually dereference environment variables by using dollar sign, env, curly brace right

Starting point is 00:44:37 but we also now offer one that allows you to find out what a cache variable is and so you can do dollar sign cache parentheses and now you can see what the difference is between a local variable that shadows a cache variable. And so that syntax allows us some room there to potentially simplify how we set properties on targets. We could add new identifiers inside of that component where dollar sign something parentheses allows us some language room. Okay. So become a CMake developer and work in that space because I'm far too busy. I'm a little bit busy with travel at the moment to start another project.

Starting point is 00:45:21 But that's, yeah, that's all. That's pretty cool. What about like warnings to tell you if a local variable is shadowing a global one? So we don't have that. In actuality, I recommend one. So there is very good reasons why you want local variables to shadow global variables. So the classic example is extending the C++ flags you have, right? So the old style was, you know, you add them to CXX flags, and then we said in modern CMake you shouldn't do that. You should add it to a target. Right.

Starting point is 00:45:54 And that's correct. You should add it to a target. But the variable version of that is very unique. So if you set CMake CXX flags to the the contents of CXXFlags plus your flags, you're not actually modifying the cache variable. What you're doing is you're constructing a local variable that takes precedence over the cache variable. Okay.

Starting point is 00:46:14 That will be dereferenced first. That variable doesn't modify the cache state. It's just local. So it'll go out once you leave that directory or that function. So it has scope and it has a lifetime. Which is very useful. If you do like cache force on it, it'll actually

Starting point is 00:46:31 write it to the cache. Now that's problematic because each time you run CMake you're going to additively add those flags onto it. So whatever flag you add will just keep on getting added on multiple times. The local variables actually don't modify cache variables, but construct a local variable that overrides the cache

Starting point is 00:46:50 or takes precedence over the cache variable. All right. And you don't have to know, like your code, downstream code never really has to know about it, which is excellent. Okay. Is there anything else you want to talk about with CMake, or should we talk about VTK?

Starting point is 00:47:05 Let's go talk about VTK. Okay. I can talk all day about either one. So do you want to tell us a little bit more about what VTK is and what your involvement is with it? Yeah, so VTK, and then there's VTKM. So VTK is about 20 years old. It's the visualization toolkit. It was one of the first projects.

Starting point is 00:47:24 It actually started beforeization Toolkit. It was one of the first projects. It actually lives long. It started before Kitware existed. The original founders started writing a book, the Visualization Toolkit book, and they decided to build a library for that to showcase the book. That then turned into a company. And so that's like the starting history of Kitware. And that is for visualization of geometric things. So if you have a triangular mesh and you want to see it, or if you have a hexahedral data set or a volumetric data set from an MRI, VTK can visualize that and perform algorithms on it.

Starting point is 00:47:57 And these are data analysis algorithms. So clip it, cut it, see how particles would move through it given a field. Like if there was like a pressure field, how would that move the particles? And so that's VTK. And that was all written for CPU architectures. It was extended to support MPI so it could run across multiple processors in an HPC environment. But what it wasn't designed for was for emerging architectures. So what happens when

Starting point is 00:48:25 you have GPUs in the mix, and you want to do more than just rendering on a GPU, you know, we support volume rendering and ray tracing and rasterization on a GPU, but we don't support computation in standard VTK on a GPU. And so that's where VTKM comes in, we do computation and analysis and rendering using GPUs or many core architectures like the Intel Xeon Phi that's leaving us or other new designs with super high core counts on a chip or on a die. And so for that approach, we have a fine-grained parallelism approach. So we offer a collection of data parallel primitives very similar to what you would find in like std parallelism, but with some extensions. So we have from like thrust, we have a lot of the parallel lower bounds, parallel inclusive scans, a lot more by key variants than you would get in std parallel. And then we offer domain specific building blocks on top of that. So if you want to iterate a hexahedral mesh on all the cells and have information about the points of each hex, we offer a high-class worklet design or a kernel design where you can describe that.

Starting point is 00:49:38 And we will automatically find all those values and give it to you for just a single cell. So you can write your algorithm on a per cell basis and then we handle all of the parallelism for that. And you mark which fields are input and output or read-write and we handle all that too. I understood virtually none of that.

Starting point is 00:49:57 you have a geometric domain, right? Big box. Sub yeah subdivide that into i understand big boxes big box voxels right okay voxels volumetric pixels right yeah volumetric pixels all right so that's a really simple use case and you want to iterate the cells right the volume of each one okay which means that you need the eight points of each voxel. And you might want three or four different properties associated with every point of that voxel.

Starting point is 00:50:31 And you probably don't want to write every single time you have an algorithm that iterates over all the cells to rewrite all that infrastructure and how to grab the points and probably have some lookup tables it needs to query other arrays. So VTKM handles all that. And what you write is a worklet or a kernel with the parentheses operator, and you specify what you want. Okay, so a function object like a lambda or something? Yeah.

Starting point is 00:50:58 We don't support lambdas because we need some more decorators that tell us, is this field that's coming in as the first parameter, is it cell-based or point-based? And so that tells us where to look it up and the access patterns of it. And is it read or write? If it's a write field and you don't have any storage, we know what size to allocate.

Starting point is 00:51:18 Or if it's read, we know we can use a restrict load. That kind of sounds like entity component system kind of stuff that game programmers do. You have a bunch of elements, you iterate over them really quickly, ship it off as fast as you can,

Starting point is 00:51:33 move on to the next thing. Yeah. VTK has been like an element entity component system since its foundation, where all of the arrays are independent of the data set. And so you can work on individual arrays or you can work across them on a slice that corresponds to a single flyweight cell.

Starting point is 00:51:52 Okay. I understood considerably more of that. There we go. Thank you. Who are the actual users of VTK? Is it the actual scientists with a bunch of data they want to visualize or are there engineers in between the scientists? VTK is the

Starting point is 00:52:09 fundamental toolkit, and on top of that there's visualization toolkits or GUIs that have been built. ParaView, which Kitware develops, is one of them. There's another one, Visit, that also uses it. And there's other tools that I don't remember that are also built on VTK. So a lot of

Starting point is 00:52:26 users use it without knowing they're using VTK, but I would say most of our users use it through ParaView. So that's the GUI, and it allows you to build rendering pipelines to see what you're interested in. So is that like a drag-and-drop visual programming kind of environment

Starting point is 00:52:42 or something? No, it's more of like a pipeline mode, which so you construct a flow diagram of which filters and what comes, like the sinks and sources, which comes in, what goes out of each one. And then you have like a rendering window or a 2D charting window or a 3D charting window. It's not as much drag and drop as classic UI style.

Starting point is 00:53:04 That sounds pretty cool, though. Kind of sounds like some of the real-time audio processing stuff that I've seen. Yes. It does the same kind of pipeline mode. And when you get into pipelines, there's push-based and pull-based. So does your primary source on one end, does it push signal through? Or does your syncscs do they pull information each time that they get a request okay so i think audio is push based because you always have a signal

Starting point is 00:53:32 coming in right right and ours are generally pull based uh because once we have a new render we want to get information or that right generally the data is static and then we're just pulling data through it right that makes That makes sense. Okay. Yeah, I've been doing VTKM for four-plus years now, so that's been what I've been primarily developing and doing a lot of the core infrastructure work and then writing higher-level algorithms on top of it. And how long have you been at Kitware now total?

Starting point is 00:54:01 Nine and a half years. So it sounds like Kitware's got a lot of open source projects going and you've touched many of them often on over the years. I have touched a lot of them in our division. So we have five divisions. Oh, wow. Yeah, we have about 100 and some odd programmers across the company. And I work in the scientific visualization department.

Starting point is 00:54:23 So I work on ParaView, VTK, other applications built on top of that. We have a ComputeCMB, TomViz. We have a ton of applications that we work on or we collaborate with other people on. And I've worked on most of those over my career. And then we have a software process team team which is mainly cmake focused and then we have a data analytics team that does a lot of javascript web browser based data analytics and javascript we don't care about javascript yeah and then yeah and we have a medical team that focuses on uh medical computation mri uh segmentation all those things and then we have a computer vision team too so are those teams also working on open source projects

Starting point is 00:55:07 that I've just never heard of before? Yes. Okay. So the medical team works on Slicer and ITK, which are two very large open source medical packages. Okay. The data analytics team has contributed to a couple of the, I don't remember the names,

Starting point is 00:55:22 but they contributed to some of the major JavaScript libraries for layout stuff. And then our Vision team has its own collection of open source computer vision frameworks. Wow. Yep. We're open source from the bottom up. That's unusual.

Starting point is 00:55:40 It's a pretty good company to work for if you're interested in working on open source projects. Is Kitware currently hiring engineers? Yes, we are hiring C++ engineers in my department and also the vision department. So there's, I think, five or seven open positions currently for C++ developers with some varying level of domain information. So I think we're looking for people with either an HPC background or a GUI background or a generalized GUI background. And where are you located? We have offices. I'm located in Albany, New York. We have offices in Albany, DC,

Starting point is 00:56:17 North Carolina, and New Mexico. Albany, DC, North Carolina. Okay. That covers a certain swath of the country. And then we have a sister company in Lyon, France. Oh, okay. So if anyone is in France looking for a C++ development job, check out the website.

Starting point is 00:56:39 Yeah, they have a position available. Cool. Okay. Well, it's been great having you on the show today, Robert. It was great being here. Thank you very much. Thanks for joining us. That was great. Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook

Starting point is 00:57:05 and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through Patreon. If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode was provided by podcastthemes.com.

Your Ad Here

CppCast - CMake and VTK

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.