CppCast - JIT Compilation and Exascale Computing

Starting point is 00:00:00 Episode 322 of CppCast with guest Hal Finkel recorded October 13th, 2021. JetBrains is offering a 25% discount for purchasing or renewing a yearly individual license on the C++ tool of your choice, CLion, ReSharper C++, or AppCode. Use the coupon code JETBRAINS for CppCast during checkout at www.jetbrains.com. In this episode, we discuss an LLVM update and a new Conan service. Then we talk to Hal Finkel from the U.S. Department of Energy. Hal talks to us about his C++ chip proposal and the work done at DOE. Welcome to episode 322 of CppCast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today? I'm okay, Rob. How are you doing? Doing okay. I don't think I have much news to share.

Starting point is 00:01:44 How about you? Anything you want to talk about before we get started? I'll just remind people again that my on-site CBPCon class is moving forward, and it is officially. I don't remember if I said this the last time, but I've double-checked. I am the only on-site class happening at CBPCon. The only one before and after, or the only one before and after or the only one entirely only one entirely before or after mine's a pre-conference class saturday and sunday before the conference the um the field trip is still happening obviously i can't be involved in that if i'm going to be teaching on sunday and that's shrunk down to the size now of 14 people going on the field trip. So they're taking a 15 passenger van to do the field trip instead of a chartered bus. But it's all moving forward still. I'm still sad I missed going on the field trip two years ago and not going to that restaurant.

Starting point is 00:02:37 What was it called again? Casa Benita. Yeah, I mean, it closed over COVID, which isn't surprising at all, because people went for the experience, not for the food. Restaurants that were still delivering food remotely managed to survive, but no one was going to order that food remotely. Did you see the news that the South Park creators are buying the restaurant, though? Yeah, they're buying it. And they even did like a little press conference with like, the governor or something because it was such big news that it wasn't going to die.

Starting point is 00:03:08 Ultimately. Um, it's ridiculous that we care so much about this really poor quality restaurant. Next time I'm out in Colorado, hopefully it'll be opened and uh under new management i'll have to check it out yeah they said it's that they're going to make improving the quality of the food a priority okay well uh at the top of every episode i'd like to read a piece of feedback uh we got a couple uh you know good piece of feedback from the last episode this one was saying this episode is really interesting

Starting point is 00:03:48 focused on an application that demands C++ rather than C++ itself I would have liked to learn more about the PubSub framework they have built on 0MQ and Protobuf I guess this is two episodes ago because we're time delayed right now

Starting point is 00:04:04 but this is referring to the um episode we did with the uh two guys from xing technologies on autonomous uas yeah that was yeah we could have dug into so many different topics there and i guess if anyone wishes that we had talked about x instead of y you can mostly blame me because i kept stirring the conversation to things that interested me well i was just thinking, I think we've probably mentioned 0MQ a couple times with this other episode and other past ones, but we probably could do an interview with someone just about 0MQ. Yeah, or maybe even just more generically about this

Starting point is 00:04:40 message queuing kinds of things. I think 0MQ I might have first learned about that or something like it. Oh my goodness. And like 2006 ish, something like that. I mean, this is problems that people have been working hard to solve for a long time.

Starting point is 00:04:57 I know it's not something I've used personally, but I quickly looked at it and I saw, you know, the library itself is in C plus plus, but they have like a whole bunch of different binding, you know, around it. So you can use it in Python and plenty of other languages. It's one of those things where I get the impression that the average user probably is not a C++ developer. Probably not. Probably not. Okay. Well, we'd love to hear your thoughts about the show. You can always reach out to us on Facebook, Twitter, or emails at feedback

Starting point is 00:05:24 at cppcast.com. And don't forget to leave us a review on iTunes or subscribe on YouTube. Joining us today is Hal Finkel. Hal is a program manager for computer science research in the U.S. Department of Energy Office of Sciences Advanced Scientific Computer Research Program. Prior to joining ASCR, Hal was the lead for compiler technology and programming languages at Argonne National Lab's Leadership Computing Facility. As part of DOE's Exascale Computing Project, Hal had a leadership role in several multi-institution activities, including several

Starting point is 00:05:52 focused on the LLVM compiler infrastructure project and a project focused on the COCOS and Rasha C++ libraries. Hal has been involved with ISO C++ standardization for nearly a decade, and he currently serves as vice chair of the U.S. C++ Standards Committee. He graduated from Yale in 2011 with a PhD in theoretical physics, focusing on numerical simulation of early universe cosmology. Hal, welcome to the show. Thank you very much, Rob. Oh, my goodness.

Starting point is 00:06:16 Nice to be here. There's just so much. I'm sorry. In your bio, I'm staring at it, thinking, because, you know, I like to, like, pick something and then ask the guest about it but uh what was the universe like in the early moments well we don't know oh okay all right no more seriously though um can you just would you for me my sake and for the sake of our listeners real quick describe what exascale computing is uh sure so exascale computing for the sake of our listeners, real quick, describe what exascale computing is.

Starting point is 00:06:51 Sure. So exascale computing is the kind of leading edge generation of supercomputing. You know, there's a petascale and then exascale comes after that. It's the next sort of thousand fold increase. And it's what we get when we put together sort of state of the art technology for GPUs, in this case, as the computational accelerator, and you have an awful lot of them that you put together in a huge data center, you connect them up with a state of the art, low latency network, and you use them to solve strongly coupled problems for, you know, simulating physical systems for doing large scale data analysis for doing all sorts of science. And we really have a really large investment, right, but there's also a really large set of opportunities to advance all sorts of things, you know, everything from like climate research to

Starting point is 00:07:51 optimization of advanced manufacturing processes. There's been an awful lot of work recently having to do with COVID-19, as you might imagine, everything from like drug discovery, you know, protein binding things. I'm not a biologist, so I can't really tell you the details exactly. But, you know, a lot of molecular dynamics, a lot of even genomics and other things, they all go on on these large machines to try and tackle these really important problems. That's really the important thing to realize about Exascale. And the Department of Energy is going to field a couple of these really large Exascale machines over the next couple of years. One at our National Laboratory.

Starting point is 00:08:40 The first one is going to be at Oak Ridge National Laboratory. And we have a lot of people who are, you know, already signing up to use them. And the Exascale Computing Project was a project that was designed to prepare all the software, the system software, the compilers, the numerical libraries, and a number of applications to actually use these things. Because otherwise, you know, the machine gets delivered and, you know, you don't want people to just stare at it. Right. You know, you have to have all the software set up to run. So we have a huge investment in making all of that happen. Interesting. So just quick question. X scale is obviously like the bleeding edge. But what comes after X scale? Do we know? I don't know. Zeta scale.

Starting point is 00:09:19 Zeta scale. Yeah. Yeah, well, you know, it's interesting because, you know, for a very long time, we've had increases in, you know, what you could get in terms of flops, you know, floating point operations every second. Because of, you know, Moore's law and Denard scaling and all these things, you know, processors got faster, you could add more parallelism, you could do so while decreasing the, you know, the node size on the manufacturing technology. And you could put in more cash. You could do all these things and get things to be faster. So one of the focuses of our research program actually is to figure out where you actually go next with, you know, with new computational technology, because, you know, it's not it's not really clear what's going to happen over the next couple of decades. But that having been said, I think there are a lot of opportunities that are really interesting. But the important point is that the scale, the petascale, exascale, normally these things have been taken to refer to some kind of base metric

Starting point is 00:10:26 like flops you know and uh and i think what we're realizing going forward uh is that you know we really have to focus on other metrics right things that are going to correspond more directly to scientific productivity uh of course uh and um are going to incorporate a lot of interesting new technologies and a lot of new algorithms. You know, the machines have been bought that way for a long time, like the metrics we actually use to evaluate the hardware, they're all application based. You know, we say, okay, we have these applications, we need them to get so much faster and get so much larger and so on. But still, it's easy to talk about in terms of how many flops the thing does. There's this top 500 list,

Starting point is 00:11:09 and the top 500 list has all these supercomputers, and they're all kind of rated by flops. But we'll see going forward how that turns out. But advancement in computing and technology is certainly going to go forward, even if the raw number of floating-point operations per second becomes harder to increase by, you know, thousandfolds going forward. Did I hear you right that the first X or the next X to scale system or whatever going in is going to Oak Ridge? Is that what you said?

Starting point is 00:11:38 That's correct. It's Oak Ridge National Laboratory in Tennessee. Why does it seem that Oak Ridge National Lab seems to always have the fastest DOE computer? Like, what's that about? Well, they have a lot of experience fielding those kinds of machines. And, you know, Tennessee has long been known for, you know, good infrastructure for power. Maybe that has something to do with it. They have a lot of space down there they can uh i don't i don't know if you've ever had the chance to uh go down to that to that area but no i haven't been to a bridge yet especially in the summer it's really pretty down there you know but uh but yeah they have a lot of space and um it's uh and and they have a good infrastructure uh for for doing that. The, I think if you look over the past several generations of things, Argonne has sometimes had machines that people thought were more usable,

Starting point is 00:12:34 even if they weren't the quote unquote fastest, you know, like these many core CPU machines, the, you know, the Intel Xeon 5s, the IBM BlueJeans, and so on. Oak Ridge went more aggressively towards accelerators. Their machines were faster, especially in the previous generation or so. But all in all, it's good to have different options, I think, because they serve different applications. But right now, everything is, of course, going to accelerators. I mean, GPUs are the things that we're getting now on

Starting point is 00:13:09 the exascale machines. And, you know, beyond that, I don't know exactly what's coming. But, you know, of course, now there are like a zillion companies trying to do machine learning accelerators, as you're probably aware. And they have a lot of different, really interesting architectures to them. I think that's going to bleed into architectures for more general computing. There's a lot of interesting things happening. Okay.

Starting point is 00:13:35 Well, Hal, we got a couple news articles to discuss. Feel free to comment on any of these and we'll start talking more about the work you're doing at DOE and the JIT proposal and other things. Okay? Sure. Sounds great. Okay. So this first article we have is the LVM-13 release notes.

Starting point is 00:13:54 But, you know, I saw this in the news and then I read more into it and saw that really, like, it seems like these release notes are not fully written out. It does seem that the release notes are not fully written out. There's lots of TBDs on here, yes. You would imagine a big V13 release would have plenty to talk about. But yeah, there's parts here where they just say, they're like placeholders, obviously. But it was an official announcement nine days ago on the LLVM mailing list. Yeah.

Starting point is 00:14:21 So we're not just crazy here. No, you're not crazy. You're not just crazy here no no you're not crazy you're not crazy and i will uh tell you that one of the nice things that lvm did some time ago was that they moved to time-based releases so you know every every end month whatever it is uh you get a new a new release and i think it's great because you know provides regular releases for the community. You're not waiting on sort of other random features before you get the thing you want, and so on. It's easier to plan around. One thing that is always difficult, however, is to make sure that the contributors and the code owners in various areas actually update

Starting point is 00:14:59 the release notes before the release. Sometimes this just doesn't happen. And this is somewhat unfortunate, but a lot of these people are volunteers and this is a somewhat unfortunate side effect. One of the things that the LLVM Foundation has been targeting and trying to encourage for some time is thinking about how to get better documentation, tutorials and the like. There's been a really good uptake in getting things like tutorials that were recorded at the developers meetings.

Starting point is 00:15:43 We put them on YouTube and you can do this kind kind of thing so we can get that kind of documentation. The written documentation and maybe the release notes are just an example of this. It still needs some work, but hopefully we'll get there. There is one thing I do want to call attention to on the cling tidy release notes. New extra checks. And this stood out to me because I'd actually been thinking, you know what? I want to try writing my own static analysis checks. And I've never seen anyone do this before. And what I wanted to do is write a check that would encourage you to use

Starting point is 00:16:16 stronger typing when an interface might be confusing to use. And it turns out that they have done that here. Where did it go? Bug prone, easily swappable parameters check. Find function definitions where parameters of convertible types follow each other directly, making call sites prone to calling the function with swapped or badly ordered arguments. That sounds like a potentially extremely helpful static analysis. Yeah, I would agree. I lost count of how many times I made a mistake like that. That sounds like a potentially extremely helpful static analysis. I would agree. I lost count of how many times I made a mistake like that. Right.

Starting point is 00:16:55 Now what we really need is a strongly typed using typedef and C++ so that these kinds of problems are easier to address too. But that's a different problem. Okay. Next thing we have is a post on the old new thing blog from Raymond Chen. This one is some lesser known powers of stood optional. And, you know, he basically well documents stood optional and how you should go about using it. And some interesting things that I wasn't aware of is that, you know, he points out if you have a stood optional int you can compare it you know less than or greater than zero and uh even if it's empty it will appear to be uh less than zero because empty must be less than zero i guess empty is less than everything i

Starting point is 00:17:38 think empty is less than everything yeah i, that makes me uncomfortable. Yeah. Because how many users intention? Well, and even Raymond actually suggested not to do it particular feature, but it's one of these like silent conversion things that happens in the standard library. That's potentially breaks the semantics of your code in a way that you didn't realize. And there's absolutely no way to catch for it if you're concerned about that

Starting point is 00:18:05 code. So yeah, I've never been a fan of optional anyhow. So I'm going to find things to complain about with it. I don't know. Do you have any thoughts on stood optional? How? Well,

Starting point is 00:18:16 you know, I think it's an interesting example of one of these hard design choices, right? Because it's like, well, what's, what's the alternative, right?

Starting point is 00:18:24 You can have it give, you could give it a tri alternative, right? You can have it, you could give it a tri-state return, you can make it throw an exception, and all those are messy for various reasons. You could provide a free function optional comparator so that you could just pass that when you wanted it to your container. Pop it into it.

Starting point is 00:18:40 You could, and I think this is what makes all this difficult and tricky. It reminds me of the stories behind how Stood Variant got to be the way that it is. And should that be empty or not? There are no easy answers, unfortunately. Have we ever discussed that? That could be an entire episode on its own, Rob, is the empty by exception design choice of variant, right? Anyhow.

Starting point is 00:19:15 Yeah. Yeah. Okay. And then the last thing we have is this new service called Barbarian, which is an open and distributed Conan package index. I love it just for the name. Obviously, Conan is a nice reference to Conan the Barbarian, so it makes sense.

Starting point is 00:19:34 But yeah, I was a little curious about this because obviously Conan has its own way of searching for packages. Do you understand why this is necessary and kind of what it does differently than Conan's own package manager service, Jason? My understanding is that this would be an easier way for individuals to get their packages usable in a source only way from Conan, similar to how VC package, at least until recently, had one giant mono repo That was just a bunch of recipes that you could pull from. And then the other one,

Starting point is 00:20:09 I can't remember its name right now that only only works with CMake. And it basically is like a external project at external project simplified for CMake. So that's my understanding is that it's just like an easy way for people to slip in an easy package to use that source only and you don't want to worry about binary distribution and stuff. But it's only been projects only started like just a few days ago. And I could not find any kind of package index at all. Like if there's even any existing ones, I feel a bit confused by that.

Starting point is 00:20:44 Right? Yeah, I don't know. There's no like list of the packages they have available, I guess. even any existing ones. And I feel a bit confused by that. Right. Yeah. I don't know how I, there's no like list of the packages they have available, I guess. Yeah. It says to use the Conan search command line tool to see which packages are in there, but I did not get that far.

Starting point is 00:20:56 And by looking at it, this was sent to me by the author of the project. So I'm like, you know what? That looks interesting enough to put out. Okay, well, Hal, I wanted to start off by asking you about your JIT proposal. I think we talked about it a while back with Ben Dean, who we had on the show, but I was wondering if you could maybe tell us, give us

Starting point is 00:21:20 an overview of the JIT proposal in your own words. Oh, sure. So this is something that I was working on when I was working at Argonne National Laboratory. And we have faced this sort of standard trade-off in both wanting to have highly specialized code for particular numerical kernels and particular data structures, particular indices, these kinds of things, and yet also not pay the compile time expense of compiling all of these specialized versions every time you compile a library application, source file, what have you. And we have some applications within the Department of Energy that use a lot of these kinds of

Starting point is 00:22:13 specialized templates that also are having severe issues with compile time to the point where it's impacting developer productivity. I mean, this is, I think, not a situation which is unique to DOE. And so we started thinking about how we might change this in a way that would be easy to use and easy to integrate with things. And I think, you know, the jury is still out, right, in that regard. But it was something that uh we really thought was important to explore and you know the question was really a you know could you integrate some kind of just-in-time compilation technology into c++ in a way that would be

Starting point is 00:22:57 natural in a way that would allow you to abstract the jit technology such that you could integrate it nicely with other C++ libraries and frameworks and that would be sufficiently resilient, that would be sufficiently controllable, that it would kind of fit in C++ and allow you to create applications. I think that one of the obvious concerns with JIT is that it's a complicated technology, right? And pairing it with C++ implies an awful lot of stuff going on in the background. We thought really hard about how you could reuse existing technology from LLVM,

Starting point is 00:23:50 in the case of our prototype, to kind of take advantage of a good C++ implementation and sort of tie everything together in a way that would work. And that would allow us to then explore, okay, you know, what are the overheads? What are the challenges in doing this? And of course, you know, I mean, we could have done that sort of all by ourselves in some sense, but it's really valuable in my experience to get, you know, kind of wider community feedback when doing this kind of exploration.

Starting point is 00:24:23 And there are a lot of people who are interested in this kind of exploration. And there are a lot of people who are interested in this kind of technology. You know, we were using this primarily for numerical kernels, for doing these kinds of physical simulations and data analysis. But there's, you know, jits that are used for generating like specialized data structures for databases and other use cases that I wouldn't have really thought about unless we had brought them to the committee and received the feedback from there. We had committee members who told us about potential use cases for things like real time audio processing. You know, not something we would have thought about beforehand, but, I mean, it does fit really well. There are, you know, a number of design issues around, you know,

Starting point is 00:25:13 not only practical matters like, you know, what kind of guarantees you have around, like, does the JIT work? What happens if it doesn't work? You know, can you control the resource usage and the memory usage? But also integration issues. Is there a way to really abstract away the just-in-time compilation in a way that makes sense? One of the things we struggled with, and I think it's still a challenge for this proposal, is can you make an interface where you say, okay, if I put in some kind of lambda function, I can make the lambda, the inside,

Starting point is 00:25:54 the body of the lambda jitted based on some property of the thing to which you are passing the function object. And, you know, we had some ideas about how to kind of make that work. It seems like a really natural request in terms of how you would integrate this with other things. This is something that really came out of a lot of the committee discussions. It's not something we have a clear answer for yet. But I think that, you know, going forward, as we look at, you know, not only the sort of needs of applications, but also the wide variety of hardware that's coming for running high-end computations, there are a lot of questions about how you can use this kind of technology to adapt programs to the kinds of hardware that they're, that they're running on. And, uh, you know, of course, you know, in some senses, uh, for some kinds of technologies like web technologies, for example, right. I mean, jitting is super common, right. There's, uh,

Starting point is 00:26:55 there's an answer sitting right there. Uh, but for other kinds of applications, I think it's, it's less clear in the prototype that we, uh, constructed and we did have some preliminary support for CUDA and for running it's less clear in the prototype that we constructed. We did have some preliminary support for CUDA and for running on accelerators. We did a bunch of experimentation with that. But I think there's still a lot more to learn

Starting point is 00:27:16 about how to integrate this kind of technology. I think in the long-term future, people are going to have to think about adaptation for different kinds of hardware. And, you know, a lot of this is going to really depend on how diverse the ecosystem ends up being in terms of sort of computational interfaces from the C++ programmer's point of view.

Starting point is 00:27:42 And, you know, I don't think we know yet exactly what the future will bring there, but it seems plausible that this kind of thing will still have a, have a place in the future as, as we experiment more and as we learn more about what the computers are going to look like. What is the current state of your prototype? Ah, that's a, that's a good question.

Starting point is 00:28:02 So, so there, there is, there's a prototype. It's on my GitHub page. Still, there are some forks out there as well. There is a community of people who are interested in it. I don't think it's been updated to work with the very latest LLVM Clang versions. And there are some branches in the Git repository corresponding to some different approaches to managing the abstraction around the JIT to dealing with things like resource exhaustion and failure of the just-in-time compilation. The way the proposal worked is it allowed you to put in values for template arguments at compile time. And normally, of course, in the language, these need to be compile time constants. And so this would force the backend to kind of do a bunch of things.

Starting point is 00:29:01 But I mean, you know, sometimes template instantiation doesn't succeed. Sometimes this is because someone makes a mistake. Sometimes it's because there's a static assert that says like, nope, like this shouldn't work. Well, then the question is, well, if that happens at runtime, like what do you do, right? You know, you hit a static assert at runtime.

Starting point is 00:29:22 You should be able to handle that somehow. And, you know, so we have some different approaches that we've tried out for, you know, how that might work. And, you know, of course, also things like managing resources. When you just-in-time compile something, you've consumed resources. And, you know, normally in C++, if you allocate memory or some kind of resource, there's some way of freeing it, right, that's under programmer control. So you don't use resources in an unbounded way. And, and again, you know, how you best do this, I think is, is a really interesting question in that context.

Starting point is 00:30:00 But what is the current status of the proposal? Right now, you said you've got to get, you got a bunch of good feedback, you said, but is it still moving forward in the committee? ability to work on this is now quite limited. So there are other people. And if others are interested in sort of pushing that forward, then that'd be great. I'm certainly supportive of that. And right now, I think, you know, there are some of these issues that people really need to put thought into. You know, there are a lot of people on the committee who are quite interested in exploring this direction, but there are these really important questions that need exploration and there certainly needs to be a lot more experience gained deploying this kind of technology in practice, integrating it into real applications,

Starting point is 00:31:00 seeing what kinds of resiliency issues come up, what kinds of integration issues come up, what kinds of integration issues come up, how can you really express the kinds of things you want to express using this sort of mechanism. And one of the things that I've really enjoyed about being on the C++ Standards Committee is that you have a lot of great peers who are really kind of curious and engaging people and

Starting point is 00:31:28 really like thinking about things and providing feedback. And this is one of the reasons why it's great to engage with that group. But I think for this particular proposal, there's a lot more thinking that has to happen. I think the committee has been really helpful with that. But the proposal itself does need a champion, I think, if it's going to move forward in that regard. Okay. Can you tell us any more about, like, did you really put this prototype into use when you were at Argonne National Lab? Did it get much, like, you know, real world usage, like running on these big systems or anything like that so uh we did uh use the prototype to run on some large uh systems we uh ran on uh some supercomputers at

Starting point is 00:32:17 uh lawrence livermore national laboratory uh some of my collaborators at the time took an application that they had been working on and they ported it to use this JIT system. The application actually was an interesting choice. They've had applications at that laboratory that they've created other just-in-time compilation systems for, but they weren't really well integrated. They were things where, for instance, you had modules that would generate source code, invoke compilers, load dynamically linked libraries, and so on. And there are a number of applications I know about to take these kinds of approaches. So we did experiment with using using that. And

Starting point is 00:33:15 I know from talking to people after, say, the presentation I gave at CPPCon some years ago, that there are other people who have experimented with using the implementation in more production-like settings. I had a good conversation with some folks in the finance industry who were experimenting with it. I don't have a lot of detail on what they were doing. But, you know, I think, you know, from my perspective, you really need an implementation that does a good job at handling this sort of resource allocation and, you know, failure handling issues to be truly useful in a production setting. And I think that's where a lot of the open questions still remain. Okay. I want to interrupt the discussion for just a moment to bring you a word from our sponsor. CLion is a smart cross-platform IDE for C and C++ by JetBrains.

Starting point is 00:34:12 It understands all the tricky parts of modern C++ and integrates with essential tools from the C++ ecosystem, like CMake, Clang tools, unit testing frameworks, sanitizers, profilers, Doxygen, and many others. CLion runs its code analysis to detect unused and unreachable code, dangling pointers, missing typecasts, no matching function overloads, and many other issues. They are detected instantly as you type and can be fixed with a touch of a button while the IDE correctly handles the changes throughout the project. No matter what you're involved in, embedded development, CUDA, or Qt, you'll find specialized support for it. You can run and debug your apps locally, remotely, or on a microcontroller, as well as benefit from the collaborative development service.

Starting point is 00:34:52 Download the trial version and learn more at jb.gg slash cppcast dash cline. Use the coupon code jetbrains for cppcast during checkout for a 25% discount off the price of a yearly individual license. So you mentioned that this paper would need a good champion because you have a lot of other responsibilities. Do you want to tell us a little bit more about the work you do at DOE and your involvement in the Standards Committee? Oh, sure. So right now I have moved to the Department of Energy. So now I'm an actual federal employee and I manage the large parts of the computer science research program within our office, the Advanced Scientific Computing Research Office. And we provide funding for researchers, both at natural laboratories and academia. We have a robust program for investing in small businesses as well, the SBIR program. And we sponsor research and development work across programming models, runtime libraries, you know, sort of hardware, software, co-design. We sponsor, you know, work that relates to sort of large-scale

Starting point is 00:36:12 runtime systems, to work on, you know, supporting accelerators, you know, and work on sort of application-level technologies that are really important for advancing scientific computing and, and, you know, with a lot of sort of related effects for the computing ecosystem more generally. So I think, you know, one of the things that is sort of really useful from my, from my perspective, is that with involvement in C++, the committee members that come from our national laboratories, and we have a large number of our national laboratories represented, Argonne, Lawrence Berkeley, Lawrence Livermore, Los Alamos, Oak Ridge, Sandia, all of these labs now have representatives on the committee. They can leverage their experience working on a number of these projects that the Department of Energy funds. Most recently, a lot of them are funded through the Texas Gale Computing Project.

Starting point is 00:37:17 But, you know, these are things like LLVM and Clang. We have large investments in LLVM and Clang. We have large investments in LLVM and Clang. We also have investments in C++ abstraction libraries like Cocos and Braja. We have investments in other areas, HPX, SYCL, which is this new single source programming model to handle accelerator programming. We have representatives on that committee as well. And, you know, the representatives from the labs can leverage these experiences to inform the standard. I should also mention that Walter Brown, who was on the CSOS committee for many, many years, came from Fermilab, which is the Fermi National Accelerator Laboratory,

Starting point is 00:38:06 is another DOE laboratory. And we have contributed over the years in a large number of aspects of C++ development. I think if I just take a quick look back at the kinds of things that we've worked on over the last five years or so, parallel algorithms are sort of an obvious example. But also the DOE labs have been involved in the work on executors uh which hopefully will uh come to fruition soon someday uh you know there's been there's been work on you know things like atomics you know atomic graph atomic view uh you know floating point atomics these kinds of things uh the multi-dimensional array proposals a lot of people from the labs worked on SPAN, MD-SPAN,

Starting point is 00:39:06 that set of proposals. There are also for parallelism, folks from the DOE labs have been involved in proposals for things like RCU, the recopy update mechanism, hazard pointers, affinity futures. The numeric algorithms and linear algebra work that's going on is uh uh you know driven in large part by folks from the do we national laboratories and and the thing i the thing i i really like as well is that uh the researchers from the do we

Starting point is 00:39:39 labs the staff at the do we labs you know, they don't just go after like the shiny objects, right? There's a lot of work that just goes into, you know, things like fixing up, you know, the standard library. I mean, you know, DOE lab folks have worked on like the file system interface, you know, basic container interfaces. There's a lot of work that goes on just at the, you know, at the library working group level, at the core working group level, right? We have David Herring is a member of regular member of CORA now, right? He hails from DOE National Lab. And, you know, he's put a lot of work into things like, you know, fixing up the description of how declarations work and making sure that, you know, like the linkage description for modules made sense and, you know, we've really had this kind of impact, both, you know, on kind of larger things that we've pushed for and smaller things that we've pushed for. I think and we do so collaboratively.

Starting point is 00:40:54 And I think that's that Smith on inline variables. That was something that got made it easier for people to write header-only libraries for C++, which I thought was a really nice enhancement that we were able to make. And I suppose also back in 2015, I started working on assumptions. There are other people who have picked that up and championed proposals for assumptions. They got kind of lumped in with the contract discussion for a while. So, you know, and,

Starting point is 00:41:51 and hopefully we'll get, we'll get contracts at some point as well. That will be a nice thing, but we, and, and, and I should point out that the assumptions part of that ended up being one of the most controversial pieces of the of

Starting point is 00:42:06 the contracts uh so you're why the contracts yeah yeah yeah i i i suppose in some sense uh i i have some responsibility for that but um but it's it's interesting because uh it's one of the largest, like, difficulties in pinning down the design space around contracts has been, you know, to what extent the facility is useful for things like static analysis and optimization versus dynamic checking. And so, you know, this is something that I i think that you know the the committee and the community in some senses have yet to resolve but uh but they're they're you know both that whole that whole set of things is actually all really important uh and then the question is like do they do they all fall under the same uh facility instead of rules i mean you can obviously make an argument that they're related. When I first presented a proposal on assumptions, he said, yeah, we're going to get that in contracts. And I thought about it. I was like, oh, okay. Yeah, that kind of makes sense. Like,

Starting point is 00:43:14 they kind of go together. And then, you know, as people tried to work through the details, it turned out to be quite difficult to come up with a facility that really solved all of these different, you know, addressed all these different use cases. And so, you know, maybe in the end they'll end up being separate. I don't know. But I think it's been a really important set of discussions. So there's, but, you know, in general, I have to say this is one of the things that is important.

Starting point is 00:43:45 The DOE has really been able to take the experience that we've had with our research efforts and in some sense our development efforts and really work on helping transition them to practice through the standardization effort. Like the assumptions thing, as an example, you know, that was in part motivated by a work that I did to add the assumptions facility into Clang and LLVM. All right. So so we did that that work that has now been an important production facility in that compiler infrastructure for many years now. And that then turned around and motivated the work that we did on the committee. I mean, the same is true in a lot of other areas. And I think that kind of contribution has been really important from our perspective, and hopefully other people

Starting point is 00:44:42 appreciate those contributions as well. Longtime listeners know that I've done a lot of work with the DOE over the last decade, and we've coincidentally had a bunch of DOE people on here. So I just like, if our listeners are curious, it's not at all intentional. We find people doing interesting work. They tend to work at the DOE. I mean, not always, but often, right? And a part of that is, I guess, true that employees at the DOE or at one of the national labs are free to talk about the work that they're doing, which is not always true in some sectors. Sometimes we have to have podcasts that are pre-approved and whatever. But I'm just kind of curious, like, is there an explicit culture at the DOE that says you should be out

Starting point is 00:45:27 there contributing back to the community and sharing this work? Or is it just implied because a lot of the work that you do tends to be open source and public research anyhow? Or where does this come from? Why do you DOE people keep ending up here on our podcast? That's a great question. I think the answer is sort of yes. It's all of the above. And the Department of Energy funds a lot of work. Of course, this is taxpayer-funded work. And we feel very strongly that the taxpayer should really get the maximum benefit out of the work they're funding as that should be true to the maximal extent possible.

Starting point is 00:46:13 Now, that means that, you know, we engage in research and development. Of course, some of that work is funded as open science work. The Office of Science, my office, funds a lot of that work. There is work that is funded by other parts of the Department of Energy that are more mission focused. But regardless, we feel like we should give back to the country in the best way that we can. And we also feel that it's really important to have strong international partnerships and collaborations. We try to the extent possible to make sure that our work has impact on industry, that it, you know, sort of is transferred out of

Starting point is 00:47:00 the laboratory. It reaches, you know, the private sector, it reaches a place where it can really impact the economy as a whole. And, and I think, you know, the other the other aspect to this is that, you know, obviously, some of the mission work that the DOE does, you know, is not, are not things that we can discuss openly, but, but we do try to, you know, like all good C++ programmers, right, we try to design appropriate layers of abstraction, right? So, you know, this is, this is, you know, a place where we can separate out the work that we can be open about. And we try and be very open about that. It's a great way for us to communicate with the wider community, with our vendors, with everyone else. And I think, you know, part of this as well, you mentioned culture, and that is an important part of the piece of the puzzle, right? The question is, you know, why would someone, you know, who, why would someone who could go and work at a DOE national laboratory, but then by extension could go and work at a lot of other places? And frankly, could probably make more money working at a lot of other places,

Starting point is 00:48:19 at least if you include stock options or other forms of compensation, why would they come and work at a DOE national lab? Why would they want to do that? And a lot of it is the mission. And a lot of it is the culture and the environment. It's the ability to go and do interesting things, to contribute to the nation in important ways. And the ability to be involved in a wide research community or, you know, communities of practice. And, you know, the standardization is a part of that.

Starting point is 00:48:59 And so I think that it's sort of all of the above, right? I mean, we have an explicit set of policies around open source and around trying to encourage technology transfer and so on. But we also try and maintain this kind of culture that is attractive to the people who need to come work at the labs. I think if we didn't do that, then nothing else would really be possible in this context. You mentioned standardization. Can we talk more about your role as the vice chair of the U.S. Superstance Committee? What exactly do you do in that role? Ah, that's a really interesting question. Well, my role is really a support role as the vice chair. When we had in-person meetings, I was always the one who had the attendance sheet.

Starting point is 00:50:00 You know, it's a really important role. But I also manage the mailings. That's what we call the groups of proposals that we send out. Yeah, we used to do it. Yeah, we used to do this, you know, before and after meetings. And our new procedural experiment is we've now for some time been doing monthly mailings. I think in the COVID environment where we don't have in-person meetings and for other reasons as well, it's been really important

Starting point is 00:50:33 to have that more regular cadence of updates. There was some debate about what the right frequency would be. So we decided we'd try monthly and we haven't decided we try monthly and we haven't decided to change it yet, but, um, the, uh, uh, we worked, uh, with the, uh, C++ foundation, uh, to put together, uh, the website that we use for managing the mailings now. So people can go in and, um, uh, you know, if you, if you've been to a committee meeting or you're on the committee,

Starting point is 00:51:04 you can get access to the system to go in and just like get paper numbers and so on for proposals. Uh, so we've, we've worked on automating that, uh, and, and there will be further work on automating that process, but, uh, uh, but there's, there's always some manual aspect to it. Uh, I mean, I'm, I'm guilty of missing things all the time, but I do try and check over things and make sure that people have like uploaded the right files and so on. Sometimes people mess that stuff up, understandably. And so my role as vice chair is really in that kind of support role. But I've been happy to do it and I am happy to do it.

Starting point is 00:51:45 I, I think, uh, it's, it's one of the ways I've been able to kind of give back to the, to the committee. Okay. Are there any other, uh, papers you're currently championing, uh, in the committee yourself? Uh, at this, at this point, I'm not actively championing, um, proposals, but, uh, you know, there are, you know,

Starting point is 00:52:06 there are other things that, uh, you know, I had, I had worked on, uh, recently, which I think,

Starting point is 00:52:12 you know, we're sort of successful in, uh, generating, uh, discussion. And, uh,

Starting point is 00:52:19 there was, uh, there was a proposal on, um, sort of type system level integration of, uh, versioning that I wrote back in, I suppose it was a couple of years ago now. We don't have enough time left to bring up ABI. Yeah, you don't have enough time to talk about that. But the whole like stability versus velocity discussion, I don't know if you've touched on this on the podcast, but to what extent do we need ABI stability and what does it mean to manage and change that?

Starting point is 00:52:52 This is another really important thing that the committee needs to deal with. And I expect that that's going to happen over the next several years. But I've tried to contribute to that discussion in a way that's been helpful. And I think in general, there are a number of things that I've done that hopefully sit into that category. Definitely, please keep pushing that ball forward.

Starting point is 00:53:17 Definitely. We need to do something on ABA. We need some kind of official word, yes. Well, Hal, it was great having you on the show today. Thanks so much for coming on. Is there anything else you want to let our listeners know about before we let you go? No, I will say thank you very much for inviting me. It's been a pleasure. And if you're interested in news or updates from my office, from the Office of Science, Department of Energy,

Starting point is 00:53:46 you can visit our webpage. There's a link at the bottom to sign up for email updates and I encourage anyone who's interested to do that. Awesome. We'll put that link in the show notes then. Thanks, Al. Great. Thanks for coming on. Thank you again. Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the

Starting point is 00:54:02 podcast. Please let us know if we're discussing the stuff you're interested in or if you have a suggestion for a topic, we'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter.

Starting point is 00:54:18 You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through Patreon. If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode was provided by podcastthemes.com.

Your Ad Here

CppCast - JIT Compilation and Exascale Computing

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.