CppCast - Salvus

Episode Date: August 3, 2016

Rob and Jason are joined by Michael Afanasiev to discuss his work on the Salvus library used for performing full-waveform inversions. Michael Afanasiev is currently working on his PhD in Geoph...ysics. He became interested in programming and high performance computing during his BSc in Computational Physics, playing around with simulations of star formation. After a brief attempt to lead a roguish and exciting lifestyle as a field Geophysicist, he was brought back to the keyboard during a MSc, where he began working on full waveform inversion (FWI). In 2013 he moved to Switzerland to continue working on FWI as a PhD student at ETH Zurich, where he’s currently wrapping things into a thesis. He spends most of his time writing scientific software, wandering through the alps, and atoning for the times he repeated the mantra “Fortran is the best language for scientific computing.” News CppMem: An overview Why is .h more widely used then .hpp July update for Visual Studio Code C++ extension Michael Afanasiev Michael Afanasiev's Blog Michael Afanasiev on GitHub Links Salvus Combining Static and Dynamic Polymorphism with C++ Mixin classes Salvus: retaining runtime polymorphism with templates Salvus: dynamically inserting functionality into a mixin class hierarchy Sponsor Incredibuild

Transcript
Discussion (0)
Starting point is 00:00:00 This episode of CppCast is sponsored by Incredibuild. You won't believe how fast your build can run until you download your free developer version at incredibuild.com slash cppoffer or just click the link on our website. CppCast is also sponsored by CppCon, the annual week-long face-to-face gathering for the entire C++ community. Episode 66 of CppCast with guest Michael Afanasyev recorded August 3rd, 2016. In this episode, we discuss naming your header files. Then we talk to Michael Afanasyev.
Starting point is 00:01:23 Michael talks for us about his work on the Solvus library. Welcome to episode 66 of CppCast, the only podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today? All right, Rob, how about you? I'm good, no real news. Looking forward to Seattle starting to get close, only two months away now. Less than two months. Less than two months, yes.
Starting point is 00:01:42 No new updates from CppCon this week, though. They still have a couple more things to announce, I guess. Yeah, none of the plenary sessions have been announced yet. Except for Bjarne. Yeah, Bjarne and Dan Sachs, as the keynotes have been announced. But there still can be other plenaries. So hopefully we'll hear about them soon. Anyway, at the top of our episode, I'd like to read a piece of feedback. This week, we got a tweet sent to us from Keshav.
Starting point is 00:02:08 IEEE runs these things about what the most popular programming languages are. And C overtook Java. So it's not C++. C++ was fourth on the list. But it's pretty cool still to see Java get dethroned from the number one spot. Well, that's interesting. If you look at the numbers, really, they're all very close in their rankings. Yeah, and the article also points out Arduino is on the list, but there isn't really an Arduino language.
Starting point is 00:02:40 Arduino uses C and C++, so They're on the list at number 12, so that doesn't really make sense. They should just be adding to the C and C++ numbers. Strange. Yeah. We'd love to hear your thoughts about the show as well. You can always reach out to us on Facebook, Twitter, or email us at feedback
Starting point is 00:02:59 at cppcast.com. Don't forget to leave us reviews on iTunes as well. Joining us today is michael afanasiev he's currently working on his phd in geophysics he became interested in programming and high performance computing during his bachelor's of science and computational physics playing around with simulations of star formations after a brief attempt to lead a roguish and exciting lifestyle as a field geophysicist, he was brought back to the keyboard during a master's where he began working on full waveform inversion.
Starting point is 00:03:30 In 2013, he moved to Switzerland to continue working on full waveform inversion as a PhD student at Zurich, where he's currently wrapping things into a thesis. He spends most of his time writing scientific software, wandering through the Alps, and atoning for the times he repeated the mantra, Fortran is the best language for scientific computing. Michael, welcome to the show. Hi, thanks for having me. There's a lot of stuff in your bio here that we could have fun with. I'm curious about your simulations of star formations.
Starting point is 00:03:59 Okay. How far did that go? I mean, technically there are actually simulations of galaxy formation. So we'd kind of start with some sort of initial distribution of density in the universe and then try and form galaxies out of these things. And my kind of bachelor's thesis was looking more closely at these things and trying to figure out which initial settings of parameters would lead to certain densities of stars being formed in some certain areas of density,
Starting point is 00:04:27 and try and relate that to what we actually see in the universe and tune these parameters accordingly. At that stage, I wasn't actually writing much high-performance code myself. It was more analyzing results. But that kind of first got me into the whole scheme of HPC and the whole field. That sounds like and what exactly did uh change your mind with regards to fortran well i mean that kind of came when i started writing my own software specifically i guess the stuff we'll talk about today um and i realized that fortran is is great and it's it's fast and it's, it's really amazing if, if you kind of want to solve, um, uh, equations in a very structured sense. Um,
Starting point is 00:05:10 but when we started doing things that kind of were more in line with the actual, with the real world, where you kind of have multiple different physics and multiple different problems living in the same kind of model space, all of a sudden the limitations of Fortran kind of reared their ugly heads. You basically were left with thousands of lines and many branches of if statements
Starting point is 00:05:32 to kind of get the thing done. And it's that specific kind of problem that led me to abandon it and move to C++. But you can kind of see where Fortran, where this kind of mantra comes from because when i first moved to c plus plus worrying about things like memory allocation and deletion
Starting point is 00:05:53 they were an issue and i didn't fully understand what was going on and if you just want to get science done in the fast way these things would hold held me back and would help hold other people back but when you get over kind of that like a small little hiccup you realize the power in c++ and never really look back cool okay uh so we have a couple articles to talk about uh feel free to jump in on any of these michael and then we'll start talking to you about full waveform inversion a little more detail. Sure. So this first one is from... Oh, whose blog is this? Ronnie O'Grimm's blog about CPP MEM.
Starting point is 00:06:30 He's giving an overview of this tool. Jason, have we talked about this one before? It seems very familiar. It looks very similar to the C++ code tracing tool that we talked about, but I believe we have not talked about this one specifically. Yeah, so in this article, he goes in an overview of CPP-MEM. He goes through a pretty basic example of how you can use the tool,
Starting point is 00:06:53 and I think it's going to be just part one of at least two parts he's going to do about the tool. He mentions at the end here that he plans another post where he's going to analyze a simple program with the help of CppMem to diagnose issues. Right now the tool looks like it's fairly beginning early stages in my opinion, but it's a pretty awesome way of trying to look at memory races and your memory model and that kind of thing, thread issues. Is this both an online tool or you can
Starting point is 00:07:25 install it on your PC? Is that right? I believe it's meant for online use. Okay. There's a note here in the first paragraph that says which you can also install on your PC. Oh, I see. But the webpage works.
Starting point is 00:07:41 Okay. The next thing we have is a Reddit discussion which I just found interesting about why dot H is more widely used than dot HPP or other naming conventions for C++ header files. And I know in pretty much all the code I've ever looked at, it's always dot H. How about you, Jason? I always use HPP. And in fact, one of my clients, I started with them early on in their project and they were using.h. And I said, trust me, switch to HVP. You will appreciate it in the future.
Starting point is 00:08:12 And they did and they agreed with me. They were glad that they made the change. What's their main reason for their appreciation? They had, I mean, it's a mix of C and C plus plus and being able to keep that straight and then automatically generated, um, for consistency, they had automatically generated files, which were CXH, CXX and HXX to coincide with their CPP and HPP, um, handwritten files made things more cohesive. Right. I think there is a post in that reddit post saying
Starting point is 00:08:46 that um i mean most people use h because that's what a lot of ide's kind of default to and that's what happened to me i'm using c line a lot and uh you know i create a new class and it just creates a dot h file for me and that's kind of why we have h's in in our code but and i know visual studio does the same defaults to dot h. And I thought it was interesting here. I mean, I think HPP kind of should be the default, no matter what the IDEs do. But someone referenced the core guidelines, and they use.h in all of their examples.
Starting point is 00:09:16 Yes, I disagree with them. I know one thing that bugs me when you're looking at a GitHub repository, if you use.h and you're looking at the different languages that are in your repository, it identifies all the.h's as being C instead of C++. I did not realize that. So you might not have any C code, actual C code.
Starting point is 00:09:39 It'll tell you, like, oh, half your code is C++, half is C because of all those.h files. That's true. Once we're using this catch testing library in our project, and it's catch.h, and for some reason it had just listed it as Objective-C. And it's quite a large portion of the code, so all of a sudden our code was like 30% Objective-C when nobody had written any Objective-C. I don't really know why that happened.
Starting point is 00:10:03 We eventually just removed it from the statistics, but I was kind of curious. You know, that makes me wonder if that's why Objective-C has shown up on some of my open source profile stuff, because I really haven't done any Objective-C. Yeah, look into that. It could be some H file or something in there that's doing it. And I actually have catch.h and ChaiScript.
Starting point is 00:10:22 That could be exactly it then well kai our uh catch is supposed to run well on objective c2 isn't that what he originally wrote the tool for uh i'm trying to think back to our conversation with him and i know uh definitely runs for objective c and c yes yeah it's been a while but maybe and maybe that's why he's using.h, I don't know. I actually did a Twitter poll about this exact topic, h versus HPP, and the conclusion that I saw in my Twitter responses was, we use h out of habit. Yep. That seemed to be what it came down to.
Starting point is 00:10:56 Yeah, I think most people who are using h, it's probably not a conscious decision for them. There is some talk of these, like, TPP for template files. Have you guys ever seen this in practice? No. No, I've never seen it. It's got to be complicated, I guess. Yeah, the closest I've seen is that HXX, like I mentioned,
Starting point is 00:11:14 for SWIG-generated header files. Okay, well, the last article here is from the Visual C++ blog. We've talked about their Visual Studio Code C++ extension a couple times, and they released another update for that. It now supports debugging using LLDB on Mac OS X. I know
Starting point is 00:11:36 that was one of the things we asked about when we had Ankit on the show a while back. Are there any other big features in here? Having a process picker for attaching the debugger to a running process on Linux well there's a little tidbit in the comments
Starting point is 00:11:51 that they are actively working on debugging support on Windows oh okay it's still so odd that it's going to be the last platform they support with this tool but most of the people who would want to use that are already using Visual Studio so I guess it does make sense.
Starting point is 00:12:06 Well, and there's also... Oh, I'm sorry, go ahead. I'm developing OSX mainly, and I've been using CLion, but I guess the question is, on Windows, why would you use Visual Studio code over Visual Studio? I'm just not really familiar with the difference. If you want to just
Starting point is 00:12:21 be very lightweight and just use the command line build tools instead of having to open up visual studio for everything that would be the one reason okay or if you're using min gw or clang for windows and just have no need for visual studio yeah but visual studios debugger is the most powerful part of it so it does seem a little weird to me to be on windows and really close to using visual studio but not taking advantage of their debugger right okay so michael let's start talking about uh full waveform inversion can you describe what this is for those of us who aren't in the scientific community um yeah sure i guess the. I guess the most general way to describe it is kind of, it's sort of akin to,
Starting point is 00:13:09 say, x-ray tomography. But instead of imaging the body, in most cases, we're imaging traditionally the earth and trying to create some sort of 3D image or model of the subsurface in a given region. I think maybe the coolest example is one from oil exploration. So this is kind of the classic way that oil has been found over the past, I guess, 40 to 50 years. And basically what they do, I'm not sure if you guys are familiar with how this works in general. Only vaguely. It's pretty neat. So what they've
Starting point is 00:13:46 traditionally done is, say, taking the Gulf of Mexico as an example. I mean, we all know that there's a lot of oil there after this kind of tragic disaster a couple years ago. But what they do in the first place is they get the geologists to, you know, look at some sort of geological map of the region and pinpoint where it might be likely to find some sort of oil, given some sort of geologic signatures. And once they kind of have a target region of interest, they hire these giant boats that drag behind them, you know, about kilometers worth of these things called floating streamers. And these are massive structures, but maybe five kilometers wide and 10 kilometers long that carry with them among other things these giant air guns so these boats will kind of putter on through the ocean and every now and then these air guns will just fire basically this is exactly what it sounds like they'll inject this huge bubble of air into the water and create a bubble
Starting point is 00:14:41 and that bubble will collapse and now this is kind of similar to say throwing a pebble into a pond and you do this and then you kind of set up these waves that kind of propagate in a circle away from where that pebble hit the water. So this happens in the ocean as well but instead of just traveling along the surface they also go down through depth. And eventually maybe about two kilometers deep they'll hit the ocean bottom. And at this point, they'll convert from a wave through the ocean to an actual wave through the subsurface itself. And once they're in the subsurface, they'll propagate through, and they'll hit different types of geologic bodies. They'll hit maybe a body of salt, some other rock over here
Starting point is 00:15:19 that is abnormally fast or abnormally slow with regards to wave propagation. And eventually eventually through all this scattering and movement underground, they'll eventually scatter back to the actual sea bottom. And when they get back to the sea bottom, they'll convert back to an ocean wave and travel back up through the water column and hit the boat again. And now in the boat are, along with the air guns, are little receivers that basically measure the pressure in the water over a given time. So the idea here is to basically look at these receiver time series. They basically look like, you know, you've seen these old school earthquake recordings
Starting point is 00:15:55 with like the rotating thing in the pen. And traditionally what people have done has been, I mean, we know what time this air gun goes off, and we know how long it's taken since the air gun came to get your first kind of wiggle on one of those earthquakes looking like recordings. And you can imagine that if you do this for enough times, for enough different locations in the ocean, you can basically build up a map of subsurface velocity with position. And this ends up being some sort of massive system of linear equations. And you can infer the velocity, the three-dimensional velocity model of the subsurface, just by looking at these travel times. And so this has been the method that's been basically the workhorse of the oil exploration industry for
Starting point is 00:16:42 the past 50 odd years now about 30 years ago some guys kind of figured out that you know it seems a bit of a shame because you have this long recording or earthquake recording of all these different these different pressure wiggles and we're kind of throwing all that away and we're just looking at one data point and that data point is when that wave first arrives at the receiver. And they thought, wouldn't it be great if we can basically extract the rest of the information from this seismogram as well and relate all these later wiggles
Starting point is 00:17:12 to perhaps more complex Earth structure? Maybe we can get a higher resolution image. So the theory for this came from, for those interested, came from kind of optimal control applications. But 30 years ago there was no computer powerful enough to actually get this done and the reason is that the method requires you to not just model the first arrival time of these waves but actually the full waveform itself
Starting point is 00:17:36 so in that kind of pebble example you're really modeling that wave moving along the surface of the ocean kind of every little ripple and every little distortion to that wave moving along the surface of the ocean kind of every little ripple and every little um distortion to that wave train but recently about 10 years ago all of a sudden clusters are big enough computers are fast enough that people started actually applying this technique to reasonably sized earth models so nowadays when they go out and do a survey in the gulf of mexico for an instance they're not just looking at that first arrival time of the seismic wave, but we're trying to relate every single wiggle along that kind of seismogram to different perturbations and different structures in the subsurface itself. This is the big difference.
Starting point is 00:18:18 This is the technology that full waveform inversion brought to the table. It's generating some high-resolution, three-dimensional model of the subsurface using all the information, or as much information as possible, that exists in these recordings, in these seismic recordings. If this kind of gives you an overview of the idea of things.
Starting point is 00:18:42 So you started by saying it's a giant structure that's like a kilometer wide or whatever that they're dragging along the the ocean are they actually emitting multiple of these pulses at once or well so um classically what would happen is that one source would fire at a time um and that response would be measured at the receivers and you'd kind of have to wait until the uh that response and the reverberations and the ocean stopped before you can move on and fire another source okay but actually lately there's been some work done on what's called source encoding and what this is is basically finding out some optimal way to fire all the sources actually at the same time with a certain tuning parameters so that they
Starting point is 00:19:27 can be easily separated at the receivers and it might look as if each source had been fired individually and the requisite time had been spent waiting. That's not really my field but there's certainly research going in this direction and that's just to make the actual survey itself more time efficient the the thing that my brain is having a hard time processing is if there's a bunch of different things going off at the same time or in close relationship to each other how in the world do you keep separate the results coming back from something that just happened to propagate through a very slow material versus something that was fired later um yep so there are techniques to
Starting point is 00:20:02 kind of do this um it gets it gets a bit technical but there are certain ways to fire the actual sources in a certain order that can optimally separate the responses of each of those sources at the receiver okay um but but you are right that um this is definitely a problem um the the the topic is called kind of source encoding and that's kind of what it sounds like trying to encode all the information from all these sources into one giant pulse um but um yeah there's this is this is an issue um of research for sure but needless to say you're talking about a massive amount of data that you guys are working with yeah absolutely absolutely insane really um our groups's working mainly with actual earthquake data
Starting point is 00:20:48 on the global scale. So while we still have a lot of data, it doesn't really approach the type of data that these guys from exploration actually go for. These are tens of terabytes per survey. So you... Wow. I'm sorry, but you just said earthquake data.
Starting point is 00:21:04 So you could use historical earthquake data to do like 3d mapping of the earth or something right so um well this is kind of where the exploration is where full wave from inversion was kind of originated and began its use our group's actually using it um instead of driving these boats around, we're actually waiting for earthquakes to happen and looking at the seismograms that you're probably more familiar with that kind of just lie on the surface and record the wiggles as the earthquake goes by. So you can use the same techniques by mapping, you know, each one of these wiggles to some sort of three-dimensional perturbation inside the Earth using an earthquake source and a normal seismogram than you can in the oil exploration industry
Starting point is 00:21:49 using these air guns and the receivers on the back of the boat. It's the same concept. We're still modeling waves, and we're still trying to map where there's some sort of scattering heterogeneity. Just the scales are completely different, but the technique's the same. Very interesting. And I guess most recently,
Starting point is 00:22:11 and perhaps a bit more interesting looking forward into the future, is that people have realized that the same techniques can now be used, for example, in cancer imaging. So one of the big application areas that our group's involved with is in breast cancer imaging. So again, here the technique's very similar. You have an emitter that emits some sort of vibration into the breast, and this vibration travels through the tissue, and then gets recorded at some receiver. And you again measure all these different wiggles,
Starting point is 00:22:35 and you try and map back the distortions in those wiggles to actual heterogeneities inside the breast to try and accurately pinpoint perhaps cancerous tissue or some sort of other abnormality. But again, it's the same technique, and we can kind of use the same codes to do these things as you would to look for oil, which is quite cool. So, I mean, you're working on a C++ library to do this, but before we get to that, I just have one more quick question. What's the actual the actual result of this like how do you visualize
Starting point is 00:23:05 this data what are you trying to get um so so generally in terms of visualization i mean you get some sort of three-dimensional representation of let's say velocity or density as it varies in the in the subsurface um and you you look at this thing on kind of just a giant mesh of, say, triangles, or, you know, like you look at any sort of three-dimensional graphic. And at each vertex of that triangle, you have an estimate for the density or an estimate for the velocity. And then you can kind of sweep
Starting point is 00:23:35 through this 3D model and look for things that are interesting. Okay. And that's the end result. Alright, cool. I wanted to interrupt this discussion for just a moment to bring you a word from our sponsors incredibuild dramatically reduces compilation and development times for both small and big companies like ea microsoft game studios and nvidia incredibuild's unique process virtualization technology will transform your computer network
Starting point is 00:24:01 into a virtual supercomputer and let each workstation use hundreds of idle cores across the network. Use IncrediBuild to accelerate more than just C++ compilations. Speed up your unit tests, run more development cycles, and scale your development to the cloud to unleash unreal speeds. Join more than 100,000 users to save hundreds of monthly developer hours using existing hardware. Download your free developer version at incredibuild.com slash cppoffer or just click on the link in our link section. So could you tell us a little bit about the actual library you're working on to perform this analysis? Yeah, sure.
Starting point is 00:24:41 So I guess back in January or so, me and a group of guys in my working group had kind of realized that while there are codes that exist that kind of do this full waveform inversion for any one of these problem areas, like there's ones that do it for the globe, there's ones that do it for oil exploration, and there's ones that do it for breast cancer, there's not really one that does it for all of them. And when we kind of realized that the actual application area is so similar and that we should be able to use the code for all of them, we kind of realized that, well, there's a niche here
Starting point is 00:25:13 and there's something that we can write and something that we can contribute back to the community. So this is what we started writing. And basically the express goal was that at the beginning we wanted to sacrifice perhaps some of the speed you'd get for designing a code that worked specifically for one application in order to kind of get back some flexibility so that we could apply this to different problem
Starting point is 00:25:37 domains and try out different techniques and kind of rapidly iterate on scientific ideas. Because the codes that exist now, again, are very kind of tailored and fast for one specific thing, but don't necessarily have the flexibility needed for kind of quick idea development in the scientific community. So that's where we saw that we could make a bit of a difference. So how did you end up then choosing C++?
Starting point is 00:26:03 I guess you kind of got into that a little bit with your bio. Right. Originally I got into C++ for a slightly different application. But for this type of thing, we were basing everything on a finite element type method. And I'm not sure how familiar, I guess guess anyone is with finite elements in general but the idea is that you have these kind of small little spatial elements maybe they'd be their triangles or or hexahedra and you solve equations on these little elements and then you basically generate
Starting point is 00:26:37 a set of global equations from all the small equations you solve on these elements. And what this really allows you to do is, there's a really nice formulation here in terms of object-oriented programming. I mean, each element is an object. It has certain qualities, like it has coordinates in space, it has maybe wave speed velocity on it, and it has a whole bunch of different things that belong only to that element itself. And really, C++ was the only language that we could think of that would combine the speed we needed,
Starting point is 00:27:08 because these problems are quite large, with the type of abstract object-oriented approach that we wanted to go to to kind of exploit the flexibility of the language. Okay. So that's why we chose C++, really. And the question often is, perhaps why didn't you choose something... If you're really going for flexibility,
Starting point is 00:27:29 why not choose something even more flexible? For example, Python. And we kind of did a little back-of-the-envelope calculation, which is, I think, quite funny. So there's one function that gets called in this code. And then so, for example, on a typical run, for a big run, we might have 100 million of these finite elements in our model. And because of our method on each element,
Starting point is 00:28:00 there's about 100 points where actual solutions are calculated. And now, if you can remember that kind of boat driving along, there's about 1,000 sources or so per simulation. And each simulation, each source requires about 10,000 time steps to actually model the wave forward in time. And then you need about, on top of that,
Starting point is 00:28:20 100 iterations of the entire method to produce your final image. So that ends up being 10 quintillion function calls um for this one function call um so basically we need to ensure that when we're doing this thing the functions like this are called basically as fast as they can be um and it's it's for these reasons that you know we we really had to go to some language that was i guess closer to the metal than something newer and perhaps more abstract.
Starting point is 00:28:48 10 quintillion is a lot of function calls. One times 10 to the 19, apparently. I had to look that up. I wouldn't have known what that was, no. Yeah, it actually is a number. I thought people just made up that name, but that's true. What version of C++ are you using with this library? Are you using some of the more modern techniques?
Starting point is 00:29:11 Yeah, so I guess, I mean, I kind of had the luck to kind of get into C++ after C++11 came out. So this is basically how I learned it. That sounds nice. Yeah, yeah. I'm kind of using these things like even range-based for loops that some other people are like, wait a minute, you can really do that. And I'm like, this is naturally spread out through our code.
Starting point is 00:29:35 So there's a couple of particular things about modern C++ that we used. In particular, this kind of tricks with unique pointer. So, for example, when we're trying to figure out where to, in our computational model, where to put these sources, we kind of have to search through all the elements and find the one where, say, the streamer from that survey boat lives in. Okay.
Starting point is 00:30:02 And then we add a source to that element, and that's a forcing term and it goes into the code and then it does its thing but if you do this with unique pointers basically each source is assigned a unique pointer and you kind of go through these elements one by one and trying to find out where the source should live um as soon as it finds itself it basically inserts itself there and then it doesn't exist anymore in kind of the array of sources that can be added. And this is quite nice, because often you get into really kind of annoying edge cases where a source might live on the exact edge of the element or at a vertex. So in theory, it could lie in any one of perhaps four or eight or 16 elements.
Starting point is 00:30:41 But since we're doing this, everything kind of just has one instance. As soon as it gets moved, we don't have to worry about perhaps duplicating the source in some other element. or 16 elements. But since we're doing this, everything kind of just has one instance. As soon as it gets moved, we don't have to worry about perhaps duplicating the source in some other element. And we found this just a very clean way of doing it rather than kind of keeping track of, okay, this element has a source, and then you have to go back
Starting point is 00:30:56 and kind of remove the ones that were already there. It was a pretty cool trick. And that was definitely something that cleaned up the code a lot. Interesting. I am curious, so what kind of hardware are you taking advantage of to make this as fast as possible? You mentioned high-performance computing. Are you doing anything on GPUs yet?
Starting point is 00:31:17 Sure. So this was kind of a happy accident of our design, which maybe I'll get into a bit. But traditionally, in full waveform inversion, previous codes who have done this type of stuff have proven to be a very good use case for GPUs. And the reason is because of the specific formulation of the finite element method we're using. The details are a bit irrelevant, but what it results in is that you have
Starting point is 00:31:51 a very high arithmetic intensity because you have a lot of multiplications that are being done on locations that are quite close in memory. This is just perfect for the GPU. So we have not finished or really put into serious testing our GPU implementation yet, although this is in the plans for the future. But this is something that competing codes in the community make heavy use of,
Starting point is 00:32:20 and we're certainly looking at doing. The happy accident that I was talking about harkens back to our entire design philosophy with these C++ template mixins. Instead of just creating a classical class hierarchy where perhaps the acoustic wave equation might inherit from some sort of general finite element, and some other equations might inherit from that, and you have virtual functions computing the requisite terms.
Starting point is 00:32:54 We kind of base everything around these C++ template mixins. So everything, all this class hierarchy is resolved at compile time, and you more or less have a fully formed class object by the time you run the code. And what's nice about here, and what really surprised us, is that CUDA now is more or less C++11 compliant. And if you tell this to people, they're very surprised. But what this means is that you can basically put these fully formed classes, these element classes,
Starting point is 00:33:23 directly onto the GPU without really modifying them. And this is something that you could not do with classical C++ inheritance and virtual functions. You can't put abstract classes on the GPU, but you can put these template inheritance classes on the GPU. So that wasn't part of our design when we started, but that's kind of been something that's, oh, wait a minute, we can do this, and we've run some simple tests with the hierarchy. We haven't really made it fast yet,
Starting point is 00:33:51 but it does actually at least go there onto the card. So do you want to go into a little bit more detail for how these template mixins are organized? Sure. So I guess the whole design of the code like I was saying before, we're kind of basing this whole thing around some sort of object-oriented
Starting point is 00:34:14 approach with elements as kind of our base classes and then we inherit from elements because they have all this common functionality. But we kind of ran into trouble here because once we started wanting to actually change the functionality of the actual derived classes,
Starting point is 00:34:32 we kind of ended up in a situation where all of a sudden you'd have to fulfill all these virtual functions kind of all the way up the class hierarchy. And this kind of defeated the flexibility a little bit because if I wanted to add new physics, I'd also have to take into account, I don't know, the element type that we were working on, and so on and so forth. So we kind of went back to the drawing board a bit, and then eventually stumbled upon template mixins. So what we do here is instead of inheriting from sort of concrete class, we inherit from just some generic template parameter. So you might,
Starting point is 00:35:06 you can picture, you know, class element, colon public T, where T is a template parameter. And what's nice about this is that each individual class you write is all of a sudden not tied to any sort of particular hierarchy. So then at, we basically have some master function that at runtime determines which one of these template classes to build up and return. So for example, on the basic level, we have some sort of a two-dimensional finite element. And we want that element to inherit functionality that will let it solve the wave equation in a fluid. And then also we want that element that solves the wave equation in a fluid. And then also we want that element that solves the wave equation in the fluid to be
Starting point is 00:35:49 able to be modified later on by some other physical equations. And what the template mix and pattern allows us to do is write all these classes independently of each other and then whenever we want to incorporate all this functionality together to basically build up this relatively complex kind of template hierarchy to finally build up our final class. And this has really been the key to unlocking the flexibility of our method. Has that template complexity impacted your compile times adversely or sure um and in fact um so i have my work computer which is i guess a laptop um that was what about three or
Starting point is 00:36:36 four years old and then i had my home computer which is about a year old um and the big difference is that the my work computer has a spinning disk and my new one has a solid-state drive. And I came home one day to work on the code at home and installed it on my computer, and the compile time was like twice as fast on the solid-state drive. So I don't know if anybody else has seen this, but there seems to be something in the actual using something with a spinning disk that just really inflates your compile time with templates.
Starting point is 00:37:06 It's not so bad anymore, but certainly on the work computer, I was starting to get worried that, oh man, we're going to reach a level where the compile time is going to be as long as it would take to just write this code from scratch again. So yeah, there's definitely a worry there, but it seems that it was kind of more specific
Starting point is 00:37:25 to just these older machines. Perhaps it's not a general problem. I have personally moved to SSD on all of my development machines, so yes. Yeah, it's been the single most, I think, important thing to speed up everything in the past five years or so. It's the main advantage to being self-employed
Starting point is 00:37:44 is I do not have to ask for permission to upgrade my hardware. Right, I guess. You have to find a budget to pay for it some other way. Yes, yes, it still has to come out of the budget, yes. So are there any other C++ libraries that you're using while working on this? C++, yes. So have you heard of this template expression library called Eigen?
Starting point is 00:38:13 I heard about it earlier. I think it's, I'm not quite sure, but I imagine it's used a lot in the graphics community. Basically what it is, it's a header-only template library for matrix, vector, matrix, matrix multiplications. And we do a lot of this in our code. And what it does is it will properly vectorize, I mean use the vector registers in your computer optimally for any certain operation.
Starting point is 00:38:42 So you can just write, say you have a matrix, you call it A and a vector B, you can write A star B, just like you'd write A times B in any other kind of scalar multiplication. But if you actually look at the assembly code it generates, it's quite insane. I mean, it'll go through and it'll detect whether or not this matrix and vector product is a size that can fit into the vectorized register. And if it can't completely fit into the vectorized register, it'll take the part that can and chop off the end, vectorize that part, and take the end part
Starting point is 00:39:15 and kind of multiply it in the classic scalar way. The amount of complexity in here is really insane. And this is one of those template libraries that if you didn't really know, if you weren't a pro-template C++ developer, you'd probably look at it and throw your hands up and run screaming out the door. But for a guy like me who's using it,
Starting point is 00:39:34 it's proved to be very cool. It allows you to write very clean and very fast code. So I'd definitely recommend people check out Eigen if they're doing this type of stuff. So that kind of helps you take a full advantage of the computer you're currently running on. Are you doing distributed analysis of this stuff also?
Starting point is 00:39:54 Yes. So, I mean, these problems are big enough that if you don't, if they're not parallel, then you can basically go home. To do anything of a realistic size, it needs to be distributed. So the first time we went through this, the first time I went through it, and I think everyone in science has done this at one point,
Starting point is 00:40:14 you wrote the MPI from scratch. You kind of sat down and you said, okay, this grid point talks to this grid point and you spend like a month just banging your head on the table trying to get it to work. But it is necessary, unfortunately. Recently, we've kind of realized that you don't need to start from scratch every time. And in particular, there's this library called Petsy,
Starting point is 00:40:36 which is very widely used in kind of the finite element or computational engineering community, which has things like distributed matrix vector products built into it. So for example, I can have a vector that lives across, you know, 1000 processors, and I can just write, you know, A times B, and it will figure out which elements of that vector it needs to exchange with its neighboring processors, and do that multiplication and call the system blast for, you know know all the uh all the local stuff it needs to do so that's also been incredibly valuable to go from a serial code to
Starting point is 00:41:11 to a parallel code and to be honest the number of mpi calls that we've written ourselves probably number around 10 um and this is for a code that is again again, would not work if it wasn't parallel. So that's also been incredibly valuable. Although I think this is written in like the first version of C ever. I joked once that, you know, on Independence Day, when Jeff Goldblum and Will Smith kind of upload this code onto the alien mothership and like blows everything out, that was probably the the version like it runs everywhere it's patsy um but it's not c++ but it's it's it's it's very well written c okay so i don't think we've actually mentioned the name of the library yet it's called salvus right oh yeah that's that's right um is it available to use in a state where if someone
Starting point is 00:42:07 wanted to do these types of calculations they could make use of solvus and it would be stable um yeah so i'll have to say no at the moment just because you know this thing is it's since what we're trying to do it's really important that when it when the release does happen um that it comes out and it doesn't just crash immediately you know sure yells at you um so our uh our working goal for beta release is october 1st at the moment um with a kind of i guess we call alpha release in at the beginning of december um and that for good reason lines up with uh with a conference around that time so we have a very strict deadline to get everything done by then. But at the moment, I think it's very cool to browse
Starting point is 00:42:51 and really look at how these template mix-ins work. But I'd hesitate to say download it and use it for whatever you want because that comes with a host of problems. Maybe Rob can put in the show notes a link to some specific examples you think the listeners might want to check out in the code? Yeah, sure. It looks like you have this repository of data. Is that small enough data sets that you could run Solvus against it
Starting point is 00:43:19 on your own machine? Yes, it should. I guess the problem here is that there's quite a bit of dependencies i mean there's no and we will never actually be able to distribute just a binary and that's just kind of the the way that these codes really work um actually sorry i take that back um for small examples we we do plan at one point to actually release a binary that you can just run and install on your computer. But for the general design of the code, I mean, this thing is kind of meant to be compiled on clusters.
Starting point is 00:43:54 And for this reason, it's always machine dependent, and you always need to install specific versions of specific libraries to get it to work. And that's the main issue. So even installing it on your local machine now, you'll need to download and install this library, which is relatively simple. But for example, if you have Windows,
Starting point is 00:44:16 good luck, really, getting this all working. So I'd hesitate to point to a specific example. Well, I think that's all the questions I have for you today. Jason, you have anything else you want to ask? I believe that's all of my questions. Okay, Michael. Well, it's been great having you on the show. We'll put links to Solvus in the show notes. And if there's anything else you want us to share, just let us know. Okay. Yeah. Thanks a lot. And thanks for having me and look forward to uh continuing my uh my walks to work for the next however many years while you guys are still online so yeah thanks
Starting point is 00:44:51 awesome thank you see ya thanks so much for listening as we chat about c++ i'd love to hear what you think of the podcast please let me know if we're discussing the stuff you're interested in or if you have a suggestion for a topic, I'd love to hear that also. You can email all your thoughts to feedback at cppcast.com. I'd also appreciate if you can follow CppCast on Twitter, and like CppCast on Facebook. And of course, you can find all that info and the show notes on the podcast website at cppcast.com.
Starting point is 00:45:22 Theme music for this episode is provided by podcastthemes.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.