CppCast - Salvus
Episode Date: August 3, 2016Rob and Jason are joined by Michael Afanasiev to discuss his work on the Salvus library used for performing full-waveform inversions. Michael Afanasiev is currently working on his PhD in Geoph...ysics. He became interested in programming and high performance computing during his BSc in Computational Physics, playing around with simulations of star formation. After a brief attempt to lead a roguish and exciting lifestyle as a field Geophysicist, he was brought back to the keyboard during a MSc, where he began working on full waveform inversion (FWI). In 2013 he moved to Switzerland to continue working on FWI as a PhD student at ETH Zurich, where he’s currently wrapping things into a thesis. He spends most of his time writing scientific software, wandering through the alps, and atoning for the times he repeated the mantra “Fortran is the best language for scientific computing.” News CppMem: An overview Why is .h more widely used then .hpp July update for Visual Studio Code C++ extension Michael Afanasiev Michael Afanasiev's Blog Michael Afanasiev on GitHub Links Salvus Combining Static and Dynamic Polymorphism with C++ Mixin classes Salvus: retaining runtime polymorphism with templates Salvus: dynamically inserting functionality into a mixin class hierarchy Sponsor Incredibuild
Transcript
Discussion (0)
This episode of CppCast is sponsored by Incredibuild.
You won't believe how fast your build can run until you download your free developer version
at incredibuild.com slash cppoffer or just click the link on our website.
CppCast is also sponsored by CppCon, the annual week-long face-to-face gathering
for the entire C++ community.
Episode 66 of CppCast with guest Michael Afanasyev recorded August 3rd, 2016.
In this episode, we discuss naming your header files.
Then we talk to Michael Afanasyev.
Michael talks for us about his work on the Solvus library. Welcome to episode 66 of CppCast, the only podcast for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
All right, Rob, how about you?
I'm good, no real news.
Looking forward to Seattle starting to get close, only two months away now.
Less than two months.
Less than two months, yes.
No new updates from CppCon this week, though. They still have a couple more things to announce, I guess.
Yeah, none of the plenary sessions have been announced yet.
Except for Bjarne.
Yeah, Bjarne and Dan Sachs, as the keynotes have been announced.
But there still can be other plenaries.
So hopefully we'll hear about them soon.
Anyway, at the top of our episode, I'd like to read a piece of feedback.
This week, we got a tweet sent to us from Keshav.
IEEE runs these things about what the most popular programming languages are.
And C overtook Java.
So it's not C++.
C++ was fourth on the list.
But it's pretty cool still to see Java get dethroned from the number one spot.
Well, that's interesting.
If you look at the numbers, really, they're all very close in their rankings.
Yeah, and the article also points out Arduino is on the list, but there isn't really an Arduino language.
Arduino uses C and C++, so They're on the list at number 12, so
that doesn't really make sense. They should just be
adding to the C and C++ numbers.
Strange.
Yeah.
We'd love to hear your thoughts about the show as well.
You can always reach out to us on Facebook,
Twitter, or email us at feedback
at cppcast.com.
Don't forget to leave us reviews on iTunes
as well. Joining us today is michael
afanasiev he's currently working on his phd in geophysics he became interested in programming
and high performance computing during his bachelor's of science and computational physics
playing around with simulations of star formations after a brief attempt to lead a roguish and
exciting lifestyle as a field geophysicist, he was brought back to the keyboard during a master's where he began working on full
waveform inversion.
In 2013, he moved to Switzerland to continue working on full waveform inversion as a PhD
student at Zurich, where he's currently wrapping things into a thesis.
He spends most of his time writing scientific software, wandering through the Alps, and
atoning for the times he repeated the mantra, Fortran is the best language for scientific computing.
Michael, welcome to the show.
Hi, thanks for having me.
There's a lot of stuff in your bio here that we could have fun with.
I'm curious about your simulations of star formations.
Okay.
How far did that go?
I mean, technically there are actually simulations of galaxy formation.
So we'd kind of start with some sort of initial distribution of density in the universe
and then try and form galaxies out of these things.
And my kind of bachelor's thesis was looking more closely at these things
and trying to figure out which initial settings of parameters would lead to
certain densities of stars being formed in some certain areas of density,
and try and relate that to what we actually see in the universe and tune these parameters accordingly.
At that stage, I wasn't actually writing much high-performance code myself.
It was more analyzing results.
But that kind of first got me into the whole scheme of HPC and the whole field.
That sounds like and what exactly did uh change your mind with regards to fortran well i mean that kind of came when i
started writing my own software specifically i guess the stuff we'll talk about today um
and i realized that fortran is is great and it's it's fast and it's, it's really amazing if, if you kind of want to solve, um, uh,
equations in a very structured sense. Um,
but when we started doing things that kind of were more in line with the actual,
with the real world,
where you kind of have multiple different physics and multiple different
problems living in the same kind of model space,
all of a sudden the limitations of Fortran kind of reared their ugly heads.
You basically
were left with thousands of lines
and many branches of if statements
to kind of get the thing done.
And it's that specific
kind of problem that led me to
abandon it and move to C++.
But
you can kind of see where
Fortran, where this kind of mantra comes from because
when i first moved to c plus plus worrying about things like memory allocation and deletion
they were an issue and i didn't fully understand what was going on and if you just want to get
science done in the fast way these things would hold held me back and would help hold other people
back but when you
get over kind of that like a small little hiccup you realize the power in c++ and never really look
back cool okay uh so we have a couple articles to talk about uh feel free to jump in on any of
these michael and then we'll start talking to you about full waveform inversion a little more detail. Sure. So this first one is from...
Oh, whose blog is this?
Ronnie O'Grimm's blog about CPP MEM.
He's giving an overview of this tool.
Jason, have we talked about this one before?
It seems very familiar.
It looks very similar to the C++ code tracing tool
that we talked about,
but I believe we have not talked about this one specifically.
Yeah, so in this article, he goes in an overview of CPP-MEM.
He goes through a pretty basic example of how you can use the tool,
and I think it's going to be just part one of at least two parts he's going to do about the tool.
He mentions at the end here that he plans another post where he's going to analyze
a simple program with the help of CppMem to diagnose issues.
Right now the tool looks like it's fairly beginning early stages
in my opinion, but it's a pretty awesome way of
trying to look at memory races and your memory model and that
kind of thing, thread issues.
Is this both an online tool or you can
install it on your PC? Is that right?
I believe it's meant
for online use.
Okay. There's a note here
in the first paragraph that says
which you can also install on your PC.
Oh, I see.
But the webpage works.
Okay.
The next thing we have is a Reddit discussion
which I just found
interesting about why dot H is more widely used than dot HPP or other naming conventions for C++
header files. And I know in pretty much all the code I've ever looked at, it's always dot H.
How about you, Jason? I always use HPP. And in fact, one of my clients, I started with them early on in their project and they were using.h.
And I said, trust me, switch to HVP.
You will appreciate it in the future.
And they did and they agreed with me.
They were glad that they made the change.
What's their main reason for their appreciation?
They had, I mean, it's a mix of C and C plus plus and being able to keep that straight and then automatically generated, um, for consistency, they had automatically
generated files, which were CXH, CXX and HXX to coincide with their CPP and HPP, um, handwritten
files made things more cohesive.
Right.
I think there is a post in that reddit post saying
that um i mean most people use h because that's what a lot of ide's kind of default to and that's
what happened to me i'm using c line a lot and uh you know i create a new class and it just creates
a dot h file for me and that's kind of why we have h's in in our code but and i know visual studio
does the same defaults to dot h. And I thought it was interesting here.
I mean, I think HPP kind of should be the default,
no matter what the IDEs do.
But someone referenced the core guidelines,
and they use.h in all of their examples.
Yes, I disagree with them.
I know one thing that bugs me
when you're looking at a GitHub repository,
if you use.h and you're looking at the different languages
that are in your repository,
it identifies all the.h's as being C instead of C++.
I did not realize that.
So you might not have any C code, actual C code.
It'll tell you, like, oh, half your code is C++, half is C
because of all those.h files.
That's true.
Once we're using this catch testing library in our project, and it's catch.h, and for
some reason it had just listed it as Objective-C.
And it's quite a large portion of the code, so all of a sudden our code was like 30% Objective-C
when nobody had written any Objective-C.
I don't really know why that happened.
We eventually just removed it from the statistics,
but I was kind of curious.
You know, that makes me wonder if that's why Objective-C
has shown up on some of my open source profile stuff,
because I really haven't done any Objective-C.
Yeah, look into that.
It could be some H file or something in there that's doing it.
And I actually have catch.h and ChaiScript.
That could be exactly it then well kai our uh catch is supposed
to run well on objective c2 isn't that what he originally wrote the tool for uh i'm trying to
think back to our conversation with him and i know uh definitely runs for objective c and c yes yeah
it's been a while but maybe and maybe that's why he's using.h, I don't know. I actually did a Twitter poll about this exact topic, h versus HPP,
and the conclusion that I saw in my Twitter responses was,
we use h out of habit.
Yep.
That seemed to be what it came down to.
Yeah, I think most people who are using h,
it's probably not a conscious decision for them.
There is some talk of these, like, TPP for template files.
Have you guys ever seen this in practice?
No.
No, I've never seen it.
It's got to be complicated, I guess.
Yeah, the closest I've seen is that HXX, like I mentioned,
for SWIG-generated header files.
Okay, well, the last article here is from the Visual C++ blog.
We've talked about their Visual Studio Code
C++ extension a couple times,
and they released another
update for that.
It now supports debugging using
LLDB on Mac OS X. I know
that was one of the things we asked about when we
had Ankit on the show a while back.
Are there any other big features
in here? Having a process
picker for
attaching the debugger to a running process
on Linux
well there's a little tidbit in the comments
that they are actively working on debugging
support on Windows
oh okay
it's still so odd that
it's going to be the last platform they support with this tool
but most of the people who would
want to use that are already using Visual Studio
so I guess it does make sense.
Well, and there's also...
Oh, I'm sorry, go ahead.
I'm developing OSX mainly,
and I've been using CLion, but
I guess the question is, on Windows, why would you
use Visual Studio code over
Visual Studio? I'm just not really familiar with
the difference. If you want to just
be very lightweight and
just use the command line build tools
instead of having to open up visual studio for everything that would be the one reason
okay or if you're using min gw or clang for windows and just have no need for visual studio
yeah but visual studios debugger is the most powerful part of it so it does seem a little
weird to me to be on windows and really close to using visual studio but not taking advantage of their debugger right okay so
michael let's start talking about uh full waveform inversion can you describe what this is for those
of us who aren't in the scientific community um yeah sure i guess the. I guess the most general way to describe it is kind of, it's sort of akin to,
say, x-ray tomography. But instead of imaging the body, in most cases, we're imaging
traditionally the earth and trying to create some sort of 3D image or model of the subsurface in a given region. I think maybe the coolest example is one from oil exploration.
So this is kind of the classic way that oil has been found over the past, I guess, 40
to 50 years.
And basically what they do, I'm not sure if you guys are familiar with how this works
in general.
Only vaguely.
It's pretty neat. So what they've
traditionally done is, say, taking the Gulf of Mexico as an example. I mean, we all know that
there's a lot of oil there after this kind of tragic disaster a couple years ago. But what they
do in the first place is they get the geologists to, you know, look at some sort of geological map of the region and pinpoint where it might be likely to find some sort of oil, given some sort of geologic signatures. And once they
kind of have a target region of interest, they hire these giant boats that drag behind them,
you know, about kilometers worth of these things called floating streamers. And these are massive
structures, but maybe five kilometers wide and 10 kilometers long that carry with them among other things these giant air guns so these boats will kind of putter
on through the ocean and every now and then these air guns will just fire basically this is exactly
what it sounds like they'll inject this huge bubble of air into the water and create a bubble
and that bubble will collapse and now this is kind of similar to say throwing a pebble into a pond and you do this and then you kind of set up these waves that kind
of propagate in a circle away from where that pebble hit the water. So this happens in the
ocean as well but instead of just traveling along the surface they also go down through depth.
And eventually maybe about two kilometers deep they'll hit the ocean bottom.
And at this point, they'll convert from a wave through the ocean to an actual wave through the subsurface itself.
And once they're in the subsurface, they'll propagate through,
and they'll hit different types of geologic bodies.
They'll hit maybe a body of salt, some other rock over here
that is abnormally fast or abnormally slow with regards to wave propagation.
And eventually eventually through all
this scattering and movement underground, they'll eventually scatter back to the actual sea bottom.
And when they get back to the sea bottom, they'll convert back to an ocean wave and travel back up
through the water column and hit the boat again. And now in the boat are, along with the air guns,
are little receivers that basically measure the pressure in the water
over a given time. So the idea here is to basically look at these receiver time series.
They basically look like, you know, you've seen these old school earthquake recordings
with like the rotating thing in the pen. And traditionally what people have done has been,
I mean, we know what time this air gun goes off,
and we know how long it's taken since the air gun came to get your first kind of wiggle on one of those earthquakes looking like recordings.
And you can imagine that if you do this for enough times, for enough different locations in the ocean,
you can basically build up a map of subsurface velocity with position. And this ends
up being some sort of massive system of linear equations. And you can infer the velocity, the
three-dimensional velocity model of the subsurface, just by looking at these travel times. And so this
has been the method that's been basically the workhorse of the oil exploration industry for
the past 50 odd years now about 30
years ago some guys kind of figured out that you know it seems a bit of a shame because
you have this long recording or earthquake recording of all these different these different
pressure wiggles and we're kind of throwing all that away and we're just looking at one data point
and that data point is when that wave first arrives at the receiver. And they thought, wouldn't it be great
if we can basically extract the rest of the information
from this seismogram as well
and relate all these later wiggles
to perhaps more complex Earth structure?
Maybe we can get a higher resolution image.
So the theory for this came from,
for those interested,
came from kind of optimal control applications.
But 30 years ago there was
no computer powerful enough to actually get this done and the reason is that the method requires
you to not just model the first arrival time of these waves but actually the full waveform itself
so in that kind of pebble example you're really modeling that wave moving along the surface of
the ocean kind of every little ripple and every little distortion to that wave moving along the surface of the ocean kind of every little ripple and every little um distortion to that wave train but recently about 10 years ago all of a sudden
clusters are big enough computers are fast enough that people started actually applying this
technique to reasonably sized earth models so nowadays when they go out and do a survey in the
gulf of mexico for an instance they're not just looking at that first arrival time of the seismic wave,
but we're trying to relate every single wiggle along that kind of seismogram
to different perturbations and different structures in the subsurface itself.
This is the big difference.
This is the technology that full waveform inversion brought to the table.
It's generating some high-resolution,
three-dimensional model of the subsurface
using all the information,
or as much information as possible,
that exists in these recordings,
in these seismic recordings.
If this kind of gives you an overview of the idea of things.
So you started by saying it's a giant structure that's like a
kilometer wide or whatever that they're dragging along the the ocean are they actually emitting
multiple of these pulses at once or well so um classically what would happen is that one source
would fire at a time um and that response would be measured at the receivers and you'd kind of
have to wait until the uh that response and the reverberations and the ocean stopped before you can move on and
fire another source okay but actually lately there's been some work done on what's called
source encoding and what this is is basically finding out some optimal way to fire all the
sources actually at the same time with a certain tuning parameters so that they
can be easily separated at the receivers and it might look as if each source had been fired
individually and the requisite time had been spent waiting. That's not really my field but
there's certainly research going in this direction and that's just to make the actual survey itself
more time efficient the the
thing that my brain is having a hard time processing is if there's a bunch of different
things going off at the same time or in close relationship to each other how in the world do
you keep separate the results coming back from something that just happened to propagate through
a very slow material versus something that was fired later um yep so there are techniques to
kind of do this um it gets it gets a bit technical but there are
certain ways to fire the actual sources in a certain order that can optimally separate the
responses of each of those sources at the receiver okay um but but you are right that um
this is definitely a problem um the the the topic is called kind of source encoding and that's kind
of what it sounds like trying to encode all the information from all these sources into one giant
pulse um but um yeah there's this is this is an issue um of research for sure but needless to say
you're talking about a massive amount of data that you guys are working with yeah absolutely
absolutely insane really um our groups's working mainly with actual earthquake data
on the global scale.
So while we still have a lot of data,
it doesn't really approach the type of data
that these guys from exploration actually go for.
These are tens of terabytes per survey.
So you...
Wow.
I'm sorry, but you just said earthquake data.
So you could use historical
earthquake data to do like 3d mapping of the earth or something right so um well this is kind
of where the exploration is where full wave from inversion was kind of originated and began its
use our group's actually using it um instead of driving these boats around, we're actually waiting for earthquakes to happen and looking at the seismograms that you're
probably more familiar with that kind of just lie on the surface and record the wiggles as
the earthquake goes by. So you can use the same techniques by mapping, you know, each one of these
wiggles to some sort of three-dimensional perturbation inside the Earth using an earthquake source and a normal seismogram
than you can in the oil exploration industry
using these air guns and the receivers on the back of the boat.
It's the same concept.
We're still modeling waves, and we're still trying to map
where there's some sort of scattering heterogeneity.
Just the scales are completely different,
but the technique's the same.
Very interesting.
And I guess most recently,
and perhaps a bit more interesting looking forward into the future, is that people have realized that the same techniques can now be used, for example,
in cancer imaging. So one of the big application areas that
our group's involved with is in breast cancer imaging.
So again, here the technique's very similar.
You have an emitter that emits some sort of vibration into the breast,
and this vibration travels through the tissue,
and then gets recorded at some receiver.
And you again measure all these different wiggles,
and you try and map back the distortions in those wiggles
to actual heterogeneities inside the breast
to try and accurately pinpoint perhaps cancerous tissue
or some sort of other
abnormality. But again, it's the same technique, and we can kind of use the same codes to do these
things as you would to look for oil, which is quite cool.
So, I mean, you're working on a C++ library to do this, but before we get to that, I just have
one more quick question. What's the actual the actual result of this like how do you visualize
this data what are you trying to get um so so generally in terms of visualization i mean you
get some sort of three-dimensional representation of let's say velocity or density as it varies in
the in the subsurface um and you you look at this thing on kind of just a giant mesh of, say,
triangles, or, you know, like you look at any
sort of three-dimensional graphic. And at each
vertex of that triangle, you have an estimate
for the density or an estimate for the velocity.
And then you can kind of sweep
through this 3D model and look for things that are interesting.
Okay.
And that's the end result.
Alright, cool.
I wanted to interrupt this discussion for just a
moment to bring you a word from our sponsors incredibuild dramatically reduces compilation
and development times for both small and big companies like ea microsoft game studios and
nvidia incredibuild's unique process virtualization technology will transform your computer network
into a virtual supercomputer and let each workstation use hundreds of idle cores across the network.
Use IncrediBuild to accelerate more than just C++ compilations.
Speed up your unit tests, run more development cycles, and scale your development to the cloud to unleash unreal speeds.
Join more than 100,000 users to save hundreds of monthly developer hours using existing hardware. Download your free developer version at incredibuild.com slash cppoffer
or just click on the link in our link section.
So could you tell us a little bit about the actual library you're working on
to perform this analysis?
Yeah, sure.
So I guess back in January or so, me and a group of guys in my working group had kind
of realized that while there are codes that exist that kind of do this full waveform inversion for
any one of these problem areas, like there's ones that do it for the globe, there's ones that do it
for oil exploration, and there's ones that do it for breast cancer, there's not really one that
does it for all of them. And when we kind of realized
that the actual application area is so similar
and that we should be able to use the code for all of them,
we kind of realized that, well, there's a niche here
and there's something that we can write
and something that we can contribute back to the community.
So this is what we started writing.
And basically the express goal was that
at the beginning we wanted to sacrifice
perhaps some of the speed you'd get for designing a
code that worked specifically for one application in order to kind of get
back some flexibility so that we could apply this to different problem
domains and try out different techniques and kind of rapidly iterate on
scientific ideas.
Because the codes that exist now, again,
are very kind of tailored and fast for one specific thing,
but don't necessarily have the flexibility needed
for kind of quick idea development in the scientific community.
So that's where we saw that we could make a bit of a difference.
So how did you end up then choosing C++?
I guess you kind of got into that a little bit with your bio.
Right.
Originally I got into C++ for a slightly different application.
But for this type of thing,
we were basing everything on a finite element type method.
And I'm not sure how familiar, I guess guess anyone is with finite elements in general but the idea is
that you have these kind of small little spatial elements maybe they'd be their triangles or
or hexahedra and you solve equations on these little elements and then you basically generate
a set of global equations from all the small equations you solve on these elements. And what this really allows you to do is,
there's a really nice formulation here in terms of object-oriented programming.
I mean, each element is an object.
It has certain qualities, like it has coordinates in space,
it has maybe wave speed velocity on it,
and it has a whole bunch of different things that belong only to that element itself.
And really, C++ was the only language that we could think of
that would combine the speed we needed,
because these problems are quite large,
with the type of abstract object-oriented approach
that we wanted to go to to kind of exploit the flexibility of the language.
Okay.
So that's why we chose C++, really.
And the question often is,
perhaps why didn't you choose something...
If you're really going for flexibility,
why not choose something even more flexible?
For example, Python.
And we kind of did a little back-of-the-envelope calculation,
which is, I think, quite funny.
So there's one function that gets called in this code.
And then so, for example, on a typical run,
for a big run, we might have 100 million of these finite elements in our model.
And because of our method on each element,
there's about 100 points where actual solutions are calculated.
And now,
if you can remember that kind of boat driving along, there's about 1,000 sources or so
per simulation.
And each simulation, each source
requires about 10,000 time steps to
actually model the wave forward in time.
And then you need about, on top of that,
100 iterations of the entire method
to produce your final image.
So that ends up being 10
quintillion function calls um for this one function call um so basically we need to ensure that when
we're doing this thing the functions like this are called basically as fast as they can be um
and it's it's for these reasons that you know we we really had to go to some language that was i
guess closer to the metal than something newer
and perhaps more abstract.
10 quintillion is a lot of function calls.
One times 10 to the 19, apparently.
I had to look that up.
I wouldn't have known what that was, no.
Yeah, it actually is a number.
I thought people just made up that name, but that's true.
What version of C++ are you using with this library?
Are you using some of the more modern techniques?
Yeah, so I guess, I mean, I kind of had the luck
to kind of get into C++ after C++11 came out.
So this is basically how I learned it.
That sounds nice.
Yeah, yeah.
I'm kind of using these things like even range-based for loops
that some other people are like, wait a minute, you can really do that.
And I'm like, this is naturally spread out through our code.
So there's a couple of particular things about modern C++
that we used.
In particular, this kind of tricks with unique pointer.
So, for example, when we're trying to figure out where to,
in our computational model, where to put these sources,
we kind of have to search through all the elements
and find the one where, say, the streamer from that survey boat lives in.
Okay.
And then we add a source to that element,
and that's a forcing term and it goes
into the code and then it does its thing but if you do this with unique pointers basically each
source is assigned a unique pointer and you kind of go through these elements one by one and trying
to find out where the source should live um as soon as it finds itself it basically inserts itself
there and then it doesn't exist anymore in kind of the array of sources that can be added.
And this is quite nice, because often you get into really kind of annoying edge cases where a source might live on the exact edge of the element or at a vertex.
So in theory, it could lie in any one of perhaps four or eight or 16 elements.
But since we're doing this, everything kind of just has one instance.
As soon as it gets moved, we don't have to worry about perhaps duplicating the source in some other element. or 16 elements. But since we're doing this, everything kind of just has one instance.
As soon as it gets moved,
we don't have to worry about perhaps duplicating the source in some other element.
And we found this just a very clean way of doing it
rather than kind of keeping track of,
okay, this element has a source,
and then you have to go back
and kind of remove the ones that were already there.
It was a pretty cool trick.
And that was definitely something
that cleaned up the code a lot.
Interesting.
I am curious, so what kind of hardware are you taking advantage of to make this as fast as possible?
You mentioned high-performance computing.
Are you doing anything on GPUs yet?
Sure.
So this was kind of a happy accident of our design, which maybe I'll get into a bit.
But traditionally, in full waveform inversion, previous codes who have done this type of
stuff have proven to be a very good use case for GPUs.
And the reason is because of the specific formulation
of the finite element method we're using.
The details are a bit irrelevant,
but what it results in is that you have
a very high arithmetic intensity
because you have a lot of multiplications
that are being done on locations
that are quite close in memory.
This is just perfect for the GPU.
So we have not finished or really put into serious testing our GPU implementation yet,
although this is in the plans for the future.
But this is something that competing codes in the community make heavy use of,
and we're certainly looking at doing.
The happy accident that I was talking about harkens back to our entire design philosophy
with these C++ template mixins.
Instead of just creating a classical class hierarchy
where perhaps the acoustic wave equation
might inherit from some sort of general finite element,
and some other equations might inherit from that,
and you have virtual functions computing the requisite terms.
We kind of base everything around these C++ template mixins.
So everything, all this class hierarchy is resolved at compile time,
and you more or less have a fully formed class object by the time you run the code.
And what's nice about here, and what really surprised us,
is that CUDA now is more or less C++11 compliant.
And if you tell this to people, they're very surprised.
But what this means is that you can basically put
these fully formed classes, these element classes,
directly onto the GPU without really
modifying them. And this is something that you could not do with classical C++ inheritance and
virtual functions. You can't put abstract classes on the GPU, but you can put these template
inheritance classes on the GPU. So that wasn't part of our design when we started,
but that's kind of been something that's,
oh, wait a minute, we can do this,
and we've run some simple tests with the hierarchy.
We haven't really made it fast yet,
but it does actually at least go there onto the card.
So do you want to go into a little bit more detail
for how these template mixins are organized?
Sure.
So I guess
the whole design of the code
like I was saying before, we're kind of
basing this whole thing around some sort of object-oriented
approach with
elements as kind of
our base classes and then we inherit from elements
because they have all this common functionality.
But we kind of
ran into trouble here
because once we started wanting to actually change
the functionality of the actual derived classes,
we kind of ended up in a situation where all of a sudden
you'd have to fulfill all these virtual functions
kind of all the way up the class hierarchy.
And this kind of defeated the flexibility a little bit
because if I wanted to add new physics, I'd also have to take into account, I don't know, the element type that we
were working on, and so on and so forth. So we kind of went back to the drawing board a bit,
and then eventually stumbled upon template mixins. So what we do here is instead of inheriting from
sort of concrete class, we inherit from just some generic template parameter. So you might,
you can picture, you know, class element, colon public T, where T is a template parameter.
And what's nice about this is that each individual class you write is all of a sudden not tied to any
sort of particular hierarchy. So then at, we basically have some master function that at runtime determines which one of these template classes to build up and return.
So for example, on the basic level, we have some sort of a two-dimensional finite element.
And we want that element to inherit functionality that will let it solve
the wave equation in a fluid.
And then also we want that element that solves the wave equation in a fluid. And then also we want
that element that solves the wave equation in the fluid to be
able to be modified later on by some other physical equations.
And what the template mix and pattern allows us to do is
write all these classes independently of each other and then whenever we want to incorporate all this functionality together
to basically build up this relatively complex kind of template hierarchy to finally build
up our final class.
And this has really been the key to unlocking the flexibility of our method.
Has that template complexity impacted your compile times adversely or sure um and in
fact um so i have my work computer which is i guess a laptop um that was what about three or
four years old and then i had my home computer which is about a year old um and the big difference
is that the my work computer has a spinning disk and my new one has a solid-state drive.
And I came home one day to work on the code at home and installed it on my computer,
and the compile time was like twice as fast on the solid-state drive.
So I don't know if anybody else has seen this,
but there seems to be something in the actual using something
with a spinning disk that just really inflates your compile time
with templates.
It's not so bad anymore,
but certainly on the work computer,
I was starting to get worried that,
oh man, we're going to reach a level where the compile time
is going to be as long as it would take
to just write this code from scratch again.
So yeah, there's definitely a worry there,
but it seems that it was kind of more specific
to just these older machines.
Perhaps it's not a general problem.
I have personally moved to SSD
on all of my development machines, so yes.
Yeah, it's been the single most, I think,
important thing to speed up everything
in the past five years or so.
It's the main advantage to being self-employed
is I do not have to ask for permission to upgrade my hardware.
Right, I guess.
You have to find a budget to pay for it some other way.
Yes, yes, it still has to come out of the budget, yes.
So are there any other C++ libraries
that you're using while working on this?
C++, yes.
So have you heard of this template expression library called Eigen?
I heard about it earlier.
I think it's, I'm not quite sure,
but I imagine it's used a lot in the graphics community.
Basically what it is, it's a header-only template library
for matrix, vector, matrix, matrix multiplications.
And we do a lot of this in our code.
And what it does is it will properly vectorize,
I mean use the vector registers in your computer optimally for any certain operation.
So you can just write, say you have a matrix, you call it A and a vector B,
you can write A star B, just like you'd write A times B in any other kind of scalar multiplication.
But if you actually look at the assembly code it generates, it's quite insane.
I mean, it'll go through and it'll detect whether or not this matrix and vector product is a size
that can fit into the vectorized register.
And if it can't completely fit into the vectorized register,
it'll take the part that can and chop off the end,
vectorize that part, and take the end part
and kind of multiply it in the classic scalar way.
The amount of complexity in here is really insane.
And this is one of those template libraries
that if you didn't really know,
if you weren't a pro-template C++ developer,
you'd probably look at it and throw your hands up
and run screaming out the door.
But for a guy like me who's using it,
it's proved to be very cool.
It allows you to write very clean and very fast code.
So I'd definitely recommend people check out Eigen
if they're doing this type of stuff.
So that kind of helps you take a full
advantage of the computer you're currently
running on. Are you doing
distributed analysis of this stuff also?
Yes.
So, I mean, these problems are big enough
that if you don't, if they're not parallel,
then you can basically go home.
To do anything of a
realistic size, it needs to be distributed.
So the first time we went through this, the first time I went through it,
and I think everyone in science has done this at one point,
you wrote the MPI from scratch.
You kind of sat down and you said, okay, this grid point talks to this grid point
and you spend like a month just banging your head on the table
trying to get it to work.
But it is necessary, unfortunately.
Recently, we've kind of realized that you don't need
to start from scratch every time.
And in particular, there's this library called Petsy,
which is very widely used in kind of the finite element
or computational engineering community,
which has things like distributed matrix vector products
built into it. So for example, I can have a vector that lives across, you know, 1000 processors,
and I can just write, you know, A times B, and it will figure out which elements of that vector it
needs to exchange with its neighboring processors, and do that multiplication and call the system
blast for, you know know all the uh all the
local stuff it needs to do so that's also been incredibly valuable to go from a serial code to
to a parallel code and to be honest the number of mpi calls that we've written ourselves probably
number around 10 um and this is for a code that is again again, would not work if it wasn't parallel. So that's also been incredibly valuable.
Although I think this is written in like the first version of C ever.
I joked once that, you know, on Independence Day,
when Jeff Goldblum and Will Smith kind of upload this code onto the alien mothership
and like blows everything out, that was probably the the version like it runs everywhere it's patsy um but it's not c++ but it's it's it's
it's very well written c okay so i don't think we've actually mentioned the name of the library
yet it's called salvus right oh yeah that's that's right um is it available to use in a state where if someone
wanted to do these types of calculations they could make use of solvus and it would be stable
um yeah so i'll have to say no at the moment just because you know this thing is it's since
what we're trying to do it's really important that when it when the release does happen
um that it comes out and it doesn't just crash immediately you know sure yells at you um so our uh our working
goal for beta release is october 1st at the moment um with a kind of i guess we call alpha release in
at the beginning of december um and that for good reason lines up with uh with a conference around
that time so we have a very strict deadline to get everything done by then.
But at the moment, I think it's very cool to browse
and really look at how these template mix-ins work.
But I'd hesitate to say download it and use it for whatever you want
because that comes with a host of problems.
Maybe Rob can put in the show notes a link to some specific examples
you think the listeners might want to check out in the code?
Yeah, sure.
It looks like you have this repository of data.
Is that small enough data sets that you could run Solvus against it
on your own machine?
Yes, it should.
I guess the problem here is that there's quite a bit of
dependencies i mean there's no and we will never actually be able to distribute just a binary and
that's just kind of the the way that these codes really work um actually sorry i take that back
um for small examples we we do plan at one point to actually release a binary that you can just run and install on your computer.
But for the general design of the code,
I mean, this thing is kind of meant to be compiled on clusters.
And for this reason, it's always machine dependent,
and you always need to install specific versions of specific libraries
to get it to work.
And that's the main issue.
So even installing it on your local machine now,
you'll need to download and install this library,
which is relatively simple.
But for example, if you have Windows,
good luck, really, getting this all working.
So I'd hesitate to point to a specific example.
Well, I think that's all the questions
I have for you today. Jason, you have anything else you want to ask? I believe that's all of
my questions. Okay, Michael. Well, it's been great having you on the show. We'll put links
to Solvus in the show notes. And if there's anything else you want us to share, just let us
know. Okay. Yeah. Thanks a lot. And thanks for having me and look forward to uh continuing my uh my
walks to work for the next however many years while you guys are still online so yeah thanks
awesome thank you see ya thanks so much for listening as we chat about c++ i'd love to hear
what you think of the podcast please let me know if we're discussing the stuff you're interested in
or if you have a suggestion for a topic, I'd love to hear that also.
You can email all your thoughts to feedback at cppcast.com.
I'd also appreciate if you can follow CppCast on Twitter,
and like CppCast on Facebook.
And of course, you can find all that info and the show notes
on the podcast website at cppcast.com.
Theme music for this episode is provided by podcastthemes.com.