CppCast - Blaze
Episode Date: November 2, 2016Rob and Jason are joined by Klaus Iglberger to discuss the Blaze high performance math library. Klaus Iglberger has finished his PhD in computer science in 2010. Back then, he contributed to s...everal massively parallel simulation frameworks and was an active researcher in the high performance computing community. From 2011 to 2012, he was the managing director of the central institute for scientific computing in Erlangen. Currently he is on the payroll at CD-adapco in Nuremberg, Germany, as a senior software engineer. He is the co-organizer of the Munich C++ user group (MUC++)and he is the initiator and lead designer of the Blaze C++ math library. News Recommendations to speed C++ builds in Visual Studio void foo(T& out) How to fix output parameters Routing paths in IncludeOs Klaus Iglberger Klaus Iglberger Links Blaze Munich C++ User Group CppCon 2016: Klaus Iglberger "The Blaze High Performance Math Library" Sponsor JetBrains
Transcript
Discussion (0)
This episode of CppCast is sponsored by JetBrains, maker of excellent C++ developer tools,
including CLion, ReSharper for C++, and AppCode.
Start your free evaluation today at jetbrains.com slash cppcast dash cpp.
And by Meeting C++, the leading European C++ event for everyone in the programming community.
Meeting C++ offers five tracks with seven sessions and two
great keynotes. This year, the conference is on the 18th and 19th of November in Berlin.
Episode 77 of CPP Cast with guest Klaus Igleberger recorded November 2nd, 2016. In this episode, we discussed output parameters and speeding up your builds.
Then we talked to Klaus Igelberger.
Klaus tells us about the Blaze High Performance Math Library for C++ developers by C++ developers.
I'm your host, Rob Irving, joined by my co-host, Jason Turner.
Jason, how are you doing today?
Good, Rob. How about you?
Doing pretty good. How would Halloween go for you?
I think we had something like 225 kids maybe
oh wow we we must have only had a handful because we just left the bowl out for a while and uh
it was still pretty full when we got home okay so my neighborhood has a facebook page
where people post like you know with there's kids doing vandalism, that kind of thing, right?
People talk about security or whatever.
Someone said they left their bowl out
because they needed to go to bed or whatever with candy in it.
And now they're wondering,
has anyone seen my large mixing bowl?
Because the kids took the bowl with the candy.
Why would you do that?
I don't know.
Yeah.
I live on a cul-de-sac, and my driveway is a little steep.
So I kind of think kids just towards the end of the night were intimidated by the steep driveway and just didn't want to go through the extra effort to reach my bowl.
So that's why it was so full still.
That's why I guess they didn't deserve that extra piece of candy. I guess not.
Anyway,
at the top of our episode, I'd like to read a piece of
feedback. This week,
Joey writes in saying,
I really enjoy the podcast. Keep up the good work.
I just had to make one comment
though on your offhand remark about
a possible celebration for your 100th
podcast. And he says,
why did you pick such an arbitrary number? You should have had a celebration celebration for your 100th podcast. And he says, why did you pick such an arbitrary number?
You should have had a celebration back on your 64th episode,
or maybe for your 120th, which is still a bit far away.
Nice round numbers for programmers.
Just to be clear, we have not actually planned any kind of spectacular celebration.
No, but we could.
We could.
Plenty of ways away.
I do like his idea with uh you know
the bite oriented uh celebration numbers yeah something 256th episode maybe if we should plan
for that one that'd be quite a ways away well actually we're zero counted 255th episode maybe
yeah well are we zero counted i think the first episode is the first episode no yeah you should
be you should be wasn't thinking ahead on that one i didn't think about that with c++ weekly either
anyway uh we'd love to hear your thoughts about the show you can always reach out to us on
facebook twitter or email us at feedback at cppcast.com and don't forget to leave us reviews
on itunes joining us today is Klaus Igelberger.
Klaus has finished his PhD in computer science in 2010.
Back then, he contributed to several massively parallel simulation frameworks
and was an active researcher in the high-performance computing community.
From 2011 to 2012, he was the managing director of the Central Institute for Scientific Computing in Erlangen.
Currently, he's on the payroll at CD Adapco in Nuremberg, Germany, as a senior software engineer.
He's the co-organizer of the Munich C++ User Group,
and he's the initiator and lead designer of the Blaze C++ Math Library.
Klaus, welcome to the show.
Hi, Rob. Hi, Jason.
Hey, thanks for joining us.
So how long has the Munich C++ user group been going?
How long it has been going?
For about three years.
Oh, very good.
And is that... I think it's one of the most active communities in Germany.
So we have now 830 about active members.
Okay, active.
Well, about 80 to 90 show up every month.
Wow.
Meet once a month.
That's good.
That is impressive.
Is that the same C++ user group
that Jens Weller would attend?
Jens is doing Berlin and I think Dusseldorf.
And Munich is probably too far for him.
Gotcha.
Yeah.
And I co-organize them,
doing this together with Daniel Pfeiffer.
Very nice.
Well, we have a couple news articles
to go through, Klaus,
and then we'll start talking about Blaze.
So feel free to jump in on any a couple news articles to go through, Klaus, and then we'll start talking about Blaze.
So feel free to jump in on any of these news articles.
The first one is from the Visual C++ blog.
And this one is recommendations to speed up your C++ builds in Visual Studio.
And they have a number of good recommendations, many that our listeners might already be familiar with.
The first one is going to go into detail about guidelines
for using precompiled headers.
Yeah.
Go ahead, Jason.
Precompiled headers are great,
but it is hard to use them consistently in cross-platform code.
Right, because Clang and GCC don't have as good support for it, right? Very different
support, too.
And trying to kind of fit all that
into CMake and get it all to do what you want
to do is a bit of a pain.
Right. In addition
to pre-compiled headers, they suggest
using slash mp,
which will paralyze
your
compilation,
slash incremental for incremental linking,
and debug fast link to generate partial PDB files.
That's interesting.
I don't think I'm too familiar with that one.
No, I've never seen that one.
Yeah.
And the last guideline is to possibly make use of third-party build accelerators
one of which is incredibuild which we've talked about before on the show uh and used to sponsor
the show and they have another one here named electric cloud which i don't think i've heard of
but i'm guessing they do a similar um distribution of your build process as incredibuild does
it seems that way yeah anything you want to mention on this one, Klaus?
I think
slashmp is something that everybody can use.
This would be my favorite.
Because today
everybody has more than two cores,
probably four, perhaps even eight.
That's an easy game.
Yeah, absolutely.
Okay, next one is
from Foo Nathan's blog and it's about how to fix output parameters.
And I thought this was a pretty interesting blog post going through.
He doesn't want to either defend or advocate against using output parameters, but basically says if you're going to use output parameters, he created a new type to help manage the use of output parameters.
What did you think about this one, Jason?
I liked that.
I'm like a quarter of the way into it,
and I'm like, ooh, I see where he's going with this.
And I totally like where he went with it.
It seems like a good idea to me to take advantage of the type system as much as we can.
Yeah.
I like it too. So it's a very nice approach.
Also very clean, very obvious what's happening.
Yeah, definitely. And he points out just at the end, I'm not sure if you
got down this far, Jason, that he's making use of
a bit more sophisticated
implementation of this in his type safe library.
And I don't know if that was one of the libraries we discussed with him when we
had Jonathan on the show.
I do not believe we discussed it with him.
I did see that on the,
on the article.
Yeah.
You know,
it looks like the library is only a few months old actually.
Okay.
So this might be a new one.
He started after we had him on the podcast.
September 29th was the first commit.
Oh, okay.
Yeah, so we had him over the summer, I think, right?
Yeah, certainly before CppCon.
Yeah.
Well, we'll put this link to this library in the show notes
in addition to the link to the blog post
if anyone wants to check out the
type safe library yeah okay and the last one is routing paths and include os from javascript to
c++ jason do you want to introduce this one i know you're a little bit more familiar with include os
at this point yeah so include os is you know well we we we've had them on the show and talked about it, right?
Yeah.
But you can write C++ application that basically compiles to something that's bootable and then boot that up in a virtual box or QEMU or whatever.
And you can just make an OS from a C++ file.
Really easy.
It's awesome.
And one of their example uses is as a standalone web server mana i guess they're calling it and it is um
what they have here is an example for how to do normal like url routing inside of your include os
written c++ web server and it looks um i mean it looks like what you would expect to be able to do
with something like Apache. Right.
And, oh man, I forget what that plugin is called in Apache.
But whatever, the standard kind of URL routing stuff that Apache can do.
Okay.
Although it's apparently a JavaScript library, so I guess you can do it in JavaScript also.
Oh, I guess that explains the title, going from JavaScript to C++. Yeah, they ported a path to RegExp,
some popular JavaScript library, into C++ RegEx.
Very cool.
For IncludeOS.
Okay, well, let's start talking to Klaus about Blaze.
Where should we get started?
Just want to give us an overview on the library?
Yeah, why not?
So Blaze is, first of all, a high-performance C++ linear algebra library.
So the idea is, you have some vectors, some matrices, you put them together in whatever operation you want,
and hopefully you get the maximum performance out of it.
So, I know there's a handful of libraries. We'll probably discuss this later on.
But the primary intent is to have a library that really gets the maximum out of a CPU.
So why, I mean, you just said that there's a handful of other libraries.
Why did you start working on Blaze? What drove you to work on that?
So now we have to go back in time, approximately 2010.
So by that time, we already had multi-core CPUs. And I was looking for a math library and found that
there were a lot of them, so not
as many as today, but quite
a few. But none of them were parallel.
So none of them provided
any kind of
real parallelism, two cores, four cores,
everything was single core. There were a few
exceptions, of course.
A couple of libraries had a few operations that were parallelized,
but none of them were really parallel.
And I also did a couple of performance comparisons
and found that, well, I wasn't completely satisfied.
And so I took on a new hobby project.
This is what it was in the beginning.
I started this in about 2010,
and in 2012 I published the first version.
Admittedly this version was not parallel but in the one point releases we yeah all the features
that were necessary to do some parallelization were added and 2.0 was really parallel not just
a little parallel but we tried to be completely parallel. Everything that made sense in a way
that was also, well, perhaps new. So we initially had OpenMP and Boost threads and C++11 threads,
so a couple of kinds of parallelization, whatever somebody wanted to use.
So what was your personal interest in a linear algebra library in the first place? Was this hobby or research-related?
Initially, it was probably more hobby than research-related.
So as Rob said from my bio, I was in HPC for quite some time.
High performance was always important.
But the linear algebra stuff was something I was personally interested in.
So I just tried out initially what you can do, and it grew and grew.
I found it was fun, and at some point I just decided,
okay, now it's probably good enough to show to others.
I don't think I've looked at any linear algebra since my college days.
What are the sorts of use cases for a C++ linear algebra
library?
There's quite a lot.
Whenever you have
vectors and matrices, you want to do something
with them. So vector additions
to matrix multiplications,
inverting a matrix, transposing matrices,
whatever. Then such a
library is useful because
of course, you can
write some of these operations very easily.
Vector addition is one
for loop, reading from
two vectors, adding the values,
writing it to a third vector.
But you would have to think about
parallelization, about proper alignment
of data, about vectorization
and all this stuff.
Using such a library,
you can just use whatever is in there
very easily, very conveniently. You can basically write your operations
as you would write them in a math book. So A is equal to
B times C or whatever. That's convenient.
So I guess, well, I guess I'm going to jump ahead a little bit here
for what, but you just said, so it's a template expression library, right?
Correct. It's header only, based on expression templates like most of these libraries.
You essentially only include one header file, and then can start doing some programming.
So let's start in the, you've been working on this since before C++ 2011.
Correct.
How much pain was it to specify, say, the return types of your expressions without being
able to do auto return types?
So all this pain is not entirely gone. So we have now made the transition to C++14,
and you still see a lot of the extra work we had to do without the auto keyword.
So there's a lot of traits classes that contain the return types,
and you can always ask these trait classes,
okay, what would be the result of this particular expression?
So decltype was also not available, of course.
So today we're going to use decltype,
give me the return type of this expression.
We did this manually.
It's quite a number of classes that just encapsulate these return types.
I've been...
No, no, go ahead, please.
So it's not particularly nice, I know.
But, okay, it didn't work in any other way back then.
I've just been pondering recently
how things like auto return types
would appear to enable new types of expressions
and APIs that simply were very nearly impossible before.
And I'm wondering if you've experienced
maybe you're being able to do new things
now that you are using C++11-14.
Yeah, I completely agree.
So we were stuck a little without C++11,
but since we have now made the transition,
we, for instance, could very easily add Fuse Multiply Add.
So I'll explain it in a simple way.
So fuse multiply add is an operation where one addition and one multiplication are done together.
One intrinsic operation.
This is, of course, if you have additions and multiplications at the same time,
in the optimal case, a factor of two in performance. However, you do not want to
sprinkle FMA calls all over your code.
You wouldn't also,
like in expression template code,
in other expression template code,
write some addition and some multiplication,
and the compiler or the system
should realize,
oh, I can combine these two,
make an FMA operation out of it.
And now with the auto return type, what we did is just to implement FMA exactly like this.
So now we have expression templates within expression templates,
kind of a multi-layered expression template approach.
The other one is still the old one, the traditional one.
The inner one, however, is building on the auto return type.
So you just return some expression objects.
The auto-keyword helps here.
No extra work.
And then the system assembles this to find,
oh, now I can combine an additional multiplication called FMA instead.
Cool.
So you recently announced Blaze 3.0.
What are some of the recent changes in the new version?
So in the one-point and two-point versions,
we used C++ 98 primarily.
One of the reasons was, okay,
back then C++ 11 was not available everywhere,
especially on these HPC machines.
We wanted everybody to be able to use the library.
However, yeah, as just explained, C++11 and C++14 just provide so much more possibilities that we decided now it's time.
Now there should be a new compiler available.
And so the major change is truly the transition from C++98 to C++14. We jumped over C++11 because, well,
we would have been stuck with C++11
if we would have announced that place 3.0 uses C++11 only.
But now we have 14 and can use it for quite some time.
So I consider this some kind of breaking change.
People want to have the guarantee
that they can use a compiler for a long time.
So 3.0 will be, or 3.x will be C++14.
This is a guarantee.
Okay.
So speaking of compilers, what compilers does Blaze work with?
It should hopefully work with all C++14 compilers.
We do a lot of testing.
We explicitly test GCC, Clang,
the Intel compiler, and Visual Studio.
Okay.
Of course, in all the versions
that provide C++14.
And this is actually a good thing to do
because all compilers are different.
You always find that
these subtle differences.
I just recently realized that
in the Intel compiler,
the variable templates are not properly implemented. these subtle differences. I just recently realized that in the Intel compiler, the
variable templates are not properly
implemented. And so, unfortunately
now I cannot use this feature, just
because I want to provide
C++14
compatibility with all compilers.
So you have to really test all of them.
I'm curious from a library
maintainer perspective, if you
are supporting the compilers that say they support C++ 14,
or are you supporting the compilers that support C++ 1y?
You know what I mean?
Like the ones pre-standard 14?
So we tested a couple of 1y compilers,
but for instance, GCC 4.8 gave us trouble.
Okay.
So we basically support every real C++ 14 compiler.
Okay.
I'd like to interrupt the discussion for just a moment
to bring you a word from our sponsors.
CLion is a cross-platform IDE for C and C++ from JetBrains.
It relies on the well-known CMake build system
and offers lots of goodies and smartness
that can make your life a lot easier. CLion natively supports C and C++, including C++11
standard, libc++, and boost. You can instantly navigate to a symbol's declaration or usages too.
And whenever you use CLion's code refactorings, you can be sure your changes are applied safely
throughout the whole code base. Perform unit testing with ease as CLI integrates with Google Test,
one of the most popular C++ testing frameworks,
and install one of the dozens of plugins like Vim Emulation Mode or Go Language Support.
Download the trial version and learn more at jb.gg.cppcast dash cline.
How does Blaze compare to some of the other math libraries like Eigen?
Okay, that's a good question.
So, of course, the idea is the same.
Provide some operations in such a way that users can very easily use them,
and you should get the maximum performance out of it.
Since you mentioned Eigen, Eigen is definitely much ahead in terms of features.
Blaze is younger, we do not have that many developers, and so we are lacking a lot of features that Eigen already provides or has provided for years.
However, I think Blaze is even more focused on performance. So really the idea is every operation, whatever you use,
even if it's something that's very, very rarely used,
should run as efficiently as possible.
And additionally, I think this is a unique feature,
is the parallelization.
Everything is running in parallel.
So, okay, I should perhaps also
explain the everything
it's not really everything
for instance what is not parallel
is sparse vector addition
sparse vector addition just does not
parallelize well, there is no
performance benefit
so I have to say everything that benefits
from parallelization is running in parallel
I think this is something that only a few libraries, if at all, offer.
Eigen, for instance, is focusing the parallelization only on the dense matrix multiplication.
And a couple of other operations.
It's very focused.
So how do you make the determination as to whether or not something is worth parallelizing?
Okay, that's also a good question.
Of course, for very small
vectors, matrices, parallelization doesn't
work well, so you would
reduce performance instead of increase
it. So there
is always a threshold involved.
If certain thresholds
and the threshold involves the size of
the vectors or matrices
is exceeded,
then the parallelization kicks in.
There's a configuration file for that.
You just can manually adapt these configuration thresholds for your system
to really get the maximum out of it.
So far, it's set to, hopefully, reasonable values.
So, of course, it's not parallelizing for tiny and small vectors, matrices, and
it's, of course, parallelizing for large ones. And somewhere in the middle range, it has
the switching. So, in some cases, it depends on the data types that you use. In some cases,
everything can be done at compile time. You know the sizes at compile time, so the decision
whether to parallelize or not can be done at compile time.
In some other cases, especially if you have dynamic sizes, of course, there's a small runtime check.
Is it above the threshold? If yes, okay, then call the parallel kernels, else the serial kernels.
So do you find that that threshold for when parallelization makes sense is determined by the compiler being used at all?
It's probably mostly affected by the system you're running on, by the architecture.
How much cache do I have? How much CPUs do I have available? How fast is the CPU?
This is probably the main factors.
From a compiler point of view, I have to admit,
I did not find a lot of difference.
So small differences definitely,
but not differences that would drive in to change these thresholds or to make them compiler dependent.
So you're talking about parallelization,
both like you do vectorization of the data,
like at the CPU level and at the thread level, right?
Correct. So the
vectorization means that I use SSE
AVX, also AVX
512, so the upcoming AVX
standard for the new Xeon
CPUs. I think the
name is Skylake from the
new Xeon.
This is basically done just
by the compiler. During compilation
either the compiler figures out
what architecture you're building for
or you explicitly state, I want to use SSE.
I want to use AVX.
And then internally, the according
vectorization is used. So the
code is generated for this kind
of vectorization.
It's just...
Do you have to do anything?
You're not using compiler intrinsics,
you seem to be saying.
I use intrinsics,
but they're wrapped accordingly.
Oh, okay.
Yeah, so there's a small sub-library.
Depending on which kind of vectorization you choose,
the encoding functionality is enabled.
But the interface is all the same.
The difference is just how big are those
packs.
The pack doubles. Is it 2, 4,
or 8? Depends on the vectorization.
Okay.
And then the parallelization is
mainly
data parallelism, which means
you try to improve the performance
of a single operation as much as possible.
And there's a wrapper for that too.
So if I find that something can be parallelized,
then I distribute the data to some threads.
Then I give the threads a task.
Okay, you compute this part, you compute that part.
And then within each thread,
the single core kernel is running, essentially. okay, you compute this part, you compute that part, and then within each thread,
the single core kernel is running, essentially.
So I don't know if it's clear.
So at one point, it says distribute the work,
define smaller work units,
and then the smaller work units are then handled by a single core.
Okay.
So are you following along with the C++17 parallel standard algorithms work that's being done?
A little, yes.
Do you think that that's something that you would be able to make use of in your library?
Probably to some extent.
I'm curious how they handled the...
So this is what I don't know, the detail,
how they handled the size,
because a small number of elements should not be vectorized,
sorry, parallelized, as large numbers should be parallelized.
This is something I'm very curious to see how this works.
So essentially when the parallelization kicks in there, in that case.
I believe it's, I don't know if it's an optional parameter
or not, but I believe there is a parameter to the
parallel versions of the algorithms to say
whether or not you want it to be parallel.
So maybe it's user-defined.
Okay. And it's easier
than in my case. In my case, it's...
Yeah. It's not user-defined,
it's system-defined. Right.
So are you currently writing all this
parallelization code yourself? You're not making
use of some other library like
HPX? I'm using OpenMP.
This is something
basically what you have
to do yourself, but
it's mainly pragmas. Pragmas around
for loops in my case. I make use
of C++11 threads and also boost threads
because they're so similar.
And HPX is on the
to-do list.
HPX is a little different from a
parallelization point of view, from a
paradigm point of view,
and I definitely want to try it.
I didn't have time to
include it yet, unfortunately.
HPX is probably very close to what you'll get
from any future
parallelization in the standard library, I would think.
Right, Jason? From the algorithm
standpoint, but they also have their
lightweight threads,
fibers, or whatever
they are called.
Because I know in HPX, they said you can run
thousands of threads, and it only
actually uses so many hardware threads
depending on what it needs to do.
So you gave a...
Oh, go ahead.
I would be very, very interested in seeing some comparison
to the traditional approach.
So OpenMP is what I would say is traditional.
HBX is a little different.
And I think they should compare pretty well.
You gave a talk at CppCon, right?
Correct, yes.
How did that go?
So my impression was it was fine was working well
unfortunately it was only about 20 people there this was a little disappointing given that there's
of course 950 2000 participants however the talk was at nine o'clock in the morning and it was
about linear algebra and so i should not have expected a lot of people to come.
Still, I think it was an interesting talk.
I hope that people watching it on YouTube will agree.
I essentially showed what you can do with Blaze,
showed a couple of performance comparisons,
a couple of graphs with the most common operations.
So, of course, dense arithmetic,
but also a couple of sparse benchmarks.
Okay.
How does the library compare in ease of use compared to something like IGAN or other math libraries?
Is it a similar API?
Very similar.
Okay.
So basically, the algebra operations are always using
the infix notation of operators,
so plus, minus, times
that's very very
similar. It's a little different in
those operations that it cannot be directly
expressed in terms of operators
like matrix inversion
and creating
views on some vectors
or matrices
if I now say it's easier to use
it's of course a very subjective view now say it's easier to use, it's of course a very subjective view.
I hope it's easier.
Of course,
that is
in the eyes of the V holder.
This started as a side
project, open source project for you.
Correct.
If I understand right, you've had, let's
say, a very rare
success for your open source project,
because it got noticed by your employer. Is that correct?
That is correct. So I live in Erlangen, as you said, and very close to Erlangen in Nuremberg is
CDR Dapco. So that's the one office in Germany they have. And I was lucky that they actually
realized that the person who's writing this library lives close by. And I was lucky that they actually realized that the person
who's writing this library lives close by.
And so they invited me, I
talked about it, and they gave me
a job. This was truly
a very, very positive,
very lucky
thing for me. It's not very
many open source developers get that
opportunity, I don't think.
I have no idea, but
as I said, I was truly lucky.
So I'm guessing you get to
use your library at work?
Yes. So it's now part of
the major product from
CDDapco called Stasis in Plus.
It's not used within the entire
product, but in a large portion already.
And there's the discussion
to use it in other places too so
it's working pretty well yeah so it's definitely a success then i would say it is a success of
course yeah are you aware of other products or companies that are making use of it currently
not yet unfortunately so i know a couple of um that use it, so probably just institutes that I know better,
that know me, and so they use it.
But I'm not aware of larger companies
except for CDAT Upgrade that use it.
This would be actually something I would be very interested in,
and I would be willing to even put a couple of
advertisement logos on the webpage.
But yeah, if people feel it's a good library and they use it, just send me an email.
There's a possibility you'll get some emails after this airs.
Hopefully.
Do you have a lot of other contributors aside from yourself working on the library?
So from an
implementation point of view, I'm the one
who is doing most of the work.
I have help in the form of
a guy who is dealing with
the Windows port.
Visual Studio is always a little more tricky
apparently, and he helps me with this.
And I have a consultant
from the Allen Computing Center
who is helping with performance
stuff. So if I run into performance trouble,
I go there, we discuss
what's happening in memory,
we take a look at the assembly code together,
etc. And usually we come up with
something that runs better.
And there's a couple of smaller
contributions from people
who wanted to add just one thing.
So that's it.
So you said the Windows port's always just a little bit tricky.
Do you find yourself having to make compromises
for what features you can use in C++14
because of other compilers?
Sometimes, yes.
Sometimes I have to revert a couple of things
because one compiler does not accept it.
And probably the most reason was that I found that in the Intel compiler,
15, 16, even 17 variable templates do not work as expected.
So I was implementing variable templates for basically all type traits,
all the additional type traits that I have.
But unfortunately, I committed everything,
and then after that, tested it with Intel Compiler
and found it doesn't work.
This was a little disappointing.
So yes, I have to compromise to be able to provide something
that works with all the compilers.
But I think still it's worth the trouble, of course.
If you only provide something for one particular compiler,
people might not be particularly happy.
And also, something
like the Intel compiler is very, very
common in the HPC community.
So I cannot say, oh, I don't
support Intel anymore. This doesn't work.
That's a good
point. You mentioned earlier that
one reason you started with C++98
is that supercomputing
clusters tend to have older compilers on them.
And I've seen tweets about this recently also from, I think Bryce mentioned it or something.
Someone did on Twitter recently.
And so are they updating now?
So now you, you know, what's the story like?
Of course, I don't know all the big machines,
but the ones that I know, mainly the ones within Germany,
they have by now a compiler that is able to compile with C++14.
So they try to update, and they try to provide more compilers.
This might not be true for all machines, of course,
but the ones that use Intel architecture,
they usually also have at least one compiler that can do C++14.
So in many cases, it's GCC6.
Oh, so very recent then.
Pretty recent.
Less frequently a client compiler,
but often also Intel, of course.
Intel 15 is
kind of developed everywhere.
Probably by now
16, perhaps even 17.
There's a couple of machines with IBM
architecture, and IBM has a specific
compiler for this architecture.
This might be something that now
has problems.
But, okay.
So does IBM...
IBM actually maintains their own C++ compiler?
I don't know to what extent,
but they have a specific compiler
for their
architecture. I don't know if
they write a C++ compiler on their own.
If it's completely
IBM,
perhaps it's just a wrapper for something else.
Right.
Okay. But for those people who cannot use 3.0, I mean C++14, they can still use 2.6.
I tried to add a lot to 2.6 before making the transition
to C++14. It should be a good start, point to start
at some point. Also in these machines, C++14 is available.
Do you find yourself still having to
go back and do bug fixes on your 2.6
branch?
I'm perhaps proud to say there is very, very
few bugs in place.
We do a lot of testing. We have an enormous
testing effort.
About one month before the release
of a version, we started to do
testing.
This is a really large
effort. All the compilers
with about 6,000
tests that do run a couple of hundred
operations each.
I hope, of course, the template library
is difficult in this regard,
because you can combine everything with everything,
but I think
we have found a good
compromise.
So we get very few bug reports,
which hopefully is not because so few people use it.
It's hopefully because it's just working properly.
Right.
Do you have a sense of how many people are using it
compared to other math libraries?
I unfortunately know the number of downloads.
So it's usually in the range of 400 to 600 each version.
This is, it's reasonable.
Okay.
All right.
You have any other questions, Jason?
No, I don't think so.
Okay.
Klaus, where can people find more information about you
and more information about Blaze online?
So people can always write me an email to klaus.egelberg at gmail.com.
That's the usual way to comment directly.
And if you want to find Blaze online, just Google for Blaze C++ Math Library,
and you will definitely find it with a first hit.
Okay. Well, thank you so much for your time today, Klaus.
Yeah, thank you for the invitation. I really appreciate this. Thank you.
Sure. Thanks for joining us.
Thanks so much for listening
in as we chat about C++.
I'd love to hear what you think of the podcast.
Please let me know if we're discussing the stuff you're
interested in. Or, if you have a suggestion
for a topic, I'd love to hear about that too.
You can email all your thoughts to
feedback at cppcast.com. I'd also appreciate if you like CppCast on Facebook and follow CppCast
on Twitter. You can also follow me at Rob W. Irving and Jason at Leftkiss on Twitter.
And of course, you can find all that info and the show notes on the podcast website
at cppcast.com. Theme music for this episode is provided by podcastthemes.com.