CppCast - Type Erasure, SIMD-Within-a-Register and more
Episode Date: October 20, 2024Eduardo Madrid joins Phil and Timur. Eduardo talks to us about the Zoo libraries, including his advanced type-erasure library, as well as the SWAR library which simulates ad-hoc SIMD within a register.... We also discuss how he has taken inspiration and cues from the worlds of Biology and Physics to arrive at new thinking around software development, design and architecture. News QT 6.8 is released "Named Loops" proposal adopted into C - will C++ follow? C++ Online Call for Speakers is open Links The Zoo libraries "C++ Software Design" (book) - Klaus Iglberger Klaus Iglberger's talks on Type Erasure: "A Design Analysis" "The Implementation Details" (Some of ) Ed's talks: "Using Integers as Arrays of Bitfields a.k.a. SWAR Techniques - CppCon 2019" "Rehashing Hash Tables And Associative Containers" - C++ Now 2022" "Empowerment with the C++ Generic Programming Paradigm" - C++ Online 2024
Transcript
Discussion (0)
Episode 392 of CppCast with guest Eduardo Madrid, recorded 14th of October 2024.
In this episode, we talk about QT 6.8 and about named loops.
Then we are joined by Eduardo Madrid.
Eduardo talks to us about the zoo libraries and the electromagnetic wave metaphor. Welcome to episode 392 of CppCast, the first podcast for C++ developers by C++ developers.
I'm your host, Timo Dumler, joined by my co-host, Phil Nash.
Phil, how are you doing today?
Hi, Murray.
Timo, how are you doing?
I'm good.
Very busy.
What else?
I'm going to Zurich this weekend to speak at the Zurich C++ Meetup.
So going to see our friend Sergei Platonov, who is organizing it.
And Guy Davidson is also going to be speaking there.
So it's going to be a little bit of a reunion with friends,
but also hopefully a great C++ meetup.
Yeah, and then I guess there's going to be more travel in November.
Yeah, it's busy.
I don't know.
Same with you, Phil, right?
What's up?
Where are you right now, actually?
I can see that you're in some hotel room.
What's going on?
I am, yes.
I'm in Italy at the moment. i'm doing some training here this week and i was in the nevidence last
week at c++ under the sea as we talked about before but after this week yeah it was it was
very good uh excellent first time conference very hard to believe it was our first year actually
although being only one day it did, yeah, we could have really done
two or three days of this. So I think they will be extending it in the future.
That's great. Well, hopefully I can make it next time. So what are you up to then after that?
So I'm in Italy this week, but after that, I've got no traveling for, I think,
five weeks. So that's going to be quite unprecedented right well maybe it's
a chance for you to relax i hope so yeah right so at the top of every episode we'd like to read
a piece of feedback and this time we received a comment uh very short and it says please provide
a text transcript so i'm not sure whether we actually already do that, because when I listen
back to our episodes on Spotify, I can actually see a transcript, but I have never tried to
download it or anything. Phil, is that a thing? Is that something people can do? Do you want to
talk about that a bit? So what you're seeing, I think, is a Spotify feature. There's nothing
that we do for that. Well, we can produce a transcript. In fact, the software that we're
using right now to record on automatically produces a transcript in fact the the software that we're using right now
to record on automatically produces a transcript test we record so we do have access to that and
of course there's various tools now to uh to do that and couldn't even provide a summary of the
whole podcast it's getting quite uh quite interesting uh and i do have i have had plans
to do that for quite some time now. The trouble is, particularly because this is quite technical content, the hit rate on it is quite low.
There's a lot of mistakes, typos, things you didn't quite hear, right?
Yeah, I noticed that whenever we say words like constexpr or noexcept or whatever, any kind of SOS terminology. Sfina is also one of those.
Like any tool just gets it hilariously wrong and the results are quite entertaining.
Sometimes not safe for work.
I've seen that before.
And then there is the problem of second learners.
Yeah.
So I think the ideal solution is to make it available in a format where we can actually crowdsource the corrections.
So I'm thinking of some sort of like a GitHub repo where each show has at least one text file,
ideally maybe split up into many, and then people can submit pull requests to make corrections,
and that should spread the load a bit.
Obviously, that's going to take a little bit of work to put together, so we haven't done that yet,
but I do want to do that.
So excellent suggestion.
And I think that will then help
to make the content more searchable.
You want to find something specific,
you can search right into the text.
So as well as just being able to read it.
All right.
So we'd like to hear your thoughts about the show
and you can always reach out to us
on xmaster on LinkedIn
or email us at feedback at cppcast.com.
Joining us today is Eduardo Madrid.
Eduardo is the main author of the Zoo Libraries, including the Type Erasure Framework and the
SWAR Library that put together unique innovations he has been sharing with the community over
the years through conference presentations.
He has been designing and implementing software architecture for application areas such as
automated trading and high-frequency trading, Snapchat, and high-tech finance such as Bloomberg.
Eduardo comes from South America and is a Bitcoiner.
Most recently, he has developed an interest in the emergence of complex behavior and assembly theory and is now applying these fields to the architecting of software libraries.
Eduardo, welcome to the show.
Hi, I'm very happy to be here.
Now, Eduardo, I've known you a few years now, so as well as your bio, I know a few other things
you've been working on. A lot of it seems to be inspired by evolutionary theory. Is there any part
of evolution that hasn't led to software architecture? Oh, thank you for the question. I don a class on artificial intelligence,
but this is the classical artificial intelligence that had things like, for example, genetic algorithms,
planning, constraint satisfaction problems.
And back in the day, we talked about expert systems
and all that stuff that is looking, sounding obsolete nowadays
because the techniques of artificial intelligence stuff that is looking sounding obsolete nowadays because
the techniques of artificial
intelligence based on
neural networks
became so dominant
unfortunately of all things
that's the part that
interested me the least
when I was doing artificial intelligence back in the day
but
like I think that there are lessons from evolutionary biology,
abstract biology, applicable to software engineering, yes.
Yeah, absolutely.
And I found what you've been talking about there very interesting.
We're going to dig into some of that a little bit later.
Yeah, so we're definitely going to dig into that a bit later.
But before we get to that, we have a couple of news articles to talk about.
So feel free to comment on any of these, okay?
So the first one is not really a news item.
It's just a little addition to what we mentioned last time when I said that the CppCon keynotes have been posted online,
but one was missing, the one by Amanda Rousseau about safety and security called
Embracing an Adversarial Mindset
for C++ Security.
Actually, that had been posted. I just
seemed to have overlooked the link, but we did
put the link into the show notes.
So it has been up there since
two weeks ago. I just want to
re-emphasize that, yes, all
CppCon keynotes actually are
online. I got that wrong last time.
So apologies for that.
The next piece of news is that Qt, a major cross-platform application framework written
in C++, has a new release, Qt 6.8, which has just been released.
It's an LTS version, long-term support.
So it's like a major release, I guess.
And it has a lot of new stuff
in it so qt apps if you do qt apps on mobile it now supports ios 18 and android 14 they now support
windows on arm which is interesting it seems like this platform is gaining more acceptance
which actually completes support for arm across all the platforms that Qt supports.
You can also now create applications
for Apple Vision Pro and MetaQuest 3XR headsets.
You can now write your own virtual reality headset software
with Qt, Raspberry Pi 5, NVIDIA AGX, Orion,
and a bunch of new embedded platforms.
So they're really extending what kind of platforms they're targeting
with their framework.
There's a bunch of modules
that were under technology preview
that have now been completed
and are now basically fully functional
and available and part of the main feature set,
including Qt Graphs,
which lets you do really cool
like 2D and 3D graphics stuff,
Qt HTTP Server and Qt gRPC.
So they are fully supported from this release on.
There's also new options to reduce the binary size of your Qt app,
which kind of echoes one thing that I felt.
You had an episode about that while I was on parental leave, didn't you?
How that's really important.
So Qt offers new features for that now.
Yeah, we had Sendor Dargo on
and that seems to be coming up a lot.
Seems to be the time for everyone
to concentrate on binary sizes.
So that was quite timely, I think.
So obviously every time there's a major release
from a big framework like Qt,
there's a very, very long list of features,
which I'm not going to go through,
but it seems like there's a lot of stuff in there.
So if you're doing cross-platform application development
with C++,
it seems like this is something you could check out okay so i have one more news item which is actually not from c++ but from the c committee good old c language is also evolving they're also
occasionally put out new standards um and they have now approved for the next version of c which
i don't think we know yet when it will come out but whenever it does they have now approved for the next version of C, which I don't think we know yet when it will come out,
but whenever it does,
they have approved the new feature for it called named loops,
which is something that has been in discussion for a while.
Apparently that proposal was actually somewhat controversial as well,
but they have now got it in.
So the way it works is that you can write a label.
So any like identifier followed by a colon,
immediately before a switch or a for loop.
And then you can write things like break label or continue label to break or continue
that specific loop or switch,
which is very handy if you have nested such loops.
So yeah, I think there is a bit of an expectation
that C++ should adopt that as well,
so the two languages don't diverge.
So let's see if and when and how that's going to
happen. Probably not anymore
for C++26, but
I'm curious whether it's going to be
as controversial in C++ as it apparently
was in C, and
what you think about that.
To me, it baffles
me that it only took 50
years for C to acquire the capability of escaping method loops.
That's something that is in one way or another present in just any other language, except C.
It is embarrassing that it is not in C and C++.
Yeah, I'm actually pretty sure Fortran had it
like literally 50 years ago.
Yeah.
No, I think everyone's been wanting this
for all of those 50 years,
that the trouble is getting people to agree on the syntax.
And I think that's really why it was controversial.
Not that we have the feature at all,
but just how you actually spell it.
There just seems to be no syntax.
Everyone just says,
yeah, that's probably how we should do it.
Well, but that syntax is basically the same syntax that Perl uses.
You know, it's like many languages use that syntax.
I don't understand.
What's the controversial aspect of it?
I have to admit that I also don't know.
I just read on Reddit that it was apparently controversial.
I didn't dig deeper.
Maybe I should ask
one of the C people. Maybe they can tell me the story.
One thing that came out in there, Fred,
was the fact that you put
the label before the
loop or the switch statement
made it sort of feel a bit detached from it
as if you're doing a go-to to
just before the loop. I think that was
the objection, that you're sort of using
a go-to, a label, but
it's not actually going to the label.
Oh yeah, I can see how that's weird, but I guess
that's the only thing that syntactically
works, right? Yeah.
But at this point, if it's in C,
the pressure is really there to
just adopt the same syntax in C++
because the compilers are going to do it anyway.
Alright, so I believe, Phil, you have one more news item for us, don't you?
Yep.
So just like last time, I sneaked in one more,
which is another call for speakers,
this time for C++ Online.
That's going to be running again the end of February.
Didn't actually put the dates in,
but I think it's the last week in February we've got.
The call for speakers is now open.
So if you want to speak at an online conference,
so you can be from anywhere in the world,
but we're going to be running it to the close to UK time zone,
friendly times.
So if you can make that, then come and speak at C++ online.
All right.
So with that, we can move on to our main guest
and it's my great pleasure to again introduce Eduardo to the show.
Actually, we have known each other for a while as well.
I think I'm pretty sure we bumped into each other at my very first CPPCon, which was almost a decade ago.
And I've been continuously talking about stuff since then.
So sorry, Ed, that it took so long to finally get you on the show.
But I'm very, very happy that you're here today.
I am very, very happy that you're here today i am very very happy myself and uh i got to about my relation to cpp cast as a listener uh i must say that i
got to know phil thanks to the interview at cpp cast i didn't know anything about Catch-2 and I immediately adopted Catch-2 after that
interview at CppCast and wow, there are so many things that I have learned at CppCast that have
been very influential for me. In one occasion, I drove from where I still live in Los Angeles to CppCon when it was in
Bellevue. And the drive was a fantastic drive. It's very beautiful. And my entertainment was
to listen to CppCast the whole way through. So it was about 20 hours of CppCast episodes.
Very influential for me all right well
that's great to hear thank you very much for letting us know that that means a lot and obviously
i guess a lot of that was uh when uh robin jason was still running the show but yeah naturally but
you are you are the natural successors i um i think that uh the spirit is evolving and keeping strong.
And I am very enthusiastic about things like that initiative that you guys have of considering to make crowdsourcing transcripts that are searchable.
Wow, that's amazing, right?
We've got to do it again.
Yeah, you better get on that, Phil.
All right.
So, Ed, thank you so much for your kind words.
That really means a lot.
But I also want to talk about your work,
which is also really interesting,
which is why you're here today.
So I guess the most visible part of your work,
apart from the great conference talks
that you gave over the last years,
is your zoo libraries, which are on GitHub.
Do you want to just briefly
give us a little overview of what they're
about and how they came about and who's
working on it?
Then we can maybe dig into the details a little bit
afterwards.
My pleasure.
The reason why I started this open source
project is that I couldn't fully apply the ideas that I had gathering experience from my own entertainment and especially at work, of course. with trading, I have also had responsibilities over trading systems.
And just to give people an idea of the type of responsibility that that means,
for example, when I was much younger in my career,
there were four people at the team of IT executions,
and our team was doing all of the trading systems,
like 99% of the flows happened through our work.
And on a daily basis, around $250 million moved per trading day back in 2011, for example.
And if you divide that by four, then that'll be my daily trading responsibility,
right? So it was very important to be attentive to every single detail, no bugs, no performance deficiencies, reliability. It was very challenging and the responsibility was quite high. So I had these responsibilities and that is a privilege because I learned things that I couldn't have learned any other way.
Like I cannot set up an emulator or a simulator of a financial exchange at home, right?
To see what matters and what doesn't. And there are lessons from that experience that
I couldn't apply because of the constraints
of the normal work.
The only outlet for
exposing those ideas to
reality was to create
something outside of work.
And
I really like the social
aspect of open source, of manifesting to the public,
all right, please see these ideas that I am expressing in source code. I would like your
comments to be able to tell that to the community. So that's why I began to do an open source repository. And the first thing that I tackled with in earnest was type erasure.
Okay.
So let's talk about that a little bit.
So you have a type erasure library in there.
So I guess, are we talking about the kind of type erasure that, for example, std function does or std any?
Or is that something else?
And what does your library do differently?
Why is it better than other approaches?
Yes, indeed.
We are talking of type erasure is to read the chapter on type erasure written in the book by Klaus Eagleberger, C++ Design Patterns.
It's an entire chapter.
It's something very sophisticated.
And the reason why it is sophisticated is because the interplay is a dance, like many other different design patterns cooperating.
So there is the external polymorphism design pattern, prototype, bridge, and some others.
But you can get the full description from Klaus.
And let me anticipate one thing that if we can talk about that,
that would be great, that in my opinion,
type iteration exhibits emergent behavior,
meaning that it has a synergy among the design patterns that it composes
that have properties that the individual
design patterns don't have.
Like, for example, the external runtime polymorphism design pattern has an inherent performance
drawback.
However, external polymorphism, as it happens within type U ratio, does not necessarily
have that inherent performance drawback.
And the latter part, I am just going to say it.
It is a strong claim, but, well, the evidence is plentiful in that regard.
In terms of performance and modeling capabilities,
the framework that I wrote is superior to all of us.
And it is independently superior in performance
and independently superior in modeling power.
So that's interesting.
I have to say, I haven't really used type erasure
in an advanced way.
Nobody who doesn't use my framework has ever done it.
So I'm just curious. i want to dig a little
bit deeper so my kind of naive understanding of it the way i've used it a bunch of times to
implement like a i don't know something like a value type that takes different like when you're
implementing a json pass or whatever you can have a variant but you can also have like a
basically something like an any right like it takes whatever so usually like what i've done
is you have uh kind of a pointer underneath and you do it over by uh like constructors that take
different types and then they create like a kind of a polymorphic type and then you have like
internally like a v table that dispatches to the right polymorphic type but the interface basically
hides all of that stuff is yeah there there seems to be much more to it than that.
So I'm just curious what I'm missing here.
Ooh, this is a long list.
Let's start with one really important thing, okay?
I believe that type erasure is the only reasonable way
to get value semantics with runtime polymorphism
both at the same time that's something that you would find in in klaus's book right and again you
can go drink from the source that's an excellent book the pdf is available the all of the things
that type erasure give you now in a conventional way, you can get it from Klaus's book, okay?
He's done a talk on it as well.
Oh, yes, he has done several at this point in time.
I track his presentations,
and he has talked about type ratio several times.
So another important thing is the modeling power.
I even designed a – I am trying to introduce some
language to describe the future
of type erasure.
An important
advancement over
conventional runtime polymorphism using
inheritance and virtual overrides
or basically
object-oriented interfaces or an
improvement over object orientation
on type erasure with regards to runtime polymorphism is that it supports what I
call not only monophyletic subtyping relations, but also paraphiletic and
polyphiletic.
Let me give you an example of what I call paraphiletic subtyping relation,
or maybe even polyphiletic.
Let's call it polyphiletic for the purposes of what I'm going to explain.
Let's say every time that you want to do something like serialization,
serialization and deserialization for an application or a framework,
you're going to be choosing among the different types
that participate in your application,
what are the ones that are going to be able to be persisted
and deserialized or recovered from storage, right?
So the process of you selecting the times
that you're going to be able to serialize
is the subtyping relation that you're introducing.
Subtyping as in the Lisk-Obs substitution principle.
And what I want people to see from this example
is that they are basically an ad hoc collection of types
that you're going to serialize.
Because that set is related only to the very specific needs of your application.
So classical object orientation
only gives you the option of
wrapping everything so that they
adhere to the object-oriented interface of serialization.
That the property becomes
structural, right? But that gives you
a lot of problems.
It is a world of hurt.
And let us escape from C++.
Let us talk about Java, for example.
This is a very asinine problem in Java
because of the dichotomy of polyphilic subtyping relations
that cannot be supported directly
by the object orientation
as in Java,
you have these ridiculous classes
like the integer objects
that internally have
a T2B integer,
but they are boxed
and have a lock
and have a lot of things
just to be able to participate in object-oriented interfaces.
That's one example of how pathetic those wrappers are.
And the reason why I mentioned these of polyphyletic,
paraphyletic, and monophyletic is because it's something
that I have never heard anybody else mentioning
with regards to type erasure.
And my framework directly supports all of these things.
So what exactly do you mean by polyphilitic and monophilitic?
Yeah, these three words come from biology.
So let's say, for example, mammals, okay?
Mammals, you can represent it with a base class
or an object-oriented interface.
And the properties that make mammals mammals
are common to all of us, right?
So in that regard, that is a monophyletic taxon, a monophyletic category,
a monophyletic hierarchy. Nothing surprising there, but you might have other categories,
like for example, marine mammals, okay? So marine mammals are polyphyylatic with regards to being mammals
because you exclude things that are mammals but not marine mammals.
Even though all marine mammals have common ancestors,
mammals have species that are not marine in nature, okay?
Or the example of, oh, yeah.
So that's an example of polyphyletic.
And there are also examples of paraphyletic relations.
One example would be primates.
Like Homo sapiens is a primate that swims.
And that's quite specific to Homo sapiens is a primate that swims, and that's quite specific to Homo sapiens.
Like, primates in general don't swim.
So that's an example of a paraphilic subtyping relation.
So what I'm going to is that the reality that you want to model in your software is far more complicated than what purely monophyletic subtyping relations would allow you to.
So that's a very big limitation.
When you run into this problem while doing software engineering and the only resource that you have is interfaces in the object-oriented meaning, then you are in a world of hurt.
Rather than thinking of
hierarchies of interfaces,
we're thinking of clusters of
capabilities that could
be cross-cutting. Yes, yes, yes.
Very much so. Thank you
for that comment, because
that gets us closer to what
my framework does.
Maybe I should say what is
a subtyping relation before
it is too late.
The SOLID principles,
the L stands for
the LISCOP substitution principle.
And a very good
example of substitution is
the concept of a telephone.
From a telephone, you would
expect it to be able to place and receive calls
at the very least.
But the realization of a telephone
can happen with a rotary telephone,
something that people that was born in this century
don't even know how to use.
And smartphones,
that even though they are orders of magnitude more complicated,
everybody and their dog know how to use a smartphone nowadays, right?
My dog does.
My dog definitely also does, yes.
When you want to use a telephone for its basic capabilities of placing or receiving a call,
then any of them would do.
Any of them are substitutable.
That's the essence of the LISCOP substitution principle.
Of course, the realization of substitutability
happens in programming languages,
typically via subclassing, object-oriented interfaces.
It's a structural property of the types.
Type erasure frees you from requiring that structural relation.
So it is free form.
It is unconstrained, and therefore it has much more powerful modeling capabilities.
Right.
So this is a fascinating
discussion and I'd love to spend the rest of our
time talking about type erasure
and subtyping relationships.
But I do
actually want to ask you about another library
that you worked on before
you ran out of time.
So let me just switch topic if that's okay with both of you.
So you also have a SWAR library in there, S-W-A-R,
which you've talked about, which seems to be very interesting,
very fascinating, and also very fast.
So do you want to just briefly talk about what SWAR actually is
and why anybody would need that
and then what your library gives the user in that area?
Sure.
So SWAR stands for Scene D within a register.
So simple instruction, multiple data,
single instruction, multiple data
within a normal integer register.
So you're going to pretend, you're going to simulate, emulate,
that you have lanes within a normal integer.
And those lanes might be, let's say the base type, 64-bit integers.
You can have, for example, 8-bit lanes.
So you would have 8 8-bit lanes.
You can have 16 4-bit lanes.
You can have 9 7-bit lanes. So you would have eight 8-bit lanes. You can have 16 4-bit lanes. You can have
nine 7-bit lanes. So our library supports all of those configurations. And then you program
using this library in the way that you would normally use SIMD, regardless of whether you
have SIMD in the architecture that you're using.
And it turns out that the emulation of SIMD capabilities through judicious mathematics and bit-twiddling techniques
gives you a performance that is really good,
especially when what you want to do is not proper SIMD work,
but let's call it tactical, like ad hoc SIMD applications.
And there are two examples that I can call right now.
One of them is, of course, the motivating reason for me having done this library,
which is to work with sets as bit fields.
And this is what I communicated to the community back in 2019 on C++ for the win
or how C++ beats all other languages at card games.
So that's what I described,
a much more primitive version of the SWAR library.
SWAR, of course, is not a term that I invented,
but it comes from other people.
And the thing that we revisited, the people on the zoo libraries, Scott Bruce and I, decided that the SWAR techniques were excellent for implementing the Robin Hood hash tables that we had in mind.
And it proved their mettle in that regard. So today, Michael O'Rourke and I have advanced so much
some techniques in the SWAR library
that we're beginning to apply to CUDA
and to proper SIMD registers and SIMD instruction sets.
And you can see a tiny bit of that,
like the incipient work that we're doing
on our latest presentations,
like Jamie, Poms, and I,
that we
are using intrinsics to do
SIMD in a way that is a
generalization of our SWAR libraries
within the SWAR library.
So, sorry, the SWAR library
is driving the
intrinsics. So we're
escaping the
part of
being constrained to normal
integer registers because it turns out that
the techniques are
applicable. Like, if you want an example,
to my knowledge, there
is no architecture that would
allow you to have
lane sizes that are not
a power of two
number of bits.
We can, without much problem and without losing too much performance,
have lane sizes that are nine bits or seven bits.
And why would that be useful?
I gave you the two examples, right?
And in the case of Robinhood, for example, it is useful to reconfigure a table in the way that the metadata that contains information regarding the summary of hoisted hashes and something that we call the probe sequence length. So you might have, for example,
that you have a Goldilocks problem
that a length size of 8 bits is too small,
but a length size of 16 bits is inefficient, too large.
So 9 bits is, for example, a really useful in practice
linsize for rolling dot hash tables. That's one application or one feature that,
as far as I know, is not present in all the libraries and frameworks.
Yeah, I think we get a bit too hung up on always using powers of two.
That's not necessarily the best approach.
Well, the issue about powers of two is that they compose phenomenally well.
Yeah.
So that's not a...
Good default.
And we do have a performance drawback,
like indexing.
Any type of indexing, basically, when you are not using powers of two,
requires to do multiplications.
And it is especially painful when you want to do the equivalent of masking the lower bits,
which would require you to do a mathematical operation of an arithmetic of modulo.
And so the modulo is a multiplication and a subtraction.
So it's expensive.
You mentioned performance a moment ago,
and you said the performance was very good.
Can you expand on that a little bit?
Are we talking about slightly better or?
There is a performance cost to use a normal CMD
because you have to format things in the way
that your architecture is going to be able to process
in a CMD using the CMD architecture, okay?
Yeah. And sometimes, like in the case of abx2
abx2 that gives you 256 bits per register or or abx 512 that gives you 512 bits
per register that'll be excessive right and And you might want to do something more tactical.
Number one, you don't want to switch
to using the execution units of the CMD architecture
because then you don't have simple comparisons,
for example, in the way of...
When you do that, typically, especially in the way of when you do that,
typically, especially in the case of the ARM architecture, ARM Neon,
when you have your operations and then you want to transform the results into booleans,
those booleans will have to be converted into a mask first in the SIMD architecture registers,
and then you should transfer that mask to the SIMD architecture registers and then you transfer that mask
to
the general PUPOS registers and
then you decode the mask
in the general PUPOS registers.
Well, the SWAR
techniques, even though they have
an inherent cost to them,
if
depending
on the specific application,
it might happen that it's actually cheaper and more performing
to keep everything within general Puzzle registers.
That's very interesting.
I guess it depends on the problem,
but I haven't really seen that being used in any code that i was working on so i'm i'm thinking now of any
would there have been any cases where some algorithm would have benefited from that
approach it's it's quite interesting i i need to think about this a bit more
well let me tell you two recommendations you can watch my presentation in 2019 of the application of SWAR techniques to have the basically top performing Texas Hold'em Poker engine.
That's what I described there when the SWAR library was very, very primitive and it was still very, very helpful. And the other is to re-watch
because I know that you were in the audience
and you even made a nice comment.
Thank you very much, Timur,
about my C++ Now presentation
where I described the first time
our hash tables
and also the presentation at CppCon
with Scott Bruce and I
on the same topic of hash tables.
Yeah, so I remember the hash table one.
So hash tables is not something you use quite as often in audio, for example, as you would in finance.
So I think I didn't quite yet understand how the SWOT techniques are kind of applicable for other things as well.
I need to rewatch that talk.
So thank you very much for that.
Yeah, I would recommend you to do it
because the mechanisms of the particular flavor
of hash tables that I have been working on
together with my collaborators at the repository,
Robinhood hash tables,
the mechanisms of Robinhood hash tables
are truly interesting in my opinion.
And the interplay between those characteristics and those techniques and guarantees such as...
Sorry, I don't want to go into unnecessarily deep detail, but that's interesting.
And the SWAR techniques were invaluable to make them perform.
Right.
So we're going to include links to these talks in the show notes.
I'm sure other people will find that also very interesting.
So I want to switch to another topic before we run out of time,
because I want to have at least a little bit of time to talk about that.
So I did not re-watch the talk about hash tables,
but I did watch your much more recent talk,
actually just a few months ago at C++ Online,
where you mentioned something like the electromagnetic wave metaphor,
the electromagnetic wave progress metaphor.
And as somebody who has a background in physics,
I found that to be very, very interesting. electromagnetic wave progress metaphor and as somebody who has a background in physics um i
found that to be very very interesting and uh just like with the evolution stuff we talked about
earlier i was like hmm how did how did i don't immediately see how that applies to software
engineering so i'm just very curious uh i mean i know a little bit about this obviously already
because i watched your talk but do you want to talk a little bit about this, obviously, already because I watched your talk. But do you want to talk a little bit about that?
Because I think that's quite an interesting thought that I haven't really heard before.
So I think that might be interesting to talk about as well.
Oh, man, if this metaphor is novel to you, like the meaning behind the metaphor is novel to you, that makes my day, Timur.
Yes, it is.
Thank you for the metaphor.
I really enjoyed that.
All right.
So here's a really capital observation,
which is when you are doing
a non-trivial software engineering effort,
you end up learning a lot
about what you were doing
once you delivered some passable, usable code.
And that point in time, the project, that new understanding,
that superior understanding that you have achieved
gets orphaned of support, of sponsorship,
to capitalize on it and introduce it back into the code base.
So what I mean is that in the process of doing something practical,
you may gain insights that, for lack of a better word,
we're going to use the word theoretical to describe them,
like not immediately applicable,
like you don't know how to apply
them immediately, but you know that you have a superior understanding of it.
And who knows what are the benefits that you're going to get down the road.
By the way, in my libraries, the examples are so abundant that it's exhilarating of achieving a superior understanding.
But if you supply that sponsorship, if you have faith in your superior understanding, then rather quickly you can get the hang of how to turn those insights, those theoretical insights into practical
benefits.
So this reminded me, I also studied physics, this reminded me of electromagnetic waves,
because electromagnetic waves, all of the work, all of the energy that you can do useful
things with them, like driving sensors, etc., come from the electric field.
Magnetic fields, magnetic forces, in general, don't produce any work.
So the magnetic field needs to be transformed into an electric field to be directly applicable and useful.
And it turns out that electromagnetic radiation is precisely that.
The electric field changing, improving, let's say,
makes a change, induces a change in the magnetic field,
and the change in the magnetic field induces a change in the electric field.
And that is the electromagnetic wave, in essence.
And so how does that apply to the process of software engineering?
Is it that you can then feed back your insights into the practical?
Yeah, the important part is to have faith in your new insights,
in your new understanding, to be put back into the code base as soon as possible,
even if you don't have a direct need that you had identified.
And the exhilarating part is that this works
because it is the essence of why I have accomplished so much
in the zoo libraries with so little effort that I have put in.
But quite interestingly, there is the experience of several people
that have contributed in earnest to the repository
that got indoctrinated into this way of doing software.
And they themselves have been getting insights
and doing things that are better than they have ever done.
That is very interesting.
From the perspective of software engineers, it has proven that the induction happens.
From these libraries, people do learn to do this.
And it is really, really great when you are doing something and you have an insight that you
you know that is uniquely yours that nobody else had that understanding ever before like
there's one example that happened recently and i'm just going to use that example because it
is the most recent one um i was reading From Mathematics to Genetic Programming for the NTF time.
It's a book that I have read several times,
and every time I do, I discover much more insightful things.
The book by Alexander Stepanov and Rose.
And it was about the Egyptian multiplication algorithm.
And they then noticed something.
Oh, wow.
Everybody implements this algorithm
by consuming the rightmost bit in the count
to do this.
And I think that that is a tiny bit less efficient
than consuming the most significant bit.
So the insight
was directly applicable
to one question that I was solving,
which was SWAR multiplication.
You know that multiplication
is a truly important operation.
And in
a normal integer that is split
into lanes, it is tricky
to implement
multiplication in a SWAR way, in a lane
wise, in a CMD
way, so that you can multiply
elements of the
same lane into
SWAR registers.
And I implemented that multiplication in the regressive formulation
as opposed to the progressive.
That's how I call it.
And this was so interesting that the collaborator, Jamie Pond,
took an interest on that and made the showcase of his own CppCon presentation this year.
And that talk was very well received
because it was dense in insights that were novel.
Like, we need to give names to things
that we haven't ever seen before,
but that are interesting and useful. And one of them is the progressive direction versus the
regressive direction. But Jamie Pond explains it in his presentation at the CPPCon, so I don't have
to repeat that. And I don't know, right now, like at this very minute that I'm talking to you,
normally multiplication gets implemented with the Booth algorithm. Like that's how microprocessors
or arithmetic logic units do multiplication by using, well, not the Booth algorithm, but based on the Booth algorithm, the Wallace and the Dodd algorithms.
And it's very clear that for the purposes of SWAR,
the motivation for lane-wise multiplication that was my objective
that led me to apply the Egyptian multiplication algorithm
wasn't suitable.
So I am having a new understanding of this is the first time that using a monad,
the monad of Swart, drove very strongly a preference towards the Egyptian
multiplication algorithm as opposed to this other alternative for
multiplication algorithms. And what we did with
multiplication is not only multiplication
but
like Jamie explained in his presentation,
this template
is applicable to any associative
operation that is counted.
So it's general to any
monoid. In particular, for example,
application to integer fields
that are very important in cryptography,
like arithmetic modulo of a large prime base.
So the very same algorithm is so general that it's scary.
And it emerged in a very organic way
from carefully analyzing each need
that our motivating...
Well, yes, so this is an example of work
that emerges in the library,
in the zoo libraries,
that engender their own CppCon presentation
by someone who's not even the person who had the initial ideas.
I don't know if this is interesting to you,
but I think it is fascinating because this is the other thing.
Once you get the hang of how to use the zoo libraries
and what I call the epistemological doctrine of software engineering,
the practice makes you much faster on this iteration.
So we have a very high frequency in the electromagnetic wave
in terms of transforming theory into practice
and transforming practice into theory quite quickly on a per day basis,
basically.
So if you are skeptical about this, I can just invite you to become a collaborator to
the Zoo Libraries.
And it takes more or less about three months to get the hang of it.
And you'll see for yourself.
All right.
I would like to extend that invitation to anybody who's listening to our podcast today.
Yep, the zoo is open.
Yeah, that's the shameless plug that I had to include as an interviewee.
All right.
So we talked about evolution.
We talked about electromagnetic waves.
Another thing that you mentioned in your work is assembly theory.
So what's that?
Yeah, well, this hasn't been reflected in my work as of now,
but the theory of emergent complexity and assembly theory
already has examples in the world of sovereign engineering,
at least from my point of view.
So let us use something simpler than assembly theory,
which is the emergence of complexity, right?
In general, what you want to...
So complexity is simpler.
Yeah, so using complexity to simplify software engineering,
it sounds like a contradiction or a pun,
but here's what I'm going toward.
You make a library, okay?
I have always been very interested in trying to understand
what makes a library easy to use.
And, of course, you have to remove coupling.
You have to maximize cohesion.
You even have to take into account something that people don't mention
very often, which is orthogonality,
but I believe it is essential to good libraries.
But other than doing the things that we already know that work
and that they are useful,
what is the ghost of a good library that makes it really composable,
really useful. And so I think that the summit of the quality of a software library
is when it is applied to unanticipated cases.
When you're doing a library,
you have some very clear applications in mind.
And if it turns out that that library is useful for something
that was completely unanticipated,
then that library captures something essential that is subtle,
difficult to appreciate, but it is definitely there
that makes it a good library, right?
Yeah.
And that reminded me of things that uh naturally compose and that's where
the emergence of complexity or the theory of phenomenology phenomenology of the emergence
of complexity comes into play because what i mean by this is that when complexity emerges, that gives you an insight into how things,
the multiplicity of agents combine with themselves, interact with themselves.
Like, for example, the simplicity of an ant,
as opposed to the complexity of an ant colony that comes about
because of the interplay
between the different ants that are simple individuals, simple organisms,
even easy to describe.
So that's the idea, that you have a library that is like ants
that would randomly combine to lead to interesting stuff,
let alone when there is someone who deliberately is using the library
to accomplish something.
So I believe that if you have something that is so easy to combine,
that it can happen randomly to obtain interesting behavior,
then if you do it deliberately, you're going to get really great behavior.
And by the way,
this is one of the very counterintuitive insights
that I have communicated to the community
in my presentation at C++ Now,
the one that you attended, Timur.
And also, I think that you were there, Phil.
I was, yeah.
To my knowledge,
I am the only one who has ever recommended people
to specify as little as
possible and not only that that i went one step further which is to make a template to specify
less so so you're basically saying that code that does something useful should evolve to do more
stuff rather than being constrained by a specification
that says what it does?
Yeah, that's part of it.
The observation that I am making is that
one essential property of a good library
is that it is good for unanticipated cases, right?
To be good for unanticipated cases, right? To be good for unanticipated cases, then the library itself needs
to be very agnostic about the purpose for why it was developed.
So you are developing a library.
You have a clear objective in mind.
But if you make the library that specifies the bare minimum,
it is mostly ambiguous in the way that it supports the motivating reason for having written the library.
You're maximizing the chances that it is going to be applicable
in completely unanticipated ways
because that library makes a deliberate effort
of not imposing itself on the applications that are going to use them.
And I have seen this experimentally when I was working at Snap.
That's very interesting.
Yeah, that's a very interesting approach.
Thank you.
Next time I write a library, I will try that approach.
I'm genuinely curious now.
Yes.
So all of those things happen in the zoo libraries as a matter of course.
And the results are there, okay, to be seen and appreciated or studied.
I would really love to dig into this a lot more, but we are running over time.
And, in fact, I'd love to dig into all of the things we've been talking about.
And for most of them we can because you've done talks on them so we're definitely going to put all of
those links in the show notes so anybody whose interest is caught by any of those topics they
can go and find out more but we do have one more question from my part is I should say that
one thing that is embarrassing is that I have never documented the type erasure
library, despite of
how widely it's used.
It is part of production of Snapchat
and other projects. And Snapchat
on its own is
quite important and relevant.
But the thing is that
I basically had to write a book
on type erasure because
this library has been
recommended by,
by a lot of people.
It's mentioned in the,
in the book by Klaus Eagleberger,
for example,
even though I didn't earn his credit because he,
he consulted me,
but I didn't tell him something particularly insightful because I made a
terrible mistake of forgetting the part of monophyletic,
paraphiletic, and polyphiletic, because I made the terrible mistake of forgetting the part of monophyletic,
paraphiletic, and polyphiletic, monopoly and paraphyletic relations,
subtyping relations.
I forgot to tell him about this.
He writes this in a book, a whole chapter.
They carry to it about 10% of the whole book.
Pedro Picos, for example, at C++ Now this year,
explained just aspects, a few aspects,
like 60%, 70% of the performance aspects of the type iteration library
in a whole C++ Now presentation.
And there are so many more things
to communicate to the people.
Right.
So we could carry on talking about this all day. In fact, we have done in the people. Right. We could carry on talking about this all day.
In fact, we have done in the past.
But right now, we do need to wrap up,
and we do have one more question that we traditionally ask,
which is, is there anything else in the world of,
let's try and keep it to C++, because we could go much further.
Anything else in the world of C++ right now
that you do find interesting outside of the zoo libraries?
I consider it very interesting how C++ is being misunderstood.
And therefore C++ apparently is dying,
according to what I can see.
And it is attempted to be substituted by Rust and Carbon
and C++2 and so on,
that in my considerate opinion, they are just misunderstandings of C++
that eventually are going to be revealed to be misunderstandings.
But who knows what's going to happen
with C++ at that point.
Right.
So the future of C++ itself
is what interests you?
Yeah, well,
I don't like C++.
It is just that
it is less annoying
than other programming languages,
especially with regards to performance.
It's the worst language apart from all the others.
That's a bold statement.
The summary of my attitude about it is that
you better really understand C++
because if you misunderstand it,
then that's going to put you in a world of hurt.
Yeah.
That is true.
That is definitely true.
Just one example.
Type curation is not expressible
in any other programming language.
I mean, at conferences,
I have shown the material evidence
that the type curation framework that I implemented
is of superior performance
than the intrinsic feature of the language of virtual overwrites.
The performance is better, and it is objectively better.
It is very easy to see that it's better.
So it's a library that beats an intrinsic mechanism in the language.
That's C++.
In other programming languages, including, for example, Rust,
it is inexpressible.
That is quite interesting.
Okay, so as Phil said, I wish we had more time to talk about this.
This is really fascinating.
We're going to put all the links to these topics in the show notes.
And before we let you go, is there anything else that you want to tell us?
Is there anywhere people can get in touch with you
if they want to learn more about these concepts?
Oh, yes.
Please contact me through my direct messages at Twitter.
I use the handle thecppsu,
like the textual, thecppoo, like a zoological.
And through that handle, you can reach me.
That is the same account name at GitHub where the zoo libraries reside.
And, of course, I am going to be documenting the type ratio library,
and that's going to be documenting the newest understanding that I have about it all.
And this is going to be something that hasn't been done yet,
so it's going to significantly improve the usability
because people would not have to contact me to use it.
They could work independently.
All right.
So that's really interesting.
Looking forward to looking at that.
And with that said, we are at the end.
So thank you very much, Ed, for being our guest today.
I really enjoyed it.
I hope you did as well.
It was my pleasure.
I'm looking forward to more output from you on all of these topics
absolutely thanks thanks to you guys thanks so much for listening in as we chat about c++
we'd love to hear what you think of the podcast please let us know if we're discussing the stuff
you're interested in or if you have a suggestion for a guest or topic we'd love to hear about that
too you can email all your thoughts to feedback at cppcast.com. We'd also appreciate it if you can follow CppCast on Twitter or Mastodon. You
can also follow me and Phil individually on Twitter or Mastodon. All those links,
as well as the show notes, can be found on the podcast website at cppcast.com.
The theme music for this episode was provided by podcastthemes.com.