CppCast - TDD, BDD, Low Latency and CppCon
Episode Date: November 29, 2018Rob and Jason are joined by Lenny Maiorani from Quantlab to discuss high performance computing, pair programming, volunteering for CppCon and the site of next year's CppCon. Lenny has been usi...ng C++ off and on since 1995. Since graduating from SUNY Plattsburgh with a degree in Computer Science, he has been working at startups focused on high-throughput applications. About 2 years ago he joined Quantlab and discovered a different type of high-performance computing in low latency systems. Lenny lives in Denver, Colorado with his wife Lexey and their dog. He can be found hiking in the Colorado mountains while thinking about container access patterns and wondering if std::map can be renamed to std::ordered_map. News Better template support and error detection in C++ Modules with MSVC 2017 15.9 What's new in Clion 2018.3 Std::string is not a Container for Raw Data Counterpoint Lenny Maiorani @lennymaiorani Links Quantlab CppCon 2014: Lightning Talks - Lenny Maiorani "Test-Drive Performance" Sponsors Backtrace JetBrains Hosts @robwirving @lefticus
Transcript
Discussion (0)
Thank you. at backtrace.io slash cppcast. And by JetBrains,
maker of intelligent development tools
to simplify your challenging tasks
and automate the routine ones.
JetBrains is offering a 25% discount
for an individual license
on the C++ tool of your choice.
CLion, ReSharper, C++, or AppCode.
Use the coupon code JETBRAINS for CppCast
during checkout at jetbrains.com.
In this episode, we discuss IDE updates and storing raw data in strings.
Then we talk to Lenny Majorani from QuantLab.
Lenny talks to us about high-performance computing and much more. Welcome to episode 177 of CppCast, the first podcast for C++ developers by C++ developers.
I'm your host, Rob Irving,ined by my co-host, Jason Turner.
Jason, how are you doing today?
Doing all right.
How are you doing, Rob?
I'm doing fine.
We're going to just talk about Star Wars this episode, right?
Yeah, well, yes.
I mean, we could recap for our listeners and then let our guests weigh in in just a moment.
It's just an argument of whether or not
the last two movies are worth watching
basically which one is the worst i mean i i have strong and mixed feelings about last jedi i think
solo was better than rogue one but uh that's kind of my summary yeah yeah and i like rogue one better
than solo because it introduced this whole fuel problem and Star Wars never had a fuel problem
before this. We've got
36 years or whatever of Star Wars
with no fuel problems and now there's fuel
problems. Although I bet if you
read the books they're probably talking about fuel all the damn
time. That's probably
true but I think the books are actually science fiction
not science fantasy.
Or whatever.
Anyway, at the top of the episode,
I'd like to read a piece of feedback.
This week, we got a tweet from Patrice Roy
saying, listen to the CPP cast,
Luke Trip Report with Ashley Hedberg episode.
Thanks a lot.
I missed San Diego,
and I miss going to WG21 meetings a lot these days.
I haven't been able to attend the last two,
but should be in Kona, fingers crossed.
Interesting through and through.
And yeah, that was a great episode we had with Ashley.
We still have a lot more to go over
from the San Diego meeting,
and we should have another guest on to talk about that soon.
I was just debating
if I should try to make it to Kona.
Oh yeah?
The problem would be
C++ on C. So I would have to go to C++ on C
and then come home for like three days and then fly to Hawaii. Okay. So Kona is the week after
C++ on C? Or no, no more than two weeks after it. I mean, it would be, it would be tighter than I
want to be for flying halfway around the world, basically. Right. Now, if you went, would you go with Jen
and actually make a vacation out of it,
or would you try to go to the meetings every day?
If I go to Hawaii, my wife would come with me.
The question is, I don't know.
Since I'm not an official committee member,
I would be up to my discretion
how many meetings I attended, I guess,
and I haven't decided what I would do.
That's part of the problem, really,
because I know it's intense and it's a long week.
Well, you have to let me know what you decide to do, okay?
Yeah.
Well, we'd love to hear your thoughts about the show as well.
You can always reach out to us on Facebook, Twitter,
or email us at feedback at cpcast.com,
and don't forget to leave us a review on iTunes.
Joining us today is Lenny Majorani.
Lenny has been using C++ off and on since 1995.
Since graduating from SUNY Plattsburgh with a degree in computer science,
he's been working at startups focused on high-throughput applications.
About two years ago, he joined QuantLab
and discovered a different type of high-performance computing
in low-latency systems.
Lenny lives in Denver, Colorado, with his wife, Lexi, and their dog.
He can be found hiking in the Colorado mountains while thinking about container access patterns and wondering
if StoodMap can be renamed to Stood Ordered Map. Lenny, welcome to the show.
Hi. Hi, Rob. Hi, Jason.
Have you actually considered submitting a proposal to change the name of Stood Map to
Stood Ordered Map?
I've heard that that proposal process has uh is quite complex and takes
a long time and a lot of effort so i'm just hoping somebody else will do it at this point
i'd imagine that'd be a hard sell just because it would break all existing std map code yeah
right but maybe we could deprecate it and eventually get it up. Make a type alias for the name, deprecate the old one.
I think the standards process would allow for that today, whether or not it would get accepted.
Yeah.
Although, just as an aside, std autopointer, broken from the beginning.
Everyone agrees it's been broken from the beginning.
I have my copy of effective C++ that says don't use it, right?
It was removed in C++17.
Today, GCC 8 and C++20 mode still has AutoPointer
because they don't want to break existing code.
But it is marked deprecated, right?
It's marked deprecated.
That's not conformant with the standard.
In C++17 mode, it's supposed to be removed.
And if removing autopointer breaks existing code,
that code's already broken.
Anyhow.
Okay, well, Lenny, we've got a couple of news articles to discuss.
Feel free to comment on any of these
And we'll start talking more about your work on QuantLab and other stuff, okay?
Okay
So this first one I have is from the MSVC blog
Better template support and error detection in SQL Server modules
With the new 2017 15.9 update
And they haven't been talking much about SQL++ module support in MSVC for a while,
but I guess they've been kind of under the covers
making a bunch of changes and improvements,
and they're just announcing and laying out all the improvements
that they've had, that they've made recently.
And, yeah, better constexpr support, two-phase name lookup, it all seems pretty
important for modules to work properly. Yeah. It also looks like they're just trying to get
aligned with the current wording of the standard as well.
Right, because that has changed a bit over this past year in the committee, right?
Yeah. Have you worked with modules at all, Lenny?
No, I haven't touched it at all.
Yeah, me neither.
Yeah, it seems like one of those things
where it might be just better off
to wait until it's official
and all the different platforms start implementing it.
Yeah.
Okay, was there anything else you wanted to talk about
or point out here, Jason?
No, just that it's good to see Visual Studio constantly working on getting all these things better.
Yeah, one of the major improvements they're calling out here is improved error diagnostics. one example here with a very, very simple piece of code where it went from a 10-line error that doesn't really tell you what the problem is to a
nice, easily understandable two-line error message.
Right.
Just kind of, I guess, as an aside again, my ChaiScript has exactly
one compilation error left for C++17 support now.
So I should probably, now that it's down to exactly one thing
and I don't think they're working on it,
I should probably submit a bug report on that.
Oh, yeah, definitely.
Okay, with other compiler or IDE news,
we also have a post from CLion.
They just released their 2018.3 release.
And this has a bunch of new features.
The main one is their initial remote development support.
And this will basically allow you to,
if you're using CLion on Windows, Mac, or Linux,
to be able to remotely build and debug on a different machine.
I think currently it's just a remote Linux host.
It's a pretty cool feature.
Yeah, this is pretty interesting for QuantLab too.
A lot of the QuantLab developers use CLion. I'm more like Jason.
I'm not an IDE user, but
I think that a lot of people will like this because the way that most of them
are doing it right now is remote
X display.
And, you know, it's laggy and just not a great experience.
Oh, like running CLion on the remote host
and exporting the display back over SSH or whatever.
Yeah, exactly.
Oh, yeah, I've never tried a remote X
with something as complicated as an IDE before.
I think Netscape back in 1997 was probably the last time I tried something that complex over X.
Yeah, so the people using it, like a lot of the folks in Houston, in our Houston office use it,
and our data center is also in Houston.
So that's not too bad for them. But for our remote
offices, like here in Denver, that the lag is basically prohibitive. Wow. Did you know if
anyone's tried this yet? It looks like it just came out two days ago. I don't know if anyone has.
Rob, you've done a lot of kind of embedded device stuff. We've talked about that a little bit on the show.
What about remote?
But this isn't the same kind of thing.
Would this be useful?
What I'm trying to ask is,
could you imagine if this would be useful if you were still doing handheld devices
or something like this?
I don't even know.
Since it's remote Unix host only,
I don't know if that makes sense or not.
Yeah, I never worked with any remote Linux devices. I was always working
with a remote
device that you would plug in via USB
like a Windows CE
device, something like that.
And that usually had built-in
support through Visual Studio.
I wonder how debugging is going to work
with this feature.
If you can do remote debugging also.
I mean, it seems to support both remote build and remote debug.
Yeah.
So it seems like it should help your team.
Yeah.
I guess, and besides that, there's a couple other things on here.
Like they are ClangD-based navigation and search,
so they're pairing up a little bit with Clang like everyone is.
Right.
And better C++ 17 specific feature support.
That's cool.
Yeah, so hooked into CPU profilers
like Perf or Dtrace on Linux and Mac.
What were you going to say, Lenny?
Yeah, that's what I was going to say, too.
That looks pretty exciting.
And they have catch support for their unit testing.
I think we gave Phil a hard time about that when he first started there
because they didn't yet support catch, and I don't know when that was added.
Yeah, I don't think this is the first release it's been added,
but yeah, I'm glad they got that in there with Phil working for them.
I have to say, honestly, everywhere I teach, I always recommend catch just because it is so crazy simple to use.
I know it's not the fastest to compile.
I know it has a couple of issues for like multi-threaded testing and stuff
that bothers some people.
But if you need a testing framework and you don't have one yet,
you know,
pound include catch.
Yeah.
Yeah.
Okay.
And then the last post I have is, uh, from arn mertz's blog and it's stood
string is not a container for raw data yeah yeah when i saw this post i i just immediately thought
back to one of the lightning talks from cpp con which was you know kind of done in jest but it
was a talk about stood basic string for, you know,
non string things like you could have a stood basic string of bulls instead of a vector of bull.
And it was a really good talk. Um, you know, he wasn't really telling people that they should go
and do this, but it was a, an interesting talk to see what you could do with it. Um, and I'm not
really sure like why Arne felt the need to write this post
and tell people not to do it.
Was he replying to that lightning talk?
I'm not sure.
Well, I think if he was replying to the lightning talk,
he would have specifically called out the fact that you could use basic string,
which he didn't.
Yeah, he's talking about string itself.
And I've used string in the past, personally, for interacting with C APIs,
because it can just be a simple little container.
But Vector can do almost exactly the same things.
I don't know.
Yeah, it's a really good way to make your code unreadable.
Std string for data.
Yeah, it'd confuse everyone.
But if you used, like, using data container equals std basic string of byte,
and then just like use data container or whatever throughout your code,
like give it whatever name makes sense, right?
I'm mostly on board with that.
Although Arna calls out a couple of things here that like, yeah,
depending on what operation you're using on the basic string class,
it might be looking for like null, which is a fair point,
and I had not considered that.
But at the same time, you could have this issue with any arbitrary UTF-8 data
and a std string too.
So I don't know.
Well, it does look like one of the commenters is saying that Google's protocol buffers uses this,
uses strings for other types of data, raw data.
So maybe that was kind of the genesis of this post.
Yeah.
Have to think about this one some more.
And we'll have a link to both the video and the article in the notes yeah i do suggest watching
the uh the lightning talk and if i might say that lightning talk was one of my previous students
and it was a topic that came up in class he brought it up and i said you have to propose
a talk on this hopefully hopefully at some point it will become a full session talk because i think
there's actually a lot to talk about by the time you talk about small string optimization
and the optimizations that basic string can do around trivial types.
That vector just can't do it in the same way.
Right.
And I think we did talk to him
during our lightning talk interview at CppCon,
but as I think we mentioned before,
we unfortunately lost some of those interviews and he didn't come around
for the second read. Yeah, he didn't find out about the redo until
after it was too late. Right. Okay. Yeah, Arna says something
at the end about, you know, if you really want to do this
maybe you should use a stronger type and have
a wrapper structure that you can give a meaningful name
I'm all on board with stronger typing
so yeah, I'm not going to argue with that
Okay, so Lenny, do you want to start off
by just telling us a little bit more about the work you do at QuantLab?
Sure
So I came to QuantLab
after spending a number of years doing high performance systems that were all focused on high throughput.
And so when I joined, I kind of had to change my view of the world.
You know, one layer's throughput is another layer's latency.
So I kind of had to rethink how I go about this problem.
I work on the trading system.
So QuantLab is a high-frequency trading company.
And we work on everything from market data parsers to market modeling and simulation to order entry systems.
So basically the full system end-to-end.
I guess. No, I'm sorry. Go ahead.
I also do a lot of, I focus a lot on the build process and tooling.
But generally we're writing C++ every day.
From the standpoint of maybe our listeners who don't know the difference,
could you describe what you mean between throughput versus latency?
Yeah. So when I refer to throughput, I mean how much data we can push through the system.
And with latency, I'm referring to how fast any given piece of data can flow through the system.
So with a high throughput system, you might not necessarily care if each individual item can pass through quickly.
So if you think about a message that comes in, you might not be worried how long it takes the message to get through,
but you want to get as many through as possible within a given time period.
With low latency, you want each individual message to be processed as quickly as possible.
And maybe you don't care as much about how many you can process,
so you might trade different things to achieve this.
What kind of systems did you work on in the past
that were high throughput?
So I worked on...
So before I joined QuantLab,
I was working on load balancers,
so HTTP load balancers.
And in that case,
at web latencies, you're more concerned with human perception of the latency.
So adding another millisecond might not matter because the latencies are going to be in the 50 to 100 milliseconds across the web. So you might not really, certainly microseconds
in that case aren't going to matter too much. You don't want to add them all over the place.
Right. I also worked on data capture systems. So like network capture devices that
can record huge amounts of traffic without dropping any packets.
And data storage systems that need to...
So there were high-performance computing systems that were responsible for connecting large numbers of devices
to some sort of back-end storage.
And those needed to be able to get all the data needed to all the devices as quickly as possible,
but we weren't necessarily concerned about microseconds of latency.
Right. If I could exaggerate the numbers, maybe. So for like a network uh sniffing logging whatever system
you don't you don't you don't care if you will if it takes a minute from the time that the byte
hits the network to the time that you actually write it to disk you just care that you're able
to handle so many terabytes per second or whatever and that they all eventually get on the disk right yep okay but with
with a high frequency trading we're very much concerned with when the market data
is disseminated by the exchange and how quickly we can act upon it because there are going to be
other market participants who also want to act upon possibly the same data.
So whoever can do it first is able to make the trade that they want. in the systems tend to be fairly low compared to something like a network capture system
that's trying to capture all of the data on a company's network.
That might be gigabits or even terabits per second,
whereas a market exchange is going to be perhaps maybe a few hundred megabits per second.
Oh, wow. Yeah, that's considerably smaller scale.
Orders of magnitude, if you will.
So what's your development process like?
I mean, what types of problems specifically are you working on each day?
So we're generally focused on writing low latency market data parsers and getting the outputs of our market model.
When the market model decides to trade, get that order to the exchange as quickly as possible also.
So we're generally focused on those kinds of problems, but it's very data-driven.
So we have lots of different projects to pick from, and we try to decide which projects based on what's going to provide us the best, provide us the biggest return.
Okay. We're a very agile shop. So we do agile scrum with a lot of elements of extreme programming,
like pair programming. And we try to generate high level business behaviors that are desired to meet that
data driven decision. So we try to make our data driven decision on what project and then figure
out what the business behaviors are from that. And then we try to make them happen.
Okay, you just mentioned pair programming and extreme programming.
It's actually, I don't think I have that book anymore.
I had one of the earlier books on extreme programming at one point.
I was working for a company and the owner of the company heard about extreme programming.
So he gave us all books on it.
But I've only personally done a little bit of pair programming.
How well does that work for you all?
I was new to it coming here, and I'm a convert.
I'm totally sold on it.
I think it's been really good to give us focus on projects so we can get the whole team focused.
We often even do mob programming, which is kind of like a pair but more than two people. So we'll have mobs of maybe three or
four people all working together on a project.
And we found that it works really well for us.
We're able to all
be focused, and when one person needs to
go to another meeting or something, we can,
we just carry on. We don't, we don't necessarily stop because one person in the group has to
do something else. Do you feel like this is like fundamentally like, uh, faster with,
or slower with better outcomes or like, how that work i think it's faster over time
because everyone it as long as everyone's engaged everyone has to learn the the topic so you don't
necessarily have people in the team at least people from that mob who don't know a part of the system. Everyone
knows that everyone can contribute there quickly. So it keeps the entire team leveled up.
But within a short time frame, it does reduce the parallelism of the team.
So since all the members of the team are working on the same thing,
we're not working on maybe four different things. Okay. It does seem like, if I could just imagine
this in a regular daily work environment, if you are stuck on a problem, you probably get unstuck faster, I would think.
Yeah.
Yeah, there tends not to be as many times that you are stuck at all.
Because somebody always has an idea on how to progress.
Right.
Although I'm reminded of that classic XKCD where everyone's fully around by sword fighting because they're compiling, right?
Now, what happens to the mob during the compilation phases?
We really focus heavily on keeping our build time short.
Okay.
So most of our builds are around one minute.
And and we do all it's a single translation unit with headers.
So every project, not every project there.
QuantLab has been around now for about 20 years, and there's a lot of sort of legacy codebases
that have different styles in them.
Okay.
But the projects that I've been working on most recently
are all single translation unit.
Now, if you said this randomly to someone,
oh, yeah, our whole project is a single translation unit
with a bunch of includes,
the average person would assume this would slow down compilation time.
Right. The projects tend to be pretty small in size. We're trying to
react to the market quickly, which means that we want our binaries to be small because we want everything to fit in cache.
And that also means that we're probably going to eliminate as much code as possible to keep
that really tight.
And then when we do find explosions in build time, we quickly address them. So
if somebody adds something that is pathological to GCC or something like that, we'll quickly try
to address that. Do you have a continuous integration system or anything that will
alert you if build times have increased? It doesn't automatically alert us.
We just keep an eye on it.
But also, since the build times are short,
the development cycle tends to get painful
when those build times increase.
Right, so everyone just notices it anyhow.
And then somebody's complaining about it
in the daily scrum or something like that.
How do you benchmark, diagnose, fix build time issues?
How do you figure out what made the change?
We try to keep all of our merge requests really small.
So when a change goes in,
we can tell which one changed it
because we can just scroll back in time
in the list of CI builds
and see, oh, the build went from a minute and a half
to seven minutes on this commit right here.
So there must be something in this commit.
And then from there, it's just kind of trial and error, you know, trying to comment out different
lines, different function calls to see which thing actually caused it. Okay. Yeah, my last students,
I like to point out that get bisect is a very handy tool for diagnosing some issue in your code.
And my last class kind of jokingly complained that I didn't mention get blame as a useful tool as well.
I like to have an alias for get alias for blame to praise because it just feels better to type it.
That's funny.
So I believe you also do like a behavioral driven development or TDD kinds of
things there. Do you want to talk about that?
Yeah. Um, for a number of years now, uh,
the whole quant lab development team has been really strict TDD. So what I mean by that is
everything is always done in what we call a red-green refactor cycle. So first you need a
failing test. That's the red part of the cycle. Then you write the code that makes it pass, then you're green,
and then you refactor to a point where you're happy with what you have,
and then add a new test that is red.
And one of the people I work with likes to say that if you don't have a red test,
you don't have a license to write new code.
So QuantLab has been doing this for quite a while.
And more recently, we've been getting into behavioral-driven development,
where we like to define the behaviors sort of business logic level
and then go and make that happen. the behaviors sort of business logic level,
and then go and make that happen. And maybe we do some TDD along the way,
writing unit tests for something that we're adding.
But we're trying to achieve a general behavior,
and we like to do this to avoid over-constraining the behavior.
If you do really strict TDD with a lot of unit tests, you might end up in a situation where you have a very rigid system.
Everything is set in concrete.
And to change anything, now you have to change a lot of tests. So if you can just
specify at the business level what you want, it gives you the guardrails to be able to freely
refactor and change the internals as long as it still meets the behavior that's desired.
Okay. Interesting. Yeah.
So your team used to be very distributed.
How did you deal with that through development?
Yeah, it still is really distributed.
It's getting less so.
We've kind of been concentrating in Denver,
at least the team that I'm on. So the team that I'm on is split at the moment between Denver, Boston, and Houston.
And it used to be Denver, Boston, London, and Houston.
So we have reigned in the time zones a little bit.
But overall, the whole company is pretty well distributed.
The headquarters is in Houston, but we have offices in New York, New Jersey, and Amsterdam also.
Okay.
So dealing with this can be trying, I guess.
We have a sort of informal relationship manager in Houston that helps us with just general face-to-face communication is lacking, like it is when we're all in Denver,
things sometimes are misunderstood or just not fully communicated.
So we do travel to alleviate that pressure, but it's not always perfect.
So those are sort of the human problems with being remote, but also we have a lot of technical challenges.
Like we were talking about earlier with CLion, using CLion for, being in a remote office is difficult. We usually use TMUX with shared terminals. So we all will log into a shared TMUX session and we use that for
pairing. So we can be distributed. The guys in Boston can just be on the same TMUX with us.
And then for audio, we usually use Zoom.
So we have an audio link with them, and then we also have the shared TMUX session.
So that's interesting.
You've got audio, you said, with Zoom, but you're not taking advantage of the screen sharing capabilities.
Yeah, sometimes we do.
When there's a need to share a screen for, you know, maybe web browsing or something like that, we will.
But most of the time we're spending time on the shell and in Vim or Emacs.
And you don't you've been able to successfully work this out with mob programming and shared screen and just voices in the room.
Yeah. And it even works so well for us that sometimes we have people that are in the Denver office who are both sitting at their own desks,
both logged into the same TMUX session,
and with headphones on, talking to each other over Zoom.
That's how well it works.
It's kind of weird sometimes.
I just realized, since you've mentioned Denver several times,
I guess this is the point for the disclosure here that we didn't mention before,
that yes, Lenny does come to my meetup, and we do know each other in person as well.
So I probably maybe should have mentioned that earlier.
And Lenny has demonstrated this mob programming technique for our meetup.
We've done two times now, maybe three times?
Two or three, yeah.
I think maybe two times I was there
and one time I was gone traveling.
And so Lenny is one of my backups
to make sure that the meetup keeps happening, basically.
And that's been a good experience.
It's what, I don't know,
you could farm yourself out to other meetups to demonstrate this mob
programming or something yeah it um i found that the meetup it really slowed things down because
the mob was really big yeah and the um and since people hadn't worked together before
um there's like really widely varying opinions quite right and and as people work together more
the style and opinions kind of join they kind of meet somewhere but then you you get the uh you
know old school c guy who wants to you know implement the loop and assembly language and you
get the newer school person it's like no we should just use you know implement the loop and assembly language and you get the newer school
person it's like no we should just use you know all the algorithms and then we get to discuss and
compare what the advantages are and what the performance actually looks like right yeah yeah
compiler explorer is uh savior for that i just had a student say that compiler explorers are the godsend.
I was like, I see what you did there.
I don't think you did it on purpose, but it was still hilarious.
I wanted to interrupt this discussion for just a moment
to bring you a word from our sponsors.
Backtrace is a debugging platform that improves software quality,
reliability, and support by bringing deep introspection and automation throughout the software error lifecycle.
Spend less time debugging and reduce your mean time to resolution by using the first and only platform to combine symbolic debugging, error aggregation, and state analysis.
At the time of error, Backtrace jumps into action, capturing detailed dumps of application and environmental state. Backtrace then performs automated analysis on process memory and executable code to classify
errors and highlight important signals such as heap corruption, malware, and much more.
This data is aggregated and archived in a centralized object store, providing your team
a single system to investigate errors across your environments.
Join industry leaders like Fastly, Message Systems, and AppNexus that use Backtrace to modernize their debugging infrastructure.
It's free to try, minutes to set up, fully featured with no commitment necessary.
Check them out at backtrace.io.
Going back to high performance and low latency code, do you use any particular tools for optimizing? Yeah. We sort of have, I don't know,
three or four phases of performance optimization. So I guess for the first one is just constexpr
all the things. You know, we try to move as much as we can to compile time,
and we do that through heavy refactoring.
So we're always looking to move more computation to compile time.
And then we have different layers
that give us different results.
So we like Google Benchmark for doing micro benchmarks.
But they're sort of fun, right?
The turnaround time is really quick.
It's sort of like a game.
You can make a change.
You get immediate feedback.
But the problem is that we often get misleading results from that you might make a certain
optimization that's really good in the benchmark but doesn't actually work i think the the canonical
example of that is something that just completely optimizes away in the benchmark
so it looks great in the benchmark but zero milliseconds yes yeah but in real life it's Zero milliseconds, yes. Valgrind is really useful to give us a graph of the bottlenecks and the performance profile of the system, where the time is being spent.
And you're talking specifically cash grind or whatever, right?
Yeah, cash grind and call grind.
Call grind, okay.
Yep. So we can really see the where time's being spent in a visual way, which is useful to find.
But it's also pretty slow to run. So depending on what you need, it may be slow.
And then and then we also like.
So one of the interesting things about using that is that you can turn it on and off programmatically inside of your program.
So they're like Valgrind macros that you can put around certain things that you want to, if you want to know the number of cycles.
Let's say you have a hot loop that's pulling a network socket. You don't really care about all the cycles that are being spent
in that. You only care about the cycles once you
actually get data off of the socket. So you turn it on once
we have the data. Interesting. So like
C macros that you compile in. Right, yep.
Okay, I've never used those.
Yeah, and then Linux Perf gives another view of the same data,
and then we can play around with increasing the number of instructions per clock cycle and making changes to affect that or memory usage, page faults.
Then all of those tests, so we have a lot of tests that are written like that,
and they all run in our CI system and publish their results into an elastic database that has Kibana on it. So we can use Kibana to draw pretty
graphs and
be able to look through
those results.
And so we can see where
we added something that
either caused a
benefit or a problem.
We can go back in time and look
at how different things affected our performance.
Can I ask, if this is information you can share, I assume it would be,
how long does it take from the time that you commit a change
to the time that you have that pretty graph back to you in your CI?
About five to ten minutes.
Okay, that's pretty good. Yeah. Yeah, so we have a big Kubernetes cluster, several thousand cores, I think.
That's big.
That has, yeah.
I only have like 20 cores in my whole house.
It's big for us, but I guess it's not big for some people.
Comcast, I've heard, has a nationwide Kubernetes
cluster. Okay. Yeah. Well, I mean, it's Comcast. Some of it probably runs on my phone gateway and
I don't know it. So in our Kubernetes cluster, we can farm out a lot of jobs simultaneously.
The actual build time of our CI system is probably, even for some of our small jobs, is fairly large. It might be two hours of CPU time, but it executes in just a few minutes.
Because so many parallel tests, I are running right okay um and then like
after so that gives us a good like application level um performance metrics and then from there
we need to tie into the larger system and for that we have a high fidelity simulator. And we can, with enough data, with enough runs through the simulator,
we can statistically determine in plus or minus 3.5 nanoseconds the performance characteristics
of the system. Wow. But it takes hours to run. So we generally avoid it except in the final stages
before some deployment.
So it's not something you just run nightly or something like that?
No.
It takes hours.
And we only have a few of those systems that are capable of that level of performance monitoring.
So we'd probably have quite a bit of contention for that
if everyone was trying to run their own jobs
on whatever code they wrote.
So we've talked a bit about your CI system, tools you use.
What version of the C++ standard are you using at QuantLab?
It's all C++ 17.
Oh, wow.
Even some things from 20 are starting to creep in
um but uh so we're using gcc 8.2 and clang 7 uh and both of those support designated initializers
by turning on the c99 uh extensions right so we're we've been making use of designated initializers,
and we have a concepts emulation library that we use.
So we're kind of doing concepts,
but not really concepts exactly as they are in C++20.
But we have quite a few problems with the designated initializers.
No, I'm sorry. Go ahead.
It's not fully supported yet in Clang, for instance.
Right. Yeah, I've done two C++ Weekly episodes about designated initializers now,
and each time I get comments about how I am not accurately representing what the standard allows. And I just finally realized that basically
the compilers that currently support it
are just supporting it through C99 emulation.
They're not supporting the C++20, actually,
because it only allows equals initialization,
not braced initialization of the elements.
Right, and that's where...
And Clang doesn't even support very well the braced initialization.
GCC will do it.
Okay.
All the cases that I've tried, it's worked.
Oh, I haven't gotten it to work.
I don't know what I was doing wrong.
But Clang definitely did not.
Okay.
But we really like the concepts um emulation stuff because so much of our code is
is compile time stuff we have a lot of templates and so using concepts there
really can help the understandability and readability so this is is Eric Niebler's concepts emulation layer? We have kind of a
home-rolled concept. It's very similar
to how Eric's works. So are you also then trying to take advantage
of ranges and that kind of thing? We have one
project that I know of internally that's using ranges.
We haven't really started using that yet
very widely. We're using the ranges v3 library and it's kind of a toy project at the moment,
but it's an interesting one. I think that once ranges is supported in the standard library,
I think we'll start using it right away.
Do you ever have any concerns as long as you're using the latest and greatest compilers and
trying to push the latest and greatest features that you're actually getting bad code generated
or anything like that? Yep.
Yep. I think I'm going to give a lightning talk at our meetup about that next week.
Oh, that sounds great.
So we actually found some problems with the code being generated for std find.
What?
Yeah.
I mean, that's been around forever.
Yeah, so GCC 8.2, at least.
I'm not sure which other versions.
GCC 7 was not affected by this, but GCC 8.2 is.
And so we found some optimization problems in GCC with StdFind when using a random access iterator.
Last time I looked, and it's been a while gcc standard library implementation for things like
find had partial loop unrolling baked into the algorithm and now i wonder immediately like if
it would that would be coming into it that's that's coming into it yep okay
i don't want to give away my lightning talk. Yeah, that's all right. We're doing lightning talks at our meetup next week.
Yes.
Hopefully we have a good turnout for that.
Well, as you guys are talking about, you guys have met before at the meetup,
and I got to meet you at SVCon this year where you were a volunteer.
Do you want to tell us a little bit about the volunteer program
and your experience with it?
Yeah.
I got started just kind of on a kick.
I thought, hey, I'll volunteer and see what happens.
And it's been really good.
It's been a really interesting way to meet people at the conference
because you meet a lot of people that you would probably not have interacted with otherwise.
Just, you know, someone passes you in the hall and asks you a question.
And then just because you're wearing the teal shirt and then you start talking to them.
And 20 minutes later, you go out to lunch with them.
And, you know, it's so it's that's been a really fun way to do it.
The and also you really only have to work a single five-hour shift per day.
During your shift, you often attend the talks that you wanted to go to.
You're kind of just attending the conference with a couple extra responsibilities.
Your admission is free, obviously, if you're a volunteer.
Admission is free, obviously, if you're a volunteer. Admission is free.
And for some, there are scholarships for some students.
So it's a limited set of scholarships that John's able to give out.
But students apply and can come for free.
And then with the scholarship, they also get their apply and can come for free and then they get with the scholarship they
also get their hotel and flights paid for oh wow so i mean you you you didn't really i don't think
you said this but how many years have you been volunteering it's not like you just did this once
and it was awesome i've done it twice twice okay yeah So are you planning to volunteer again in 2019?
I am, and this time it's in Colorado.
It's in Colorado.
So that's going to be, yet again, a different experience.
I think having the really large single event center
is going to change the dynamics of the conference a bit.
Right.
I'm really excited to see how that goes.
I've been to C++ now in Aspen several times, and I think that it might have elements of
that where you're kind of isolated because being at that conference in Aspen, it's
smaller.
Right.
But it's also isolated where everyone is staying at the same hotel.
And there's kind of one bar that people hang out at.
And the town is basically shut down in May because it's Colorado's mud season.
So everyone just stays together. together there aren't like big factions that leave um to either go home or to go out to go out to
dinner somewhere else everyone's kind of together and i think that might happen at this at this
event center too so i'm really interested to see how that goes yeah the the gay lord of the rockies
and this the the aspects that you just brought up I don't believe we've discussed on this show
yet, but yes, everyone will be
effectively in one hotel,
or even if you're in one of the outlying hotels,
they're not
more than a mile or
something away.
There's some that are being built right there,
right next to it and everything.
We expect that everyone
will be hanging out together all,
who knows, 1400 of us, however many people sign up this year.
Should be interesting.
Yeah. So aside from that, can you, is there any other way that you would like to encourage
the listeners today to start already thinking about volunteering, signing up, submitting
proposals, whatever.
That stuff hasn't been opened up yet, but it probably will be shortly in the next couple of months, I believe.
And the conference is in September. Yeah.
Yeah. You know, if you haven't been to Colorado before, this is a great opportunity to come here.
But you might want to check out you might want to stay a few extra days because there is a lot to do here.
September is, I think, the best weather in Colorado of the year.
It's usually warm and sunny and just a beautiful time to be here.
There's national parks nearby.
There's tons of breweries.
There's microbreweries, and then there's Coors, right?
Like, everything.
And Budweiser.
Everyone forgets that Budweiser actually has a large brewery here, too.
If you're into that kind of thing.
Like, there's Old West historical mining towns.
There's all kinds of things to do.
I think we have four national parks now, if I have that number right.
Black Canyon, Sand Dunes, Rocky Mountain National Park, and Mesa Verde.
And Garden of the Gods.
Is that a national park?
I thought it was.
I don't know.
I don't know.
And I believe you were looking at the schedule now not that we're saying the only reason to come to cpp con is for beer but i believe
if if you if i remember right you told me that the great american beer festival is the week after cpp
con yep oh wow and that is a that's a big deal if you do plan to do that, listeners, make sure you buy your tickets as soon as they go on sale for the Great American Beer Festival.
Then buy your plane tickets if you're going to plan to stay here for two weeks because of the conference.
Yeah, and another interesting thing for people coming from Europe, I think there might be more flights from Europe to Denver than
there are to Seattle. It might be an easier trip. Yeah. So United is our main carrier for
international flights coming in, but they're partners with Lufthansa, which will get you,
I think, a few separate direct flights from Europe. And I think Air France might have a
couple of direct flights. And Iceland Air has a couple of direct flights as well.
Oh, well, direct from Iceland.
Okay.
Well, it's been great having you on the show today, Lenny.
Is there anything else you wanted to plug or talk about before we let you go?
No, I think we covered a lot of it.
Okay.
Can people find you online anywhere?
I'm on Twitter. Not that often, but I am.
My Twitter handle is just my name, LennyMyRunning.
It's a lot of vowels, but you should be able to find it.
We'll put it in the show notes.
Okay, well, it's been great talking to you today, Lenny.
Thanks, you too.
Thanks for coming on.
Thanks so much for listening in as we chat about C++.
We'd love to hear what you think of the podcast.
Please let us know if we're discussing the stuff you're interested in,
or if you have a suggestion for a topic, we'd love to hear about that too.
You can email all your thoughts to feedback at cppcast.com.
We'd also appreciate if you can like CppCast on Facebook and follow CppCast on
Twitter. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like
to thank all our patrons who help support the show through Patreon. If you'd like to support
us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info
and the show notes on the podcast website