CppCast - TDD, BDD, Low Latency and CppCon

Episode Date: November 29, 2018

Rob and Jason are joined by Lenny Maiorani from Quantlab to discuss high performance computing, pair programming, volunteering for CppCon and the site of next year's CppCon. Lenny has been usi...ng C++ off and on since 1995. Since graduating from SUNY Plattsburgh with a degree in Computer Science, he has been working at startups focused on high-throughput applications. About 2 years ago he joined Quantlab and discovered a different type of high-performance computing in low latency systems. Lenny lives in Denver, Colorado with his wife Lexey and their dog. He can be found hiking in the Colorado mountains while thinking about container access patterns and wondering if std::map can be renamed to std::ordered_map. News Better template support and error detection in C++ Modules with MSVC 2017 15.9 What's new in Clion 2018.3 Std::string is not a Container for Raw Data Counterpoint Lenny Maiorani @lennymaiorani Links Quantlab CppCon 2014: Lightning Talks - Lenny Maiorani "Test-Drive Performance" Sponsors Backtrace JetBrains Hosts @robwirving @lefticus

Transcript
Discussion (0)
Starting point is 00:00:00 Thank you. at backtrace.io slash cppcast. And by JetBrains, maker of intelligent development tools to simplify your challenging tasks and automate the routine ones. JetBrains is offering a 25% discount for an individual license on the C++ tool of your choice. CLion, ReSharper, C++, or AppCode.
Starting point is 00:00:39 Use the coupon code JETBRAINS for CppCast during checkout at jetbrains.com. In this episode, we discuss IDE updates and storing raw data in strings. Then we talk to Lenny Majorani from QuantLab. Lenny talks to us about high-performance computing and much more. Welcome to episode 177 of CppCast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving,ined by my co-host, Jason Turner. Jason, how are you doing today? Doing all right.
Starting point is 00:01:49 How are you doing, Rob? I'm doing fine. We're going to just talk about Star Wars this episode, right? Yeah, well, yes. I mean, we could recap for our listeners and then let our guests weigh in in just a moment. It's just an argument of whether or not the last two movies are worth watching basically which one is the worst i mean i i have strong and mixed feelings about last jedi i think
Starting point is 00:02:13 solo was better than rogue one but uh that's kind of my summary yeah yeah and i like rogue one better than solo because it introduced this whole fuel problem and Star Wars never had a fuel problem before this. We've got 36 years or whatever of Star Wars with no fuel problems and now there's fuel problems. Although I bet if you read the books they're probably talking about fuel all the damn time. That's probably
Starting point is 00:02:38 true but I think the books are actually science fiction not science fantasy. Or whatever. Anyway, at the top of the episode, I'd like to read a piece of feedback. This week, we got a tweet from Patrice Roy saying, listen to the CPP cast, Luke Trip Report with Ashley Hedberg episode.
Starting point is 00:02:56 Thanks a lot. I missed San Diego, and I miss going to WG21 meetings a lot these days. I haven't been able to attend the last two, but should be in Kona, fingers crossed. Interesting through and through. And yeah, that was a great episode we had with Ashley. We still have a lot more to go over
Starting point is 00:03:12 from the San Diego meeting, and we should have another guest on to talk about that soon. I was just debating if I should try to make it to Kona. Oh yeah? The problem would be C++ on C. So I would have to go to C++ on C and then come home for like three days and then fly to Hawaii. Okay. So Kona is the week after
Starting point is 00:03:33 C++ on C? Or no, no more than two weeks after it. I mean, it would be, it would be tighter than I want to be for flying halfway around the world, basically. Right. Now, if you went, would you go with Jen and actually make a vacation out of it, or would you try to go to the meetings every day? If I go to Hawaii, my wife would come with me. The question is, I don't know. Since I'm not an official committee member, I would be up to my discretion
Starting point is 00:04:00 how many meetings I attended, I guess, and I haven't decided what I would do. That's part of the problem, really, because I know it's intense and it's a long week. Well, you have to let me know what you decide to do, okay? Yeah. Well, we'd love to hear your thoughts about the show as well. You can always reach out to us on Facebook, Twitter,
Starting point is 00:04:17 or email us at feedback at cpcast.com, and don't forget to leave us a review on iTunes. Joining us today is Lenny Majorani. Lenny has been using C++ off and on since 1995. Since graduating from SUNY Plattsburgh with a degree in computer science, he's been working at startups focused on high-throughput applications. About two years ago, he joined QuantLab and discovered a different type of high-performance computing
Starting point is 00:04:39 in low-latency systems. Lenny lives in Denver, Colorado, with his wife, Lexi, and their dog. He can be found hiking in the Colorado mountains while thinking about container access patterns and wondering if StoodMap can be renamed to Stood Ordered Map. Lenny, welcome to the show. Hi. Hi, Rob. Hi, Jason. Have you actually considered submitting a proposal to change the name of Stood Map to Stood Ordered Map? I've heard that that proposal process has uh is quite complex and takes
Starting point is 00:05:07 a long time and a lot of effort so i'm just hoping somebody else will do it at this point i'd imagine that'd be a hard sell just because it would break all existing std map code yeah right but maybe we could deprecate it and eventually get it up. Make a type alias for the name, deprecate the old one. I think the standards process would allow for that today, whether or not it would get accepted. Yeah. Although, just as an aside, std autopointer, broken from the beginning. Everyone agrees it's been broken from the beginning. I have my copy of effective C++ that says don't use it, right?
Starting point is 00:05:52 It was removed in C++17. Today, GCC 8 and C++20 mode still has AutoPointer because they don't want to break existing code. But it is marked deprecated, right? It's marked deprecated. That's not conformant with the standard. In C++17 mode, it's supposed to be removed. And if removing autopointer breaks existing code,
Starting point is 00:06:16 that code's already broken. Anyhow. Okay, well, Lenny, we've got a couple of news articles to discuss. Feel free to comment on any of these And we'll start talking more about your work on QuantLab and other stuff, okay? Okay So this first one I have is from the MSVC blog Better template support and error detection in SQL Server modules
Starting point is 00:06:40 With the new 2017 15.9 update And they haven't been talking much about SQL++ module support in MSVC for a while, but I guess they've been kind of under the covers making a bunch of changes and improvements, and they're just announcing and laying out all the improvements that they've had, that they've made recently. And, yeah, better constexpr support, two-phase name lookup, it all seems pretty important for modules to work properly. Yeah. It also looks like they're just trying to get
Starting point is 00:07:13 aligned with the current wording of the standard as well. Right, because that has changed a bit over this past year in the committee, right? Yeah. Have you worked with modules at all, Lenny? No, I haven't touched it at all. Yeah, me neither. Yeah, it seems like one of those things where it might be just better off to wait until it's official
Starting point is 00:07:36 and all the different platforms start implementing it. Yeah. Okay, was there anything else you wanted to talk about or point out here, Jason? No, just that it's good to see Visual Studio constantly working on getting all these things better. Yeah, one of the major improvements they're calling out here is improved error diagnostics. one example here with a very, very simple piece of code where it went from a 10-line error that doesn't really tell you what the problem is to a nice, easily understandable two-line error message. Right.
Starting point is 00:08:15 Just kind of, I guess, as an aside again, my ChaiScript has exactly one compilation error left for C++17 support now. So I should probably, now that it's down to exactly one thing and I don't think they're working on it, I should probably submit a bug report on that. Oh, yeah, definitely. Okay, with other compiler or IDE news, we also have a post from CLion.
Starting point is 00:08:38 They just released their 2018.3 release. And this has a bunch of new features. The main one is their initial remote development support. And this will basically allow you to, if you're using CLion on Windows, Mac, or Linux, to be able to remotely build and debug on a different machine. I think currently it's just a remote Linux host. It's a pretty cool feature.
Starting point is 00:09:03 Yeah, this is pretty interesting for QuantLab too. A lot of the QuantLab developers use CLion. I'm more like Jason. I'm not an IDE user, but I think that a lot of people will like this because the way that most of them are doing it right now is remote X display. And, you know, it's laggy and just not a great experience. Oh, like running CLion on the remote host
Starting point is 00:09:35 and exporting the display back over SSH or whatever. Yeah, exactly. Oh, yeah, I've never tried a remote X with something as complicated as an IDE before. I think Netscape back in 1997 was probably the last time I tried something that complex over X. Yeah, so the people using it, like a lot of the folks in Houston, in our Houston office use it, and our data center is also in Houston. So that's not too bad for them. But for our remote
Starting point is 00:10:05 offices, like here in Denver, that the lag is basically prohibitive. Wow. Did you know if anyone's tried this yet? It looks like it just came out two days ago. I don't know if anyone has. Rob, you've done a lot of kind of embedded device stuff. We've talked about that a little bit on the show. What about remote? But this isn't the same kind of thing. Would this be useful? What I'm trying to ask is, could you imagine if this would be useful if you were still doing handheld devices
Starting point is 00:10:35 or something like this? I don't even know. Since it's remote Unix host only, I don't know if that makes sense or not. Yeah, I never worked with any remote Linux devices. I was always working with a remote device that you would plug in via USB like a Windows CE
Starting point is 00:10:51 device, something like that. And that usually had built-in support through Visual Studio. I wonder how debugging is going to work with this feature. If you can do remote debugging also. I mean, it seems to support both remote build and remote debug. Yeah.
Starting point is 00:11:08 So it seems like it should help your team. Yeah. I guess, and besides that, there's a couple other things on here. Like they are ClangD-based navigation and search, so they're pairing up a little bit with Clang like everyone is. Right. And better C++ 17 specific feature support. That's cool.
Starting point is 00:11:29 Yeah, so hooked into CPU profilers like Perf or Dtrace on Linux and Mac. What were you going to say, Lenny? Yeah, that's what I was going to say, too. That looks pretty exciting. And they have catch support for their unit testing. I think we gave Phil a hard time about that when he first started there because they didn't yet support catch, and I don't know when that was added.
Starting point is 00:11:54 Yeah, I don't think this is the first release it's been added, but yeah, I'm glad they got that in there with Phil working for them. I have to say, honestly, everywhere I teach, I always recommend catch just because it is so crazy simple to use. I know it's not the fastest to compile. I know it has a couple of issues for like multi-threaded testing and stuff that bothers some people. But if you need a testing framework and you don't have one yet, you know,
Starting point is 00:12:19 pound include catch. Yeah. Yeah. Okay. And then the last post I have is, uh, from arn mertz's blog and it's stood string is not a container for raw data yeah yeah when i saw this post i i just immediately thought back to one of the lightning talks from cpp con which was you know kind of done in jest but it was a talk about stood basic string for, you know,
Starting point is 00:12:46 non string things like you could have a stood basic string of bulls instead of a vector of bull. And it was a really good talk. Um, you know, he wasn't really telling people that they should go and do this, but it was a, an interesting talk to see what you could do with it. Um, and I'm not really sure like why Arne felt the need to write this post and tell people not to do it. Was he replying to that lightning talk? I'm not sure. Well, I think if he was replying to the lightning talk,
Starting point is 00:13:16 he would have specifically called out the fact that you could use basic string, which he didn't. Yeah, he's talking about string itself. And I've used string in the past, personally, for interacting with C APIs, because it can just be a simple little container. But Vector can do almost exactly the same things. I don't know. Yeah, it's a really good way to make your code unreadable.
Starting point is 00:13:38 Std string for data. Yeah, it'd confuse everyone. But if you used, like, using data container equals std basic string of byte, and then just like use data container or whatever throughout your code, like give it whatever name makes sense, right? I'm mostly on board with that. Although Arna calls out a couple of things here that like, yeah, depending on what operation you're using on the basic string class,
Starting point is 00:14:04 it might be looking for like null, which is a fair point, and I had not considered that. But at the same time, you could have this issue with any arbitrary UTF-8 data and a std string too. So I don't know. Well, it does look like one of the commenters is saying that Google's protocol buffers uses this, uses strings for other types of data, raw data. So maybe that was kind of the genesis of this post.
Starting point is 00:14:36 Yeah. Have to think about this one some more. And we'll have a link to both the video and the article in the notes yeah i do suggest watching the uh the lightning talk and if i might say that lightning talk was one of my previous students and it was a topic that came up in class he brought it up and i said you have to propose a talk on this hopefully hopefully at some point it will become a full session talk because i think there's actually a lot to talk about by the time you talk about small string optimization and the optimizations that basic string can do around trivial types.
Starting point is 00:15:12 That vector just can't do it in the same way. Right. And I think we did talk to him during our lightning talk interview at CppCon, but as I think we mentioned before, we unfortunately lost some of those interviews and he didn't come around for the second read. Yeah, he didn't find out about the redo until after it was too late. Right. Okay. Yeah, Arna says something
Starting point is 00:15:36 at the end about, you know, if you really want to do this maybe you should use a stronger type and have a wrapper structure that you can give a meaningful name I'm all on board with stronger typing so yeah, I'm not going to argue with that Okay, so Lenny, do you want to start off by just telling us a little bit more about the work you do at QuantLab? Sure
Starting point is 00:16:01 So I came to QuantLab after spending a number of years doing high performance systems that were all focused on high throughput. And so when I joined, I kind of had to change my view of the world. You know, one layer's throughput is another layer's latency. So I kind of had to rethink how I go about this problem. I work on the trading system. So QuantLab is a high-frequency trading company. And we work on everything from market data parsers to market modeling and simulation to order entry systems.
Starting point is 00:16:46 So basically the full system end-to-end. I guess. No, I'm sorry. Go ahead. I also do a lot of, I focus a lot on the build process and tooling. But generally we're writing C++ every day. From the standpoint of maybe our listeners who don't know the difference, could you describe what you mean between throughput versus latency? Yeah. So when I refer to throughput, I mean how much data we can push through the system. And with latency, I'm referring to how fast any given piece of data can flow through the system.
Starting point is 00:17:26 So with a high throughput system, you might not necessarily care if each individual item can pass through quickly. So if you think about a message that comes in, you might not be worried how long it takes the message to get through, but you want to get as many through as possible within a given time period. With low latency, you want each individual message to be processed as quickly as possible. And maybe you don't care as much about how many you can process, so you might trade different things to achieve this. What kind of systems did you work on in the past that were high throughput?
Starting point is 00:18:11 So I worked on... So before I joined QuantLab, I was working on load balancers, so HTTP load balancers. And in that case, at web latencies, you're more concerned with human perception of the latency. So adding another millisecond might not matter because the latencies are going to be in the 50 to 100 milliseconds across the web. So you might not really, certainly microseconds in that case aren't going to matter too much. You don't want to add them all over the place.
Starting point is 00:18:53 Right. I also worked on data capture systems. So like network capture devices that can record huge amounts of traffic without dropping any packets. And data storage systems that need to... So there were high-performance computing systems that were responsible for connecting large numbers of devices to some sort of back-end storage. And those needed to be able to get all the data needed to all the devices as quickly as possible, but we weren't necessarily concerned about microseconds of latency. Right. If I could exaggerate the numbers, maybe. So for like a network uh sniffing logging whatever system
Starting point is 00:19:47 you don't you don't you don't care if you will if it takes a minute from the time that the byte hits the network to the time that you actually write it to disk you just care that you're able to handle so many terabytes per second or whatever and that they all eventually get on the disk right yep okay but with with a high frequency trading we're very much concerned with when the market data is disseminated by the exchange and how quickly we can act upon it because there are going to be other market participants who also want to act upon possibly the same data. So whoever can do it first is able to make the trade that they want. in the systems tend to be fairly low compared to something like a network capture system that's trying to capture all of the data on a company's network.
Starting point is 00:20:54 That might be gigabits or even terabits per second, whereas a market exchange is going to be perhaps maybe a few hundred megabits per second. Oh, wow. Yeah, that's considerably smaller scale. Orders of magnitude, if you will. So what's your development process like? I mean, what types of problems specifically are you working on each day? So we're generally focused on writing low latency market data parsers and getting the outputs of our market model. When the market model decides to trade, get that order to the exchange as quickly as possible also.
Starting point is 00:21:53 So we're generally focused on those kinds of problems, but it's very data-driven. So we have lots of different projects to pick from, and we try to decide which projects based on what's going to provide us the best, provide us the biggest return. Okay. We're a very agile shop. So we do agile scrum with a lot of elements of extreme programming, like pair programming. And we try to generate high level business behaviors that are desired to meet that data driven decision. So we try to make our data driven decision on what project and then figure out what the business behaviors are from that. And then we try to make them happen. Okay, you just mentioned pair programming and extreme programming. It's actually, I don't think I have that book anymore.
Starting point is 00:22:48 I had one of the earlier books on extreme programming at one point. I was working for a company and the owner of the company heard about extreme programming. So he gave us all books on it. But I've only personally done a little bit of pair programming. How well does that work for you all? I was new to it coming here, and I'm a convert. I'm totally sold on it. I think it's been really good to give us focus on projects so we can get the whole team focused.
Starting point is 00:23:19 We often even do mob programming, which is kind of like a pair but more than two people. So we'll have mobs of maybe three or four people all working together on a project. And we found that it works really well for us. We're able to all be focused, and when one person needs to go to another meeting or something, we can, we just carry on. We don't, we don't necessarily stop because one person in the group has to do something else. Do you feel like this is like fundamentally like, uh, faster with,
Starting point is 00:24:01 or slower with better outcomes or like, how that work i think it's faster over time because everyone it as long as everyone's engaged everyone has to learn the the topic so you don't necessarily have people in the team at least people from that mob who don't know a part of the system. Everyone knows that everyone can contribute there quickly. So it keeps the entire team leveled up. But within a short time frame, it does reduce the parallelism of the team. So since all the members of the team are working on the same thing, we're not working on maybe four different things. Okay. It does seem like, if I could just imagine this in a regular daily work environment, if you are stuck on a problem, you probably get unstuck faster, I would think.
Starting point is 00:25:09 Yeah. Yeah, there tends not to be as many times that you are stuck at all. Because somebody always has an idea on how to progress. Right. Although I'm reminded of that classic XKCD where everyone's fully around by sword fighting because they're compiling, right? Now, what happens to the mob during the compilation phases? We really focus heavily on keeping our build time short. Okay.
Starting point is 00:25:40 So most of our builds are around one minute. And and we do all it's a single translation unit with headers. So every project, not every project there. QuantLab has been around now for about 20 years, and there's a lot of sort of legacy codebases that have different styles in them. Okay. But the projects that I've been working on most recently are all single translation unit.
Starting point is 00:26:18 Now, if you said this randomly to someone, oh, yeah, our whole project is a single translation unit with a bunch of includes, the average person would assume this would slow down compilation time. Right. The projects tend to be pretty small in size. We're trying to react to the market quickly, which means that we want our binaries to be small because we want everything to fit in cache. And that also means that we're probably going to eliminate as much code as possible to keep that really tight.
Starting point is 00:27:01 And then when we do find explosions in build time, we quickly address them. So if somebody adds something that is pathological to GCC or something like that, we'll quickly try to address that. Do you have a continuous integration system or anything that will alert you if build times have increased? It doesn't automatically alert us. We just keep an eye on it. But also, since the build times are short, the development cycle tends to get painful when those build times increase.
Starting point is 00:27:38 Right, so everyone just notices it anyhow. And then somebody's complaining about it in the daily scrum or something like that. How do you benchmark, diagnose, fix build time issues? How do you figure out what made the change? We try to keep all of our merge requests really small. So when a change goes in, we can tell which one changed it
Starting point is 00:28:08 because we can just scroll back in time in the list of CI builds and see, oh, the build went from a minute and a half to seven minutes on this commit right here. So there must be something in this commit. And then from there, it's just kind of trial and error, you know, trying to comment out different lines, different function calls to see which thing actually caused it. Okay. Yeah, my last students, I like to point out that get bisect is a very handy tool for diagnosing some issue in your code.
Starting point is 00:28:46 And my last class kind of jokingly complained that I didn't mention get blame as a useful tool as well. I like to have an alias for get alias for blame to praise because it just feels better to type it. That's funny. So I believe you also do like a behavioral driven development or TDD kinds of things there. Do you want to talk about that? Yeah. Um, for a number of years now, uh, the whole quant lab development team has been really strict TDD. So what I mean by that is everything is always done in what we call a red-green refactor cycle. So first you need a
Starting point is 00:29:37 failing test. That's the red part of the cycle. Then you write the code that makes it pass, then you're green, and then you refactor to a point where you're happy with what you have, and then add a new test that is red. And one of the people I work with likes to say that if you don't have a red test, you don't have a license to write new code. So QuantLab has been doing this for quite a while. And more recently, we've been getting into behavioral-driven development, where we like to define the behaviors sort of business logic level
Starting point is 00:30:24 and then go and make that happen. the behaviors sort of business logic level, and then go and make that happen. And maybe we do some TDD along the way, writing unit tests for something that we're adding. But we're trying to achieve a general behavior, and we like to do this to avoid over-constraining the behavior. If you do really strict TDD with a lot of unit tests, you might end up in a situation where you have a very rigid system. Everything is set in concrete. And to change anything, now you have to change a lot of tests. So if you can just
Starting point is 00:31:07 specify at the business level what you want, it gives you the guardrails to be able to freely refactor and change the internals as long as it still meets the behavior that's desired. Okay. Interesting. Yeah. So your team used to be very distributed. How did you deal with that through development? Yeah, it still is really distributed. It's getting less so. We've kind of been concentrating in Denver,
Starting point is 00:31:46 at least the team that I'm on. So the team that I'm on is split at the moment between Denver, Boston, and Houston. And it used to be Denver, Boston, London, and Houston. So we have reigned in the time zones a little bit. But overall, the whole company is pretty well distributed. The headquarters is in Houston, but we have offices in New York, New Jersey, and Amsterdam also. Okay. So dealing with this can be trying, I guess. We have a sort of informal relationship manager in Houston that helps us with just general face-to-face communication is lacking, like it is when we're all in Denver,
Starting point is 00:32:56 things sometimes are misunderstood or just not fully communicated. So we do travel to alleviate that pressure, but it's not always perfect. So those are sort of the human problems with being remote, but also we have a lot of technical challenges. Like we were talking about earlier with CLion, using CLion for, being in a remote office is difficult. We usually use TMUX with shared terminals. So we all will log into a shared TMUX session and we use that for pairing. So we can be distributed. The guys in Boston can just be on the same TMUX with us. And then for audio, we usually use Zoom. So we have an audio link with them, and then we also have the shared TMUX session. So that's interesting.
Starting point is 00:34:00 You've got audio, you said, with Zoom, but you're not taking advantage of the screen sharing capabilities. Yeah, sometimes we do. When there's a need to share a screen for, you know, maybe web browsing or something like that, we will. But most of the time we're spending time on the shell and in Vim or Emacs. And you don't you've been able to successfully work this out with mob programming and shared screen and just voices in the room. Yeah. And it even works so well for us that sometimes we have people that are in the Denver office who are both sitting at their own desks, both logged into the same TMUX session, and with headphones on, talking to each other over Zoom.
Starting point is 00:34:51 That's how well it works. It's kind of weird sometimes. I just realized, since you've mentioned Denver several times, I guess this is the point for the disclosure here that we didn't mention before, that yes, Lenny does come to my meetup, and we do know each other in person as well. So I probably maybe should have mentioned that earlier. And Lenny has demonstrated this mob programming technique for our meetup. We've done two times now, maybe three times?
Starting point is 00:35:25 Two or three, yeah. I think maybe two times I was there and one time I was gone traveling. And so Lenny is one of my backups to make sure that the meetup keeps happening, basically. And that's been a good experience. It's what, I don't know, you could farm yourself out to other meetups to demonstrate this mob
Starting point is 00:35:46 programming or something yeah it um i found that the meetup it really slowed things down because the mob was really big yeah and the um and since people hadn't worked together before um there's like really widely varying opinions quite right and and as people work together more the style and opinions kind of join they kind of meet somewhere but then you you get the uh you know old school c guy who wants to you know implement the loop and assembly language and you get the newer school person it's like no we should just use you know implement the loop and assembly language and you get the newer school person it's like no we should just use you know all the algorithms and then we get to discuss and compare what the advantages are and what the performance actually looks like right yeah yeah
Starting point is 00:36:38 compiler explorer is uh savior for that i just had a student say that compiler explorers are the godsend. I was like, I see what you did there. I don't think you did it on purpose, but it was still hilarious. I wanted to interrupt this discussion for just a moment to bring you a word from our sponsors. Backtrace is a debugging platform that improves software quality, reliability, and support by bringing deep introspection and automation throughout the software error lifecycle. Spend less time debugging and reduce your mean time to resolution by using the first and only platform to combine symbolic debugging, error aggregation, and state analysis.
Starting point is 00:37:18 At the time of error, Backtrace jumps into action, capturing detailed dumps of application and environmental state. Backtrace then performs automated analysis on process memory and executable code to classify errors and highlight important signals such as heap corruption, malware, and much more. This data is aggregated and archived in a centralized object store, providing your team a single system to investigate errors across your environments. Join industry leaders like Fastly, Message Systems, and AppNexus that use Backtrace to modernize their debugging infrastructure. It's free to try, minutes to set up, fully featured with no commitment necessary. Check them out at backtrace.io. Going back to high performance and low latency code, do you use any particular tools for optimizing? Yeah. We sort of have, I don't know,
Starting point is 00:38:11 three or four phases of performance optimization. So I guess for the first one is just constexpr all the things. You know, we try to move as much as we can to compile time, and we do that through heavy refactoring. So we're always looking to move more computation to compile time. And then we have different layers that give us different results. So we like Google Benchmark for doing micro benchmarks. But they're sort of fun, right?
Starting point is 00:38:52 The turnaround time is really quick. It's sort of like a game. You can make a change. You get immediate feedback. But the problem is that we often get misleading results from that you might make a certain optimization that's really good in the benchmark but doesn't actually work i think the the canonical example of that is something that just completely optimizes away in the benchmark so it looks great in the benchmark but zero milliseconds yes yeah but in real life it's Zero milliseconds, yes. Valgrind is really useful to give us a graph of the bottlenecks and the performance profile of the system, where the time is being spent.
Starting point is 00:39:52 And you're talking specifically cash grind or whatever, right? Yeah, cash grind and call grind. Call grind, okay. Yep. So we can really see the where time's being spent in a visual way, which is useful to find. But it's also pretty slow to run. So depending on what you need, it may be slow. And then and then we also like. So one of the interesting things about using that is that you can turn it on and off programmatically inside of your program. So they're like Valgrind macros that you can put around certain things that you want to, if you want to know the number of cycles.
Starting point is 00:40:41 Let's say you have a hot loop that's pulling a network socket. You don't really care about all the cycles that are being spent in that. You only care about the cycles once you actually get data off of the socket. So you turn it on once we have the data. Interesting. So like C macros that you compile in. Right, yep. Okay, I've never used those. Yeah, and then Linux Perf gives another view of the same data, and then we can play around with increasing the number of instructions per clock cycle and making changes to affect that or memory usage, page faults.
Starting point is 00:41:26 Then all of those tests, so we have a lot of tests that are written like that, and they all run in our CI system and publish their results into an elastic database that has Kibana on it. So we can use Kibana to draw pretty graphs and be able to look through those results. And so we can see where we added something that either caused a
Starting point is 00:41:58 benefit or a problem. We can go back in time and look at how different things affected our performance. Can I ask, if this is information you can share, I assume it would be, how long does it take from the time that you commit a change to the time that you have that pretty graph back to you in your CI? About five to ten minutes. Okay, that's pretty good. Yeah. Yeah, so we have a big Kubernetes cluster, several thousand cores, I think.
Starting point is 00:42:31 That's big. That has, yeah. I only have like 20 cores in my whole house. It's big for us, but I guess it's not big for some people. Comcast, I've heard, has a nationwide Kubernetes cluster. Okay. Yeah. Well, I mean, it's Comcast. Some of it probably runs on my phone gateway and I don't know it. So in our Kubernetes cluster, we can farm out a lot of jobs simultaneously. The actual build time of our CI system is probably, even for some of our small jobs, is fairly large. It might be two hours of CPU time, but it executes in just a few minutes.
Starting point is 00:43:21 Because so many parallel tests, I are running right okay um and then like after so that gives us a good like application level um performance metrics and then from there we need to tie into the larger system and for that we have a high fidelity simulator. And we can, with enough data, with enough runs through the simulator, we can statistically determine in plus or minus 3.5 nanoseconds the performance characteristics of the system. Wow. But it takes hours to run. So we generally avoid it except in the final stages before some deployment. So it's not something you just run nightly or something like that? No.
Starting point is 00:44:11 It takes hours. And we only have a few of those systems that are capable of that level of performance monitoring. So we'd probably have quite a bit of contention for that if everyone was trying to run their own jobs on whatever code they wrote. So we've talked a bit about your CI system, tools you use. What version of the C++ standard are you using at QuantLab? It's all C++ 17.
Starting point is 00:44:43 Oh, wow. Even some things from 20 are starting to creep in um but uh so we're using gcc 8.2 and clang 7 uh and both of those support designated initializers by turning on the c99 uh extensions right so we're we've been making use of designated initializers, and we have a concepts emulation library that we use. So we're kind of doing concepts, but not really concepts exactly as they are in C++20. But we have quite a few problems with the designated initializers.
Starting point is 00:45:24 No, I'm sorry. Go ahead. It's not fully supported yet in Clang, for instance. Right. Yeah, I've done two C++ Weekly episodes about designated initializers now, and each time I get comments about how I am not accurately representing what the standard allows. And I just finally realized that basically the compilers that currently support it are just supporting it through C99 emulation. They're not supporting the C++20, actually, because it only allows equals initialization,
Starting point is 00:45:57 not braced initialization of the elements. Right, and that's where... And Clang doesn't even support very well the braced initialization. GCC will do it. Okay. All the cases that I've tried, it's worked. Oh, I haven't gotten it to work. I don't know what I was doing wrong.
Starting point is 00:46:18 But Clang definitely did not. Okay. But we really like the concepts um emulation stuff because so much of our code is is compile time stuff we have a lot of templates and so using concepts there really can help the understandability and readability so this is is Eric Niebler's concepts emulation layer? We have kind of a home-rolled concept. It's very similar to how Eric's works. So are you also then trying to take advantage of ranges and that kind of thing? We have one
Starting point is 00:46:59 project that I know of internally that's using ranges. We haven't really started using that yet very widely. We're using the ranges v3 library and it's kind of a toy project at the moment, but it's an interesting one. I think that once ranges is supported in the standard library, I think we'll start using it right away. Do you ever have any concerns as long as you're using the latest and greatest compilers and trying to push the latest and greatest features that you're actually getting bad code generated or anything like that? Yep.
Starting point is 00:47:38 Yep. I think I'm going to give a lightning talk at our meetup about that next week. Oh, that sounds great. So we actually found some problems with the code being generated for std find. What? Yeah. I mean, that's been around forever. Yeah, so GCC 8.2, at least. I'm not sure which other versions.
Starting point is 00:48:06 GCC 7 was not affected by this, but GCC 8.2 is. And so we found some optimization problems in GCC with StdFind when using a random access iterator. Last time I looked, and it's been a while gcc standard library implementation for things like find had partial loop unrolling baked into the algorithm and now i wonder immediately like if it would that would be coming into it that's that's coming into it yep okay i don't want to give away my lightning talk. Yeah, that's all right. We're doing lightning talks at our meetup next week. Yes. Hopefully we have a good turnout for that.
Starting point is 00:48:52 Well, as you guys are talking about, you guys have met before at the meetup, and I got to meet you at SVCon this year where you were a volunteer. Do you want to tell us a little bit about the volunteer program and your experience with it? Yeah. I got started just kind of on a kick. I thought, hey, I'll volunteer and see what happens. And it's been really good.
Starting point is 00:49:15 It's been a really interesting way to meet people at the conference because you meet a lot of people that you would probably not have interacted with otherwise. Just, you know, someone passes you in the hall and asks you a question. And then just because you're wearing the teal shirt and then you start talking to them. And 20 minutes later, you go out to lunch with them. And, you know, it's so it's that's been a really fun way to do it. The and also you really only have to work a single five-hour shift per day. During your shift, you often attend the talks that you wanted to go to.
Starting point is 00:49:53 You're kind of just attending the conference with a couple extra responsibilities. Your admission is free, obviously, if you're a volunteer. Admission is free, obviously, if you're a volunteer. Admission is free. And for some, there are scholarships for some students. So it's a limited set of scholarships that John's able to give out. But students apply and can come for free. And then with the scholarship, they also get their apply and can come for free and then they get with the scholarship they also get their hotel and flights paid for oh wow so i mean you you you didn't really i don't think
Starting point is 00:50:34 you said this but how many years have you been volunteering it's not like you just did this once and it was awesome i've done it twice twice okay yeah So are you planning to volunteer again in 2019? I am, and this time it's in Colorado. It's in Colorado. So that's going to be, yet again, a different experience. I think having the really large single event center is going to change the dynamics of the conference a bit. Right.
Starting point is 00:51:04 I'm really excited to see how that goes. I've been to C++ now in Aspen several times, and I think that it might have elements of that where you're kind of isolated because being at that conference in Aspen, it's smaller. Right. But it's also isolated where everyone is staying at the same hotel. And there's kind of one bar that people hang out at. And the town is basically shut down in May because it's Colorado's mud season.
Starting point is 00:51:36 So everyone just stays together. together there aren't like big factions that leave um to either go home or to go out to go out to dinner somewhere else everyone's kind of together and i think that might happen at this at this event center too so i'm really interested to see how that goes yeah the the gay lord of the rockies and this the the aspects that you just brought up I don't believe we've discussed on this show yet, but yes, everyone will be effectively in one hotel, or even if you're in one of the outlying hotels, they're not
Starting point is 00:52:14 more than a mile or something away. There's some that are being built right there, right next to it and everything. We expect that everyone will be hanging out together all, who knows, 1400 of us, however many people sign up this year. Should be interesting.
Starting point is 00:52:31 Yeah. So aside from that, can you, is there any other way that you would like to encourage the listeners today to start already thinking about volunteering, signing up, submitting proposals, whatever. That stuff hasn't been opened up yet, but it probably will be shortly in the next couple of months, I believe. And the conference is in September. Yeah. Yeah. You know, if you haven't been to Colorado before, this is a great opportunity to come here. But you might want to check out you might want to stay a few extra days because there is a lot to do here. September is, I think, the best weather in Colorado of the year.
Starting point is 00:53:14 It's usually warm and sunny and just a beautiful time to be here. There's national parks nearby. There's tons of breweries. There's microbreweries, and then there's Coors, right? Like, everything. And Budweiser. Everyone forgets that Budweiser actually has a large brewery here, too. If you're into that kind of thing.
Starting point is 00:53:40 Like, there's Old West historical mining towns. There's all kinds of things to do. I think we have four national parks now, if I have that number right. Black Canyon, Sand Dunes, Rocky Mountain National Park, and Mesa Verde. And Garden of the Gods. Is that a national park? I thought it was. I don't know.
Starting point is 00:54:04 I don't know. And I believe you were looking at the schedule now not that we're saying the only reason to come to cpp con is for beer but i believe if if you if i remember right you told me that the great american beer festival is the week after cpp con yep oh wow and that is a that's a big deal if you do plan to do that, listeners, make sure you buy your tickets as soon as they go on sale for the Great American Beer Festival. Then buy your plane tickets if you're going to plan to stay here for two weeks because of the conference. Yeah, and another interesting thing for people coming from Europe, I think there might be more flights from Europe to Denver than there are to Seattle. It might be an easier trip. Yeah. So United is our main carrier for international flights coming in, but they're partners with Lufthansa, which will get you,
Starting point is 00:54:55 I think, a few separate direct flights from Europe. And I think Air France might have a couple of direct flights. And Iceland Air has a couple of direct flights as well. Oh, well, direct from Iceland. Okay. Well, it's been great having you on the show today, Lenny. Is there anything else you wanted to plug or talk about before we let you go? No, I think we covered a lot of it. Okay.
Starting point is 00:55:22 Can people find you online anywhere? I'm on Twitter. Not that often, but I am. My Twitter handle is just my name, LennyMyRunning. It's a lot of vowels, but you should be able to find it. We'll put it in the show notes. Okay, well, it's been great talking to you today, Lenny. Thanks, you too. Thanks for coming on.
Starting point is 00:55:46 Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons who help support the show through Patreon. If you'd like to support
Starting point is 00:56:16 us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.