CppCast - Compiler Explorer

Episode Date: January 28, 2016

Rob and Jason are joined by Matt Godbolt to discuss the online Compiler Explorer project. Matt is a developer at trading firm DRW. Before that he's worked at Google, run a C++ tools company, ...and spent over a decade in the games industry making PC and console games. He is fascinated by performance and created GCC Explorer, to help understand how C++ code ends up looking to the processor. When not performance tuning C++ code he enjoys writing emulators for 8-bit computers in Javascript. News Microsoft releases CNTK, its open source deep learning toolkit C++ Language Support for Pattern Matching and Variants VS2015 Update 2's STL is C++17 Feature Complete C++Now 2016 Submission Deadline Matt Godbolt @mattgodbolt Matt Godbolt's blog Links Compiler Explorer x86 Internals for Fun & Profit

Transcript
Discussion (0)
Starting point is 00:00:00 This episode of CppCast is sponsored by Undo Software. Debugging C++ is hard, which is why Undo Software's technology has proven to reduce debugging time by up to two-thirds. Memory corruptions, resource leaks, race conditions, and logic errors can now be fixed quickly and easily. So visit undo-software.com to find out how its next-generation debugging technology can help you find and fix your bugs in minutes, not weeks.
Starting point is 00:00:26 Episode 43 of CppCast with guest Matt Godbolt recorded January 27, 2016. In this episode, we talk about a new open source project from Microsoft. Then we talk to Matt Godbolt from DRW. Matt tells us about his interactive compiler Explorer and what you can learn from it. Welcome to episode 43 of CppCast, the only podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today? All right, Rob, how about you? Doing pretty good.
Starting point is 00:01:38 It sounds like we're both feeling better, so that's good. Yeah. Yeah. I'm getting ready to move again. Right. In just two weeks. But I think we're going to be able to get by without much interruption to the show, which is good. Very good. Just across the town move.
Starting point is 00:01:53 You've moved more times in the last year than I've moved in the last 14. Well, this should be the last one, at least for a very long time. Hopefully I get to stay there for at least 14 years. Okay. Well, at the top of every episode, I like to read a very long time. Hopefully I get to stay there for at least 14 years. Okay. Well, at the top of every episode, I like to read a piece of feedback. This week, we actually got a lot of comments on the CppCast website after last episode. I guess because we kind of put out a call for guest suggestions. Yes.
Starting point is 00:02:19 And this one comes from Vipple. And he says, may I nominate Jens Veller, C++ developer well-known for community building at Meeting C++, for appearing as an episode guest. We actually already reached out to Jens, and we think we should be getting him on sometime in March. And he actually just emailed me the other day to suggest a couple guests of his own. And some of the other feedback responses we got, there was some suggestion to talk to some guys from Google
Starting point is 00:02:49 who are working on a language extension called Halide. Did you see this, Jason? I did not. Yeah, something programming DSL called Halide, a DSL embedded in C++, allows programmers to specify an algorithm with a functional programming paradigm,
Starting point is 00:03:09 but also allows them to specify a schedule for execution yielding native C++-like performance. It's very easy to get your algorithm running on a GPU. Sounds interesting. So I'll have to look into these Google developers and see if I can get in touch with them. So how do you spell that? Halide. Yeah. It's H-A-L-I-D-E.
Starting point is 00:03:30 Okay. Interesting. Yeah. Anyway, we do appreciate all the comments. Like I said, we got several after this last episode and keep the suggestions coming in. You can always email us at feedback at cppcast.com,
Starting point is 00:03:44 or you can reach out to us on Facebook or Twitter. And again, we always like to get those reviews on iTunes as well. Joining us today is Matt Godbolt. Matt is a developer at trading firm DRW. Before that, he's worked at Google, run a C++ tools company, and spent over a decade in the games industry making PC and console games. He is fascinated by performance and created GCC Explorer to help understand how C++ code ends up looking
Starting point is 00:04:11 to the processor. When not performance tuning C++ code, he enjoys writing emulators for 8-bit computers in JavaScript. Matt, welcome to the show. Thank you for having me. Nice to be here. So have you worked on any games that any of us would recognize? This is always the question I dread when I say that I've worked for so long in the games industry. So it was, I think, 1997 I started working in the games industry. So the games I worked on were like PlayStation 1 games. I think the only one that was even remotely successful was a game called Croc Legend of the Gobos, which certainly kept the company I worked
Starting point is 00:04:48 for afloat for a few years. But I worked on one of the SWAT games for the PC, and I worked on some... My favorite game to work on was a Dreamcast game, but seeing as nobody bought the Dreamcast, no one has bought the game either.
Starting point is 00:05:04 Well, some of my friends in college were big Dreamcast fans. It was like the Betamax of its day. It was technologically superior. It was great to develop for. There's guys at Sega who were behind it were always quick to answer questions.
Starting point is 00:05:20 But it just didn't win. It was like dead on arrival. The PlayStation 1 was too far ahead. Yeah. Okay, well, we got a couple news articles to talk about, and then we'll start talking about GCC Explorer. So the first one is, Microsoft just released CNTK,
Starting point is 00:05:38 its open-source deep learning toolkit. And C++ isn't actually directly mentioned at all in the article but they link to their github repo and you can tell that it is a c++ repository and they're comparing it to a couple other deep learning toolkits that have been out already and the performance of microsoft's solution is just off the scale compared to these other ones. Jason, you have anything to add to this one? Not really, because I feel like I just don't really understand how I would use a deep learning toolkit.
Starting point is 00:06:17 Like, I don't know what does a normal person do with this? I have no idea. It sounds like it's used for speech recognition and image processing and it's all pretty advanced stuff that I'm sure is going to become more and more kind of commonplace in the industry
Starting point is 00:06:36 as we move forward. As people can tell, deep learning seems to be a way of us sort of accelerating the heat death of the universe because as far as I can tell, data centers are being filled and filled with machines, racks and racks of machines, which are now being filled with racks and racks of GPUs to support this kind of thing. It looks like it's a solution looking for a problem.
Starting point is 00:06:59 I guess some firms are using it for, like Google and Co. are using it for great image recognition stuff. But I've yet to see anyone in a normal company suggest it as a solution to something. Well, they kind of ominously say throughout this article that it was developed to meet Microsoft's internal needs. Which makes me just, there's too many images. Although, let's be fair, Microsoft are like a big turnaround. Recently with their open source stuff, you know, they, we'd written them off like four or five years ago, and they're starting to make a comeback. So I applaud the effort of them opening up things and, you know, hope that they continue to do so. Yes, I totally agree. But between your image of the heat death of the universe and the mental image I already have
Starting point is 00:07:47 of Skynet being developed, I figure these have to lead to the end of the planet somehow. That's interesting. I just wanted to look at their GitHub repo again, but apparently GitHub is down for maintenance right now. GitHub has been down for the last half hour, actually. Twitter has been exploding with everyone raging about it.
Starting point is 00:08:06 So, yeah. What else have we got to do on a... Well, talk about signs of the apocalypse. Well, when GitHub's not down, the CNTK is fully open source if anyone has any interest in seeing how a deep learning tool is implemented in C++.
Starting point is 00:08:27 Yeah. Yeah. Okay. Next article is from Dave Sankel, one of our early guests on the show. And he's writing an article about his proposal to add language support for pattern matching and variance. And it's not for C++ 17. It'll be for C++ 20, maybe, I guess.
Starting point is 00:08:49 It'll be the next revision. But it was a really good article explaining his proposal in depth, showing lots of examples of what language-based pattern matching and variance will look like. Yeah, and it looks like it requires some language extensions for kind of operators that don't currently exist, unless I misread that. Yeah, I think there were three new operators for dealing with variants. Yeah, Alternative, Discriminator, is there another one? Yeah, I don't know.
Starting point is 00:09:21 There's a lot of meat to that article for anyone who goes to read it. It is a very meaty article, but definitely well worth the read. And a fair bit of discussion afterward, too, in the comments. Definitely. This is something that I would like to have. But the proposal as it stands does seem like an awful lot of new keywords and a lot of new semantics and things like that. And I know, you know, Bjarne is particularly averse to adding any kind of extra keywords to the language in places where it's not totally unambiguous so all the kind of names that they're picking as well are exactly the kind of names that people who are writing variant libraries now are using as function names so making them operate the name seems like a
Starting point is 00:10:00 problem but you know i've i've tinkered in rust, and one of the things I did enjoy most about Rust was the ability to do this deconstruction on types and enumerated values and things like that. So I think it would be a welcome language addition, but it needs a bit of work, I think. Yeah, it's a very good point with the operator names. There's no
Starting point is 00:10:20 way you could have an operator named extract, because everyone's codebase is going to have some function name with that name. Yeah. Right. It depends if they're going to play the tricks they did with things like override and final where it's only a keyword in a particular place, but it's not clear to me that you could do that with an operator.
Starting point is 00:10:39 Right. Okay. And then this last article also coming from Microsoft, this one from STL about the VS 2015 Update 2. Okay, and then this last article also coming from Microsoft. This one from STL about the VS 2015 Update 2, which is not out yet, I believe. I think we're still on Update 1. But when Update 2 comes out, the STL is going to be C++ 17 feature complete, which is interesting since C++17 isn't really out yet. So Microsoft used their deep learning toolkit
Starting point is 00:11:08 to invent time travel. Well, they're basically looking at all the working papers in C++17 and seeing what they think is going to be in C++17, and the STL is going to support all of those features in addition to all the 11 and 14
Starting point is 00:11:24 features. Yeah, I would be... Go on. They're going to be scared by the Clang guys who take great pride in being in every meeting at the CPP conferences and things and usually have finished the features that have been agreed by the end of the night.
Starting point is 00:11:39 They do these hackathons through it. It's a marvel to watch. That sounds pretty amazing. It does. What were you marvel to watch. That's pretty amazing. It does. What were you going to say, Jason? I was just going to say I'd be surprised if any of those proposals change very much. What they've implemented will probably end up being what's in 17. Yeah, at this point, I think 17 is getting pretty locked down.
Starting point is 00:12:02 I don't think we're going to see much change at this point. Okay, so Matt, let's start talking to you. Can you kind of give us an overview about what this Interactive Compiler Explorer is and where people can find it? Sure. So you'll forgive me if I call it GCC Explorer because that's what I used to call it forever. And then as I added more and more compilers, the name got more and more inappropriate. So I'm trying to rebrand it as Compiler Explorer now. But GCC Explorer is a website that you can get to at gcc.godbolt.org that basically gives you a drop-down of compilers and allows you to type in your code on the left-hand side
Starting point is 00:12:37 and the left-hand pane, and it interactively compiles it and shows you the disassembled view of your code on the right-hand side. So the reason I wrote this is I found myself, especially with the new C++ OX, as it was then, features, wondering what the actual cost of the code that I was writing was. And obviously, the best thing to do is to run your code, profile it, and whatever. But sometimes you just want that sort of feedback loop of like, well, how does the compiler do this? Which things disappear?
Starting point is 00:13:05 Which things are actually elided by return value optimization? What does move construction do? And certainly for tiny little snippets of code where you're thinking, should I pass a std string by reference? Should I pass it by value? You can type in a little bit of code into the explorer and then just see what the assembly looks like on the other side and so this it started out life as a kind of bash script that i wrote where it was just compiling and then dumping it on the screen continuously in like every two seconds and then i realized that hey this is the the age of the web and i should really
Starting point is 00:13:41 learn how to write a website and before you knew it it was up and running and then you know it's I added loads more compiler support to it so we now got basically all the versions of GCC ever we've got ARM versions in there we've got Clang I even managed to talk Intel into giving me an evaluation of sorry an educational
Starting point is 00:13:59 license for at least one of the versions of ICC so one of the more interesting things to do is to put the same piece of code in, crank the optimization up to top level, and then drop down between compilers and go, well, how would this compiler do? How does this piece of code vectorize okay on this compiler, or do I need to give extra hints and stuff? So it's been a fascinating tool.
Starting point is 00:14:23 And for me personally, and the feedback that I get from people now is that they use it now almost like a, does my code compile? So I have a pal who works on the Unreal Engine under Windows, and he needs to be able to support all the various different compilers. And for
Starting point is 00:14:40 various little idioms, he'll chuck them into the website and then just basically cycle through and make sure that A, they compile, and B, the code looks of looks about what he would expect it to look like. So you have PowerPC and AVR and x86 and ARM on here. How do you accomplish having so many different targets? Are you cross-compiling? Blood, sweat. Yeah, so the whole thing's open source. You can go and have a look at it on GitHub when it's so there's the whole thing's
Starting point is 00:15:05 open source you can go and have a look at it on github when it's up and there's actually the magic and trickery I use to build all the compilers
Starting point is 00:15:11 and build all of the docker images that it runs on so behind the scenes it runs on an amazon instance and the amazon instance then embeds
Starting point is 00:15:19 several docker images each one has a different version of an operating system for which I could find a version of the compiler for that particular architecture or of that particular age to install into.
Starting point is 00:15:33 And the reason for that is, there's a thing called Crosstool, which is a fantastic set of scripts that allows you to do cross-compilation for all these different things, but it's extremely flaky and I didn't want to invest too much time. So I'll be honest, most of the things that are on that list
Starting point is 00:15:46 are things where I can find the installation packages for, say, Ubuntu 10, and that's the one I installed on a Docker image of Ubuntu 10, and then there you go, there's all the compilers. So it's the bane of my life, though, because the number one feature is can you support CompilerX? So I get an email, and then I'll have to hunt down a way of getting it and then I'll have to
Starting point is 00:16:07 like find a way of crowbarring it into my ever expanding Docker containers and going from there well see now I almost I have actually built them sorry go on yeah I almost put in a support request myself to support Vim bindings in the code editor but I figured that was outside of
Starting point is 00:16:23 the scope of the project actually that shouldn't be too bad the code editor, but I figured that was outside of the scope of the project. No, actually, that shouldn't be too bad. The code editor is this marvelous, marvelous little thing called CodeMirror, which I recommend anyone who is thinking of doing anything on the web. I mean, I barely touched it. The only thing I've done is I've given it like an assembly mode that understands
Starting point is 00:16:38 how to syntax color some of the more obscure things that appear on the right-hand side. But it's a fully featured editor. It supports, if you let it, it supports code expansion, auto-completion. It supports parsing ability. It's brilliant. So there are Vim bindings for it. I just haven't put them in there because no one's that crazy, right?
Starting point is 00:17:00 I usually paste my code into it. I actually find the project kind of addicting myself. I just use it to see, like, well, what does this code do? Like you were saying, but I edit live in it and watch it spit back the results. Right.
Starting point is 00:17:18 So you're the one who's hammering my servers, right? Yeah. That's okay. That's what they're there for. But it is amazing, actually. I mean, a passion of mine, as you can probably tell from the other things I get up to, is more like the low level of things. And it does surprise me how much, how isolated we have become from the processor that's inside
Starting point is 00:17:39 the machine that's running our code, even in C++. And I love C++. I love the things that are going on in c++ 11 and 17 and we're it's becoming our higher and higher programming language but i absolutely love the fact that you can still drill all the way back down to effectively just see and if you really don't mind you can open up an inline assembly block and just keep on going down right down to the bottom and i don't think there's a single other programming language. Maybe Rust is getting there, but there's no other programming language that lets you do this.
Starting point is 00:18:08 But even with that in mind, I think there's a surprising number of very good programmers who don't really know how the rubber hits the road. And this is another thing to show them. It's like, it's not that scary. I vividly remember having a guy come in back in the games days, fresh out of university,
Starting point is 00:18:28 and we had a crash in code, and we couldn't work out what was going on so i dropped to the disassembly view and he looked to me like what are you doing this is witchcraft i'm like this is how this is what your program does if you don't understand this then you don't know what the computer's doing and he looked at me i don't know assembly i'm like at the time i don't x86 assembly too, but you can marry up what the instructions are with what you're trying to do, and you can probably work it out. And, you know, we did. And I think that everyone needs to have that experience. Everyone should know that it's accessible,
Starting point is 00:18:54 it is something you can do, and it's important to understand what the computer's doing so that you know how to not do silly things. And also, how smart the compiler is at guessing what you actually meant. Let's talk a little bit more about that. I mean, what types of things do you think C++ developers who don't really go down into assembly very often can learn from using this kind of tool? Right. I think some of the things I think most people have sort of internalized at some level, you know, the passing of things by const ref as opposed to by value, although again
Starting point is 00:19:26 with the caveats that there's, I guess that's the C++ watchword, right? There's always a catch, right? Sometimes when that's better to do and sometimes when that's not. Unnecessary copies of objects taken sometimes come out of that. What else is the type of things?
Starting point is 00:19:43 Pointer aliasing is always a surprising one. and I mean that in terms of like if you sit down someone with a little loop that's going for i equals naught to a million you know add up some things and then you look at the assembly output and go well why on earth is it like reloading the loop counter every single
Starting point is 00:19:58 time well the thing is comparing the loop counter against every single time and you're like well yeah that does seem weird you know it should put it in a register it should unroll it or whatever and uh then you'll then then a compiler author will sit down and say and say to you unless the compiler can prove that the innermost part of the loop is not accessing through some circuitous way the the uh the variable that the end of the iteration is stored in then it has to assume it's changed every time. You're like, oh, I hadn't thought of that.
Starting point is 00:20:29 Now, that doesn't come out from just straight looking at the code. Someone has to tell you about that. But there are some surprising things like that that come out, I find. So do you have any favorites other than that for surprising things that maybe other users have pointed out or that you've learned through the tool? I guess, I mean, there's a couple of examples that you can do on the dropdown.
Starting point is 00:20:50 And one of them is just a simplistic sum over arrays. And in order to torture the compiler into generating the code that one might instinctively think it could generate, it's amazing how many funny little underscore, underscore restricts and how many funny little intrinsic you need to do just to sort of cough politely and say, no, no, no, this is what I meant. You can trust me that this is not aliasing.
Starting point is 00:21:14 You can trust me that the pointers that I'm going to give you are aligned on the natural boundary for the thing that I'm passing in. And then it will generate a lovely piece of vectorized code. But you really do have to go out of your way to do that. And I think unless you're actively disassembling your code or looking at the app or single-stepping through the debugger, you just don't know what's going on. Right. Interesting.
Starting point is 00:21:38 So what kind of response have you gotten from users using the tool? It's been overwhelmingly positive. It's surprising how many hits I get and how many people are interested in what their code is doing, which again is the thing that I really enjoy hearing about. Most of the requests are for things that are now in the bug tracker,
Starting point is 00:22:01 things like a side-by-side sort of diff view. I mean, this is a way that i use it myself as well as like should i pass you know o3 or minus o2 for the compiler options you know how much difference does it make um it would be nice to be able to write the code and then have a drop down and choose like two versions maybe even two versions of code as well like this is i would like to do either this way or this way or maybe maybe a you know minus capital d to the compiler to switch between two like inner blocks inside your loop and then be able to like look at a diff difference view of the generated assembly and see if you can see what the difference is between the before and after um so and forever everyone was asking about link time code generation which was a big hole in the way that it was
Starting point is 00:22:44 working before so the way that it used to work was that it would compile the code and use the flags to gcc or the various compilers to just output the the text of the assembly and then it would send the whole lot down to the client and then the javascript would do all the filtering and whatnot and then my amazon bills were getting more and more and more as i was sending more and more terabytes of data of effectively white space. I mean, you guys know how big these things can be. So I finally bit the bullet, and now all that's done on the server. And now also, optionally, with the binary view for the compilers that I can support,
Starting point is 00:23:16 it actually compiles it to the object file and then disassembles it back out again and then applies some filtering and then gives you back the end results, which gives you the opcodes as well, so you can see how many opcodes each instruction encodes to. And more importantly, it allows you to turn on things like link time code generation, which is very interesting.
Starting point is 00:23:38 Again, the compilers are so smart these days. It's amazing what they can do. And when you lift the lid to them and say say this is the whole program go go nuts um the stuff they do like the the new uh gccs um and their um their amazing de-virtualization that they do speculative or otherwise um it's just great it's it's it's nice to watch but um but i guess i now now my the most feedback i'm getting at the moment is i clicked the button for link time code generation, and now all I get was an empty main function. I'm like, well, yeah, the compiler was smart enough to realize that your code had no side effects whatsoever, and so it just threw the whole lot away. So there's a sort of black art to generating the code and trying to find a way of tricking the compiler into not throwing it away.
Starting point is 00:24:31 So you've mentioned that you're running all these builds on an Amazon server, and it seems like the system responds very fast. So how many servers do you have running, and is this costing you a fortune to keep it up, or what? Thankfully, not. No, at the moment, I have one server up and running, although it can scale up to four on demand. That is the very occasional times it makes anywhere near the beginning of Hacker News article. Then it scales up. But at the moment, it's just one. It used to be like the cheapest tiered Amazon server you could get.
Starting point is 00:25:05 But that was a bit choppy. So I've gone to a slightly higher level and it's not costing very much at all. But there's a lot of caching that I try and do. You know, obviously a lot of people land on that page and they load up the examples. Or while you're typing, you know, you go back and forth on things between two different variations of the same code.
Starting point is 00:25:23 And so it's useful to cache that on my side as well. So I'm surprised at how quick it is too. I'm even more surprised that, touch wood, nobody has hacked it yet because running a full compiler on demand with arbitrary pieces of code that people are typing in gives a kind of a large window of vulnerability to those who have bad intentions.
Starting point is 00:25:48 But, I mean, I have tried my best on that front, but I'm sure if a security expert is listening, they can blow a hole in it and go for it, I would say. It's a throwaway instance, so there shouldn't be anything on there if it does go down. It does annoy people. But it doesn't actually execute the compiled code correct correct but you'd be surprised how many command line flags there are to gcc for like loading dynamic modules or so this
Starting point is 00:26:12 is so the in fairness we've had a few people attempt to break through and um the normal way is to like do something like uh pound include etc password and then of course the error message prints out the first line of et cetera, password. And you're like, oh, look, I can hack you now. I'm like, well, not really. And then, of course, so I put in the obvious, like, let's just check in the server that there's nothing silly in the includes. And then, of course, C++ being, or rather, I guess this is C,
Starting point is 00:26:40 being as awkward as it is, there are a million ways of getting the preprocessor to read a file. You know, you can pound define foo as quote slash etc slash password. And then you can do pound include foo without quotes. And it will expand the macro and off you go. And there's a lot of ways of obfuscating it. So eventually I bit the bullet and now I run all the compilers with an LD preload, which for the more Unix folk among you is a way of effectively preloading a shared library before an executable runs. That allows me to hook all of the
Starting point is 00:27:13 operating system calls for like file opening and reading and writing and creation, whatever like that. And then I have a little thing that just goes, I know the kind of files that GCC should be opening. Anything else, deny. It's very, very simplistic, though, so I'm sure anyone who's listening can crack their way through it, and please contact me if you have a better idea. That sounds like a pretty creative idea to me, but I'm not an expert in these things. I wanted to interrupt this discussion for just a moment
Starting point is 00:27:38 to bring you a word from our sponsors. You have an extensive test suite, right? You're using TDD, continuous integration, and other best practices. But do all your tests pass all the time? Getting to the bottom of intermittent and obscure test failures is crucial if you want to get the full value from these practices. And Undo Software's live recorder technology allows you to easily fix the bugs that don't otherwise get fixed. Capture a recording of failing tests in your test suites and debug them offline so that you can collaborate with your development teams and customers.
Starting point is 00:28:09 Get your software out of development and into production much more quickly and be confident that it is of higher quality. Visit undo-software.com to see how they can help you find out exactly what your software really did as opposed to what you expected it to do. And fix your bugs in minutes, not weeks. So you have other versions, right? There's a rust, uh, compiler explorer also, is that correct? That is correct. Yeah. So, uh, I have, there's rust.godbolt.org. There's
Starting point is 00:28:37 go.godbolt.org and there is d.godbolt.org, which, um, yeah, which I think you can probably guess which of those. In terms of the number of hits, Rust has the most, because I think it's the most popular. Even though Rust has its own kind of like interactive Rust compiler, which can generate assembly
Starting point is 00:28:57 and looks probably nicer than mine. So I'm surprised the number of hits I get out of that. I think that's a testament to the growing popularity. Go is kind of next, and then poor D is still lagging a little bit, even though I'm a big fan of, well, certainly Andrei and what he's up to. But I feel maybe the ship has sailed there. We'll see. We'll see. I think it's healthy to have many, many different languages to choose from.
Starting point is 00:29:24 And, yeah, with Go, Rust, D, and C++ all vying for position, I mean, my thoughts are that things like Go and Rust and D are what have accelerated C++'s development of recent times with, you know, OX, 11, 17. We've seen that there's somebody coming up behind and they've got their sort of site set on systems programming languages too and performance and things like that and so the complacency has dropped so i'm i'm very much in favor of competition healthy competition well you've mentioned all the big ones there go rust d what um have you learned anything from these languages
Starting point is 00:30:01 that you think should be brought into c++ or any ways that you think they'll outpace C++ in the near future? I mean, not from my direct experiences in GCC Explorer. It still seems to be, sorry, Compiler Explorer, I should say now. It does seem to me that GCC is winning, certainly in the open source compiler stakes stakes the the um uh the code generation game i have yet to see anything come out of either d or um go or rust's compiler that sort of competes with it and in even compared to clang i mean i love the clang folks they're doing an amazing the amazing work and again they've kicked gcc along by basically shaming it with their, the error messages that they generate. And, but their code quality is still not quite,
Starting point is 00:30:49 quite there. I mean, this is based on my samples of things that I'm interested in, which are, you know, very numerically intensive, very intrinsically based anyway. So maybe it's, maybe I'm completely skewed,
Starting point is 00:31:00 but, but yet code generation is not as good in the other languages, I would say just yet. But in terms of the features that the languages have themselves, we've already spoken earlier about the pattern matching which I'm becoming more and more of a fan of variants in C++
Starting point is 00:31:16 in general. It's something that we are starting to do more and more in my day job to sort of match the to get the job done um but um but um having pattern matching would be just so much nicer than the the sort of the boost static visitor kind of approaches that we're having to do at the moment it's just not
Starting point is 00:31:41 very ergonomic to use and and i think that puts a lot of people off. So I'm looking forward to that coming. And obviously, the type safety aspect of Rust is in equal amounts frustrating and amazing. Frustrating because if you get it wrong, the compiler gives you a 300 line error message that you have to decode. It is right though, and it would have been a problem at runtime, but I'm used to finding the problems at runtime rather than at compile time. And I'm pretty good at finding them now, but I'm less good at reasoning about why the borrow check is not letting me do something I think I should be able to. So it's like debugging template metaprogramming, compile errors. I guess so, although no, nothing's as bad as that. I guess that's another sort of strength of C++ in general
Starting point is 00:32:26 is the fact that you can pretty much write code in any style you like. You know, I can sit there and I can write my little intrinsic laden maths kernels, and some other guy can write an almost functional style language approach using template metaprogramming, and then we fight over the compiler, as you say, or rather the compile times, which is obviously something I think, again,
Starting point is 00:32:48 the other languages have over C++ is that their compile times are usually better. I guess Rust is actively working on there and it's still very early, but that's probably the only thing that's comparable to a big C++ template fest. The D claims are that it creates code almost as efficient as C++,
Starting point is 00:33:06 but almost as fast as using a scripting engine for compilation. Right. That last thing is definitely true. I mean, it is very fast. As is Go, though. I mean, Go is super fast in terms of compilation time, and that was another stated goal. And I think if people are like me you've grown up writing c and c plus plus we've probably learned or rather we've we slept walk into the case where
Starting point is 00:33:32 our compilations are getting longer we started i mean i started out writing assembly and that was like instantaneous then c and that you know you nowadays if you pull down a an open source project that's written in c and you type make it's like oh it's built what's going on there how do we get into this stage where we accept like a 25 minute build as being like just one If you pull down an open source project that's written in C and you type make, it's like, oh, it's built. What's going on there? How do we get into this stage where we accept like a 25-minute build as being like just one of those things that you have to endure when you're writing in C++? So it's interesting to see other languages taking a step back and going, we should fix that first. Developer productivity trumps almost everything. So you mentioned your day job a little bit bit there is there anything you can talk about i mean we've had desire to get someone on to talk about high frequency trading and what that industry
Starting point is 00:34:12 is like as a c++ developer right right right obviously there's a limited amount that i can talk about we are the industry as a whole is secretive which is an interesting thing so prior to this obviously i worked at google and you know're also secretive, but in a different way. The trading industry is an interesting one. I mean, obviously, a small selection of what we do as an industry is very time-critical, very latency-sensitive. And people are prepared to go to extraordinary lengths to reduce the time it takes to do things. And that often means writing close down to the metal,
Starting point is 00:34:53 you know, assembly, C, C++, obviously are good choices there. But there's still an enormous amount of stuff that we do for where Java, Python are still very good choices. And then as far as I can tell, for where Java and Python are still very good choices. And then as far as I can tell, the really fast people are starting to look beyond conventional hardware and conventional C and C++ and they're going to extreme designing their own hardware
Starting point is 00:35:19 and things like that, which is fun and interesting sounding. But my take on that is that you always still need something to talk to that hardware and you know that means that you need to understand again at the basic level what hardware in general does and i feel that c is one of the better languages c and c++ are the better languages to be able to learn those techniques understand it i mean like i've spent time you know trying to explain to died in the wall java programmers about we don't pause a lot of the data that's coming in there is no pause step i just cast it to be the right structure and read the data out of it it's like memory is a slab of bytes and i can choose how i uh i interpret those bytes however i like and it
Starting point is 00:35:59 doesn't cost me anything there's no conversion there's no anything and they look at me like what do you mean and like this is no but that's how the computer works you have to understand how the computer works and i think you know we we i mean certainly from from our company's perspective we're always looking to hire smart people so i get my little mic micro plug in there but um um it's it's becoming more difficult to find people who who are um i think martin thompson coined this term um the term of mechanical sympathy like understanding that um in order to get the most out of something you do have to understand how it works at a sort of deep level even if you're going to be writing in java or python or whatever and c++ gives you a good preparation of getting there um i have a there's
Starting point is 00:36:43 a presentation that i that i did for the GoToConferences on how the computer really works, which is like lifting the lid even of what the instruction stream does under the hood of the x86. And I was able to find an example of writing something in Python that caused the branch predictor to go wrong.
Starting point is 00:36:59 And I could demonstrably show that even in Python, you can write code that causes bad performance issues. If you don't understand that like 12 layers of complexity beneath where you're writing, there is actually something you might need to at least know exists. I'm not saying every Python program needs to crack out like some textbook on how branch predictors work, but you know, you have to be aware that it's there. So are those talks available online?
Starting point is 00:37:24 Yes. I think if you look on youtube there's um go to conferences and just type my name in i think you'll probably find okay find them there one thing we didn't mention in the news is that the c++ now 2016 submission deadline is coming up on two days from now which i guess is the day that we're going to air this episode, probably. But anyhow, have you spoken at any C++ conferences or have you considered it? I've considered it, yeah. If I had enough of an axe to grind, my passion is in a particular area, as you can probably tell. And although I've been variously described
Starting point is 00:38:06 as C++, as someone who knows C++ pretty well, as I've been doing it so long, I've never really had a burning ambition to proselytize for the language, but I guess I would consider it. Well, they're fun conferences, just that I'd throw it out. So you've had
Starting point is 00:38:21 experience working on other industries, too. Embedded systems triple a video games which you mentioned during uh your introduction well what we like to think of everyone describes their games as triple a the public ultimately gets to decide though i think uh is there any other like interesting experiences from some of that past work that you'd like to share uh relating to c++ oh that's a big question. It is a big question, isn't it? I mean, obviously, pretty much every game ever has been written in C and C++.
Starting point is 00:38:52 It's, again, for all the reasons we've talked about, it's pretty much the same as the trading industry. There is a certain set of things you want to get done really, really, really fast or with very low latency, be it react faster to input from the exchange to send more trades at the right time, or be it getting more explosions on the screen or whatever. There is something about doing things efficiently and fast.
Starting point is 00:39:19 That said, I think there are extra tricks in the games industry where you have the luxury, tricks in the games industry where you know you you have the luxury certainly in the console world you have the luxury of having the same hardware in front of you as the people at home are going to have in front of them which means that you can go to town on specific optimizations and to an extent I have the same luxury now you know we we're in this situation at work if we need to deploy something we buy a particular computer that I then target and I put it in a data center and it runs my code right so I can assume that particular instructions are
Starting point is 00:39:50 available or memory is going to be laid out in a particular way or whatever what scares me is like doing anything like that for like modern PC games where there is such a heterogeneous I can never say that word environment of computers out there,
Starting point is 00:40:06 and each of them have different RAM, different speeds of RAM, CPU features, GPU features, that doesn't leave much space for creative solutions to problems, which is something I've always enjoyed. It's like finding tricky ways of doing stuff. This is not specifically a C++ thing, so forgive me for a second, but writing a game for both Xbox and PlayStation 2 at the same time was fun because they had wildly different feature sets
Starting point is 00:40:35 and trying to make the rendering system of one work commendably well on the other one was an exercise in like, well, if we tell the graphics card that it's actually a 16-bit texture even though it's an 8-bit texture and then we draw tiny little triangles over here that just picks out the red color from the texture and then we can use that as an index over here and you know stuff like that stuff where someone has to sit down and explain it on paper
Starting point is 00:40:59 and draw it out and you're like oh yeah you can do that oh you said you couldn't do you know bum mapping on the playstation it's like well it turns out you can if you if you have enough smart smart people working on something and a constrained problem that isn't changing which i guess is computing right right well so you've got a few other toy projects on the side right for some emulators you've written is that correct that is right yeah so in my my other passion is um reliving my the glory days as far as i'm concerned of 8-bit microprocessors in sort of like the late 80s early 90s and so um i took particular um pride in taking something which i had grew up with as a child which was both the Sega Master System which was the 8-bit console before the Genesis came out and the BBC Micro which was a
Starting point is 00:41:53 an 8-bit processor that was only available in Britain where I grew up and seeing how fast I could write an emulator in JavaScript and it it's embarrassing to say that even a badly written emulator, as mine are, in JavaScript on a modern computer will run at full speed quite easily. And it's a lot of fun. If no one's ever written an emulator before and is wondering about it, it's one of the most rewarding things you can do because you hack on something for a few hours,
Starting point is 00:42:24 you get the first instruction emulated, and you kind of run the ROM, and you're like, oh, yep, I can see that the registers are changing. That's exciting. And you keep going, you keep going. And a couple of days in, suddenly you're running a whole new game or something that you never wrote, right?
Starting point is 00:42:41 I never wrote Altered Beast for the Master System, but I could play it suddenly, even though I had nothing to do with it. It's just really rewarding. And then suddenly all these other games start working, and before you know it, you've got a library of games to play from, and the warm glow of having made it possible yourself.
Starting point is 00:42:59 That sounds cool. Is that done in pure JavaScript, or using Emscripten? It's actually done in pure JavaScript. pure JavaScript or using Emscripten? It's actually done in pure JavaScript. I did look at Emscripten, and I have also looked at handwriting ASMJS, which I'm told is insane. And it is.
Starting point is 00:43:16 It's almost impossible to get right. The browsers will very quickly drop back to going, oh, no, no, no, I don't know what this is, if not everything is in place for SMGS. That's been a wonderful journey, actually. So joking aside, to actually get it to not take a whole CPU on a fast computer to run
Starting point is 00:43:37 does take a bit of doing. And I tripped over a whole bunch of performance issues. And the guys, particularly at Mozilla, the Firefox team, were amazing at responding to bugs. I just tweeted somebody randomly saying, oh, this seems to be slow. And before I knew it, I'm exchanging emails with Alon Zakai and getting stuff fixed. And he's filing bugs on my behalf, and I'm sending him snippets of code that show up pathological behavior in the browser, and they're fixing it two or three days later, and it's out and released. And you're like, wow, this is how open source should be, and this is awesome.
Starting point is 00:44:14 That is awesome. Wow. Okay, well, Matt, I'm not sure if I have any more questions for you. Is there anything else you wanted to bring up before we let you go? No, I think I'm good. I very much enjoy being here. Thank you for having me on. Where can people find you online? So my Twitter handle is my name, Matt Godbolt, at Matt Godbolt. And my blog is at Xania, X-A-N-I-A dot org.
Starting point is 00:44:41 And just Googling my name is unique enough usually to find most of the stuff that I'm up to. Well, thank you so much for your time. Thank you. Thanks for joining us. Thanks so much for listening as we chat about C++. I'd love to hear what you think of the podcast. Please let me know if we're discussing the stuff you're interested in or if you have a suggestion for a topic.
Starting point is 00:45:00 I'd love to hear that also. You can email all your thoughts to feedback at cppcast.com. I'd also appreciate if you can follow CppCast on Twitter and like CppCast on Facebook. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode is provided by podcastthemes.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.