CppCast - Rust

Starting point is 00:00:00 This episode of CppCast is sponsored by JetBrains, maker of excellent C++ developer tools including C-Line, ReSharper for C++, and AppCode. Start your free evaluation today at jetbrains.com slash cppcast dash cpp. And by CppCon, the annual week-long face-to-face gathering for the entire C++ community. The 2015 program is now online. Get your ticket today. Episode 20 of CppCast with guest Steve Klabnick recorded July 23, 2015. In this episode, we'll discuss why you shouldn't be using Boolean function parameters. Then we'll interview Steve Klabnick from the Rust core team.

Starting point is 00:01:03 Steve will educate us on the history of the Rust language and talk about some of its key features. Welcome to episode 20 of CppCast, the only podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing tonight? Doing all right, Rob. How are you doing? Doing pretty good. Really busy time for me right now. I'm in the middle of moving. Actually, when I'm done recording this, i won't be needing this computer anymore i'm going to pack it all up to send out in north carolina so exciting and busy wow yeah and i'm sorry to all listeners for

Starting point is 00:01:59 missing an episode last week i was traveling for the new job, and I meant to pack my microphone with me and forgot it. So I'm sorry to everyone about that. But we already have an episode planned for next week, so we should be back on track. So at the top of every episode, I'd like to read a piece of feedback. This week I have a comment from Facebook. Jason Cleary wrote in. He said, I discovered this podcast a few weeks ago

Starting point is 00:02:28 and have been knocking out old episodes in my one-hour commute. I know you're moving, but I'm looking forward to the next one. Keep up the great work. So thank you, Jason. As I said, we should be back on track. Sorry again for missing that episode. And we'd love to hear your thoughts about the show, so you can always email us at feedback at cppcast.com.

Starting point is 00:02:47 Follow us on Twitter at twitter.com slash cppcast. And we always like to see your reviews on iTunes, so you can definitely review us there. We'll really appreciate that. It'll help us get more listeners. So joining us tonight is Steve Klavnik. Steve is a Ruby and Rails contributor. He's on the Rust core team, a hypermedia enthusiast,

Starting point is 00:03:11 and author of Rust for Rubyist, Rails in Action, and Designing Hypermedia APIs. When Steve isn't coding, he enjoys playing the Netrunner card game. How are you doing tonight, Steve? I'm doing great. How are you guys? I'm doing good. Awesome. You know, I've got to say, I had not played Netrunner card game. How are you doing tonight, Steve? I'm doing great. How are you guys? I'm doing good. Awesome. You know, I gotta say, I had not played Netrunner, so I just looked it up and Wikipedia lists

Starting point is 00:03:32 skills necessary to play the game as card playing and arithmetic. Yeah, so it's the game that Richard Garfield made after Magic the Gathering and then when it died in all card games in the late 90s eventually fantasy flight games uh picked up the ip and so they've re-released it it's much more

Starting point is 00:03:50 like a board game now so instead of like magic where you buy random packs of cards you just get the full expansion every time you buy stuff and so it tends to be much more skilled based uh the theme is like evil corporations and hackers trying to steal their secrets um so you know there's lots of like computery you like use programs to break into servers and So there's lots of computer-y, you use programs to break into servers, and it's just great. It's got a great, very William Gibson, Neuromancer kind of theme.

Starting point is 00:04:13 Sounds like fun. Yeah, it's a good time. So before we get into talking about Rust tonight, we want to just talk about a couple news items. The first one is this blog post, which is just some good general programming tips. It's about getting rid of Boolean function parameters. And the blog post is basically saying how this common problem that a lot of programmers will run into

Starting point is 00:04:37 where they find some function that maybe almost does what they want to do, so they wind up sticking a new Boolean parameter in it and saying maybe the default is true and I'm going to say false in this one scenario. And it works great for this one situation, but then you forget what the parameter is for the next time you look at this function. Jason, do you have any thoughts on this one?

Starting point is 00:05:01 Yeah, I kind of liked this. It's just an idea that I hadn't really seen before. And I think it was particularly appropriate for us us SQL Splats developers, since we don't have named parameters. And a lot of the rest of the world can get away with doing this kind of thing with named parameters. But just interesting idea to pass an enumeration value in instead of a Boolean. So it's more obvious what exactly you're asking the function to do. Right. And then the other option would be actually having separate functions but using the same underlying

Starting point is 00:05:28 implementation. So instead of having just one function with the Boolean parameter, you'd wind up having three functions, but the third function that's doing all the work, you would never actually be calling directly. Right. It'd be another way of doing it cleanly. Steve, what did you think about this article?

Starting point is 00:05:44 Does this still apply to Rust developers? Yeah, I actually first came across this idea in a Ruby context, and I think this is generally applicable across languages. I really like the enum variant a lot, just because an example shows calc formula type is gain has so much more

Starting point is 00:06:00 semantic information than true or false that I think that even just that alone makes it much more useful. Also, if you need to add something later, you're just changing the enum instead of changing the signature of the function, which is a really nice version way of doing that kind of thing. So yeah, I definitely

Starting point is 00:06:16 think this is good advice overall, regardless of language. Yeah, absolutely. So the next article, this actually just came from a tweet. Eric Niebler tweeted out that the Concepts TS was voted out. And when I first saw this tweet, I thought, oh, no, you know, Concepts isn't going to make it into C++ 2014. That's not good.

Starting point is 00:06:41 Everyone was really looking forward to that. But by voted out, apparently the committee actually means voted in. It's just weird committee syntax. And I wasn't the only one who thought that. I looked at these comments on Reddit where multiple people all thought voted out was a negative. So that's great. So concepts should make it into C++ 2014. I think there was some detail saying that they already got an implementation of concepts working with the GCC compiler.

Starting point is 00:07:10 Yes, you can. I've seen some people on Twitter talking about that, that you can play with it right now, I believe. Very cool. So yeah, hopefully it should make it into Clang and MSVC for C++ 2014. Good stuff. So Steve, let's get into talking about Rust. So, you know, this is a C++ podcast, but we've talked a bit about Rust and D, and I think some of our listeners are interested in what is Rust all about? And, you know, what should we know about

Starting point is 00:07:42 it as C++ developers? So where do you think we should get started with that? Yeah, absolutely. So I guess the first way of sort of approaching this question is, like, how Rust and C++ compare, obviously, right? And I think that the most interesting thing about Rust, and this has only been true, well, let's back up with, actually, I guess, history. So Rust started off eight years ago, actually. It's a really old project. It was a personal project of this guy named Graydon, who originally wrote the ClearCase version control system,

Starting point is 00:08:16 among other things. And Graydon's a longtime C++ dev, and he wanted to make a language that was focused on safety. So Rust is kind of interesting because over this eight year history it's had the exact same goals at a high level but the way it's accomplished those goals has changed drastically so rust has kind of been four different languages over eight years for like from a language feature perspective but that that's sort of this um it's because the way that we accomplish

Starting point is 00:08:45 those goals changed. And every single time they changed, it was less and less overhead, closer and closer to the metal, quote unquote. You know, I always find that phrase to be kind of funny. And, you know, getting rid of more and more things. So to the point where nowadays with Rust, it doesn't even have any more of a runtime than C or C++ does. And so it's like actually usable as a replacement for C and C++ in those contexts. I know that C++ people are sick of languages coming along and saying like, we can do everything C++ can, right? Because they're always missing out on some sort of thing that makes C++ special.

Starting point is 00:09:19 And I think that Rust is one of the first languages that can actually truly say that it is a viable option in places where C++ is your only choice today for C. I don't want to fall into the C equals C++ trap either, right? So I'm just saying that because it's a podcast. I'm aware they're very different languages. So Mozilla adopted it about halfway through that history of its four years. And the reason is that, uh, that Mozilla is interested is because Firefox is currently about 4 million lines of C++. And, uh, when we did a little bit of an audit recently against the security bugs filed with

Starting point is 00:09:57 Firefox, about half of them come down to memory on safety bugs that end up leading to some kind of security vulnerability eventually. So the idea of a memory-safe language with little overhead is very, very appealing because web browsers need to be incredibly fast, but they also need to be very secure and resistant against attacks, since

Starting point is 00:10:15 obviously web browser is one of the primary ways that hackers will get into your system. And that does not mean that Rust is inherently super secure, but it just eliminates a lot of the common memory-unsafety things that end up turning into, you know, remote code exploitation and all that kind of stuff. So is the project actually being used in Firefox and Mozilla right now?

Starting point is 00:10:35 So one of the things that was done to make sure that Rust was an actual useful language is we, first of all, implemented Rust in Rust relatively early on in Rust's lifecycle, so, you know, that makes it useful. But also we're developing an alternate browser rendering engine called Servo, so it's sort of like Gecko.

Starting point is 00:10:52 Not so much Firefox itself as a whole, but just the rendering part, Gecko. And that project is called Servo. So that's been developed alongside of Rust and it's been very useful in informing the language on what things are actually useful in a real-world developer context.

Starting point is 00:11:05 So we would write new features in Rust, try them out in Servo, turns out they were useful, so we kept them, or it turns out they weren't useful, so we got rid of them, and that kind of feedback cycle really informed the design of the language. As far as Firefox goes,

Starting point is 00:11:18 some Rust code has landed in the tree very recently, but I believe that still none of it is actually user-facing yet. There's actually a significant amount of work that needs to be done with things like the build system to be able to get, you know, the build system able to build and integrate Rust code into the rest of Firefox. And so that's sort of where the work is being done right now

Starting point is 00:11:38 is in those kinds of things. And slowly we will replace chunks of Firefox with Rust code. The first thing that's landed is a, is a media type header parser, basically. And there's another patch in the queue to change all the URL parsing code with a Rust URL parser. So those are sort of the first pieces they're going to land. And they're in the queue. There's just some work that needs to be done before that actually lands for real. So I imagine it's a really big job

Starting point is 00:12:05 to implement a new rendering engine use that servo right yeah so is that from scratch or is it a port of gecko to rush so it's its design is from scratch in the sense that uh it's not like based off of so one of the big goals actually is to prove out how parallel rendering can really help improve the speed of a rendering engine so if we copy the design of Gecko, that would sort of not take that into account. So it's from scratch in the sense that algorithmically it is on its own. However, there is a number of libraries that are essentially wrappers to C and C++ components. So, you know, for example, it uses like Skia, which is not written in pure Rust, right? So there are large chunks of it that are being sort of reused, but the core stuff is, you know, totally from scratch.

Starting point is 00:12:51 And components are slowly being rewritten in Rust, you know, as that becomes more and more, I don't want to say feasible, but like worth spending the time on, right? Like you don't want to spend all of your time reinventing wheels. You want to actually get a project done. So that's sort of how it's gone. As it makes sense to replace components, components get replaced. But there's still a lot of C and C++ code as well. So maybe I'm jumping ahead here, but I was curious. What I was trying to get at is,

Starting point is 00:13:20 have you worked with trying to port C++ directly to Rust? Does that make sense? Are there any useful tools for it or anything like that? Yeah, so actually, just earlier today, I was messing around with some tools doing binding generation for headers. So we have really good tooling for C. Maybe I shouldn't say really good, but it works well for C.

Starting point is 00:13:37 For C++, not so much. The problem with writing a C++ to Rust compiler is that the strictness of the Rust compiler really informs your design. And so it's very, very hard to do a simple machine Rust. You would basically have an unsafe block around your entire program, essentially. And so you wouldn't be getting very much mileage out of Rust safety features if you were to try to do a direct translation. And there's also a surprising number of semantics that are a little difficult to port over. And we can get into that when some of the questions we have queued up get into some of those differences. But also, it's very hard to translate the exact semantics of things.

Starting point is 00:14:29 In a language like C++, the exact semantics really matter, right? If you're doing a scripting language, you could do it relatively easily, but for something where the details matter, it's very, very important to get the semantics exactly the same. So let's dig into some of the features of Rust a little bit.

Starting point is 00:14:46 You talked about how memory safety is so important to Mozilla. How does Rust achieve memory safety in a way that C++ doesn't? Yeah, so the primary way that we achieve memory safety is through compile-time analysis, static analysis. Again, the speed being so important, we don't want runtime checks for safety as much as possible. There are some instances in which that's unavoidable, but there are also some ways that you can then get around it, which gets a little into the weeds. But the point is we try

Starting point is 00:15:16 to offload as much as possible to compile time checks so that you pay zero runtime overhead, because speed is important. So that's like the primary mechanism is essentially doing analysis on whether or not a pointer is pointing to valid memory or not, knowing when memory goes out of scope and gets deallocated is very important. And yeah, doing these kind of compile-time checks. It's often

Starting point is 00:15:37 referred to as the borrow checker. I'm doing air quotes. It's the primary analysis and also the most interesting one that is sort of different than C++' uh like modern c++ is stuff okay okay um what the next item i have here is uh threads without data races that sounds interesting so what's interesting about uh memory safety is that if you look at the sort of the problem with memory safety is you can't have, and this is concurrency in general, right? So there's sort of that saying that

Starting point is 00:16:10 shared mutable state is the root of all evil. So you can, the problem is having both of those things at once, shared and mutable. So if you have immutable memory, then you can share it as much as you want, there's no problem. If you don't have aliasing, then you can mutate memory all you want and there's no problem. And so that general property of memory safety,

Starting point is 00:16:28 even in a single-threaded context, turns out to be basically the exact same thing that you need to not have data races in a multi-threaded context. So it actually transitions very nicely the sort of rules we have for all Rust code applied to threading equally as well. And what's really neat about that is it means that threading is actually implemented as a library. So you can write your own alternate threading library if you wish and get the same degree of safety by using Rust mechanisms. So one great example of this is there's a library called Mio, M-I-O, and it does green threading, or not exactly green threading,

Starting point is 00:17:02 but it does like an event loop non-blocking I-O stuff. And it has the exact same guarantees around memory safety as sort of the standard library thread stuff does. It uses the same static analysis properties because everything is in a library as opposed to being built into the language. So can we dig into that a little bit? I'm sorry, I don't know if I interrupted what Rob was getting ready to ask here, but... I was curious about green threading.

Starting point is 00:17:24 Oh, okay. Yeah, so Rust used to have green threads built into the language itself. As I sort of mentioned before, we used to think that the language would need to know a lot about details about things like threading in order to ensure safety. And so a long time ago, Rust had just green threads,

Starting point is 00:17:41 and the language knew a lot about their semantics to ensure safety. As the type system got better, and as we got better with more static analysis, we realized that wouldn't actually be necessary. And what kind of systems language doesn't have access to systems threads, right? Like native threading. So we added 1.1 to 1 threading as well as sort of the end-to-end threading with the runtime. And like any good programmer, we see a similar interface, so we want to add on abstractions so you can switch between them. So we said, hey, it shouldn't matter what kind of threading you use. Just set up the runtime to either use M to N or 1 to 1, and then the interface stays

Starting point is 00:18:11 the same and you can change that up. The problem is that the overhead involved in doing so means that the N to M threading was not really significantly faster or better than the 1 to 1 threading anyway. And so by getting rid of N to M threads, we could also get rid of the vast majority of the runtime. And so Rust today in a standard library only offers 1-1 threading instead of end-to-end. And that enabled us

Starting point is 00:18:32 to sort of get rid of all that overhead and stuff. But now you could implement, if you wanted to, you could implement your own green threading library and then you would basically be building that kind of runtime overhead yourself. And it's possible that Servo... Servo was actually one of the bigger users of green threads.

Starting point is 00:18:48 They got kind of mad at us when we removed them. Not, like, really mad, but they were kind of like, oh, come on, now we have to redo all this work. And so they've switched to one-one threads, and it still works just fine, but they may end up re-implementing a green threading library on their own, or someone else who comes along may end up implementing one eventually. All right, so correct me if i'm wrong here the green threading is basically software implemented cooperative multitasking is that correct it's sort of a it's sort of a broad see why it's called n to m

Starting point is 00:19:15 threading is you put like n uh user space threads over or you put n uh like they're often called like tasks or like servlets or like there there's a different name for threading so you map these user space threading onto an operating system thread so that's like N to M means you could have three user scheduled threads running on one or two operating system threads and the primary advantage is things like you can do very small stack size so you can spin up, like Erlang for example

Starting point is 00:19:44 is a language that uses green threading very heavily. And so you can easily spend up tens of thousands or hundreds of thousands of green threads in an Erlang VM, and it'll use very little memory. Whereas if you're supposed to spun up 100,000 operating system threads with their stack size, you know, you would be using a lot of memory, and it'd be very slow to copy

Starting point is 00:19:59 around. Yeah, Linux is kind of notorious for that, for allocating a few megs per thread, I believe. Yeah, yeah. kind of notorious for that, for allocating a few megs per thread, I believe. Yeah, and so it tends to still be relatively fast, and you can control the stack size and all that sort of stuff. One of the reasons, though, is like, you know, again, the details matter. So, for example,

Starting point is 00:20:16 if you have end-to-end threading as your primary threading mechanism, then if you want to call into C, then you need to shuffle your stack around to give, you know, the size of stack that C would expect. It can lead to less efficient bindings to C libraries if end-to-end is your primary abstraction.

Starting point is 00:20:31 There's lots of details that matter. Just like any technology, green threads are awesome in certain use cases and not awesome in other use cases. It depends on what exactly you're trying to achieve. Okay. The next point I had here was how Rust can prevent nearly all seg faults.

Starting point is 00:20:51 Yeah. So I should get into, I briefly will mention, so Rust does have this thing called unsafe. And unsafe is an annotation you can add around a block. So you unsafe curly and then close curly. And inside that unsafe block, it technically is only three things that you can do differently. But those three things basically boil down to whatever you want to do. So you can sort of get around Rust safety checks, because that's necessary with things like cinterop and also some other things. And so when we talk about Rust,

Starting point is 00:21:21 we're generally speaking about safe Rust, i.e. the language outside of safe blocks. So in a Rust program that does not use any unsafe blocks, you should never get a segfault ever. If you do get a segfault, it's because somebody did something bad in an unsafe block that they weren't supposed to be doing, basically. And the advantage there, a lot of people will say something like, well, if Rust claims to be safe but there's an escape hatch, then how can you ever claim that it's truly safe? And sort of the answer is that the vast, vast, vast majority of code does not need to use unsafe itself. So, for example, Cargo, our package manager, is written entirely in Rust,

Starting point is 00:21:56 and it uses no unsafe code whatsoever. Unsafe code is usually used in libraries if you're trying to build up certain kinds of bindings or if you're writing certain kind of data structures. And those you need to sort of get around the rules a little bit. But yeah, in safe code, Rust should never ever segfault, which is really, really nice, frankly. So these unsafe blocks, they basically just ignore the stack analysis you were referring to before? So I'll give you an example.

Starting point is 00:22:25 So an ampersand T in Rust is called a reference. And Rust references have this static analysis that makes sure they're always pointing to valid memory. Inside an unsafe block, Rust will still do the static analysis on that ampersand T type. However, there's also another pointer type that we call asterisk-mutes-T or asterisk-const pointer type that we call asterisk mute T or asterisk const T that we call them raw pointers. And those are basically the exact same as a C pointer.

Starting point is 00:22:52 And so you're only allowed to dereference one of those pointers inside of an unsafe block. So you can do other stuff with them outside of unsafe and all the checks happen. It's not like the checks get turned off on the checked constructs it's just that those things allow you to access the unchecked constructs does that make sense it's sort of hard to talk about without like actually having code on the page you know um but but that's sort of what it is is like it allows you to dereference uh these sort of raw pointers that don't do the checking normally it doesn't really turn off the checking for the types that are already checked. I feel like,

Starting point is 00:23:28 are you familiar with the obfuscated C code competition? Oh yeah, the IOCC is amazing. My favorite entry is the one how they had to add that as a one byte minimum because GCC would compile a zero byte file into like a technically valid executable or whatever. Yeah, yeah, totally.

Starting point is 00:23:44 So I just feel like the Rust community, if I'm understanding everything you've said so far, needs to have a competition to see if people can write code that is valid, safe code, that breaks the static analysis and causes a SIG fault. Yeah, yeah. To help vet out bugs in the static analysis. Well, yeah, there's that.

Starting point is 00:24:05 We've actually had one or two people try to formalize subsets of Rust, and that is something we want to do someday. The problem is that formalization is very, very hard, and when you're trying to improve the language, you change the semantics of that language, obviously, right? So you don't want to halt changing the language just to prove that it's absolutely safe.

Starting point is 00:24:22 But it is true that occasionally, we've yet to really find hardcore bugs in the analysis itself. changing the language just to prove that it's absolutely safe. But it is true that occasionally we haven't, we've yet to really find like hardcore bugs in the analysis itself. However, unsafe code can, you know, humans are fallible. And so sometimes you'll make mistakes. There was actually a big discussion. One of our library APIs was demonstrated to be unsafe shortly before the 1.0 release. And so we had to, you know, address that, you know,

Starting point is 00:24:44 because occasionally this kind of thing will happen. The way in which it was unsafe was absolutely ridiculous. So the short version is basically like an RAI guard that is put into an RC-counted ref cycle where you have

Starting point is 00:25:00 two RCs pointing at each other so that their counts are always equal to one. You could leak an RAI guard in one of those, and that would cause unsafety in certain contexts or whatever. So you really had to be trying really hard to find the unsafety, but it was theoretically possible, so we had to fix

Starting point is 00:25:16 that API. But every once in a while it can happen. Since you just brought that up, so is the static analysis able to even say, sorry, these two objects point at each other and reference each other, so they'll never go out of scope or they'll never be destroyed or something like that? So usually the way that that works, if you need multiple ownership, the core of bar checking is this concept called ownership, which is something you have to deal with.

Starting point is 00:25:43 We stole the terminology from C++, so you should be very familiar, I would imagine, with that general notion, right? So an owner is a reference that's, when that reference goes out of scope, the resource will be deallocated. So if you need multiple owners, so the classic example that people talk about is a doubly linked list, which is a terrible data structure, but it's everybody's second or third one they implement, and so they try to build it. A doubly linked list, which is a terrible data structure, but you know, it's in it's everybody's like second or third one they implement. And so they try to build it. A doubly linked list needs multiple pointers to a thing, right? So there's no one single owner. And so you can either drop down to that unsafe and just handle it yourself the same way you wouldn't see. Or you can use reference counting to, you know, say, okay, there are two references

Starting point is 00:26:20 this thing, and then, you know, we decrement it whenever there's one reference and that kind of thing. So in that sense, it can handle those through... RC is basically a library type that's implemented using unsafe under the hood, but the external interface is entirely safe, and so the analysis knows this is going to bump up or decrease the count. And that's sort of how that interacts with that kind of thing.

Starting point is 00:26:47 Does that make sense? I think so, yes. One thing I'm curious about is C++ has plenty of static analysis tools. You can do slash analyze in Visual Studio. Clang has a great built-in analysis tool. Usually running those takes a pretty long time compared to a normal compile. Is that, you know, happened in Rust as well? Is it much longer compiler times because of the sector analysis? So there's, compile times are a really interesting question because of the,

Starting point is 00:27:16 like, multifaceted aspects of it, right? So what I will say is that for some programs, we are already, like, faster than C++++ the rust compiler is often perceived as being a little bit slow but i think that's mostly because go is so quick that people are used to this near instantaneous compilation now um but we uh and one of the big focuses of the early couple releases is dropping compile time so rust 1.1 features something like a 20 or 30 percent decrease in compile times. It's based on our bootstrapping process. Since Rust is written in Rust, we compile it with itself.

Starting point is 00:27:50 So we drop that time 20%, and Rust 1.2 will feature another 20% or 30% drop in those times. One big thing that Rust is missing that can lead to longer compile times is actually incremental compilation. We're in the process of developing that kind of thing. And so that's actually probably where Rust tends to be the most slow at the moment. And the other thing, actually, the static analysis does not take nearly as long time as the LLVM's optimization passes.

Starting point is 00:28:15 We currently generate really naive LLVM IR in many places. And so we rely on LLVM to beat it in the submission and make it actually good and quick. And so that takes a long time sometimes. Usually about half the compilation time is just in LLVM optimization passes, not in the analysis that we do. I'd say that it isn't super speedy, but it's mostly due to things that are not the borrow checker most of the time. In my experience, most of what slows C++ down is heavy template usage.

Starting point is 00:28:47 Does Rust support generic programming like that? We have hygienic AST-based macros, and you can also, on unstable nightly Rust, you can write compiler plugins that will do even more compile-time stuff.

Starting point is 00:29:04 We are missing some aspects of the template metaprogramming stuff you can write compiler plugins that will do even more compile time stuff. We are missing some aspects of the sort of template meta-programming stuff you can do in C++. The biggest problem that we have is you can't parameterize over integers, and that's like a big thing, that sort of weakness in generic programming in Rust, but generally

Starting point is 00:29:19 speaking, yeah, we do use macros. They tend to not be too terribly slow though um okay so i gotta take a step back you just said you can write compiler plugins yeah it seems like that would have to interfere with your safety and analysis well i mean they generate code so basically what happens is is you can you can load up the compiler as a library and then ask it for the AST and then produce a new AST and hand it back to the compiler to actually be compiled. So that's sort of what I mean by compiler plugins. And it will still check that AST.

Starting point is 00:29:56 That happens before all of these analysis passes run. So that doesn't get around them exactly. It's before those analyses passes occur. So one thing you could do, for example, is write your own lints in Rust. And so you can analyze the AST, see a pattern that you think is bad, and then emit a warning or whatever

Starting point is 00:30:16 and do that kind of thing. So it's more like hooking into the compiler's existing stuff that it is getting around those checks. That phase happens later. Interesting. C and C++ have a long history going back to the early days of programming itself. Still, it's hard to find a good development tool for these languages. Luckily, our good friends at JetBrains, after spending over a decade making all sorts of tools for a great many technologies, now provide C and C++ developers with three dedicated tools. CLion, ReSharper C++, Thank you. C++ and Boost. C++ templates and macros are resolved correctly and supported throughout

Starting point is 00:31:05 each tool. Find your way through the code quickly with hierarchical views and instant navigation to a symbols declaration. Boost your productivity by generating the missing members with override implement actions. Rely on code refactorings and be sure that your changes are applied safely throughout the whole code base. Write better, safer, and more efficient code with on-the-fly Thank you. Or if you develop for iOS and OSX, use app code. Visit jetbrains.com slash cppcast dash cpp to learn more and download your free evaluation. Or get a private license at a 25% discount using the coupon code cppcast jetbrains cpptool. And if you're a student or an open source project, use all of them for free, courtesy of JetBrains. Okay. open source project, use all of them for free, courtesy of JetBrains. Okay, the next thing I want to talk about was efficient C bindings, which I think you've already mentioned a little bit.

Starting point is 00:32:11 Yeah, so basically the deal is once we got rid of that runtime, and we went with one to one threading, there's no reason why we basically have as efficient C bindings as you can get in any language, right? We know how to call the C calling function and everything else is set up just fine, so we're basically on par in terms of binding with C as you would be if you were binding to a C library in C++. It's the same thing.

Starting point is 00:32:33 Okay. Are there any other key features of Rust you wanted to bring up before we move on? One thing that I think is really big and important, especially for a C++-focused audience, is Cargo, which I made a reference to a little bit earlier. So you can write Makefiles if you want to, but almost virtually every Rust project does not use Makefiles. We use Cargo instead.

Starting point is 00:32:59 So Cargo is very similar to if you've used Bundler and ruby or npm in node or kind of like pip and virtual end of in python basically the way that it works is it's it's a combination dependency manager plus build tool so you say like i depend on these packages and then cargo knows how to fetch build and compile those packages and link them in for you automatically so you don't need to worry about setting up get sub modules or like adding linker flags like dealing with all that kind of stuff um and so it makes it very very easy to share uh rust libraries and use them in other projects so um we have a website crates.io that's our package manager you know host and we've already had four million yeah crates we get this

Starting point is 00:33:41 like very like working working class like you know here's your package of code, you know, we're shipping them, you know, with cargo, right? But we've had, we have 2,500 or 2,600 packages currently available, and they've been downloaded 4.4 million times so far. Wow. So sharing code is very, very easy, and people do it a lot because it's very low overhead to do so. So I think that's one of my favorite things as opposed to working in... I do C more than C++, frankly, in my spare time, but it's compared to depending on the library in C versus Rust is like night and day.

Starting point is 00:34:15 That's very, very trivial to share your code. Yeah, C++ is starting to catch up in the package manager space. There's now B code, and I think that's the only one I'm aware of. Right, Jason? Yeah.

Starting point is 00:34:29 Does NuGet allow for some stuff like that too? Yeah. No, just for visual studio though. Yeah. Yeah. I've seen a couple fly by, but I'm always like,

Starting point is 00:34:37 I'm going to wait and see if they get popular before I pay attention to them. So it's, you know, there's like a number of those C and C++ package managers, but it's important that, you know, it's a social thing, right? So it's the number of those C and C++ package managers, but it's important that it's a social thing, right?

Starting point is 00:34:47 So the network is valuable, so it's hard to bootstrap that kind of thing, for sure. So you mentioned earlier that you don't support incremental builds yet, so that affects compile time. Yes. So does that mean if you change one of your source files in your project, it has to rebuild all the source files? So we have this whole thing with crates and sharing these libraries of code, right? Yeah.

Starting point is 00:35:12 A crate is a unit of compilation. So if you break your project up into four crates, it will only rebuild the one crate that you change, not the other three. So that's sort of the mechanism. We tend to just keep those things smaller than you would instead of having big monolithic projects. And crates are like a tree of modules, basically, is the way that it works.

Starting point is 00:35:31 So yeah, you'll end up rebuilding the entire crate that that changes in, but projects can be composed of multiple ones. All right, I was trying to reconcile that because you're talking about cargo and it pulling in dependencies. I'm like, wait, does that mean I have to build all the dependencies all the time or what yeah yeah your dependencies will not

Starting point is 00:35:48 get rebuilt only your code and only the crate in which it's in okay that's a good point so does it support like the full complement of like can you make shared libraries dynamically loaded modules that kind of thing like i might okay you You can build static libraries. You can build dynamic libraries if you want. We tend to statically link by default because similar to C++, we don't have a defined ABI at all, right? So if I shipped you a rough library, you would need the exact same build of the compiler

Starting point is 00:36:18 to guarantee that it will work. It's not like we change that thing all the time, but we don't guarantee it, so you can't assume. And so we tend to... We statically link everything but glibc. And we added Musel support recently, so you can actually statically compile Musel in if you want. But for the most part, you tend to build up these smaller libraries and then assemble them into one final binary and ship it if you're doing that thing.

Starting point is 00:36:43 What is Musel? I'm sorry. Musal, M-U-S-L is an alternate libc that's written specifically with the aim of being statically linkable, since glibc is not generally statically linkable. It's just a libc implementation. I did not know that existed.

Starting point is 00:37:00 It's pretty neat. It's missing one or two things, but for the most part, it's good and it works. And yeah, it lets you have fully static binaries, which is nice. That's interesting. So I was looking at the Rust blog preparing for this show, and it looks like 1.1 was just released about a month ago. Yeah.

Starting point is 00:37:20 So that's great. Now that you're past 1.0, is everything starting to stabilize a bit more? Do you think there's going to be more adoption of Rust? Yeah, so the release cycle and also the stability guarantees are interesting. So what's funny is Rust has a history of being very, very unstable. And the ironic thing is that's because we care so much about stability. We wanted to make sure that everything was set up correctly before we guaranteed stuff was actually stable.

Starting point is 00:37:46 So we knew 1.0 means we can't break backwards compatibility, so we hurried up and we broke it as much as possible to get it into the right form, so that 1.0 we could say, okay, we are now stable. And so we've adopted this six-week train release model. So what happens is 1.0 came out, and then six weeks later, 1.1 landed. In about a week and a half, two weeks, 1.2 is going to come out. And because those are 0.1,

Starting point is 00:38:12 0.2, 0.3 versions, they are backwards incompatible with the previous Rust releases. So you can upgrade your code from 1.0 to 1.1, and it will just compile and work. I will say that there is a small asterisk. We do reserve the right to change the type inference algorithms every once in a while, and if we do, you may need to add a minor annotation to sort of fix an ambiguity that pops up. But what we do is we actually have a tool that can run... So again, I said we have those couple thousand crates up on crates.io.

Starting point is 00:38:46 So we have a tool that can run and compile every open source crate that's possible and check to see if we've broken any of them. So every other week or so, we sort of run the head version of the compiler against every open source bit of code. And we can know, hey, did we accidentally break anything or not? And then address those kinds of issues. So one recent example of that is we just changed a lint. And it turns out that some projects had essentially turn all lints from warnings into errors. And so we thought because that lint was not on by default that we wouldn't break anyone's code. But it turns out we broke like 20 or 30 packages. So we've now reverted that change, um, so that we won't break those packages when an actual release comes out.

Starting point is 00:39:27 Um, so we, we're taking backwards compatibility very, very seriously. And, uh, you know, we were sort of making those guarantees, uh, very strongly and taking steps to make sure that we don't break your code when rust, uh, updates. So you said that 1.1 is compatible with 1.2, or 1.2 is backward compatible. So is it fair to say you're at least somewhat following semantic versioning scheme? 2.0 would be a breaking change? Absolutely.

Starting point is 00:39:54 And that's sort of the... Wecember is what we cite in that kind of thing, for sure. We hope that 2.0 will not be any sort of massive breaking change. I personally would like to see the kind of thing where the last release in the 1.x cycle, if you have no deprecations, you can upgrade to 2.0 transparently. So we'll sort of like deprecate things that eventually remove those deprecations. But if you're running with no deprecations, then you'll just upgrade. But we'll see. 2.0, you know, we just landed 1.0 and we're working on stuff, so we're not really totally talking. We've talked a little bit about what a 2.0 might look like in

Starting point is 00:40:30 the future, but, you know, that's far off and we don't want to think about it, you know, right this moment. We're focused on, you know, improving what we have rather than talking about changing everything again, right? Users don't like when your code breaks, right? As a programmer, I hate it when my upstream breaks me, so I want to be a good upstream maintainer and not break everyone's code either. Right. And what about adoption? Obviously, Mozilla is using it and starting to use it more for Firefox,

Starting point is 00:40:55 as you said earlier. Do you think there's other big projects that are starting to look into Rust? Yeah, so actually one of the things we've been doing is we've started scheduling calls with production Rust users to sort of get their feedback and see what their pain points are and how we can improve it. So I actually spent about eight hours last week on conference calls with people that are using Rust in production, and largely they are super psyched. You know, there's obviously some things that could be better, like any project, and so we're working on fixing those kind of pain points for production users. But we definitely have seen a significant uptake since 1.0 because there's a lot of people who are saying,

Starting point is 00:41:33 I'm interested in this language, but I'm going to wait until I don't have to relearn it every single week, which is an incredibly, totally reasonable opinion to have, right? It was my job to pay attention to Rust. It was hard for me to keep up even as a full-time thing so uh you know now that we'll point it was out people are starting to use it uh i saw yesterday uh probably the the best known company that is uh starting to use rust actively is dropbox um and they've said they're going to start open sourcing some code in about a month or six weeks from now um and so that's kind of nice they They've been working on some stuff in private. So yeah. And Chef's new, if you know Chef, the deployments, infrastructures of service company, their new API, the API client is written that new things get adopted is the, you know, new tiny companies build stuff.

Starting point is 00:42:26 So we have a number of smaller companies that are using Rust as well. But there are people, you know, saying some random startup, you probably haven't heard of them. So, you know, it doesn't carry as much weight as those bigger names, as much as we do care about their cases as well, obviously. So you mentioned Dropbox. At last year's C++ conference, CppCon, I know Dropbox had, I think, two presentations where they talked about mobile application development using C++ and they open-sourced Degene. So I'm guessing they used Rust for more back-end stuff. Is that kind of common from what you're seeing in Rust usage?

Starting point is 00:43:01 Yeah, so you can use Rust on mobile. We actually have Android in our CI, so every commit gets tested against Android, and we have a community member who tests iOS as well, so both of those things work just fine. But Dropbox, my understanding, and I basically gleaned this from, I'll talk about this in terms of their public comments

Starting point is 00:43:19 they've made on this, is basically they have an internal block file storage system stuff is what they're using Rust for. So it's very much a thing that's running on their backend servers, not something that's being shipped to people. Yeah. So you just mentioned... I'm sorry, go ahead.

Starting point is 00:43:38 I was just going to say, yeah, so I do think that the backend tends to be where things are shipped. There is one of the first production users is Skylight, which is the product from tilde.io. Tilde is sort of known as the company that started EmberJS. And their product is a Rails performance monitoring application. And so the Ruby gem you install into your Rails app

Starting point is 00:43:59 to do the performance monitoring is actually implemented in Rust. So they're an example of someone that's actually shipping Rust code to end users. But, yeah. Okay, you just brought up two questions for me then. You mentioned iOS and Android. I assume you have Mac, Linux, and Windows

Starting point is 00:44:17 all working? Yep. Are there any GUI toolkits? Like if I wanted to make a cross-platform GUI application for several platforms? So there's really only one that is natively Rust, and it's sort of an intermediate mode graphics library, so it's more useful for things like games that are inventing their own UI.

Starting point is 00:44:34 You can bind to things like GTK or to the native Windows or macOS interface if you want to. Qt is a little harder to bind to, but GTK works really well in my understanding. There hasn't been a whole lot of new ones developed yet, but there's been a lot of bindings to existing cross-platform GUI things. Qt is all C++

Starting point is 00:44:54 where you'd be able to use your native interface for binding to GTK. Right. Qt also heavily uses some features that we don't have a direct analog for, so we don't have variadric functions, so we don't have, like, variadric functions. And so that's, like, a thing that's hard to do the mapping. But, yeah, that's my understanding, at least.

Starting point is 00:45:14 Well, so then you mentioned this Ruby gem that uses Rust for its actual implementation. So what's that story like? Does the Rust have to talk to the C API for Ruby to implement the bindings? Or do you have some other tool that makes binding to scripting languages easier, like Swig and C++? So at the moment, it's pretty much like Rust exposes some C, the Ruby FFIs into it, and the Ruby thinks that it's a C library instead of a Rust library, and it just works. I personally actually spent most of today working on, well, not actually working on,

Starting point is 00:45:54 doing the research that will let me write the code to try to write some libraries to make that easier. I'm going to work on Node and Ruby first, and I would love Python. There's some Python stuff, too. I just know it a little worse, or not as well, I guess. But there are libraries. I have libraries in my head that will make it easier. They don't as well, I guess. But we, there are libraries. I have libraries in my head that will make it easier. They don't exist yet, but they will be coming.

Starting point is 00:46:11 But for now it's pretty much like, yeah, the, the, you just write extern C and then the dynamic language assumes it's just a C library like any other. So if you write extern C, then you'll just like, you could use NM and dump the C symbols from your library,

Starting point is 00:46:24 whatever you, it just looks like a C library to you. Yep. Okay. So I mentioned that we were going to have you on in our last episode, I think, and we actually had a listener write in with some questions. So this is from Isabella Muerte, and she wrote in I find it odd that it's required even for types that use a reference to an object. For instance, it should be obvious that a database transaction cannot outlive a database connection, yet Rust requires this bizarre syntax for generic lifetimes.

Starting point is 00:46:56 I was hoping you might be able to shed light on why I must be explicit with my lifetimes like I must with forwarding references or generating x value standard move in c++ yeah did I come across did yeah okay so um yeah so there's a couple of things about this question I guess I would say that the the first thing is uh sort of the core of this question to me is like why must I write these annotations and And the reason is that we have been conservative with the amount of inference that we've been writing because we want to sort of feel where the pain points are and then write inference rules that make sense.

Starting point is 00:47:33 So at first, early Rust, you actually had to write all of the annotations all of the time. And that was sort of an undue burden on function signatures. So we added a couple of very simple... They're more like elision than they are inference. It's like a pattern. If it sees the pattern, it matches, and it just assumes that the annotations are there.

Starting point is 00:47:51 On structs, which is sort of what this is referencing to about a database transaction with a database connection, you need to annotate those references explicitly still because we decided we want to be conservative with the inference and see how painful it was in real code before we added more complicated inference rules basically.

Starting point is 00:48:11 So in the future, we can possibly add additional inference to make that a little easier. We're just sort of taking the conservative slow steps right now. And oftentimes, too, where you need to write annotations right now, outside of struct definitions, you need to write annotations right now, outside of struct definitions, you need to write inference.

Starting point is 00:48:28 You need to write those annotations when it's not actually possible to do the inference itself. So, um, while it's true that a transaction should not outlive the connection, uh, rust does not necessarily, uh, know that that is true. It's sort of hard without the exact code to tell you, you know, why in those cases, but, um, you know, we, we try to not force you to write those things unless you actually have to. Um, and there's some instances where you do need to inform the compiler of your intention, and that's when you need to write those kinds of annotations. So we didn't talk about this earlier. What do these annotations look like in the code?

Starting point is 00:49:02 Yeah. So just like, so just like you do a less than, greater than with a capital T if you're doing a generic over a type parameter, right? It's like the same syntax as it is in C++. You write a function that's generic over some T, you do less than T, greater than, etc.

Starting point is 00:49:20 You also do this with lifetimes. Lifetimes are a particular kind of generic parameter that's used on references. So as I mentioned earlier, references are kind of like pointers, except for they do the static analysis checking. So you actually annotate and you say, this reference has a generic

Starting point is 00:49:36 lifetime parameter. Usually it's just written like tick A, tick B, tick C. We tend to, just like you tend to use T for a type, use one letter instead of typing out the whole word type or writing a more complicated name for the generic, we tend to use single letters starting with A to indicate generic lifetimes.

Starting point is 00:49:54 And what that does is that lets Rust analyze how long that reference is expected to be alive and to ensure that what that reference is pointing to remains valid for the entire time that that reference is valid. So currently that's based on scopes. So, you know, a reference goes into scope and then it checks that it's valid the entire time that it goes out of scope. There's some patterns that that's a little awkward. So one of the improvements we're hoping on making is to actually move that analysis to the control flow graph instead of scope-based. And that would

Starting point is 00:50:24 let you, you know, drop a reference before it actually goes out of scope graph instead of scope-based, and that would let you drop a reference before it actually goes out of scope out of scope, if that makes any sense. Like, if you only... Even though a reference might be valid for an entire block of code, if you only use it in the first half of that block, you could determine that you don't use it in the second half, and therefore it would be okay

Starting point is 00:50:46 to use, if that makes any sense. Okay. I have a follow-up, also from Isabella. The other question is if these lifetimes must be explicitly dependent, how will Rust approach the issue of custom memory allocations in arenas, object pools, and more?

Starting point is 00:51:06 Sure, a lifetime of a vec of T relying on an arena will be dependent on the arena's lifetime and require a large amount of writing the explicit lifetimes and user-defined traits. I feel that this problem is equivalent to lacking lambdas in C++ where we had to write a damn function object any time. We wanted to use function in the algorithm header, and this leads to code bloat and sad times, which I agree with. Yeah, I mean, nobody likes writing annotations, right?

Starting point is 00:51:32 So I definitely feel that pain. We do actually already have arenas, so you can actually, there's a, I can give you a link if you want to send that along with how arenas actually work today. And what's funny is arenas actually tend to have less lifetime problems because the arena would keep it alive longer. Lifetimes tend to get most complicated when you're talking about

Starting point is 00:51:56 stack-allocated data that goes into and out of scope very quickly. But if you have an arena that's allocated up front, then it can live for a very long time. So it tends to be easier, in my experience, it tends to be easier for the compiler to infer those kinds of things because they tend to be broader. So I don't think that arenas will cause any particular problems as opposed to any other kind of reference, basically.

Starting point is 00:52:22 But you still may feel that it's too painful uh and we may you know may make it less so in the future possibly it just depends so kind of side related to that um game developers in particular like to do things like custom allocators and control how memory is created and accessed does rust allow you to do that? So, games is really interesting. There's some subset of games programmers who are absolutely enamored with Rust and they love the safety aspects. And there's another group of games programmers that are like, I don't want any safety at all. Just get out of my way. And I think the greatest example of that is Jonathan Blow. So he's actually implementing

Starting point is 00:53:06 his own programming language called Jai. And he explicitly looked at Rust and sort of was like, this gets in my way more often than not. And we in the Rust world have sort of been following what he's doing with some interest. He makes different trade-offs

Starting point is 00:53:20 than we would make, obviously. That's why he's making a different language. But I will say that there are some game programmers who don't appreciate the safety aspects. There are other ones who are into it, and there's actually a very, very large number of game libraries available for Rust. The big project's name is called Piston. It's sort of an umbrella

Starting point is 00:53:36 project with people implementing a whole ton of game-related libraries, and the game developer community is actually one of the more active sub-communities of the Rust world. So I definitely think that there are some game programmers who are skeptical, and there are some that are super into it. It just really depends on where you fall.

Starting point is 00:53:55 And this sort of is true of any language that does heavy static analysis. You can either view it as the compiler is getting in my way, or you can view it as the compiler is telling me that this might be a problem and that caught it now instead of endless hours in the debugger, right? So I tend to view compilers yelling at me as a positive thing because it almost always is catching a bug that I would have had to track down at runtime and determine the actual problem for then. And so I tend to actually personally be happy when a compiler yells at me, but I can also know that gets frustrating if it yells at you all the time, right? So it just depends on your temperament, I think.

Starting point is 00:54:29 Right. We talked about Jonathan Blow and his proposals a couple weeks ago, I think. I didn't realize he was actually going forward with that language. Oh, yeah. He hasn't made it open source at all yet. You can't download a compiler or anything, but he's been doing two to three hour-long YouTube videos every couple weeks showing off the latest builds that he's been doing. And it's really interesting. There's a lot.

Starting point is 00:54:52 You sort of like, I mean, watching hours and hours of YouTube video is not necessarily the best use of your time, obviously, right? But there are people who write up summaries, and I have watched a lot of them the whole way through because it's just I love programming languages in general. Personally, it's it's one of the reasons why I work on this one is because I just love languages of all kinds. And so I'm interested to see, you know, what he's doing with his. I did want to say one more thing about fighting the compiler. It's also very true that when people come to Rust initially, they fight with a compiler a lot. And then somewhere around like the one or two month in mark, the rules click, and then they stop fighting with it all the time. So I actually very rarely end up arguing with the compiler these days, because I've sort of internalized those rules, and I

Starting point is 00:55:33 understand them now. But it is true that like beginners tend to fight with a compiler a lot more often. And there are a lot of people who come into the IRC channel, and they say like, I could do this in C++, and it's totally safe. And we're like, well, actually, there's this thing that makes that not safe, and that's why the compiler's yelling at you in this instance. And they're like, oh, yeah, I didn't even think of that. And it's not true for universally everything, and sometimes you do know

Starting point is 00:55:55 that that unsafe thing you're doing is never going to happen, and so you do it anyway. But it's often true that until you internalize the rules you fight a lot and then once you do you stop fighting. Okay, well thank you so much for your time Steve. Is there anything else you wanted to go over before we let you go?

Starting point is 00:56:15 I think that that's mostly it. I guess the last thing that I want to briefly mention is move semantics is another area that's very different in Rust than it is in C++. And it's not that it's so much different, it's more directly integrated. So instead of having to say this is a move, we actually move by default. And we also don't have move constructors, which is something that leads to, so a move is always a memcpy in Rust, which means that, so for example, if we reallocate a vector, we don't need to run a move is always a memcpy in Rust, which means that, for example, if we reallocate a vector, we don't need to run a move constructor on every element of the vector.

Starting point is 00:56:49 We can just memcopy all the data because that's how the semantics are defined. So I think that also C++ programmers coming to Rust will find small areas like that where there's a concept that's roughly analogous to C++. We have different semantics that you'll need to get used to a little bit. You can't just say, oh, I know how move semantics works in C++, so I'm good. And so there's some things like that that we've taken a feature, integrated it heavily, and the rules are a little different.

Starting point is 00:57:13 So, yeah. Okay. So where can people find you online, Steve? So the place that I'm most active online is Twitter. I use my real name online for everything. So if you go on to any random service, you search for Steve Kladnick, you will probably find me, but, uh, yeah, I'm a very active and heavy Twitter user. Uh, and then also, you know, Steve at stevekladnick.com. I reply to emails, although sometimes it gets a little backed up. Uh, but you know, I try to reply to every email

Starting point is 00:57:39 that I get. So those are the two places. And, uh, just one more question. Where would people go if they want to, you know to get the Rust compiler and try it out? Everything is at rust-lang.org. So we have installers there and the documentation, which is my primary job. There's actually a full book if you want to learn the language. It's about 250 pages, 260 pages right now. So there's a lot of resources to learn. And also the IRC channel

Starting point is 00:58:05 is incredibly friendly and active. We have about 1,000 people usually idling in the channel so you can get very quick help and people love newbie questions. No question is too simple or too easy. We would love to have you if you have any problems at all with Rust stuff, just jump on IRC

Starting point is 00:58:21 and that's why we're there. Okay, thanks so much for your time. Thanks for having me. Thanks so much for listening as we chat about C++. I'd love to hear what you think of the podcast. Please let me know if we're discussing the stuff you're interested in or if you have a suggestion

Starting point is 00:58:37 for a topic. I'd love to hear that also. You can email all your thoughts to feedback at cppcast.com. I'd also appreciate if you can follow CppCast on Twitter and like CppCast on Facebook. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode is provided by podcastthemes.com.

CppCast - Rust

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.