CoRecursive: Coding Stories - Tech Talk: Rust And Bitter C++ Developers With Jim Blandy

Starting point is 00:00:00 Welcome to Code Recursive, where we bring you discussions with thought leaders in the world of software development. I am Adam, your host. So it is sort of asking them to give things up. But the thing is, what you get in return is exactly this guaranteed memory safety, right? And this guaranteed freedom from data races. And it's just this huge win. Writing secure, performant, multi-threaded code is very difficult. Today, I talked to Jim Blandy about Rust, a programming language that is trying to make this a lot easier.

Starting point is 00:00:56 We also talked about why it's so hard to write secure code. Jim works on Firefox, and his insights into the difficulty of writing secure code are super interesting. I also asked Jim about Red Bean Software, a software company that refuses to sell software at all. Jim Blandy is the co-author of Programming Rust, among many other things. Jim, welcome to the show. Hi, thanks for having me. Yeah, it's great to have you. I have your book and I really hoped I would have made it a long way through it before I talked to you, but I can see the bookmark. It's about a quarter of the way through. Yeah, it ended up a lot longer than we wanted it to be.

Starting point is 00:01:42 Oh, I don't think it's your fault that i haven't made it what we had in mind originally we wanted to do something that was more like you know k and r like this slim volume just covers exactly what you need um but basically rust is just it's a big language that you know is trying to address the things that people need. And it ended up just being, we took a little while to cover that. Yeah, it's not a small language, I wouldn't say. Yeah. So what is Rust for? What is it targeting?

Starting point is 00:02:16 Well, I mean, the funny answer is that it's targeting all those bitter C++ programmers who are sick and tired of writing security holes. Basically, Jason and I, my co-author Jason and I, we both have worked on Mozilla's SpiderMonkey JavaScript engine. And that is a big pile of C++ code. It's got a compiler front end. It's got a bytecode interpreter. It's got two JITs.

Starting point is 00:02:44 It's got a garbage collector, which is compacting and incremental and generational. And working on SpiderMonkey is kind of hair-raising, right? Because Firefox, even as it's not the lead browser, we still have hundreds of millions of users. And that means that when we make a mistake in the code, we can expose just a huge number of people to potential exploits. And of course, when you're working in this field, and this is true, I think, of all the browsers, you know, you get the CVEs, things get published, you find out about exploits. And so it really is very humbling, right?

Starting point is 00:03:34 And so whether you are writing the code, which is going to go out in front of all these millions of people, or whether you are reviewing the code, right? You know, somebody has finished their patch, and they flag you for review. You know, if you're reviewing the code, right? You know, somebody has finished their patch and they flag you for review. You know, if you're the reviewer, you're sort of like the last line of defense, right? This is the last chance for a bug to get caught

Starting point is 00:03:54 before it becomes, you know, becomes an exploit. And, you know, and you sort of get used to it. You sort of accept that this is the way, that this is the way things are done. Then you start working in Rust. I just was curious about Rust because the original creator of the language, Graydon Hoare, is a personal friend of mine. We worked together at Red Hat. Modern Rust has gone far beyond what Graydon started with, so I wouldn't say that it's his language anymore.

Starting point is 00:04:26 But I was curious. I've been following its development since the beginning, so I was curious about it. And once I really started getting into it, I realized that this is systems programming, where I, as the programmer, am in control of exactly how much memory I use. I have full control over how things are laid out in memory. The basic operations of the language correspond closely to the basic operations of the processor. So as a systems programmer, I have all the control that I need, But I don't have to worry about memory errors and memory safety. And just like I say, you know, when you're working on a security-critical C++ code base, like Jason and I have been, you sort of get used to it, right?

Starting point is 00:05:22 And you sort of internalize that, like, this is just, you know, the standard that you're being held to is actually perfection, right? Because that's what it is, right? The smallest mistake. And these people, when you read about the exploits that people show off at Black Hat, it's just amazing. Just the ingenuity and work and just blood, sweat, and tears that people put into breaking things is really impressive. You know, you've internalized that. And then suddenly you work in rust, and that weight is lifted off your shoulders. And it is like getting out of a bad relationship, right? You know, you just sort of got used to being like, just treated badly. And then suddenly somebody is reasonable to you. And you're like, holy cow, I am never going to do that ever again. Right? Because it's really and then and then the then the, the next thing is that when you get to work on concurrent code, in Rust, right, actually trying to take a problem and distribute it across multiple cores.

Starting point is 00:06:30 Rust is constructed so that when your program compiles, once your program compiles, it is free of databases by construction, assuming that you're not using unsafe code. And in C++, everybody thinks that their multithreaded code is fine. Everybody understands what a mutex is and how it works. The primitives are not difficult to understand at all.

Starting point is 00:07:04 But then you end up getting surprised by what's actually going on in your code when you have to work on it. We had one of the engineers here at Mozilla. Firefox is a heavily multi-threaded program. I think when you start up, there's like 40 or 50 threads that get going. And the garbage collector does stuff off-thread. The JavaScript compiler will push compilation work off to a separate thread. We do I.O., like, for example, when a tab is trying to write something to local storage, that I.O. is often pushed off to a worker thread, and it's sort of handled asynchronously on the main thread.

Starting point is 00:07:50 So it's a very heavily multithreaded program. the thread sanitizer tool, to look for data races, to actually look at our code and observe, you know, how well we were doing in keeping data properly synchronized. And what he found was that in every case where Firefox uses threads, we had data races. Oh, wow. Not most, every single case. So, yeah, that's kind of astounding.

Starting point is 00:08:36 So let's back up. So what's a data race? Oh, okay. So a data race is when you have one thread right to a memory location, and then another thread reads it, but there is no synchronization operation that occurs between those two. That's not like nobody releases a mutex, the other person acquires the mutex.

Starting point is 00:08:56 Or the write isn't atomic, or there isn't a sort of message sent. There are any number of primitives the language provides that ensure memory synchronization. And the reason this is an issue, it's an issue for two reasons. One is that whenever you have any kind of non-trivial data structure, the way they're always implemented

Starting point is 00:09:21 is you have a method, right? Or a function, just any operation on that data structure. And the method will temporarily relax the invariance that that data structure is built on, do the work, and then put the invariance back into place. For example, if you're just trying to push an element on the end of a vector, right? Usually it will write the new element to the end of the vector, and then it will increment the vector's length, right? Well, at that midpoint between those two operations, the vector's got this extra element that it's supposed to own, but the length doesn't reflect that, right? So there's this momentary relaxation of the invariance of the type that the length actually is accurate.

Starting point is 00:10:06 Or even more so, if you are appending an element to a vector and the vector has to reallocate its buffer. So first, it's going to allocate a larger buffer in memory. Next, it's going to copy over the existing elements to that new buffer, right? At which point, there are actually two copies of every element, right? Which is kind of strange, right? Which which point there are actually two copies of every element, right? Which is kind of strange, right? Which one is the owning copy? Which one is live? And then it frees the old buffer

Starting point is 00:10:35 and then it sets the vector's pointer to point to the new buffer and, you know, like that, right? So that's a more complicated operation where in the midst of this operation, the vector is actually in this wildly incoherent state, right? But by the time the method returns, the vector is guaranteed to be back in shape and ready to use again, right?

Starting point is 00:10:58 And so when you have data races, getting back to data races, the problem with unsynchronized access is that it means that you can have one thread observing the states of your vector, or really of any non-trivial type, while it is in the midst of being modified by some other thread. And so whatever invariance the vector's methods are counting on holding in order to function correctly may not hold. And so that's sort of the language level view of things. But then modern processors, of course, add even further complication to the mix where each processor will have its own cache.

Starting point is 00:11:47 And although they do have cache coherency protocols trying to keep everything synchronized, it turns out that even Intel processors, which try to make fairly strong promises about memory coherence, it's still visible that each processor will queue up writes to main memory. That is, if you are one core on a multi-core machine and you're doing a whole bunch of writes to memory, those writes actually get queued up and reads on that same core will actually, you know, if you try to read a location that you just wrote to, it'll say, oh, wait, I see in my store queue that I just wrote to this. And so I'm going to give you the value that I just wrote, even though the other cores will

Starting point is 00:12:29 not have seen those writes yet. And so the other thing that synchronization ensures is that at the processor level, the memory that you are about to read is guaranteed to see the values that were written by the processor that wrote to them, assuming that both have executed the proper synchronization. So a data race is a write to a location by one processor, by one thread, and then a read from that location from another thread without proper synchronization. And it can cause invariance to be violated, and you can encounter memory coherence errors.

Starting point is 00:13:05 The hardware thing you mentioned is interesting. Maybe it's a bit of divergence. So how does that work? So if there's writes queued up to a certain sector or something, and you are reading from it, does it block until those writes go through? Is that what you're saying? Okay, so this is something that... So the processors change over time,

Starting point is 00:13:26 and the different processors have different details about exactly how they do this. And so I'm not sure that I'm going to be able to accurately describe current Intel processors, but this is as I remember it. What you've got at the basic level, you've got the caches that are communicating with each other about what they have cached locally. Like, for example, if nobody has read a particular block of memory, then that's fine, right? But when one core brings a particular block of memory into its cache, it'll actually mark that and say, okay, I've got this,

Starting point is 00:14:12 but I haven't written to it yet. And it's okay for other cores to read that memory. And so maybe all the cores, maybe it's a big block of read-only memory, maybe it's, I don't know, maybe it's strings, static strings or something like that. And so all the cores can bring copies of that memory into their caches and then use it. However, before a core is able to write to a block of memory, it says, I need exclusive access to that. And it actually broadcasts out on this local bus for the purpose of this kind of

Starting point is 00:14:46 communication and says, okay, all you other cores, I'm about to write to this block of memory, please evict it from your caches and mark it as exclusively mine. And so all the other cores, they kick out that block from their caches. They say, we don't know what's in this lock of memory anymore. Only that guy knows what's in it. So then that processor that's obtained exclusive access to that block of memory can do what it pleases, right? And then in order for the other cores to actually even read from that memory now, they have to go and get a copy of it back from or you know force the the core that was writing to it to flush it's what it had back to me in memory and so then it goes back into the shared state um and so they call it the messy protocol it's m e s i which is like uh geez i

Starting point is 00:15:39 can't remember what it is um but like um e stands for, there are four letters are the names of the four states that a particular block can be in an E is exclusive access, which is when you're writing something s is for shared access, when you're when it's actually just everybody has the same copies, and everybody's just reading from it. And I think I is invalid, where like, somebody else is writing to it. And so your copy of it is bogus. So that's just keeping the caches coherent. But then the other thing is that writes are a lot slower than reads. And so each core has a queue of the writes to memory that it has made

Starting point is 00:16:18 that it is waiting to write out to main memory. And so if you do a whole bunch of stores, your store queue will get filled up with a list of these things going out. And if the core which has done the writes tries to read, then certainly it knows what the current values of things are.

Starting point is 00:16:40 But the other cores can't see it yet, can't see those writes yet. And so the way that you can get incoherence, the way that you can end up with different cores having a different idea of what order things happened in, is when one core gets the result out of its star queue, and then the other core gets the result out of its store queue, and then the other core gets the result out of main memory. And so you can end up with different cores seeing writes to memory seem to happen in a different order. And the history of this is actually really interesting. For a long time, Intel would have sections of their processor manuals where they tried to explain how this worked.

Starting point is 00:17:29 And they would make these very friendly promises like, oh, yes, everything's coherent. Don't worry. You just do the writes you want to, and everybody sees them. And then there was this group. I could look up the reference later if you're curious. But there was this group in either Cambridge, I think, or Oxford. Anyway, a very theoretically

Starting point is 00:17:49 inclined group who basically said, we're going to actually make a formal model of the memory that we're actually going to formalize the memory model that Intel has formalized it in. They made up

Starting point is 00:18:06 a little logic that says which executions are acceptable and which executions are permitted by this and which executions are not permitted by this. Now, again, the specification doesn't say exactly what happens. It just says what the rules are. So it says,

Starting point is 00:18:22 this could never happen. This might happen. So it identifies a set of acceptable executions, not a specific, it doesn't tell you exactly which one the processor is going to do, right? It just specifies a set of acceptable executions or a predicate that you could run on execution to say this was real or this is not acceptable, right? So anyway, so what this research group did is they said, well, let's take them at their word, and we're going to write tests. We're going to use this formal, we're going to use this specification that we've written, that we made up, right, because all we've got is English to work with. And we're going to generate a ton of tests that we will run on the actual processor to see if the processors actually behave the way they

Starting point is 00:19:06 are claims to behave in the manual. And I mean, you can tell, obviously, the answer is no, right? That Intel themselves in their own documentation did not correctly describe the behavior of their own processors. And so this group, and the great thing about it, what was really powerful was that their techniques allowed them to just, you know, generate lots of tests and then find ones that failed. And then they were able to reduce them. So when they published, they had very short examples. If you run this sequence of instructions on one core and this sequence of instructions on another core, you will observe these results, which are forbidden by the spec. Right. So it's really nice. It was really just like, here's your book, you know. And basically what they found was that in general, yes, the the the the messy protocols do work as advertised.

Starting point is 00:20:05 But the thing that you really have to add, the thing that you have to put it, you add to the picture to make it accurate is the, the, the store cues, the right cues. Because if you have a right that hasn't happened yet, then you're going to have this.

Starting point is 00:20:19 If you have a right, if you have a right that you've done, if you have, if you've just done a right, you will see that right before other cores will see it. So anyway, this is the kind of thing, right, just to bring this back to Rust. This is the kind of thing where, you know, it sort of raises, I think, a programmer's sort of macho hackles. You say, well, you know, that seems pretty tough for most people, but I can handle it.

Starting point is 00:20:44 Everybody says that. I catch myself thinking that. It's not true. You're not up to the task of being perfect. pushing your algorithm out across multiple cores, pushing your code out to run in multiple threads, and just know that you may have bugs, but they're not going to be these bugs that depend on exactly the order in which memory writes happened

Starting point is 00:21:22 and the exact instructions that the compiler selected and things like that. It is just a huge win. So data races, data races are out. Yeah, data races are out. So how? Well, so the key idea of Rust, which is something, and this is, I think, really the thing that most programmers get hung up on when they learn it, is that Rust takes control of aliasing. And by aliasing, I mean the ability to reach the same piece of memory under two different sort of names under two different expressions right

Starting point is 00:22:05 the example that i give in the book um is uh i actually give a c++ example right um and i say okay you've got this is c++ mind you this is not rust um so you got int x right and and it's it's beautiful right it's an order it's not x, it's just int x, right? And then you take a const pointer to it. So I say const int star x or star p, right? I've got a const int pointer p and I say, you know, equals ampersand x. So I've got a constant pointer to a non-const x. Now, the way C++ works, you cannot assign now to star p, right? If you try to, you know, assign a variable to star p or, you know, use the increment operator on it or something

Starting point is 00:22:54 like that, then that's a type error. You're forbidden from using p to modify the referent to the pointer. But you can assign to x, no problem, right? And so you can go ahead and change the value of x any time you want. And so it's not the case that just because p is a pointer to a constant int, that the integer it points to is constant, right? How perverse is that, right?

Starting point is 00:23:21 I mean, like, what does const mean if it can change? Okay, the thing is, I want to make clear that there are uses for this kind of thing, right? It is pretty useful to say, well, through this pointer, I'm not going to change this value, right? So I'm not saying it's useless, but it is kind of, you know, not what you expect. And so, but if you think about what it would take to fix that, right, to say, well, if I'm going to say that you know that this pointer to this thing that this point this is really a pointer to a constant thing that would mean that for as long as that pointer p exists the pointer to a constant that all other access or that all other modification of the thing that it points to has to be forbidden, right? You have to basically, as long as P is pointing to X,

Starting point is 00:24:06 you have to make sure that X can't be modified, right? And so that's what I mean by aliasing, that star P, that is dereferencing the pointer P, and X are both things that you can write in your program that refer to the same location, right? And this kind of aliasing can arise under pretty much same location, right? And this kind of aliasing can arise under pretty much any circumstance, right? Anytime you have, you know, two paths through the heap

Starting point is 00:24:33 that all arrive at the same object, right? Anytime you have a shared object, right, in the graph of objects, that's two ways to get to the same location. And there will generally be two different expressions that you could write to refer to the same location and there will generally be two different expressions that you could write to refer to the same object right so javascript lets you do this java lets you do this basically every language lets you create aliases and what rust does is it actually restricts your ability to use pointers such that it can tell when something is aliased,

Starting point is 00:25:08 and it can say, okay, for this period, for this portion of the program, these objects are reachable by, basically, there's two kinds of pointers. There's shared pointers, and then there's shared references, and there's mutable references. So it'll say these objects are reachable by shared references, and thus they must not be changeable. Right? And so you know, not just that you can't change those values through those shared pointers, but you know that nobody else can change them either. So it's really powerful. When you have in Rust, if you have a shared reference to an int,

Starting point is 00:25:49 you know that int will never change. If you have a shared reference to a string, you know that string will never change. If you have a shared reference to a hash table, you know that no entry in that hash table will ever change while you have that shared reference, as long as you have that shared reference. And so once that reference goes out of scope,

Starting point is 00:26:06 then changes could happen. Exactly. Exactly. Exactly. And then the other kind of reference is a mutable reference, right? Where what it says is you have the right to modify this, but nobody else does, right? Nobody else has the right to even see it. And so a mutable reference is like, it's basically, it's a very exclusive kind of pointer. So when you have a mutable reference to a hash table, nobody else can touch that hash table while you have it. And that's statically guaranteed. It's part of the type rules. It's guaranteed by the type rules around mutable references. And so you can imagine that any type system which can guarantee this thing about like, oh, there's nothing else, there's no other live way in the system

Starting point is 00:26:47 to even refer to the referent of this mutable pointer. That's a pretty powerful type system. And, you know, working through the implications of that, I think is where most people stumble learning Rust. That there is this strict segregation between shared references where you have shared immutable access and where you have mutable references where it is exclusive access. So there's this strict segregation between sharing and mutation. And the way that Rust accomplishes that is, I think, really novel and something people aren't used to.

Starting point is 00:27:29 And honestly, when you tell – I was having lunch with a very accomplished programmer who is an old friend. We hadn't talked in years. And we were talking about Rust, and he says, yeah, but I can't create cycles. I mean, I'm a programmer. I know exactly what I want to do with those cycles. I want to have data structures that are arbitrary graphs, and I need those data structures. And Rust won't let me make them, and so I'm not interested. And so I think he's wrong, but I think he's making a poor choice.

Starting point is 00:28:05 But he is correct in his assessment that basically Rust really is asking you to give up something that is just such a fundamental tool that most programmers have just internalized, and they've learned to think in those terms. So it is sort of asking them to give things up. But the thing is, what you get in return is exactly this guaranteed memory safety, I think I mentioned the programmer machismo. I want a gender neutral term for that. But like basically the programmer's pride, right? The programmer's like that little bit of little confidence that you've got, right? You want to flip that from people saying, oh, I can handle, you know, data races. I can handle, you know, unsynchronized memory access.

Starting point is 00:29:12 No problem, right? You want to flip them from thinking that to thinking, oh, I can write my code in this restricted type system, right? You want to say, you want to make them say, I can feel, I can handle, I can get things done even though Rust is restrictive, right? You want to say, you want to make them say, I can feel, I can handle, I can get things done, even though rust is restrictive, right? I can overcome these things. I can take this, uh, this limited, uh, you know, buttoned down system and make it. Maybe people just shouldn't be so invested in their own pride. I don't know. I'm not optimistic about that ever happening.

Starting point is 00:29:47 But one thing is, it sounds like what you're talking about, right, is it's like changing the relationship you have with the compiler. I mean, I think some people view a compiler as like a teacher with like a ruler that hits you on your hands, like, don't do that. But there's an alternative way where maybe it's more like an assistant. Yeah. Yeah. Yeah, yeah, yeah. And what's going on a lot with Rust is that your debugging time is getting moved from runtime, right, to compile time. That is, the time that you would spend chasing down pointer problems in C++,

Starting point is 00:30:25 you instead spend in negotiation with the compiler about your borrows and about your lifetimes. And the thing about it is, the difference is that tests only cover a selected set of executions. My tests cause the program to do this. It runs it through its spaces in this specific way, whereas types cover every possible execution.

Starting point is 00:30:52 Definitely. Right? And so that's the property that really makes it wonderful for concurrency, which is that with concurrency, you have to just give up on your tests really exercising all possible, you know, executions. Because, you know, the rate at which different cores run the code and the way we have the threads get scheduled and what else happened to be competing for your cache at the time, none of that stuff is really something that you can get a grip on. And so having a type system that is all possible executions are okay is exactly what

Starting point is 00:31:29 what the doctor ordered for that so is there are we at risk of there just being a problem with the type system um yeah sure i mean if the type system isn't sound, then you lose, right? Or we lose. In fact, so one nice thing is that the people who are sort of the lead designers of the type system right now, as I understand it, are Aaron Turin and Nico Matsakis. And in particular, Nico is the one who had this insight about, hey, we have the possibility of really separating sharing and mutation and keeping those two things completely segregated. And that's what I think is really the defining characteristic of Rust, or rather the defining novelty of Rust. And so they work, when they talk about type systems, they're playing with,

Starting point is 00:32:29 um, with PLT redux, which is a, which is a system from the, from the PLT group, um, that, that made a racket and all that stuff, um, uh, for looking at, for playing with formal systems and looking at derivations in formal systems but they're not proving things about it there is then a project uh called rust belt um i mean there's also a conference called rust belt but uh rust belt is a project um at a german uh university where they're actually trying to formalize rust it's a research program where they say okay we are a group of people and we're going to work on um finding formal models of the rust type system and and rust semantics um and um uh in in particular, there's a guy, Ralph Young, who is really taking this on.

Starting point is 00:33:29 And he is working on machine-verified proofs of the soundness of Rust's type system. Now, it turns out that there are aspects of Rust that make this very interesting and challenging and turn it into something that just has never been done before. In particular, all of Rust is built on a foundation of unsafe code. And unsafe code is code where it uses primitive operations whose safety the compiler cannot check. These operations, they still can be used safely. whose safety the compiler cannot check, right? These operations, they still have, you know,

Starting point is 00:34:09 they still can be used safely, right? They just have additional rules in order to be used safely that you as the programmer can know. So what do you mean to say that it's built on a foundation of unsafe code? Well, the Rust vector type, for example, the vector type itself is safe to use, right? If you are writing Rust code and you use the vector type, it's a fundamental type in the standard library.

Starting point is 00:34:36 It's like, you know, it's the list of... It would be the analog of Haskell's list or something like that. You can't get away with it. You can't not use it. And so, basically, if you are using vector, then you are at no risk, right? Any mistakes that you make using vectors will be caught by the type checker and the borrow checker, right? At compile time. So Rust, so vector is safe to use. Vec is safe to use. But the implementation of vec uses operations whose correctness the compiler itself cannot be sure of. In particular, when you want to push a value onto the end of a vector,

Starting point is 00:35:25 what that's doing is that's taking this section of memory, which, like, again, you've got to imagine the vector has a big buffer, right? And it's got some spare space at the end of the buffer. And you're going to push a new value. Say you're going to push a string onto the end of that vector, right? So vector strings. And, well, okay, that's a... You're transferring ownership of the vector from...

Starting point is 00:35:42 You're transferring ownership of the string from, you know, whoever's calling the push method to the vector itself. And so there's a bit of uninitialized memory at the end of the vector's buffer, or towards the end of the vector's buffer, which is now having a string moved into it. And in order for that to be correct, in order to make sure that you don't end up with two strings thinking they're both same on the same amount of memory and in order to make sure that you don't that you don't leak the memory um uh it has to be guaranteed or it has to be the case it has to be true that the memory that you're moving the string into is uninitialized, right? And whether or not the location that something gets

Starting point is 00:36:28 pushed onto is uninitialized or not depends on the vector being, you know, being correct, right? That is, the vector knows the address of its buffer, it knows its length, and it knows its capacity, right? The actual in-memory size of the buffer. And so the vector has to have, A, checked that there is spare capacity, that the length is less than the capacity, right? And that length has to have been accurately maintained through all the other operations on the vector. If there is a bug in the vector code, and the length ends up being wrong, then this push operation, which transfers ownership, can end up, say, overwriting some existing element of the vector, right? And then that could be a memory flaw, a memory problem. But the nice thing is that VEC is a pretty simple type. It's built on some modules which have a very simple job to do, right? And so that is a small piece of code that we can audit

Starting point is 00:37:35 to ensure that the vector is using its memory correctly. And once we have verified by hand, by inspection, that the vector is using its memory correctly, then we can trust the types of the vector's methods to ensure that the users will always use it correctly. Right? So the users have no concern. It's only we who implement the vector who are responsible for this extra level of vigilance and making sure that we're getting the memory right. So the type system can be and is being formally verified, but the libraries need to be hand audited? What's vector written in? Is it written in Rust? Vector is written in Rust, right? And that's the key, right? Is that unsafe code in Rust is sort of this escape hatch

Starting point is 00:38:29 that lets you do things that you know as the programmer, that you as the programmer know are correct, but that the type system can't recognize as correct, right? So, for example, implementing vector is one of them, right? So vector itself is written in Rust. It uses selected unsafe code. This is exactly what the Rust Belt project is tackling. In order to really make meaningful statements

Starting point is 00:38:56 about Rust, you're going to have to actually be able to handle unsafe code because the primitive operations of Rust, like the synchronization operations, the stuff that implements Vue texts, the stuff that implements inter-thread communication channels, or the basic memory management things that get memory, that obtain memory, free memory for a vector's buffer,

Starting point is 00:39:19 or that free a vector's buffer when the vector is disposed of, or the I.O. operations that say, look, we're going to read the contents of a file or data from a socket into this memory without wiping out random other stuff around it. All of those things are sort of code that no type system can really... Well, yeah, I think you can say that. They're primitive operations, and so no type system can really, well, yeah, I think you can say that. They're primitive operations, and so no type system can really say what they do.

Starting point is 00:39:51 But you can use unsafe code and make sure that you use them correctly. And then, assuming that your unsafe code is right, you can build well-typed things on top of those that are safe to use. And so this two-level structure of having unsafe code at the bottom and then having typed code on the top is what allows people to have some confidence in the system. And so the Rust Belt people actually want to understand the semantics of unsafe code and actually spell out what the requirements are in order to use these features safely. And then they want to verify that Rust standard library

Starting point is 00:40:31 does indeed use them correctly. So they're really going for the whole enchilada. They want to really put all of Rust on a firm theoretical foundation. And it's really exciting. And the trade-off, like as a user of the language, it seems to make sense to me. So you're saying like, rather than, you know, needing to audit my code

Starting point is 00:40:54 to make sure these issues don't exist, I can trust that the system has been formally verified, except for these unsafe primitives, which have been audited themselves. Yeah. Yeah. Well, yeah. Basically, if you don't use unsafe code,

Starting point is 00:41:13 then the compiler prevents all undefined behavior, prevents all data races, prevents all memory errors. If you don't use unsafe code, you are safe. If you do use unsafe code, you are, or unsafe features, you are responsible for making sure that you meet the additional requirements that they impose above and beyond the type system. And so, yeah, I mean, and so either you can figure out how to fit your problem into the safe type system. And the nice thing about Rust is that the safe type system is actually really good and quite usable. And most programs do not need to resort to unsafe code.

Starting point is 00:41:51 So you can either work in that world, which is what I almost always try to do, or if you really need, if there's a primitive that you really know is correct, but that Rust, that the type system can't handle, then you can drop down to unsafe code, and you can implement that. And one of the things we, one of the strategies that we emphasize in the unsafe chapter of the book, it's the very last chapter after we presented everything else, is one of the strategies that we encourage people to use is to make sure that, or to try to design interfaces such that once the types check, that you know that all of your unsafe code is A-OK, right?

Starting point is 00:42:32 And then that means that you've exported a safe interface to your users. And so if you have an unsafe trick that you want to use, you isolate that unsafe trick in one module, right, that has a safe interface to the outside world. And then you can go ahead and use that technique and not worry about its safety anymore. You use the module, and then the module's own types ensure that it's okay. The unsafe code doesn't escape, right? Exactly, exactly.

Starting point is 00:43:04 It sounds similar to the, to the idea, like, you know, people be writing some Haskell function that claims to do no side effects, but for performance reasons, maybe it's actually doing some sort of, um, you know, generating a random number or maybe that's a bad example, but, uh, it's totally hidden from the user, right? It acts, It acts pure from the outside, whatever may happen. Yeah, that's a good example. That's a good example because the question comes, the question arises, is it really pure from the outside? If they did it right, if they really actually kept all of the statefulness local and isolated so that you can't tell from the outside, then everything's fine.

Starting point is 00:43:43 The people and the rest of the, whoever's using that from the outside can use it and not worry about it. And they get the performance and they don't have to worry about, you know, the details. But then inside, the people who wrote that code are, they have extra responsibilities, right? And the normal Haskell guarantees of statelessness don't, don't apply to them because they've, they've broken the rules or they've stepped outside the rules and they are, they're now responsible. You mentioned, um, the type system of Rust and, uh, actually it has a lot of, a lot of features that, that I guess you wouldn't expect from something that's,

Starting point is 00:44:20 well, it may be, I didn't expect it. it has a lot of functional uh feeling features oh yeah you know i'm really glad i'm really glad that you brought that up because i mean so i've talked about safety and i think i've talked about performance right but but rust the nice thing really nice thing about rust is that it is not by any means a hair shirt right it is actually really comfortable to use it has a a very powerful generic type system. The trait system is a lot like type classes in Haskell. If you've used type classes in Haskell, I mean, everybody uses type classes in Haskell,

Starting point is 00:44:56 whether they know it or not, right? And yeah, so Rust has traits. The whole standard library is designed around, you know, those generics and those traits, and it puts them to full use. And it's actually a super comfortable system to use. I did a tutorial at OSCON in Austin last May where we went through, to the extent that you can in three hours,

Starting point is 00:45:25 writing a networked video game. And so that involved 3D graphics, it involved a little bit of networking, and it involved some game logic, right? And when I was working on it, obviously I had to have the game ready for the talk. And, you know, I, I didn't, it took me, it took me, I put it off. And so I had to do the last stages of development in a rush. And it was fantastic. It was like I had, I had wings or something because once i'd gotten something

Starting point is 00:46:07 written down right once i'd really gotten the types right um it was done it was done if it had been if i had been working in c++ i would have had to randomly take three hours out of the schedule to to fix to track something down uh and debug it. And because it was Rust, I just got to keep going forward, forward, forward. And so it was just like really great progress. And Rust has all these sort of batteries included kind of things. Like there's a crate out there called Serdy, which is for serializing, deserializing. Serializer, deserializer.

Starting point is 00:46:43 And it is a very nice collection of formats like there's json there's a binary format there's xml there's there's you know a bunch of other stuff right and then a set of rust types that can be serialized string hash table vector um you know what what have you. And Serity is very carefully constructed so that if you have any type which it knows how to serialize or deserialize, then you can use that with any format that it knows how to read or write.

Starting point is 00:47:17 So you just pick something off of this list and then pick something off of that list and you're immediately ready to go for sending something across the network. uh and in fact it will actually you can actually and that naturally if you define your own types you can specify how they should be serialized or deserialized right um you know you define your own custom struct and say well here's how okay but the thing is that's real boilerplate right? So there is actually this magic thing that you can say, you can slap on the top of your own type definition.

Starting point is 00:47:49 You can say, derive, serialize, and deserialize. And what that does, I mean, I guess Haskell has something like this too, that automatically generates the methods. It looks at the type and automatically generates the methods to serialize and deserialize that type. And so it is super easy to get your stuff ready to communicate across the network. And so for, you know, for communicating people's moves and communicating the state of the board, that was, it was just, just a blast because there was all of this sort of boilerplate stuff that I didn't have to worry about.

Starting point is 00:48:28 And those are just the kind of power tools that are wonderful. Just for a callback, I think that's like generic derivation. And I did have Miles on the show earlier. He wrote something similar for Scala. And yeah, Haskell has it. I think it was originally called Scrap Your Boilerplate. But yeah, very cool feature, right? A lot of boilerplate can be removed by things like that. Yeah, Scrap Your Boilerplate is done within the Haskell type system,

Starting point is 00:48:53 if I remember that paper right. And Surdy is doing a little bit of procedural macro, kind of exactly looks at your type definition and decides what to do with it. And I wonder maybe that stuff could be done in a scrap your boilerplate style.

Starting point is 00:49:14 I don't understand scrap your boilerplate well enough to say. But yeah, it's that style of thing. And those are just wonderful power tools. One thing, so it sounds, I think you're making a good thing. And yeah, and those are just wonderful power tools. One thing, so it sounds, I think you're making a good argument. So Rust, you know, apparently hard to learn.

Starting point is 00:49:30 This is what I've heard. However, once you learn it, there's a superpower. Yeah. So is this superpower applicable to non, you know, C++ devs? Is this a useful skill for somebody who's throwing up web services? I think so. So I work in...

Starting point is 00:49:52 Okay, so you had Edwin on talking about Idris, and Edwin made a comment that I want to push back on. He said, I don't think that types really help people eliminate bugs that much because unit tests are still useful. And so I work, right now I work in the developer tools part of Mozilla. And we have a JavaScript front end.

Starting point is 00:50:24 The user interface for the developer tools in Firefox, they're written themselves in JavaScript. And it's a React Redux app, basically, that talks over a JSON-based protocol to a server that runs in the debug key and looks at the web page for you. And we are, I'm proud to say that my colleagues are enthusiastic about the potential for types, and they really see the value of static typing. And we are more and more bringing, we're using flow types in JavaScript.

Starting point is 00:51:10 We're bringing flow types into our code base. But it's not done, right? We haven't pushed them all the way through. There's plenty of untyped code still. Because JavaScript flow types let you type one file but leave the rest of the file, leave the rest of the system untyped. And so you can gradually introduce types

Starting point is 00:51:31 to more and more of your code as you go. So we're in that process. And of the bugs that I end up tracking down, I think, and I don't want to put a number to it because i mean i haven't been like keeping statistics but it feels like uh at least half of them uh would have been caught immediately by static typing right i've heard i've heard people say this when uh when typescript like moving to typescript which is similar right that? They're like often. Yeah, yeah, yeah. Same idea. Often they found a, yeah, like a, you know, not a super obscure bug,

Starting point is 00:52:10 but like a little bit like a corners where things would go wrong, that the type system was like, what are you doing here? Well, the thing is, the thing is, I think people like the people who work in Haskell or, you know, certainly somebody who works on Idris, I don't think they really know what the JavaScript world is like, right? It's just insane what people do. Okay. In JavaScript, if you typo the name of a property on an object, right? It's just a typo, right? You capitalize it wrong or something. That's not an error.

Starting point is 00:52:49 JavaScript just gives you undefined as the value. And then undefined just shows up at some random point in your program. And so until you have complete unit test coverage of every single line, you don't even know whether you've got typos. That's crazy. That is just not what humans are good at and it's exactly what computers are good at and so to to not to make that to put that on the human programmers shoulders is doesn't make any sense um so now to be fair to edwin, he does have t-shirts that say, it compiles, ship it. Oh no, I thought that was a really good podcast.

Starting point is 00:53:31 We're all fans of, or we're all really curious about Idris. But I think that we don't want to undersell the benefits of static typing. Back to your question, for people who aren't doing systems programming, why would they be interested in Rust? Rust is just a really good productive language to

Starting point is 00:53:49 work in. It will make you worry about a bunch of things that maybe you thought you shouldn't have to think about. But in retrospect, I kind of feel like I'm happy to have those things brought to my attention. Like, for example, at the very beginning, I talked about how data structures, the method of a data structure will sort of bring it out of a coherent state and then put it back into a coherent state. You want to make sure that you don't observe it in the midst of that process. Well, you can get those kinds of bugs even in single-threaded code, right? You can have one function which is working on something

Starting point is 00:54:31 and modifying something, then it calls something else, calls something else, goes through a callback, and you have several call frames, right? And then suddenly you have something that tries to use the very data structure that you were operating on at the beginning, but you weren't aware of it. And so basically nobody knows that they're trying to read from the data structure that you were operating on at the beginning, but you weren't aware of it. And so basically nobody knows that they're trying to read from the data structure that

Starting point is 00:54:49 they're in the middle of modifying. And that's something that's called the iterator invalidation. In C++, it's undefined behavior. In Java, you get a concurrent modification exception. And I just mentioned this to a Java programmer, and he's like, oh, yeah, CMEs. That's, you know, they had a name for them, right? They knew. And that's also, that's totally a single-threaded programming error. And that's also prevented by Rust's type system. So I feel like Rust's types actually have a lot of value, even for single-threaded code,

Starting point is 00:55:23 which is not performance-sensitive. But it's just really nice to have a... It's really got your back in terms of letting you think or making sure that your program works the way you think it does. And so, yeah, I mean, I think it has a lot of applicability as a general-purpose programming language. And the one thing we didn't talk about, but I think that you touched on briefly at the beginning, was to do with security.

Starting point is 00:55:50 So we talked about data races, but you also mentioned security. Right. Yeah. So most security, well, sorry, there are lots of different kinds of security holes. And according to the collected data, there are a few people who collect statistics on the published security vulnerabilities and sort of what category they fall into. You know, is it SQL injection? Is it cross-site scripting? Is it, you know, they sort of categorize them. And the category that I'm interested in for this particular case is the memory corruption, memory errors. And those have been a consistent, you know, 10, 15% of all the security vulnerabilities

Starting point is 00:56:33 being published altogether, right? And so there's still a very big issue. And most of the time, almost all the time, what's happening there is you've got a buffer overrun. After you have seen enough of these attacks, you start to feel like pretty much any book could be exploited with enough ingenuity. It turns out that I can't find this post anymore, but the Chrome security team, Google's Chrome browser security team,

Starting point is 00:57:25 had a blog post just about security errors caused by integer overflows. Integer overflows sounds so innocent, right? But it turns out that if that integer is the size of something, you are 90% of the way to perdition, right? Because basically you can make it think that something is much bigger than it actually is in memory, and then you've got access to all kinds of stuff you shouldn't have access to, and you're really out of the – you've broken out of jail.

Starting point is 00:57:59 And so, yeah, so having a type system which prevents memory errors and which basically makes sure statically that you're not going to, that your program doesn't behave in an undefined way, really does close off a very significant opportunity for security holes. And one of the quotes we open up, one of the chapter quotes we open up the book with was a tweet by Andy Wingo, who is a great JavaScript hacker and a great free software hacker. And his tweet was simply, basically there was a bug in a TrueType parser, right? A font parser. And that was one of the basically there was a bug in a true type parser right a font parser and that was one of the bugs that was used to break into the machines that were controlling the iranian um

Starting point is 00:58:55 nuclear uh purification facilities oh i didn't know that what what was that what was that called a stuxnet right yeah stuxnet yeah that that basically was built around a flaw in TrueType, right? So TrueType, it's a font parser. TrueType is security-sensitive code. So basically, all code is security-sensitive. There's no longer, you can no longer say, oh, you know, it's just graphics code. You know, it's not true. If

Starting point is 00:59:25 you're writing C++ code and it's got controlled memory and it's doing pointer arithmetic, you've got to be on your toes. And the standard is perfection. And so Rust, same as a data race, it takes a certain class of these

Starting point is 00:59:41 vulnerabilities off the table? Yeah, actually it takes in Rust without unsafe code, if your program types, then we are saying it will not exhibit undefined behavior. And undefined behavior is like often the... Yeah, is often the root of the security hole. Awesome.

Starting point is 01:00:02 So we're reaching the end of our time here. One thing when I was Googling you that I found is your Red Bean software site. Oh, sure. I actually ended up forwarding this to a couple of my friends. It says on it, to all intents and purposes, it appears you have a consulting company that does not sell software. Is that correct? Well, first of all, that's really, really old. My friend Carl Fogel and I, we ran a little company called Cyclic Software, and we were selling CVS support.

Starting point is 01:00:36 We were the first group to distribute network transparent CVS. We didn't write it, but somebody else wrote it, and they said they didn't want responsibility for it. And so we were the ones who were distributing it. And so I'm kind of proud of that because it was Network Transparent CVS that was really the first version control system that open source used to collaborate on a large scale. And then it got replaced by Subversion and Git and Mercurial. But network transparency at CVS was really how things got started.

Starting point is 01:01:11 So we had Cyclic software, and then we decided we didn't want to run it anymore. We couldn't run it anymore, and so we sold it to a friend, and we realized we had to change our email addresses. We had to tell everybody, don't email us at JimB at Cyclic anymore. And that's kind of a bummer. We realized that we were going to be changing our email addresses. We had to tell everybody, don't email us at, you know, Jim B. at Cyclic anymore. And, you know, that's kind of a bummer. We realized that we were going to be changing our email addresses every time we changed jobs. So we resolved to create a company whose sole purpose was never to have any monetary value, right? So we would never have to sell it. And so we could keep a stable email address for the rest of our lives. I mean, it's a vanity domain and lots and lots of people have vanity domains.

Starting point is 01:01:46 So, but, but our joke is that it's a, it's a company whose purpose is never to have any value. Yeah. I found on the front page, it says, let me read this by buying our products.

Starting point is 01:01:54 You will receive nothing of value, but on the other hand, we will not claim that you have received anything of value in this. We differ from other software companies who insist in face of abundant evidence of the contrary, that they've sold you a usable and beneficial item yeah that's that's carl well it's been a lot of fun uh jim thank you uh so much for your time yeah thanks for having me it was fun and i enjoyed your book uh i'll get through it eventually i think i'm on chapter four

Starting point is 01:02:22 i'm gonna keep working yeah yeah yeah stick with it through the traits and generics chapter. Because once you've gotten to traits and generics, then that's really where you've really got everything you need to know to really read the documentation and understand what you're looking at. Awesome. Thank you. And we're really sorry it's chapter 11. We tried to make it as fast as possible. No, it's all good.

Starting point is 01:02:42 All right. Take care. Take care. Take care.

Your Ad Here

CoRecursive: Coding Stories - Tech Talk: Rust And Bitter C++ Developers With Jim Blandy

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.