CppCast - Unix and C History

Episode Date: February 4, 2022

Brian Kernighan joins Rob and Jason. They first talk about the pros and cons of virtual teaching and training during COVID times, and the news that BOLT has been merged into LLVM. Then they talk to Br...ian about the history of UNIX and C development at Bell Labs. News More than a year of virtual classes experience - The good parts BOLT merged into LLVM C++ Cheat Sheets Links Brian Kernighan Unix: A History and a Memoir (Amazon) The C Programming Language (Amazon) Sponsors Indicate the #cppcast hashtag and request your PVS-Studio one-month trial license here https://pvs-studio.com/try_free C++ tools evolution: static code analyzers: https://pvs-studio.com/0873

Transcript
Discussion (0)
Starting point is 00:00:00 Episode 335 of CppCast with guest Brian Kernaghan recorded January 31st, 2022. Sponsor of this episode of CppCast is the PVS Studio team. The team promotes regular usage of static code analysis and the PVS Studio static analysis tool. In this episode, we talk about virtual teaching in Bolt. Then we talk to Brian Kernaghan. Brian tells us about the history of C and Unix at Bell Labs. Welcome to episode 335 of CppCast, the first podcast for C++ developers by C++ developers. I'm your host, Rob Irving, joined by my co-host, Jason Turner. Jason, how are you doing today?
Starting point is 00:01:12 I'm all right, Rob. How are you doing? Doing all right yeah still still cold still cold down here I'm sure it's even worse than Denver I saw your tweet about the temperature differential. What is it an 80 degree temperature differential across like 25 miles or something like that yeah. It's crazy. It's Colorado. It's funny because in that tweet like if you go and read because that was a national weather service tweet and all these people are responding to the national weather service saying you need to check your thermometers clearly this is wrong where those of us who have been living in colorado for more than a few years are like no it's normal yeah this happens like this isn't even that surprising. I mean, particularly, you know, you could just drive 80 miles and get warmer if you like to. Yeah. Yeah. I mean, well, part of the problem with that
Starting point is 00:01:51 particular one was there's no direct route between those two locations because the mountains got in the way, which is why it was possible for there to be such a large temperature differential. That makes sense. But either way, I mean, I was chatting with one of my friends and he was 10 degrees warmer in Denver. I mean, I say that I one of my friends and he was 10 degrees warmer in Denver. I mean, I say that I'm in Denver. Technically, I'm just outside Denver, like 10 degrees. It's a lot warmer. All right.
Starting point is 00:02:12 Well, at the top of every episode, I treat a piece of feedback. We got this tweet from Daryl referencing last week's episode with Kobe, where we were talking about using Docker. He wrote, yep, eliminating system crud when doing the dev is so nice. The missing implicit dependencies show up much quicker. And yeah, that's definitely something I don't think we talked about. But yeah, if you are running from a Docker container that only has the specific, you know, components and tools that you put into your container script Docker file, then if you're missing anything, you're going to have to go and add it to the container. You're going to know exactly what you need.
Starting point is 00:02:49 It's a good point. It kind of makes me feel old because I've just been so used to using full VMs for that for so many years. I guess I should try using a container at some point. Yeah. I should mention that when we had Kobe on last week, we actually had a whole bunch of other topics we meant to talk to him about, and we just spent so much time talking about Docker
Starting point is 00:03:08 that we didn't really get to it all. So we're probably going to have him on again real soon to talk about some other things. Yeah, it's the first time we've ever done that. Well, we'd love to hear your thoughts about the show. You can always reach out to us on Facebook, Twitter, or email us at feedback at cppcast.com. And don't forget to leave this review on iTunes or subscribe on YouTube. Joining us today is Brian Kernaghan. Brian received his
Starting point is 00:03:30 PhD from Princeton in 1969 and was in the Computing Science Research Center at Bell Labs until 2000. He is now a professor in the Computer Science Department at Princeton, where he writes short programs and longer books. The latter are better than the former and certainly needs less maintenance. Brian, welcome to the show. Hi, guys. It's a pleasure to be here. There was so much happening in Bell Labs at that time. That had to have been a very exciting time to have been there. Yeah, it actually was kind of remarkable. I mean, some of this is the rosy glow of,
Starting point is 00:03:57 you know, it being a long time ago, but I certainly remember it as being best times of my life. You're surrounded by a whole bunch of good people doing really interesting things in a field that hadn't been mined out or gotten to commercial or all you know just all kinds of benefits like that so a golden age at least as filtered through said rosy glasses well i mean you know to be fair i have jobs from 20 years ago that i don't think of as the best time of my life so uh it's not just the rosy glasses of time, hopefully. All right. Well, Brian, we've got a couple news articles to discuss.
Starting point is 00:04:32 Feel free to comment on any of these, and we'll start talking more about your work at Bell Labs and Unix and C and all that stuff. All right. So this first one we have is a blog post from Andreas Fertig. And this is more than a year of virtual classes experience, the good parts. He's going into goes into. Like, it makes everything a lot more accessible. You know, you can easily join a Zoom call and not have to, you know, pay to travel to some location to attend a class or anything like that. What do you think of this, Jason? Because I know you've kind of held off on doing any virtual training. I have, yes. And my style of teaching classes is different enough from Andreas's that I don't know if the advantages that he sees would directly have applied to my classes.
Starting point is 00:05:34 Because if I don't get interaction throughout the class, then I don't have that much lecture material. I need people interacting with me. Or the advantage is that Andrea says, well, if someone is ill or needs to go do whatever, then they can just turn off their camera and chill and still listen. But I'm like, I need people participating. Otherwise, my class is going to be half as long as they signed up for, basically. Brian, what are your thoughts on this as a computer science professor? Have you been teaching virtually this whole time? Are you getting back into like a hybrid or in-person style of classroom these days?
Starting point is 00:06:08 It's been, well, close to two years of this. And for the first year or thereabouts, it was entirely virtual. And that was, as Jason suggested, really most unsatisfactory. I think anybody who teaches does a much better job when there's an audience where you can see whether you're getting across, you can see the places where you have completely not gotten across. People ask questions spontaneously that send things in often new and
Starting point is 00:06:36 interesting directions. And so all of that stuff doesn't work nearly as well when you're on Zoom or some equivalent. And so I found the level of engagement a lot lower, I think, with kids when I was doing it via Zoom. The other thing is that the class gets too big. You can't see them all, period, even. And, you know, although the rules are you're supposed to leave your video on and practice, there's times when you can't and times when you don't want to. And so, no, I think the downsides certainly outweigh the outsides a lot, aside from the scheduling issues. I mean, the fact that I can talk with you folks without leaving the comfort of my office right here in beautiful suburban frozen New Jersey, that's a benefit. You know,
Starting point is 00:07:22 I've given talks in Australia and Argentina in the last couple of weeks, and that kind of stuff would never happen in previous life. So that's the good bit. Scheduling is often easier, and it was certainly easier when kids were locked down. The class I taught a year ago last fall, I guess, had kids as far separated as England and Hawaii. And so there was a noticeable time zone gap. And that was clearly a nuisance for all concerned, but something that was part of the equation. So we're back to, at this point, in fact, we're less full in-person classes, but everybody is, kids are tested twice a week and tested once a week, and everybody wears masks and has to have been boosted, no exceptions. So that means that it is, you know, comparatively safe in a real sense, and it's a lot better.
Starting point is 00:08:19 You know, I like to pace when I'm in class back and forth and back and forth. Probably drives the kids nuts, but it's better. And so I can do that kind of thing. Any issues with teaching in a mask? I was wondering that too. Yeah, my brother-in-law's wife is a professor and she had to get a mask with a window on it because she had someone who was hard of hearing in their class. So just to be able to read lips, the mask was getting in the way. Well, that's a new one on me. No, I haven't had that. The thing that I found when I first did it
Starting point is 00:08:52 was that you need to inhale as well as exhale. I had trouble with that part of it. And so I was always gasping for breath. I'm curious, what classes are you teaching this semester or last semester? So last semester was a class that I've done almost every year for over 20 years at this point on computing and communications and things like that, but for a very non-technical audience, history major kind of people or English majors and things like that. So I did that last fall. I did the year before totally virtual with 20 kids. I did it in person, 30 kids last fall. That was absolutely fine. It was nice to be in the same room as the kids. In the spring, I've been doing independent work seminars. Princeton has this requirement that kids do independent research as undergraduates in various amounts.
Starting point is 00:09:47 So I'm running a couple of seminars with sort of nine or ten kids each on digital humanities, which is kind of an off-the-wall topic for your hardcore computer science stuff. None of this Docker bit, but at the same time, it's a lot of fun and a chance for kids to explore an area that they might not otherwise have a chance to explore as part of an academic experience. So I've been doing that in the spring for the last three or four years. Okay. The next thing we have is a post about Bolt being merged into LLVM to optimize binaries for faster performance. I don't recall, Jason, if we've talked about Bolt before.
Starting point is 00:10:26 I feel like it may have come up once, but I wasn't sure. I was not sure either. But it sounds pretty impressive. It's something that was developed by Facebook, and it's been used internally there for a while. It is used to optimize the code layout for binaries generated from GCC and Clang. And it's now going to be part of LLVM.
Starting point is 00:10:45 And they have some nice graphs here where they're showing the performance improvement that you get from Bolt alone versus performance improvement from PGO. And you can combine them to get even more combined performance improvement. So it looks like pretty great stuff. If I'm reading this right, you take your binary, run it through Bolt, and it does an optimization of the layout, and magically GCC is 21% faster. That's great. I was hoping you guys would tell me.
Starting point is 00:11:17 I assume it has something to do with cache management? Is that what the deal is? Part of it looked like cache layout, yeah. I mean, for cache advantages, layout for cache advantages. Yeah, okay. The closest corollary I can think of in my own work is when I've used profile-guided optimization incorrectly. So code that was in the hot path in the real world wasn't in the hot path during pgo so then the compiler and linker takes the code that wasn't in the hot path and moves it to the far end of the binary and then
Starting point is 00:11:53 anytime you need to call that code it has to do you know a jump to a code that's cold not in the cache load it into the cache and then execute it and so by running profile guided optimization incorrectly, I was actually able to get like a 30% performance hit to my application. So if it's a similar kind of situation, but it's relaying things out to make sure the code that you're always going to execute next as much as possible is hot and ready to go, then maybe that's where it gets its big advantage.
Starting point is 00:12:23 Yeah, this is really, now that I slightly understand it better, it reminds me of something so far back in my past. I did my PhD thesis on a problem called graph partitioning, and the original impetus for that was to improve the paging behavior of programs by figuring out which chunks of code, let's call them subroutines or functions, went on particular pages so that you've minimized the number of transitions from one page to another because that would cause a page swap. That sounds like it could be almost directly related. Maybe you were referenced in their paper. I don't think people go that far back. I need to know much younger computer science professors that it surprises me sometimes.
Starting point is 00:13:08 They're like, well, everyone already did this research back in 1965, and we're just now just rediscovering it, you know? All right. And this last thing we have is a post on hacking CPP. And this is cheat sheets. And there's a whole bunch of different ones. We have cheat sheets for algorithms, sampling distributions, a whole bunch of stuff in this page. It looks like it's really good for reference.
Starting point is 00:13:32 I've had this page open for like two months and kept meaning to put it in the show notes. I thought it would be a good one to add in today. It's just the author put so much work into showing all these graphs for how the algorithms execute, how data is laid out with all the different types in the standard library and everything. It's an incredible amount of work.
Starting point is 00:13:48 Yeah, definitely. Brian, I did want to ask, what is your kind of current language experience like? I mean, obviously you had a lot of involvement in C. Are you keeping up to date with C or C++? Not in a strong sense. In C, I basically tuned out of the improvements after roughly the 88 version or something like that. A lot of things got added, none of which were of any direct use to me, and so I kind of zoned out. C++, I follow pretty casually. I hate to admit it on this particular August 4th. That's fine. I know, yeah. But on the other hand, Bjarne is a very good friend, and he and I talk quite regularly, and I get sort of bits and pieces about the standardization process. I have read manuscripts of, I think, all of the C++ over this point 40 years.
Starting point is 00:14:46 So in that sense, I kind of keep up loosely, but I haven't written a significant amount of C++ for six or seven years at this point. So I know what it is. I recognize it when I see it. But I think if I look, most of the coding I do at this point is pretty small stuff. A lot of it in service of explaining something in a class. And for that, it tends to be either Ock for one-liners or Python for things that might be a couple hundred lines. And that's about it. Occasional forays into other stuff, but not much. I didn't even think about that until just now. You and Bjarne must have overlapped at AT&T for a while, huh?
Starting point is 00:15:17 Yeah, I guess the way I would phrase it is that I was at Bell Labs. I arrived there permanently in 69. And Bjarne arrived there in 79 or 80-ish, approximately, and through the luck of the draw, I was his manager, although you don't manage people for, jeez, I don't know, close to 15 years, and then he went off to AT&T Labs in roughly 95-ish or something like that, But we stayed in close contact all along. Oh, that just shows how terrible my history of these things are. I didn't realize AT&T Labs was a separate lab from Bell Labs. It was one of the fission products when AT&T split
Starting point is 00:15:55 in 95 or 96 into Lucent, which was sort of the bulk of the old Bell Labs and AT&T Labs, which was the part of AT&T that went, yeah, Bell Labs, the Western Electric, the manufacturing stuff, and so on, went one direction. And sort of two-thirds of Bell Labs went with that, the part that I stayed in. Another third of Bell Labs went with AT&T, which at that point was basically long distance kind of communications. Right. In various ways, all of the fission products fell on hard times. Well, Brian, maybe we should start off by talking about this memoir that you wrote, which is about the development of Unix. What kind of inspired you to write this 50 years after the original development of Unix started? Well, you know, somewhere probably around 40 years after that, I started trying to get people who actually had the right combination of knowledge and talent and potential interest to write a
Starting point is 00:16:55 history of Unix and related things. You know, because there were several friends who had done basically serious historical studies, and there were other people who had been present at the beginning in a position of authority and influence. And I couldn't convince anyone to do it. So finally, I said, okay, let me do it myself. And my wife and I were going to spend two and a half months in England over the summer. And I thought, well, it'd be a project. And I very quickly came to the conclusion that I was not capable of writing a sort of academic history, you know, the kind that has footnotes and carefully checked sources, all this kind of stuff that just wasn't going to work. And so it mutated into something which was this combination of history and just my personal
Starting point is 00:17:38 recollections of things for which I didn't need footnotes. I could kind of make it up as I went along. And in that sense, it worked pretty well. And I wrote the thing fairly quickly over that period of time in England and then came back and finished it off. I think the triggering event was that the folks at Bell Labs in Murray Hill wanted to have a sort of 50th anniversary gathering in October of 2019. And that was the forcing function. I wanted to have the thing done and ready at that point. And so I made it within a day or so. There are places where I'm sure a little more seasoning would not have hurt a bit. But that's sort of the genesis and sort of the evolution of it.
Starting point is 00:18:24 So did the 2019 gathering, that did actually happen, I assume? That did happen, yeah, the timing luck. And it was quite a nice party. I don't know how many people there were, certainly 300 or 400, and including friends from Unix-y days that I hadn't seen for many years at that point. So, yeah, it was actually very, very nice. And a video of it, or videotape talks and so on and some of those are online and others seem to have disappeared and i honestly don't know where they
Starting point is 00:18:51 went in you know bitbucket somewhere unfortunately but it was definitely a fun gathering no question about that definitely intended to read the book i went to buy it on friday but for some reason my kindle said it wasn't available on my Kindle at the moment. So I'll have to figure that out. One of the ways I was able to get it done quickly was to publish it through Amazon's print-on-demand service. Right. It was now Kindle Print or something like that. And that has lots and lots and lots of advantages, among others.
Starting point is 00:19:21 If you upload the PDF, they look at it, and 24 hours later, it's available. The downside is that it appears, just judging by my personal mail, quality control wanders up and down, depending on all kinds of unknown variables. The copies I've gotten for my own use have all been perfectly good, but others report erratic printing, poor binding, and it may be partly just dependent on where it's printed, because Amazon does not use a single printer, as far as I can tell. And then the other thing is, I should have made a proper Kindle, which is a print replica, which means it's basically just a PDF. And I don't know why that wouldn't print on or be visible on Kindles. I did it on my wife's Kindle, which is not a new one, and it was okay. And it certainly looks good on things like iPads. So I have the smallest Kindle, which doesn't do a great job with things like PDFs.
Starting point is 00:20:31 So perhaps that's why they said I didn't want to buy it there. Yeah. Yeah, that's probably the real story. PVS Studio is a static code analysis solution that helps enhance code quality, security, and safety. The analyzer detects bugs and potential vulnerabilities in C, C++, C Sharp, and Java code on Windows, Linux, and macOS. CppCast listeners can use the CppCast hashtag to get the analyzer's one-month trial version. To request the trial, use the link in the podcast description. C++ projects are getting increasingly complex. Too complex for people to analyze them thoroughly during code reviews. That's where code analyzers come in. They notice everything the human eye misses, thus making code reviews more productive and enhancing
Starting point is 00:21:13 code quality. Want to know more about the problem? Take a look at the recent article from the PVS Studio team, C++ Tools Evolution, Static Code Analyzers. The link is in the podcast description. Could you maybe tell us a little bit more about your experience working at Bell Labs at the time that Unix was first being developed? Okay, yeah, obligatory rosy glow. Yeah, sure. You know, it really was great fun. So I was doing PhD stuff here at Princeton in the late, you know, second half of the 60s. And I spent a summer at MIT in 66. And that was a wonderful environment because that was when they were CTSS, the compatible time-sharing system was just being upgraded or scaled up to Multics. And so there I was hanging
Starting point is 00:21:58 out with a bunch of people who are really, really first-rate programmers doing absolutely fabulous work. And I was living in Cambridge, which was a great place to live. And all of this was really good, and I enjoyed it. And then there was this cohort of people who I had never met who were at Bell Labs, because they were part of the Multics operation, but they were all in New Jersey. And so the next summer, in 67, I got this job at Bell Labs working with the same people in the same group. That, I think, was part of why it was so much fun, is I'm working with people who are doing interesting things. Most of them are, well, I shouldn't say young people. Some of the folks like Ken Thompson, Dennis Ritchie, and so on were
Starting point is 00:22:35 essentially the same age as I am. So it was fun for two summers. I actually had two internships, and I had so much fun that I never interviewed anywhere. They offered me a job. I took it, and I went there in the beginning of 69. And it was in this relatively small group of people, Computing Science Research Center, maybe 25 people who were interested in things in and around computing. Some of it was algorithmic stuff. Some of it was languages, tools, and this underlying interest in operating systems. And so there's all kinds of interesting things in the air, but it was basically a bunch of
Starting point is 00:23:10 sharp people playing with computers and having a good time because it was early in the evolution of computing. Things like Moore's law was making computers sort of more accessible to groups for $50,000, which is a lot of money in one sense, but not in another, you could get a decent machine like the PDP-11, which came out, I guess, in roughly 1970 or late 70s, early 71s, something like that. So part of it was just lots of interesting people and lots of interesting things to work on in an environment where they didn't tell you what to do. Do something.
Starting point is 00:23:48 And so it tended to encourage people to wander around and talk to others and find interesting things to work on. And of course, this was the research arm of AT&T, which at the time was providing telephone service for most of the United States. And so it was a very big company, a million people, a million employees. And so there were a lot of interesting problems and things to work on. It didn't matter what you were interested in.
Starting point is 00:24:13 Somebody in the telephone system probably cared about it. And so that meant that management didn't tell people what to do, but rather tried to dangle interesting things in and around. But they also were perfectly happy to have people work in the academic side of things, you know, publish papers in journals, conferences, things like that. It was all deemed to be good. And so in that sense, it was a lot of fun. You didn't have to worry about funding.
Starting point is 00:24:40 You didn't have to worry too much about resources. It was a place where I discovered something I hadn't actually seen much of as a graduate student. People came in and worked at night. There was sort of, I wouldn't call it a night shift, but there were a group of people who kind of were comfortable working in the evenings or even far into the night as a group. Or, you know, as numbers of individuals, but not because of conference deadlines or anything. It's just that they were turned on by what they were doing. And I discovered, gee, it's kind of fun. You could go there and you could work at night too on these intriguing things. So it was good in that sense. The other thing is, I think there were so many good people doing so many interesting things that
Starting point is 00:25:20 it was kind of like, I don't know, playing tennis, which I do really badly. If you play with people who are better than you are, it improves your game. And I think I benefited from that a lot. So imposter syndrome in some sense. But the other thing is, of course, there would be something where perhaps you could be better than all those other folks who were better than you on most things. No, it was a lot of fun. There's no question about it. I hear these stories about like AT&T, Bell Labs, and 3M and stuff from the 60s, 70s, and 80s. And I feel like the environment of like pure research for research sake, like sending people off to go and investigate things, I feel like it just doesn't exist in the same way as it used to. I think that's to a considerable degree true. I mean,
Starting point is 00:26:06 in theory, that's what universities do. But, you know, universities, you have other pressures, among other things. Most people have to worry about funding. They have to get money for, in particular, supporting graduate students who then do the research that the faculty member would like to do, but is tied up in fundraising. And there are mixed benefit chores at this point in my life. You know, I very much enjoy the teaching aspect. I don't resent that time whatsoever. But, you know, I remember a friend of mine coming back from,
Starting point is 00:26:41 a Bell Labs friend coming back from a sabbatical stint at a university saying, those people do research as a hobby. Okay, whereas it was a full-time job at Bell Labs. Next blessing, for sure. But today, yeah, I think it's harder to find places where people can work on whatever they want without any real immediate concern for whether is that going to pay off in some sense. It worked well at Bell Labs in part because of sheer scale for a lot of people because there was this problem-rich environment lots of things to work on and absolutely stable
Starting point is 00:27:12 funding because bell labs was funded by in effect a tax on telephone services so if you made a long distance call a little tiny slice of that would go off and that would be part of bell labs budget but so you could predict what the revenue would be. The quid pro quo was that Bell Labs, as an aggregate, had to do things that in some way improved telephone service for the country. And it worked pretty well for close to 100 years, certainly 75 years. And then various things changed. So just as a hypothetical, if you had to justify the research that you were doing back then and say, this is the expected payout that we expect to get from this, do you think Unix would have ever been developed? I'd say the odds were against that, I think, because it was, you know, these guys fooling around. Bell Labs had been involved in the Multics project for, what, five or six years, I guess. And it was crystal clear that Multics was not going to arrive on time to provide the service that it had promised to provide.
Starting point is 00:28:15 And so Bell Labs pulled out of it. And I think Bell Labs management at the money-paying ranks, you know, bitten and thus shy of future stuff. And why are you working on it? But management, on average, I think, took a very enlightened view of work that wasn't obvious. I remember once being in a room with Ken Thompson and Joe Condon, who had done this work on Bell, chess playing, the first really successful chess playing computer. And Bill Baker, who was at that time president of Bell Labs, I guess, came in with some August visitor just to show off what kinds of things go on at the labs in research. And said visitor asked, well, why are these guys working
Starting point is 00:28:58 on computer chess? It doesn't have anything to do with communications. And Bill Baker gave a, you know, five or 10 minute discussion of why this was important. Look, they're doing computer aided design, they're building tools, they're better understanding how to make specialized equipment, etc. So he did the defense so that somebody like Ken or Joe didn't have to say anything. I think that was really meant to it wasn't a sort of a PR mechanism. I think if you wanted to work on the same thing for five or ten years, somebody might say, would you think about redirecting? But certainly it was measured in multiple years, not multiple hours. What do you think, Rob? Would you go move to a job if you could just spend multiple years researching your passion work?
Starting point is 00:29:44 Certainly sounds like fun. Yeah. I think you'd find, however, if you weren't in an environment where other people paid attention to it and were intrigued by what you did and gave you feedback about it, it wouldn't be as satisfying. I think that was one of the things that worked well at the labs. You know, you would build something. In my case, I would write a program that did something and other people would use it. And that was very positive. They say, gee, that was neat. I was able to do this with it.
Starting point is 00:30:12 And you think, wow, that's wonderful. And then, of course, they tell you about the features that were missing or the bugs they found or whatever. And that wasn't so good, but it was a useful piece of feedback. I kind of feel like there might be a corollary here to modern GitHub open source development, where so many times someone will make, from their perspective, as a throwaway project. They're like, oh, here's a thing.
Starting point is 00:30:31 I'm just going to toss it up on GitHub. They stop paying attention for six months. The next thing they know, there's 100 people who are all interested in this toy throwaway project that they were last working on. It happens, anyhow. I think that is, in some cases, absolutely a fine mechanism. And certainly I stumble across things in GitHub from time to time and think, gee, that's kind of neat. So I apologize for not knowing the timelines and Unix development, but were they developed at around the same time at Bell Labs? Well, roughly. I can recommend a
Starting point is 00:31:00 good book, by the way, in case you want to get the timeline. Sorry. Yeah, so I think the Unix, kind of call it the proto version, the prototype version of Unix dates from roughly the middle of 1969. And this is when Ken Thompson did his famous, you know, found a little used PDP-7. And in three weeks put together what is really arguably the first Unix system. So he did that in, call it 1969 or something like that, on a PDP-7, which was already an obsolete machine and quite small. It had 8K, 18-bit words. So half was the operating system, the other half where you swapped user programs in and out.
Starting point is 00:31:41 But that was an assembly language program, PDP-7 assembler, but he wrote his own assembler, of course. And then the first Unix thing that you and I would say, oh, yeah, that's a Unix system, was kind of the, call it the middle of 1971 or maybe a little, somewhere around there anyway, on a PDP-11, which had just about appeared at that point. And that was a richer machine.
Starting point is 00:32:04 It had somewhat more memory. I think our first one had 24k bytes, so a little bigger. That's 50,000 bucks. But it had types. It had characters, single bytes, and 16-bit integers, which were also pointers. The address space was 16 bits. So the first version of Unix for that was written in PDP-11 assembler. Of course, Ken wrote the assembler for that too, along with Dennis, I guess, in some sense. And the third edition of Unix, which came along in, I think, February of 73, was still in assembly language at that point. But C, Dennis had started working on C in 1972, and it's basically taking typeless languages, BCPLB, and converting them into something that had types.
Starting point is 00:32:57 Not many types, basically, just char and int and address types. And so he was still converting and working on the C programming language at that point. But somewhere in 1973, the system was rewritten in C. The thing that was needed was basically struct, because it's hard to do operating system related things if you can't get at the individual bits and pieces of stuff. So the fourth edition, which showed up late in 73, was an assembly language, or sorry, a C program. It was, I don't have the numbers for that one but by the time the sixth edition came along which was 75 that was about 9 000 lines of c and maybe 500 or 1000 lines of assembly language to get at specific things in hardware that didn't fit the c
Starting point is 00:33:38 model very well that's the version that john lyons documented in his commentary. Yeah, so most of the stuff that made the modern Unix was pretty much completely in place by, call it the 5th or 6th, certainly by the 6th edition. Of course, there were lots of things that weren't there because they didn't meaningfully exist. Multiple processors or networking in its various guises. But you could go back to 6th edition Unix today, and you probably wouldn't like the shell. Because it was programmable, but it wasn't very good. And the born shell came along somewhat after that, and that made it actually possible to program properly in the shell.
Starting point is 00:34:22 But pipes showed up in the 3rd edition. They were first done in assembly language. Interesting. You brought up the lion's commentary, because that was, I went to university in the late 90s, and there was, I recall it being almost like a legendary book, because at the time, no, Amazon, as far as I knew, it was out of print at that time. I believe you can now go buy a new copy. And I heard stories from friends that are like, well, no, what you want is the photocopy of the photocopy of the photocopy that has everyone else's handwritten notes in the
Starting point is 00:34:51 margins about Lyon's commentary. Yeah, right. Sort of Talmudic or something. Hang on one second, guys. I wonder. Oh. I have a copy. I just happen to have with me. Oh, wow. Our audio listeners, he has a copy of the Unix book.
Starting point is 00:35:16 And it's a numbered copy. Wow. Number 135. So that's one of the original manuals. That is the commentary. That is the code and the corresponding great orange thing. Oh, the commentary of the Linux. So that's a priceless document, if you like.
Starting point is 00:35:33 But I have the other one, and I'm sorry again that this is the one that you're referring to, Jason. Yes, that's the one I'm referring to, yeah, the lines commentary print version, yeah. I don't know whether that's still in print or not. I bought a copy when it became available, as you say, probably in the 90s sometime. Some friends in Australia actually gave me a copy of one which is bound differently, but it's the same content quite recently.
Starting point is 00:35:59 So it's a marvelous, absolutely marvelous thing to read. Those manuals that you just held up a moment ago, do they have your handwritten notes in the margins? Maybe that should be preserved. I think this one has one thing. You can see little yellow stickies there. And if I open it up to that page, it has somewhere in it, and I can't find it now. Oh, it's the other side.
Starting point is 00:36:22 It has that famous line that says, you are not expected to understand this. I mean, back to the Unix and C coming out, is it fair to say from what you just explained that the point was to write less assembly? That's why C was used more? Yeah, absolutely. One of the many contributions from Multics, I mean, it's easy to badmouth Multics, but they had some incredibly good ideas along with some incredibly good people. But one of the ideas that they had was, let's write the operating system and all of the tools and so on in high-level languages.
Starting point is 00:36:59 And so they started out with PL1, which turned out to be an inappropriate choice, shall we say, because PL1 was way too big, way too complicated. Compiler technology was not up to it, I think. And so they converted a lot of it to BCPL, which was a very much stripped-down language. It was done by Martin Richards at Cambridge and imported to MIT. BCPL was a typeless language, but very straightforward and clean and simple. And so it showed that you could actually write quite effective systems code for all systems applications in a high-level language. And so the path from BCPL, which of course people at Bell Labs used in their Multics work, then led Ken to build a simple very simplified version already simple
Starting point is 00:37:46 bcpl which he called b but it was typeless again that was the thing that ran on pdp7 and then dennis retrofitting types into that made it c so that's the evolution but the idea of writing high-level languages i think was very much a Multics idea, whether it came from somewhere else before, I don't know. But once you've done that, then you get all kinds of advantages, and you can see several. One advantage, if things are written in high-level languages, is that it broadens the base of people who can be programmers, because they don't have to know the assembly language. Assembly language programming is picky. It would work to get it right. And so you've broadened the base of people who can write code,
Starting point is 00:38:29 and that's very important. You've already seen that with things like Fortran a decade earlier, a decade and a half earlier. And the other thing, maybe not so expected byproduct, was portability. First, portability of tools that you might build, and then ultimately portability of the operating system. And so once the operating system was written in C, people started to think about portability.
Starting point is 00:38:52 The first port was done in early 77, I think, by Richard Miller at the University of Wollongong in Australia to an Interdata 732. And then six months later, Steve Johnson and Dennis did a port to the interdata 832 distinction between those two i don't even know what it is at point different approaches i think miller just transliterated the code whereas steve created a portable c compiler that is a compiler that would run anywhere that had a code generation that you could tune to different machines. And once you've got portability, all of your code, then that has all kinds of benefits, including freedom from depending on particular manufacturers.
Starting point is 00:39:36 And so now you can see that. I mean, think of Apple going from the PowerPC to the x86 series to their own M1 design at this point. And that all depends on the fact that they're using high-level languages for everything. When you were all working on Unix and it was being moved to the PDP-11, what operating system were you using before you actually were using Unix? Unix was self-supporting extremely early, not the PDV-7, where I think probably the assembler was running on a big computer at the Murray Hill Computer Center, probably a GE635,
Starting point is 00:40:17 which was the general purpose machine that was used there. But very quickly, things became self-maintaining. And I think by the time the first edition one of the Unix system was there, you were able to run on yourself, as it were, that there was no dependence on other computers. When did your own involvement in C begin? Well, my involvement is basically as a popularizer i i wrote a tutorial for b just did all as a sort of how do you get off the ground and write code in b and i don't even remember why i did it i mean but that's one way to learn it and then when c came along i basically upgraded that B tutorial into C, and that was
Starting point is 00:41:06 more useful. And at some point, it looked like there would be interest outside our little group in something more. There were probably at that point 100 Unix systems. I don't even remember. And so I twisted Dennis Ritchie's arm into writing a book with me on C. I don't think he originally wanted to do it, but being a nice guy, he did. And that was probably the smartest thing I ever did in my life. Because I really wanted his expertise and I wanted his reference manual. So he and I wrote the book together. I wrote the first draft of a lot of it, but the reference manual is entirely his. I don't think I ever touched a word of that. And he did the system call chapter because he knew that stuff better than i did and then we
Starting point is 00:41:48 homogenized the earlier stuff so that worked out rather well i'm not always that lucky you were the chief evangelist for c not in any intentional sense but it sort of worked out that way and at least in some sense so how did you you all, you said at that moment, you estimate there were maybe 100 systems running Unix. Like, how did this sharing of knowledge and software proliferate around the world in labs and stuff at that time? My memory, and this is not something that I was deeply involved with, but there was, first, Unix spread inside Bell Labs
Starting point is 00:42:26 because it provided an environment for lots of the support systems that went into running, you know, a nationwide telephone service, you know, keeping track of where is equipment and who's paying for all kinds of reports of one sort and another. And it turned out Unix was just better for that kind of thing. And there was a version of Unix internally called the Programmer's Workbench, which built tools that made it possible for people with big databases and big record-keeping requirements and big logging and real-time control and all kinds of other things to use Unix just basically to interact with other stuff, whether big mainframe computers or weird peripherals. So there's a lot of spread inside. And at some point, it got enough press, and maybe it was because of the CACM article in whatever it was, 73 or something like that, that universities started to ask whether they could get copies of it. I think Ken Thompson took a copy with him when he went to Berkeley sabbatical, roughly that time. And so universities started to get it, and AT&T, I think, didn't know
Starting point is 00:43:32 what they were doing. And furthermore, they couldn't sell software because they were a regulated public monopoly. They couldn't sell software because somebody could accuse them of cross-subsidizing the telephone business and thus giving, who knows, some unfair advantage to something. So they couldn't sell it, but they could license it. And so what they did was to license Unix, the source code, as well as the running system, to universities for what amounted to a nuisance fee, a hundred bucks to make a magnetic tape or something like that. And so that proliferated and people with the licenses to do this interpreted the license at least as saying they could talk to other people who had the licenses. And so they formed user groups where people would get together
Starting point is 00:44:19 and they would discuss what they were doing and they would swap their code, kind of like an early version of open source GitHub-y kind of thing. It was nine-track magnetic tape. And so they would, you know, have these meetings, which became essentially the Unix users group, kinds of meetings that proliferated around the world, national groups in lots of countries, led to conferences like the Usenix conferences that still go on to this day, I guess, or maybe there's Linux conferences at this point. That was the spread mechanism. It was mostly through universities. There was a commercial license that was also available for 20,000 sticks in my mind. And so you were a company.
Starting point is 00:44:59 You could buy a Unix license and become part of this group of people who could share their expertise in this weird system. That took us through probably the 70s very nicely. One of the centers of activity was Berkeley. Ken had been an undergraduate and a graduate student at Berkeley, so he had lots of friends there. He taught there. And at some point, basically when the VAX came along, which was the 32-bit version of the BDP-11, when the VAX came along in the late 70s or something like that, I think the center of gravity shifted to Berkeley, in particular with the networking kinds of things, because all of the internet code that was written for Unix was done there originally by Bill Joy and colleagues. It's a mental picture that I just never considered before.
Starting point is 00:45:46 Someone literally carrying a nine inch magnetic tape. I just got the latest version of Unix. Let's install it. Yes, right. And it's nine track, not nine inches. Oh, nine inches. I'm sorry. Yeah.
Starting point is 00:45:59 There's a bit of a generation gap here or something. Yes. Yes. Well, I'm older than Rob, but the oldest media that I've used is 5 1⁄4-inch floppy. I've used tape. I've used audio, cassette, compact tape as well. I even have an 8-inch floppy around here somewhere. Do you still find yourself using Unix as your primary operating system, something else?
Starting point is 00:46:24 Yeah, I think so. My personal computers are almost entirely Macs, a combination of portability and just ease of use. And so arguably what I'm running there underneath is a flavor of Unix based on some BSD thing, much mutated, I guess. So I think of that as Unix and I'm sitting here, got terminal windows open and so on. The university, well, certainly the computer science department runs Linux. I've forgotten which flavor, doesn't matter. And so I often use that, basically SSH into a Linux system. And I kind of don't care which it is because most of the things I use kind of work the same on all of them. And if they don't, I think of it as a failure of implementation or something like that.
Starting point is 00:47:11 So, yeah, I'm basically just using Unix as I always did. Cool. That's amazing. Going through your list of publications that you've worked on over the years, in addition to the C book and the recent Unix memoir, you also worked on a Go programming language book. I was kind of wondering what your involvement was in there. That was done with, and I think the correct preposition there is actually by Alan Donovan. You know, I had spent summers at Google off and on for quite a while, and some accident of geography and knowing people, I wound up sitting in the group of people who were working on Go at Google in New York. You know, Peter Weinberger, who's an old friend from Bell Labs Day, was there.
Starting point is 00:47:54 And next desk was Alan Donovan, who I had not met before that. And so Alan and I talked. I was his intern, in effect, for one summer. And I learned some Go at that point. And everybody I talked to, including Alan, including Rob Pike, who was one of the creators of Go and so on, said that they didn't like the existing repertoire of Go books that were out in the world. And Alan and I decided that maybe it would be worth trying to write a book about Go. And so we did. But the book is probably 95% his work. He's just an exceptional programmer and, in addition, an exceptionally good writer.
Starting point is 00:48:32 And so it worked out extremely well, again. And I didn't contribute nearly as much as he did. But that's the genesis. And so I don't write very much code, period. Not as much as i would like it tends to be small things and i haven't written any significant amount ago in some years at this point it's one of those things i would love to get back to because it's a nice language but i'd have to refurbish my knowledge and your experience as a professor i understand right you
Starting point is 00:48:58 became a full professor in 2000 for about the last 22 years now, you've seen students who, you know, personal computers were not everyone owned one to everyone, like literally carrying a computer in their pocket. Like, have you seen, like, do incoming freshmen know more or less today about computers than they did 20 years ago? Or have you seen any kind of shift or change in what the average student coming in? Yeah, it's an interesting question. I think what I would say is it's not that they know more or less, but they know different in some sense. The kind of things that I would have taken for granted that, well, first two populations, and if we pick the kids who are going to wind up in computer science, I think today they
Starting point is 00:49:43 are operating at a higher level, if you like. They tend not to know actually how machines work. And they're not as good at, let's say, what I think of as absolutely core Unix tools. I know all of those weird abbreviated names. And I know the totally irregular, weird set of arguments for lots of commands, and they don't. And so I can do things at the command line that they can't. But the flip side is that they're a lot more comfortable at other things, higher level languages. A lot of them are comfortable in Python.
Starting point is 00:50:15 They learn Java in high school, maybe, or pick it up quickly here. So they know different kinds of things. Just a random example. I was sitting at home in one of the perpetual Zoom sessions with one of my kids last week, and my connection was really lousy. And I said, oh, the ping times, I can see the ping times are going up. And he said, what's that? So that in a way encapsulates it. And, you know, very, very bright kid doing really interesting work, but at a high level, and he didn't know what ping was.
Starting point is 00:50:46 So, but there's a different population as well, which is kids are not admitted to computer science at Princeton. Some schools, they are, you know, you apply computer science or you apply whatever, but here you're just, you're admitted. And so that means that kids come in and they discover computing by virtue of taking a course early on. And then they say, gee, that's a lot of fun. But they didn't do any computing in high school. And so ultimately, that first course kind of brings everybody up to a pretty much common level, even though some of them have never done any computing before. And so that means there's a population of kids whose length of experience and perhaps breadth of experience is just different than it might have been 20 years ago.
Starting point is 00:51:29 But that isn't to say they won't be very successful if they decide to go on in computing. And the flip side, of course, is that pick your discipline. I don't care what it is. It has a computing component at this point. And so any field at all that you might want to work in, you're going to need to know something about computing if you want to be kind of up to date with what goes on in it. And that's one of the reasons why I'm doing this digital humanities course. I'm talking to historians and English professors and art people and so on, and they use computing in their work. So a lot of the kids who take the intro computing courses here are not going to be computer science types at all. Right. Thank you. Yeah. Well, Brian, it was so great having you on the show today. Thank you so much for going into all this history
Starting point is 00:52:16 with us. Yeah. Thank you so much for coming on. It's great fun. Thanks so much for listening in as we chat about C++. We'd love to hear what you think of the podcast. Please let us know if we're discussing the stuff you're interested in, or if you have a suggestion for a topic, we'd love to hear about that too. You can email all your thoughts to feedback at cppcast.com. We'd also appreciate if you can like CppCast on Facebook and follow CppCast on Twitter. You can also follow me at Rob W. Irving and Jason at Lefticus on Twitter. We'd also like to thank all our patrons
Starting point is 00:52:48 who help support the show through Patreon. If you'd like to support us on Patreon, you can do so at patreon.com slash cppcast. And of course, you can find all that info and the show notes on the podcast website at cppcast.com. Theme music for this episode was provided by podcastthemes.com.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.