Advent of Computing - Episode 5 - Unix for the People, Part 2

Episode Date: June 2, 2019

Now, as the name suggests this is the second part of a series on the history of UNIX. Part 1 mainly covers the background leading up to UNIX. If you haven't listened to it yet, I strongly suggest you ...go do that now. A lot of what was covered in part 1 provides needed context for our discussion today.   Just as a quick recap, last time I told you about CTSS and Multics, two of the earliest time-sharing operating systems. Today, we are going to be picking up where we left off: Bell Labs just left Project MAC and decided to start their own time-sharing project. What they didn't realize was that this new project, called UNIX, would soon outshine all of its predecessors. But when this all started, in 1969 on a spare mainframe at Bell Labs, there was no hint at it's amazing future.

Transcript
Discussion (0)
Starting point is 00:00:00 You may be surprised to hear it, but did you know that there is one piece of software that both Apple and Microsoft have been involved with off and on over the years? Now this is a strange and complicated story, but I want to at least touch upon it for making a certain point. So in 1980, Microsoft announced it would be releasing a new product called Xenix. This was an operating system for the new line of 16-bit home computers that had recently started to hit the market. However, after five or six years out, Microsoft sold off Xenix to focus on other, more profitable projects. Now, much later, in 1995, Apple released another very similar product called AUX.
Starting point is 00:00:48 This was, again, a new operating system. But unlike Zenyx, AUX was a graphical system designed to run on a series of new Mac computers. But AUX also never really hit it big in the marketplace, never even recouping costs, and it was cancelled about a year or so after its initial release. So how are these two obscure, failed operating systems released 15 years apart by two totally different companies related? And what does this have anything to do with this episode? Well, I'm glad you asked. It turns out that both AUX and Zenix have a shared ancestor. That's right, they're both based off the earlier Unix.
Starting point is 00:01:34 In more recent times, Apple again courted Unix, this time to much more success. OSX, now called macOS, and iOS, Apple's flagship offerings, are now both based off of Unix. Microsoft has also dipped back into its Unix history, adding a Linux subsystem to Windows 10, and even going so far as to contribute back improvements to the larger Linux project. If both Apple and Microsoft have kept coming back to UNIX again and again for decades, then it stands to reason that there's something going on with this UNIX stuff. So why is UNIX such a force nearly 40 years after its release? And what is UNIX anyway? Welcome back to Advent of Computing.
Starting point is 00:02:28 This is episode 5, Unix for the People, part 2. I'm your host, Sean Haas. Now, as the name suggests, this is the second part of a series on the history of Unix. Part 1 mainly covers the background leading up to the start of the Unix project. If you haven't listened to it, I strongly suggest you go and do that now. I can wait. Now, a lot of what was covered in Part 1 provides needed context for our discussion today. Just as a quick recap, last time I told you about CTSS and Multics,
Starting point is 00:03:00 two of the earliest examples of timesharing operating systems. Today, we're going to be of timesharing operating systems. Today we are going to be picking up where we left off. Bell Labs just left Project Mac and decided to start their own timesharing project. What they didn't realize was that this new project, soon to be named Unix, would outshine all of its predecessors. But when this all started in 1969 on a spare mainframe in the back rooms of Bell Labs, there was no hint at this amazing future. So when we last heard from Bell, they had just pulled out of the Multics project.
Starting point is 00:03:35 But ironically, them leaving Multics is actually the start of our larger story. While Bell was still involved in the Multics project, one of their programmers, Ken Thompson, had been developing a video game called Space Travel on the system. When Bell Labs left the Multics team, Ken sadly lost access to the MIT mainframes that he was using. Since he still wanted to work on Space Travel, he decided to port it to run on one of Bell's mainframes. But there were some issues with this arrangement. Mainly, the GE635 at Bell Labs that Ken had access to was still running as a batch based system instead of a timesharing system.
Starting point is 00:04:20 So each move in the game had to be submitted as a job. Now if you've ever played a video game before, then you know that having to submit a move and then wait and wait and wait while it runs and you get your response back will quickly ruin the experience. The display on the GE mainframes were also pretty slow and choppy. Obviously, Ken wasn't very happy with this downgrade. At this point, you may wonder, Sean, what does this have to do with Unix? Well, believe it or not, this is actually where Unix first starts. As Thompson was looking for a way to improve space travel, he found a seldom used computer on the Bell campus. The system he found, and would rewrite Space Travel 4, was a Deck PDP-7.
Starting point is 00:05:08 Now, for the time, the PDP-7 was a pretty run-of-the-mill kind of mainframe. Released in 1964 as a cost-reduced version of some of Deck's earlier mainframes, the PDP-7 didn't have any of the fancy features of the GE systems that Thompson was used to using for the Multics project. However, the PDP-7 at Bell did have a better display than the other GE mainframes that Thompson could get access to. Also, since it wasn't as in-demand, Thompson wouldn't have to deal with batch processing. He could just use the system on his own with few, if any, interruptions. He could just use the system on his own with few if any interruptions. So, Ken and another researcher, Dennis Ritchie, started to rewrite space travel for a second time, this time in PDP-7 assembly language.
Starting point is 00:06:06 As the game grew and grew, the two programmers implemented increasingly complicated features and became increasingly familiar with the PDP-7 itself. But it was slow work. This was mainly because the duo was still doing their actual programming on one of Bell's larger GE machines. So to update space travel, Ken and Dennis would have to head over to another mainframe on another part of the building, edit the code, compile it, output the finished program to punched paper tape, and then walk the tape back over to the PDP-7 to load and run. Obviously, this wasn't an ideal situation. Also, this illustrates two important points nicely. First, there's some weird obsession with paper and computing in the 60s. I mean, for real, you have teletypes which output text onto paper,
Starting point is 00:06:50 you have punch cards which are just glorified cardstock, and you have paper tape. There's something sketchy going on there. And number two, programmers have always been lazy. So keep this in mind as we travel through the rest of our story. So the team, now familiarized with the new mainframe, they wanted to see what they could really do to make programming easier for themselves and find out how they could stretch their muscles on this newer machine.
Starting point is 00:07:21 Over the course of 1969, they implemented a new file system on the PDP, then a set of tools for managing files, a text and program editor, an assembler, and eventually a whole command line interface to tie everything together. Pretty quickly, it started looking like they had a new operating system on their hands. And soon, Ken, Dennis, and others inside Bell would start calling the software Unix, as a kind of joke on the earlier name, Multics. That brings us up to the actual start of the project. In the next few years, Unix would receive official backing from Bell Labs and become much more of a real project, instead of just some idle tinkering on a spare system.
Starting point is 00:08:10 But I still haven't explained what Unix is exactly. So, at its core, Unix is an operating system. Well, what does that mean? I touched on what an operating system was at least in passing in episode four, but I think it bears some deeper explaining here so that when I explain what Unix does and what it is, that it might have some more impact. So on its most basic level, an operating system is a set of software that can manage and allocate the physical resources of a computer. In reality, a computer is really just a mess of chips and circuits.
Starting point is 00:08:55 The operating system, regardless of if that's Windows, Mac OS, or Unix, deals with hiding all the hardware away through a process called abstraction. Let me give you an example. Let's say you want to edit a text file. Now, for an everyday computer user, that really just boils down to opening the file, editing it, and then saving it. Simple. However, in reality, there's a lot of steps to that. And at each step, you have to know a lot about the underlying system. So first off, where's the file stored? It has to be on some kind of data storage device like a hard drive or a flash drive or even a network card. Each of those devices has its own way of storing data. Also, how is the data store connected to the computer?
Starting point is 00:09:39 Different connection types, if that's an internal hard drive or a USB socket or a network port, each have different ways and protocols that you have to read from them. Once you start reading the file, where in memory do you need it to be stored? Now, this will depend on what the text editor expects and where on the computers there's some free memory. But what encoding do you need to use so the editor can understand it? Once the file is open and you start editing, now you have to worry about how keyboard input is being handled. Are you using a USB keyboard or a PS2 keyboard? Or maybe a wireless one?
Starting point is 00:10:17 Each of those inputs sends keystrokes in a different way. So you should start to see a pattern here. There's a lot going on at the hardware level that should not be touched by human hands. An operating system, often abbreviated as just OS, fixes this problem in a few key ways. First, the OS knows what kind of hardware a computer has and how to work with it. On top of that, it's also able to present some kind of interface for programs to use the hardware. Now, that part's called abstraction. Basically, the OS abstracts out the complicated hardware procedures so you're able to say
Starting point is 00:10:56 something like, open this file, and the OS takes care of all the details. Beyond simple abstraction, an OS also provides some kind of environment for programs to run in. Basically, just think of this as a set of tools that a programmer can use to make new software that can run on that operating system. A program like Chrome, for instance, isn't part of Windows as an operating system, but it runs inside the Windows environment. as an operating system, but it runs inside the Windows environment. There are a lot of different approaches to how an operating system abstracts hardware and builds an environment. So, back to the topic at hand, what makes Unix special, and how does it tackle these
Starting point is 00:11:36 problems? Explaining that quickly becomes kind of nebulous. The large issue with this is that there's no one thing that makes Unix Unix. In fact, there's nothing even that really makes it revolutionary. In a lot of ways, Unix, even for the time, was technically inferior to other systems out there. Instead, what makes Unix special is a lot of small pieces that work together to make a larger whole. The thing is, many of the features that combine to form Unix aren't even software, but instead just a certain approach to solving problems and looking at a computer.
Starting point is 00:12:17 This whole collection of software and common practices is often called the Unix philosophy. Thompson and Ritchie describe the Unix philosophy as a combination of a few factors. To quote, First, since we are programmers, we naturally designed the system to make it easy to write, test, and run programs. Second, there have always been fairly severe size constraints on the system and its software. Given the partially antagonistic desire for reasonably efficient and expressive power, the size constraint has encouraged not only economy, but a certain elegance of design. Third, nearly from the start, the system was able to, and did,
Starting point is 00:12:59 maintain itself. So, what does any of this actually mean? Well, I want to try to explain this by going over how Unix as an operating system functions in practice and relating that back to these core ideas of what makes Unix, Unix. The first part I want to discuss is the file system. For those of you not in the know, a file system is just the way in which data is stored on a computer. This is one of those features on modern computers that so often fades into the background. We all know that a computer stores and works with files, but don't ever really think about how it does any of that. Now, it may not come as a surprise, but file systems have existed a lot earlier than Unix.
Starting point is 00:13:46 However, Unix made the file system more central to the computing experience than many earlier operating systems ever did. The file system that Thompson designed was hierarchical. Now that just means that it could have nested directories. Like a modern file system, essentially. The Unix file system also has built-in access control. Each file belongs to a user or group of users and can have attributes set to control who can view, edit, or delete it. Now, so far, all of these aspects of the Unix file system I've mentioned come almost directly from its predecessor, Multics. Really, that shouldn't
Starting point is 00:14:26 come as a surprise, since the Unix team had just come off working on Multics that same year. However, there's a big feature I've left out that makes Unix come out on top. Device files. Essentially, a device file is just a special file that lets you access some physical device connected to your computer. Right off the bat, that doesn't sound all that impressive, but in practice, this changes a lot of how a computer is really used. Since you can treat any device as just a text file and vice versa, this means that a programmer instantly has more flexibility. Instead of writing code to output a message on a terminal, a printer, and any other device, you really just have to be able to output to files, and Unix just takes care of the rest.
Starting point is 00:15:13 So that's a broad brush for the file system. Already, we can see that it provides a good programming environment and enables elegant and flexible solutions just with the device files alone. and enables elegant and flexible solutions just with the device files alone. The next big piece of the Unix experience is how it handles its programs. That being that each program in Unix is often described as doing one thing and doing it well. This idea extends from simple programs that come bundled with the system to third-party software that you find online. Maybe I can explain this better with an example.
Starting point is 00:15:53 One of the most commonly used programs in Unix is a program called Cat. Now, this has existed for all of Unix's history, and all it does is print out the contents of a file. On its own, Cat is basically useless. The same is true with a lot of Unix programs. You have programs that give you the number of lines in a file, the first or last line of a file, programs that move files and delete files,
Starting point is 00:16:17 but none of them are really a Swiss Army knife. To make sense of why having a bunch of these conditionally useful programs matters, we need to talk about one more of Unix's core features. Those are pipes. Now, a pipe is just a way to turn one program's output into another program's input. This is another one of those Unix features that sounds unimpressive. But when you add pipes to the idea of small, niche programs, you get a much bigger picture. This lets you chain together smaller programs to do bigger things. You can kind of think of it like executing a combo in a fighting game. And since each program you're using is just a step in a large process, you have a lot of flexibility in what you can end up doing. The idea of pipes, small programs, and everything on the system being treated as
Starting point is 00:17:06 files is really the core of Unix philosophy and can be seen all throughout Unix-type systems. But that's just at the surface. If you look a little deeper, we get to the third point that Richie used to describe Unix. Like Multics before it, Unix was written in a high-level language. However, the first few versions of Unix were written in PDP-7 assembly, as I mentioned before. But then subsequent iterations began being written in a new language called C. The history of C is complicated enough to deserve its own episode, so here I'm going to just try to hit on the broad strokes where it relates to UNIX. To start off, in 1971, UNIX had already become a hit inside Bell Labs. This was the year that
Starting point is 00:17:53 UNIX would leave the PDP-7 behind and move over to a new, more powerful PDP-11. Over the next two years as more PDP-11 mainframes started to come into Bell Labs, Unix also moved in, becoming the system of choice inside the laboratory. But Unix was still composed almost entirely of assembly language. That brings us up to 1973, one of the most pivotal years for Unix. Thompson and Ritchie had been trying to find a replacement for assembly. Mainly just because as Unix grew, the codebase started to become harder and harder to deal with. I'm sure if you've ever programmed a large project, then you can relate to that.
Starting point is 00:18:33 Now, a high-level language would make Unix easier to maintain and improve. But in 1973, their options really looked bleak. They had some experience with PLI from their time with Multics but that wasn't really the right tool for the job. Fortran was also considered but that was pretty quickly dropped. The main issue was if for creating an operating system you need a language that's very general purpose and still has some amount of control over the underlying computer. And as of the early 70s there really wasn't a perfect fit out there. control over the underlying computer. And as of the early 70s, there really wasn't a perfect fit out there. So over the course of 72 and 73, Dennis Ritchie set to work.
Starting point is 00:19:13 Taking features from some contemporary programming languages and adding in some new ideas, C was born. What made C instantly stand out from other options was that it was just plain and simple a good general purpose language. C is kind of like a jack of all trades. It's good at a lot of things, but it's not the best at any one task. And that's really what you want in an operating system. You want a language that's intensely flexible. The other aspect of C that makes it shine is the fact that it remains close to the computer. C exposes a lot more of the underlying structure of what a computer is doing, especially when
Starting point is 00:19:52 it comes to memory. This means that you get the flexibility of a higher level language with the control, at least somewhat, of a low level language. On top of all that, a lot of the features of C are designed to compile down to only a few CPU instructions, making it fast and easy to translate code into something a computer can understand, but also giving the programmer more of an idea of what the final code is going to do. Over this time, the Unix team started to port the system to totally be in C, and by 1973,
Starting point is 00:20:24 C was the lingua franca of Unix. This was the point where Unix started to really become a big deal. I've been kind of burying the lead on C for a bit here, so let me explain one more thing. It's true, C is a much better language to program in than assembly, at least in terms of ease of use, but there's a much bigger upshot to the language. Now with assembly language, each line of code is specific to the machine it's running on. That means that a program written in assembly is locked down to the machine it was originally written for. That's not the case with high level languages.
Starting point is 00:20:58 The compiler, the software that translates a high level language to machine code that a computer can understand, determines what system the finished program can run on. So instead of rewriting all of Unix next time it needed to be moved to a newer machine, Thompson and Ritchie only had to write a new C compiler for that new computer. This idea of being able to take the same code and run it on different types of systems is called portability, and before Unix, it was unheard of. Now, instead of having to rewrite the system into a totally new codebase, you could just
Starting point is 00:21:32 rewrite the compiler, and it turns out that rewriting the compiler quickly became a lot easier than rewriting all of Unix. Within the year, Unix was not only running on the new PDP-11, but it also imported to a handful of other types of systems inside of Bell. Why was C and portability such a big deal? Well, partly because Unix was the first to do it. And Unix really stuck the execution. In 1975, Bell released Unix V6, and for the first time, the system started to appear in the wild.
Starting point is 00:22:07 But this wasn't a software launch that any of us would recognize. You see, you couldn't just buy a disk with the newest fancy version of UNIX on it. Instead, you'd have to buy a license to use the source code. With that, you get all the code needed to build your own UNIX system. Just compile, install, and run. Since the C source code was now out, the next big leap should make some sense. Now, I've already established the programmers are intensely lazy, but they're also some of the ficklest people you'll ever meet. So pretty quickly, those with the license to use Unix
Starting point is 00:22:44 started to tinker with it and turn out their own versions of the operating system. These derivatives fell into two large categories, either straight ports to new hardware or totally new systems. The ports were just programmers optimizing and recompiling Unix for newer computers. The new systems, however, vary widely in terms of how much was changed. Some of these new Unix compatible systems would only have a few tweaks here and there, while others were totally redesigned to work for niche systems or applications. This is where the aforementioned operating systems like Microsoft's Zenix and Apple's AUX appear. Zenix kept pretty close to the original Unix codebase,
Starting point is 00:23:27 just porting the system to smaller home computers, while AUX was more in the total overhaul category, ending up being a classic Mac-like environment, but built using underlying Unix code for compatibility's sake. We also get more successful versions of Unix from this explosion, like Berkeley's BSD or IBM's AIX. The names aren't really that important, what matters is that once the code was out there, Unix was ported and modified to run on nearly every computer imaginable.
Starting point is 00:23:59 But that's not even the best part. You see, Unix always presents the same environment regardless of the computer it's running on. Now this has a few implications. Firstly, if you know Unix on one computer, you know Unix on any system it can run on. Secondly, and much more importantly, any program written for one type of Unix always runs on every other type of Unix. This one-two punch of seeing Unix meant that for the first time, software could truly be written once and run everywhere. I cannot stress how important this one fact is. Prior to the spread of Unix, it would take a total overhaul of any program to move it from computer to computer. The compatibility and portability that the Unix
Starting point is 00:24:44 environment brought to the table set the stage to change how computers were used forever. That's the story of the creation of Unix. The 80s and 90s saw an explosion for Unix, both in terms of its user base and its diversity in general. This era is often called the Unix Wars due to the sheer amount of competing derivative systems. To properly explain this time period would be a very difficult task, to say the least, so I want to instead focus on the resolution and where Unix is today. There really isn't a single year I can point to as the start of the quote modern age of Unix.
Starting point is 00:25:24 Rather, the shift occurred sometime in the mid 1990s. This is when the Unix community at large finally agreed on something that had been missing for decades, a full specification. You see, systems like Multics had been written for a spec. That means that the specification for how they should function was designed before the operating system was written. Unix worked the opposite way. The OS grew kind of naturally and diversified as it spread. Only after some 20 or so years did a concrete and agreed upon spec for what made Unix Unix come to be. The late 80s and early 90s saw a few competing standards, but the one that won
Starting point is 00:26:07 out is today known as the Portable Operating System Interface, or POSIX. Basically, POSIX laid out the technical details of how to implement a Unix-compatible system. This meant that now anyone, regardless of if they had license to use AT&T's original code, could create their own Unix-like system and maintain compatibility. Really, POSIX paved the way for a new, more modern type of Unix system. Mainly because now instead of having to adopt older Unix code, totally new OSes could just be designed for an existing specification and still be Unix-like. I think more than anything, this is what really cements Unix as a legend, at least in the
Starting point is 00:26:53 computing world. While a lot of systems these days are based off the Unix specification, very few, if any, are based on code from earlier Unix releases. any are based on code from earlier Unix releases. So, where do we see Unix today? Well, the biggest heir to the legacy is Linux. Started in 1991 by Linus Torvalds as a hobby, today Linux is one of the most used operating systems on the planet. Instead of resembling the older forks of the Unix codebase, Linux is a totally new OS, and it adheres to the same specifications established of the Unix codebase, Linux is a totally new OS, and it adheres
Starting point is 00:27:25 to the same specifications established by earlier Unix-like systems. This means that Linux functions much in the same way as earlier Unix systems, but has the advantage of more modern features. Since introduction, Linux has become the go-to system for servers, making up somewhere north of 90% of the internet's infrastructure. But it's not just restricted to larger computers. Under the hood, Android, the most widely used OS on smartphones and tablets, is also a type of Linux. If you're listening to this on a mobile device, then there's about a 70% chance that you are using
Starting point is 00:28:01 some kind of Linux right now and don't even know it. chance that you are using some kind of Linux right now and don't even know it. Alright, I think it's time to wrap this episode up. To do that, I want to come back to the quote that I started this series with, Dennis Ritchie's description of Unix as a quote, system around which a fellowship could form. I think we see that fellowship in a few key places in the history of Unix. On the most superficial level, we can understand it as the time-sharing aspect that Unix takes from its Multics roots. That would be a fellowship of users gathered on a single shared computer. But as Unix evolved, we started to see more than that. We started to see a cultural level of that fellowship. As the OS spread, that fellowship became a culture of programmers and users that all knew the same system across multiple different computers, ranging from the largest mainframes to the smallest PCs.
Starting point is 00:29:00 In the modern era, this fellowship continues to shift and evolve. Today, it goes beyond Unix itself. Instead, we can understand it as a level of communication and connectivity that Unix-like systems enable. Our current internet technology and communication infrastructure now forms the backbone of that Unix fellowship. And in that way, anyone connected to our new computerized world is part of that fellowship. Thank you for listening to Adrin of Computing. I'll be back in two weeks time with a new episode on a topic a little more on the lighter side compared to Unix. In the meantime, if you like the show, please take a second to share it with your friends.
Starting point is 00:29:42 As always, you can rate and review on iTunes. If you have any comments or suggestions for a future show, go ahead and shoot me a tweet. I'm at Advent of Comp on Twitter. And as always, have a great rest of your day.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.