Advent of Computing - Episode 53 - C Level, Part I

Starting point is 00:00:00 It's impossible to say that there's any one best programming language. That's just a plain fact. Some are better suited for specific tasks. Some have traits that are universally loved, while others have quirks that are objectively bad. It even gets sticky trying to figure out which language is the most used. Nowadays, there's definitely a lot of code publicly available. We can scrape that from the internet to start getting an idea of trends, but that only gives us a sliver of the much larger code pie. You can't really go around asking companies for line counts

Starting point is 00:00:39 of proprietary software. That is a quick way to get kicked out of someone's lobby. That all being said, we do have a definite ranking for popularity of different programming languages. As of March 2021, the most popular language in the world is, drumroll please, C. That's at least according to the TIOBE index. Their ranking reflects which languages are discussed the most online, hence popularity not necessarily usage. Since the index started, C and Java have been the perennial top two contenders. They trade off the number one spot back and forth for years. But here's where things get wild, at least in my mind. C isn't a newcomer. Its first iteration appeared in 1972. That's nearly 50 years ago. Discounting contenders like assembly language,

Starting point is 00:01:33 C is by far the oldest language anywhere near the top of the TIOBE index. Deepening the puzzle, most of the other top languages cite C as a major influence. Java was originally developed as an alternative to C. Another one of the top 10 languages, C++, was developed as an extension of C. Even seemingly unrelated languages like Python have drawn influences from C. Adding to the connection, most major versions of Python's interpreter are written in C. Everywhere you look, it seems that there's some connection back to this venerable language. So when I see something like this, one big question springs to mind. What exactly influenced C itself?

Starting point is 00:02:34 Welcome back to Advent of Computing. I'm your host, Sean Haas, and this is episode 53, C-Level, part 1. This time, we're getting back into what's one of my favorite topics, programming languages. As long-time listeners know, or really anyone can guess, I'm a huge fan of programming both in my personal and professional life. It's a practice that I find very fulfilling and fascinating. And, like anything related to the digital realm, programming has its own set of sagas. Today, we're starting a deep dive into C, one of the most popular and influential programming languages out there. And along the way, we're going to go through what I think is a very fascinating saga of the digital past. Now, I'd almost say that C is the Latin of computer programming, but that's not 100% accurate as a comparison.

Starting point is 00:03:30 Anyone who's programmed long enough has been exposed to C. It's kind of a rite of passage. You see it used as both example code and living projects. living projects. Most modern languages borrow a lot of syntax from C, which means that even programmers that haven't used C directly find the language recognizable. Sure, it's been given facelifts over time, derivatives and dialects have hit the scene, but C remains a very popular and widely used language nearly 50 years after its creation. Unix is one big reason behind C's popularity. In the early 70s, Unix, an operating system developed at Bell Labs, made its way into classrooms and onto computers around the world.

Starting point is 00:04:22 By that point, the operating system was written almost entirely in C. In fact, the language was initially developed for that very purpose. So as Unix rose to prevalence, so too did C. A major contributing factor here was that early versions of Unix were distributed as uncompiled source code, so anyone using Unix was exposed to its particular language. This same code was also used as a teaching aid at many universities, so C became a pretty big deal pretty quickly. But herein lies something that's always kind of bugged me. In the current day, C is really seen as a general-purpose programming language. One of its huge keys to success is that it's portable. Just about any computer you can find has a C compiler that supports it. So just about any computer you

Starting point is 00:05:12 can think of runs some kind of software written in C. And believe me, a whole lot of software is written in C. But here's the caveat. C was designed to fit a very specific niche. It was developed as a language for system programming. And you can see that laid bare in the language itself. C isn't as low-level as, say, assembly language. You aren't dealing directly with the computer on its terms, but you can get close if you want to. C can directly address memory. In a lot of cases, you have to be comfortable manually moving around bits and bytes to get the most out of C. In Java or Python or even ALGOL, you just can't do that.

Starting point is 00:06:03 You flat out can't. Those languages deal with the computer on a more abstracted level. The programmer is kept separate from what's going on down lower. That kind of abstraction is seen as a good thing. It's often seen as an essential part of a quote-unquote good programming language. But in that respect, C breaks from convention, and it breaks pretty hard. Just looking at it alone, it seems that C's in a class separate from other popular and more modern languages. But looks can be deceiving. There's something more interesting going on just below the surface. Now, as you can probably tell, this is the start of a series.

Starting point is 00:06:48 I tried to do this all in one episode, but I just can't. There's too much to the story, and once my research got going, I realized that there was a lot to see as predecessors that I didn't want to rush through. The overall goal of this series is to try and understand what made C how it is, and how C became so ubiquitous. To do that, we have to go back to basics and look at some older programming languages. So this episode, we're going to re-litigate some of the issues with ALGOL, Look into what portability really means and why portability actually matters. Along the way, we'll build up the theory that we need so that next episode,

Starting point is 00:07:32 we can see how C put all those ideas into practice. Now, here's the thing. I'm not just name-dropping ALGOL. We actually need to get back into that language. I had some deeper coverage of Algol back in my episode on Jovial. That's episode 38 if you want to go check it out. I'm not going to cover its entire history here. Instead, we're just going to focus on what's relevant today. Algol, initially called the International Algebraic Language, was an attempt to create a universal programming language. The effort started all the way back in 1958 and was spearheaded by an international committee of programmers, computer scientists, and industry experts.

Starting point is 00:08:18 By that time, there were already a handful of programming languages on the scene. The goal was to just replace all of those with a single, perfect programming language. ALGOL was planned to be the standard for international collaboration, built as a tool for expressing programs on paper, and a practical language for all programmers to use and love. A key factor here, and one that we'll be dealing with for a large part of this episode, was machine independence. ALGOL was designed so that, at least in theory, it could be implemented on any computer. The language spec made no assumptions about hardware, which instructions a processor could run, or how memory was laid out.

Starting point is 00:09:06 The initial 1958 spec didn't even include keywords for handling input or output, since that would, by its very nature, be machine-dependent. In other words, ALGOL employed a very high level of abstraction. The reasoning for this choice is pretty clear. ALGOL was slated to be this universal programming language for the future, so it had to be able to run on any computer, past, present, or future. In theory, anyone with enough skill could implement an ALGOL compiler for any computer they wanted. On paper, ALGOL's a wonderful language. It incorporated features well ahead of its time. It even introduced new syntax that would show up in languages for decades to come. However, ALGOL itself was never able to unseat its competition.

Starting point is 00:10:02 We could get into a pretty long conversation on why Algol failed to thrive, but there are two key points that I think are most relevant. It was hard to implement, and it was just too abstracted. The first point here is a little subtle, but I have some direct evidence. Algol is a fairly complicated language. Its code is a lot more structural than contemporaries. Functions can call other functions. Functions can have functions defined within them for some reason. Code is grouped into blocks. Data could be passed by value, but it could also be passed by reference. Compared to, say, Fortran, there's just a lot more moving parts going on inside ALGOL. By the early 60s, we start to see some implementations of the

Starting point is 00:10:55 language, but a lot of them are off-spec. One great example here is ALGOL30. It's a compiler that was developed at Dartmouth for the LGP30 computer. But this wasn't a one-to-one implementation. Thomas Kurtz, future co-creator of BASIC, described the problem this way, quote, Since the limited size of the LGP30 precluded a full implementation of ALGOL60, certain of its features, arrays called by value, own arrays, strings, variable array In other words, ALGOL's a fine language. The spec was fantastic, but it didn't take real-world constraints into account. So yeah, it could be the language of the future, but the 60s, especially the early 60s, just wasn't that future. Algol was great. It could work on

Starting point is 00:11:54 any machine in theory, but when rubber met the road, it wasn't all that practical. It wasn't all that practical. Point two, algol abstracted away a little too much. It's all well and good to keep software folk insulated from the complexities of hardware land, but only up to a point. This is especially true when we're dealing with this earlier period in computing. Sometimes you just need to be able to shift around a few bytes in memory. If you have specialized hardware, you have to have a way to interface with it, or it might as well just be a really expensive paperweight. ALGOL is a fantastically designed

Starting point is 00:12:39 programming language, but it doesn't provide any way for you to access a computer's actual hardware. It's nice to live in an idealized world of pure theory. It's fun, it's safe, it's comfortable, but that doesn't make a computer's lights blink off and on. That all being said, reports on ALGOL gained a lot of popularity, especially in academia. The language spec was really well written. It was full of good ideas waiting to be put into use. In 1960, a newer spec was released, the fittingly named ALGOL60. This helped to make the early 60s a hotbed of development for ALGOL-like and ALGOL-inspired languages.

Starting point is 00:13:27 bit of development for algol-like and algol-inspired languages. CPL was one of these new languages, the Combined Programming Language. It was initially developed in 1963. Work started at the University of Cambridge, but researchers from the University of London soon joined the team. The overall project was summarized as, quote, the object in developing CPL was to produce a language which could be used for all types of problems, numerical and non-numerical, and would allow programmers to exploit all the facilities of a large and powerful computer without having to escape into machine code, end quote. At its core, CPL was an extension to ALGOL. It's coming from the same kind of academic tradition. The new language is even laid out in academic papers

Starting point is 00:14:15 just like ALGOL was. The big difference is that CPL was designed with actual hardware in mind. The paper I just quoted from, called The Main Features of CPL, was published in 1963, and it makes this hardware mindset pretty clear. One easy way to see this is how the CPL paper talks about variables. Right up front, it mentions that variable types will have limitations dependent on the host hardware and your compiler implementation. That makes good sense. At the time, there was a lot of variance between computers. Importantly, different machines would sometimes represent numbers in different ways.

Starting point is 00:14:57 Memory will have a different layout between, say, an IBM 704 and a PDP-1. The CPL paper is pretty explicit about this, quoting again, quote, type integer is a machine-dependent facility, end quote. Any variable is going to be stored in memory, so it will be impacted by how memory works on the underlying machine. That's something that AlgGOL never talks about, and something that you just can't get away from in the real world. Diverging further from ALGOL, CPL was designed with a host of input and output features. Data streams were handled either as raw I.O. or reads and writes to files. And, of course, that meant more hardware dependence.

Starting point is 00:15:48 Accessing a file is dependent on how files are defined on your computer. Reading keyboard input is dependent on how your computer is wired up to a keyboard. The CPL spec leaves it at, hey, we will have input-output routines in the language, and then the rest is an exercise to the reader. In practice, that means that anyone implementing a compiler for CPL would have to sort out how to deal with I.O., but the framework that's really the important part in the theory side of things is right there in the paper. But here's the thing. CPL, at least

Starting point is 00:16:28 a full CPL, only ever existed as a series of academic papers. There were some pared-down compilers built, but the spec was never fully implemented. The reason for this is simple. fully implemented. The reason for this is simple. It's hard as nails to write a compiler. It's probably one of the more complicated programs that you can really write. CPL solved one of Algol's glaring issues, but not the other. It was still a very big and very complicated language. CPL's design took hardware into account, but only so far as adding features for dealing with hardware. They were still looking at programming as a highly abstracted process. That all being said, CPL would have a huge impact on the history of programming. Well, it's in a roundabout way, but I think that makes it more interesting. Soon after initial papers were published, a team at Cambridge set

Starting point is 00:17:32 to work on a new compiler for CPL. This is the ill-fated reference compiler that I mentioned. Martin Richards was a PhD student on that team, and some of CPL's faults sparked his imagination. Richards had enrolled in Cambridge in 1959, originally studying mathematics but quickly getting wrapped up in computing. As the early 60s wore on, CPL became a big feature of Richards' early career. But the big problem with any theoretical work on computers is that once electrons hit silicon, things get messy. At the time, Cambridge had two main computers, EDZAK-2 and a single Ferranti Atlas. Initial development of the CPL compiler started on EDZAK-2, since, at the time, the Atlas computer was still being set up

Starting point is 00:18:26 and used for more pressing research. The plan was that once the Atlas computer had some spare cycles, the CPL team would get to upgrade. But that introduced a fun little snag in the project. EDZAC2 and Atlas were pretty different computers. For the time, that was normal. There was a lot more diversity in computer design than there is today. And being the early 1960s, there was a very limited range of programming languages to choose from, especially when you're building something like a compiler. You basically had to use assembly language. So the CPL team was trapped in this weird position. They needed to write a new compiler.

Starting point is 00:19:14 They had some time on EDZAC2 and would eventually have some time on a new Atlas computer. Normally, that would mean writing part of a compiler, scrapping that, and then starting over once access to a new computer was secured. That's not a good position to be left in. It's a good way to waste a lot of effort. Christopher Strachey, one of the designers of CPL, came up with a workable solution. That was a macro compiler thing that he called the General Purpose Macro Generator, or just GPM for short. Macros are their own interesting topic that fits somewhere between assemblers and compilers. The basic idea is that you build up a library of macros. These are little chunks of code that each have an assigned name. Each macro is a template, so you can plug in different parameters to fill different parts of that code.

Starting point is 00:20:15 In your actual source code file, you just include a call to the macro you want to use. Then, as your code is processed, all calls to macros are replaced with the actual macro template. It essentially gives you a shortcut. If you have a chunk of code that you just keep typing out, well, you can make it into a macro and save some time. GPM wasn't the first macro system. Macros had been part of programming since at least the 1950s. They usually showed up in the context of macro assemblers. That is, assemblers with built-in support for macros. But there's a twist to how GPM worked, or at least how it was used.

Starting point is 00:21:03 Usually, macros are used to save time when programming. It's a convenience thing. But Strachey designed GPM as this portable quasi-programming language slash stack machine. Now, this gets really weird and pretty technical, so buckle up. Generally speaking, GPM could be set up to work with any kind of macros. You just need to define your set of macros and then call them as needed in your source code. The trick here was how you defined your macros. You could just have a library of useful functions, or you could do something a lot more complicated. you could do something a lot more complicated. Strachey used GPM to build up what's called an abstract machine.

Starting point is 00:21:55 That's a simplified computer that only existed inside GPM's macro library. This machine supported a slate of simple stack-based operations that mimicked how CPL functioned. Basically, Strachey just went, well, the perfect computer for CPL doesn't exist, so let me figure out what that would look like. And the details for everything else and physical hardware can come later. Now, here's what really matters, and here's why this approach is so smart.

Starting point is 00:22:26 Instead of writing the entire CPL compiler in assembly language, the team made very aggressive use of this new set of macros. Each operation in CPL was implemented as a set of macros, so the actual source code for the compiler had very little assembly language in it. In other words, it was really close to machine-independent. The compiler's code wasn't working with EDZAC2 registers or handling Atlas's specific type of hardware access. It was all written for this abstract machine that Strachey designed. The final puzzle piece, and what made GPM actually work in this context, was the macro library. That was written in assembly language, so that part was machine dependent. Remember that every call out to this abstract machine has to be replaced with a chunk

Starting point is 00:23:20 of macro from GPM. However, it wasn't as much code as the compiler. It was just a set of simple definitions that mapped instructions from Strachey's abstract machine to real-world assembly language. We're talking an order of magnitude less code than the compiler itself. To adapt the CPL compiler from EDZAC2 to Atlas, a process known as porting, all the team had to do was write up a new set of macros. That's a pretty simple process compared to rewriting an entire compiler, and it's a process that Richards would have taken part in. and it's a process that Richards would have taken part in. But even with GPM and better hardware, and all the academic papers in the world,

Starting point is 00:24:13 CPL never made it into the spotlight. The compiler became something of an ongoing project. Eventually, a minimal compiler was up and running, but it took years of work and it only implemented part of CPL's spec. But Richards wouldn't be around to see this personally. He was moving on to newer pursuits. In 1966, Richards completed his PhD and left Cambridge, taking up a job as a researcher at MIT. Richards' thesis had been on CPL, specifically the struggles with implementing the new language. So, now outside of Cambridge, Richard's set to work on an ambitious new project, or rather, a new scaled-back language. He called it BCPL, which, according to what you read, either stands for basic CPL or, more evocatively, bootstrapping CPL.

Starting point is 00:25:10 What's that? You haven't heard of BCPL? Well, that's not surprising. Most people haven't, or if they have, they've only heard it in passing. BCPL would become the direct inspiration for C. A lot of the ideas that Richards put into BCPL will find their way into the later language. The big one, the huge idea that would give us C and arguably make Unix so successful, is bootstrapping. This is also sometimes called self-hosting, and it's what BCPL was all about. Richard's core plan was to take the work being done on CPL's compiler to its logical conclusion.

Starting point is 00:25:55 I mean, in retrospect, it seems logical, but it may not have been that cut and dry at the moment. it may not have been that cut and dry at the moment. GPM, the macro system that was used to make CPL portable, was already starting to look and act in a similar manner to CPL. The syntax was different and it had a reduced functionality, but GPM certainly had a touch of CPL in it. The next step beyond that was to make a compiler written entirely in the language it was compiling. That is, a self-hosting compiler. This probably sounds a little brain-twisting, and it kinda is. The trick here is that you essentially write two compilers for your new language. First, you take the normal route. You write up a compiler in some other language.

Starting point is 00:26:49 In this case, Richards would have been working with assembly language directly. Then, once that's up and running, you write a second compiler, this time in your brand new programming language. You use the first compiler, sometimes called the bootstrap compiler, to build your second compiler. By the end of the process, Richards would have a BCPL compiler written entirely in BCPL itself. The immediate question here should be, why? Why on earth would anyone want to do this? Well, it all comes down to a core tenet of programming. It's better to be lazy, and sometimes being lazy takes you down really weird routes.

Starting point is 00:27:38 Believe it or not, creating a self-hosting compiler can end up saving a lot of time, at least if we're looking at the extreme long term. It's a little counterintuitive at first, but self-hosting makes a compiler a lot more portable. To best explain how that all works, I think we should actually dive into BCPL and Richard's implementation of this new language. BCPL and Richards' implementation of this new language. In general, BCPL is a subset of CPL with some very strategic changes. In the manual BCPL and its compilers, Richards explains the relationship as, quote, BCPL adopted much of the syntactic richness of CPL and strives for the same From that, let's make a quick roadmap for what Richards wanted to accomplish as of 1966.

Starting point is 00:28:49 We've hit most of the main points already, but I think a summary should help keep us focused. Richards wanted to make a language that was like CPL, and by extension, like ALGOL. Those earlier languages had fantastic syntax and structure, so they made for a good blueprint to follow. After the whole CPL compiler fiasco, he wanted a language that was easy to create a compiler for, and this new language had to be capable enough to use for writing compilers itself.

Starting point is 00:29:24 It turns out that all of these goals are deeply interdependent, so let's look at how Richard's implementation of BCPL kind of ticks the check boxes next to these features. As always, my go-to for a quick and dirty analysis is the language's typing system. If it isn't abundantly clear, the reason I always go for variables as quick as possible is that it tells us how the language deals with memory. And that ends up being really, really important to basically all programming languages. For BCPL, this is a really easy analysis. There is only one variable type. That's it. No integers,

Starting point is 00:30:10 no flowing points, no strings, nor booleans, at least not explicitly. The only type that Richards provides is the so-called bit pattern. Definitely a reduced form of CPL to say the least, but there's some really cool method to this apparent madness. In BCPL, a bit pattern is just some generic chunk of data somewhere in the computer's memory. That genericness is what makes things so interesting. That genericness is what makes things so interesting. Computer memory, at least in almost all cases there are always some caveats so we have to be careful, is laid out as a linear series of little cells of data. Each of these cells has its own address, starting at zero and going up to however much memory you have installed.

Starting point is 00:31:06 The size of these cells is dependent on the computer. In modern x86 systems, and really most recent computers, each cell is 8 bits wide, aka each cell holds a single byte of data. But back in the 60s, there was a lot more variance from computer to computer. To be a little more precise, we'd say that modern computers use an 8-bit wide byte. But historically, that hasn't been a constant. Some computers used 6 bits to a byte, some had 12-bit wide bytes, and some even addressed by words, which in the modern parlance are two bytes, but at the time could have variable or fixed lengths.

Starting point is 00:31:49 It becomes an issue. Now, that may sound like a small detail, but it turns out that the size of these cells of memory makes all the difference in the world. It determines how a programmer deals with data, what kind of math is easy to do, and how characters can be encoded. This memory bit width determines a whole host of factors for a programmer. So how can you make a programming language that works on all these various systems, no matter how wide their bytes and bits are. The solution that

Starting point is 00:32:27 Richards arrived at was to just define some primitive variable. In this case, it's called a bit pattern, a single cell of memory. It's just whatever the computer uses, a byte by any size. uses. A byte by any size. By keeping to one generic data type, BCPL just kind of drops the whole conversation. Oh, what size should you define integers at? According to BCPL, that doesn't matter. You don't get integers. You don't get anything. You just get a chunk of data. And that simplification, that almost elegant approach, saves a lot of issues. The rest of the language is built around this core idea. The data you work with in BCPL are just generic byte values. They don't have any explicit meaning, so it's up to the programmer to decide how to use that data. It's a very assembly language style approach that I kind of love. If you want to work with an

Starting point is 00:33:32 integer, you just define a variable and put a number in it. Then going forward, you have to remember to treat that variable or set of variables like an integer. If you need to store a character, just do the same. Define a variable, load it up with a character code, then call it a day. BCPL also has support for arrays. So if you need a list of integers, you can just define a variable as such. To the computer, it's just data. So do with it as you will. BCPL gives you that flexibility. On the surface, that sounds really limiting, but there's another feature that makes things

Starting point is 00:34:13 viable. BCPL uses these cool little things called pointers. That is, you can define a variable as some address in memory. It's a little confusing at first, but the idea is that way you can create a variable that points to another variable or to some chunk of RAM. Taken together, pointers and bit patterns make a really robust way to manipulate data. And really, that just means they're a good way to deal with memory. What makes this combo all the more interesting is how low level it actually is. Like I mentioned, it's a very assembly language thing to do. When you work with assembly language, you just, if you need a variable, grab a chunk of memory, throw some data in there, and just kind of have to remember what was in that memory space.

Starting point is 00:35:08 In contemporary languages, you didn't really deal with memory in that way. CPL and ALGOL are examples, but so are FORTRAN and LISP. All those languages went to extreme pains to keep programmers away from raw memory access. In BCPL, you have access to any location in memory within reason. In fact, to do something as simple as manipulating text data, you have to deal with raw memory. It's a low-level approach to data, but it's wrapped up in a higher level language. Like I mentioned, BCPL's major design goals were all pretty intertwined. This simplification and low-level

Starting point is 00:35:54 variable system served a dual purpose. It's just an effective way to handle data. But it also made creating a compiler for BCPL a whole lot easier. I know we're talking a lot about memory, and that's with good reason. It was a big source of variance in the 60s and a big roadblock to portability. But there's one last key to this whole memory mess, and it takes us back a little bit to Richard's experience working on the ill-fated CPL compiler. GPM used kind of an ad hoc abstract machine to create platform-independent code. BCPL took that a step further. The design that Richard's came up with used a much more formally specified abstract machine. The most popular BCPL manual, simply

Starting point is 00:36:47 called BCPL, the language and its compilers. Well, it calls this the object machine, or idealized machine. Now, object machine is a bit of a confusing term here, because objects mean something else in regards to programming. But I think idealized machine is a little bit better. Like with GPM, the basic idea is to define some computer that your code can run on. Then you write your compiler to create an executable for this idealized machine. for this idealized machine. That machine doesn't exist. It's only a spec. The final step, and the only step where hardware matters, is converting from an idealized machine program to an actual executable that runs on real hardware. That sounds complicated. Conceptually, it is complicated. complicated. Conceptually, it is complicated. But at its core, Richards was taking the easiest route possible. A later essay that he wrote, quote, the BCPL abstract machine was almost

Starting point is 00:37:54 the same abstract machine of macro calls used in the implementation of CPL. But since its statements were to be generated by the front end of the compiler and read by a code generator, there was no need for it to consist of textual macro calls. End quote. To translate a little, your code started at the front end of the BCPL compiler. From there, it's read, parsed, broken up, and turned into instructions for Richard's abstract machine. That machine borrowed a lot from his work at Cambridge. That abstract code is then put into the code generator. That's the hardware-dependent part that turns abstract code into real executable.

Starting point is 00:38:40 That intermediate code is never touched by human hands. It's written by a computer and read by another computer program. So there's no need to get fancy. Richards implemented that layer as just a stream of numbers. No text, no structure. It's easy to write a program that understands numbers. It takes far fewer operations, so why not make things as easy as possible? Another huge factor that simplified BCPL was its intended scope. Richards wasn't making a 100% general-purpose language. BCPL wasn't designed to take on every possible task. Instead, he focused on the niche of system programming.

Starting point is 00:39:29 That is, BCPL was intended to be used for compilers, operating systems, and other low-level tools. As it turns out, and this is kind of funny to me, those kinds of programs use very little mathematics. At the most, you end up doing integer math. The largest numbers you have to deal with are memory addresses, and those are all simple whole numbers. So that let Richards get away with having no native floating point support, and a very pared-down mathematics system. BCL's spec is rounded out with its own character set and tokens. Now, once again, this is going to sound pretty nitpicky, but I think it's a really important part of the discussion. Basically, this syntax and character token discussion is how BCPL looks. Which characters mean what in the language, which

Starting point is 00:40:26 characters execute instructions, and which are just for nice formatting. It's central stuff to any programming language. And just like with other design choices, Richard went with the easiest and most elegant solution at least wherever possible. Now, I say wherever possible because we're still dealing with computers. Rule one of computers, they're little machines that tend to ruin your plans. And in the 1960s, we're still dealing with a diverse cast of digital systems. This is one place where the portability goals of BCPL led to issues. At MIT, Richards was working with IBM computers. Back at Cambridge, he had been using totally different systems. In this era, there weren't widely adhered-to standards for character codes. So we get this

Starting point is 00:41:22 weird, and I personally think delightfully weird, situation where Richards found a way to conquer portability. He was able to abstract away memory and processor differences with ease. But he hit a roadblock when it came to which characters his new language could actually use. The best example of this is the curly bracket. It's a really common character in more modern programming languages. It's the little open and closed curved bracket with the sharp point in their middles. The terminals and computers back at Cambridge supported that character. MIT's IBM systems, they just didn't. This ended up being a bit of a problem because Richards wanted to use the curly bracket to

Starting point is 00:42:09 enclose blocks of code. This idea was part of ALGOL and CPL, basically just a way to group code for execution. Things like loops or if statements all executed a single block of code that followed them. In ALGOL, blocks were defined using the words begin and end. So you'd have to say something like, if x is greater than y, then begin your block of code and then follow that up with an end. It works fine, but it's a little verbose, so Richards wanted to just replace the begin and end with an open and closed curly bracket. Speculating here, but I think there's another good reason for this change. This is mainly me talking from experienced programming assembly language and not explicitly backed up in the sources, so do with it as you will. Processing textual data or string manipulation is actually not that trivial.

Starting point is 00:43:15 It can take a lot of code and processing power to make sense of human-readable text on a computer, especially when you're working in a low-level language. A string like begin is really easy for us squishy human folk to write, read, and understand. But for a program, it adds an extra level of complexity. A computer sees begin as five characters, maybe six if you have a terminating zero. When you get down to the processor level, you can only actually compare one character at a time. It's just how processors and register machines work. So to look at some input data and find the word begin, you need to run five successive comparisons. Adding stuff to manipulate memory and move around as you're doing comparisons bumps that up to some more instructions, but let's just say

Starting point is 00:44:13 five instructions is the bare minimum amount of complexity. Let's say you're also trying to look for the word end. That means each incoming word takes up to eight compare instructions to process. As you add in more keywords, that balloons the amount of processing you need to do. Now, to the rescue, up in the sky, what could it be? It's the curly bracket. You can now enclose a block of code with just two characters. That means that your program only has to do two comparisons to look for those brackets. Over time, this kind of instruction pinching really does add up.

Starting point is 00:44:58 But the problem was that IBM mainframes, at least at the time, didn't do curly brackets. So Richards had to compromise. He ended up using $OPENPARIN and $CLOSEPARIN for begin and end block. Some later versions of the compiler ended up supporting curly brackets, but that was once IBM got with the program. By 1967, the very first version of BCPL was completed, and that's including specification and compiler. Richards had taken a much more pragmatic approach to language design, and it really paid off. Instead of being some low-level computer language that few could understand, or some high-level computer language that few could understand,

Starting point is 00:45:49 or some high-level abstracted language that only worked in academic papers, BCPL met computers in the middle. It had features of high-level programming languages, but it also had access to the low-level bits of a computer. And it can be implemented with relative ease. Smart choices, both in the language's overall design and its finer features, made BCPL easy to compile. And by adapting GPM's abstract machine model, Richards was able to make porting BCPL much easier compared to contemporary languages.

Starting point is 00:46:21 As the language spread, it became a very popular choice for programmers. It made its way through MIT's CTSS community, off into parts of Multics, and eventually into the hands of some young and excited programmers all the way off at Bell Labs. This brings us to where BCPL really made its mark on history, and this brings us deep into the early days of Unix. For those not in the know, aka those who actually go outside sometimes, Unix is an operating system. In fact, it's probably one of the most important operating systems out there. It was initially developed in 1969 at Bell Labs.

Starting point is 00:47:06 During the early 70s, it spread like wildfire, and in the modern day, Unix and its derivatives run on just about any computer you can think of. Windows is really the last great bastion against the Unix-like tides, but even that's starting to change. But before it blew up onto the scene, Unix was a side project hammered out on spare hardware deep inside AT&T's Bell Labs. This project was started in 1969 by Ken Thompson, a programmer at Bell who had recently come off working on Multics. Now, very broadly speaking, Multics was the spiritual predecessor to Unix. It was a large-scale partnership between a handful of companies, Bell Lab among them, to create a secure timesharing operating system. The idea was that Multics would be the backbone of computer infrastructure that would be used as a widespread utility service.

Starting point is 00:48:09 Bell and AT&T in general were really eager to get in on the action. At the time, there wasn't a one-size-fits-all timesharing system out there, and Bell wanted one for internal use, and maybe for some more ambitious projects too. But Multics was perhaps over-ambitious. Timelines would slip further and further behind, and by 1969, Bell pulled out of the partnership entirely. But that left Bell with a bit of a problem. They still didn't have the complicated timesharing system they were after. And on a smaller scale, programmers like Ken Thompson and Dennis Ritchie just

Starting point is 00:48:51 plain liked working on Multics. It was a nice environment to work in. The entire system was written in a high-level language. Sure, it had some problems, but it was a fun playground for programmers. Back at Bell, Thompson was slowly deciding that he wanted to create a little slice of Multics at home. As lore goes, Thompson found a barely used PDP-7 computer tucked away near his office, so he set to work writing his own operating system on the machine. Pretty soon in the process, another programmer named Dennis Ritchie joined up with Thompson. It was in that lab that a very early version of Unix would start to take shape. Just as a quick aside, at this point it wasn't called Unix. I don't know

Starting point is 00:49:38 if it even had a name. The label would come later on in the 70s, but I'm just going to keep calling it Unix because I don't want to have to go over and over again saying the system that would one day be known as Unix. Anyway, from the start, there were two huge hurdles to face. First was the platform that the duo were working on. The PDP-7 was chosen for one big reason. No one else was really using it. It wasn't very powerful, and it only had 8 kilobytes of memory. That was a huge limit to what Unix could be. Second, and intertwined with the first issue, Thompson and Ritchie were working directly in assembly language. This was partly due to a lack of tools on the PDP-7, and partly because they were working on such a low-level project.

Starting point is 00:50:34 Now, I'll be the first to admit that assembly language does have some problems. It's definitely one of my favorite ways to bash bits, but it's not always the best tool for the job. For one, it's not portable. Thompson and Ritchie knew that if Unix continued, it needed to move off the PDP-7. As the system grew, so did its requirements, and 8K of RAM isn't enough space to really grow. The other issue was on the human side of things, maintainability. Discounting writing binary directly in machine code, assembly language is one of the hardest languages to program in. It's terse, you have to remember cryptic instruction names, and you need

Starting point is 00:51:20 to know exactly what your codebase is doing. For large projects, especially ones with multiple programmers, that last part becomes a real issue. You have to keep meticulous documentation about which parts of memory are used by different code, how registers should be treated, and everyone has to write code that follows that convention or things break. From the outset, Thompson and Ritchie knew they wanted to move away from assembly and start working in a higher level language. And from the outset, they had a choice in mind. Quoting from a talk that Thompson gave, quote, after Unix was up or simultaneous with Unix coming out, BCPL was just emerging, and that was a clear

Starting point is 00:52:07 winner for both of us. Both of us were really taken by the language and did a lot of work with it." BCPL had everything that the Unix crew were looking for. It was built for portability. It could handle low-level tasks while still being encased in a high-level language. And it was easy to implement a compiler for. But that's assuming you had access to a capable computer. The PDP-7 just wasn't capable enough for a BCPL compiler. Turns out you couldn't fit an entire compiler for the language in 8k of code. There were experiments with other languages, but none really stuck. None fit the exact niche that the crew wanted. Notably, Douglas McElroy, another early addition to the unofficial team, ported a language called TMG to the PDP-7. Now, TMG is particularly notable

Starting point is 00:53:08 here because it was language designed for writing compilers. So not the most general-purpose thing in the world, but a type of glue that would help bootstrap something bigger. TMG was a step forward, but the crew still wanted a capable programming environment. One of the things that they liked about Multics was that there were ready tools for programming and debugging. So servicing that became somewhat of a project within the side project. This would be the next spark towards progress. Ritchie described the series of events like this, quote, progress. Ritchie described the series of events like this, quote, the intent to handle Fortran lasted about a week. What he produced instead was a definition of and a compiler for the new language B, end quote. So after some initial work, a reference compiler

Starting point is 00:54:15 for this new thing called B was created. Then that compiler was eventually rewritten in B itself. compiler was eventually rewritten in B itself. By late 1969 or early 1970, B was fully self-hosting. So, what was this new language? Simply put, it was a stripped-down BCPL, small enough that an entire compiler could be written in less than 8 kilobytes of code. If there was anything like a punchline in this episode, then here it is. B is a simplified BCPL. BCPL is a simplified CPL. CPL is a simplified and slightly upgraded ALGOL. In other words, we're looking at this oddly direct chain of pragmatic compromises, at this oddly direct chain of pragmatic compromises, slowly going from theory to practice. B is where that chain made it fully outside of academia and into the world of industry. So what did the new language look like? How did Thompson's proclivities impact B?

Starting point is 00:55:29 The go-to litmus test here is easy. B uses the same variable typing system as BCPL. You get one type, it's basically a bit pattern by another name. The B reference manual, written by Thompson a few years into development, calls it a word. It's not exactly the same as early BCPL implementations, but the difference is negligible. A word in this context is just the native size of data that a processor works off of. It's not always a single byte, sometimes it's two bytes, sometimes it's more, sometimes it's less, but it's the same idea as a bit pattern. It's a generic chunk of data. The character set and syntax are the most noticeable changes here. According to Ritchie, Thompson was always a fan of, quote, Spartan syntax. Why use a lot of words when a few well-placed characters will do the trick? And luckily, that view aligned with B's

Starting point is 00:56:21 constraints, namely the memory limitations of the fantastic PDP-7 platform. In general, Thompson pared down BCPL's character set, adjusting and tightening the language. The awkward begin and end tokens, represented by $OPEN and $CLOSE in BCPL, were replaced with fully-fledged curly brackets. Deck machines supported the character, so it was a very logical upgrade. Another big change was how assignments were written. Once again, this is a pretty nitpicky detail, but I find this really interesting to think about. but I find this really interesting to think about. In ALGOL and its derivative languages that includes BCPL, you assign a value to a variable by typing variable colon equals value.

Starting point is 00:57:15 That's a holdover from ALGOL's formalized roots. The colon equal is used in mathematics to signify a definition, so it made sense to put it into a new formalized programming language. Comparisons in these algol-like languages use a single equals. That is, you could just type x equals y to see if the two variables were equal. Whenever I look at these older languages, that's something that always trips me up since it's not modern convention. There was a flip that happened. It didn't start with B, but B put together a lot of little tweaks in one package that looks resoundingly modern. Thompson replaced colon equals with just equals for assignment, and the single equal that was used for comparisons was replaced

Starting point is 00:58:06 with the double equal equal. In the current day, languages tend to follow this new convention. So to the modern eye, just this change in how equals is treated plus curly brackets makes B look very familiar. Once again, Ritchie mentions the change of assignment briefly in a few essays, specifically in a 1993 article called The Development of the C Language. Ritchie states that a lot of these small changes were a, quote, matter of taste, which, yeah, I get it. Syntax can definitely be a matter of taste, but if I can go off the rails here for a second, the assignment operator in particular is really interesting to me. The simple fact is, over the course of a program, you're going to have a lot more assignments

Starting point is 00:58:56 than comparisons. I was just looking at some random code I had downloaded, and it's a 10 to 1 ratio. 10 times more assignments than I have comparisons. By far, it's just more common to move around data than it is to compare data. So flopping things around, making assignment take only one character while comparisons take two, that definitely saves some space in source code code and it saves some time when typing. It's not the biggest deal in the world, but I think that's a really neat tweak. The other interesting piece that I just have to throw in is that B introduces the plus plus and

Starting point is 00:59:37 minus minus operators. Those increment or decrement a value by one. They're really common in use today, and they started off in B. Implementation-wise, B diverges slightly from the track laying down by BCPL. The B compiler created threaded code. This is a pretty interesting technique that's somewhat close to how GPM or BCPL compilers worked, but a little bit different. Normally, a compiled program is just a series of normal machine code instructions. You can't tell the difference between an assembled and a compiled program unless you know what you're looking for. Well, for B, things are a little bit different. B worked off a rough abstract machine, but nothing quite as formalized as BCPL's idealized machine. But in B, calls to this abstract machine weren't replaced with chunks

Starting point is 01:00:35 of code for some physical target machine. Instead, each instruction was just replaced with a function call. This means that the final machine code for a B program ends up looking distinct. Just about each instruction calls out to some chunk of code held in a library of functions. In practice, the portability of B was accomplished by rewriting that library of calls for a new platform. The ease of portability with threaded code is similar to BCPL's implementation. Not much has to be rewritten for a new platform. In fact, threaded code may make creating the compiler a little easier in the first place. You basically skip out the whole replacing instructions step. so one fewer steps is really nice. However,

Starting point is 01:01:27 there is a downside. Threaded code tends to be slower than other approaches. Each line of your final program has a call, which a call is its own instruction, so that eats up cycles. B was functional, but a little bit hamstrung by this implementation. In the coming months, other issues would come to light. B would see some limited use inside Bell Labs, but it never beat out assembly. The Bell crew still wanted to rewrite Unix in a high-level language, but B just wasn't quite the right tool for the job. So as they transitioned to a new computer and the team started to grow,

Starting point is 01:02:10 Dennis Ritchie set to work on a replacement for B. That second language would one day be known as C. Alright, that does it for the start of our exploration of the C programming language. This has been a bit of a heavy episode and actually pretty light on C itself, but I think the context here is really, really important. C, and really any programming language, doesn't exist in a vacuum. Each has its own lineage. For C, that family tree just happens to take us back all the way to time immemorial. When you get down to it, the road to C can be characterized by a process of slowly transitioning

Starting point is 01:02:58 from theory into practice. ALGOL starts in the lofty heights of pure theory, a universal language designed to take on all challenges. CPL adapts those ideas into a more useful system, trying to make an abstract idea practical. Once programmers try to implement CPL, more and more practical issues start to pop up. One of those is portability, how easy it is to move software from computer to computer. Eventually, Martin Richards remixes CPL into BCPL, where he tackles the issue of portability a lot more formally while also retaining low-level control of the underlying computer. He does this all in a very academic

Starting point is 01:03:47 environment with very formalized goals. The ideas, tricks, and technology that Richards put into BCPL find their way off campuses and eventually into the hands of programmers at Bell Labs. From there, theory is fully reduced to practice. That is, the team working on Unix starts to figure out how things like portability can work in the messy world. That's where I'm leaving things for now. Next week, we're going to be looking at the early development of C, how B transformed into the world's most popular programming language. And most importantly, we're going to see how C took on the world. Thanks for listening to Advent of Computing. I'll be back in two weeks' time with the next part of my series on C.

Starting point is 01:04:38 And hey, if you like the show, there are now a few ways you can support it. If you know someone else who would like the show, then why not take a minute to share it with them? You can also rate and review on Apple Podcasts. And if you want to be a super fan, you can support the show directly through Advent of Computing merch or signing up as a patron on Patreon. Patrons get early access to episodes, polls for the direction of the show, and bonus content. You can find links to everything on my website, adventofcomputing.com. If you have any comments or suggestions for a future episode, then go ahead and shoot me a tweet. I'm at Advent of Comp on Twitter. And as always, have a great rest of your day.

Your Ad Here

Advent of Computing - Episode 53 - C Level, Part I

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.