Advent of Computing - Episode 53 - C Level, Part I
Episode Date: April 4, 2021C is easily one of the most influential programming languages in the world, and it's also one of the most popular languages in the world. Even after close to 50 years it remains in widespread and sust...ained use. In this series we are going to look at how C was developed, how it spread, and why it remains so relevant. To do that we need to start with background, and look at what exactly influenced C. This episode we are diving into some more ALGOL, CPL, BCPL, and eventually B. Like the show? Then why not head over and support me on Patreon. Perks include early access to future episodes, and bonus content:Â https://www.patreon.com/adventofcomputing
Transcript
Discussion (0)
It's impossible to say that there's any one best programming language.
That's just a plain fact.
Some are better suited for specific tasks.
Some have traits that are universally loved, while others have quirks that are objectively bad.
It even gets sticky trying to figure out which language is the most used.
Nowadays, there's definitely a lot of code publicly available.
We can scrape that from the internet to start getting an idea of trends, but that only gives
us a sliver of the much larger code pie. You can't really go around asking companies for line counts
of proprietary software. That is a quick way to get kicked out of someone's lobby. That all being said, we do have a
definite ranking for popularity of different programming languages. As of March 2021, the most
popular language in the world is, drumroll please, C. That's at least according to the TIOBE index.
Their ranking reflects which languages are discussed the most online,
hence popularity not necessarily usage. Since the index started, C and Java have been the
perennial top two contenders. They trade off the number one spot back and forth for years.
But here's where things get wild, at least in my mind. C isn't a newcomer. Its first iteration
appeared in 1972. That's nearly 50 years ago. Discounting contenders like assembly language,
C is by far the oldest language anywhere near the top of the TIOBE index.
Deepening the puzzle, most of the other top languages cite C as a major influence.
Java was originally developed as an alternative to C. Another one of the top 10 languages,
C++, was developed as an extension of C. Even seemingly unrelated languages like Python have
drawn influences from C. Adding to the connection,
most major versions of Python's interpreter are written in C. Everywhere you look, it seems that
there's some connection back to this venerable language. So when I see something like this,
one big question springs to mind. What exactly influenced C itself?
Welcome back to Advent of Computing.
I'm your host, Sean Haas, and this is episode 53, C-Level, part 1.
This time, we're getting back into what's one of my favorite topics, programming languages.
As long-time listeners know, or really anyone can guess, I'm a huge fan of programming both in my personal and professional life. It's a practice that I find very fulfilling and
fascinating. And, like anything related to the digital realm, programming has its own set of sagas.
Today, we're starting a deep dive into C, one of the most popular and influential programming languages out there.
And along the way, we're going to go through what I think is a very fascinating saga of the digital past.
Now, I'd almost say that C is the Latin of computer programming, but that's not 100% accurate as a comparison.
Anyone who's programmed long enough has been exposed to C. It's kind of a rite of passage.
You see it used as both example code and living projects.
living projects. Most modern languages borrow a lot of syntax from C, which means that even programmers that haven't used C directly find the language recognizable. Sure, it's been given
facelifts over time, derivatives and dialects have hit the scene, but C remains a very popular
and widely used language nearly 50 years after its creation.
Unix is one big reason behind C's popularity.
In the early 70s, Unix, an operating system developed at Bell Labs,
made its way into classrooms and onto computers around the world.
By that point, the operating system was written almost entirely in C.
In fact, the language was initially developed for that very purpose. So as Unix rose to prevalence, so too did C. A major contributing factor here was that
early versions of Unix were distributed as uncompiled source code, so anyone using Unix
was exposed to its particular language. This same code was also used as a teaching aid at
many universities, so C became a pretty big deal pretty quickly. But herein lies something that's
always kind of bugged me. In the current day, C is really seen as a general-purpose programming
language. One of its huge keys to success is that it's portable. Just
about any computer you can find has a C compiler that supports it. So just about any computer you
can think of runs some kind of software written in C. And believe me, a whole lot of software
is written in C. But here's the caveat. C was designed to fit a very specific niche. It
was developed as a language for system programming. And you can see that laid bare in the language
itself. C isn't as low-level as, say, assembly language. You aren't dealing directly with
the computer on its terms, but you can get close if you want to.
C can directly address memory.
In a lot of cases, you have to be comfortable manually moving around bits and bytes to get the most out of C.
In Java or Python or even ALGOL, you just can't do that.
You flat out can't.
Those languages deal with the computer on a more
abstracted level. The programmer is kept separate from what's going on down lower. That kind of
abstraction is seen as a good thing. It's often seen as an essential part of a quote-unquote
good programming language. But in that respect, C breaks from convention,
and it breaks pretty hard. Just looking at it alone, it seems that C's in a class separate
from other popular and more modern languages. But looks can be deceiving. There's something
more interesting going on just below the surface. Now, as you can probably tell, this is the start of a series.
I tried to do this all in one episode, but I just can't.
There's too much to the story, and once my research got going,
I realized that there was a lot to see as predecessors that I didn't want to rush through.
The overall goal of this series is to try and understand
what made C how it is, and how C became so ubiquitous. To do that, we have to go back
to basics and look at some older programming languages. So this episode, we're going to
re-litigate some of the issues with ALGOL, Look into what portability really means and why portability
actually matters. Along the way, we'll build up the theory that we need so that next episode,
we can see how C put all those ideas into practice. Now, here's the thing. I'm not just
name-dropping ALGOL. We actually need to get back into that language. I had some deeper coverage of
Algol back in my episode on Jovial. That's episode 38 if you want to go check it out.
I'm not going to cover its entire history here. Instead, we're just going to focus on what's
relevant today. Algol, initially called the International Algebraic Language, was an attempt to create
a universal programming language.
The effort started all the way back in 1958 and was spearheaded by an international committee
of programmers, computer scientists, and industry experts.
By that time, there were already a handful of programming languages on the scene.
The goal was to just replace all of those
with a single, perfect programming language. ALGOL was planned to be the standard for
international collaboration, built as a tool for expressing programs on paper, and a practical
language for all programmers to use and love. A key factor here, and one that we'll be dealing with
for a large part of this episode, was machine independence. ALGOL was designed so that,
at least in theory, it could be implemented on any computer. The language spec made no assumptions
about hardware, which instructions a processor could run, or how memory was laid out.
The initial 1958 spec didn't even include keywords for handling input or output,
since that would, by its very nature, be machine-dependent. In other words,
ALGOL employed a very high level of abstraction. The reasoning for this choice is pretty clear.
ALGOL was slated to be this universal programming language for the future, so it had to be able to
run on any computer, past, present, or future. In theory, anyone with enough skill could implement
an ALGOL compiler for any computer they wanted. On paper, ALGOL's a wonderful language. It
incorporated features well ahead of its time. It even introduced new syntax that would show up in
languages for decades to come. However, ALGOL itself was never able to unseat its competition.
We could get into a pretty long conversation on why Algol failed to
thrive, but there are two key points that I think are most relevant. It was hard to implement,
and it was just too abstracted. The first point here is a little subtle, but I have some direct
evidence. Algol is a fairly complicated language. Its code is a lot more structural than
contemporaries. Functions can call other functions. Functions can have functions defined within them
for some reason. Code is grouped into blocks. Data could be passed by value, but it could also be
passed by reference. Compared to, say, Fortran, there's just a lot more
moving parts going on inside ALGOL. By the early 60s, we start to see some implementations of the
language, but a lot of them are off-spec. One great example here is ALGOL30. It's a compiler that was developed at Dartmouth for the
LGP30 computer. But this wasn't a one-to-one implementation. Thomas Kurtz,
future co-creator of BASIC, described the problem this way, quote,
Since the limited size of the LGP30 precluded a full implementation of ALGOL60,
certain of its features, arrays called by value, own arrays, strings, variable array In other words, ALGOL's a fine language.
The spec was fantastic, but it didn't take real-world constraints into account.
So yeah, it could be the language of the future,
but the 60s, especially the early 60s, just wasn't that future. Algol was great. It could work on
any machine in theory, but when rubber met the road, it wasn't all that practical.
It wasn't all that practical.
Point two, algol abstracted away a little too much.
It's all well and good to keep software folk insulated from the complexities of hardware land,
but only up to a point.
This is especially true when we're dealing with this earlier period in computing.
Sometimes you just need to be able to shift around a few bytes in memory. If you have specialized hardware, you have to have a way to interface with it, or
it might as well just be a really expensive paperweight. ALGOL is a fantastically designed
programming language, but it doesn't provide any way for you to access a computer's actual
hardware. It's nice to live in an idealized world of pure theory. It's fun, it's safe,
it's comfortable, but that doesn't make a computer's lights blink off and on.
That all being said, reports on ALGOL gained a lot of popularity, especially in academia.
The language spec was really well written.
It was full of good ideas waiting to be put into use.
In 1960, a newer spec was released, the fittingly named ALGOL60.
This helped to make the early 60s a hotbed of development for ALGOL-like and ALGOL-inspired languages.
bit of development for algol-like and algol-inspired languages. CPL was one of these new languages,
the Combined Programming Language. It was initially developed in 1963. Work started at the University of Cambridge, but researchers from the University of London soon joined the team. The overall project
was summarized as, quote, the object in developing CPL was to produce a language which could be used for all types of problems,
numerical and non-numerical,
and would allow programmers to exploit all the facilities of a large and powerful computer
without having to escape into machine code, end quote.
At its core, CPL was an extension to ALGOL. It's coming from
the same kind of academic tradition. The new language is even laid out in academic papers
just like ALGOL was. The big difference is that CPL was designed with actual hardware in mind.
The paper I just quoted from, called The Main Features of
CPL, was published in 1963, and it makes this hardware mindset pretty clear. One easy way to
see this is how the CPL paper talks about variables. Right up front, it mentions that
variable types will have limitations dependent on the host hardware and your compiler implementation.
That makes good sense.
At the time, there was a lot of variance between computers.
Importantly, different machines would sometimes represent numbers in different ways.
Memory will have a different layout between, say, an IBM 704 and a PDP-1.
The CPL paper is pretty explicit about this, quoting again,
quote, type integer is a machine-dependent facility, end quote. Any variable is going
to be stored in memory, so it will be impacted by how memory works on the underlying machine.
That's something that AlgGOL never talks about,
and something that you just can't get away from in the real world.
Diverging further from ALGOL, CPL was designed with a host of input and output features.
Data streams were handled either as raw I.O. or reads and writes to files. And, of course, that meant more hardware dependence.
Accessing a file is dependent on how files are defined on your computer. Reading keyboard input
is dependent on how your computer is wired up to a keyboard. The CPL spec leaves it at, hey,
we will have input-output routines in the language, and then the rest
is an exercise to the reader.
In practice, that means that anyone implementing a compiler for CPL would have to sort out
how to deal with I.O., but the framework that's really the important part in the theory side
of things is right there in the paper.
But here's the thing. CPL, at least
a full CPL, only ever existed as a series of academic papers. There were some pared-down
compilers built, but the spec was never fully implemented. The reason for this is simple.
fully implemented. The reason for this is simple. It's hard as nails to write a compiler.
It's probably one of the more complicated programs that you can really write. CPL solved one of Algol's glaring issues, but not the other. It was still a very big and very complicated language. CPL's design took hardware into account, but only so far as adding features for dealing with hardware.
They were still looking at programming as a highly abstracted process.
That all being said, CPL would have a huge impact on the history of programming.
Well, it's in a roundabout way, but I think that
makes it more interesting. Soon after initial papers were published, a team at Cambridge set
to work on a new compiler for CPL. This is the ill-fated reference compiler that I mentioned.
Martin Richards was a PhD student on that team, and some of CPL's faults sparked his imagination.
Richards had enrolled in Cambridge in 1959, originally studying mathematics but quickly getting wrapped up in computing.
As the early 60s wore on, CPL became a big feature of Richards' early career.
But the big problem with any theoretical work on computers is that
once electrons hit silicon, things get messy. At the time, Cambridge had two main computers,
EDZAK-2 and a single Ferranti Atlas. Initial development of the CPL compiler started on EDZAK-2,
since, at the time, the Atlas computer was still being set up
and used for more pressing research. The plan was that once the Atlas computer had some spare cycles,
the CPL team would get to upgrade. But that introduced a fun little snag in the project.
EDZAC2 and Atlas were pretty different computers. For the time, that was
normal. There was a lot more diversity in computer design than there is today. And being the early
1960s, there was a very limited range of programming languages to choose from, especially
when you're building something like a compiler. You basically had to use assembly language.
So the CPL team was trapped in this weird position.
They needed to write a new compiler.
They had some time on EDZAC2 and would eventually have some time on a new Atlas computer.
Normally, that would mean writing part of a compiler,
scrapping that,
and then starting over once access to a new computer was secured. That's not a good position to be left in. It's a good way to waste a lot of effort. Christopher Strachey, one of the designers
of CPL, came up with a workable solution. That was a macro compiler thing that he called the
General Purpose Macro Generator, or just GPM for short. Macros are their own interesting topic that
fits somewhere between assemblers and compilers. The basic idea is that you build up a library of macros. These are little chunks of code that each have an assigned name.
Each macro is a template, so you can plug in different parameters to fill different parts of that code.
In your actual source code file, you just include a call to the macro you want to use.
Then, as your code is processed, all calls to macros are replaced with the actual
macro template. It essentially gives you a shortcut. If you have a chunk of code that you
just keep typing out, well, you can make it into a macro and save some time. GPM wasn't the first
macro system. Macros had been part of programming since at least the 1950s.
They usually showed up in the context of macro assemblers.
That is, assemblers with built-in support for macros.
But there's a twist to how GPM worked, or at least how it was used.
Usually, macros are used to save time when programming.
It's a convenience thing. But Strachey designed GPM as this portable quasi-programming language slash stack machine. Now, this gets really weird
and pretty technical, so buckle up. Generally speaking, GPM could be set up to work with any kind of macros.
You just need to define your set of macros and then call them as needed in your source code.
The trick here was how you defined your macros.
You could just have a library of useful functions, or you could do something a lot more complicated.
you could do something a lot more complicated.
Strachey used GPM to build up what's called an abstract machine.
That's a simplified computer that only existed inside GPM's macro library.
This machine supported a slate of simple stack-based operations that mimicked how CPL functioned.
Basically, Strachey just went,
well, the perfect computer for CPL doesn't exist,
so let me figure out what that would look like.
And the details for everything else and physical hardware can come later.
Now, here's what really matters,
and here's why this approach is so smart.
Instead of writing the entire CPL compiler in assembly language, the team made very aggressive use of this new set of macros. Each operation in
CPL was implemented as a set of macros, so the actual source code for the compiler had very
little assembly language in it. In other words, it was really close to
machine-independent. The compiler's code wasn't working with EDZAC2 registers or handling Atlas's
specific type of hardware access. It was all written for this abstract machine that Strachey
designed. The final puzzle piece, and what made GPM actually work in this
context, was the macro library. That was written in assembly language, so that part was machine
dependent. Remember that every call out to this abstract machine has to be replaced with a chunk
of macro from GPM. However, it wasn't as much code as the compiler. It was just a set of
simple definitions that mapped instructions from Strachey's abstract machine to real-world
assembly language. We're talking an order of magnitude less code than the compiler itself.
To adapt the CPL compiler from EDZAC2 to Atlas, a process known as
porting, all the team had to do was write up a new set of macros. That's a pretty simple process
compared to rewriting an entire compiler, and it's a process that Richards would have taken part in.
and it's a process that Richards would have taken part in.
But even with GPM and better hardware, and all the academic papers in the world,
CPL never made it into the spotlight.
The compiler became something of an ongoing project.
Eventually, a minimal compiler was up and running,
but it took years of work and it only implemented part of CPL's spec. But Richards wouldn't be around to see this personally. He was moving on to newer pursuits. In 1966,
Richards completed his PhD and left Cambridge, taking up a job as a researcher at MIT.
Richards' thesis had been on CPL, specifically the struggles with implementing
the new language. So, now outside of Cambridge, Richard's set to work on an ambitious new project,
or rather, a new scaled-back language. He called it BCPL, which, according to what you read, either stands for basic CPL or, more evocatively, bootstrapping CPL.
What's that? You haven't heard of BCPL?
Well, that's not surprising.
Most people haven't, or if they have, they've only heard it in passing.
BCPL would become the direct inspiration for C. A lot of the ideas that
Richards put into BCPL will find their way into the later language. The big one, the huge idea
that would give us C and arguably make Unix so successful, is bootstrapping. This is also
sometimes called self-hosting, and it's what BCPL was all about.
Richard's core plan was to take the work being done on CPL's compiler to its logical conclusion.
I mean, in retrospect, it seems logical, but it may not have been that cut and dry at the moment.
it may not have been that cut and dry at the moment. GPM, the macro system that was used to make CPL portable, was already starting to look and act in a similar manner to CPL. The syntax
was different and it had a reduced functionality, but GPM certainly had a touch of CPL in it.
The next step beyond that was to make a compiler written entirely in the language it was
compiling. That is, a self-hosting compiler. This probably sounds a little brain-twisting,
and it kinda is. The trick here is that you essentially write two compilers for your new
language. First, you take the normal route.
You write up a compiler in some other language.
In this case, Richards would have been working with assembly language directly.
Then, once that's up and running, you write a second compiler,
this time in your brand new programming language.
You use the first compiler, sometimes called the bootstrap compiler,
to build your second compiler. By the end of the process, Richards would have a BCPL compiler
written entirely in BCPL itself. The immediate question here should be,
why? Why on earth would anyone want to do this? Well, it all comes down to a core tenet of
programming. It's better to be lazy, and sometimes being lazy takes you down really weird routes.
Believe it or not, creating a self-hosting compiler can end up saving a lot of time, at least if we're looking
at the extreme long term.
It's a little counterintuitive at first, but self-hosting makes a compiler a lot more
portable.
To best explain how that all works, I think we should actually dive into BCPL and Richard's
implementation of this new language.
BCPL and Richards' implementation of this new language. In general, BCPL is a subset of CPL with some very strategic changes. In the manual BCPL and its compilers, Richards explains the
relationship as, quote, BCPL adopted much of the syntactic richness of CPL and strives for the same From that, let's make a quick roadmap for what Richards wanted to accomplish as of 1966.
We've hit most of the main points already, but I think a summary should help keep us focused.
Richards wanted to make a language that was like CPL, and by extension, like ALGOL.
Those earlier languages had fantastic syntax and structure,
so they made for a good blueprint to follow.
After the whole CPL compiler fiasco,
he wanted a language that was easy to create a compiler for,
and this new language had to be capable enough
to use for writing compilers itself.
It turns out that all of these goals are deeply
interdependent, so let's look at how Richard's implementation of BCPL kind of ticks the check
boxes next to these features. As always, my go-to for a quick and dirty analysis is the language's
typing system. If it isn't abundantly clear, the reason I always go for variables as quick as possible
is that it tells us how the language deals with memory.
And that ends up being really, really important to basically all programming languages.
For BCPL, this is a really easy analysis.
There is only one variable type. That's it. No integers,
no flowing points, no strings, nor booleans, at least not explicitly. The only type that
Richards provides is the so-called bit pattern. Definitely a reduced form of CPL to say the least, but there's some really cool
method to this apparent madness. In BCPL, a bit pattern is just some generic chunk of data
somewhere in the computer's memory. That genericness is what makes things so interesting.
That genericness is what makes things so interesting.
Computer memory, at least in almost all cases there are always some caveats so we have to be careful,
is laid out as a linear series of little cells of data.
Each of these cells has its own address, starting at zero and going up to however much memory you have installed.
The size of these cells is dependent on the computer. In modern x86 systems, and really most recent computers, each cell is 8 bits wide,
aka each cell holds a single byte of data. But back in the 60s, there was a lot more variance
from computer to computer.
To be a little more precise, we'd say that modern computers use an 8-bit wide byte.
But historically, that hasn't been a constant.
Some computers used 6 bits to a byte, some had 12-bit wide bytes, and some even addressed by words,
which in the modern parlance are two bytes,
but at the time could have variable or fixed lengths.
It becomes an issue.
Now, that may sound like a small detail,
but it turns out that the size of these cells of memory makes all the difference in the world.
It determines how a programmer deals with data,
what kind of math is easy to do,
and how characters can be encoded. This memory bit width determines a whole host of factors
for a programmer. So how can you make a programming language that works on all these various systems,
no matter how wide their bytes and bits are. The solution that
Richards arrived at was to just define some primitive variable. In this case, it's called a
bit pattern, a single cell of memory. It's just whatever the computer uses, a byte by any size.
uses. A byte by any size. By keeping to one generic data type, BCPL just kind of drops the whole conversation. Oh, what size should you define integers at? According to BCPL, that doesn't
matter. You don't get integers. You don't get anything. You just get a chunk of data. And that simplification, that almost elegant approach,
saves a lot of issues. The rest of the language is built around this core idea.
The data you work with in BCPL are just generic byte values. They don't have any explicit meaning,
so it's up to the programmer to decide how to use that data.
It's a very assembly language style approach that I kind of love. If you want to work with an
integer, you just define a variable and put a number in it. Then going forward, you have to
remember to treat that variable or set of variables like an integer. If you need to store a character, just do the same.
Define a variable, load it up with a character code, then call it a day.
BCPL also has support for arrays.
So if you need a list of integers, you can just define a variable as such.
To the computer, it's just data.
So do with it as you will. BCPL gives you that flexibility.
On the surface, that sounds really limiting, but there's another feature that makes things
viable. BCPL uses these cool little things called pointers. That is, you can define a variable as some address in memory.
It's a little confusing at first, but the idea is that way you can create a variable
that points to another variable or to some chunk of RAM.
Taken together, pointers and bit patterns make a really robust way to manipulate data.
And really, that just means they're a good way to deal with memory.
What makes this combo all the more interesting is how low level it actually is. Like I mentioned,
it's a very assembly language thing to do. When you work with assembly language, you just,
if you need a variable, grab a chunk of memory, throw some data in there, and just kind of have to remember what was in that memory space.
In contemporary languages, you didn't really deal with memory in that way.
CPL and ALGOL are examples, but so are FORTRAN and LISP.
All those languages went to extreme pains to keep programmers away from raw memory access.
In BCPL, you have access to any location in memory within reason.
In fact, to do something as simple as manipulating text data,
you have to deal with raw memory.
It's a low-level approach to data, but it's wrapped up in a higher level language. Like I mentioned,
BCPL's major design goals were all pretty intertwined. This simplification and low-level
variable system served a dual purpose. It's just an effective way to handle data. But it also made
creating a compiler for BCPL a whole lot easier. I know we're talking a
lot about memory, and that's with good reason. It was a big source of variance in the 60s and
a big roadblock to portability. But there's one last key to this whole memory mess, and it takes
us back a little bit to Richard's experience working on the ill-fated CPL compiler.
GPM used kind of an ad hoc abstract machine to create platform-independent code.
BCPL took that a step further.
The design that Richard's came up with used a much more formally specified abstract machine. The most popular BCPL manual, simply
called BCPL, the language and its compilers. Well, it calls this the object machine, or
idealized machine. Now, object machine is a bit of a confusing term here, because objects mean
something else in regards to programming. But I think idealized
machine is a little bit better. Like with GPM, the basic idea is to define some computer that
your code can run on. Then you write your compiler to create an executable for this idealized machine.
for this idealized machine. That machine doesn't exist. It's only a spec. The final step, and the only step where hardware matters, is converting from an idealized machine program to an actual
executable that runs on real hardware. That sounds complicated. Conceptually, it is complicated.
complicated. Conceptually, it is complicated. But at its core, Richards was taking the easiest route possible. A later essay that he wrote, quote, the BCPL abstract machine was almost
the same abstract machine of macro calls used in the implementation of CPL. But since its statements
were to be generated by the front end of the compiler and read by a code generator,
there was no need for it to consist of textual macro calls.
End quote.
To translate a little, your code started at the front end of the BCPL compiler.
From there, it's read, parsed, broken up, and turned into instructions for Richard's abstract machine.
That machine borrowed a lot from his work at Cambridge. That abstract code is then put into
the code generator. That's the hardware-dependent part that turns abstract code into real executable.
That intermediate code is never touched by human hands. It's written by a computer and read by another computer program.
So there's no need to get fancy.
Richards implemented that layer as just a stream of numbers.
No text, no structure.
It's easy to write a program that understands numbers.
It takes far fewer operations, so why not make
things as easy as possible? Another huge factor that simplified BCPL was its intended scope.
Richards wasn't making a 100% general-purpose language. BCPL wasn't designed to take on every possible task. Instead, he focused on the niche of system programming.
That is, BCPL was intended to be used for compilers, operating systems, and other low-level tools.
As it turns out, and this is kind of funny to me, those kinds of programs use very little mathematics. At the most, you end up doing
integer math. The largest numbers you have to deal with are memory addresses, and those are all
simple whole numbers. So that let Richards get away with having no native floating point support,
and a very pared-down mathematics system. BCL's spec is rounded out with its own
character set and tokens. Now, once again, this is going to sound pretty nitpicky, but I think it's
a really important part of the discussion. Basically, this syntax and character token
discussion is how BCPL looks. Which characters mean what in the language, which
characters execute instructions, and which are just for nice formatting. It's central stuff to
any programming language. And just like with other design choices, Richard went with the easiest and
most elegant solution at least wherever possible. Now, I say wherever possible because we're still
dealing with computers. Rule one of computers, they're little machines that tend to ruin your
plans. And in the 1960s, we're still dealing with a diverse cast of digital systems. This is one
place where the portability goals of BCPL led to issues. At MIT, Richards was
working with IBM computers. Back at Cambridge, he had been using totally different systems.
In this era, there weren't widely adhered-to standards for character codes. So we get this
weird, and I personally think delightfully weird, situation where Richards found a way to conquer portability.
He was able to abstract away memory and processor differences with ease.
But he hit a roadblock when it came to which characters his new language could actually use.
The best example of this is the curly bracket.
It's a really common character in more
modern programming languages. It's the little open and closed curved bracket with the sharp
point in their middles. The terminals and computers back at Cambridge supported that character.
MIT's IBM systems, they just didn't. This ended up being a bit of a problem because Richards wanted to use the curly bracket to
enclose blocks of code.
This idea was part of ALGOL and CPL, basically just a way to group code for execution.
Things like loops or if statements all executed a single block of code that followed them. In ALGOL, blocks were
defined using the words begin and end. So you'd have to say something like, if x is greater than
y, then begin your block of code and then follow that up with an end. It works fine, but it's a little verbose, so Richards wanted to just replace the begin and end with an open and closed curly bracket.
Speculating here, but I think there's another good reason for this change.
This is mainly me talking from experienced programming assembly language and not explicitly backed up in the sources, so
do with it as you will. Processing textual data or string manipulation is actually not that trivial.
It can take a lot of code and processing power to make sense of human-readable text on a computer,
especially when you're working in a low-level language.
A string like begin is really easy for us squishy human folk to write, read, and understand. But for a program, it adds an extra level of complexity. A computer sees begin as five characters,
maybe six if you have a terminating zero. When you get down to the
processor level, you can only actually compare one character at a time. It's just how processors
and register machines work. So to look at some input data and find the word begin, you need to
run five successive comparisons. Adding stuff to manipulate memory and move around
as you're doing comparisons bumps that up to some more instructions, but let's just say
five instructions is the bare minimum amount of complexity. Let's say you're also trying to look
for the word end. That means each incoming word takes up to eight compare instructions to process.
As you add in more keywords, that balloons the amount of processing you need to do.
Now, to the rescue, up in the sky, what could it be?
It's the curly bracket.
You can now enclose a block of code with just two characters.
That means that your program only has to do two comparisons to look for those brackets.
Over time, this kind of instruction pinching really does add up.
But the problem was that IBM mainframes, at least at the time, didn't do curly brackets.
So Richards had to compromise.
He ended up using $OPENPARIN and $CLOSEPARIN for begin and end block.
Some later versions of the compiler ended up supporting curly brackets, but that was
once IBM got with the program. By 1967, the very first version of BCPL was completed,
and that's including specification and compiler. Richards had taken a much more pragmatic approach
to language design, and it really paid off. Instead of being some low-level computer language
that few could understand, or some high-level computer language that few could understand,
or some high-level abstracted language that only worked in academic papers,
BCPL met computers in the middle.
It had features of high-level programming languages,
but it also had access to the low-level bits of a computer.
And it can be implemented with relative ease. Smart choices, both in the language's overall design and its finer features,
made BCPL easy to compile.
And by adapting GPM's abstract machine model,
Richards was able to make porting BCPL much easier compared to contemporary languages.
As the language spread, it became a very popular choice for programmers. It made its
way through MIT's CTSS community, off into parts of Multics, and eventually into the hands of some
young and excited programmers all the way off at Bell Labs. This brings us to where BCPL really made its mark on history, and this brings us deep into
the early days of Unix.
For those not in the know, aka those who actually go outside sometimes, Unix is an operating
system.
In fact, it's probably one of the most important operating systems out there.
It was initially developed in 1969 at Bell Labs.
During the early 70s, it spread like wildfire, and in the modern day, Unix and its derivatives
run on just about any computer you can think of. Windows is really the last great bastion against
the Unix-like tides, but even that's starting to change. But before it blew up onto
the scene, Unix was a side project hammered out on spare hardware deep inside AT&T's Bell Labs.
This project was started in 1969 by Ken Thompson, a programmer at Bell who had recently come off working on Multics. Now, very broadly speaking,
Multics was the spiritual predecessor to Unix. It was a large-scale partnership between a handful
of companies, Bell Lab among them, to create a secure timesharing operating system. The idea
was that Multics would be the backbone of computer infrastructure that would be used as a widespread utility service.
Bell and AT&T in general were really eager to get in on the action.
At the time, there wasn't a one-size-fits-all timesharing system out there, and Bell wanted one for internal use, and maybe for some more ambitious projects too.
But Multics was perhaps over-ambitious.
Timelines would slip further and further behind,
and by 1969, Bell pulled out of the partnership entirely.
But that left Bell with a bit of a problem.
They still didn't have the complicated timesharing system they were after.
And on a smaller scale, programmers like Ken Thompson and Dennis Ritchie just
plain liked working on Multics. It was a nice environment to work in. The entire system was
written in a high-level language. Sure, it had some problems, but it was a fun playground for
programmers. Back at Bell, Thompson was slowly
deciding that he wanted to create a little slice of Multics at home. As lore goes, Thompson found
a barely used PDP-7 computer tucked away near his office, so he set to work writing his own
operating system on the machine. Pretty soon in the process, another programmer named
Dennis Ritchie joined up with Thompson. It was in that lab that a very early version of Unix would
start to take shape. Just as a quick aside, at this point it wasn't called Unix. I don't know
if it even had a name. The label would come later on in the 70s, but I'm just going to keep calling it Unix because
I don't want to have to go over and over again saying the system that would one day be known
as Unix. Anyway, from the start, there were two huge hurdles to face. First was the platform that
the duo were working on. The PDP-7 was chosen for one big reason. No one else
was really using it. It wasn't very powerful, and it only had 8 kilobytes of memory. That was a huge
limit to what Unix could be. Second, and intertwined with the first issue, Thompson and Ritchie were
working directly in assembly language.
This was partly due to a lack of tools on the PDP-7, and partly because they were working on such a low-level project.
Now, I'll be the first to admit that assembly language does have some problems.
It's definitely one of my favorite ways to bash bits, but it's not always the best tool for the job.
For one, it's not portable.
Thompson and Ritchie knew that if Unix continued, it needed to move off the PDP-7.
As the system grew, so did its requirements, and 8K of RAM isn't enough space to really grow.
The other issue was on the human side of things, maintainability.
Discounting writing binary directly in machine code, assembly language is one of the hardest
languages to program in. It's terse, you have to remember cryptic instruction names, and you need
to know exactly what your codebase is doing. For large projects, especially ones with multiple programmers,
that last part becomes a real issue.
You have to keep meticulous documentation about which parts of memory are used by different code,
how registers should be treated,
and everyone has to write code that follows that convention or things break. From the outset,
Thompson and Ritchie knew they wanted to move away from assembly and start working in a higher
level language. And from the outset, they had a choice in mind. Quoting from a talk that Thompson
gave, quote, after Unix was up or simultaneous with Unix coming out, BCPL was just emerging, and that was a clear
winner for both of us. Both of us were really taken by the language and did a lot of work with
it." BCPL had everything that the Unix crew were looking for. It was built for portability. It
could handle low-level tasks while still being encased in a high-level
language. And it was easy to implement a compiler for. But that's assuming you had access to a
capable computer. The PDP-7 just wasn't capable enough for a BCPL compiler. Turns out you couldn't fit an entire compiler for the language in 8k of code.
There were experiments with other languages, but none really stuck. None fit the exact niche that
the crew wanted. Notably, Douglas McElroy, another early addition to the unofficial team,
ported a language called TMG to the PDP-7. Now, TMG is particularly notable
here because it was language designed for writing compilers. So not the most general-purpose thing
in the world, but a type of glue that would help bootstrap something bigger. TMG was a step forward,
but the crew still wanted a capable programming environment.
One of the things that they liked about Multics was that there were ready tools for programming and debugging.
So servicing that became somewhat of a project within the side project.
This would be the next spark towards progress.
Ritchie described the series of events like this, quote,
progress. Ritchie described the series of events like this, quote, the intent to handle Fortran lasted about a week. What he produced instead was a definition of and a compiler for the new language B, end quote. So after some initial work, a reference compiler
for this new thing called B was created. Then that compiler was eventually rewritten in B itself.
compiler was eventually rewritten in B itself. By late 1969 or early 1970, B was fully self-hosting.
So, what was this new language? Simply put, it was a stripped-down BCPL, small enough that an entire compiler could be written in less than 8 kilobytes of code. If there was anything like a punchline in this episode, then here it is.
B is a simplified BCPL. BCPL is a simplified CPL. CPL is a simplified and slightly upgraded
ALGOL. In other words, we're looking at this oddly direct chain of pragmatic compromises,
at this oddly direct chain of pragmatic compromises, slowly going from theory to practice.
B is where that chain made it fully outside of academia and into the world of industry.
So what did the new language look like? How did Thompson's proclivities impact B?
The go-to litmus test here is easy. B uses the same variable typing system as BCPL. You get one type, it's basically a bit pattern by another name. The B reference manual, written by Thompson
a few years into development, calls it a word. It's not exactly the same as early BCPL implementations,
but the difference is negligible. A word in this context
is just the native size of data that a processor works off of. It's not always a single byte,
sometimes it's two bytes, sometimes it's more, sometimes it's less, but it's the same idea as
a bit pattern. It's a generic chunk of data. The character set and syntax are the most noticeable changes here.
According to Ritchie, Thompson was always a fan of, quote, Spartan syntax. Why use a lot of words
when a few well-placed characters will do the trick? And luckily, that view aligned with B's
constraints, namely the memory limitations of the fantastic PDP-7 platform.
In general, Thompson pared down BCPL's character set, adjusting and tightening the language.
The awkward begin and end tokens, represented by $OPEN and $CLOSE in BCPL,
were replaced with fully-fledged curly brackets.
Deck machines supported the character, so it was a very logical upgrade.
Another big change was how assignments were written.
Once again, this is a pretty nitpicky detail, but I find this really interesting to think about.
but I find this really interesting to think about. In ALGOL and its derivative languages that includes BCPL, you assign a value to a variable by typing variable colon equals value.
That's a holdover from ALGOL's formalized roots. The colon equal is used in mathematics to signify
a definition, so it made sense to put it into a new
formalized programming language. Comparisons in these algol-like languages use a single equals.
That is, you could just type x equals y to see if the two variables were equal. Whenever I look at
these older languages, that's something that always trips
me up since it's not modern convention. There was a flip that happened. It didn't start with B,
but B put together a lot of little tweaks in one package that looks resoundingly modern.
Thompson replaced colon equals with just equals for assignment, and the single equal that was used for comparisons was replaced
with the double equal equal. In the current day, languages tend to follow this new convention.
So to the modern eye, just this change in how equals is treated plus curly brackets
makes B look very familiar. Once again, Ritchie mentions the change of assignment briefly in a few essays,
specifically in a 1993 article called The Development of the C Language. Ritchie states
that a lot of these small changes were a, quote, matter of taste, which, yeah, I get it. Syntax can
definitely be a matter of taste, but if I can go off the rails here for a second,
the assignment operator in particular is really interesting to me.
The simple fact is, over the course of a program, you're going to have a lot more assignments
than comparisons.
I was just looking at some random code I had downloaded, and it's a 10 to 1 ratio.
10 times more assignments than I
have comparisons. By far, it's just more common to move around data than it is to compare data.
So flopping things around, making assignment take only one character while comparisons take two,
that definitely saves some space in source code code and it saves some time when typing.
It's not the biggest deal in the world, but I think that's a really neat tweak.
The other interesting piece that I just have to throw in is that B introduces the plus plus and
minus minus operators. Those increment or decrement a value by one. They're really common in use today, and they started off
in B. Implementation-wise, B diverges slightly from the track laying down by BCPL. The B compiler
created threaded code. This is a pretty interesting technique that's somewhat close to how GPM or BCPL
compilers worked, but a little bit different. Normally, a compiled program
is just a series of normal machine code instructions. You can't tell the difference
between an assembled and a compiled program unless you know what you're looking for. Well,
for B, things are a little bit different. B worked off a rough abstract machine, but nothing quite as formalized
as BCPL's idealized machine. But in B, calls to this abstract machine weren't replaced with chunks
of code for some physical target machine. Instead, each instruction was just replaced with a function call. This means that the final machine code for a B program ends up looking distinct.
Just about each instruction calls out to some chunk of code held in a library of functions.
In practice, the portability of B was accomplished by rewriting that library of calls for a new
platform.
The ease of portability with threaded code is
similar to BCPL's implementation. Not much has to be rewritten for a new platform. In fact,
threaded code may make creating the compiler a little easier in the first place. You basically
skip out the whole replacing instructions step. so one fewer steps is really nice. However,
there is a downside. Threaded code tends to be slower than other approaches. Each line of your
final program has a call, which a call is its own instruction, so that eats up cycles. B was
functional, but a little bit hamstrung by this implementation.
In the coming months, other issues would come to light.
B would see some limited use inside Bell Labs, but it never beat out assembly.
The Bell crew still wanted to rewrite Unix in a high-level language,
but B just wasn't quite the right tool for the job.
So as they transitioned to a new computer and the team started to grow,
Dennis Ritchie set to work on a replacement for B.
That second language would one day be known as C.
Alright, that does it for the start of our exploration of the C programming language.
This has been a bit of a heavy episode and actually pretty light on C itself, but I think the context here is really, really important.
C, and really any programming language, doesn't exist in a vacuum.
Each has its own lineage.
For C, that family tree just happens to take us back all the way to time immemorial.
When you get down to it, the road to C can be characterized by a process of slowly transitioning
from theory into practice.
ALGOL starts in the lofty heights of pure theory, a universal language designed to
take on all challenges. CPL adapts those ideas into a more useful system, trying to make an
abstract idea practical. Once programmers try to implement CPL, more and more practical issues
start to pop up. One of those is portability, how easy it is
to move software from computer to computer. Eventually, Martin Richards remixes CPL into
BCPL, where he tackles the issue of portability a lot more formally while also retaining low-level
control of the underlying computer. He does this all in a very academic
environment with very formalized goals. The ideas, tricks, and technology that Richards put into BCPL
find their way off campuses and eventually into the hands of programmers at Bell Labs.
From there, theory is fully reduced to practice. That is, the team working on Unix
starts to figure out how things like portability can work in the messy world.
That's where I'm leaving things for now. Next week, we're going to be looking at the early
development of C, how B transformed into the world's most popular programming language. And most importantly, we're going to see how C took on the world.
Thanks for listening to Advent of Computing.
I'll be back in two weeks' time with the next part of my series on C.
And hey, if you like the show, there are now a few ways you can support it.
If you know someone else who would like the show,
then why not take a minute to share it with them? You can also rate and review on Apple Podcasts. And if you want to be a super fan, you can support the show directly through Advent of
Computing merch or signing up as a patron on Patreon. Patrons get early access to episodes,
polls for the direction of the show, and bonus content. You can find links to everything
on my website, adventofcomputing.com. If you have any comments or suggestions for a future episode,
then go ahead and shoot me a tweet. I'm at Advent of Comp on Twitter. And as always,
have a great rest of your day.