Advent of Computing - Episode 129 - ALGOL, Part I

Starting point is 00:00:00 What's the most important part of a programming language? What does a good language just have to have? This is one of those spots where I think it becomes clear that programming is more of an art than an actual science. There isn't a single good answer here, right? It will depend on all kinds of things. What the language is used for, who's using it, where it's used, maybe even when it was developed. For me, this answer has changed throughout my life and career.

Starting point is 00:00:35 If you asked me this question when I started programming, I would have probably listed out some very specific kinds of simple language features. A language needs to give programmers full control over memory. It needs useful data types, but nothing too heavy-handed. I would have probably also mumbled something about parentheses. At that point, I had been working a lot with assembly language and Lisp, so I had a very particular view of things. If you had asked me that same question when I was in college, I would have given a totally different answer. A good language should have some kind of array broadcasting mechanism for matrix mathematics. It needs a native way to handle multithreading. It should have some IPC mechanism of some sort. And, of course, any language that's worth its salt needs flexible data types. At that point in my life, I was writing a lot of physics simulation software, so

Starting point is 00:01:26 my views skewed in another very specific way. As I've grown older, my views have yet again changed. If you asked me today, I would say that a good language has to have a mature ecosystem built up around it. There are well-developed libraries and packages. There are people and places you can turn to for help and resources. The language is old enough that most of the kinks have been worked out, most of the stumbling points have been found, and thus, a lot of the chaos and struggle has died down. But above all, I think a good language has to have good documentation. You know the old adage, right? RTFM, read the, uh, you know, manual.

Starting point is 00:02:11 You understand, of course, I must keep my PG rating for the show. If you don't have a good manual to read, then the language will suffer for it. Why program in a language that actively makes itself hard to understand? The same goes for libraries. I've actually passed up on using certain libraries just because their documentation hasn't been good. Sometimes even before I get too deep into a library's features, I first look at their documentation and see how easy it is to read and easy, you know, to understand, to actually use. So, in this framework, in my now enlightened mindset, the most documented language should be the best, right?

Starting point is 00:02:54 Well, maybe don't take my enlightened advice too easily. Welcome back to Advent of Computing. I'm your host, Sean Haas, and this is episode 129, Algole, part one. I'm finally back to my usual schedule and my usual fare. So we're doing a classic. We're going to be talking about a programming language. And actually, we're kind of circling back around to pick up a thread from a few older episodes. When I covered the C programming language, I discussed the origins of an earlier language called ALGOL, or I-A-L. It also came up when I ran an episode on Jovial. In those episodes, ALGOL is really just the background character. But that does us disservice to the language itself. The simple fact is that Algol

Starting point is 00:03:52 is one of the foundational languages in programming. Most other languages can trace their origins back to Algol somehow. If you've programmed in any capacity, it's likely you've encountered some of Algol's DNA without even realizing it. That said, I don't want to just keep covering beginnings of events. That's a valid strategy, it's where a lot of the cool stuff happens, but I think it's a limiting one. I think Algol gives us a unique opportunity to examine how a language grows over time. ALGOL gives us a unique opportunity to examine how a language grows over time. That's because, at first, ALGOL had a lot of issues. Over time, newer versions of the language were developed,

Starting point is 00:04:35 issues were addressed, newer features and newer ideas were added, and in some cases, new problems would arise. The guiding source for this series is going to be from my battered copy of The History of Programming Languages 2, or HOPL 2 if you like. Specifically, the ALGOL 68 session. That name, perhaps, gives you a hint at where things are going. ALGOL is a highly standardized language. It's described by a very precise specification, and that spec has been updated somewhat regularly. So we're actually looking at different versions of that spec. Each version is named after the year it was published, roughly, starting with ALGOL 60 and moving up from there. With that in mind, things might get a little confusing. I'm going to try

Starting point is 00:05:26 and resist the urge just to call it all algol. I'll try to keep the year designation or the version designation wherever possible. But, and here's the thing, this is going to take a lot of space to cover. So we're doing a two-part series here. I think that's really only fitting for multiple versions of this wild language. This episode will be mainly covering ALGOL 60 and everything around that spec. That will be setting us up for next episode, where we talk about ALGOL 68, its direct predecessors, and really an entire drama around standardization. So let's get started. I'm going to begin with a recap of where ALGOL came from,

Starting point is 00:06:12 and then a discussion of the issues with the first spec. From there, we're going on hard and fast into the timeline. We're going to look at how ALGOL overcame initial issues, how later versions affected other languages, and how other languages would affect later versions of ALGOL. Let's start building up what I think is going to be a very interesting mess. The original name for ALGOL was IAL, the International Algebraic Language. This grew out of a larger drive to create a standardized or universal language that seems to have started in the

Starting point is 00:06:53 middle of the 1950s. By 1950, GAMM, a computer science organization in Germany whose full name I will not butcher, had taken notice of a wild abundance of unrelated and unique programming languages. At roughly the same time over in the States, the ACM was coming to grips with this same problem. Too many languages and a total lack of standardization. This problem of incompatibility, call it a problem of proliferation if you want to sound really fancy, is a well-documented pattern. We ran into this same type of issue during my System 360 series at the start of the year. In the late 50s and early 60s, there was an explosion of unique and incompatible computers. IBM, RCA, English Electric, and maybe more companies started initiatives to make families

Starting point is 00:07:46 of compatible machines. IBM was the first to really stick the landing, leading to the System 360 and its clones. The goal was to rally their product line around a family of compatible machines, and thus save money, save time, and reduce complexity. and thus save money, save time, and reduce complexity. When it comes to programming languages, well, the problem here is actually really similar. It's almost the same. Multiple incompatible languages lead to this situation where software programmers and projects are siloed.

Starting point is 00:08:21 They're isolated. A Fortran programmer can't seamlessly transition into a Lisp project. Or, in some cases, a Lisp programmer on one type of machine can't really use Lisp on another machine. Really, this represents two sides of the same issue. A lack of standard language means that human resources get messy. It's hard for that Fortran programmer to make heads or tails of Lisp. It's just as hard for a Lisp programmer to try and communicate on a very technical level with our poor, poor Fortran lover. The other side is implementation. For a language to be useful, it needs to be implemented. Someone needs to sit down and write a compiler or an interpreter or

Starting point is 00:09:05 something that runs code in the target language. Those implementations tend to have some drift, at least they did back in the day. Lisp on an IBM machine wasn't necessarily the same as Lisp on a DEC machine. So even two Lisp programmers may be speaking a slightly different language. They may not be able to share code with each other, even. The solution that GAM and ACM arrived at, in isolation of one another, was to create a new language. Existing specifications had issues, so why not make a new spec, a better one? All jokes aside, this was a very reasonable approach. Each organization wanted to standardize around some language, and existing languages didn't meet all their expectations.

Starting point is 00:09:54 Each org worked somewhat alone until 1958, when GAM and ACM met in Zurich, Switzerland, to join forces and create a single international standard for an algebraic language, hence IAL. It's pretty quickly renamed to ALGOL, since IAL just doesn't have a good, well, ring to it. The goal of ALGOL is simple. Provide a universal language that all programmers can use. It will run on every computer, and it will be understood by all. That's a very simple premise, but it actually introduced a lot of technical problems to solve. One is the fact that ALGOL is actually two languages. It has to be. It's a publication and a hardware language. The

Starting point is 00:10:46 publication language is meant to, well, be published in print. The idea was to establish a standardized form for describing algorithms, something like the formalized math notation used in mathematical proofs. In this way, all programmers the world over would be on the same page, so to speak. This would allow for algorithms to be very formally defined and well understood. In that sense, ALGOL would function as a type of digitally-oriented Esperanto. The hardware language is what we would traditionally call a programming language. It's a language that's meant to be fed into a computer, compiled, and executed. There's not much to explain there.

Starting point is 00:11:33 This is the actual code that a programmer in real life would work with. Now, on first inspection, that may sound strange, but maybe it's okay. I think you can see where the framers of IAL are going with this. It's no different than a human tongue. We have spoken representations and written representations of the same language. There are some differences between representations, but in general, if you can read and write English, you could probably speak it too.

Starting point is 00:12:09 But you know how this goes. There's something weird here. Let's stick with the human language example. There are things you can't really say with spoken language, but you can easily write out, and vice versa. Here, I'm thinking of specific symbols, or sounds, or inflections. There's this drift between the two forms of the language, the two different representations.

Starting point is 00:12:33 So, too, is there a drift between these two versions of IAL. Allow me to read directly from the 1958 Preliminary Report on the Language. Quote, Hardware representations will generally use a smaller set of symbols 1958 Preliminary Report on the Language. of the language will employ many of the notational conventions of mathematics, e.g. exponents, subscripts, Greek letters, and will be used in human communications of IAL programs. End quote. This, yet again, sounds reasonable on many levels. Well, maybe that reasonableness isn't entirely clear. Let me explain. Back in 1958, the state of data encoding was bad. That's really the only way to put it.

Starting point is 00:13:31 For a computer to understand text, that text needs to be encoded as a binary number. That's usually called character encoding, since you're encoding these things character by character. You get these nice tables that show how numbers and characters match up. In the more modern day, we have things like ASCII, UTF-8, or Unicode that can encode, well, almost anything. Unicode is especially good at this. It's very extendable. But back in these earlier eras, there were very few standards around character encoding. It varied from company to company and from computer to computer. To make matters worse, we need to consider character size, as in how many bits are used to represent a character.

Starting point is 00:14:19 ASCII uses 8, one byte. UTF-8 is variable. It can use up to 4 bytes to encode a character. In those cases, you get lots and lots of possible unique characters. You can easily have the full alphabet, letters, symbols, anything you want. But a lot of older machines used 6 bits per character. used 6 bits per character. That's only 64 possible characters, which means you have a hard limit. On the other hand, typewriters could do all kinds of weird stuff. Let's think about underlines as a fun example here. Let's say you want to print out underlined text. Your lame character encoding does not have space for 26 underlined characters. The computer just can't understand the difference between an underlined A and a normal A. But if you have an underscore and a backspace character on your typewriter,

Starting point is 00:15:18 you can fudge things. You can print out a letter, backspace, then print an underscore. things. You can print out a letter, backspace, then print an underscore. That's something called overtyping. So the printed text can look underlined, even if it's not stored as underlined text. That, plus expanded character sets and fancier typewriters, meant that the publication form of IAL would look significantly different than the hardware form of the language. But it gets weirder. You can actually bridge the gap a little bit more here. Allow me to introduce one of ALGOL's idiosyncrasies. It's called stropping. This is a mechanism by which hardware ALGOL can be formatted. Kinda.

Starting point is 00:16:08 It's really weird to wrap your head around at first. In printed ALGOL, certain parts of the language show up in different text faces. Keywords, the reserved words that ALGOL actually knows, are one example. Depending on the version, keywords are often printed in boldface. But that's a typewriter thing. It doesn't really work on an old computer. Remember, 64 characters at most. To mediate this, we can use stropping, special formatting characters that you put into your code.

Starting point is 00:16:45 What's interesting about stropping is it was highly dependent on the hardware implementation. So some hardware versions of IAL used periods for stropping in front of words, others used apostrophes around words. It gets messy, it gets weird, and really it leads to another drift between multiple different hardware versions of IAL. Now, there is a further complication that I have to add in. The initial 1958 report actually has a third language involved, called the reference language. This was intended as the main source of truth, the actual bedrock definition of IAL. The print and hardware versions of IAL, then, would be different aspects of this core reference language. One of the 1958 papers even states that the reference language would be used to quote-unquote transliterate between the different forms of IAL. Once again, this might feel a little complex, perhaps overkill, but remember, the whole point of IAL is to make a standardized language. It's to make a universal language that

Starting point is 00:17:55 everyone can use in print and in code. Now, I have to burst some bubbles here and further complicate things. Whenever you read about algol derivatives, you will inevitably run into words like portable or compatible or maybe interoperable. There's this idea that an algol program could be compiled on an IBM machine or a GE machine, or even a slick bowl gamma. That's the whole point of a universal language, right? Well, that's just not really the case. I'm pointing this out because I've fallen for this classic trap myself. C is the classical portable language. It can run on anything because the language itself contains no hardware details. It's also a direct descendant of ALGOL. The older language also makes no hardware assumptions.

Starting point is 00:18:54 That was part of its core design. So, bingo bango bongo, ALGOL must be portable. Eh, but not exactly. There's a sort of implied notion that ALGOL or IAL could run on anything, but that's never super explicit in older papers. The 58 prelim report describes not one hardware language, but multiple. It actually uses the term hardware representations, plural. It actually uses the term hardware representations, plural. Here's a quick excerpt point from its summary on the hardware representations. Quote,

Starting point is 00:19:35 Each one of these is a condensation of the reference language enforced by the limited number of characters on standard input equipment. Each one of these must be accompanied by a special set of rules for transliterating from publication language." In other words, there was expected variance between implementations of IAL. The language was, in theory, portable, but that had to be handled explicitly. If you wanted to run one of your colleague's IAL programs, you had to first transliterate it to work with your hardware version. If you saw a neat program in an ACM paper, you would have to transliterate it to execute it. You'd have to go through maybe two or three different manuals to know, okay, here's the reference language that I'm using, here's the publication language,

Starting point is 00:20:28 and here's the rules for my compiler for how to go between the two into a hardware version. Given the state of character encoding, you couldn't actually automate this. You can't copy-paste text when your character encoding is different. Copy-paste didn't even exist yet. paste text when your character encoding is different. Copy paste didn't even exist yet. So you're put in this situation where any portability is totally manual. This really speaks to the era. The framers of IAL knew that there would be technical issues with a truly universal language. Character encoding, a very dumb little detail, made actual portability basically impossible. An IBM machine may have a Sigma character, but RCAs may not.

Starting point is 00:21:13 So there had to be this layer of transliteration. Plus, you just gotta love that word, right? But the language itself is portable. Once again, kinda. I mean, ignore the separate hardware rep things for a minute. IAL, as a rule, makes no assumptions about hardware. In fact, it doesn't say anything about hardware. Everything is abstracted.

Starting point is 00:21:39 You can't directly do anything to the underlying machine. You don't manipulate memory, you have to ask IAL to make a variable for you. And you have to ask for a very specific type of variable at that. The spec says nothing concerning punch card readers, teletypes, or printers. It doesn't talk about disk drives, memory size, or character encoding. That means that, disk drives, memory size, or character encoding. That means that, despite the character issues, IAL is actually portable. Things may look a little different, you might have to manually make some changes, but all core functions of the language will function the same. That's the power of the spec. At least, that's the theory. This is where things start to break down in, I think, a frustrating way.

Starting point is 00:22:28 The first spec says nothing about hardware. That's a good thing for portability, but a bad thing for practicality. IAL doesn't include any guidance on handling inputs or outputs. The classic story here is that the first version of ALGOL, ALGOL60, didn't even include a print function. This was done because print is hardware-dependent, so it can't be part of a portable spec. And this is true, but only to a point. John Backus's 1958 report on IAL actually includes examples of input and output functions, but not as part of the reference language.

Starting point is 00:23:13 Rather, as examples of procedures. Allow me to explain because I think this is pretty neat. This is something that I haven't seen discussed in other coverage of IAL or ALGOL. So there's this feature of the language called procedures. It's a chunk of code that can be called and reused later in your program. Procedures can be either defined in IAL itself or as external machine code. This is a trick that many compiled languages can do. It's especially useful when your language spec has no guidance about hardware controls.

Starting point is 00:23:53 You make it so a programmer can work up a little program that, say, prints some text from memory to a teletype. That little bit of machine code can be connected up to your fancier, high-level program. The process is called linking. This gives you the best of both worlds. When you need fine-grained hardware access, you can slap together something from assembly language, but most of your work can be done in a very standardized and much more user-friendly language. When discussing these external procedures,

Starting point is 00:24:26 Backus basically just says, yeah, you could implement a print and also an input function using this feature, and here's what it might look like. But that's all. He doesn't recommend doing so. It's not a required part of a hardware implementation. He just points out that it's permissible. That, though, kind of sucks. It's really weak guidance at best. Before we move on, there's one more thing we need to tackle about the state of IAL in 1958. That's the actual implication of the word universal. Now, I've heard it said that ALGOL was intended as a language for any and all applications, that universal didn't just mean creating a common tongue, but rather a single tool for all tasks. One of the questions I had when going back to these early sources was, is that true? Did the original team intend for ALGOL to be

Starting point is 00:25:27 viewed this way, as the language, the one true language? Well, that's actually a very complicated question. Here, I'm going to pull directly from the same Bacchus paper. Quote, The proposed language is intended to provide convenient and concise means for expressing That seems nice and vague, doesn't it? Backus here is saying that IAL is meant to work for all numerical computation. A shallow read here would be to assume he means IAL is a math language. And that may be the case. Bacchus is also known for spearheading the development of Fortran, a language that only works for mathematical applications.

Starting point is 00:26:21 It has had some uses outside that domain, but those exceptions tend to prove the rule. Backus developed Fortran, the formula translator for numerical computation. So maybe it makes sense that IAL would fit this same type of mold, a language for one specific purpose. Allow me to ruin that nice, simple understanding. Let's think about the state of programming languages leading up to 1958. We have Fortran, of course, which is a language meant for math. We have Hopper's family of languages, A0, Arithmetic, and a few more, which are math languages. Shortcode is another early entrant, and another language meant for numerical computation. The odd languages out here are

Starting point is 00:27:11 things like Hopper's Flomatic, which is a data processing language, basically long-form mathematics, and maybe something like IPL, which is a very early list processing language. something like IPL, which is a very early list processing language. The big show in town, the main application for machines in this period, was numerical computation. So saying a language is meant for all numerical computation may be implying that it was meant for just about all programming needs, at least in this era. Now, I want to be clear, the papers from 58 are a little cagey here. I'm not sure if this is the implication they're going for, or if they just meant that IAL is a math language for math nerds. Maybe they were thinking of the larger future field and putting IAL in the one existing category.

Starting point is 00:28:06 This is something that we need to keep in mind moving forward, that the ecology of the era impacts what IAL and ALGOL actually mean. ALGOL, in its 1958 form, feels old. It's kind of hard for me to easily describe what I mean by that. In large part, it has features and a certain look that make it feel like very early programming languages, which, to be fair, makes a lot of sense. Algo would inspire and help form many early languages, so of course it should feel like that early cohort. It just has a certain way of doing things that lines up with those earlier languages. One of the reasons for Algol's early influence comes down to how its development process worked. It was, as the original name suggests, a very international affair. In the final 1960

Starting point is 00:29:08 report on Algol, written by Peter Naur, we get a wonderful summary of this collaborative effort. Allow me to summarize the summary for you. In 1958, the first IAL report hits print and gets circulated. Over that year, programmers start to weigh in on the proposal. In 1959, a second conference is held, this time in Copenhagen. Naur takes some space to point out one part of this conference in particular, the hardware group. This was a group of programmers that sat down and started looking at how ALGOL could be implemented. programmers that sat down and started looking at how ALGOL could be implemented. Suggestions from this conference and from this hardware group were then folded into a new round of proposals. That same year, another conference was held in Paris with the same aims. In the States, the ACM collected feedback from programmers directly. Over 58 and 59, a huge community was forming, all to feed in ideas

Starting point is 00:30:07 and hammer out this universal language. At this point, the programming community in general was pretty small. I'd wager that most programmers, if not all, were exposed to ALGOL in this period. Between user groups, academic journals, conferences, newsletters, it must have been hard to not notice ALGOL's formation. This helped the influence of ALGOL spread far and wide. It also meant that many points of view from many different programmers could be incorporated into the final design. Those are both very good outcomes. But there are also downsides to this approach. ALGOL may have been one of the first languages to be designed by committee. This is often viewed by some as an absolute evil, but that doesn't have to be the

Starting point is 00:30:59 case. Rather, it can be dangerous. It can be hard to navigate. By using committees, it is possible to pull in a wild amount of experience and expertise, which can help the design. That's a huge pro. As for commons, well, I think it's best explained by Fred Brooks in The Mythical Man Month. That text is about managing programming projects, but I think it applies just as well to language design. Brooks argues that it's crucial for design decisions to come down to a single person, that one leader should have absolute power. This is necessary in order to create a cohesive product. If multiple people are all jockeying and vying over parts of

Starting point is 00:31:46 the design, then the system will fall apart. There will be visible rifts between different parts of the design, or parts of the system will be working against each other. The leader, singular, is able to produce a coherent design because they know all parts of the design, and they can thus choose what's best. At least, that's the theory according to Fred Brooks. The danger of a committee approach all comes down to incoherence of design and incoherence of vision. You have all these ideas coming in, some that are even contradictory, and you have to find a way to fit them together, or to weed out the good from the bad. For a single leader, that's easy. They have the final say, so they pick and choose what they think will work best. There's no argument involved, there's not even time spent for a long

Starting point is 00:32:35 drawn-out debate and a vote. But for a committee, argument is a core feature. That can, in some cases, lead to a catch-all approach. The broader committee may compromise on ideas instead of making tough and unpopular choices. It can also lead to a certain paralysis that causes the design process to go on way too long. The flip side here is that if a committee is able to navigate these hazardous waters well enough, then they can incorporate more good ideas than any one person could ever dream up in a lifetime. But these are very hazardous waters. According to Alan Herbert, one of the many involved in the ALGOL project, I.O. was a victim of this committee approach.

Starting point is 00:33:21 Quoting from an article in the Register, The defining committee couldn't agree on how to do input-output. They decided that they would be left to a library, and that library would be user-dependent. Instead of making a tough choice or doing the actual work, the committee balked. They backed down. To be fair, a standard around IO would have blown up the scope of ALGOL 60, but the continued omission was noticed. There's also just the general administrative maze here that we need to consider. The ALGOL session in HOPL 2 gives a more complete rundown than I could ever muster, but in short, ALGOL 60 was first drafted by one committee,

Starting point is 00:34:06 then passed off to another organization which spun up a special working group, which ratified and passed out the ALGOL 60 spec. There is layer upon layer of committees involved here, some with overlap in membership. So while ALGOL may have been the most scrutinized language to date, it was also the most highly bureaucratic. So, what all did ALGOL 60 do to improve the 1958 version?

Starting point is 00:34:32 In large part, this newer version was aimed at further standardizing and codifying the language. The 58 version was pretty well defined, but things could go further. The biggest battle here was against ambiguity. An algo program only means one thing. It can only ever have a single interpretation. In the initial IAL report, Bacchus introduces a formalized syntax for describing the language. This is what we would now call a metaprogramming language or a

Starting point is 00:35:06 metalanguage. It's a way to very specifically describe how a language is constructed, to explain its syntax. For ALGOL 60, Naur expanded and refined that metalanguage, which allows for a full description of ALGOL syntax. This description has the same type of rigor that you'd find in a mathematical proof. It's clear and it's unambiguous. This meta-language is eventually named the Bacchus-Naur form, or BNF, and it's still actually used today for describing syntaxes. The 1960 spec is also a lot more clear on how ALGOL should act, or its semantics. Each language feature is explained in terms of standardized syntax in BNF, semantics, any caveats, any restrictions, and any possible side effects.

Starting point is 00:35:59 The spec is, well, it's very specific. It's also written in a way that it can be easily referenced. Each section is numbered and each subsection is also numbered, so you have citations to an exact chunk of the spec. These factors all tightened up that reference language. So let's talk a little more in detail about that spec. One of the big features that ALGOL really hammered out is scope. This is a huge realm of research, so prepare for an understatement. A core feature of ALGOL is the block. This is a mechanism used to group code. In ALGOL, it's done using the words begin and end, but later languages like C use the

Starting point is 00:36:46 more ubiquitous curly braces for the same feature. It's really just an encoding thing more than anything. Blocks can contain other blocks, forming a nested structure. Each block has its own local variable scope, meaning that if you declare a variable inside a block, that variable is only accessible within that block or blocks nested inside it. This means that something like a procedure, itself defined as a block, can keep its variables isolated from the rest of your program. It's one of those very simple fundamentals that's crucial to modern programming. It starts with IAL, and ALVOL 60 refines and better defines its

Starting point is 00:37:32 behavior. Other improvements are more pragmatic. Variables are a good example of this pragmatism. In the IAL spec, variables can be defined anywhere. There's no reason this shouldn't be possible. Indeed, most modern languages let you do this. But practically speaking, there are some issues with that. It mainly comes down to things like memory allocation and compiler complexity. ALGOL doesn't have a mechanism for explicit allocation. You can't ask the language to give you a chunk of memory. Instead, it allocates memory when you declare a new variable. It's more efficient to do that all at once, or at least to keep that part of your process

Starting point is 00:38:17 separate from active code. The solution in ALGOL 60 is to put all declarations at the beginning of their block. I personally think this is coming from Fortran programmers. In that language, variables have to be declared at the beginning of your program or function. Another pragmatic change was the removal of so-called inline functions. This clears up something that struck me as strange and also kind of antiquated about IAL. In early specs, you could define a function like you'd define a variable. As in, you just say I have a function named x, it takes an argument, y, and it returns a value. That would be done as a single line, a function name set to an equation.

Starting point is 00:39:08 That's a neat shortcut, but it's actually kind of messy. IAL already had syntax for defining procedures. In fact, that syntax is very sophisticated. Adding these extra functions is weird. Now you have two ways to do the same thing, but one way is more limited than the other. ALGOL 60 just gets rid of the difference. A function is just a procedure. Often, colloquially, this meant a procedure that returns a value, but that's not actually in the spec. You just have procedures and call them whatever you want, but to ALGOL, they're all procedures. This type of streamlining makes ALGOL60 seem more mature and more modern. It's being reduced to just what's actually needed by dropping out extra

Starting point is 00:39:52 or redundant features. I think this is a really good hallmark of a mature language. It's something that we actually see happening in Lisp in the same time period. The earliest versions of Lisp had two types of expressions, the M expression and the S expression. These expressions had different syntaxes, which made for a very clunky language. The idea was that M expressions would find data and S expressions code. Basically, you have two languages in a trench coat. That's not the best situation to be in. As Lisp was refined, M expressions were dropped by folding their functionality into the S expression. This led to a simplified syntax. In other words, Lisp became more simple and more streamlined as it matured. Here we see ALGOL following that same path

Starting point is 00:40:46 concurrently. We also get little things in 1960, like new types. IAL was purely numeric. You had integers, reals, and booleans, which were actually one-bit numbers. ALGOL 60 adds strings. ALGOL60 adds strings. Kind of. This is where things get a little strange with the language. According to the BNF source code, the source of all ALGOL truth, variables can only be typed as integers, reals, and booleans. But strings are defined in the spec. There's a whole section on strings. That seems kind of weird, right? Well, here's the thing. The spec needs to have a definition

Starting point is 00:41:35 of strings in order to, well, define other structures. Things like variable names are defined in the BNF as strings. You can also technically have string literals in ALGOL. That means you could write some text inside of quotes, but you can't assign a variable to hold a string value. In that sense, strings are kind of this weird second-class data type. You may be able to pass them around. You can use them. You can use them to describe the language, but you can't actually store them in memory. Strings are also defined in an ambiguous way. To paraphrase from the BNF itself, a string is any sequence of symbols not containing open or close quote or empty. So here's the ambiguity. The term symbols is not defined in the meta language, but basic symbol is. That class is defined as any upper or lowercase letter, number,

Starting point is 00:42:46 That class is defined as any upper or lowercase letter, number, boolean, value, or delimiter. So, the actual legal values of a string are a little bit up in the air. Call that an exercise for the reader, if you will. I do want to point out something about the inclusion of strings, though. Actually, two somethings. Adding strings to the spec means the spec can talk more precisely about the language. Ambiguity about what a string actually is aside, the document can now say things like, a program is composed of strings of text. It can also talk about textual data, which is very useful. This also points towards, I think, a drive for

Starting point is 00:43:27 more general purpose language. String processing exists outside the realm of purely numerical computing. ALGOL60 isn't there yet, but it's taking a step in that direction. Now, once again, Now, once again, we need to go back and remember what's missing. That is, input and output. ALGOL60 is a refinement of the earlier 1958 spec, but it's still missing any guidance on input and output. This is, yet again, a huge and very costly omission. The first implementations of ALGOL60 came in, well, 1960. That was a reference implementation, written by some of the folk who developed the spec itself.

Starting point is 00:44:15 But as ALGOL60 spread and was implemented elsewhere, the specification shifted. In theory, the specification could be used to make a 100% compatible compiler. But there were certain features that were left open to interpretation. Input and output were some of those features. Different implementations handled printing data in different ways. Some used a print statement. Some used display. Some called it write. Some had simple procedures that accepted a string or a number and printed it out, unadorned. Other implementations could format text and format data in any number of ways. It wasn't just that different compilers used different words,

Starting point is 00:45:00 but their input and output functions worked in totally different ways. words, but their input and output functions worked in totally different ways. In practice, this meant that different versions of ALGOL were not compatible. If you were careful with how you wrote your code, if your I.O. was kept isolated and made very generic, then your code may be portable. It could be. But that put the burden on the programmer, not on the design of the language. Do you trust everyone in the world to follow the same safe practices? To restrict I.O. to isolated routines that all spit out the same format of data? No, of course not. No one in their right mind would believe that programmers would just do that on their own without guidance. The ALGOL 60 spec gave no guidance on those measures.

Starting point is 00:45:57 There were also more fundamental issues with the language, outside of, you know, the IOWOs. These are outlined in part in this paper written by Donald Newth called The Remaining Trouble Spots in ALGOL 60. This focuses primarily on ambiguities in the specification. I actually think it's a pretty riveting read, for the most part. There's some spots that drag. One particularly interesting issue that Newth points out are so called incomplete functions. So, quick background here. One of the groundbreaking things about ALGOL is that it was a structured language. Code is grouped into blocks. You have reusable units of code that can be very well formatted. Procedures, as I mentioned before, are made out of blocks. Despite this focus on structure and correctness,

Starting point is 00:46:48 ALGOL60 has an interesting and scary feature, GOTO. Blocks of code can be labeled, and you can jump to that label using a GOTO statement. This is viewed by many programmers as, well, we have a word for it. It programmers as a bad thing. The reason is that gotos can get weird. Newt's specific problem is that a procedure can have a goto that exits that procedure. The ALGOL60 spec doesn't say anything about preventing this behavior. So when you call a procedure, you actually have no guarantee that the procedure will return. It could jump off to somewhere else. And while that does offer some

Starting point is 00:47:34 convenience, it has a cascading effect on code complexity. It's, well, it might be considered harmful. It puts you in this mindset where if you think go-tos might be around the corner, you have to check that your functions are returning, and that you're not just being jumped to. It gets weird really quick. While 60 was an improvement over the previous reports, I think it's clear it still had issues. That said, it was a pretty slick little language. ALGOL60 is simple, flexible, and relatively clean. Weird ambiguity issues aside, a lot of asides this episode.

Starting point is 00:48:16 I honestly think that a lot of the flaws only shine through due to the amount of scrutiny on the language. But we must ask, where was there to go from here? The Algol 68 session in Hopple 2 reads something like a soap opera. It gives this wonderful blow-by-blow of all the conferences that lead up to the final Algol 68 specification. the conferences that lead up to the final ALGOL 68 specification. Along the way are twists, turns, betrayals, resignation, and depression. There are actually passages where C.H. Lindsay, the author, talks about how depressed attendees were getting. Usually, it seems like they had bad Thursdays during these week-long conferences. This was the point where I realized that I do

Starting point is 00:49:06 actually need to split this up into multiple episodes. ALGOL 68 is a beast. Part of this is simply due to changes in the programming world. IAL had been devised when a scant handful of languages existed. By 1968, languages like Lisp, BASIC, PL1, Simula, Snowball, and COBOL were all in common use. We had transitioned from a world where most programmers used assembly language and maybe some tools to a world where object orientation existed. Programming had changed. Computers had changed. The idea of a universal language had a whole new connotation. Algol 60 was finalized right on the cusp of this linguistic explosion. It should come as no surprise that as soon as the ink was dry on the report,

Starting point is 00:50:02 there were plans for a successor. There were, of course, the usual revisions and extensions. You can even find reports on proposed I.O. standards for ALGOL60. The language remained a hot topic, but over the years, it became apparent that the 60 spec just wasn't good enough. We've already gone over some of the issues with Algo 60, but what were people trying to address in this period? What were people trying to change? Here I'm pulling from two main sources. The first, from 1964, is Duncan and Van Ving Harden's Cleaning Up Algo 60. The second, from 68, is Peter Naur's Successes and Failures of the Algo Effort. I can recommend the second paper for anyone who wants a fun and spicy read. It's

Starting point is 00:50:55 personally kind of wild. I love it. Anyway, these papers bookend the Algo 60 era, so I think that will give us a good understanding of what needed to change and how people were trying to address that. The cleanup paper is particularly interesting because it really takes advantage of the structure of the ALGOL 60 spec. This is something that, I think, is one of ALGOL's greatest strengths. And it's a unique strength, to be sure. algol's greatest strengths. And it's a unique strength, to be sure. Since the specification is so formal and so structured, it's possible to describe really fine-grained changes to the spec. The cleanup paper references exact locations in the revised 1960 report. It, for instance, will say which numbered section needs to be replaced and with what. It's kind of like reading a software patch, but for a language specification.

Starting point is 00:51:49 It's some really neat stuff that you only get when you have this level of formalism. The recommendation here is pretty smart. The big thrust is to add two new data types. The first being the string. We already saw that ALGO60 almost had strings. The cleanup paper adds strings as an actual data type, as in, you can declare a string variable. The paper includes all the adjustments needed for this new type, and includes a set of functions for string manipulation. This includes, and get this, input-output functions. That's right, actual algol that can print something. The spec here is,

Starting point is 00:52:38 actually, pretty simple. It describes procedures for reading in a symbol and for printing out a symbol. It's also left ambiguous where those symbols come from or where they go. I think that is the perfect approach, the one that should have been adopted way back in 1958. You get some function names, how they behave, and then it's up to the implementers to figure out if out symbol prints to a teletype or a paper tape, or if in symbol can read from a punch card. That is just so reasonable. Now, I did say there were two new data types. The second is, by far, I think the cooler of the two. By far, I think the cooler of the two.

Starting point is 00:53:28 This paper recommends making label a data type. To quote, This is one of those things where, if you know, you know. ALGOL 60 is a very restrictive language. It hides away memory, so you can't do something like jump directly to a memory location or muck up data in RAM. If you want to ruin some data, you have to do that using variables. And if you want to jump around and have issues, you have to do that using variables. And if you want to jump around and have issues, you have to do that using goto and label. In ALGOL 60, labels existed in the same way as

Starting point is 00:54:12 strings. They're part of the spec, but not part of active code, if that makes sense. Labels are defined and you can use them, but you can't store them as a variable because there's no label type. Cleaning up ALGOL recommends that labels be elevated to a full data type. That may sound silly at first, but this is actually a wild feature. This means you could have a variable take on the value of a label, and then you could jump to that label. You could make these dynamic go-tos that, if used wisely, would radically simplify code. The switch thing here is particularly cool. It reminds me of how I like to write code. A switch is this construct that is kind of like a special-purpose if statement. Basically, you set up clauses that

Starting point is 00:55:06 execute for a whole set of given values. That's usually used if you have something like a numeric value and want to do something different for each possible value. You could use a pile of ifs, but a switch gives you a shortcut there. The trick here is that ALGOL60 lets you make an array out of any data type. So all you have to do once you have the label data type is make an array of labels. Then if you have some condition that runs when your input number is 3, well, that just means you have to go to the third element of your array. That makes switches redundant so you can remove them. This is a very common approach in something like C or assembly language, where you can write code that basically says,

Starting point is 00:55:58 execute this location in memory. When I've used this in my own code, I'll call it something like a jump table or a lookup table. It makes for a really fast, efficient, and kind of elegant way to handle conditionals. But since ALGOL60 prevents you from doing that, you need a little bit of formalism. Elevating labels to an actual data type fixes that limitation. I see this as a refinement. It's a small change to the spec that would make ALGOL60 much more flexible and much more powerful. It also removes a language feature, the switch, which may make ALGOL60 easier to understand. In general, the fewer features, the better.

Starting point is 00:56:47 Adding just two data types would make ALGOL more in line with its competition and make for a simplification of the language. Plus, you know, the whole input-output thing would make ALGOL 60 actually usable and prevent the spread of off-spec implementations. actually usable and prevent the spread of off-spec implementations. The cleanup paper also introduces something, well, concerning that we need to keep an eye on. It says the evil words ALGOL-X and ALGOL-Y. In 1964, it was clear that ALGOL 60 needed to change. But it was unclear how it needed to change. There were two camps on this. Two possible proposals. X and Y, with some overlap and arguments and crossover. It's, well, it gets complicated. ALGOL X was the more conservative

Starting point is 00:57:42 route. A plan to extend and refine ALgal 60 into a more general-purpose language. At this point, Algal was still very much numeric. I think it's plain to see that from the data types alone. It has numbers and lists of numbers. While that may have been reasonable for 1960, it wasn't reasonable for 1964. may have been reasonable for 1960, it wasn't reasonable for 1964. That may sound like a short time, but progress was fast in this era. In those four years, new languages and, beyond that, more sophisticated computers were hitting the scene. A purely numeric language, while useful, was losing its shine. In general, we can think of ALGOL-X as keeping the bones of ALGOL-60, but updating it for the coming years. ALGOL-Y, on the other hand, was a more radical proposal. It would also shift over time, but here I'm just going off the very earliest definitions for this

Starting point is 00:58:39 part of the project. Y would be a new language, and it would be homo-iconic? How's that for a $5 word, huh? We've talked about this on the show before, and it's actually one of my favorite topics to pull out and confuse folk with. A homo-iconic language can represent and manipulate its own code. As in, it can describe and talk about itself using its own syntax. Homo here meaning, well, speaking the same language, and iconic just meaning representation. This is a radical departure from a simple numeric language, but I think it follows the spirit of algol in a way that, maybe if not direct, at least rhymes. Homo Iconic languages tend to have a sort of elegance, both in practice and definition. Lisp is the usual example here.

Starting point is 00:59:35 That language, which was largely developed after the ALGOL 60 spec, is able to describe itself. You can define how Lisp works in terms of Lisp expressions. That means that a rigorous specification for Lisp can be written in Lisp. That can feel a little mind-bending at first, but in practice, it works just like how BNF is used in the ALGOL60 spec. You start with simple primitives like the expressions for composing items and lists, and then build up from there. BNF does the same thing. It starts by defining things like letters and numbers. A huge benefit to homoiconism is it allows you to describe both syntax and semantics using the language. You can use Lisp to describe what Lisp looks like and what that code actually does. This, I think, is perfectly in line with the specification-focused world of ALGOL. It would turn the language itself into the specification, a living monument in and of itself.

Starting point is 01:00:48 You wouldn't need a separate meta-language like BNF. You'd just have one unified language. Homo Iconism also allows for a wild level of flexibility. You can write a Lisp program that produces a program and executes it. You can also write a Lisp program that can take code as input, make changes, and then spit out a modified program. That's not a hack or a trick. That's just how the language is designed. That said, it lets you pull some wild tricks. However, there is a downside here. Homo Iconic languages tend to be weird.

Starting point is 01:01:31 They don't have to, but they tend to be. Lisp, Forth, Track, you know, just kind of strange languages are Homo Iconic. Languages like that are hard to understand, especially for the novice. And while homoiconicity is powerful, it can be a difficult tool to wield wisely. This is actually, indirectly, one of Peter Naur's frustrations with the direction of Algol. Attempts to improve the language and the specification led to confusion. To quote, Attempts to improve the language and the specification led to confusion. To quote,

Starting point is 01:02:11 Insofar as the ALGOL60 report has made it clear that a carefully worded, defining description is of great value for the work with programming languages, so far the description of ALGOL60 may be counted as a success of our effort. But this must not blind us to the need for descriptions oriented to human readers. In view of that fact, the ALGOL 60 report has gained the reputation of being very hard on the uninitiated reader. We must conclude that on this score our efforts have been a failure. For the language to be universal, it needs to be universally understood. A precise spec was part of that, but so was a certain level of simplicity and a certain level of human orientation.

Starting point is 01:02:57 During the 1960s, the work around Algol would turn into a grand drama. In 1968, Peter Naur himself would resign from the working group. Many had resigned before him and many would resign after. The transition from Algol 60 to the eventual 68 is a wild ride to say the least. That's going to come next time. But let's just stop at this moment in time. Naur cites a particular success during this period. The Algol effort was the first attempt to really put programming languages under the microscope. It brought a type of vigor and power to language development that, up to that point, had been reserved for the realms of mathematics and the hard sciences. The language itself had problems,

Starting point is 01:03:45 that's impossible to deny. Its creation, however, would change the face of programming. That puts IAL and ALGOL60 in a very interesting position. They never strike it big, never become hugely popular language, they never become universal. Their influence, however, can still be felt today. Thanks for listening to Advent of Computing. I'll be back in two weeks' time with a new piece of computing's past. If you like the show, there are a few ways you can support it. If you know someone else who'd be interested in the history of computing, then please take a minute to share the podcast with them. You can also rate and review the show on Apple Podcasts. If you want to be a super fan, you can support the show directly through Advent of Computing merch or signing up as a patron on Patreon. Patrons get early access

Starting point is 01:04:34 to episodes, polls for the direction of the show, and bonus content. You can find links to everything on my website, adventofcomputing.com. If you have any comments or suggestions for a future episode, then please either shoot me a tweet or send me an email. I'm at adventofcomp on Twitter. And as always, have a great rest of your day.

Your Ad Here

Advent of Computing - Episode 129 - ALGOL, Part I

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.