Advent of Computing - Episode 78 - INTERCAL and Esoterica

Starting point is 00:00:00 Work with computers long enough and you start to notice how absurd everything is. Personally, I think too much exposure to digital machines does something maybe bad to a person. On a very fundamental level, computers aren't conducive to human use. We don't think in terms of logic operations and instruction pipelines. Those are very artificial constructs that we designed to help simulate math using silicon. We've developed some good tools for managing these weird machines, but bridging that gap from the natural to the artificial will always be a rough process. This roughly hewn bridge has many manifestations, you could call them splinters. You can see it in the annoying and seemingly endless quirks

Starting point is 00:00:53 of computers. Maybe your printer just plain won't listen to you today. Maybe your hard drive is fragmented, whatever that practically means. Or maybe you've been working on a program all day, and through seemingly no fault of your own, it just won't run. This affliction hits programmers particularly hard. Programming languages, for all their spit and polish, can really be a pain to use. What's the difference between pass-by reference and pass-by value? Are lists and arrays always interchangeable? What if they're not? What happens then? Why aren't all errors fatal? And why do I need to end each line with a semicolon again? It just all seems so arbitrary. It can become frustrating very easily.

Starting point is 00:01:46 So what's an intrepid binary basher to do? I've heard it said that laughter is the best medicine, and I think that's true in this case. When you grow tired of real programming, when C just isn't cutting it, or PHP has gotten a little long in the tooth, why not turn to something a little more fun? Why not give an esoteric programming language a try? Welcome back to Advent of Computing. I'm your host, Sean Haas, and this is episode 78, Intercal and Esoterica. Today is a very serious episode, so I'll abide no grins and definitely no chuckles. We're going

Starting point is 00:02:35 to be talking about a type of programming language that breaks the mold, innovates new features, and fascinates us serious practitioners. Today we're looking at esoteric programming languages. Now, it can be a little tricky to sum up esoteric languages. It's kind of a I'll-know-it-when-I-see-it kind of thing. I've often seen them referred to as joke programming languages, and while I can understand where that analysis is coming from, I think it minimizes these languages. I personally prefer to view esoteric languages as a form of art. In general, an esoteric language is one made without really practical goals. This can, in some cases, be done as a joke. One example would be the modern lolcode, a very vexing language

Starting point is 00:03:28 that pulls its syntax from memes. Or whitespace, a language that uses only spaces, tabs, and new lines. But there's more than just a sensible chuckle here. The often canonical example of this art is a language known as Brain Expletive, shortened to BF for Mixed Company. I'd say the full name, but I'm trying to keep the show from being marked as explicit. BF is a Turing Complete Language, meaning it can be used to program anything programmable. That fun, recursive definition. The schtick is that it's about as close to a Turing machine as it gets. You don't have addressed memory. You have an infinite tape with a head. You can increment or decrement the value of the tape at the head's location.

Starting point is 00:04:15 You can move the head, you can do a simple loop, and you can do input and output. But that's it. BF is minimalistic, to say the least, but it can be used to program anything you could write in C or Java or Fortran. BF just happens to take a little more code to get things done. That and all the instructions are a single character, so the code ends up being almost indecipherable. So why would anyone want anything to do with BF? Why would you want anything to do with any of these esoteric languages? Partly, it's just fun to jump through the looking glass on occasion. And partly, I think it makes for a good challenge. For me, I'm a huge fan of BF.

Starting point is 00:05:06 Last year, I actually wrote an interpreter for the language that runs under MS-DOS as, you know, a challenge. It was a fun exercise. Working with these weird languages is a really neat way to stretch your imagination. Esoteric languages also offer a neat way to deconstruct and recontextualize programming. I know, more pretentious language, but I have to use it somewhere. Programming in BF, you soon learn what really matters to a computer, or at least to a basic Turing machine. I'd argue that's not only useful, but there's something pleasing about it. Esoteric languages can make you think about your everyday

Starting point is 00:05:46 life much in the same way artwork can. The final factor here that pushes esoteric languages out of the realm of mere jokes is the time involved in their creation. I'm all for putting a lot of effort into a good prank, but there has to be a limit, right? Esoteric languages are designed, implemented, and programmed in the same way as their more serious cousins. That's really a lot of work for a joke, no matter how you look at it. So in this episode, we're going to be digging into why esoteric languages are fun and interesting. I think it's only right that we take a bit of a meandering approach here, so prepare for a bit of a fun and funny episode. The other complicating factor is most esoteric languages really come to prominence in the 90s, but instead

Starting point is 00:06:38 we're going to be focusing on the early era. We're going to start off with one language called Intercal. How were programming languages, a very serious and academic endeavor, able to turn into jokes and art? What made that change possible? And I think this is a fun thing to really turn around in your head. How were esoteric languages similar to their better knownknown and much more serious counterparts. We're going down a weird branch of the programming language evolutionary tree, so I think we need to reassess the roots that we're coming from. To start with, at the most basic level, what's the point of a programming language? To paint with a very broad brush, I think there are four main reasons for high-level programming languages to exist, that is, any language that's not assembly or machine code.

Starting point is 00:07:33 To begin with, high-level languages make programming easier and, by extension, faster. That was at the root of really early programming developments. Rear Admiral Grace Hopper's work on compilers came out of a need to reduce programming overhead, or, to paraphrase from her, because she was lazy. Fancy computers aren't that useful if it takes months to write a single program. At their core, programming languages are really productivity tools. That might make them sound boring, maybe like their core, programming languages are really productivity tools. That might make them sound boring, maybe like they're Excel, but that's really the root of it. A programming language, if it's composed properly, will make you more productive. Second, high-level languages are easier to learn.

Starting point is 00:08:21 That counts for new inductees to the dark arts, as well as seasoned pros picking up a new language. This was one of the core factors behind the development of Fortran. John Backus wanted a way for engineers and mathematicians to use computers. And hey, a big fancy computer isn't really of much use if you're the only one who can use it. Third, high-level languages are easier to debug. And when I say easier, I'm comparing them to machine code and assembly language. That's the root-basal comparison we have to keep in mind. I know debugging C can be a huge pain, but trust me here, it's leaps and bounds better than trying to deal with low-level bugs. Debugging as part of the programming cycle is just as important as, well, you know, programming.

Starting point is 00:09:13 Once again, cribbing from Hopper, a program in a high-level language will have fewer lines of code. Each line will compile out to multiple, usually many multiple processor instructions. That means there are fewer spots for human error. A fancy machine is useless if you don't have a working program. And finally, four, high-level languages make it easier to port a program to another computer. This doesn't come up until later into the development of languages. another computer. This doesn't come up until later into the development of languages. When there's just 10 computers in the world, it doesn't really matter that your code only works for your nearest machine. Once we get to 100 systems, or maybe 1,000, then you start to think about sharing your

Starting point is 00:09:58 work. Assembly language is explicitly tied to the architecture you're working on. You're directly telling the computer what to do. That means programs and assembly aren't well suited to jump the gap from one type of machine to another. High-level languages, if designed properly, can easily jump that gap. The prime example here, despite actual lack of adoption, is ALGOL. The language was built explicitly to run on any machine out there. I guess the snappy tagline for this point is that a big fancy computer can be made more useful if it has friends. So there we have my, admittedly partly stolen, views on why programming

Starting point is 00:10:40 languages need to exist. Call it Sean's rationalization theorem. One of the keys to the success of high-level languages is that these four goals have a compound effect. They form something greater than the sum of their parts. This is especially true once portability enters the mix. With good programming languages comes better software. Advances in programming have made more advances possible in an unending cycle that continues to this day. If code is easier to write, then people will write more code. If more people are writing code, then, well, we get more code. If code is easier to debug, then we get more working code.

Starting point is 00:11:24 The working part is surprisingly important. And if code is portable, then, once again, more people are able to contribute. Projects can get bigger, collaboration becomes more widespread, and we all live happily ever after. It's a nice little vision. Of course, the path to that rainbow isn't an easy one. So how do language designers push us towards this brighter future? This is the kind of thing that really varies from language to language, but there are a

Starting point is 00:11:56 few common routes. Clean and easy to understand syntax is the most common denominator. By syntax here, I just mean how the language works, which characters and words and structures it uses to, you know, lay out its code. The whole point of a language, if we even simplify further, is to protect programmers from the deeper machinations of a computer. You can do that by replacing a series of archaic instructions with, say, a nice and simple plus sign. Clean syntax. Easy. Now, there are two major syntax goals that most languages strive towards.

Starting point is 00:12:35 English-like syntax and traditional mathematic notation. Both of these goals run into problems, but even Fortran looks closer to English than the bits and bytes that a computer actually reads. A language like BASIC takes this further by trying to limit things down to using only short, pronounceable English words. BASIC is actually a neat example here because its designers tried to limit the use of non-alphanumeric characters. The result is a language that you can read aloud. At least 90% of the language can be turned into spoken word, and that really makes it nice for classrooms. Then we have the factors of consistency and accuracy. Programming can be seen as more of an art than a science, sure, but that's only really possible

Starting point is 00:13:25 when the language makes some self-consistent sense. You should never be surprised by the language itself. That's, to me at least, one of the most important factors in deciding if I want to use a language or not. Maybe you get shocked by some really sweet code, but the linguistic component should be as mundane as possible. Now, that's not to say every language is fully self-consistent. I can think of a short list of languages I've used that violate this rule, but they shall remain unnamed for now. The point here is that if a language is boring, that's not a bad thing.

Starting point is 00:14:06 Accuracy here partly overlaps with consistency, hence why I'm grouping them. You want your language to be powerful enough that a programmer can accurately describe their problem in that language. That can mean including good support for mathematics, handling order of operations consistently, or even just having nice clean syntax for jumps. Stuff needs to work the same way every time, and the programmer needs to be able to arrive at a reasonable chunk of code every time. Another tool, or maybe a meta tool, is specialization. Hopper was of the view that we shouldn't try to work towards one

Starting point is 00:14:45 unified programming language, but instead that we should create languages specialized to specific jobs. Call it a Hopperian specialization, if you will. By limiting your design goals, you can focus your work. That's good in any project, and especially useful for programming languages. Fortran, once again, is the shining example of this mindset. The language was tailor-made for math and engineering, so much so that it's still in use in those niches today. Now, that's a super fast crash course in why we use programming languages and what makes them quote-unquote good. At least, I think this gives us a good measuring stick for ruthlessly judging and picking apart programming languages. This has all been to set up the introduction to a joke.

Starting point is 00:15:39 Now, I know, the setup-to-punchline ratio here is totally whack, but I want this to have the proper impact. Quote, The Intercal programming language was designed the morning of May 26, 1972, by Donald R. Woods and James M. Lyon at Princeton University, exactly when in the morning will become apparent in the course of this manual. It was inspired by one ambition, to have a compiler language which has nothing at all in common with other major programming languages. End quote. Get your jimmies rustled, computer nerds! That's from the introduction

Starting point is 00:16:19 to the 1973 Intercal Manual. Like I said, the entire setup here is to give us the context needed to talk about this language. Take everything I've said so far and flip it. Intercal isn't easy to program in. It's not easy to learn, and it's not easy to debug. It's theoretically portable, but why would you want to port it? It doesn't have clear syntax. Each line is an unreadable and unreasonable mess. It's not even consistent or accurate in execution. You can write a program that doesn't act the same on two subsequent runs. It even has the native ability to break itself so badly it's unusable. What on earth is this anti-language? Well, according to some, intercal is the first esoteric programming language. At least,

Starting point is 00:17:12 it's been retrospectively labeled that. You could call it a joke and be done with it, but there's more going on. Intercal is a response to the state of programming in the early 70s. Intercal is a response to the state of programming in the early 70s. I've seen some describe it as hacker folk art, and I don't think that's too far off from the truth. I actually really like that descriptor here. As the manual explains, Intercal was drafted on a coffee-fueled morning in 1972. But the seeds to the language were planted a little earlier. Its creators, Don Woods and James Lyon, were undergrad students at Princeton.

Starting point is 00:17:52 Woods' name should sound a little familiar. After graduating from Princeton, Woods moved out to Stanford's AI lab, where he became an early contributor to the jargon file. In 1977, he partnered up with Will Crowther to improve and expand Colossal Cave Adventure. You could call him a bit of an icon. Lion is more low-profile. When I initially ran into his name, I assumed it must be the lion of Lion's Commentary on Unix, but that's just not the case. Just different spellings of lion and different first names too, actually. Aside from work on intercal, lion is a bit of an unknown quantity. Luckily, Woods has been forthcoming with details about the language. According to a 2020 interview

Starting point is 00:18:38 with Esoteric.codes, intercal began exactly as one might have expected. A joke. In truth, it all began with the renamed punctuation. I no longer recall what specifically led the conversation down that path, but we were trying to come up with names that all sounded alike-slash-alliterative, so that reading a line of code became a sort of tongue twister. I think the SP terms were first. Spot, spark, splat for period, apostrophe, and asterisk. And then we added other silliness such as the embrace slash bracelet pair. At some point, we decided there should be a programming language that used this character set. End quote. Honestly, most programming languages are already tongue twisters. These languages are all meant to be shoved into a computer for processing, at least that's always

Starting point is 00:19:33 been the primary goal. What Woods is getting at here, and what will be crucial going forward, is the idea that programming languages are more than these isolated tools for talking to a machine. programming languages are more than these isolated tools for talking to a machine. What do you do if you want to collaborate with another programmer? How do you talk about software? I think this is an interesting point, especially during the era where in-person collaboration was the best you could get. You don't want to just turn in your chair, look someone dead in the eyes, and start gibbering off some kind of incantation. In Intercal's manual, Woods and Lyon list off the languages that they're modeling against, so to speak. So this gives us a good point of reference. It's a relatively big list, there's a dozen entries, but it starts with Fortran, so I'm pulling from that. And hey, I know Fortran, so it's easy. Let's just look at the syntax for an array. To access the second element of an array in Fortran, you would write

Starting point is 00:20:33 array, open parenthesis, to, close parenthesis. It's kind of clunky to say out loud, right? It's better to just say, yeah, and then I grab the second element of array. That's a kind of verbal shorthand that's easy to take, since, you know, the alternative is sounding out each pair of parentheses. No one really wants to do that. And while it's rare to try and read an entire program aloud, it's distinctly more common to tell a co-conspirator, hey, my code is breaking at that line. Yeah, you know, where I try to get the second element of array. That's how it is in the normal world. But we're in intercal land now.

Starting point is 00:21:21 Everything is seen as a parody through a funhouse mirror. Woods and Lyon started constructing that funhouse by making punctuation worse. I'm going to try and do my best not to give a syntax rundown since, well, this is an audio podcast and there are better resources for that, but I have to share some of the greatest hits. In the 73 manual, the pound sign is renamed mesh, just so later on, the equal sign can be called half mesh. The exclamation mark, which Intercal uses as a shorthand for single quote period, is just called wow. I don't know why, but something about that gets me. After Broken Symbols, the rest of the language was hastily drafted.

Starting point is 00:22:09 And, oh boy, Intercal is quite the language. It systematically breaks all four parts of Sean's theorem of programming language rationality. And hey, that's what makes it fun. No one's expected to actually use this language, right? Right? To kick things off, Intercal has an evil variable system. You know how much I care about memory? Well, Woods and Lyon do my precious memory dirty. Okay, so you get four data types, each with explicitly identified gross notation. A 16-bit integer variable's name starts with a period, aka a spot. A 32-bit integer starts with a comma, called in intercal a splat. You also have arrays of each variable.

Starting point is 00:23:08 a splat. You also have arrays of each variable. So for 16-bit arrays, you use the double splat, aka the colon, and for 32-bit arrays, you use the hybrid, also known as the semicolon. That part's just kind of dumb, but it really isn't that bad. Some languages, Perl is one example, follow a similar practice in order to differentiate variable types. In Perl, scalars all start with a dollar sign. That's variables like numbers or strings. Arrays all start with the at sign. It's done partly to keep namespaces separate and partly to make reading code easier. In a Perl program, you can immediately tell what type of data you're working with. In Intercal, this is done for a very different reason. It makes code less readable.

Starting point is 00:23:58 The kicker here is that variable names in Intercal are all numeric. Any number that would be a valid integer is also a valid variable name. So to wrap this together, you get beautiful variables like semicolon 5, the array of 32-bit integers. That's nonsense. Now, 16-bit variables are the most egregious since they all have to start with a period they look like a decimal number so you get examples like 0.1 or 0.5 it just rubs me the wrong way we could overthink this a little, which, I mean, come on, you know me. Named variables are actually really crucial to programming languages. This is one of those obvious things, but it's worth slowing down to look at.

Starting point is 00:24:56 A variable is really just a name that a programmer gives to some location in memory, some chunk of bits in RAM. The job of a language is to deal with that name-to-memory-location shuffle for us. This is so fundamental that most assembly languages even support naming chunks of memory. That should be a sign that it's basic. In this lower level, it's usually called a label, but I think the analogy holds. The underlying address for a variable is just a number, some address in memory, like 1, 2, 3, 5, 8, 9, some nonsense. But we prefer to call it x, or number of oranges, something we can actually remember that has

Starting point is 00:25:44 meaning to a human. Intercal strips that away in what I think is a funny way. Sure, it gives you named variables. It's just that those names happen to be, well, they're numbers. The result is you get this beautiful semicolon 5, which is a named variable that's stored at some address in memory. But chances are good that it's not even stored at the actual numeric address 5. Operators in Intercal are honestly worse than variables. At least the variable system is

Starting point is 00:26:22 workable. You can assign values around, even though you have to use an arrow instead of an equals and dumb numbered names. Intercal gives the programmer five operators. Interleave, which is also called mingle, select logical andical Or, and Logical Exclusive Or. The first two here are intercal originals, which you should know by now means trouble. Mingle takes two 16-bit numbers and it mingles the bits. That's really the best way I can describe it. It takes the binary representation of each argument and then constructs a result by alternating bits from each argument. A mingle of 111 and 000 would be 101010. That might sound deep, but that's not a very useful operation. I hate to break it to you,

Starting point is 00:27:28 not a very useful operation. I hate to break it to you, no one needs to mingle two numbers together, it doesn't mean anything. Select is in a similar vein, but more complicated. I just don't like select, so I'm going to pull its definition from the manual. Quote, the select operator takes from the first operand whichever bits correspond to ones in the second operand and packs these bits Now, select could almost be useful. It's really close to a mask operation. That's an operation that lets you extract the value of certain bits from a larger number. That last part, the packing all the bits to the right, means that SELECT can't be used as a normal mask. To use mingle and SELECT to do anything, you have to fight with the language. You have to work around Intercal's design. The three other logic operations are roughly standard, they're just implemented the

Starting point is 00:28:27 wrong way around. By all sane accounts, logic operations are binary, that is, they take two operations and produce one result. Intercal's logic operations are unary. They take one operand and produce one result. As the manual describes it, a logic operation in Intercal is equivalent to shifting the operand's bits to the right, then performing the operation in question. Once again, this does technically do something, it's just not useful. Intercal gives you tools, they just suck. Woods and Lyon even managed to make the overall structure of Intercal gross. One way to make programming faster is to get rid of repetitive tasks or repetitive code.

Starting point is 00:29:17 That helps make programming easier to do and easier to teach. Flip that around, and you have the do slash please do slash please statement. This is the most legendary part of Intercal. Each line has to start with do. For variety, you're allowed to use please do or just please in place of any do. What's not mentioned in the manual is that you have to use some pleases in your program. If you don't use enough, the intercal compiler will throw an error, citing, quote, programmer is insufficiently polite. However, if you use too many pleases, then you run the risk of invoking another error,

Starting point is 00:30:01 quote, programmer is overly polite. another error. Quote, programmer is overly polite. It's an arbitrary stylistic restriction that doesn't impact your final program, but does insult the compiler. Now, there are three final pieces to this messed up language that we have to go over. Stash, ignore, and abstain. So far, Stash, ignore, and abstain. So far, everything's been silly and frustrating, but these are features that I actually find really interesting for a number of reasons. Stash should be familiar if you've worked with any computer or language that supports a stack. Now, for context here, a stack is basically exactly what it sounds like.

Starting point is 00:30:46 It's a data structure where you can push on a number and then pop that number off. So you can build up this pile, this stack of data for later use. When you stash a variable, it's pushed onto a stack. You can then retrieve it at a later date. But ARACAL can't be normal, so it implements this in a strange way. Each variable has its own dedicated stack, so InterCal actually keeps you from mixing up stashed variables. This is really useful since it lets you save the state of a variable before you start in on some code that could mess up a state. It's exactly the kind of feature you want to make subroutines safer. Stash struck a weird nerve with me for one big reason. And this is a little callback to a recent episode.

Starting point is 00:31:37 A few months ago, I produced a two-parter on Lisp and its predecessor language, IPL. This older language, the information processing language, was also awful to work with. Like Intercal, it had gross variable names, although IPL allowed for an entire single letter followed by numbers. But strangely enough, IPL had something really similar to Stash. Each variable in IPL was treated as a list, an array, and you were expected to push data onto those lists. That's exactly what intercal is doing here. It's treating each variable like a secret array. There is a good question to be had. Did Woods and Lyon actually take positive inspiration from IPL? I think we can safely say no, but that opens up something interesting. Lisp is on the list of languages

Starting point is 00:32:34 that Intercal was built to mock. Now, Lisp doesn't do the whole stack thing that IPL did. In fact, Lisp was partly designed as a response to IPL, a bad response. McCarthy, Lisp's author, did not like IPL. So, in kind of a funny way, Woods and Lyon may have been reverse-engineering a weird feature here. So, we've established that Stash has a serious world counterpart, but Forget and Abstain are another story. These are modifiers that change how Intercal functions. Forget will freeze the value of a variable until you tell Intercal to remember that variable.

Starting point is 00:33:18 Once forgotten, that variable's value won't change no matter what you do to it. That's not really a feature that's found in any other language. It's more like a rule for Calvin Ball. Abstain works in a similar way, but for language features. You can basically break keywords. If you want to, say, prevent some code from jumping, then you could tell intercal, please abstain from nexting. That is 100% valid code right there. Reinstate is used to revert an abstention, but here's the neat part. Or, I guess the awful part, actually. You can abstain from reinstating. You just have to say,

Starting point is 00:34:08 do please abstain from reinstating. Thus, you make all abstentions irreversible. Intercal can really be used to break itself. In the context of intercal, these are, frankly, at best useless and at worst dangerous to the programmer's health. The careful use of abstain and forget could be used to build up conditional expressions, which is neat but definitely scary. However, these two keywords strike me as interesting debugging tools.

Starting point is 00:34:44 At least, I can see a glimmer of usefulness here. The ability to hold a variable constant could definitely help suss out certain issues in your code. Turning off, say, a set of function calls could serve a similar purpose. I think that's what makes Intercal work so well as a parody of language. Its features aren't just dumb and useless and funny, they have the gleam of smart design. On paper, in total isolation, the mingle operation sounds like it's gotta be useful for something. It sounds fresh, new, and perhaps powerful. It's the kind of thing that you hear someone talking about in a bar and you're like, oh, I've never thought about using mingle before, maybe I should do some research. But when you stop and think about it, you realize that no,

Starting point is 00:35:30 Mingle is nonsense, they can't do anything useful. The Intercal manual itself helps to heighten this experience. It's a parody of contemporary language documentation, at least in some places. It doesn't hide the parody well. There's one page that's labeled as Table 1, Logical and Other Functions that's just a drawing of a table with random functions on it. Even the language's name is a dumb joke that's trying to trick you. Intercal sounds important, right? It lines up fairly well next to Fortran, APL, Lisp, or COBOL. It's even in all caps, so, you know, it's gotta be serious. What does it stand for?

Starting point is 00:36:12 Well, according to the docs, quote, The full name of the compiler is Compiler Language with No Pronounceable Acronym, which, for obvious reasons, abbreviated Intercal. acronym, which, for obvious reasons, abbreviated InterCal. The package deal here is, I think, pretty funny. You get a language with a nonsense name. You get lines of code like, please do not give up, which I assure you is totally valid InterCal. Those lines are next to code that's literally unreadable. Here, let me just give you a sample to prove it. Do spot one less than worm rabbit ears wow or one mingle rabbit ears mesh three spark. That does something? I know it's a nested mingle operation.

Starting point is 00:37:07 It may also summon a demon as a byproduct. Completing the joke is the fact that Woods and Lion actually implemented a full compiler for Intercal. It was called, appropriately, Ick. Now, there's probably a lot you could say about Ick that I'm going to gloss over. It was written in Spitbull, which, yes, isn't a joke, that's an actual programming language. What's important is that Spitbull was a language primarily designed for processing string data. It was a tool for slicing up and making sense of text. A compiler, as I'm sure I've said over and over again on this podcast, is a really complicated

Starting point is 00:37:42 program. There's a whole field of study called compiler theory just to give you an idea how involved this all is. But hey, what if you don't have any realistic concerns for things like performance? Then you can pull off some cool tricks. According to Intercal's docs, Ick first converts nonsense source code into spitball, then calls the spitball compiler to make that a finalized binary file. It's definitely a shortcut, and it has its problems, but it is a neat solution. The resulting program is slow and bloated, to be sure, but the shortcut saved Woods and Lion from having to write a real compiler. The code could be much simpler. I can only imagine that it helped make Intercal possible. I think it's fair

Starting point is 00:38:32 to say that Intercal is a bad programming language. It breaks all the rules, it's hard to use, and in some cases I'd call it downright evil. But Intercal is intentionally bad. Its evilness is inbuilt as a parody of other programming languages. This leads me to a larger question. We've already discussed how Intercal is making fun of some existing issues that Lion and Woods had with contemporary languages. That made for a compelling retort. But what other evil is lurking out there? Was Intercal the first high-level programming language that broke all the rules, or were there

Starting point is 00:39:13 accidental disasters made in the wild? This puts us in a bit of a weird situation. I don't think it's normally fair to say that there's such thing as a bad programming language. I know that is a bit of a cop-out, but let me explain. Sure, Intercal is bad, it's gross and useless, but that was by design, so I'll give it a pass. We could say that Intercal is good, as much as it pains me to say that, because it lived up to its design goals. To do so, it had to pick up a lot of traits you'd normally call bad. I know that's a little roundabout, but that's the best explanation I can think of. I'd like to think I've done a reasonable job

Starting point is 00:39:59 explaining what those bad traits are, or to put it another way, the usual design goals lurking behind a programming language. You want a language to be easy to use, easy to learn, easy to debug, and hopefully easy to port. We've seen how Intercal intentionally inverts three of the four. But what about languages that inadvertently miss the mark? Well, here's the crux of this weird position. Those four rules that make up my theorem can be relative. Easy is a pretty subjective term, after all. Easy depends on what you're comparing to. For Woods and Lyon, easy meant Fortran, Spitball, Lisp,

Starting point is 00:40:44 Kobol, and a host of other languages popular in 1972. What would happen if we went back further? Now, there are a number of routes we could take here, but I'm pulling from a short list on solang.org's wiki. It's a really wonderful source of information about esoteric languages, and it offers some theoretical pre-intercal candidates that fit this esoteric mold. The example I'm going to discuss here is TRAC, the Text Reckoning and Compiling Language. Development on TRAC started in 1959, which pushes the language pretty far back into the timeline. I'd like to present TRAC as an example of a language that was designed for good reasons, but despite that, shares many properties with Intercal. TRAC was developed by

Starting point is 00:41:33 Calvin Moores as what he calls a keyboard-oriented language. Now, before we go any further, this is not the same as the Moore of Mo's Law. Once again, different spellings. Now, for the time, the idea of a keyboard-based language was a pretty revolutionary one. The workflow for early programming kind of sucked. The usual approach was to punch up some cards containing your source code, load up a compiler, then pass your code deck through. Once that was done, you'd have binary on either some tape or maybe another deck of cards. Then you can load that up and run it. That's fine. It works, but it takes a lot of steps. Track was Moore's attempt to simplify that problem. He envisioned a language that would run as an environment, accessible over a terminal.

Starting point is 00:42:29 The programmer could then sign on and be dropped to a prompt just waiting for code to be typed. It honestly sounds a lot like BASIC, and timeline-wise, TRAC was roughly a contemporary of BASIC. At least, their development periods have some overlap. I think it's crucial to keep context in mind here. We have a common problem, namely, how to make programming easier. Hey, that's right on my big list. BASIC had other design goals, sure, but better utilization of text-based terminals was central to BASIC's success.

Starting point is 00:43:05 So, what about TRAC? How did Moore's approach this same problem? And is TRAC really that strange of a language? In a word, yes. TRAC is some high strangeness. On first glance, TRAC looks like it would fit nicely as a predecessor to other inscrutable esoteric languages. The syntax and character set is strange. There seem to be weird repetitions. Most lines of code start with either a single or a double pound sign and an open parenthesis.

Starting point is 00:43:41 It looks maybe a little like Lisp. There are a lot of parentheses, but data seems to be broken up by commas and even less than and greater than symbols. To the outsider, even a seasoned keyboard pro like yours truly, track does seem inscrutable. Sounds like a step in the right direction. But we can keep going. The core design of TRACK makes me uncomfortable. There's something unsettling about it to me. TRACK is all about strings. That's the core data type. And that's a fine choice for the medium. Terminals are all about text, so why not make a language for terminals all about text? It sounds like a nice symmetry, but here's where it loses me. Strings are to track as lists are to lisp. That's my

Starting point is 00:44:35 short explanation. It's going to take me a bit longer to unpack that. I'm working off a 1965 paper describing track that Mors and a co-author submitted to the ACM. Now, notice that this has had a pretty long development period. In the paper, Mors admits to drawing some inspiration from Lisp. I mean, at the time, you only really had three big ways to go. You had Lisp, Fortran, or Algo. I'm not going to go too deep into Lisp since I just covered that language recently, but let me just say this. Lists are crucial to the language. Lisp is designed to process list data, specifically linked list data. The language is built to do that in a very practical and unified way. In Lisp, data and code are represented in the same way.

Starting point is 00:45:36 This means that, among other things, a Lisp program can really handily write and execute another Lisp program. Everything's just data, after all. Now take that short summary and replace all instances of list with string and lisp with track. All your data is stored as strings, just chunks of text. Your program is also, shocker of shockers, just a big string. So far this is fine, right? All code at some level is just a blob of text, right? Well, here's where we start the descent. Morris explains that one of the key benefits here is that you can redefine your program in real time. Pulling from the text, quote, because track procedures and text have the same representation inside and outside the processor,

Starting point is 00:46:25 the term homoiconic is applicable, from homo meaning the same and iconic meaning representation. End quote. Now, I do like that term. You could also call Lisp homoiconic, in a sense, due to the whole data-code symmetry. But here's the spooky part. The processor is what Morse calls the interpreter, the program that takes track code and executes it for you. Now, in the normal world, interpreters and compilers are complicated and high-powered beasts. The common approach when dealing with code is to take raw text and break it up into

Starting point is 00:47:05 something more easily dealt with. That could be a lexical tree, something in binary, or even just a list. It doesn't have to be that complicated. The idea is to turn the human readable text into a convenient form for the computer to start working with. This is that awkward, roughly hewn bridge I was talking about at the top. Track doesn't do that. It's fully homo-iconic. Track is all strings, all the time. This is where my concern reached dangerous levels. Since track doesn't really turn your code into anything but, well, a string,

Starting point is 00:47:45 you can pull some cool tricks. Runtime rewrites are one example. In practice, for the programmer, you can stop a track program, make some adjustments, and then set it back running. You normally can't do that. With other languages, you have to restart execution from the beginning. But track keeps your code as code. It's all a string. that. With other languages, you have to restart execution from the beginning. But TRACK keeps your code as code. It's all a string. So you're good to go. Why is that a bad thing?

Starting point is 00:48:14 Well, it comes down to implications, as always. Since TRACK doesn't do anything fancy to your code, it has to pull some of its own tricks to actually work. When you're in the territory of pulling tricks to operate, that's a bad time. Track actually dynamically rewrites your code as it's executed. That should sound bad. If it doesn't immediately make you bristle, then let me break it down a little bit. In TRACK, every operation you perform is a function, meaning it returns some value. Now, there are exceptions. Morse points out that the command to print a string is technically not a function, but it does return a null value, so it counts most of the time.

Starting point is 00:49:02 I'm going to pull an example from the ACM paper for this, but rephrase some things for the sake of clarity. Let's say you define some variable x as the string hello world. Track has this idiosyncrasy where you can't reference a variable directly because any unadorned chunk of text is treated as a literal string. You have to grab a variable by name using the call function. So to print your variable's value, you would write something along the lines of print call of x. Here's where the whole rewriting code thing comes into play. When track goes to execute that line,

Starting point is 00:49:44 it starts with the innermost function. In this case, that's call. Track runs call, grabbing the value of x. Now, in most other interpreters, they would do something like, say, move on to the next function in the call tree, passing up the value of the last function. These normie interpreters would think, oh hey, I have the value of x stored in some temporary storage or I have a pointer. I just have to pass that on up. TRAC doesn't do that. It doesn't have any deeper inner machinations. Instead, TRAC replaces call x with the value of x, so your program is rewritten by the track interpreter as print hello world. This is the most simple example, and hopefully you can see the special approach here. Track doesn't so much execute your program in any traditional sense. The interpreter reduces your program, so to speak,

Starting point is 00:50:47 until you're left with no more executable code. Track even has special words for all of this. Strings that can be executed are called active. Everything else is neutral. So, hello world is a neutral string, since track's interpreter can't reduce that anymore. There's just nothing left to do. So yeah, your program is constantly changing as it's executed. I have to admit there are some upsides here. Track can be remarkably transparent as an environment. Nothing is really hidden because everything's happening out in the open.

Starting point is 00:51:24 A programmer could, in theory, watch exactly how their program runs. And I guess that does hit the easier-to-debug item. Another upside is that the interpreter for track could be simplified. The ACM paper describes the execution process as a series of string manipulations, so relatively easy compared to other interpreters. But I can't get away from the fact that your program is dynamic. It sounds like something right out of a bad sci-fi film. Cue a teenager in a trench coat turning away from his desk, looking into the camera with horror in his eyes and saying, the computer, it's rewriting my code.

Starting point is 00:52:06 All jokes aside, let me try to explain why this gets to me so much. Control is an important aspect of programming. This matters in the obvious sense. A programmer uses code to enact control over a machine. At the same time, a programmer needs a controlled environment in order to work. Pulling that control out from under you is one of the jokes that Intercal plays. You have to use some undefined number of pleases in your code or it won't compile. There's also random checks in Iick that will just choose to not compile your program. There's an error number for it and everything. These are arbitrary rules, something unexpected,

Starting point is 00:52:51 and something that can leave a programmer feeling like they've lost control of the computer. A system that's built around rewriting and modifying your own code, that has to warp stuff around to function, that just feels uncomfortable. Just like politeness conditions, Trax Dynamic Code Execution makes a computer sound like a less reasonable and less controllable machine. It's unexpected, it's unique, and if it was played as a joke, I'd be laughing along. The matter of Tracks syntax adds a little more spice to the whole ordeal. To start with, the syntax had some fun wiggle room. From the 65 paper, quote, In practice, the function indicator sign pound, as well as other track control or meta characters,

Starting point is 00:53:40 should be chosen in accordance with the ease in typing on the particular keyboard that will be And I get it. It's keyboard-oriented. Morse is trying to make a flexible language. But this kills me. In effect, this means that there is no standard track, at least not according to this early description. This means that, at least in theory, one implementation of track could look totally different than another. In this era, we didn't really have standardized character sets, and IBM Mainframe used a different set of text-mode characters than, say, a deck. So there is a reason here, but the side effect is that there's no standard look-for-track syntax. Moving on to other parts of the language,

Starting point is 00:54:37 TRAC has no fewer than three different ways to deal with functions. Each function call is wrapped in parentheses, but must be preceded by either 0, 1, or 2 pound signs. If I understand track correctly, then no pound sign just groups code. A single pound means execute normally, and a double pound means to execute but treat the return string as neutral. All track primitives, the functions that make up the bulk of the language, are abbreviated to two characters. Print string is just PS. Call is CL and so on. To cap things off, we get fun unmatched punctuation.

Starting point is 00:55:20 Track treats all arguments as strings, no need for wrapping anything in quotes. Track also supports unmatched parentheses in strings. So hey, that's neat. Each line is also supposed to end in a so-called meta-character. The 1965 doc recommends using the single quote for that purpose. You know, with our matching closed quote. Pull this together, shake it up, and you arrive at syntax that could also summon a demon if spoken aloud. The heavy use of parentheses makes you want to say it looks like Lisp, but looking closer,

Starting point is 00:56:00 you can tell it's different. Does that alone make track a bad language in the same sense as intercal? Well, I think it's hard to say. Let me try and give a fair opinion here. To start with, remember what I said about easy being a relative term? We need to be careful here, since in the far-flung 21st century, we have pretty nice programming languages. I mean, setting all things aside, just look at C. That's really become the standard for a lot of languages. Most modern languages draw their syntax from it. But TRAC predates that lineage, right?

Starting point is 00:56:47 We have to look at the contemporary languages around when Morris was developing his own language. I already touched on basic, so I'm not going to go further down that line. Luckily, we have a list of good languages. The intercal manual gives us a rundown of the languages it was trying to break away from, so I think that's a starting point. Let me just filter down that list for things that would have been available circa 59 to 64, let's say. Now, this actually surprised me. Woods and Lyon list 12 boring, normal languages in the 1973 Intracal manual. Just to give them listing, since I think it's a little fun, we have Fortran,

Starting point is 00:57:25 BASIC, COBOL, ALGOL, SNOBOL, SPITBALL, FOCAL, SOLVE, TEACH, APL, LISP, and PL1. I think at least 10 of these are real. I can't find anything on SOLVE and TEACH, so that might be a joke or they might just be obscure. From that list, five languages were developed before 1964. What's interesting is the languages in question. We get Fortran, COBOL, ALGOL, Snowball, and Lisp. Snowball would go through a lot of revisions leading up to 1967, so that one I'm going to discount. Let's just say that we have those four languages that will form our access of easiness. This is where my earlier statement about not judging track by modern expectations

Starting point is 00:58:18 kind of falls apart. Fortran is, you know, Fortran. It's old and weird, but it has the hallmarks of modern language syntax. We get named variables, lines of code, normal mathematical notation that lines up with handwritten equations, that sort of stuff. Things we take as a given. In this era, Fortran was a huge player, so anyone using track would have seen Fortran. ALGOL is, by and large, a modern-looking tongue.

Starting point is 00:58:50 That's because ALGOL would go on to inspire C, which in turn cemented a lot of programming convention. Initially, ALGOL was published as a proposal for a universal language. This proposal and subsequent revisions to it were passed around academic circles, so anyone that was interested in a new cutting-edge language like TRAC would have read about ALGOL. They would have seen something that looked like a modern language. COBOL and LISP are kind of the odd ones out here, since despite having their own good qualities, they each have pretty unique syntax. The only language that really, like, truly looks like LISP are ones that are derived directly from LISP. And, well, nothing quite looks like COBOL. It's

Starting point is 00:59:38 a really verbose and wordy language. The bottom line here is that the concept of easy, at least when it comes to what a language looks like, was at least in the process of being established in late 50s and early 60s. I think ALGOL is the best touchstone here, but FORTRAN gives some good backup. So, would I say the track is easy to use compared to contemporaries? No. Easy to learn? No. Easy to debug? I think that one's more of a wash, since the weird code modification may actually help

Starting point is 01:00:15 with transparency. And easy to port? Yes, but that comes with the caveat mentioned in the ACM paper of variable character sets. I think at this point I'm beating the drum a little hard, but TRAC breaks a lot of the same rules that Intercal does. And while easy is relative, there isn't that much wiggle room. So does this make TRAC an even earlier esoteric language? Well, I wouldn't be so quick to say yes. For one,

Starting point is 01:00:48 I think esoterica is a matter of intent. Moore's intended TRAC as a serious language for serious use by, presumably, very serious people who never smiled. Want some fun proof of this? Well, he registered a trademark for the language's name. That's right. Every time I've said track, go ahead and put a little TM after it. I'd say that's proof enough that Morris designed this language with a straight face. So while track has the outward hallmarks of an esoteric language, while it inverts a lot of the tropes that make a reasonable

Starting point is 01:01:25 language experience, it's not meant as a joke. It's not a parody. And it's not hacker folk art. It's just a kind of spooky and uncomfortable thing. Now, there is one final thing to consider. That's the precedent that languages like tracks set. Intercal can be funny because there are examples of gross languages out there. Parody doesn't work without a target. TRACK may not have been the specific language that Woods and Lyon set out to mock, but I think it's emblematic of some of the strange and baffling issues with programming languages at large. And really, the best thing to do when you get frustrated, flustered, or even become uncomfortable with a language is just laugh it off.

Starting point is 01:02:21 Alright, with that, we've come to the end of today's episode. Call them jokes or call them art, esoteric programming languages offer an interesting critique of the serious side of the discipline. Personally, I can say I really like Intercal. You wouldn't catch me dead using the language for any real work, but there's just something compelling and charming about it. The fact is, programming can really suck. Humans weren't exactly built as logical machines. A good programming language can help bridge the gap between us and our digital counterparts, but issues will always remain. For me, what makes Intercal work so well is its self-awareness. As we've seen, that's kind of the crucial point in pulling off the parody.

Starting point is 01:03:11 Woods and Lyon were able to identify the absurd and frustrating parts of contemporary languages, crank those up to the max, and then throw it all together with a grin. It's that knowing wink that really makes it all work. Take away the laughs, meshes, and splats, and you're left with a monster. And that has happened. There is a precedent for languages that fit a really similar scary mold. Track, I think, gives us a good example of the worst case here. It has a lot of good ideas. It sounds profound on paper. While I was working on this episode, I was trying to explain track to one of my friends. His initial reaction was that, oh, that self-modifying code must be great for AI.

Starting point is 01:03:52 As soon as I showed him some of the actual code, he quickly changed his tune. It's like Intercal's mingle operator. On paper, it sounds like it has to be useful. It has to have some niche. But mingle isn't an operation anyone wants to use. The weird syntax just makes it that much worse. The difference between intercal's mingle and trac's self-modifying code is intent. One was meant as a parody. One is a registered trademark. Now, if any of this sounds hilarious, then I highly recommend giving Intracal a try for yourself. There's a modern compiler for the language. There's actually a few. C Intracal is probably the most well-known.

Starting point is 01:04:35 For a quicker fix, there are even online Intracal environments. I'll link to at least one in the show description. Just remember, your program has to be polite to compile. Thanks for listening to Advent of Computing. I'll be back in two weeks time with another piece of computing's past. And hey, if you like the show, there are now a few ways you can support it. If you know someone else who'd be interested in the show, then why not take a minute to share it with them? You can also rate and review the show on Apple Podcasts. And if you want to be a super fan, then you can support the show directly through Advent of Computing merch or becoming a patron on Patreon. I actually have a new bonus episode that I just put up on Patreon on the Oak Treat language, so go give it a listen if you're signed up.

Starting point is 01:05:21 Patrons get early access to episodes, polls for direction of the show, and a bunch of bonus content. You can find links to everything on my website, adventofcomputing.com. If you have any comments or suggestions for a future show, then go ahead and shoot me a tweet. I'm at adventofcomp on Twitter. And as always, have a great rest of your day.

Your Ad Here

Advent of Computing - Episode 78 - INTERCAL and Esoterica

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.