Advent of Computing - Episode 78 - INTERCAL and Esoterica
Episode Date: March 20, 2022Programming doesn't have to be a very serious discipline. In fact, sometimes it's better if it's a little silly. Today we are talking about INTERCAL, the first esoteric programming language. Is it a ...joke? Is it a form of hacker folk art? Is it even a good language? To answer those questions we need to asses what makes a programming language "good" in the first place. Program INTERCAL online today! (https://www.tutorialspoint.com/compile_intercal_online.php) Selected Sources: https://archive.org/details/intercal-ref/mode/1up?view=theater - 1973 INTERCAL Manual https://esoteric.codes/blog/don-woods - Interview with Don Woods https://sci-hub.se/10.1145/800197.806048 - 1965 TRAC paper
Transcript
Discussion (0)
Work with computers long enough and you start to notice how absurd everything is.
Personally, I think too much exposure to digital machines does something maybe bad to a person.
On a very fundamental level, computers aren't conducive to human use.
We don't think in terms of logic operations and instruction pipelines. Those are very artificial
constructs that we designed to help simulate math using silicon. We've developed some good tools for
managing these weird machines, but bridging that gap from the natural to the artificial
will always be a rough process. This roughly hewn bridge has many manifestations,
you could call them splinters. You can see it in the annoying and seemingly endless quirks
of computers. Maybe your printer just plain won't listen to you today. Maybe your hard drive is
fragmented, whatever that practically means. Or maybe you've been working on a program all day,
and through seemingly no fault of your own, it just won't run. This affliction hits programmers
particularly hard. Programming languages, for all their spit and polish, can really be a pain to use.
What's the difference between pass-by reference and pass-by value?
Are lists and arrays always interchangeable? What if they're not? What happens then? Why aren't all
errors fatal? And why do I need to end each line with a semicolon again? It just all seems so
arbitrary. It can become frustrating very easily.
So what's an intrepid binary basher to do?
I've heard it said that laughter is the best medicine, and I think that's true in this case.
When you grow tired of real programming, when C just isn't cutting it,
or PHP has gotten a little long in the tooth,
why not turn to something
a little more fun? Why not give an esoteric programming language a try?
Welcome back to Advent of Computing. I'm your host, Sean Haas, and this is episode 78, Intercal and Esoterica.
Today is a very serious episode, so I'll abide no grins and definitely no chuckles. We're going
to be talking about a type of programming language that breaks the mold, innovates new features,
and fascinates us serious practitioners. Today we're looking at esoteric programming
languages. Now, it can be a little tricky to sum up esoteric languages. It's kind of a
I'll-know-it-when-I-see-it kind of thing. I've often seen them referred to as joke programming
languages, and while I can understand where that analysis is coming from, I think it
minimizes these languages. I personally prefer to view esoteric languages as a form of art.
In general, an esoteric language is one made without really practical goals. This can,
in some cases, be done as a joke. One example would be the modern lolcode, a very vexing language
that pulls its syntax from memes. Or whitespace, a language that uses only spaces, tabs, and new
lines. But there's more than just a sensible chuckle here. The often canonical example of this art is a language known as Brain Expletive, shortened to BF for
Mixed Company. I'd say the full name, but I'm trying to keep the show from being marked as
explicit. BF is a Turing Complete Language, meaning it can be used to program anything
programmable. That fun, recursive definition. The schtick is that it's about as close to a Turing machine as it gets.
You don't have addressed memory.
You have an infinite tape with a head.
You can increment or decrement the value of the tape at the head's location.
You can move the head, you can do a simple loop, and you can do input and output.
But that's it.
BF is minimalistic, to say the least, but it can be used to program
anything you could write in C or Java or Fortran. BF just happens to take a little more code to get
things done. That and all the instructions are a single character, so the code ends up being almost indecipherable. So why would anyone want
anything to do with BF? Why would you want anything to do with any of these esoteric languages?
Partly, it's just fun to jump through the looking glass on occasion. And partly,
I think it makes for a good challenge. For me, I'm a huge fan of BF.
Last year, I actually wrote an interpreter for the language that runs under MS-DOS as,
you know, a challenge.
It was a fun exercise.
Working with these weird languages is a really neat way to stretch your imagination.
Esoteric languages also offer a neat way to deconstruct and recontextualize programming. I know,
more pretentious language, but I have to use it somewhere. Programming in BF, you soon learn what
really matters to a computer, or at least to a basic Turing machine. I'd argue that's not only
useful, but there's something pleasing about it. Esoteric languages can make you think about your everyday
life much in the same way artwork can. The final factor here that pushes esoteric languages out
of the realm of mere jokes is the time involved in their creation. I'm all for putting a lot of
effort into a good prank, but there has to be a limit, right? Esoteric languages are designed,
implemented, and programmed in the same way as their more serious cousins. That's really a lot
of work for a joke, no matter how you look at it. So in this episode, we're going to be digging into
why esoteric languages are fun and interesting. I think it's only right that we take
a bit of a meandering approach here, so prepare for a bit of a fun and funny episode. The other
complicating factor is most esoteric languages really come to prominence in the 90s, but instead
we're going to be focusing on the early era. We're going to start off with one language called Intercal. How were programming
languages, a very serious and academic endeavor, able to turn into jokes and art? What made that
change possible? And I think this is a fun thing to really turn around in your head. How were
esoteric languages similar to their better knownknown and much more serious counterparts. We're going
down a weird branch of the programming language evolutionary tree, so I think we need to reassess
the roots that we're coming from. To start with, at the most basic level, what's the point of a
programming language? To paint with a very broad brush, I think there are four main reasons for high-level
programming languages to exist, that is, any language that's not assembly or machine code.
To begin with, high-level languages make programming easier and, by extension, faster.
That was at the root of really early programming developments. Rear Admiral Grace Hopper's work on compilers came out of a need to reduce programming overhead,
or, to paraphrase from her, because she was lazy.
Fancy computers aren't that useful if it takes months to write a single program.
At their core, programming languages are really productivity tools.
That might make them sound boring, maybe like their core, programming languages are really productivity tools. That might make them sound boring, maybe like they're Excel, but that's really the root of it.
A programming language, if it's composed properly, will make you more productive.
Second, high-level languages are easier to learn.
That counts for new inductees to the dark arts, as well as seasoned pros
picking up a new language. This was one of the core factors behind the development of
Fortran. John Backus wanted a way for engineers and mathematicians to use computers. And hey,
a big fancy computer isn't really of much use if you're the only one who can use it. Third, high-level languages are easier to debug.
And when I say easier, I'm comparing them to machine code and assembly language. That's the
root-basal comparison we have to keep in mind. I know debugging C can be a huge pain, but
trust me here, it's leaps and bounds better than trying to deal with low-level bugs.
Debugging as part of the programming cycle is just as important as, well, you know, programming.
Once again, cribbing from Hopper, a program in a high-level language will have fewer lines of code.
Each line will compile out to multiple, usually many multiple processor instructions.
That means there are fewer spots for human error.
A fancy machine is useless if you don't have a working program.
And finally, four, high-level languages make it easier to port a program to another computer.
This doesn't come up until later into the development of languages.
another computer. This doesn't come up until later into the development of languages. When there's just 10 computers in the world, it doesn't really matter that your code only works for your nearest
machine. Once we get to 100 systems, or maybe 1,000, then you start to think about sharing your
work. Assembly language is explicitly tied to the architecture you're working on. You're directly
telling the computer what to do.
That means programs and assembly aren't well suited to jump the gap from one type of machine to another.
High-level languages, if designed properly, can easily jump that gap.
The prime example here, despite actual lack of adoption, is ALGOL.
The language was built explicitly to run on any machine out
there. I guess the snappy tagline for this point is that a big fancy computer can be made more
useful if it has friends. So there we have my, admittedly partly stolen, views on why programming
languages need to exist. Call it Sean's rationalization theorem. One of the keys to
the success of high-level languages is that these four goals have a compound effect. They form
something greater than the sum of their parts. This is especially true once portability enters
the mix. With good programming languages comes better software. Advances in programming have made more advances possible
in an unending cycle that continues to this day.
If code is easier to write, then people will write more code.
If more people are writing code, then, well, we get more code.
If code is easier to debug, then we get more working code.
The working part is surprisingly important.
And if code is portable, then, once again, more people are able to contribute.
Projects can get bigger, collaboration becomes more widespread,
and we all live happily ever after.
It's a nice little vision.
Of course, the path to that rainbow isn't an easy one.
So how do language designers push us towards this brighter future?
This is the kind of thing that really varies from language to language, but there are a
few common routes.
Clean and easy to understand syntax is the most common denominator.
By syntax here, I just mean how the language works, which characters
and words and structures it uses to, you know, lay out its code. The whole point of a language, if
we even simplify further, is to protect programmers from the deeper machinations of a computer.
You can do that by replacing a series of archaic instructions with, say, a nice and simple plus sign.
Clean syntax. Easy.
Now, there are two major syntax goals that most languages strive towards.
English-like syntax and traditional mathematic notation.
Both of these goals run into problems, but even Fortran looks closer to English than the bits and
bytes that a computer actually reads. A language like BASIC takes this further by trying to limit
things down to using only short, pronounceable English words. BASIC is actually a neat example
here because its designers tried to limit the use of non-alphanumeric characters. The result is a language that you can
read aloud. At least 90% of the language can be turned into spoken word, and that really makes it
nice for classrooms. Then we have the factors of consistency and accuracy. Programming can be seen
as more of an art than a science, sure, but that's only really possible
when the language makes some self-consistent sense.
You should never be surprised by the language itself.
That's, to me at least, one of the most important factors in deciding if I want to use a language
or not.
Maybe you get shocked by some really sweet code, but the linguistic component should be as mundane as possible.
Now, that's not to say every language is fully self-consistent.
I can think of a short list of languages I've used that violate this rule, but they shall remain unnamed for now.
The point here is that if a language is boring, that's not a bad thing.
Accuracy here partly overlaps with consistency, hence why I'm grouping them.
You want your language to be powerful enough that a programmer can accurately describe
their problem in that language.
That can mean including good support for mathematics, handling order of operations consistently,
or even just having
nice clean syntax for jumps. Stuff needs to work the same way every time, and the programmer needs
to be able to arrive at a reasonable chunk of code every time. Another tool, or maybe a meta tool,
is specialization. Hopper was of the view that we shouldn't try to work towards one
unified programming language, but instead that we should create languages specialized to specific
jobs. Call it a Hopperian specialization, if you will. By limiting your design goals,
you can focus your work. That's good in any project, and especially useful for programming languages. Fortran,
once again, is the shining example of this mindset. The language was tailor-made for math
and engineering, so much so that it's still in use in those niches today. Now, that's a super
fast crash course in why we use programming languages and what makes them quote-unquote good.
At least, I think this gives us a good measuring stick for ruthlessly judging and picking apart programming languages.
This has all been to set up the introduction to a joke.
Now, I know, the setup-to-punchline ratio here is totally whack, but I want this to have the proper impact.
Quote,
The Intercal programming language was designed the morning of May 26, 1972,
by Donald R. Woods and James M. Lyon at Princeton University,
exactly when in the morning will become apparent in the course of this manual.
It was inspired by one
ambition, to have a compiler language which has nothing at all in common with other major
programming languages. End quote. Get your jimmies rustled, computer nerds! That's from the introduction
to the 1973 Intercal Manual. Like I said, the entire setup here is to give us the context
needed to talk about this language. Take everything I've said so far and flip it.
Intercal isn't easy to program in. It's not easy to learn, and it's not easy to debug. It's
theoretically portable, but why would you want to port it? It doesn't have clear syntax. Each line
is an unreadable and unreasonable mess. It's not even consistent or accurate in execution.
You can write a program that doesn't act the same on two subsequent runs. It even has the native
ability to break itself so badly it's unusable. What on earth is this anti-language?
Well, according to some, intercal is the first esoteric programming language. At least,
it's been retrospectively labeled that. You could call it a joke and be done with it, but
there's more going on. Intercal is a response to the state of programming in the early 70s.
Intercal is a response to the state of programming in the early 70s.
I've seen some describe it as hacker folk art, and I don't think that's too far off from the truth.
I actually really like that descriptor here.
As the manual explains, Intercal was drafted on a coffee-fueled morning in 1972.
But the seeds to the language were planted a little earlier.
Its creators, Don Woods and James Lyon, were undergrad students at Princeton.
Woods' name should sound a little familiar.
After graduating from Princeton, Woods moved out to Stanford's AI lab,
where he became an early contributor to the jargon file.
In 1977, he partnered up with Will Crowther to improve and expand Colossal Cave Adventure. You could call him a bit of an icon. Lion is more low-profile. When I initially ran
into his name, I assumed it must be the lion of Lion's Commentary on Unix, but that's just not
the case. Just different spellings of lion and different first names too,
actually. Aside from work on intercal, lion is a bit of an unknown quantity. Luckily,
Woods has been forthcoming with details about the language. According to a 2020 interview
with Esoteric.codes, intercal began exactly as one might have expected. A joke.
In truth, it all began with the renamed punctuation. I no longer recall what specifically
led the conversation down that path, but we were trying to come up with names that all sounded
alike-slash-alliterative, so that reading a line of code became a sort of tongue twister. I think the SP terms were first. Spot, spark,
splat for period, apostrophe, and asterisk. And then we added other silliness such as the
embrace slash bracelet pair. At some point, we decided there should be a programming language
that used this character set. End quote. Honestly, most programming languages are already tongue twisters. These
languages are all meant to be shoved into a computer for processing, at least that's always
been the primary goal. What Woods is getting at here, and what will be crucial going forward,
is the idea that programming languages are more than these isolated tools for talking to a machine.
programming languages are more than these isolated tools for talking to a machine.
What do you do if you want to collaborate with another programmer? How do you talk about software?
I think this is an interesting point, especially during the era where in-person collaboration was the best you could get. You don't want to just turn in your chair, look someone dead in the eyes, and start gibbering off some kind of incantation. In Intercal's manual, Woods and Lyon list off the languages that they're
modeling against, so to speak. So this gives us a good point of reference. It's a relatively big
list, there's a dozen entries, but it starts with Fortran, so I'm pulling from that. And hey, I know Fortran, so it's easy. Let's just look at
the syntax for an array. To access the second element of an array in Fortran, you would write
array, open parenthesis, to, close parenthesis. It's kind of clunky to say out loud, right?
It's better to just say, yeah, and then I grab the second element of array.
That's a kind of verbal shorthand that's easy to take, since, you know, the alternative
is sounding out each pair of parentheses.
No one really wants to do that.
And while it's rare to try and read an entire program aloud, it's distinctly more common
to tell a co-conspirator, hey, my code is breaking at that line. Yeah, you know, where I try to get
the second element of array. That's how it is in the normal world. But we're in intercal land now.
Everything is seen as a parody through a funhouse mirror. Woods and Lyon started constructing
that funhouse by making punctuation worse. I'm going to try and do my best not to give a syntax
rundown since, well, this is an audio podcast and there are better resources for that, but I have to
share some of the greatest hits. In the 73 manual, the pound sign is renamed
mesh, just so later on, the equal sign can be called half mesh. The exclamation mark,
which Intercal uses as a shorthand for single quote period, is just called wow.
I don't know why, but something about that gets me.
After Broken Symbols, the rest of the language was hastily drafted.
And, oh boy, Intercal is quite the language.
It systematically breaks all four parts of Sean's theorem of programming language rationality.
And hey, that's what makes it fun.
No one's expected to actually use this
language, right? Right? To kick things off, Intercal has an evil variable system. You know
how much I care about memory? Well, Woods and Lyon do my precious memory dirty. Okay, so you get four data types, each with explicitly identified gross notation.
A 16-bit integer variable's name starts with a period, aka a spot. A 32-bit integer
starts with a comma, called in intercal a splat. You also have arrays of each variable.
a splat. You also have arrays of each variable. So for 16-bit arrays, you use the double splat,
aka the colon, and for 32-bit arrays, you use the hybrid, also known as the semicolon.
That part's just kind of dumb, but it really isn't that bad. Some languages, Perl is one example, follow a similar practice in order to differentiate variable types. In Perl, scalars all start with a dollar sign. That's
variables like numbers or strings. Arrays all start with the at sign. It's done partly to keep
namespaces separate and partly to make reading code easier.
In a Perl program, you can immediately tell what type of data you're working with.
In Intercal, this is done for a very different reason.
It makes code less readable.
The kicker here is that variable names in Intercal are all numeric.
Any number that would be a valid integer
is also a valid variable name. So to wrap this together, you get beautiful variables like
semicolon 5, the array of 32-bit integers. That's nonsense. Now, 16-bit variables are the most egregious since they all have to
start with a period they look like a decimal number so you get examples like 0.1 or 0.5
it just rubs me the wrong way we could overthink this a little, which, I mean, come on, you know me.
Named variables are actually really crucial to programming languages.
This is one of those obvious things, but it's worth slowing down to look at.
A variable is really just a name that a programmer gives to some location in memory,
some chunk of bits in RAM. The job of a language is to
deal with that name-to-memory-location shuffle for us. This is so fundamental that most assembly
languages even support naming chunks of memory. That should be a sign that it's basic. In this
lower level, it's usually called a label, but I think the analogy holds.
The underlying address for a variable is just a number, some address in memory, like 1,
2, 3, 5, 8, 9, some nonsense.
But we prefer to call it x, or number of oranges, something we can actually remember that has
meaning to a human.
Intercal strips that away in what I think is a funny way.
Sure, it gives you named variables.
It's just that those names happen to be, well, they're numbers.
The result is you get this beautiful semicolon 5,
which is a named variable that's stored at some address
in memory. But chances are good that it's not even stored at the actual numeric address 5.
Operators in Intercal are honestly worse than variables. At least the variable system is
workable. You can assign values around, even though
you have to use an arrow instead of an equals and dumb numbered names. Intercal gives the programmer
five operators. Interleave, which is also called mingle, select logical andical Or, and Logical Exclusive Or. The first two here are intercal originals, which you should know by now means trouble.
Mingle takes two 16-bit numbers and it mingles the bits.
That's really the best way I can describe it.
It takes the binary representation of each argument and then constructs
a result by alternating bits from each argument. A mingle of 111 and 000 would be 101010.
That might sound deep, but that's not a very useful operation. I hate to break it to you,
not a very useful operation. I hate to break it to you, no one needs to mingle two numbers together,
it doesn't mean anything. Select is in a similar vein, but more complicated. I just don't like select, so I'm going to pull its definition from the manual. Quote, the select operator takes from
the first operand whichever bits correspond to ones in the second operand and packs these bits Now, select could almost be useful.
It's really close to a mask operation.
That's an operation that lets you extract the value of certain bits from a larger number.
That last part, the packing all the bits to the right, means that SELECT can't be used as a normal
mask. To use mingle and SELECT to do anything, you have to fight with the language. You have to work
around Intercal's design. The three other logic operations are roughly standard, they're just implemented the
wrong way around. By all sane accounts, logic operations are binary, that is, they take two
operations and produce one result. Intercal's logic operations are unary. They take one operand
and produce one result. As the manual describes it, a logic operation in Intercal is equivalent to
shifting the operand's bits to the right, then performing the operation in question.
Once again, this does technically do something, it's just not useful.
Intercal gives you tools, they just suck.
Woods and Lyon even managed to make the overall structure of Intercal gross.
One way to make programming faster is to get rid of repetitive tasks or repetitive code.
That helps make programming easier to do and easier to teach.
Flip that around, and you have the do slash please do slash please statement.
This is the most legendary part of Intercal. Each line has to start with do. For variety,
you're allowed to use please do or just please in place of any do. What's not mentioned in the
manual is that you have to use some pleases in your program.
If you don't use enough, the intercal compiler will throw an error, citing, quote,
programmer is insufficiently polite.
However, if you use too many pleases, then you run the risk of invoking another error,
quote, programmer is overly polite.
another error. Quote, programmer is overly polite. It's an arbitrary stylistic restriction that doesn't impact your final program, but does insult the compiler. Now, there are three final
pieces to this messed up language that we have to go over. Stash, ignore, and abstain. So far,
Stash, ignore, and abstain.
So far, everything's been silly and frustrating,
but these are features that I actually find really interesting for a number of reasons.
Stash should be familiar if you've worked with any computer or language that supports a stack.
Now, for context here, a stack is basically exactly what it sounds like.
It's a data structure where you can push on a number and then pop that number off. So you can build up this pile, this stack of data for later use.
When you stash a variable, it's pushed onto a stack. You can then retrieve it at a later date.
But ARACAL can't be normal, so it implements this in a strange way. Each variable
has its own dedicated stack, so InterCal actually keeps you from mixing up stashed variables.
This is really useful since it lets you save the state of a variable before you start in on some
code that could mess up a state. It's exactly the kind of feature you want to make subroutines safer.
Stash struck a weird nerve with me for one big reason.
And this is a little callback to a recent episode.
A few months ago, I produced a two-parter on Lisp and its predecessor language, IPL.
This older language, the information processing language,
was also awful to work with. Like Intercal, it had gross variable names, although
IPL allowed for an entire single letter followed by numbers. But strangely enough,
IPL had something really similar to Stash. Each variable in IPL was treated as a list, an array, and you were expected
to push data onto those lists. That's exactly what intercal is doing here. It's treating each
variable like a secret array. There is a good question to be had. Did Woods and Lyon actually take positive inspiration from IPL? I think we
can safely say no, but that opens up something interesting. Lisp is on the list of languages
that Intercal was built to mock. Now, Lisp doesn't do the whole stack thing that IPL did. In fact,
Lisp was partly designed as a response to IPL, a bad response. McCarthy,
Lisp's author, did not like IPL. So, in kind of a funny way, Woods and Lyon may have been
reverse-engineering a weird feature here. So, we've established that Stash has a
serious world counterpart, but Forget and Abstain are another story.
These are modifiers that change how Intercal functions.
Forget will freeze the value of a variable
until you tell Intercal to remember that variable.
Once forgotten, that variable's value won't change
no matter what you do to it.
That's not really a feature that's found
in any other language. It's more like a rule for Calvin Ball. Abstain works in a similar way,
but for language features. You can basically break keywords. If you want to, say, prevent
some code from jumping, then you could tell intercal, please abstain from
nexting. That is 100% valid code right there. Reinstate is used to revert an abstention,
but here's the neat part. Or, I guess the awful part, actually. You can abstain from reinstating. You just have to say,
do please abstain from reinstating.
Thus, you make all abstentions irreversible.
Intercal can really be used to break itself.
In the context of intercal, these are, frankly,
at best useless and at worst dangerous to the programmer's health.
The careful use of abstain and forget could be used to build up conditional expressions,
which is neat but definitely scary.
However, these two keywords strike me as interesting debugging tools.
At least, I can see a glimmer of usefulness here. The ability to hold a variable
constant could definitely help suss out certain issues in your code. Turning off, say, a set of
function calls could serve a similar purpose. I think that's what makes Intercal work so well
as a parody of language. Its features aren't just dumb and useless and funny, they have the gleam
of smart design. On paper, in total isolation, the mingle operation sounds like it's gotta be
useful for something. It sounds fresh, new, and perhaps powerful. It's the kind of thing that you
hear someone talking about in a bar and you're like, oh, I've never thought about using mingle
before, maybe I should do some research. But when you stop and think about it, you realize that no,
Mingle is nonsense, they can't do anything useful. The Intercal manual itself helps to
heighten this experience. It's a parody of contemporary language documentation, at least
in some places. It doesn't hide the parody well. There's one page
that's labeled as Table 1, Logical and Other Functions that's just a drawing of a table with
random functions on it. Even the language's name is a dumb joke that's trying to trick you.
Intercal sounds important, right? It lines up fairly well next to Fortran, APL, Lisp, or COBOL.
It's even in all caps, so, you know, it's gotta be serious.
What does it stand for?
Well, according to the docs, quote,
The full name of the compiler is
Compiler Language with No Pronounceable Acronym,
which, for obvious reasons, abbreviated Intercal.
acronym, which, for obvious reasons, abbreviated InterCal. The package deal here is, I think, pretty funny. You get a language with a nonsense name. You get lines of code like,
please do not give up, which I assure you is totally valid InterCal. Those lines are next to code that's literally unreadable. Here, let me just
give you a sample to prove it. Do spot one less than worm rabbit ears wow or one mingle rabbit
ears mesh three spark. That does something? I know it's a nested mingle operation.
It may also summon a demon as a byproduct.
Completing the joke is the fact that Woods and Lion actually implemented a full compiler for Intercal.
It was called, appropriately, Ick.
Now, there's probably a lot you could say about Ick that I'm going to gloss over.
It was written in Spitbull, which, yes, isn't a joke,
that's an actual programming language. What's important is that Spitbull was a language
primarily designed for processing string data. It was a tool for slicing up and making sense of text.
A compiler, as I'm sure I've said over and over again on this podcast, is a really complicated
program. There's a whole field of study called compiler theory just to give
you an idea how involved this all is. But hey, what if you don't have any realistic concerns for
things like performance? Then you can pull off some cool tricks. According to Intercal's docs,
Ick first converts nonsense source code into spitball, then calls the spitball compiler to make that a finalized binary file.
It's definitely a shortcut, and it has its problems, but it is a neat solution.
The resulting program is slow and bloated, to be sure,
but the shortcut saved Woods and Lion from having to write a real compiler. The code
could be much simpler. I can only imagine that it helped make Intercal possible. I think it's fair
to say that Intercal is a bad programming language. It breaks all the rules, it's hard to use,
and in some cases I'd call it downright evil. But Intercal is intentionally bad.
Its evilness is inbuilt as a parody of other programming languages.
This leads me to a larger question.
We've already discussed how Intercal is making fun of some existing issues that Lion and Woods had with contemporary languages.
That made for a compelling retort.
But what other evil is lurking out there?
Was Intercal the first high-level programming language that broke all the rules, or were there
accidental disasters made in the wild? This puts us in a bit of a weird situation.
I don't think it's normally fair to say that there's such thing as a bad programming language.
I know that is a bit of a cop-out, but let me explain.
Sure, Intercal is bad, it's gross and useless, but that was by design, so I'll give it a pass.
We could say that Intercal is good, as much as it pains me to say that,
because it lived up to its design goals. To do so,
it had to pick up a lot of traits you'd normally call bad. I know that's a little roundabout,
but that's the best explanation I can think of. I'd like to think I've done a reasonable job
explaining what those bad traits are, or to put it another way, the usual design goals lurking
behind a programming language. You want a language to be easy to use, easy to learn, easy to debug,
and hopefully easy to port. We've seen how Intercal intentionally inverts three of the four.
But what about languages that inadvertently miss the mark? Well, here's the crux of this weird position.
Those four rules that make up my theorem can be relative.
Easy is a pretty subjective term, after all.
Easy depends on what you're comparing to.
For Woods and Lyon, easy meant Fortran, Spitball, Lisp,
Kobol, and a host of other languages popular
in 1972. What would happen if we went back further? Now, there are a number of routes we could take
here, but I'm pulling from a short list on solang.org's wiki. It's a really wonderful source
of information about esoteric languages, and it offers some theoretical pre-intercal
candidates that fit this esoteric mold. The example I'm going to discuss here is TRAC,
the Text Reckoning and Compiling Language. Development on TRAC started in 1959, which
pushes the language pretty far back into the timeline. I'd like to present TRAC as an example of a language that was designed
for good reasons, but despite that, shares many properties with Intercal. TRAC was developed by
Calvin Moores as what he calls a keyboard-oriented language. Now, before we go any further,
this is not the same as the Moore of Mo's Law. Once again, different spellings.
Now, for the time, the idea of a keyboard-based language was a pretty revolutionary one.
The workflow for early programming kind of sucked. The usual approach was to punch up some cards
containing your source code, load up a compiler, then pass
your code deck through. Once that was done, you'd have binary on either some tape or maybe another
deck of cards. Then you can load that up and run it. That's fine. It works, but it takes a lot of
steps. Track was Moore's attempt to simplify that problem. He envisioned a language that would run as an environment, accessible over a terminal.
The programmer could then sign on and be dropped to a prompt just waiting for code to be typed.
It honestly sounds a lot like BASIC, and timeline-wise, TRAC was roughly a contemporary of BASIC.
At least, their development periods have some overlap.
I think it's crucial to keep context in mind here.
We have a common problem, namely, how to make programming easier.
Hey, that's right on my big list.
BASIC had other design goals, sure, but better utilization of text-based terminals was central
to BASIC's success.
So, what about TRAC?
How did Moore's approach this same problem?
And is TRAC really that strange of a language?
In a word, yes.
TRAC is some high strangeness.
On first glance, TRAC looks like it would fit nicely as a predecessor to other inscrutable esoteric
languages. The syntax and character set is strange. There seem to be weird repetitions.
Most lines of code start with either a single or a double pound sign and an open parenthesis.
It looks maybe a little like Lisp. There are a lot of parentheses, but data seems to be broken up by commas and even less than and greater than symbols.
To the outsider, even a seasoned keyboard pro like yours truly, track does seem inscrutable.
Sounds like a step in the right direction.
But we can keep going.
The core design of TRACK makes me uncomfortable. There's
something unsettling about it to me. TRACK is all about strings. That's the core data type.
And that's a fine choice for the medium. Terminals are all about text, so why not make a language for terminals all about text? It sounds like a nice
symmetry, but here's where it loses me. Strings are to track as lists are to lisp. That's my
short explanation. It's going to take me a bit longer to unpack that. I'm working off a 1965 paper describing track that Mors and a co-author
submitted to the ACM. Now, notice that this has had a pretty long development period.
In the paper, Mors admits to drawing some inspiration from Lisp. I mean, at the time,
you only really had three big ways to go. You had Lisp, Fortran, or Algo. I'm not going to go too deep into Lisp
since I just covered that language recently, but let me just say this. Lists are crucial to the
language. Lisp is designed to process list data, specifically linked list data. The language is
built to do that in a very practical and unified way.
In Lisp, data and code are represented in the same way.
This means that, among other things, a Lisp program can really handily write and execute another Lisp program.
Everything's just data, after all.
Now take that short summary and replace all instances of list with string and
lisp with track. All your data is stored as strings, just chunks of text. Your program is also,
shocker of shockers, just a big string. So far this is fine, right? All code at some level is just a blob of text, right? Well, here's where we start
the descent. Morris explains that one of the key benefits here is that you can redefine your
program in real time. Pulling from the text, quote, because track procedures and text have the same
representation inside and outside the processor,
the term homoiconic is applicable, from homo meaning the same and iconic meaning representation.
End quote.
Now, I do like that term.
You could also call Lisp homoiconic, in a sense, due to the whole data-code symmetry.
But here's the spooky part. The processor is what
Morse calls the interpreter, the program that takes track code and executes it for you. Now,
in the normal world, interpreters and compilers are complicated and high-powered beasts.
The common approach when dealing with code is to take raw text and break it up into
something more easily dealt with. That could be a lexical tree, something in binary, or
even just a list. It doesn't have to be that complicated. The idea is to turn the human
readable text into a convenient form for the computer to start working with. This is that awkward, roughly hewn bridge I was talking about at the top.
Track doesn't do that.
It's fully homo-iconic.
Track is all strings, all the time.
This is where my concern reached dangerous levels.
Since track doesn't really turn your code into anything but, well, a string,
you can pull some cool tricks.
Runtime rewrites are one example.
In practice, for the programmer, you can stop a track program,
make some adjustments, and then set it back running.
You normally can't do that.
With other languages, you have to restart execution from the beginning.
But track keeps your code as code. It's all a string. that. With other languages, you have to restart execution from the beginning. But TRACK keeps
your code as code. It's all a string. So you're good to go. Why is that a bad thing?
Well, it comes down to implications, as always. Since TRACK doesn't do anything fancy to
your code, it has to pull some of its own tricks to actually work. When you're
in the territory of pulling tricks to operate, that's a bad time. Track actually dynamically
rewrites your code as it's executed. That should sound bad. If it doesn't immediately make you
bristle, then let me break it down a little bit. In TRACK, every operation you perform is a function, meaning it returns some value.
Now, there are exceptions.
Morse points out that the command to print a string is technically not a function, but
it does return a null value, so it counts most of the time.
I'm going to pull an example from the ACM paper for this, but rephrase some things for the sake of clarity.
Let's say you define some variable x as the string hello world.
Track has this idiosyncrasy where you can't reference a variable directly because any unadorned chunk of text is treated as a literal string.
You have to grab a variable by name using the call function.
So to print your variable's value,
you would write something along the lines of print call of x.
Here's where the whole rewriting code thing comes into play.
When track goes to execute that line,
it starts with the innermost function. In
this case, that's call. Track runs call, grabbing the value of x. Now, in most other interpreters,
they would do something like, say, move on to the next function in the call tree, passing up
the value of the last function. These normie interpreters would think, oh hey, I have the value of x stored
in some temporary storage or I have a pointer. I just have to pass that on up.
TRAC doesn't do that. It doesn't have any deeper inner machinations. Instead, TRAC replaces call x with the value of x, so your program is rewritten by the track interpreter
as print hello world. This is the most simple example, and hopefully you can see the special
approach here. Track doesn't so much execute your program in any traditional sense. The interpreter reduces your program, so to speak,
until you're left with no more executable code. Track even has special words for all of this.
Strings that can be executed are called active. Everything else is neutral. So,
hello world is a neutral string, since track's interpreter can't reduce that anymore.
There's just nothing left to do.
So yeah, your program is constantly changing as it's executed.
I have to admit there are some upsides here.
Track can be remarkably transparent as an environment.
Nothing is really hidden because everything's happening out in the open.
A programmer could, in theory, watch exactly how their program runs.
And I guess that does hit the easier-to-debug item.
Another upside is that the interpreter for track could be simplified.
The ACM paper describes the execution process as a series of string manipulations,
so relatively easy compared to other interpreters.
But I can't get away from the fact that your program is dynamic. It sounds like something
right out of a bad sci-fi film. Cue a teenager in a trench coat turning away from his desk,
looking into the camera with horror in his eyes and saying, the computer, it's rewriting my code.
All jokes aside, let me try to explain why this gets to me so much. Control is an important
aspect of programming. This matters in the obvious sense. A programmer uses code to enact
control over a machine. At the same time, a programmer needs a controlled
environment in order to work. Pulling that control out from under you is one of the jokes
that Intercal plays. You have to use some undefined number of pleases in your code or
it won't compile. There's also random checks in Iick that will just choose to not compile your program.
There's an error number for it and everything.
These are arbitrary rules, something unexpected,
and something that can leave a programmer feeling like they've lost control of the computer.
A system that's built around rewriting and modifying your own code,
that has to warp stuff around to function, that just feels uncomfortable.
Just like politeness conditions, Trax Dynamic Code Execution makes a computer sound like a
less reasonable and less controllable machine. It's unexpected, it's unique, and if it was played
as a joke, I'd be laughing along. The matter of Tracks syntax adds a little more spice to the whole ordeal.
To start with, the syntax had some fun wiggle room. From the 65 paper, quote,
In practice, the function indicator sign pound, as well as other track control or meta characters,
should be chosen in accordance with the ease in typing on the particular keyboard that will be And I get it. It's keyboard-oriented.
Morse is trying to make a flexible language.
But this kills me. In effect, this means that
there is no standard track, at least not according to this early description. This means that,
at least in theory, one implementation of track could look totally different than another.
In this era, we didn't really have standardized character sets, and IBM Mainframe used a different set
of text-mode characters than, say, a deck. So there is a reason here, but the side effect is
that there's no standard look-for-track syntax. Moving on to other parts of the language,
TRAC has no fewer than three different ways to deal with functions. Each function call is wrapped in parentheses, but
must be preceded by either 0, 1, or 2 pound signs. If I understand track correctly, then
no pound sign just groups code. A single pound means execute normally, and a double pound means
to execute but treat the return string as neutral. All track primitives, the functions that make up the bulk of the language,
are abbreviated to two characters.
Print string is just PS.
Call is CL and so on.
To cap things off, we get fun unmatched punctuation.
Track treats all arguments as strings, no need for wrapping anything in quotes.
Track also supports unmatched parentheses in strings.
So hey, that's neat.
Each line is also supposed to end in a so-called meta-character.
The 1965 doc recommends using the single quote for that purpose.
You know, with our matching closed quote.
Pull this together, shake it up, and you arrive at syntax that could also summon a demon if spoken
aloud. The heavy use of parentheses makes you want to say it looks like Lisp, but looking closer,
you can tell it's different. Does that alone make track a bad language in the same sense as intercal?
Well, I think it's hard to say. Let me try and give a fair opinion here. To start with,
remember what I said about easy being a relative term? We need to be careful here, since in the
far-flung 21st century, we have pretty nice programming languages.
I mean, setting all things aside, just look at C.
That's really become the standard for a lot of languages.
Most modern languages draw their syntax from it.
But TRAC predates that lineage, right?
We have to look at the contemporary languages around when Morris was developing his own language.
I already touched on basic, so I'm not going to go further down that line.
Luckily, we have a list of good languages.
The intercal manual gives us a rundown of the languages it was trying to break away from,
so I think that's a starting point.
Let me just filter down that list for things that would have been available circa 59 to 64, let's say. Now, this actually surprised me.
Woods and Lyon list 12 boring, normal languages in the 1973 Intracal manual. Just to give them
listing, since I think it's a little fun, we have Fortran,
BASIC, COBOL, ALGOL, SNOBOL, SPITBALL, FOCAL, SOLVE, TEACH, APL, LISP, and PL1.
I think at least 10 of these are real. I can't find anything on SOLVE and TEACH, so
that might be a joke or they might just be obscure. From that list,
five languages were developed before 1964. What's interesting is the languages in question.
We get Fortran, COBOL, ALGOL, Snowball, and Lisp. Snowball would go through a lot of revisions
leading up to 1967, so that one I'm going to
discount. Let's just say that we have those four languages that will form our access of easiness.
This is where my earlier statement about not judging track by modern expectations
kind of falls apart. Fortran is, you know, Fortran. It's old and weird, but it has the hallmarks of modern language syntax.
We get named variables, lines of code,
normal mathematical notation that lines up with handwritten equations,
that sort of stuff.
Things we take as a given.
In this era, Fortran was a huge player,
so anyone using track would have seen Fortran.
ALGOL is, by and large, a modern-looking tongue.
That's because ALGOL would go on to inspire C, which in turn cemented a lot of programming convention.
Initially, ALGOL was published as a proposal for a universal language.
This proposal and subsequent revisions
to it were passed around academic circles, so anyone that was interested in a new cutting-edge
language like TRAC would have read about ALGOL. They would have seen something that looked like
a modern language. COBOL and LISP are kind of the odd ones out here, since despite having their own good qualities,
they each have pretty unique syntax. The only language that really, like, truly looks like LISP
are ones that are derived directly from LISP. And, well, nothing quite looks like COBOL. It's
a really verbose and wordy language. The bottom line here is that the concept of easy, at least when it comes to
what a language looks like, was at least in the process of being established in late 50s and early
60s. I think ALGOL is the best touchstone here, but FORTRAN gives some good backup.
So, would I say the track is easy to use compared to contemporaries?
No.
Easy to learn?
No.
Easy to debug? I think that one's more of a wash, since the weird code modification may actually help
with transparency.
And easy to port?
Yes, but that comes with the caveat mentioned in the ACM paper of variable character sets.
I think at this point I'm beating the drum a little hard,
but TRAC breaks a lot of the same rules that Intercal does.
And while easy is relative, there isn't that much wiggle room.
So does this make TRAC an even earlier esoteric language?
Well, I wouldn't be so quick to say yes. For one,
I think esoterica is a matter of intent. Moore's intended TRAC as a serious language for serious
use by, presumably, very serious people who never smiled. Want some fun proof of this? Well,
he registered a trademark for the language's name.
That's right.
Every time I've said track, go ahead and put a little TM after it.
I'd say that's proof enough that Morris designed this language with a straight face.
So while track has the outward hallmarks of an esoteric language,
while it inverts a lot of the tropes that make a reasonable
language experience, it's not meant as a joke. It's not a parody. And it's not hacker folk art.
It's just a kind of spooky and uncomfortable thing. Now, there is one final thing to consider.
That's the precedent that languages like tracks set. Intercal can be funny because
there are examples of gross languages out there. Parody doesn't work without a target. TRACK may
not have been the specific language that Woods and Lyon set out to mock, but I think it's emblematic
of some of the strange and baffling issues with programming languages at large.
And really, the best thing to do when you get frustrated, flustered,
or even become uncomfortable with a language is just laugh it off.
Alright, with that, we've come to the end of today's episode.
Call them jokes or call them art,
esoteric programming languages offer an interesting critique of the serious side of the discipline. Personally, I can say I really
like Intercal. You wouldn't catch me dead using the language for any real work, but there's just
something compelling and charming about it. The fact is, programming can really suck. Humans weren't exactly built as logical machines.
A good programming language can help bridge the gap between us and our digital counterparts,
but issues will always remain. For me, what makes Intercal work so well is its self-awareness.
As we've seen, that's kind of the crucial point in pulling off the parody.
Woods and Lyon were able to identify the absurd and frustrating parts of contemporary languages,
crank those up to the max, and then throw it all together with a grin.
It's that knowing wink that really makes it all work.
Take away the laughs, meshes, and splats, and you're left with a monster. And that has happened. There is a
precedent for languages that fit a really similar scary mold. Track, I think, gives us a good example
of the worst case here. It has a lot of good ideas. It sounds profound on paper. While I was
working on this episode, I was trying to explain track to one
of my friends. His initial reaction was that, oh, that self-modifying code must be great for AI.
As soon as I showed him some of the actual code, he quickly changed his tune. It's like
Intercal's mingle operator. On paper, it sounds like it has to be useful. It has to have some niche. But mingle isn't an operation
anyone wants to use. The weird syntax just makes it that much worse. The difference between
intercal's mingle and trac's self-modifying code is intent. One was meant as a parody.
One is a registered trademark. Now, if any of this sounds hilarious, then I highly recommend giving Intracal a try for yourself.
There's a modern compiler for the language.
There's actually a few.
C Intracal is probably the most well-known.
For a quicker fix, there are even online Intracal environments.
I'll link to at least one in the show description.
Just remember, your program has to be polite to compile.
Thanks for listening to Advent of Computing. I'll be back in two weeks time with another piece of
computing's past. And hey, if you like the show, there are now a few ways you can support it.
If you know someone else who'd be interested in the show, then why not take a minute to share it
with them? You can also rate and review the show on Apple Podcasts. And if you want to be a super fan, then you can support the show directly through Advent of Computing merch or becoming a patron on Patreon.
I actually have a new bonus episode that I just put up on Patreon on the Oak Treat language, so go give it a listen if you're signed up.
Patrons get early access to episodes, polls for direction of the show, and a bunch of
bonus content. You can find links to everything on my website, adventofcomputing.com. If you have
any comments or suggestions for a future show, then go ahead and shoot me a tweet. I'm at
adventofcomp on Twitter. And as always, have a great rest of your day.