Advent of Computing - Episode 40 - Spam, Email, and Best Intentions
Episode Date: October 4, 2020Spam emails are a fact of modern life. Who hasn't been sent annoying and sometimes cryptic messages from unidentified addresses? To understand where spam comes from we need to look at the origins of e...mail itself. Email has had a long and strange history, so too have some of it's most dubious uses. Like the show? Then why not head over and support me on Patreon. Perks include early access to future episodes, and stickers:Â https://www.patreon.com/adventofcomputing
Transcript
Discussion (0)
Stop me if this one hits a little bit too close to home.
Imagine for a moment, you're minding your own business.
Maybe you're trying to focus on work, when all of a sudden your phone beeps into life.
Now, it's probably nothing, but you decide to reach down and check just in case it's actually an important message.
You see one new email. The subject reads, URGENT. LAST CHANCE. You're surprised the message is even
in all caps. That has to be important, right? But once you open the email, you realize much
to your disgust, it's just an advertisement. You've become one of the latest victims of spam.
We all use email on a daily basis. Spam is just one of the unintended
consequences of that technology. Email is used for nearly everything, from billing notifications
to catching up with friends, sharing small files to mailing lists. It's a really interesting case
of technology that was just such a good idea it was bound to happen. In fact, there are many possible origins for the
first email system. And when you get down to it, email is just a really natural response to a
mundane need. It also happens to be a great example of using some of the most cutting-edge
technology to do something as simple as sending out an all-caps ad. In the 21st century, email
spam is kind of just a fact of life. As long as
there's someone selling something or promoting something, the messages will keep flowing.
In the physical world, we have the equivalence of junk mail. And as the internet became more
ubiquitous, it was really just a matter of time before these same tactics made their way into
the digital world. But how did that jump occur?
Well, to answer that question, we need to look at how email itself came to be.
Welcome back to Advent of Computing.
I'm your host, Sean Haas, and this is episode 40, Spam, Email,
and Best Intentions. We're now in October, and that means it's once again time for Spook
Month here on the podcast. That means that this month's episodes will be at least an
attempt at scrounging up something close to digital ghost stories. Last year, I offered
two slightly on-theme episodes, one covering the early history of computer viruses and another on the game Colossal Cave Adventure.
Now, the operative issue at hand is that computers aren't actually all that scary,
but they can certainly be strange and frustrating.
So to kick things off, we're taking a dive into one of the more frustrating aspects of the internet.
That's email.
The fact of the
matter is that email was a wonderful idea. But as more people hopped onto the platform,
it became pretty easy to abuse. Spam mail is just one prevalent issue. Viruses are very easily
spread over email. Scams also thrive in the medium. Who hasn't gotten an email that offers you millions of
dollars if you just pay some nominal pesky bank fees? Literally everyone and their mother has an
email account. So there are plenty of targets for these types of messages. It's some actually scary
stuff. But stepping away from all the modern issues, how did email actually come into being?
away from all the modern issues, how did email actually come into being? Well, that's where things veer into the strange and slightly complicated. Broadly speaking, there are two
phases to the story. Localized mail and eventually networked mail. The idea of email is really
simple. It's just a means to send messages from one user to another. Something similar has been
invented, used, and then lost
countless times throughout the development of computers. So teasing out the real origin of
email gets a little bit dubious. It wouldn't take shape into what we know until ARPANET steps in,
and from there it would spread as a core part of the growing internet. So let's take a closer look
at email, from its contested origins and parallel development
to its eventual standardization. And along the way, we'll try to find out just why junk mail
came into the digital world. And if we're lucky, maybe even why junk mail stuck around for so long.
Pinning down the first email system is actually really hard. I'd be willing to bet that it's impossible to get a 100% definitive
answer. The reason for that is that there's no single origin to point to. Email and email-like
systems were developed in isolation for years, on totally unrelated computers by totally
unconnected people. The idea of sending digital messages is just so natural that it seems to just pop up on its own.
That being said, we can examine some of the earliest mailing systems to see where the idea started to take more of an official shape.
The whole concept of electronic mail starts to form just after the creation of timesharing systems.
So that's sometime in the early 1960s.
These systems allowed for a single computer's resources
to be shared between multiple users.
For the first time, more than one person could actually use a computer at once,
something that was previously impossible.
When you have a system that can only ever be used by one person,
sending messages around doesn't make all that much sense.
I mean, what are you going to do,
leave a note for the next person to turn on the computer?
You don't need special programs for that.
But once you get to this point of shared infrastructure, the possibility becomes a lot more enticing.
Suddenly there's more than one person using the same computer.
Maybe even enough users to hold a conversation.
The other factor is how these early timeshare systems were actually used.
To log into the system, you would first sit down at a personal terminal which was connected up to
the computer. Some of these terminals were wired in more or less directly, but they didn't have to
be. Most often, users were connected over a telephone line using a modem. Thanks to the
robust phone grid in the States, that meant that a user could log in from basically anywhere in the country. Work was no longer constrained to a single server room.
Instead, you could work from basically anywhere. This represented a total shift in how computers
could be used, and there'd be some big consequences. One of the first of these multi-user systems was MIT's Compatible Time-Sharing System,
or simply put, CTSS.
Completed in 1961, CTSS allowed for up to 30 users to connect to a shared IBM 7094 at
once.
While all the software that made CTSS is impressive, I just want to focus on one small aspect of
the system, and that's how it handled user accounts.
One of the big problems in any timesharing system is how to keep one user's data and programs from interfering with those of another.
This becomes really, really important when multiple programs are running simultaneously.
If one program can interfere with another, then conceivably the whole mainframe could go down.
This specific problem is usually solved using memory protection.
It's basically type controls over how a program can actually read and write to the mainframe's
memory.
Now, a parallel problem exists for files.
While one user messing around with another user's files may not harm the computer much,
it could lead to some more personalized disasters. I know
if someone wrote over my files, I wouldn't be very pleased to say the least. Just like with other
resources, the solution was to break storage up into chunks and control access to each chunk.
In CTSS, each user had a personal directory tied to their account, and with each account being
password protected, only that user
could access that directory. This worked pretty well for keeping files safely isolated. Runway
programs couldn't tamper with someone else's files, and prying eyes couldn't easily snoop on data.
But as CTSS gained more users and became much more complex, this system ran into some weird issues.
What do you do, for instance,
if someone wants to share data with another user? To us today, that sounds pretty normal. We pass
around files all the time. But in the early 60s, this was totally new. Programmers were just
starting to figure out how to make cohabitation on a computer possible. So even something as
mundane as sharing a file, well,
that was breaking new ground. CTSS implemented this in two different ways. The first and most
simple was the concept of common files. The 1963 Programmer Manual described common files like this,
quote, to allow convenient cooperation between programmers, such as students and classes or
group projects, there's a feature which makes it possible to have files common to several different Essentially, common files were stored in a directory that anyone could access.
Any user could read to common files, make new files, or even run programs stored in this shared directory. It made sharing data between programs really easy,
and it made collaboration between programmers very possible.
The feature itself is simple, but it paved the way for much more.
Just by having a shared digital space,
it became possible to make larger community-managed datasets,
or for teams of programmers to work together
without actually having to be near one another.
However, common files weren't a catch-all.
All this data was still public.
If you had a CTSS account, you could have access to every common file.
The second solution filled the gap by allowing users to securely or privately share files.
On CTSS, this feature was called a linked file and allowed one user to share
a file to any number of other users. Once linked, that file could be read only by users it was
linked with, but could only be written to by its initial owner. So unlike common files, a linked
file was relatively safe from outsiders. But it still had a problem. Linked files were only useful for sending data on a one-way trip.
Now, everything I've outlined so far were official features of CTSS,
things that were designed and built into the system's core.
But with so many people sharing a computer,
interesting behavior and conventions start to form.
Sometime in the mid-1960s, Tom Van Vleck, a programmer on the
CTSS project, described a particularly strange custom forming around these shared files.
Quoting from Van Vleck, quote, this new ability encouraged users to share information in new ways.
When geographically separated, CTSS users wanted to pass messages to each other.
They sometimes created files with names like toTom and put them in common file directories.
The recipient could log into CTSS later from any terminal and look for the file and print it out if it was there.
CTSS users were doing more than just programming on the mainframe.
They were starting to communicate at long distances.
And really, it makes a lot of sense that this would happen.
Humans like to talk, and common files were a really easy way to share data.
The feature was already being used for programmed data, so why not adapt it for more personal
information?
But there were some issues with this ad hoc message passing.
Throwing messages into common directories was really just like posting a letter on a bulletin board.
Anyone who passes by can read it, or even scribble on their own note.
As long as everyone agrees to play nice, then that's fine.
But with an increasingly growing system, the chance of interference was there.
And what would happen if there was more than one person named Tom on the server at once?
At the same time that users were casually sending around messages, a more serious need was starting to rise.
CTSS was a big system, with a lot of different programmers contributing software.
So as with any complicated project, documentation was very key to its
success. Source code had to be documented, new commands had to be documented, and someone had
to manage all that pile of documentation. The process at MIT was pretty standard. The author
of any new software on the system was required to write up the proper docs on the new code.
new software on the system was required to write up the proper docs on the new code.
That rough draft was then sent to an editor to, well, you know, edit the rough draft and eventually enshrine it into the authoritative manual. Sometimes, depending on what was submitted,
the editor would need to get back in touch with the author for clarifying questions or critiques.
And even once a new command was documented and put into the system, it was important
to keep a dialogue open.
Users are really good at finding bugs in code, so having some way to handle user feedback
is essential to the health of a project as large as CTSS.
But with users now separated out geographically, there were some logistical problems in handling
feedback.
The first attempt to address these problems came in the form of a memo titled Programming Staff
Note, Proposed Minimum System Documentation. Now, the actual note doesn't have a date on it, which
I don't like. If you put a document anywhere, you should have a date on it for future reference by someone like me.
The best guess I can give is that it was written either in late 64 or maybe early 65.
The bulk of the memo is dealing with the improving of documentation workflow.
One of the improvements proposed is a little program called Mail.
Quoting from the docs, quote,
program called Mail, quoting from the docs, quote, a new command should be written to allow a user to send a private message to another user, which may be delivered at the recipient's convenience.
This will be useful for the system to notify a user that all or some of his files have been
backed up. It can also be useful for users to send authors any criticisms, end quote.
useful for users to send authors any criticisms, end quote. The initial purpose for mail is twofold,
get user feedback to programmers and send alerts to users while they're offline.
Just some interesting things to note here, even in this rough draft state, we have a proposed robot sending emails to users. That's been around a lot longer than one might hope.
Anyway, there are a few little details that I want to tease out of that description. Firstly, mail was planned as a way for users to
get messages, quote, at the receiver's convenience. As in, a way for users to be sent messages without
being logged into the system. That's not new on CTSS. Users were already doing
that with common files. The other interesting point is that the memo explicitly says that these
are private messages. That's the part the common files were missing, and a sorely needed feature
for a practical messaging system. But work on mail wouldn't start immediately. In the spring of 1965, Tom Van Vleck and Noel Morris had just joined the CTSS project.
The two ran across the mail memo pretty soon after joining.
But they didn't find anything else about the command on the system.
Quoting Van Vleck,
When we read the PSN document about the proposed CTSS mail command, we asked,
Where is it? And we were told there very much available to write up some new code.
So by summer, the mail command was added to CTSS, and users were finally able to send secure private messages.
And as with a lot of technology, the devil really is in the
details. Even in 1965, we start to see the bones of what would become email. When you get down to
it, mail was really just a codification and improvement of the existing common file conventions.
But it does so in a bit of an interesting way. Earlier I mentioned that CTSS had another way to share files between users, the so-called
linked files.
It seems like that could have been one way to implement a message passing system.
But Morris and VanVleck didn't go that route.
The issue with using linked files is that they only work for one-way communication.
So to get a conversation going, you'd have to set up a series of multiple linked files. That's why common files were a much more popular option. Everyone can read
and write to them. No setup needed. Mail did things just a little bit different. To send a message,
you only had to type out who the message was going to and then the message to send them. Of course,
it wasn't that simple. The message had to be the
contents of a file, so in practice, you first had to draft a message, save it to disk, and then send
it. The recipient field was also a little bit more obtuse. You couldn't just say to Tom anymore.
In CTSS, each user had a programmer number and belonged to some problem number. Think of it like a user ID and a
group ID they belong to. To send a message, you had to know both of those magic numbers.
The upshot was that this combination of numbers was totally unique. You can have more than one
Tom, but Dan Bleck's address of M14162962, well, that's a little longer, but totally unambiguous.
Another handy feature was that Mail could accept a list of recipients.
You could specify those when you ran the command or store the list in a file.
In this way, you could message all of your friends at once, or even set up a crude mailing list.
That all sounds pretty familiar, but Mail had a little bit of a trick hidden up its
sleeve that we don't have today.
The program also accepts wildcards for recipients.
For a normal user, this meant that you could shoot off a message to everyone in your group.
But if you were a more privileged user, then you could send a message to everyone on CTSS.
Once a message was actually sent, it showed up, where else, but in the recipient's mailbox.
I mean that very literally. Mail would drop all messages into a file called mailbox in a user's
private directory. Each message started with the date and time it was sent and then the user
sending it, followed by the actual message body. If you already had messages in your mailbox, then the new messages would just be appended on. And, as an added feature,
a user could even turn off their mailbox, by just setting the file as read-only.
The final piece of the puzzle was to alert users when they got a new message. On login,
the system would check if the mailbox file had any data in it. And if so, it flashed a quick,
you have mailbox.
That way, a user was less likely to miss an important communique.
In the coming years, mail would become a staple of life on CTSS.
It deftly solved the more official problem of user feedback and documentation management.
But more importantly, it gave users a way to privately communicate. This kind of digital
chit-chat may seem unimportant, but systems like mail marked a fundamental change in how computers
were used. It made very early digital communications possible, and suddenly a computer was more than
just a machine for running programs. Morris and Van Vleck aren't necessarily the creators of email.
There are a few major caveats to keep in mind. If you want to be pedantic, then at this point,
it wasn't being called email. But more importantly, there were a slew of roughly similar systems being
developed around that same time. Anywhere timesharing was in use, some kind of messaging
system was bound to be developed. But the MIT rendition has become one of the more well-known
and influential. We can even see its impact on more modern mail systems. For instance,
modern Unix-like systems handle their messages almost the same way. The biggest caveat here
is actually shared by mail and all its contemporaries. It's
missing one core feature that makes email, well, email. None of these programs could actually send
messages to another computer. You were stuck talking with users on the same mainframe.
In the 1960s, that wasn't really a big deal. Networking was still in its very earliest days. The cutting-edge
networks included maybe a handful of computers at most. The idea of sending messages between
mainframes did come up, but no one really saw it as necessary. Just like how timesharing made these
early message-passing systems inevitable, it would take a new technology to trigger the final push towards email. It would take the ARPANET. There is so much that we could say about ARPANET.
It was the direct predecessor to the modern intranet, so a lot of the technology that we
use today was codified on the earlier network. The project started in 1966, with the first part
of the network going online just before the end of the decade.
ARPANET was a massive project.
Most of the core management and funding came from the US government, but the grunt work was actually done by a huge cast of universities and contractors.
There had been other networks, both in the United States and abroad, but ARPANET was the first to really catch on, and a big part of its success was thanks to its design. The fine details of exactly how a network
works doesn't really matter that much for us today, but I think it's worth having a passing
understanding at least. ARPANET worked so well because it was designed as a distributed network.
There's no one hub of traffic that everyone has to connect to.
Instead, ARPANET, and now the internet, is arranged sort of like a web of routers.
ARPANET used a special purpose machine called an Interface Message Processor to manage all
traffic on the network. When you made a request to some server, these IMPs handled finding the
best route to your final destination.
Sometimes it's right next door, but most likely your request has to hop from one IMP off to another until it eventually reaches its final recipient.
To make everything work out safely, each computer on the network has a unique hostname, something like Sean's Big Expensive Mainframe.
When you boil it down, the network really just wires up
a bunch of computers. Simple, right? But getting there was a huge undertaking. One of the key
players in its development was BBN, a research and development firm that had major ties to early
development of computers. In the 1960s, their research would become crucial to ARPANET. You can see it right in the network's infrastructure.
The IMPs that handled all traffic on the network were designed and built by BBN.
And one programmer at BBN by the name of Ray Tomlinson would develop the first network's email.
How Tomlinson got involved with ARPANET is actually a bit of an interesting story in itself.
Ray almost made it through college without running into a computer. In the early 60s,
he was enrolled in the electrical engineering program at Rensselaer Polytechnic Institute.
Part of his studies included time working at IBM's main office. But he wasn't working on
IBM's computers. Instead, he was working with analog test equipment.
It sounds like some sort of thing close to intern grunt work.
Tomlinson really enjoyed his time at IBM, but he could only spend so much time at the office before being drawn into the more digital realm.
Quoting Tomlinson,
After about two or three years of this, I finally saw a computer down the hall that was available for engineers to use, and I decided to learn to program it.
End quote.
For Ray, that was the moment he was bit by the bug.
It would be an experience that stuck with him for years, a small introduction to the larger world of computing.
In 1963, he moved on to grad school at MIT, still on the electrical engineering track. In grad school,
he started to specialize in analog speech synthesis. At the time, computers hadn't yet
found their voice, so the field was still fully analog. The prevailing notion was that computers
didn't really have the processing power to handle human speech. But Tomlinson, well, he didn't really
buy that excuse. He decided early on that his
thesis project would be a computer-controlled speech synthesizer. The design called for a
computer handling most of the work, and controlling a complex analog device to actually generate the
final sound. However, a problem soon appeared. Tomlinson was a lot more interested in working on his thesis project than
actually going to grad school, so much so that he stopped attending classes in favor of programming.
His thesis advisor was, understandably, a little bit concerned at this. The advisor recommended
that he could pick up some side work for BBN in their speech lab. Maybe some consulting would serve as a more
healthy outlet for this young obsessive programmer. And it seemed to have worked. By 1965,
Tomlinson earned his master degree and soon became a full employee at BBN. Initially,
he would continue his work on speech synthesis, as well as research into human-machine interactions.
But a new project was looming large on the horizon,
and as Beebe and Starved worked on ARPANET, Ray came along for the ride.
Early on, ARPANET was, to put it bluntly, totally unrecognizable.
The modern internet is all about shipping around large amounts of data,
and moving any kind of data.
But things started out very small and very restricted.
Once packets and test messages were reliably moving through ARPANET, the first big use was
remote computer access. Using the network, researchers were able to log into any other
computer connected. While useful, there were bigger plans for ARPANET, and those plans would
actually take a lot of work to see
through. At BBN, Tomlinson had shifted from speech synthesis work into ARPANET-related programming.
More specifically, Ray worked on the team responsible for Tenex, that's BBN's own
timesharing system. Tenex was actually pretty well represented on the new network, no doubt
thanks to BBN's hand inside ARPANET.
So it should come as no surprise that a lot of new network-specific software was being
developed specifically for Tenex.
It was a platform positioned right at the center of a lot of big changes.
One of the biggest challenges faced at the time was figuring out how to get the most
out of ARPANET.
Researchers now had access to an
ever-expanding network of mainframes, but for the time being, all they could actually do was connect
up and log into some far-off machine. Everyone knew that more was possible, but no one had quite
gotten to the more part quite yet. That's where Tomlinson enters the picture. The next logical step after remote
login was to devise a way to transfer files around the network. This would open up a lot
of possibilities. Imagine something like CTSS's common files but shared between hundreds of
mainframes and tens of thousands of users. The fast and easy transfer of data between machines
would be a very big deal.
It was just a matter of working out the little details. Luckily, 10x proved a great playground
for Ray to turn idle musings into actually working code. Quote,
I wrote a little program to open a file here, open the other file in the other place, and
send files back and forth, and that was called copy net and it was
pretty simple-minded you know one you put out a string this is what the file name is that you're
writing to and the other one intercepts that and in turn opens the file at the other end and then
it just streams the data through end quote now saying copy net was simple may be the wrong term. Rudimentary or inflexible is a little bit better.
Tomlinson's program was only possible as long as both sender and receiver were on a 10x system.
For a large swath of ARPANET, there wasn't much of a problem there.
You can just shuttle data around between similar enough mainframes.
But if you're running CTSS or Multics, then you're kind of out
of luck. That being said, Copynet was a big step towards more codified file transfer protocols.
But Copynet wasn't the only part of the equation. It was a means to move data around. But what
exactly could that data be? The final push that led Ray down the path to email came in a pretty strange form.
In the wacky world of ARPANET and eventually internet and Minas Trivia, there aren't
really memos.
Instead, there are requests for comments, aka RFCs.
These are essentially, well, they're just memos with a fancier name.
Each has a sequential number, which makes them really easy to reference.
The actual content of RFCs is something like an open letter to those managing the ARPANET.
They can range from short notes to full-on specifications to the more bizarre.
Allow me to introduce the poetically named RFC-196, a mailbox protocol. Now, despite
the name, this is not a proposed email system. Oh, but it's so close I can't ignore it. Reading
from the introduction, quote, the purpose of this protocol is to provide at each site a standard mechanism to receive sequential files for immediate or deferred printing or other uses.
The files for deferred printing would probably be stored on intermediate disk files.
It later continues, quote,
Multiple mailboxes, 128, are allowed at each site and are identified as described below.
The default mailbox number is zero for use with the standard mail printer.
End quote.
Published in 1971, just scant years after the ARPANET starts,
RFC-196 is a proposal for a physical mail protocol.
Now, this should sound like some high strangeness to anyone following along.
It lays out a protocol for sending sequential text-based data over a network, storing it in
some digital mailbox, and then finally printing out the message. Yes, printing out the message
on a printer, presumably to eventually be put in some physical mailbox in someone's office.
Now, all joking aside, RFC-196 is a really interesting document for a few reasons.
It actually proposes a full email system. Sort of, at least. The idea is that you could send
a message over the ARPANET to a receiving server and that message
would then be placed in some file awaiting use. It's just that the use for RFC-196 is printing
that onto paper. The proposal is a wonderful example of researchers coming to grips with
totally new technology and the strange paths they sometimes take. What's one use for computers? Well,
printing. So why not let remote users do that over the ARPANET? It's this strange mix of
state-of-the-art technology and bland practicality that I find really interesting. Instead of using
ARPANET for something radically new, 196 describes a way to use it as an upgrade to existing
practices. Luckily, I'm not the only person puzzled by this RFC. Tomlinson was just as perplexed.
It was clear that the proposal was hitting on something really important. Of course people
want to send messages over ARPANET. But the design just wasn't fully formed.
Interested in the prospect, Ray started to deconstruct the proposal and see if the rough idea was workable.
Under 196's prescription, each mainframe would have up to 128 mailboxes,
with the number on the box acting somewhat analogously to an address.
Sure, that can work, but it also kind of sucks.
The RFC was looking at mail more on the level of a mainframe, but you don't send mail to a computer,
you send it to a person. One of the big changes that Ray made was scrapping the whole idea of
numbered mailboxes. Instead, each user on every server would have their own address.
And luckily, there was already some precedent for him to work off of. As I mentioned earlier,
CTSS's mail was just one example of a mainframe mail system. Most timesharing operating systems
had some version of mail. On Tenex, that program was called SendMessage. While largely the same
as the earlier CTSS system, there were some changes that had occurred over the years.
One big one was usernames. By 1971, most operating systems had actual human-readable usernames.
Sure, somewhere in the code each user still had an ID number, but you also had the option to use a more friendly name.
Instead of logging in as user number M12598, you were much more likely to just log in as Sean.
That simple change was visible throughout these timesharing systems.
Local mail clients now readily accepted user names instead of ID numbers, making it much easier to send a colleague a quick message.
But there was a caveat.
Usernames still had to be unique.
So you may not be able to have your first name as your username,
but you could have a recognizable one.
But just a username will only get you so far.
On a single server, that was fine.
But by 1971, there were around
a dozen mainframes on ARPANET. So how can you distinguish between a Sean at UCLA and
that one other Sean over at MIT? Once again, Tomlinson was able to draw from existing standards
to make his life a little bit easier. Every mainframe on the ARPANET had a unique host name, basically an
authoritative identifier for the computer. That way, if you wanted to connect to the installation
at UCLA, you could just call up UCLA. No fuss or crossed wires needed. Combine local usernames with
mainframe hostname, and you get a unique and totally unambiguous address. One final piece was missing from this puzzle, though.
How should those two names actually be combined?
Well, for Ray, that was a no-brainer.
It had to be the at sign, the curly A in a circle, quoting from Tomlinson.
The purpose of the at sign in English was to indicate a unit price.
For example, 10 items at $1.95.
I used the at sign to indicate that a user was at some other host
rather than being local, end quote.
It also helped that, at least on 10x,
the at sign didn't have some other special meaning.
Putting that all together, you end up with a really readable address.
To email Tomlinson, you might type out ray at bbm.
It's simple, you can read it aloud, and it's totally unambiguous.
Today, it's just second nature to see email addresses written out in this format.
It makes good sense.
And frankly, that's because it's a good format.
What Tomlinson did here is something that a lot of
programmers have learned and mastered. Instead of spinning something totally new, he combined
existing technology and convention. Nothing in an email address was new in 1971. But putting all
those pieces together led to something that was actually pretty new. Incidentally, the same user
at host convention would be adopted by many other
protocols on ARPANET. When it came time to implementing the software side of things,
Tomlinson took a very similar tack. He already had access to send message, and he already had
access to copynet, which he had just written up. Mail on 10x, and really mail on most systems,
was just thrown into a file. And if you
can transfer files around on a network, then it's a very small step to transferring mail. Tomlinson
sent the first networked email before 1971 was out. That first message didn't have far to travel.
Quoting from Tomlinson, the first message was sent between two machines that were literally
side by side.
The only physical connection they had, aside from the floor they sat on, was through the ARPANET.
I sent a number of test messages to myself from one machine to the other.
The test messages were entirely forgettable, and I have, therefore, forgotten them.
End quote.
While we don't know the actual content of the first email, we know where
things went after 1971. The following year, a new version of 10x was finalized and shipped out.
Notably, it included Tomlinson's new networked send message. Email was adopted at a breakneck
pace. Sure, it may not have been groundbreaking technologically speaking, but it was a great
use of existing systems. But more than any of that, it filled a very important need.
Just like on CTSS, the ability to communicate was sorely missing from the early days of
ARPANET. Email brought the human factor a lot deeper into the mix. Now you could do
more than program or crunch data on the network.
It was a sign of what the internet would become years later. In 1973, a new RFC came out,
number 561, titled Standardizing Network Mail Headers. The paper represented the first
enshrined email standard, based off Tomlinson's earlier work. That same year, email accounted for roughly 75%
of traffic on ARPANET. The fact was, people plain liked sending emails. It made work easier.
It opened up a new line of communications. Everyone from the newest intern to even the
Queen of England was firing off messages. Email had emerged, and it would stick with us into the modern day.
I guess that leads to the final question. Where did it all go wrong? How did we go from
happy programmers chatting away to a morass of spam mail and ads? Maybe unsurprisingly,
spam appeared in inboxes almost immediately.
To see the first spam message, we need to head back to MIT and CTSS.
One of the interesting little features of Morris and VanVleck's mail program was that it could send out mass messages.
Like I mentioned earlier, the average user could specify asterisks in the programmer number field,
and that would send out a message to everyone within your group. But for a privileged user, someone such as Van Vleck, you could specify
an asterisk as the recipient group. In other words, if you had a privileged account on CTSS,
you could send a message out to everyone connected to the mainframe. Normally, this kind of mass mail
would be used for system alerts, information
about downtime, or other important and hopefully not very frequent administrative messages. No one
wants to get a load of mail in their inbox, so it's the kind of feature that would have to be
used sparingly and appropriately. But despite the restraint on the side of CTS's administrators,
the feature was still there.
It opened up a tempting possibility for anyone with a privileged account.
It was only a matter of time before something happened.
In 1971, every CTSS user was greeted by an alert that they had mail in their mailbox.
Upon checking, they all saw the same message.
As Van Vleck recalled, it started something like this. Upon checking, they all saw the same message. As Van Vleck recalled, it started something like
this, quote, there is no way to peace. Peace is the way, end quote. Of course, as with all
unsolicited mail, it's in all caps. The message was sent by Peter Boss, a programmer on the CTSS
team that happened to have the right access to fully unlock the power of the male
command. When Van Vleck found out, he was not pleased. By this point, he was running the system
programmer group on CTSS, so the misuse happened on his watch. Van Vleck eventually confronted Boss
about the violation. Quote, I pointed out to him that this was inappropriate and possibly unwelcome.
And he said, but this is important. End quote. Now, while this antidote is a little sparse,
I think there is something interesting here. Morris and Van Vleck had apparently worried
about the possibility of something like this happening. But really, there's only so much
they could do. The ability
to send mail to every user at once was useful to the team, so removing it wasn't really an option.
And even today, it's a little bit hard to tell spam from meaningful mail. The other thing that
really strikes me is Boss's response. To him, informing his fellow users on the way of peace was important.
It sounds like he had the best intentions possible.
He wasn't looking at his message as junk mail.
Unlike today, where we have programs that handle sending spam,
this early period had a discernibly human touch to it.
But this was still just on a single machine.
What about over on ARPANET?
Well, jump forward a few years, and we get into some interesting territory.
In 1975, RFC-706 was published, titled On the Junk Mail Problem.
Now, this one falls more into the bucket of short memo.
The entire thing's only a page long.
And surprisingly, 706 is concerned with a totally different kind of spam than you might expect.
Quoting from the RFC,
In the ARPA network host IMP interface protocol, there is no mechanism for the host to selectively refuse messages.
This means that a host which desires to receive some particular messages must read all messages addressed to it. Such a host could be sent many messages by a malfunctioning host.
This would constitute a denial of service to the normal users of this host. End quote.
Ah, back to the lovely prose of RFCs. The takeaway here is that the junk mail problem isn't an issue of mass unsolicited mail.
It comes down to the technical issue that seems to have popped up on ARPANET.
If a server were to receive enough email traffic, possibly due to some faulty code,
then the computer could be rendered useless. More than anything, this RFC is talking about
an early form of a DDoS attack.
Once again, the sources are, sadly, pretty sparse, but from reading the memo, it sounds like the RFC is referencing some specific malfunction that was observed on the
network. A day-to-day error isn't something that always makes it into the historical record.
Anyway, that's just my best guess. What's important here is that 706 shows programmers on ARPANET were aware of a problem in the works.
The possibility of spam, or at least overwhelming amounts of mail, was there.
It's just that the recognizable form was yet to develop.
But don't worry.
It wouldn't take that long.
In 1978, we get our first verifiable spam email sent over
a network. And, well, once again, we stray back into the realm of failed best intentions. 78 would
be a big year for email, but was also a big year for Digital Equipment Corporation. A new model of the DEC System 20 was coming out,
an incremental improvement but still an exciting new offering. All that was left to do was drum
up some sales for the new mainframe. In 1978, the person for the job was Gary Theuric, a DEC salesman.
One tried-and-true method for selling new mainframes was through demos. An announcement for the product would be sent out.
A time and place would be set, and then crowds of interested people would come see the new computer in action.
Once the audience was sufficiently wowed, it was a lot easier to move units.
But the system wasn't perfect.
It actually took a lot of planning to make a demo work out well.
It actually took a lot of planning to make a demo work out well.
In 1978, it was common to invite people either by mail, through ads, or by calling them directly.
These were pretty well-worn tactics by this point in history, and the way Thirk saw it, it was time for a change.
Now, Gary knew roughly who he wanted to invite.
If everyone came, it would be a really big crowd. The problem was
sending out letters to every prospective attendee would be expensive. Inviting them by phone wouldn't
be much better. Tracking down so many phone numbers and then placing each call would take
a long time. There was an immediate better solution that Dirk decided to turn to. Email.
DEC did have mainframes connected up to the ARPANET,
so sending out a message would be relatively easy. But there was one other key to Therk's plan,
one that we can't exploit today. Back in the early days of ARPANET, there were actually
printed email directories, kind of like a digital phone book. Today, the very notion of that is laughable.
But the network was small enough that a directory of all active users could be compiled relatively easily.
So Dirk grabbed a copy and formed a plan.
He had two demos lined up, one in LA and one in San Mateo.
Leafing through his handy ARPANET directory, he found the section for West Coast users.
Those would be the recipients of his
email invitation. It came out to nearly 400 email addresses, all printed on physical paper. Dirk
handed off the task of typing each address to one of his co-workers, and he set about work drafting
the actual email. On May 1st, 1978, the message went out. And things degraded from there. The first problem was that DEC's email
client wasn't actually built to handle so many email addresses. It may have been possible if
all the addresses were entered into a file to form a mailing list. But they were each input
to the program over the course of a few days of work. So things go a little, well, strange. After the first 320 emails,
the recipient field overflowed, putting the balance of email addresses into the body of the message.
So the lucky ones who actually received the email, well, it started off with a long list of
failed addresses. The following day, Thirk would correct the problem and send out the email to all the
overflowed addresses. Reading past the accidental garbage, the body of the email started with,
quote, Digital will be giving a product presentation of the newest members of the
DEC System 20 family. The DEC System 2020, 2020T, 2060, and 2060T. The DeX System 20 family of computers has evolved from the Tenex operating system and the DeX System 10 computer architecture.
Both the DeX System 2060T and 2020T offer full ARPANET support.
And it continues on like that for quite a while.
As is custom, it's in all caps.
The caps thing is a little strange to me.
Some terminals or mainframes only handled capital letters back in the day, but this is 1978.
I can't tell if the cruise control here is intentional or accidental.
That part aside, the email is undeniably an ad for DEC's latest and greatest hardware.
As far as I'm aware, this was the first time a large company tried marketing something over the ARPANET.
And it turns out that not many people were happy about that.
Complaints were immediate, but things would escalate and get pretty out of control over the course of the coming days.
This ad was a totally
new and innovative use for the network, and it seemed the Thurk ad actually struck a nerve.
There were two main complaints from users. Firstly, it was unsolicited junk mail, and secondly,
it was way too long of a message to be sending via email. The latter actually led to some more technical problems.
The aforementioned RFC-706 had never been implemented, so there was no way to deny
incoming large emails. The message arrived and got pushed down to disk immediately.
And with the state of storage at the time, this led to some problems. Thirk would later recall,
quote,
Some people said they didn't want the email and that it ate up all of the free space on their computer. It used up all the disk space on a professor's computer at the University of Utah,
end quote. It seems that the fears referenced in RFC 706 were coming true, courtesy of a mass
mailer ad. There were some who defended Thirk's spam mail, but that wouldn't last for long.
One early defender of note was none other than Richard Stallman, the future founder of the Free Software Foundation.
Now, he was not one of the original recipients of the spam.
But as with all things, Stallman has a certain style.
He has to decide and form an opinion as quickly as
possible. So in a vacuum of any actual information, Stallman had this to say, quote, I didn't receive
the deck message, but I can't imagine I would have been bothered if I had. I get tons of
uninteresting mail and system announcements about babies being born, etc. At least a demo might have been interesting,
end quote. But even RMS quickly changed his mind. A day after coming to Therx Defense,
Stallman got wise to what was up, quoting again, well, Jeff forwarded me a copy of the deck message
and I eat my words. I sure would have minded it. Nobody should be allowed to send a
message with a header that long, no matter what it's about. End quote. Well, I think it's a pretty
accurate example of how a lot of ARPANET users felt. They objected both to the form and the
function of Thirk's email. Oh, but the backlash actually gets worse than some angry programmers and academics.
You see, there's a little nitpicky problem here. The ARPANET was a government-funded project.
Most of the money came from tax dollars. As such, there were some pretty strict rules about its use.
All traffic had to be related to government communications, military work,
or research. An ad for the new DEX system 20 was none of those. Due to either lack of knowledge
or lack of care, Therick had actually committed a federal crime. And it wouldn't take long for
news to travel. Scant days after the message was sent out, the government was on the case. According to
one Major Raymond Zahor, quote, this was a flagrant violation of the use of the ARPANET as the network
is to be used for official U.S. government business only. Appropriate action is being taken to preclude
its occurrence again, end quote. Once again, just as a note, that message against spam is also in all caps.
Anyway, some angry and threatening calls were made to Therx Boss, but despite the strong language, the government actually didn't do much else.
The repercussions for DEC were surprisingly light.
deck were surprisingly light. At the time, it was customary for ARPANET users to self-regulate the network, and in the wake of Therix's spam, that self-regulation had to be tightened. It would
take years for more formal anti-spam measures to be put in place. But I think there's a bigger
takeaway here. Spam mail makes a lot of people really angry, and it has since day one. So why does it persist? By 1978,
we can already see two big reasons for this. Firstly, spam was very successful. It was a
great way to get your message out. The response to Peter Boss's piece mail isn't that well documented,
but using CTSS, he was undoubtedly able to get more people thinking about pacifism.
In Theorek's case, his spam ad made the upcoming DexSystem20 demo very successful.
Sure, some people were turned off, but others were actually interested.
By his own account, the controversial email netted Dex tens of millions of dollars in sales.
The other reason spam keeps flowing comes down to the people sending it,
and this point matters a little less now that everything in the world is automated, but
bear with me. Theric didn't have any malicious intent when he drafted his inaugural spam mail.
Later interviews make it very clear he genuinely thought that people would be interested in seeing
the upcoming demos. He would probably put it in different words, but share the same sentiment as Peter Boss.
Why did he hit send?
Well, because to him, the message was important.
All right, that does it for this episode.
As we've seen, email is one of those ideas that was just too good for programmers to pass up.
Communication between users is such a basic need that digital mail systems crop up as soon as multi-user systems appear.
The specific catalyst early on was time-sharing operating systems.
But anything that allowed for multiple users on shared infrastructure would have probably led to the same result.
And once ARPANET gets going, it doesn't take long for email to adapt.
By the end of the 70s, email was a staple on the network, and from there, its survival was guaranteed.
And with email's existence, be it local or over a network, it was only a matter of time before unsolicited messages started to flow.
network, it was only a matter of time before unsolicited messages started to flow. However,
I think it's fair to say that early mail spam had a distinctive flavor to it, if you'll pardon the pun. In the more recent era of junk mail, things are just more annoying and much more malicious.
But early on, it was different. Email itself was created with the best of intentions, and surprisingly so, spam was the same way.
Thanks for listening to Advent of Computing.
I'll be back in two weeks' time with another episode.
And since it is Spook Month, I'm hoping to have another slightly frustrating or spooky episode on the docket.
And if you like the show, there are now a few ways you can support it.
If you know someone else who's interested in Computing's past,
then why not share the show with them?
You can rate and review on Apple Podcasts.
And if you want to be a super fan,
then you can now support the show
through Advent of Computing merch
or signing up as a patron on Patreon.
Patrons get early access to episodes,
polls through the direction of the show, and some
bonus content. You can find links to everything on my website, adventofcomputing.com. If you have
any comments or suggestions for a future episode, then shoot me a tweet. I'm at Advent of Comp on
Twitter. And as always, have a great rest of your day.