Advent of Computing - Episode 14 - Creeping Towards Viruses

Episode Date: October 6, 2019

Computer viruses today pose a very real threat. However, it turns out that their origins are actually very non-threatening. Today, we are going to look at some of the first viruses. We will see how th...ey developed from technical writing, to pulp sci-fi, to traveling code. I talk about The Scarred Man by Gregory Benford in this episode, you can read the full short story here: http://www.gregorybenford.com/extra/the-scarred-man-returns/ Like the show? Then why not head over and support me on Patreon. Perks include early access to future episodes, and stickers: https://www.patreon.com/adventofcomputing Important dates in this episode: 1949: John Von Neumann Writes 'Theory and Organization of Complex Automata' 1969: 'The Scarred Man' Written by Gregory Benford, Coined Term 'Virus' 1971: Creeper Virus Unleashed

Transcript
Discussion (0)
Starting point is 00:00:00 It's the summer of 2000. You sit down at your desk and load up Outlook to check your emails for the day. You scroll through the usual spam, a couple bills maybe, and then you see it. The subject line simply reads, I LOVE YOU. All caps, all one word. And it looks like it came from a friend, or maybe even a classmate. Excited and intrigued, you open up and read the email. It says,
Starting point is 00:00:26 Kindly check the attached love letter coming from me. The mystery deepens as you see a file attached. loveletter4u.txt.vbs. Could this be it? Are your deepest secret feelings finally being returned? You download the attachment and hurriedly open the file. But the love letter never opens. Instead, you hear your computer's hard drive start to spin up, and things just start to slow down. You think the old machine must just be acting up, so you go to reboot it. It turns off just fine, but the darn thing just won't boot back up to Windows. It would seem that you've been had, and now your computer is one of 50 million
Starting point is 00:01:11 victims of the I Love You Virus. From large scale attacks like WannaCry or CryptoLocker to more targeted intruders such as Stuxnet, viruses are part of the landscape and subconscious of modern day computing. But viruses didn't start out as such a force. In fact, they didn't even start out in computers themselves. A lot of the first viruses existed as purely science fiction and, in the earliest cases, technical writings. Years before any computer would be infected. So what's the story of the computer virus? Welcome back to Advent of Computing.
Starting point is 00:01:56 I'm your host, Sean Haas, and this is episode 14, Creeping Towards Viruses. I'd like to start out with a programming note, and somewhat of an announcement. That announcement is, Halloween is the best holiday. It's also my favorite holiday, and it just always has been. So this October, I want to present some spooky episodes. And, well, it turns out there's some problems with that. There's not really that many computer ghost stories, at least that have information behind them. What does exist is often uninteresting, unsubstantiated, related to some sort of finance conspiracy,
Starting point is 00:02:36 or more existentially scary than fun and slightly spooky. Not really much you can tell around a campfire. That being said, I finally settled on a lineup for this episode and the next that, while maybe not frightening, I think will be Halloween-ish enough. So to kick things off for spook month on Adren of Computing, we're going to be talking about viruses. Because really, what's more spooky than losing control of your own computer and possibly years of personal data that you never remember to back up? All jokes aside, the history of the computer virus is a lot more strange than I initially thought.
Starting point is 00:03:15 In the modern day, viruses are a very real thing and a real threat. But it turns out that the first description of a virus predates any attack by decades. So today, I want to examine some of the earliest writings on the matter, leading up to the eventual first viral outbreak. And along the way, I'd like to find out just how close modern viruses are to their pen and paper ancestors. To aid in that conversation, we're going to need a measuring stick of what makes a virus a virus. And if you boil it down, there's really only one criteria that needs to be met. Just like any real-life virus, it needs to be able to self-replicate and spread. Of course, any self-respecting virus will do a little more than just that.
Starting point is 00:03:59 But I think just being able to spread on its own is a good enough start. The I love you virus that I mentioned earlier is a perfect example. Once a computer is infected, the virus hijacks the victim's email account and sends a copy of itself to all the contacts it finds. Once emails are sent and the spread part is taken care of, it starts to randomly overwrite files on the victim's computer. Simple and effective. So with the framework taken care of, let's dive into the strange origins of the computer
Starting point is 00:04:29 virus. So if the true identifying feature of a virus is the ability to self-replicate, then we can trace its origin back as far as 1949 and the lecture series titled Theory and Organization of Complex Automata by none other than John von Neumann. Now, if you're at all familiar with mathematics, physics, or computer science, then you probably recognize the name. Von Neumann's contribution to computing alone is widely considered as foundational. It wouldn't be too out of the question to say that
Starting point is 00:05:06 he was one of the most important scientists of the 20th century. In his lifetime, he would be involved with the Manhattan Project, atomic and hydrogen bomb development, and the development of the EDVAC computer. He invented the merge sort algorithm and the entire field of game theory. he invented the merge sort algorithm and the entire field of game theory. And those are just some of the highlights. But right now, I'm not concerned with the wider picture. Instead, let me tell you about just one of his lectures. The lectures themselves were given at the University of Illinois in 1949, but would become widely circulated after von Neumann's death.
Starting point is 00:05:43 In 1966, Arthur Burx, himself an accomplished computer scientist, compiled the lecture series along with other related papers and manuscripts into the book Theory of Self-Replicating Automata. Just the name of the book is a mouthful. And really, so are its contents. Von Neumann worked on both the theory and application of computer science, but most of his talks and writings were firmly on the theory side of things. Now, I'm ashamed to admit it, but I hadn't really read anything written by Von Neumann before preparing for this episode. I've read digested versions of his works before, but never the source. And I gotta say, it's even more dense
Starting point is 00:06:26 than I imagined somehow. But once you get past the writing style, many of von Neumann's insights are still impressive over half a century later. So, the book is called Theory of Self-Replicating Automata, but how does that relate to viruses, or even computers? The automata part is what may throw you off the scent a little. An automata is just an abstracted or more generic way of talking about a machine that can do a set task. So that means anything from a computer to a robot to a computer program. The other part of the title, self-reproducing, is a little easier to understand. That just means a program that can make a copy of itself, essentially.
Starting point is 00:07:10 So a more sensationalized and less accurate title would be along the lines of The Robot That Could Make More Robots. A little cooler, but, like I said, not quite as accurate. Now, in the book, von Neumann lays out a theoretical framework for how a machine could be designed to create identical copies of itself. The implication here is that once you make a single, self-replicating machine, then you'd pretty quickly have unlimited copies of that machine. Just tweak the machine or automata part into the word program and boom. That's a description of a computer virus, at least at its core. But that's only part of what makes the source interesting.
Starting point is 00:07:52 The lecture series specifically captures all of this in a larger comparison between 1940s computing technology and the human brain. In the era, that was a common conversation. and the human brain. In the era, that was a common conversation. Computers were just starting to get good enough to actually use for anything besides idle research. And fields like artificial intelligence were just starting to explore how computers could be made more in our image. And this may sound like an aside or just some fanciful writing to get a point across, but really, this is where the core of von Neumann's argument comes from. Why not make computers even more like a human? To quote from von Neumann,
Starting point is 00:08:33 Anybody who looks at living organisms knows perfectly well that they reproduce other organisms like themselves. This is their normal function. They wouldn't exist if they didn't do this, and it's plausible that this is the reason why they abound in the world. In other words, living organisms are very complicated aggregations of elementary parts, and by any reasonable theory of probability or thermodynamics, highly improbable. That they should occur in the world at all is a miracle of the first magnitude. The only thing which removes or mitigates this miracle is that
Starting point is 00:09:12 they reproduce themselves. End quote. No matter how good computers get, organic organisms will always have a key advantage. A computer can't make a copy of another computer, but a human can do just that. Well, two humans, but that's beside the point. So to follow von Neumann's logic, why not find a way for computers to do what they can't? To make their own progeny. The lecture ends with a theoretical discussion of how to design a self-replicating automata. This is von Neumann after all, so there has to be a good dose of theory. None of this is a direct description of a computer virus, but to me it really sounds like von Neumann was only a few steps away from just that. Even the word virus in the context comes from personifying computers, and theory of self-replicating automata is full of that
Starting point is 00:10:05 exact allegory. I think that if old Johnny von Neumann had been writing even a decade later, he may well have coined the term computer virus himself. It would be another 20 years before virus was actually used to describe a self-replicating program. Or at least that's one claim. There are actually a handful of competing theories as to where the idea of calling malicious self-replicating code a computer virus originates.
Starting point is 00:10:32 But the predominant claim, and the one that seems to be the most credible, is that it was first coined in a sci-fi story written in 1969. That story is The Scarred Man, written by Gregory Binford, and, well, it's not that very well written. And that's not just my opinion, that's according to both critics and the author himself. The Scarred Man was one of Binford's earlier stories. It would be republished in an anthology of his works called Worlds Vast and Various in 2000. And just to note, that book is where I'm getting a lot of the quotes from Binford in regards to The Scarred Man from. In the author notes section for The Scarred Man, he wrote, quote, The only story in here from my early period, written in 1969, I include for two reasons. From this possibly bad start, Benford would go on to become a successful and prolific author.
Starting point is 00:11:40 But just as with von Neumann, I'd like to set that aside and focus on this one work. The Scarred Man is only about 5,000 words long, and first saw circulation in Venture Science Fiction, a pulp-style sci-fi magazine. Overall, it's unassuming. The tale paints a relatively standard backdrop of a near-future dystopia, set in the far-off year of the 1990s. The International Computer Syndicate, basically a super IBM, is the closest thing to a villain. ICS is one part computer manufacturer and one part mob-style monopoly on the market. And to make things even spicier, the core of the story is a conversation in a
Starting point is 00:12:25 smoky bar held between a smuggler and a businessman on vacation in future Antarctica. It may sound a little bit absurd, but you gotta love these dime store style sci-fi stories. Anyway, hidden inside the framing is the actual meat of the tale. Deep inside the framing is the actual meat of the tale. Sapporo, the eponymous scarred man, is an ex-ICS employee. Partway through his tenure, Sapporo and a coworker figure out a way to rip off the syndicate that would go, hopefully, undetected. Their plan? Program and plant a malicious program deep inside a new ICS mainframe.
Starting point is 00:13:06 And here's where we get back on track. The malicious program was named Virus. To quote from the scarred man as to what Virus actually did, The program he logged in instructed the computer to dial a 7-digit telephone number at random. Now, most phones are operated by people, but quite a few belong to computers and are used to transfer information and programming instructions to other computers. Whenever a computer picks up the receiver, metaphorically I mean, there's a special signal that says it's a computer, not a human.
Starting point is 00:13:40 Another computer can recognize the signal, see? Sapiro's computer just kept dialing at random, paying up on humans until it got a fellow computer of the same type as itself. Then it would send a signal that said, in effect, do this job and charge it to the charge number you were using when I called. And then it would transmit the same program Sapiro had programmed into it. So there we have it. A program that can self-replicate and spread to other computers.
Starting point is 00:14:11 That's sounding pretty darn close to a modern computer virus. The details of the attack are also similar to real-life cases. Virus started out only on one computer and then spread over a network. In this case, via telephone lines. So I'd say this meets the most open definition of a computer virus already. But for the extra points, how was virus making back money? What evil deeds was it told to do once it was on a victim's computer? Now, this part of the story I find very interesting.
Starting point is 00:14:47 Virus was designed to steal computer time. For that to make any sense, you have to know what computer time is, or rather what it was. Back when mainframes were king and all programs had to be scheduled out to run on large shared computers, users were limited by computer time. That's how long you have for your program to run on the system, and you'd be charged by that metric. So indirectly, Virus is designed to steal money from people by taking away chunks of their computer time. But there's more to the grift than that. As Virus spread undetected, ICS systems started to slow down. It was taking computer time at random, so processes would randomly take longer to complete. And due to the plot armor of virus, no one could find the
Starting point is 00:15:33 source of the problem. Now to get some money out of the whole debacle, Soparo and his companion started a consulting firm, the only company that was able to fix the seemingly random bug in ICS systems. Their solution was a program called Vaccine, and it was written specifically to find and remove virus from a single system. And there we have it, the earliest full description of a computer virus attack, complete with the name. But if this was the first written work concerning the matter, complete with the name. But if this was the first written work concerning the matter, then where did Binford's idea come from? It turns out that he may have wrote a computer virus as well. Well, sort of. The explanation of Binford's inspiration only comes from himself, there isn't any corroborating evidence that I could find at least. So this has to be looked at with some skepticism. Now, from 1967 to 69,
Starting point is 00:16:27 Binford was working as a researcher at Lawrence Radiation Laboratory. While there, he learned to program in Fortran. He cites this time in his life as the inspiration for virus. Quoting once again from the author's notes, quote, there was a pernicious problem when programs got sent around for use. Bad code that arose from researchers included, maybe accidentally, pieces of programming that threw things awry. One day I was struck by the thought that one might do so intentionally, making a program that deliberately made copies of itself elsewhere. The biological analogy was obvious. Evolution would favor such code, especially if it was designed to use clever methods of hiding itself and using others' energy, computing time, to further its own genetic ends.
Starting point is 00:17:17 So I wrote simple code and sent it along in my next transmission. Just a few lines in Fortran told the computer to attach these lines to programs being transmitted to a certain terminal. End quote. The conception of viruses as not intentionally evil, but instead intentionally poor code, just strikes me as really charming and funny. but instead intentionally poor code, just strikes me as really charming and funny. However, unlike his fictional equivalent, Binford would pull the plug and remove his experiment before it went too far. He also goes on in the author's notes to mention that he transferred a similar program to Los Alamos Labs during the early days of ARPANET.
Starting point is 00:18:01 Not to steal computer time, but to make a point that security should be considered on the fledgling network. While there's no way to tell if Gregory ever read von Neumann's earlier work, a similar line of thinking is present. If living creatures can reproduce, why not make programs or computers that do the same? Von Neumann talked about self-replication in more general terms, but Binford imagined it as being a tool, an end to a larger means. The other contingent that Binford added was the idea of an antivirus. Von Neumann's writing never mentioned the idea of anti-self-reproducing automata. And calling the solution to virus vaccine, while maybe not medically accurate, has a certain satisfying logic to it.
Starting point is 00:18:48 At least it sounds more inspired than a program called antiviral. So that's the theory part down. The idea of a computer virus, even the modern name, existed as far back as the 1960s. Then, when does the first virus actually appear in the wild, so to speak? And how does the reality of a virus attack stack up to the fiction? It turns out that it wouldn't be long after The Scarred Man that we see our first example. In 1971, the world met Creeper, the first computer virus, or computer worm if we're being specific. The Creeper worm ends up looking markedly similar to earlier descriptions of viruses,
Starting point is 00:19:32 and its origin is just as interesting. Creeper was originally written by Bob Thomas, a researcher who worked at BBN starting in the early 70s. worked at BBN starting in the early 70s. BBN stands for Bolt, Bernard, and Newman Incorporated, or more charmingly, just called Bolt. And it's one of those unique types of companies that sprung up just in time for the early days of the computing boom in the United States. It's a research and development firm, so their trade is in doing research and development for other companies and institutions. A lot of these R&D outfits, like Bolt, share a single primary patron. That's the U.S. government. And as computers started to become more of a reality in the 50s and 60s, Bolt and similar companies became flush with government funds for research into computing.
Starting point is 00:20:27 As the 70s rolled around, BBN shifted a lot of its computing research into networking research. And during that era, that meant ARPANET. Now eventually, ARPANET would develop into our modern-day internet. But in the 70s, it existed as a government-owned network primarily used for connecting researchers across the country. BBN specifically was working on a type of technology that most network users would never use or even see, distributed computing. Normally an ARPANET or even an internet user today would connect up to a server, get whatever information they need, and log off. To boil that down and gloss over some complexity, you have a human talking to a computer. On the most simple level, distributed computing is trying to get computers to talk to each other.
Starting point is 00:21:18 It's usually used for linking up machines to calculate larger problems than a single computer can handle. But since there isn't a human in the equation, well, except for the programmers who made it, the kind of network traffic used for distributing computing flies under the radar, so to speak. Today, a lot of internet infrastructure actually uses distributed computing, or at least concepts from the discipline, in the background. But in the 70s, it was a totally new idea. And just with distributed computing,
Starting point is 00:21:53 we're already one step closer to the situation depicted in The Scarred Man. Bob Thomas was one of many researchers working with distributed computing at BBN. Specifically, he and his colleagues were working on an air traffic simulator called MCROSS, or the Multi-Computer Route-Oriented Simulation System. Now, at first glance, that doesn't sound too terribly related to distributed computing, but it turns out that it's actually a pretty interesting puzzle to solve using the method. The code worked something like this. You had a network of computers with each system simulating one chunk of airspace that, together, made up a larger sky. Simulated planes could fly anywhere in the sky. A flight path could even go from one computer's airspace into another's. Of course, that meant that all computers had to be able to communicate with each other,
Starting point is 00:22:45 and, somehow, pass off a plane and all its information from one system to another. Thomas was in charge of how that pass-off would function, and this is where Creeper comes into the picture. To quote Thomas, To begin to address the technical problem posed by moving part of the ongoing distributed computation, I decided to see if I could build a simpler, non-distributed program, which I named Creeper. Creeper's job after being started on one computer was to pick itself up and move to another by sending itself across the ARPANET to the other computer. across the ARPANET to the other computer. To make its job a little harder, I required Creeper to perform a simple task without error as it moved from computer to computer. Creeper's simple task was to continuously print a file on a console without missing or repeating any characters. What I love about this is how familiar it sounds to me on a personal level.
Starting point is 00:23:50 The entire error control simulation was, of course, super complicated. So it makes things a lot easier to try out some new code in a more simple program. It's an approach that a lot of programmers, myself included, use on a regular basis. That, and the fact that printing some output to the screen, or teletype in this case, is the easiest way to test if something worked. It feels like in the right time and place, almost any programmer could have had the idea. And that's what I find so interesting and so compelling about the story of Creeper. So what exactly did Creeper print out as it ran? So, what exactly did Creeper print out as it ran? Turns out that's very simple.
Starting point is 00:24:29 To quote from Creeper the first virus, I'm the Creeper, catch me if you can. Now, back to Thomas. Quote, After getting Creeper to work, I did two things. One was to integrate the techniques it used into Macross, so that parts of a distributed simulation could move around the ARPANET as the simulation was ongoing. The other was to hack Creeper, giving it the capability to wander aimlessly and endlessly throughout the various DEC PDP-10 computers on the ARPANET, picking its next host at random.
Starting point is 00:25:05 End quote. Just a quick note there, the DEC PDP-10 was the type of computer in use at BBN at the time, and it was a relatively popular system, so a lot of them were connected up to the ARPANET. And there we have the next part of the prophecy, as it were. Just like with Binford's virus, Creeper was now able to spread from computer to computer aimlessly and wreak havoc, or, well, at least announce its presence to the user. But a component was still missing. This early Creeper only moved from computer to computer,
Starting point is 00:25:41 so it wasn't self-replicating, but instead more of a traveling program. Only one computer could be infected at a time. Soon after, Ray Tomlinson, another researcher at Bolt who would go on to develop the first email protocol, modified Creeper to keep a copy of itself on computers that it traveled through. And that put the final piece in place. Now Creeper was a fully realized virus. It's not clear how many computers Creeper infected, but at the time, there weren't any preventative measures in place. It had to be at least enough to be noticed,
Starting point is 00:26:19 and possibly enough to be an annoyance. Tomlinson speculated that at max, Creeper could have infected 28 whole computers. In a final twist, life would once again imitate art. One of the programmers, sources don't agree on if it was Tomlinson or Thomas, would create a way to remove Creeper, a program that they dubbed Reaper, the first antivirus. Really, Reaper was more fighting fire with fire. The program was another worm, but it had a more complicated job. Once on a new computer, Reaper would look for the code that made up Creeper and remove it. Once that was done, it would go on and spread to another computer,
Starting point is 00:26:58 trying to search out more copies of the Creeper virus. Ultimately though, the stakes with Creeper and Reaper were relatively low. With the small size of ARPANET in 1971, and the fact that Creeper was a demo and not actively malicious code, there was never a real threat. I think Tomlinson sums up the situation well. When asked why he wrote Reaper in a 2014 interview, he replied, quote, The motive was purely to get the satisfaction of having done it. End quote. Alright, I think this is as good a place as any to end the episode.
Starting point is 00:27:42 That's the story of how the first computer virus appeared. The idea went from abstract theory to fictional dramatization and eventually to practice over some 30 or more years. And when you get down to it, as soon as the computer virus was named, most of its aspects were already set in stone. Viruses today still operate by replicating and spreading from computer to computer. The vector and intent may be different, but the underlying idea is the same. My take from all this is that not every event in the history of computing is entirely real. The space between the first theory and implementation for the computer virus makes up 30 plus years of no hardware or software. If you want to read
Starting point is 00:28:27 The Scarred Man for yourself, I'll include a link to the story on Binford's own website in the description. Thanks for listening to Adren of Computing. I'll be back in two weeks time with another hopefully spooky episode to round out October. Until then, if you like the show, why not take a minute to share it with your friends? You can also rate and review on Apple Podcasts. If you want to be a super fan, then you can now support the show by buying our hot new merch.
Starting point is 00:28:56 I'll have a link to the TeePublic site in the description for the episode. If you have any comments or suggestions for a future topic, go ahead and shoot me a tweet. I'm at AdventOfComp on Twitter. And as topic, go ahead and shoot me a tweet. I'm at Advent of Comp on Twitter. And as always, have a great rest of your day.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.