Advent of Computing - Episode 122 - To Edit Text

Starting point is 00:00:00 Programming is full of these internecine conflicts. I, personally, think that computers somehow rot the brain. As a wonderful example, us code slingers will fight about basically anything. We'll argue over which language is best, which operating system someone should be using, or even how code should be formatted. Never mention how you indent text in front of a programmer. Depending on your choice, they may burst a blood vessel. One of these ongoing struggles is the so-called editor war. It's a fight over which text editor is the best. Historically, this has

Starting point is 00:00:40 meant the fight between two particular editors, Vi and Emacs. Vi comes from UC Berkeley and first appeared in some of the earliest distributions of Unix. Emacs is a classic MIT job, which only migrated over to Unix after a long career in the fabled Artificial Intelligence Lab. Now, this will probably mark me for hate, but I don't really have a side here. I've used both, and both have their merits. Personally, right now, I use a lot of Vi, because that's what comes pre-installed on most Linux distributions. The editor war is still a pretty hot conflict. In the modern day, it has come to encompass more than just Vi and Emacs. I've heard programmers fight over Atom vs. Sublime vs. Visual Studio Code and a few more exotic choices.

Starting point is 00:01:33 I've even heard someone that likes Nano, if you can believe it. But why fight over something so seemingly trivial? Well, the simple answer is computers tend to rot the brain. But speaking more seriously, programmers spend a huge amount of time inside text editors. It's one of our biggest windows into the digital world. It's kind of where we live most of our days. We tend to get territorial about our digital homes. With the level of emotion here, you'd imagine that text editors are part of our digital heritage, one of those things that's existed forever and always been the fabric of programming. I mean, it's how we all program, right? Surely,

Starting point is 00:02:20 text editors have existed since the dawn of programming, right? Well, that's not really the case. Just like everything else, they had to be created. And that creation, in fact, came pretty late into the overall story of computing. Welcome back to Advent of Computing. I'm your host, Sean Haas, and this is episode 122, To Edit Text. Today, we'll be stepping into a realm that I only know tangentially, that is, the history of text editing. But first, we need to talk about something I'm more familiar with. That something being hypertext. Many early hypertext systems appear almost identical to modern text editors. Here,

Starting point is 00:03:13 of course, I'm talking about full-screen, point-and-click style editors. The type that we use daily today. Take Doug Engelbart's NLS as an example. That system's main interface is a blank space that you can type text into. All the hypermedia goodness comes on top of that text interface. HES, a hypertext system co-created by Ted Nelson, uses a very similar interface. Its primary purpose is to hold together text, so the principal interface is a screen that you write text into. I think that all follows pretty simply. There is an argument to be made that these systems look like text editors because, at the time, that was the best media a computer could display. In theory, hypertext can be extended to include all media

Starting point is 00:04:06 or any type of information. In the modern day, the internet does just that. You can have images that fit into the larger hypertext structure of the web. But back in the 1960s and 70s, computers tended to have issues displaying images. So by necessity, we first get hypertext instead of hypermedia proper. That's one plank in this text platform, but there is another. That's the primacy of the text editor. Now, this may be where you have to enter my mind of madness for a moment. Just stick with me. have to enter my mind of madness for a moment. Just stick with me. There are a number of later user interfaces that are actually just text editors. These aren't mainstream, mind you,

Starting point is 00:04:54 but they are influential in their own right. The two that come to mind, for me at least, are Terrier Davis' Temple OS and the Rio interface from Bell's Plan 9. Now, I know, that's two pretty weird deep cuts. And I think, as usual, Temple OS is just included as a nod to an oddity. Plan 9 is the cool example here. This was an operating system developed at Bell Labs as something kind of like a spiritual successor to Unix. It's one of those big, smart projects that's founded on rock-solid computer science principles. One of those principles is that everything is treated as a file.

Starting point is 00:05:37 That includes everything, from programs to devices to even other computers on the network. Topping this all off, completing the theme is Rio. Plan 9 technically has a number of compatible user interfaces, but Rio is the one most associated with the operating system. In Rio, all windows are actually just instances of a simple text editor. When you run a command, you're actually just opening a file and typing into it. The point I'm getting to is that text editors aren't just boring office tools. Instead, they show up in these strange places. They show up in big, fancy, and smart software.

Starting point is 00:06:18 There's that weird divide here of text editors as very bland tools, but also as handy interfaces. A lot's going on, which is usually indicative of an interesting story. So what is the story? How do text editors first enter the picture? Is there some early fork in their evolution that explains this duality? Or are text editors, like so many other technologies, a logical extension of the digital form? I want to start off with another deep cut. Have you ever seen a stack of punch cards? That's the format that old school programmers worked in. Each 80-column card contained a single line of code. A program was composed of many such cards, one after another. On the surface, this seems very simple, really archaic. But this was

Starting point is 00:07:15 what folk had to work with. For many programmers, this was the medium that their code ultimately lived on. Due to their ubiquity, certain customs and cultures arose around these cards. One custom was the diagonal slash. Now, to be clear, I'm making up a dramatic name here. From what I can tell, there wasn't a ubiquitous name for this practice. When a deck of cards was finished, the programmer would take a felt-tipped pin and mark a diagonal line across one side of the deck. This gave programmers a way to quickly check if a deck was in the proper order. If every card was in its correct position, then the slash would be unbroken, a nice uniform

Starting point is 00:07:59 diagonal line. If cards were out of order, however, that slash would have breaks and jumps in it. We could say that the slash was a symptom of a larger issue. Punch cards have to be in a certain order to work. One card has to come after another, just as one line of code must come after another. And these cards have no inbuilt way to stay in order. If you drop a deck, then, well, you're kind of screwed. You have to painstakingly reorganize the entire deck of cards by hand. That's one of those horror stories that we don't really have to live through today. On the other hand, we can look at the slash as a fascinating part of early programming culture.

Starting point is 00:08:46 It's one of those lost relics that we can scry for information. Here's a little bit of knowledge straight off the stack. Some decks of punch cards have more than one diagonal line. If you look closely, you'll see that only one of those lines is perfectly formed, an unbroken mark from top to bottom. The other lines, however, will appear broken and jumbled. This is the result of manual editing. Despite their limitations, punch cards allowed programmers to easily replace and shuffle lines of code. Each card represents, really, encodes, a single line of code. Each card represents, really encodes, a single line of code. So updating one line was as simple as swapping out a card. Entire chunks of a program could be moved from one part of the deck to another by, you know, just physically moving it. New code could

Starting point is 00:09:42 be spliced in in a very similar manner. Just punch up your new cards and slam them into the deck wherever you want them. When these operations were completed, a new line would be drawn, since the older slashes were now incorrect. This is one of those beautiful cases where the medium itself, with a little bit of off-spec ingenuity, proved to be a valuable tool. The reorganizable nature of punch cards meant that you could edit a deck by hand. The fact that it was just a stack of paper meant that a permanent marker could be used to help organize a deck and track if they had been edited. This might not have been the most fun operation in the world, but you could edit a program that was stored on punch cards. In fact, many programmers did.

Starting point is 00:10:30 For decades, punch cards were the medium of choice when it came to programming. In fact, you didn't really have another choice. It was kind of punch cards or bust. That started to change in 1959 with the release of the PDP-1, sometimes called the first minicomputer This was a machine that could be operated by a single human being It came as one big main unit instead of a room full of modules It even used interchangeable parts And it loaded programs from paper tape

Starting point is 00:11:03 I know, I know, big deal, right? Some nerds decide to use a only slightly different paper-based medium. They're still shoving tree byproducts into computers. This may sound like a really small change, but it actually makes all the difference in the world. Paper tape is, just as the name sounds, a long, continuous strip of paper. It can be coiled into a reel or folded up like an accordion. Data is encoded on the tape as, what else, but punched holes. It sounds a lot like punch cards. It is paper with holes in it, after all,

Starting point is 00:11:40 but the two mediums are wildly different. Paper tape can't easily be subdivided into chunks of data like a deck of punch cards can. The data on a tape is continuous. One bit comes after another. That order is enforced by, well, the tensile strength of paper itself. So check this out. Lines don't have the same meaning on a paper tape. In a deck of punch cards, each card is a single 80-character line of code, at least assuming we're dealing with software. But if you use paper tape, well, there's no inbuilt distinction between lines of code. This has pros and cons. On the one hand, this means the paper tape is technically less wasteful.

Starting point is 00:12:33 Let's say you have a line of code that's only five characters long. On paper tape, that only requires five groups of punches plus another punch to signify the end of a line. Whereas on a punch card, that five-character line takes up the same physical space as an 80-character line. You're wasting a lot of paper. A downside, though, is that you can't edit data that's been punched onto a tape. Once the data is there, it's just there. You can't splice out a line or shuffle around code unless you pull out some scissors and glue. Even then, a poor splice job would probably destroy a tape reader. I haven't actually heard of any examples of people doing this, so let's just assume people aren't splicing

Starting point is 00:13:19 paper tape like you would, say, a magnetic tape. The jump to this new medium meant the loss of the older punch card ways. Gone was the diagonal slash, and gone was the possibility of manual editing. This lines up with another technological change, or at least the wider adoption of some new technology. The PDP-1 was also a hotbed for early interactive computing. It was among the first computers to be wired up to a terminal meant for interactive use. If you want the long story that explains all those caveats, I ran an episode on the history of the command line a while back. Basically, it all starts at or around MIT. In the very late 1950s, some programmers at MIT, including John McCarthy of Lisp fame, are experimenting with interactive computing.

Starting point is 00:14:13 This is partly related to early work that would lead to the Lisp programming language itself. At the time, they were using an IBM machine, but pretty soon, a PDP-1 arrives on campus, and some of that research and spirit spills over to the new hardware. We get this collision of new ways of thinking. The PDP-1 comes with an electric typewriter wired directly into the machine. MIT nerds are seeing what they can do with these so-called online typewriters. There's a new medium that's, in theory, more efficient than punch cards, but at the cost of editing flexibility. What's a programmer to do? The first attempt to splice these technologies together comes in 1960. It's a program called

Starting point is 00:14:57 Colossal Typewriter, developed by John McCarthy and Roland Silver. Now, technically speaking, I don't think this was actually developed at MIT. Colossal Typewriter was mostly developed at BBN, but McCarthy was also working at MIT at the same time. It's one of those timeline messes where things bleed over. Anyway, back to the point. Colossal Typewriter is probably the first text editor. But let me be clear, this is a very weird program. It doesn't fit cleanly into expected categories because, well, it was created before those categories existed.

Starting point is 00:15:38 You see, Colossal Typewriter is a tape editor, as in it was meant specifically for editing data on tape. This can be a little weird to wrap your head around. At least, at first. There's a bit of a wall that obscures it, but we'll see how we can circumvent that a little later. Anyway, we don't really use tape anymore. We have all these tools for breaking up text into logical chunks, but in the era of the PDP-1, those tools didn't exist.

Starting point is 00:16:09 So Colossal Typewriter can appear somewhat alien, since it's operating in this mode where it considers data directly on a tape. General operations go something like this. You start with the tape you want to edit. Colossal Typewriter reads that tape into a buffer in memory. Once the tape is loaded, you can make your edits. When you're done, Colossal Typewriter will punch out a fresh tape. That's the most simple explanation I can give you.

Starting point is 00:16:37 The weird details all come out in the editing step. And to explain that, we need to get a little more technical. Colossal Typewriter describes everything in terms of two-pointers, A and B. Everything you do, all your inserts and edits and whatnot, are done in reference to these points. Each works like a cursor on a more graphically inclined editor. You start with A pointing to the beginning of your tape, and B pointing to the next free byte after the end of the tape. And for basic text entry, that's about it. You can start typing right away. Everything you type will be added to the end of the buffer, advancing B with each keypress. Editing is a little more tricky. Colossal

Starting point is 00:17:23 Typewriter's interface is modal, meaning it has different modes of operation. It starts off in text mode, which just waits for keystrokes to store in the buffer. Manipulation of the buffer is all done in control mode. In this mode, you can move around the A and B cursors, print or punch the buffer, and read in more data from paper tapes. That's about as much control as you get. It's simple, but it's workable. With this, it was possible to prepare programs online with a PDP-1. That's online, as in you could actually sit down and type out a program while the computer was running. These types of modal interfaces tend to have some... issues. They're just primitive things. Fast to develop, but not always the best

Starting point is 00:18:13 to use. For instance, let us consider how you switch modes in Colossal Typewriter. It's actually very simple. If you're in text mode, you can enter command mode by simply typing the greater than symbol. That's easy enough, but it creates a bit of a problem. What if you want to actually type the greater than symbol into the buffer? To do that, you have to type a triple equals first, which the manual calls a quote symbol. Old typewriters had weird symbols, what can I say? Now, this kind of sucks, since if you want to write anything with a greater than, you actually have to type triple equals greater than. You just have to know that or else you get kicked to control mode. And while that particular symbol doesn't come up much in the written word, it does show up a lot in source code.

Starting point is 00:19:07 There would be certain languages that would be almost impossible to write using Colossal Typewriter. There is one other special character that you have to quote. That's the close square bracket. In many languages, that bracket is used to denote the end of a list or array. It's a very, very basic feature. But to Colossal Typewriter, in text mode, that symbol is a command to print the buffer. This would make the program almost impossible to use for many languages. Beyond that, Colossal Typewriter was just a very simplistic program. It worked, and it would have been usable, but only just, and only in the very specific context

Starting point is 00:19:54 it was developed. Before we move forward in our story, I want to take us back. I've already described one way that programmers would have worked with punch cards. Some programmers would have been able to manually work on their decks, flipping things around by hand. But, to be clear, not all programmers worked that way. Some would have had very limited access to the equipment that could have punched cards in the first place. And some may not have had the expertise required to ride the diagonal line. Some programmers just had to work through intermediaries. One practice was to use a so-called coding sheet, or coding form. These were sheets of paper that were pre-formatted for code, as in hand-written

Starting point is 00:20:37 code. A Fortran coding sheet, for instance, had each line broken into 80 columns. Those columns each had handy labels, since some parts of a punch card are actually meant for special purposes, especially when it comes to Fortran. Coding sheets would also have fields for entering the programmer's name, the program name, and other identification information. I actually have a stack of my dad's old coding sheets that he used for a class on RPG2. It's a weird language that I need to dive into at some point. Anyway, from what I've read and what I've seen, programmers filled these sheets out by hand. Most examples that I can find show pencil marks, but it is likely that some brave folk used pen.

Starting point is 00:21:23 You'd write out your program, then submit it to be punched. These forms would then be passed off to a typist, who would sit down at a keypunch and type out the program. These keypunches were pretty cool machines in their own right. They were these huge desk-sized beasts that had a normal-looking keyboard, but instead of typing to paper, they punched your keystrokes onto cards. Edits, in this case, may have been done on the coding form itself. If a program failed to run as expected or needed to be extended, then the programmer would go through

Starting point is 00:21:58 the whole process again. The code would be written out by hand or maybe edited with an eraser and a pencil, and then resubmitted to a typist. Once again, like with manual punch card editing, this is a workable process, but it has a lot of hangups. It's not the most convenient or quick thing in the world. This coding sheet style of programming was a way to deal with limited resources or to control access to hardware. Take the example of programming was a way to deal with limited resources or to control access to hardware. Take the example of someone in a programming class. A college student in the 60s or even 70s wouldn't have known the first thing about a computer, at least not initially. So not only would it be a little foolhardy to drop them into a machine room, it would also be a

Starting point is 00:22:42 massive waste of computer time. By working through an intermediary, by writing on a coding sheet and submitting jobs, computer time could be better allocated. Students could be better served without disrupting a very limited resource on campus. We can look at this workflow as a type of offline editing, as in, the editing process can be carried out without access to the computer, without the computer even running. The only time the computer was involved

Starting point is 00:23:11 was for compilation and execution of the final program. But, of course, this comes at its own cost. When you get down to it, offline editing may not actually save much time, at least not for experienced programmers. Just think about the time wasted with the whole back and forth of this process, but I guess that's a whole separate conversation about time resource management. These forms of offline editing were huge, partly because of that limited resource issue I mentioned before. Mainframes were expensive

Starting point is 00:23:46 machines that were often leased on a monthly basis. Even punch card equipment, like key punches, cost a pretty penny. It wasn't always possible to offer up computer time to everyone. Even experienced programmers were kept at arm's length from machines in the 50s and 60s. It wasn't always possible to have a key punch on every desk. But, once again, the PDP-1 and other small computers opened up new avenues. As a smaller and cheaper machine, it was, at least in theory, a less limited resource. There could be more computers and more time to go around. The first pass at an editor, Colossal Typewriter, was very limited. Not much is very clear to see.

Starting point is 00:24:35 But this wouldn't be the only text editor developed for the new PDP-1. A few years after the Colossal program came Expensive Typewriter. Now, as I said, the provenance around Colossal Typewriter is a little annoying. It was written at BBN while McCarthy was serving a stint there, but before he went back to MIT. Expensive Typewriter, on the other hand, is pure MIT hackery. As such, it was developed by one of the original gangs of hackers. Here's a fun question to consider. What are the most important types of programs? I'd wager it's not big, fancy software used by scientists. It's not mathematical models used by

Starting point is 00:25:21 engineers. It's not even healthcare information systems or banking software. No. The most important programs in the world are all tools. Programs that make it easier or more convenient to use a computer. Software that makes a programmer's job easier. Without quality tools, you can't write quality software. This means you can figure out how good a programmer's life is by how good the tools they have are. So what was the state of tools on the PDP-1 back in the early 60s? Steven Piner, one of the co-creators of Expensive Typewriter, explained it this way in an oral history interview. Quote, Very limited tools at the time. It was a paper tape, punched paper tape input device, punched paper tape output device,

Starting point is 00:26:11 and a CRT display and a Soroban electric typewriter. The programs were prepared offline on FlexoRiders or at Teletypes with a tape punch. Our immediate goal was to get functioning software to help us do what we wanted to do with it. End quote. For many platforms, this was the reality. When you got a computer, you didn't necessarily have nice tools to use with it. IBM was probably the exception to the rule, but even Big Blue took years and years to build up a library of quality programming tools. So these nerds over

Starting point is 00:26:47 at MIT had a PDP-1, but that was about all they had. Piner further explains in his oral history that they did have a very simple assembler and debugger. That was basically the bare minimum. Without a viable text editor, programmers had to punch code directly onto paper tape. This was all done, as he explained, offline via a FlexoWriter. These were essentially the tape-based equivalent of a key punch, but with added features. A FlexoWriter could be used as a terminal for online operations. In offline mode, a FlexoW could be set to punch keystrokes to a paper tape. One of the first tasks that these programmers took up was to create a better set of tools for their new computer. According to Piner, this included an improved assembler, a better debugger,

Starting point is 00:27:39 a very simple operating system, and a text editor. That editor would become known as Expensive Typewriter. How does this second editor differ from its Colossal counterpart? That's a place where things get kind of weird. Piner seems to recall Colossal Typewriter having some sort of line orientation. He implies that users were able to work one line of code at a time. He implies that users were able to work one line of code at a time. But, from the documentation I've seen, Colossal Typewriter only worked one character at a time. The 1960 manual for the program only has one command that cares about lines, and that's the command for reading in tape data.

Starting point is 00:28:26 So there is either an improved version of the Colossus kicking around MIT, which is very possible, or Piner is misremembering something. Expensive typewriter would be line-oriented. Like I said, that may be a new thing or it may have been inspired by earlier software. In practice, this meant that a user was able to navigate text one line at a time. Compare that to the documentation for Colossal Typewriter, which worked one character at a time. That may seem like a small distinction, but I assure you, this matters in a huge way. I want to point out something very specific from that Piner oral history. This is from the same part of the conversation as the previous quote.

Starting point is 00:29:13 Of course, we were programming purely in machine language at the time. We had no compilers or high-level languages. There was an assembler, and there were some minimal debugging tools and such. End quote. At MIT, in the PDP-1 lab, prior to the development of better tools, programmers were using machine language. Once again, tiny distinction, but these are the kinds of distinctions that I find fascinating, and that, after a while, tend to add up. I'll be making some leaps in logic here, but I think my thinking is sound. Okay, machine language is all numeric. It's a language that the computer speaks itself,

Starting point is 00:29:54 which turns out to be these unending blobs of gross binary numbers. That tends to be a little inscrutable. It's not the kind of thing that you can easily sit down and type on a keyboard. Programming in machine code is tedious, slow, and very error-prone. Since computer time costs actual money in this era, I'd wager that meant that there's little incentive for an online machine code editor. Things change if we're talking about programming in assembly language. Assembly is a very primitive language, but it is more human-friendly. These languages offer fancy mnemonics for each instruction. These mnemonics are one-to-one replacements for

Starting point is 00:30:40 machine code instructions, so the code is still verbose, but it's better. You get to use the word add instead of operation code number 40, for instance. In machine code, everything is packed into raw binary data. You just get bits in this unending stream of digital consciousness. In assembly languages, however, you get to separate out your operations. This is done with a new line. You will actually have distinct lines of code, just like in Fortran or some other fancy language. So here's the spin. In machine code, the smallest unit of code is an instruction. The same is true for assembly language. But if we're speaking lexically about how the language is actually written, then the smallest unit of

Starting point is 00:31:30 code in assembly could also be viewed as a single line of text. It's an instruction followed by a line break. You could totally work with machine code where every instruction ends in a line break, but that will need some massaging before you execute it. You'll have to take out the line breaks before running the program. At that point, there's no reason to stick with machine code. Assembly language code also has to be massaged into machine code, but you get a pile of side benefits. It does more than just removing new lines. This is all a very rambling way of saying that if you're using assembly language, then you could actually make use of a line editor. It makes the prospect of an online, line-based editing tool pretty nice. But, Piner did say

Starting point is 00:32:20 that he and his buddies were working in machine code. Well, that brings us back to the discussion of tools. These PDP-1 hackers were developing a better assembler for the new machine. This would help them migrate away from pure machine code. Now, crucially, this was a macro assembler. That comes with its own pile of very interesting implications. First off, let me explain what a macro assembler actually is. This is one of those programs that kind of just does what the name says. It's an assembler with support for macros. A basic assembler will just translate mnemonics into machine code while adding a little bit of extra spice. You can usually label sections of your code with names, which can be then used in

Starting point is 00:33:11 place of constants. One example is just labeling the beginning of your program as start, and then you could set a register to the address start instead of setting it to the address some string of incomprehensible numbers. That's about as complex as a normal compiler gets. You get mnemonics and you get labels. The macro part just allows you to define a set of macros. This is purely a text replace kind of thing. You define a macro as a chunk of assembly code, and you give it a catchy name. Then, in your actual program, you can call up that macro by its name. When your assembly is being translated to machine code, any reference to a macro is just replaced with the code of that macro itself. This is a handy shortcut for commonly used code. You could use this

Starting point is 00:34:03 to keep track of constants or to do quick little tasks like checking the state of a toggle switch. So here's the cool part. If you're using a macro assembler, you are basically speaking a different language than stock standard assembly. Normal assembly is very repetitive. Chunks of code tend to look the same. You get a lot of ads and CMPs and moves and so on. You get a lexical soup where everything is pretty much consistently gross. They're just three-letter words over and over again. There's not a very nice way to tell one ad from another. This is helped, slightly, by labels. You can name chunks of your program in assembly language.

Starting point is 00:34:50 Technically, these actually just label addresses in memory, but that's the same as labeling part of your code. These labels can be used to set register values, or they can be jumped to. And, crucially, you can name them pretty freely. There are restrictions here, especially on older computers, but you get nice, unique names. That helps to make part of your code unique and easy to identify at a glance. Macros take us to the next level. Here, you have unique words that can invoke chunks of code. Instead of staring at a sea of ads punctuated by some fancy labels,

Starting point is 00:35:35 you can have a whole passage of code written out in more or less intelligible English. You can easily remember and identify where something is going on. Let's say you're looking for the place where you check the state of a toggle switch, and if it's flipped, jump to another location. Well, that's very easy. You look for where you use the check switch macro. Once again, I'm kind of re-arguing the wheel here. Of course, adding English-like words makes a programming language easier to use. That's the whole point of a high-level language.

Starting point is 00:36:03 And at this point, we already have Fortran. Folk were already programming in cushy languages. So why does the whole macro thing matter? Well, it's for a very micro reason. Expensive Typewriter introduces a feature that, I argue, was made specifically because it was designed for programming with a macro assembler. You see, navigation in Expensive Typewriter was mainly done by searching for text patterns. The user interface here is very similar to what we saw with Colossal Typewriter.

Starting point is 00:36:38 The Expensive Program is modal, with a control in text mode. All commands are entered as cryptic characters. Mode switching is annoying. All that jazz. The key difference is that Expensive Typewriter doesn't really have a cursor in the same sense as Colossal Typewriter. Rather, it just tracks the current line you're editing. You can change the current line, which lets you move back and forth in the document, but that's the boring way to navigate. The cool way to move around is by searching. In control mode, you can search for any pattern of characters. There are some limitations to this, however, or maybe I should say controls. Expensive Typewriter breaks its text up into these virtual pages. Each page is composed of lines.

Starting point is 00:37:27 its text up into these virtual pages. Each page is composed of lines. The manual describes this somewhat like a coordinate system. Each character can be uniquely addressed by its page, line, and location within the line. The editor provides a few different types of search operations. You can either search within the current page or search within the entire text buffer. There are also options for controlling the line number the search starts at. When a hit is found, its location is loaded up into a set of internal variables that function kind of like a three-dimensional text cursor. You can then start editing at the virtual cursor, or you can go looking for more matches. There's something really neat going on here. Think back to punch card editing for a second.

Starting point is 00:38:11 That was a line-by-line affair. You could call it line-oriented if you want. Imagine that you're trying to fix a syntax error. You wrote AND instead of ADD on a line. If you know the line number, well, that's easy. You can just pull out the correct card and be done with it. But if you only know that the text is wrong, well, you have to manually search through the entire program one line at a time.

Starting point is 00:38:40 Expensive Typewriter automates away that particular operation. You just have to sit down at the PDP-1, enter control mode, and tell the editor to search for AND. It would drop you at the line with the typo, ready for your correction. This, however, can only go so far for assembly language. In this particular case, AND is a valid operation, and it's a common one. On the PDP-1, AND performs a logical AND, something the programmers, for some reason, really like to do. Searching for that word would probably give you a lot of hits.

Starting point is 00:39:18 This is a bit of a digital tongue twister. It can be really hard to keep things straight when you're programming with two operations that use similar names. Imagine looking for the typo in a sea of ands and ads. You know, best of luck, what can I say? This is the kind of problem that high-level programming can help with, and it's also something that macros and labels can help with. Adding in user-named instructions change the landscape. It means that we can think of a program in terms of named regions. Those names show up in the text of the code, which means they can be searched for. The ability to search changes, fundamentally, how you as a programmer relate to your code. Now, here, I really am

Starting point is 00:40:07 reaching, but stick with me. Ted Nelson, across his body of work, rallies against the idea of sequential data. His core reason for this aversion is that humans don't think sequentially. Rather, we think associatively. We file away ideas in terms of their connections and relationships to other ideas. I argue that Expensive Typewriter is an example of associative thinking. Just by adding text searching, it allows a programmer to think of their code in different ways. Instead of labels and macro names serving as purely functional features, they can now be landmarks. You can associate all parts of your program that check the status of toggle switches by the fact that those sections all call the Check Switch Macro.

Starting point is 00:40:57 Expensive Typewriter even automates that grouping for you. While this isn't hypertext, I think it has a similar feel to it. At least, to me, this speaks to programmers looking for better ways to organize information. Okay, that's probably enough of my hypertext tangent. Back to the actual editor. Expensive typewriter can do more than just search. In fact, you could almost look at its command mode as its own language. Remember how I said searching fills out a set of internal variables? Well, you can actually reference those variables from control mode. Once a match is found, its page, line, and location within the line are loaded into a set of variables. The typewriter also tracks the end of the matching string,

Starting point is 00:41:43 so a search actually gives you the exact window where that match is found. You also have access to variables that store the number of pages and number of lines in the current page. The idea is that you can use those variables to build out complex commands. You can tell the program to replace very specific chunks of text. You can tell the program to replace very specific chunks of text. You could search for AND, grab the proper coordinates, and then replace it with the word ADD. It's a very cool feature set, but its implementation, well, it's kind of gross. It's basically a digital form of chicken scratch.

Starting point is 00:42:22 The first issue is just the choice of variable names. For whatever reason, everything in control mode specifically uses one-character commands. They are terse and confusing. Let me give you an example straight from the manual. Right arrow has the value of the number of pages in the buffer, i.e. the number of the last page. Double quote has the value of the number of characters from the beginning of the line at which the most recent search command found a match to the first character of the match. End quote. Using variables and commands, you can do all kinds of text manipulation.

Starting point is 00:42:58 You can do a wild amount of processing, but you have to know this really archaic kind of demonic-seeming language. If you do know that, however, you can switch around lines of text. You can move the results of searches or delete the line that immediately follows a certain match. You get this very robust rule set for editing text, but it's just not user-friendly at all. The language here is so deep that it even has constructs like loops. Honestly, it's a wild feature set for a text editor, and it makes sense. It's an editor written by programmers for editing programs. There is one other feature that I have to at least touch on. ExpensiveTypewriter can, at a single keystroke, fire up the team's fancy macro assembler. The assembler then takes the file you were editing, assembles it, and runs the finalized

Starting point is 00:43:55 program. I'm sure there are some weird issues with this setup, especially if you write bad code, but this is a really nice feature to have. It means that a programmer can, effectively, live inside the typewriter. You can program, assemble, and execute code from within one interface. You could also configure ExpensiveTypewriter to run an arbitrary program at another keystroke. This is something like a hotkey or a macro key. So in theory, you could load up a debugger. You could have a fully integrated development environment in the early 1960s. That's wild to think about because that's a really modern tool set to have all packed into

Starting point is 00:44:39 one interface. But this itself leads to another issue. At this point in our chronology, we're still in the early 1960s, and we're going to be staying at MIT. This has kind of turned into a story about software on one computer at one lab at MIT, but that's how it goes sometimes, you know? Online editing, powered by expensive typewriter, was becoming popular. The editor worked very nicely, despite its quirks. As Dan Murphy, one MIT programmer, explained, With the PDP-1, the code-compile-debug cycle was shortened from hours to minutes. I could do it several times on the machine in a one-hour session. Good homespun tools and the overall design of the PDP-1 meant a more personal computer.

Starting point is 00:45:29 It also meant a more productive computer, at least in a lab setting. But despite this, computer time was still scheduled. It was still a limited resource. Online editing with expensive typewriter was cool, and it was a very useful tool, but there was a problem with that. It's even referenced to in the name. Using an online editor could eat up a lot of time. It turned out to be a very expensive type of program to use. Thus, a kind of fun problem crops up. The PDP-1 is still a limited resource, enough so that programmers can't get unlimited time with the machine. But on the other hand, there are tools that are pulling users to use the PDP-1 in more interactive ways.

Starting point is 00:46:14 This isn't a batch machine where you can drop off a program and wait for results. This is something that's on the cusp of personal if only there were more hours in the day and more hardware in the lab. One solution, and the solution that the aforementioned Murphy chose, was a return to offline editing. At least, kinda. The narrative here isn't some nice linear story. Expensive typewriter was, at least ostensibly, meant as something of a demo. It did see use, but due to the overall climate of computing, it wasn't exactly a staple. It was something nice that could be used sparingly. Some folk, probably the vast majority at MIT, still prepared programs offline. Murphy figured that both options kind of sucked, so he started developing a third path. This leads to a program

Starting point is 00:47:05 called the Tape Editor and Corrector, or the Text Editor and Corrector, or just Techo. And, dear listener, this is where we start to go fully off the rails. First, let me outline the top-of-the-line workflow, assuming that you were using Expensive Typewriter. You would come into the computer lab for your scheduled time slot. You'd sit down, load up the editor, load in your program, which was most likely on tape. You could then enter the so-called code-compile-debug cycle. This means that you would make changes to your program, compile it, try to run it, then try to sort out why it wasn't working. This is a cycle because once you ran through some debugging,

Starting point is 00:47:50 you'd have to go back and edit your program and then compile it and then debug it some more. Eventually, your hour runs out. You get kicked out of the cycle and kicked out of the lab. This is where things get interesting. When you're dropped from that loop, more than likely, you aren't done working on your program. You still have ideas in your head, you still have bugs to fix, you still have edits that you need to make. Under status quo, there were two options. The first is to sit down at a Flexirider and retype your entire program, making the corrections that you have in your head.

Starting point is 00:48:24 That kinda sucks. The second is to wait for your next scheduled window, sit down at the PDP-1, and use Expensive Typewriter to make your edits. That's much fancier, but making those edits will eat into that time window. So that option also kinda sucks. Each of these approaches has its benefits, however. Expensive Typewriter lets you make very fine edits interactively, and offline editing gives you time to mull over the changes you want to make without squandering computer hours. Murphy wanted to get the best of both

Starting point is 00:48:57 worlds. As he described in The Beginnings of Tekko, he started working on a quote, program to help me write other programs. Initially, Tekko sat in this weird gray zone between online and offline editing. The best way I can describe Tekko is as a text-oriented programming language. A user would prepare a tape that described the edits they wanted to make, sometimes called the correction tape. That would be done offline at a FlexoWriter. Then, once their scheduled window came up, they would load up Techo, load up the code they wanted to edit, and then load up that correction tape. This was kind of

Starting point is 00:49:37 like storing up a bunch of commands for another editor's control mode and then running all of those control sequences at once. This leads to a really simple, yet very powerful tool. Many of the basic commands are similar to those that we saw in Expensive Typewriter. You can search for strings, remove and replace chunks of text, add new text, all that stuff. Where things differ, however, is the matter of line orientation. This is something I didn't expect and something that I never really thought about until reading Murphy's article on Techo. He argues that line oriented editors are actually a form of legacy technology. Expensive typewriter treats the line as a key unit of data. In my head, that's always made sense. Programs are executed line by line,

Starting point is 00:50:26 after all. But here's the thing. Murphy's argument is that separating data into lines is an artifact of punch card technology. I kinda like this line of thinking. Expensive typewriter treats lines of text as if they are somehow special entities, as if they're a special unit of data. Techo, on the other hand, treats a file as a long stream of text, like a tape. Lines exist only as newline characters somewhere inside that dataset. This meshes better with the underlying medium, with the paper tape, and it also simplifies editing. Techo itself has an internal cursor that points to one character

Starting point is 00:51:12 in the text stream. To make edits, you move that cursor to wherever you want to change some text. This can either be done by advancing the cursor a set number of characters, by searching for a text pattern, or by some more archaic means involving macros, registers, and searching. The point is, you're effectively sliding this pointer up and down the tape. The almost language here is a very powerful thing. If you wanted to, you could do all your programming from Techo. There's a command to insert text and new lines, so you can easily work up a tape that just spits out whatever program you want. You could then, after some debugging, work up a Techo tape that would splice out all the nasty

Starting point is 00:51:56 bugs you find. Let's call up today's repetitive example. Let's assume you fat-fingered ands where you meant adds, and you know all those fat fingers show up right after you call the check switch macro. Maybe you intended to grab the value of a switch and then add it to some running total somewhere. You just happened to mess things up by one character, and you think it happened a few times. This kind of problem is just begging for some slick techo. You can write up a correction tape to look for each instance of the check switch macro, advanced past the newline character, and then replace the next and with an add. That tape, all on its own, could correct your entire

Starting point is 00:52:40 program. That's some real power. But with great power comes, well, a bad interface. There is a fatal flaw with Tekko's design. The actual language, if you can even call it that, is close to incomprehensible. It's not even a programming language in the strictest sense. It's basically just an extended version of a text editor's control mode. Each operation in Techo is one or two characters. Those operations are executed immediately. So there really isn't syntax, so to speak. It's just a list of operations to run.

Starting point is 00:53:22 You can't dress up and format your Techo in any comprehensible manner. I'll spare you trying to read some Techo code on air because, well, I think that might actually summon a demon. You can find examples of programs online. They are concerning. Murphy himself says it best, so let me cut back to him. Quote, In the short term, however, offline command preparation proved to be impractical. It was impossible for most mortals, myself included, to consistently prepare command tapes that would work right. Like any other program, they would have bugs that would only become evident while on the computer. It's a beautiful situation to consider. The program that's meant to fix bugs in another program can come with its own bugs.

Starting point is 00:54:06 And this is where things take a twist. Murphy intended Techo as this middle ground between online and offline editing. It was supposed to be this tool that takes the best of both approaches. But Techo's complexity meant it required its own code-compile-debug type of loop. It was best used interactively. In short order, Tekka was modified to be an online text editor, just as a way to deal with these issues. There were some modifications made to the control language,

Starting point is 00:54:37 but the biggest jump here was breaking up the workflow. Running an entire correction tape meant running one modification after another. It meant running a very non-interactive program. By going online, a user could run a single command, check that the changes were correct, and then think and prepare for the next command. The fun part is this switch to online operation got us back to the same issue that Murphy was trying to solve. Tekko now used computer time. And worse, it used interactive computer time. But that turned out to be a worthwhile move.

Starting point is 00:55:15 At this point in the mid-sixties, Tekko was basically just a better expensive typewriter. It can do everything the earlier editor can do, and more. Now there is one final thing I want to go back to. That's the whole data representation in Techo. Like I mentioned earlier, it treats text as a long string of characters, whereas line-oriented editors treat text as a collection of discrete lines. Line editors would persist, but they've become something of a dead end. The text stream approach is, say it with me now, much more flexible. This data representation makes no assumptions about the data itself, which is a good thing to do.

Starting point is 00:55:59 It may have new lines, it might not. It might not even be text. It could just be a pile of data, and if you're editing just a stream of bytes, oh, that's fine too. This has proven a more durable approach, at least in some applications. Over in Unix land, data streams are hugely important. The shell, Unix's main user interface, has built-in support for text streams, or generic data streams, rather. You can pass around streams of data from one program to another, each program processing the stream as it flows through. I use these features nearly every day in my professional life.

Starting point is 00:56:34 Some of you listening probably do the same. If you've ever piped, grepped, setted, awked, or pearled, then Techo probably sounds familiar to you. Alright, that brings us to the end of this episode. Text editors are a simple fact of life, but they appear from a very specific set of circumstances. They are one of those technologies that we could look at and think, yeah, of course, that's always existed, why wouldn't it? But the fact is, text editors were developed due to this very specific set of circumstances. It comes down to hackers that were left alone with a smaller computer that couldn't read punch cards. Some early editors, like Expensive Typewriter, were meant to replace the feel of editing

Starting point is 00:57:25 punch cards, or at least, they ended up replacing that type of workflow. They worked line by line, just like a deck of punch cards. Others, like Tekko and ColossalTypewriter, worked a character at a time. This was more in line with the underlying medium, at least at first. Over time, storage media changed, as did text editors. Both approaches end up having their days. Now, I want to leave you with a question to mull over, as is my sometime practice. Would text editors have developed without the PDP-1? What if DEC had released a more traditional machine, one that read punch cards, would online editors have appeared in the early 60s? Would there have been a need for online editing at all?

Starting point is 00:58:12 I think that's an interesting counterfactual to consider. Maybe we'd still get editors, but I bet they would look and feel very different, at least at first. Thanks for listening to Advent of Computing. I'll be back in two weeks time with another piece of computing's past. And hey, if you like the show, there are a few ways you can support it. If you know someone else who'd be interested in the history of computing, then please take a minute to share the podcast with them. You can also rate and review the show on Apple Podcasts. If you want to be a super fan, you can actually support the show directly through Advent of Computing merch or signing up as a patron on patreon patrons get early access to

Starting point is 00:58:49 episodes polls for the direction of the show and bonus content you can find links to everything on my website adventofcomputing.com if you have any comments or suggestions for a future episode then go ahead and shoot me a tweet i'm at advent of comp on twitter and as always have a great rest of your day

Your Ad Here

Advent of Computing - Episode 122 - To Edit Text

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.