Advent of Computing - Episode 156 - RPG, a Different Paradigm?

Starting point is 00:00:00 I've told the story before of how I learned to program. My father had earned a degree in computer information systems back in the 1980s, but he never really used it. A side effect was a shelf full of programming textbooks in my childhood home. That's a very dangerous thing for a child to stumble upon. If you have a similar situation, I recommend locking them up. The books, I mean, not necessarily the child. When I finally found these texts, I caught the bug, and I've never been the

Starting point is 00:00:32 same since. In the ensuing years, those books have migrated to my own shelves right behind me. I remember having long conversations with my father about programming. That is, once I started down the path. One of the things that stuck in my mind was the time he told me about this language called RPG, the Report Program Generator. It was a language meant to run on big machines, not the dinky home computer I was working on. It did all kinds of serious business stuff, whatever that meant in practice, I had no idea. And it was programmed by filling out a form. That last part sounded odd to me.

Starting point is 00:01:16 Even as a novice, I knew that wasn't how you program. One is supposed to, of course, open a text editor and type out code line by line. But a form? That didn't exactly compute for me. A few days later, my father actually produced a pad of those forms and showed me some example code. Now, I couldn't read it. I couldn't make heads or tails of it, but I could tell he was telling the truth. This RPG thing was, somehow, fundamentally

Starting point is 00:01:46 different than what I was currently struggling to learn. I like to think that I've grown as a programmer since then. I've mastered, or at least butchered, many a programming language. RPG, however, is something that I still think about from time to time. It's one of those big mainframe things that us mortals can't really use, right? But what exactly is RPG? And is it truly bound to some arcane form? Welcome back to Admin of Computing.

Starting point is 00:02:26 I'm your host, Sean Haas, and this is episode 156, RPG, a different paradigm? Or perhaps you could call it, what my father never taught me about RPG. Today we're looking at a language that I know of by name, but not much else. It seems like when it comes to RPG, that's the case for many, or at least a few of the folk I've ran into. I've brought up the language in conversation a number of times, you know, as one does, and I've been surprised when someone says, oh yeah, I've heard of that, but I've never actually used RPG.

Starting point is 00:03:03 But hey, you know, I keep weird company, I'm weird company myself, so maybe it's not much to go on there. My plan for today is to remedy this situation. I want us to get a working understanding of what RPG is, where it fits into the larger timeline and kind of how it developed. RPG becomes interesting as soon as we scratch the surface. To begin with, it's developed in 1959. That is very, very early in the history of programming languages. This puts it around the same age as Lisp.

Starting point is 00:03:40 It also means that, probably, we're in for some weirdness. As I kept mentioning in the recent Acts episode, much of programming is built on tradition. Early programming languages exist prior to the formation of many traditions, so things get weird. Another way to look at it is that programming isn't quite nailed down yet, so there's more room for variation and experiments. The language is also actively supported today. The most recent version of RPG came out in 2020, so in addition to being a very early language, it's also a very long-lived language. The final little hook here is that RPG, in some ways, was developed to emulate older punch card tabulators. For now, I'm going to leave it at that since I really want to go deep on this.

Starting point is 00:04:33 How do you even develop a program that works like a punch card machine? Our path here is simple. At least, I think so. We have a strange old language, right? Well, I want to complicate that a bit. From what I've seen so far, RPG is very unique. When faced with technology that's so different, I like to ask, is there anything here for us to learn? Is RPG a diamond in the rough? Is it something everyone should learn and be taught in college?

Starting point is 00:05:03 Or is there an obvious reason that so few have been exposed to this art? RPG is designed initially for the IBM 1401. To make sense of the language, we have to talk at least a little bit about its first platform. The 1401 is released in 1959. The computer is, essentially, a giant and very expensive programmable unit record machine. Now, don't feel too bad if you don't know that term. Unit record equipment is the overarching term for punch card machines. That includes tabulators, key punches, sorting machines, anything that touches a

Starting point is 00:05:45 punch card. The 1401 can do all of that, and it's all controllable via a program. That makes it at once powerful and terrible to behold. The 1401 is explicitly a punch card machine. It's meant to replace older workflows that were managed by older unit record equipment. The central processor connects up to card readers, card punches, printers, and tape drives. It serves as this nerve center that munches up data and spits out results. While the 1401 is an advancement over older non-computerized equipment, it's also compatible with many older workflows. It's a replacement and an upgrade, but in many ways, it's designed to slot into place.

Starting point is 00:06:35 There are, however, some caveats to that, which we will be addressing in a little bit here. First, though, I want to address what exactly people are doing with older unit record equipment. What exactly is the 1401 supposed to be replacing? Well, would you believe very boring back office stuff? I'm gonna use an example from the 1964 RPG handbook that I'm currently living out of. That book follows a case study that has to do with invoice processing, and it's very indicative of the kind of work people would have been doing on tabulators when the 1401 came out.

Starting point is 00:07:16 The problem is pretty simple conceptually. You work for some company. It's, once again, Widgetco. They're a very good employer of podcast listeners. Every time there's a purchase, an invoice is filled out. That includes what item was sold, who it was sold to, the quantity and the unit price, the final price, and its internal cost. You need to take this raw invoice data and generate a series of reports. You need to take this raw invoice data and generate a series of reports. You at the end of the day should have total revenue for a month, net revenue, and order volume. That is a very, very simplified but very common use case for data processing.

Starting point is 00:07:59 To actually be processed automatically, this needs to be put onto punch cards. You could do this by hand, but Widget Co's been on the upswing, so they have some automation to deal with the sheer volume of data. But an issue, however, is that this is too much data for a single punch card, and invoice entry just won't fit.

Starting point is 00:08:21 Now, this is actually a persistent issue with punch cards as a format. An IBM card holds 80 columns of data. That roughly means 80 characters of data. Certain characters and certain special characters take up more than one column and some columns are used for formatting. So 80 is the absolute maximum. In practice, it's probably more like 78 or 79. When you're dealing with data entries that are more than 80 text columns, one solution is to use multiple cards. That's usually done by describing different types of cards and then using one or two columns to denote which type of card you're dealing with.

Starting point is 00:09:06 For example, the invoice case study uses column 80 to represent card type. Type 1 card, so with the number 1 punched in that column, describe the item sold, which contains cost and price information. Type 2 cards, once again in a 2 in column 80, mean an ordered card, which describes the date of the order, who sold it, and who bought it. This goes on down the line until you have enough space to transform an entire invoice into a set of punch cards. Thus, you end up with a stack of punch cards that represent all your invoices for a month.

Starting point is 00:09:43 In the old regime, using unit record equipment, you would run to a tabulator once your deck is prepared. These were the workhorses of any punch card installation. A tabulator reads in cards, performs operations on their data, and then spits out cards or makes printouts. A crucial component here is that tabulators

Starting point is 00:10:05 have mathematical circuits. They can do mathematics and they contain registers for storing data to operate on. For the invoice example, you'd set up the tabulator to only look at item and order cards, disregarding any other cards. Then you'd read in some fields, item price, item cost, numbers of

Starting point is 00:10:25 units sold in an order. You'd multiply price by units and then add that to one register. Then you do the same for cost times units and store that in another register. When all cards are processed you would then have something wired up to print the total revenue, total cost, and take the difference to get your net. That's your final report. That sounds like a very computer-ish operation. But tabulators are not computers. They're not Turing-complete.

Starting point is 00:10:56 While they can make decisions, those decisions are very limited. You can't control the execution of a procedure based off some decision. This leads to another limitation here. Tabulators aren't exactly programmable. The operations I described sound a lot like a computer program, but it's not. Tabulators and other unit record equipment are configured. This is done via a plug board and patch cables. A tabulator has internal registers,

Starting point is 00:11:30 a reader that exposes the value from a punch card, comparison circuits and math circuits. The plug board has hookups for all these inputs and outputs for those circuits. Then you physically wire those components together. So each time you set up a tabulator, you're essentially wiring up a big circuit from these internal components. You can't do conditional execution, because there isn't really a concept of execution.

Starting point is 00:11:59 Data flows in a series of steps from circuit to circuit. You have conditionals that can block or inhibit a certain path, or maybe redirect data in a different path. You can't, however, tell it to run a certain mathematical operation until a loop reaches some condition. So it's not Turing complete, it's not really a computer despite having many features of a computer. The 1401 offered a different solution to this problem. It could do all the same operations as a tabulator, but under control of a program instead of a patch cable. The 1401's card reader turns data on punch cards into digital numbers that are fed into the computer. Then your fancy program can crunch those numbers and create a printout. You'd probably even use similar

Starting point is 00:12:51 logic. Choose which cards to process via conditionals, extract fields into variables, do some math, run some totals, and then finalize and print your numbers. But here's the caveat that I mentioned earlier. A tabulator and a computer operate very differently. A tabulator marches to a beat as cards flow through. I will say I don't have personal experience using a tabulator, but I've read about them quite a bit. If I understand them correctly, they work on a series of phases as each card is processed. All operations are pegged to part of this processing cycle. You read in

Starting point is 00:13:30 a card, go through phase 1, phase 2, phase 3. At each phase, certain circuits are turned on or turned off. A computer fundamentally marches to a different beat. For the 1401, that's 87 kHz. You get another step every dozen or so microseconds, and it has nothing to do with the beat of incoming punch cards. As a result, you have to use different methods for processing incoming data. There's also the whole programming aspect. All the 1401 circuits are already connected, at least somewhat. You flip the connections using arcane invocations that instruct a series of transistors to turn off and on. You carry out these operations by, as I said, issuing instructions by writing code.

Starting point is 00:14:21 You don't add to a register by wiring up its input to a certain output. You add to a register using the add instruction. And that enters you into the domain of Turing completeness. The 1401 can do things like executing a loop that breaks under a certain condition. You can run part of a program for one card and run a different part of a program for another card. You have conditional execution. It's much more sophisticated, it's much more flexible, and at the same time it's much more complicated. From our standpoint, that is a good thing. That's great. Who wouldn't want to replace a tabulator with a whole living breathing computer? I mean imagine the possibilities, you could run Doom on

Starting point is 00:15:09 it. But in practice, this was a double-edged sword. Imagine the accounting department at WidgetCo. They've been using tabulators since the 1920s. They have all their plug boards set up just right. All their reports are wired up and stored in these big custom-built shelves that hold dozens of plugboards. When they upgrade to a 1401, all those years of experience mean nothing. Suddenly they need this thing called a programmer to do payroll. And what even is a programmer? There are also certain things that a tabulator can do more easily than a 1401. Take formatting as a very concrete example.

Starting point is 00:15:56 A late model tabulator, like an IBM Model 407, has all kinds of inbuilt formatting options. has all kinds of inbuilt formatting options. That's how you do nice printed reports or tables of data. There are methods built into the tabulator for handling formatting. It's almost as sophisticated as some kind of typeset system. A 1401 can do the exact same thing. A programmable computer can be made to do anything that's computable,

Starting point is 00:16:27 but it has to be programmed to do so. You couldn't just take the right formatting tape and push it inside of your 1401's processing cabinet. That wouldn't work. You'd have to write a program that would know how to transfer around control and format tapes and format printouts in exactly the right way. That's doable, but requires a totally different skill set than formatting on a tabulator. Again, you're trading complexity for flexibility. We have at once the outline of this problem, but the solution is at hand. With a computer, you can solve all problems. You can compute anything computable. And it turns out that anything computable is a very large array of things. The solution in this case was a series of report-generating

Starting point is 00:17:19 programs that mimicked old-school tabulators. In other words, programs that made the 1401 function even more like the tabulators it was replacing. Perhaps the first of these programs was 9-Pack. Now this actually predates the 1401. 9-Pack was developed for the IBM 709. I'm going to largely skip this, because it doesn't add all that much to the story, plus it's only tangentially related to RPG. The reason for that is hardware. At the time, IBM had two main lines of computers, their scientific series, the 707,000 machines, and the new up-and-coming 1401 series of business computers.

Starting point is 00:18:06 These lines were not compatible, so not much software was shared. I just want to mention 9-pack because there is this earlier tradition. This is a problem that's been well known since computers started hitting accounting departments. The first 1401 report generator, and the one we're going to dive further into, was called FARGO. That stands for the 1401 Automatic Report Generation Operation, because all IBM names in this period have to be very strained abbreviations. FARGO was designed to emulate the old IBM 407 tabulator. And that's not just me looking at two manuals and going, oh yeah, that seems similar. It explicitly

Starting point is 00:18:54 says this in Fargo's manual. Quote, the statements used to supply the information are based on the logic of the 407. Thus, any person with sufficient knowledge to develop IBM 407 specifications for a given report can learn and apply the Fargo language with very little training." It's explicitly a transitionary tool. If you work for a company that just upgraded from an old 407 to a 1401, then you can use Fargo to get up and running. You don't need one of those fancy programmers, just a card punch. Where this gets interesting is in concrete implementation. So how exactly do you take a computer and make it act a little more, well, dumb?

Starting point is 00:19:51 Fargo isn't a programming language. There's a crucial part that I want to be clear about before we continue. The IBM manuals for Fargo will occasionally call it the Fargo programming language or refer to it as a language, but the manuals don't really conceptualize it like other programming languages. This isn't just a case of the text existing before the language around that was codified. Fortran is already out and about. IBM has already written a lot of manuals for Fortran is already out and about. IBM has already written a lot of manuals for Fortran. Fargo is a reporting tool that can be made to act very similar to a programming language by using

Starting point is 00:20:33 external tools. I recognize that that's a very subtle distinction that I'm laying down. I'm going to be striving to make that more clear as we go forward. First of all, a user doesn't write a program. They code a report. And it's done using, and get this, standardized forms. Now, I'm not saying that with any tongue in any form of cheek here.

Starting point is 00:21:07 IBM actually had numbered forms that you filled out to describe a Fargo report. This already goes a long way towards answering one of my key questions. Yes, people were really using forms to generate reports on IBM hardware. But how exactly did that work? How do you feed a form letter into a computer? Again, it all goes down to punch cards. The 1401's primary input method was a punch card reader. That's how everything from data to Fortran source code, even to bin binaries made its way into the computer. So to get these Fargo forms into the 1401 they had to be transferred to punch cards. This was how all programming or at least a lot of programming worked in

Starting point is 00:21:57 this era. Some languages would have transition steps. You'd go from hand written notes to some normal standardized form and then to punch cards. Fortran even worked this way in the period. You could use a standardized Fortran programming form. But Fargo went direct. Each Fargo form was an 80 columncolumn-wide grid of fields. When a form was filled out, it could be transferred column by column to a set of punch cards.

Starting point is 00:22:31 Each column had to line up exactly with a cards column, and each row represented one punch card. That means that whatever you write down in a Fargo form is literally one-to-one with what the computer reads on a card. This also represents a very different way of programming. Now, I know, I just said that Fargo isn't a programming language, but I'm gonna be looking at it through the lens of a programming language. I hope my reasoning for that will become clear as we move on.

Starting point is 00:23:06 Fargo doesn't exactly have syntax. It also doesn't really have the concept of a line of code. Everything is a form, with each row representing a card in a sequence. The meaning of a card is determined by one or two characters in the card's first column. A T in column 1, for example, marks a total card. Those are used to carry out mathematical operations. Once you mark that first column, it dictates the form for the rest of the card. Let's say we're using a T for total as the example. A T card represents some kind of math operation at some specific phase of the reporting process. The operation is put in column 6.

Starting point is 00:23:54 Add, subtract, multiply, divide, or move. Arguments are specified in columns 7 through 9, and 10 to 12. The first field here is a source source and the second is a destination. If you punch an add card, for instance, when that's executed, it will add the value of the source to the destination accumulator. That makes us sound very much like normal language, but I assure you it's not,

Starting point is 00:24:22 at least it's not represented in a normal way. I want to start us out from this point of similarity and then build on it, so let me start to diverge. The phrases in use here are going to be strange. In Fargo, a variable is called an accumulator, or a constant. These terms are interchangeable. They're declared on the constant's control card, aka Phase 1, Form Number x24-6557. This is exactly what I mean when I say that Fargo is programmed using standardized forms. IBM literally gives you numbered and labeled forms to use. IBM literally gives you numbered and labeled forms to use. A constant control card is pretty simple. It starts with a C, then you punch the number of positions that constant occupies. That's essentially its size in bytes, roughly. You then give the constant a three character name and its initial value. So you might have a row that says C04BAR1024.

Starting point is 00:25:32 That would be a constant that's four characters long called bar that stores the value 1024. I wanna point out two initial things. First, I wouldn't really call this syntax, at least not in a classical sense. When we look at a language, we usually talk about its syntax, how the language is looked and how it's constructed. But for Fargo, everything depends on how characters are positioned. To use a technical term, it comes down to how data is packed. That's pretty unique. We don't get things like tokens or token separators. That's usually how you break up

Starting point is 00:26:13 the pieces of a line of code. Instead, you just get characters that are packed in a very specific form. Point two, Fargo is kind of visual. This may sound counterintuitive at first, but hear me out, I swear I'm not going crazy. All programming languages have multiple representations. For Fargo, we get two, the forms and the card. There is also always the representation that lives in the heart of a programmer, but that one doesn't count right now. The card representation is inherently visual, since it's composed of a series of holes

Starting point is 00:26:53 punched on cardstock. That's something that you can't really write down or speak aloud easily. The form representation is only readable when the text is, you know, placed in the form. You have to look at the form to see how fields line up, what different parts of the entry mean. You could render a line as C04BAS1024, but what does that mean? There's nothing to help you parse or understand that besides knowing exactly how a C card is structured character by character.

Starting point is 00:27:33 If you don't have a form handy, if you don't have this written down in the right X24 stroke whatever form, that loses all context. You need that visual component. Constant cards are just one type of card in the entire system. The next kind that I want to talk about is the Detail Control card. That's Phase 3, number X24 stroke 6559. Okay, I'm done giving the form names. For this card to make sense, we have to grab another piece of tabulator lore.

Starting point is 00:28:09 In Punch Card Lingo, any card that holds a single data point is called a Detail Card. One card from my invoice deck would be a Detail Card, for instance. This is in contrast to cards that hold the result of calculations. Those are called total cards. I'm sure there's some joke there. It just feels like you can make a pun, but it's a little out of my reach. I'll leave that as an exercise to the reader. Anyway, the Detail Control card tells Fargo what to do with each punch card that's read into the 1401. Now, let that sink in for a second because this is very different

Starting point is 00:28:50 than a normal programming language. These detail cards are executed every time a new data point is fed into the computer. It's something that happens to a beat that's dictated by cards being read into the machine. The structure here is fundamentally different from a normal computer program. There's not even a comparison. Something like Fortran does not have an equivalent to the Detail Control card.

Starting point is 00:29:22 This is functioning exactly like an IBM tabulator. Now these cards are much more complex than constant cards. A big reason for that is that detail cards can come in many different forms. Punch cards don't really specify much about how data is structured. IBM just specifies a way to encode information in terms of holes and 80 little columns. It's up to the operator to break those columns up into fields or combine them in all kinds of ways to make meaningful information. Fargo's Detailed Control Cards specify how that data is extracted and loaded into the computer. They also describe modifications for each data point and can punch incoming data onto new cards

Starting point is 00:30:09 or even print data. The loading part is the most simple. In that case, you just tell Fargo which columns of a card represent a field, where you want to load that field and what operation you want to use during that load operation. You could load the value as is, or you could add it to an accumulator,

Starting point is 00:30:28 or even do something more sophisticated. By more sophisticated, really, I just mean multiply or divide, but that is a lot more computationally complex than an addition. This is also where we reach our first conditionals. Now, I want to emphasize this is different than general purpose conditionals that we would use in a normal program. A detailed card can specify a test for incoming data.

Starting point is 00:30:57 If the card being read matches that test, then the detailed control card, that is, is executed. Those conditionals are encoded. You basically have to pick a condition from a big menu of options. You might want to only load a field if the card has a certain mark in column 80, for instance. This is how you would filter out specific cards from our invoice deck over at Widgetco's accounting department.

Starting point is 00:31:25 But perhaps my favorite field in the Detail Control Card specification is held in columns 7-13. How could you forget that field? I mean, this is a joke, but really I think that speaks to just how... Aeolian this is. Now, this column, column 7-13, is a special space where you can put a machine code instruction. That lets you embed machine code, raw and unadorned hexadecimal, in your Fargo forms. The manual mentions you could use this to call a subroutine outside the program. So in theory, you can link a Fargo program with air quotes to a real program. That would

Starting point is 00:32:15 result in some chunk of code getting executed every time the right kind of card is read. In that way, you could make Fargo turning complete, but you're using an external tool, a pre-compiled program that's just crammed into Fargo. The final piece to Fargo is how it outputs reports. This is controlled by the printer spacing chart. This is used to set up the layout of your final report. You use special cards to show where you want certain data printed. Those spaces are then filled in either by detailed control cards or total control cards. Details aside, this is essentially a form of type setting. You're telling the computer how you want your text laid out.

Starting point is 00:33:00 It's primitive, but it's the same basic idea as Latex or HTML. The spacing charts used by Fargo are actually the same charts used on an IBM 704 tabulator. This is specifically called out in the manual. It says if you have an existing chart, it can be reused. Thus, you can make reports for the 1401 that looked identical to your old 407 reports. Again, Fargo is very explicitly a transitionary technology here. You can even see this transitionary feel in Fargo's physical manual. In preparing this episode, I've been reading the manual for the IBM 407 tabulator and for

Starting point is 00:33:46 Fargo. Flipping back and forth, it's clear to see the similarities. One that stands out is how these manuals display their programs, so to speak. The 407 is, of course, configured by a plugboard. When the manual describes a certain configuration, it shows a diagram of the plugboard with these big bold arrows showing where patch cables need to be strong with little circles to indicate emphasis and notes. Fargo's manual shows diagrams of forms when it's describing programs. As we've outlined, forms can be interconnected. A constant is defined in one form and then used on a few others. Those connections are shown with the same big bold arrows and little circles with notes as we saw in the IBM 407 manual.

Starting point is 00:34:37 So even stylistically, there's this very deep connection. Technically, it is possible to create a full normal program with Fargo. And I know I've been kind of hammering home that this isn't a real programming language. But the trick here is linking. At least kind of. I don't know that this gets really weird. So remember how I mentioned there's fields where you can insert machine code, and then you can use that to execute a pre-compiled binary? Well, Fargo also has a way to link those existing binaries, compiled programs, to your Fargo program.

Starting point is 00:35:19 At least, kind of. This is where we get into how Fargo is executed in practice, and this makes the idea of linking a little strange. So the reason I'm using the word linking here is because normally when you have a compiled programming language, you can take another compiled program and with a special little tool, shove them together, you can link them. But Fargo isn't compiled. It's also not executed exactly like any language I've ever heard of.

Starting point is 00:35:57 It's interpreted and it's done so in phases. Your Fargo cards are broken up into four possible phases. Those have to be carefully constructed with a few other decks of cards. Luckily, the Fargo Manual gives us a nice diagram to go off of. Unluckily, this is where we get some linguistic confusion. Fargo is the name of the reporting system, your code, as well as the name of the program that generates reports. Fargo, the program, also comes in four chunks.

Starting point is 00:36:30 Those are Fargo Phases 1, 2, 3, and 4. When you have your Fargo program prepared, the control cards you punch from forms, you have to interleave it with the proper phase of the Fargo program deck. You end up with a pile that starts with Fargo phase one program, then the cards for the phase one of your report, then Fargo program phase two, the cards for that phase, and so on until you reach phase four. Afterwards, if you configure things correctly, you have the binary program, the external code you want to link. After that are your detail cards, the raw data you process. The stack is then fed into the 1401's card reader.

Starting point is 00:37:16 A few things to point out here. One, there's no compilation step. Your control cards, the Fargo code that you write and punch, is interpreted piecemeal by the different phases of Fargo. The only output is your final report. This could be described as something like an interpretive process, except with the added complication of this weird blink phase. But that leads to another issue. Fargo isn't very efficient.

Starting point is 00:37:48 That's kind of the bugaboo with interpreter systems in general. Every time you run the program, you have to pass it through the interpreter, Fargo. That's not the fastest process in the world. It means that if you're running the same report over and over, you will always have some wasted overhead. Fargo was probably being used for something like payroll or inventory or taxes.

Starting point is 00:38:13 That kind of program isn't going to change very often, if ever. So that overhead is always sitting there wasting computer cycles. On a modern computer, you can get away with that. It doesn't really matter. We tend to have cycles to spare. But on an older machine, that wasn't the case. Every second counted. Computer time was a scarce and very valuable resource.

Starting point is 00:38:39 So these interpretive systems represented a very tangible issue. With the foundation laid, the next step is for us to approach RPG itself. That, however, is fraught with problems. For one, we don't have much historical detail on RPG. We don't know who exactly worked on the language even. This is actually frustrated by Wikipedia. If you read the article on RPG, it claims that a programmer named Wilf Hay created the language. That in turn cites an older web encyclopedia which has no citations. The scant articles that talk about the early days of RPG all cite either Wikipedia or that unbacked source. So you will often

Starting point is 00:39:32 see Wilf Hay credited as the creator of RPG. I have not been able to find any proof that backs this up. He was a journalist that wrote for PC Plus magazine in the 1990s. I found nothing that connects Hay to IBM, besides a few articles he wrote about the PC in the 90s. I have no earthly idea why he is claimed to be the mind behind RPG. As for other candidates, well, I have a list. There's this lovely Web 1.0 site, Van Snyder's homepage, that has a big list of folk who worked on and around the IBM 1401. It includes annotations and contact details. It seems that in the mid-thousands, Van Snyder was tracking down info on the 1401.

Starting point is 00:40:25 That site lists 10 people who may have worked on RPG. I do have phone numbers and I've tried to get in touch with some of them, but I haven't had any luck. The best I can do is prove that most of them worked at IBM roughly in the period. What I've been able to find is that RPG was co-created by Bernie Silkovitz and Barbara Wood to quote from IBM's early computers from MIT Press, quote, in 1959, two members of the applied programming department of the general products division were assigned the task of providing a software method of easing the transition from the wiring of control panels

Starting point is 00:41:06 to the preparation of programs for the 1401. The book continues. They responded by creating a language whereby the user could specify features in any report and a generator that could convert those specifications to a 1401 program capable of producing the report. The name RPG was applied to both the language and the generator program." The project that becomes RPG starts in 1959 and it's programmed by these two. However, we have almost no information about these programmers. I found a newspaper article that confirms Silkovits worked at IBM and also he was married and liked to play tennis. We know even less about Barbara Wood. The little that I have found points to the same origin

Starting point is 00:41:56 as Fargo. IBM was worried that customers would have issues making the leap from tabulators to computers, so they introduced a transitionary tool. But here's a fun wrinkle we have to consider. Did Fargo inspire RPG? I've seen it claimed in a number of places. I've even read one article that calls Fargo RPG version 0. However, again, we don't have any real evidence to go off of. We don't even have a proposed author for Fargo. The timeline is also kind of strained. I've seen some sources that claim Fargo came out in 1959. Others claim it wasn't around until 61. The same goes for RPG. So the matter of influence is a little bit hazy at best. Whatever the case, we can easily and definitely say that RPG is its own language.

Starting point is 00:42:53 However, there are a lot of similarities to Fargo. The most evident is the fixation on forms. RPG is written by filling out a series of forms called specifications and then transferring those 80-column forms onto 80-column punch cards. RPG even uses four different forms, one for input, one for data, one for calculations, and one for output formats. The differences, however, are much more interesting. First of all, RPG is compiled. That, right away, gives it a huge advantage. Once you write an RPG program, you pass it through the RPG compiler, and that spits out a binary that can be used over and over again.

Starting point is 00:43:35 That worked really well for reporting work. Compilers give you a huge advantage when you're dealing with programs that are used many times without modification. Like I mentioned earlier, you're unlikely to change your payroll program very often. So compiled report generator makes a lot of sense. That just saves computer time. This is also a very special purpose compiler. It may be more accurate to say

Starting point is 00:44:00 that you're configuring a binary, if that makes any sense. You aren't writing this general purpose program. Rather, you describe inputs, how to transform them, and then how to generate outputs. That's then used to create your binary, your final program. You can't write a game in RPG, for instance, because it's only ever able to output a report generator. If we take a step back a little bit, this is actually very similar to one of the first, well, approaches to a practical compiler. In the very early 1950s, while working at Univac,

Starting point is 00:44:38 Betty Hall-Burton developed a wild sorting program. The user would describe their data and describe how they wanted to sort it. Then the program would spit out a sorting program customized for that data. It literally created a new program for the user. This technique, called generative programming, is what inspired Grace Hopper to write her first compilers. RPG sounds a lot like these early generative programs. It takes a set of specifications and then outputs a fancy customized program

Starting point is 00:45:12 to meet those specs. In that way, we are very far outside the realm of the normal for a programming language. And again, this puts RPG in a strange regressive position. So I guess we should run up an outline of RPG, right? There are a pile of different versions, not to mention later iterations. I'm going to be using a 1964 manual for disc RPG. This will give us a good mix of early design as well as the more fancy codified features, I think.

Starting point is 00:45:48 To start with, RPG is able to handle different types of media. Early versions were, like Fargo, restricted to cardstock, but as the language matured, it added support for tape drives and, eventually, hard drive files. Disc RPG supported the IBM Ramac Disc Drive, this big beast of a spinning platter machine. You can actually see a running one down at the Computer History Museum. This support worked both ways. RPG could read from card, tape, or disc, as well as writing to the same. That, plus its traditional hard copy printouts. This means that RPG could function as one cog in a larger project. You

Starting point is 00:46:34 could use an RPG program to read a file, do some reporting, and spit out a summarized report. Then you could have another program that would read from that report. That second phase program could even be another RPG program. It's, at least to my mind, similar to the approach used by a lot of Unix tools, where you're able to take an input and create an output and then pass that on to another program as a new input. Okay, back to RPG itself. The first step of making RPG program is the input specification. That's form X24-6590. It tells RPG how to break up incoming data. In modern parlance, we'd call these parsing rules.

Starting point is 00:47:21 This is needed because all data sources to the IBM 1401 are technically serial. The computer just gets a long string of numbers that mean, on their own, nothing. The machine has to know how to chop up those numbers into meaningful information. That's described by the input specification form. This also introduces a new concept, the conditional, or rather, it elevates the concept. Each input form statement consists of two very poorly named parts, the record codes and the control fields. Of course, these are all in caps, and yes, old mainframe terminology is strange to read. You get those two fields plus a name for the incoming data. Record codes are conditionals, plain and simple.

Starting point is 00:48:16 That said, these are very basic. You can specify up to six of these conditions per card and you can do more if you use fancy coding to use multiple lines. These record codes check if a column of an incoming card matches a certain character. It's simple filtering. Multiple conditions function as a logical AND, so you could build up a check for multiple columns at once. You could even say, check for a card whose first columns read, hello world. The twist here is that you can save the results of those checks as a variable. That variable, actually called a condition by RPG, is saved in memory and can be referenced later

Starting point is 00:49:00 in specific places. If all checks pass for a card, then RPG moves on to the control fields. These specify how to extract data. You give RPG information about the location and size of a field, and if the field is broken up into multiple chunks. So, altogether, this is a filter and extract operation. At the end of the line, fields are either ignored or placed into a named variable. These record names are a little weird. In general, RPGs data conventions are weird. You name records, right? So those must be synonymous with variable names. Well, no, no they are not. Those are names only for incoming data. You also name conditionals, right? So those must also be variables. Again, no. These are very special

Starting point is 00:49:56 purpose conditional names that can only be used in very specific places. For true variables you have to use the data specification sheet form x24 stroke you get the idea. In that form you declare and initialize variables. Variables are either numeric or not numeric. Really? That encompasses everything in the world! Jokes aside, that's the actual typing system. I don't know what to tell you. It's a little odd, but workable. A neat part of the data spec sheet is that you can compose a variable from multiple input fields. Let's say you have a set of fields for day, month, and year. You can declare variables that are filled with each of those fields individually.

Starting point is 00:50:48 You can also declare a variable that's all three of those fields combined. That's something you could do with a normal programming language, but it would take special code. RPG just supports this operation as a matter of course. I think that's really interesting. operation as a matter of course. I think that's really interesting. But that's not where it ends. An RPG form is plain and simple, not simple. The data spec sheet can do a lot more than just move around numbers. You can also perform certain mathematical operations. There's an operation column that specifies how to apply the incoming field to the variable. You could choose to just move the variable around or you could choose to add it. That would keep

Starting point is 00:51:31 a running total. It'd be useful for, say, totaling the income from a pile of invoices. You can also choose subtract, multiply, and a few other mathematical operations. Data spec entries could also be controlled via conditionals. So in effect, each line is similar to an if statement followed by an operation. If this card is for sales orders, then add its price to income. This is exposed because, again, it's how tabulators worked. On a FAR 07, incoming data would be moved to a waiting accumulator following whatever path you choose via your patch cables. That could include a mathematical operation.

Starting point is 00:52:14 This setup seems strange for a language, but it was just how tabulators worked. It was how tabulator operators thought. Okay, so the next sheet for us to consider is the calculation specification sheet. Now, like Fargo's total sheets, this is where you do, well, calculations. That part isn't too interesting, and it's not too different from Fargo.

Starting point is 00:52:39 It's with the calc spec sheet that something clicked for me, though. Maybe you've already spotted it since I'm not always the fastest on the uptake. RPG almost has scope. Let me explain because I think this is neat as all get out. So scope. If you know, you know. Basically, scope is how a language decides which parts of your code can see which other parts of your code. Scope determines what data is accessible at any point during execution.

Starting point is 00:53:12 You can have two functions that both have some variable named i tucked inside of them, and your program won't get them confused. That's because of scope. For most languages, a function is treated as its own isolated scope. There are better examples, but they all have to do with mathematical proofs, so we will ignore them. Scope is a core concept to all remotely modern languages.

Starting point is 00:53:37 It's also a very subtle concept. I still make scoping mistakes as a consummate professional. Many early languages and many simple languages do not have a concept of scope. That means that all data is accessible from all parts of the program. RPG, despite its antiquity, well it's built different. The calculation spec sheet only has access to variables from the data spec sheet or from the calculation spec sheet itself. You cannot access input fields. That right there is scope. At least it's very similar to scope. That's not exactly something I expected to see

Starting point is 00:54:19 in a language from this period. Scope first shows up in Algol 60. In fact, you can read some beautiful nerd fights over Scope in the Algol Bulletin from this period. But RPG is developed at the same time as Algol 60. While folk are arguing over Scope, Silkovits and Wood are hammering out spec sheets. Is this a case of independent invention? That's the weird thing, right? I'm not sure if this scope-like behavior in RPG

Starting point is 00:54:52 was intentional. Again, let's think in terms of tabulators. The variables described by the data sheet are equivalent to a tabulator's accumulators. Those are the physical devices that hold numbers, and it's where operations are actually carried out on a tabulator. Incoming fields from punch cards are exposed as read-only data. That's because, well, it's physically read-only. You can't change the value of a column on a card once it's been punched. So when you operate on data, besides that first read, that first grab, you operate on accumulators.

Starting point is 00:55:30 You operate on variables described by the data spec sheet. If we take that view of things, then scope is the only logical answer to a problem. You can't talk about input fields in the calc sheet because those are read-only, and the calculation sheet is read-write. You have to carry that data into an accumulator, a variable, before you work with it. So scope is less a theoretical concept and more a harsh reality of punch card and living. Thus I'd argue that this isn't really a case of independent invention. Rather, I think it speaks to scope being something deeper than pure theory.

Starting point is 00:56:10 Here we have a concept very, very similar to modern ideas about scope. It falls out of a stack of cardstock and some weird machines built by IBM. Data gets isolated and restricted inside the physical tabulator for practical reasons. When tabulator technology is transferred over to a computer, that isolation is retained, it's emulated. That acts very similar to scope and angle, but it's not created for the same reason. At least, it doesn't appear to have been created for the same reason. The final sheet is the format specification. Again, it's really similar to Fargo. It describes where to output

Starting point is 00:56:49 which variables and how to format them. Everything's tagged with conditionals. And that's all of RPG. Now, from what I've described, I want you to ask yourself this. Is RPG Turing complete? Is it an actual 100% general purpose programming language? The answer is technically yes, but it's a painful yes. I'm going to leave this as an exercise for the listener. The trick has to do with the fact that RPG commands will loop. Calculations will run each time a card is read, and you can use conditional fields to inhibit execution of an order. From that, you can prove Turing completeness.

Starting point is 00:57:34 Kind of. Bear in mind that RPG changes a lot over the years. RPG 2 comes out in 1969, and that has full control structures. It's completely Turing complete. What we've been looking at is just the original. It's just RPG as described for the 1401. It may sound underwhelming, but as it turns out, it really wasn't. Allow me to turn to an 11th hour source that I ran into. One of my favorite magazines is Datamation. It's a computer trade rag that started in the 1950s, so articles go way back. You can

Starting point is 00:58:12 usually find a Datamation article that covers just about any commercial hardware or software. That includes RPG. In 1967, one Lou M. Friedberg published an article in the magazine titled, RPG, the coming of age. It's his testimonial as an early adopter of RPG. Lou's story starts following what I outlined to a T. He worked for a company that switched from a 407 tabulator to a 1401 mainframe because computers just have to be better, right? They ran into an issue almost immediately. Quote, now, however, our report requirements

Starting point is 00:58:54 could be satisfied only by a programmer with a capital P, and they were in extremely short supply. We needed help and and fast." End quote. They tried Fargo, but it wasn't powerful enough, it was too slow. They tried Cobalt, but it wasn't quite ready for general use yet. So Lu turned to RPG. Crucially, he was not a programmer. He was in for a world of hurt. What's neat to hear is that early versions of RPG's processor

Starting point is 00:59:26 just didn't work. There were bugs, glitches, and all sorts of issues. But by late 1962, IBM had sorted things out, and Friedberg was able to compile an RPG program. In his words, he was sold. Then we reach something that I never even considered. Quoting again, The first real test came immediately with a three program system for a large chemicals manufacturer. The programming department estimated it would require three weeks of billable effort or approximately $1,500 to complete the package. Because of computer scheduling, this represented six weeks of elapsed time. Furthermore, no programmer was available for at least two weeks. We decided

Starting point is 01:00:11 to gamble on RPG." This, to my mind, is fascinating. Let me explain why this gets me. Tabulators were very much in-house doohickeys. You wire up a plugboard to do your company's accounting and payroll and all that internal mess. You don't go sell that plugboard. But you can sell software. An RPG program, once compiled, is just like any other. So why not use RPG for normal pay-for software, or in this case, for contracting?

Starting point is 01:00:52 This means non-programmers could write and sell software. That's wild! That's a totally new angle! Friedberg would teach 12 other non-programmers how to use RPG, and like that, we're now onto the promise of a lot of later high-level languages. You'll spend less time programming and debugging so you can work on more complex software, but in this case, dedicated professional programmers could spend less time programming trivial reports because of RPG.

Starting point is 01:01:25 Actual practitioners never had to touch RPG to gain that benefit. Rather, the uninitiated were doing that all on their own. Thus, the programmers were freed. They had more time. That's wild! I mean, that's just wild! A lot of languages promise this sort of thing, that they'll be so easy to use and so practical that you'll save wild amounts of time, that it gets new

Starting point is 01:01:50 people programming. Very few deliver. But RPG somehow hits the mark. It's a weird language. It's this transitionary thing between archaic plugboards and modern computers. But that's just what some people needed in this period. Alright that does it for the tale of RPG, or at least what we've been able to dig up. The language is still in use today on both IBM and non-IBM hardware. It's changed considerably in its long lifetime. RPG4, the current version, is so different that old-school RPG may as well be another language. But what about original RPG?

Starting point is 01:02:39 It's clear to see that RPG is special, as is Fargo and all the other report generators of this period. These aren't general purpose languages, rather they're very specialized. This is something that we should expect to run into. As Rear Admiral Grace Hopper posited, there should be special purpose languages for all kinds of special purpose applications. That said, I still find a little startling when I run into them. RPG is a very clear and extreme example of this concept. It's so specialized that normal language

Starting point is 01:03:15 features like syntax just don't exist. It's so rigid that each RPG program is basically just a variation of each other. Read, calculate, format, print. Read, calculate, format, print. But even inside that regime, we can still make out the shape of a language. If anything, I think that's what we can learn from RPG. If we want a specialized language for a specialized task, then we should really lean into it. Don't just make a version of C that's good at strings, or a new Python distribution bundled with special math libraries. Specialization can go as far as coming in the form of, well, a radically different form. Thanks for listening to Admin of Computing.

Starting point is 01:04:01 I'll be back in two weeks. You can find links to all my stuff at adminofcomputing.com. And as always, have a great rest of your day.

Your Ad Here

Advent of Computing - Episode 156 - RPG, a Different Paradigm?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.