Advent of Computing - Episode 179 - Programming Block by Block

Starting point is 00:00:00 Can I admit something to you? Something embarrassing. It took me way too long to understand exactly what a mathematical model was. I know that may not sound all that embarrassing. The context you're missing is I went to college to study physics and I worked on a pile of computational projects. I wrote simulations from scratch. I maintained simulation code.

Starting point is 00:00:29 At one point, I was paid to do that, and I even interpreted and visualized the results of simulations. I didn't really figure out what folk meant by mathematical model until I was near graduation. I think what was happening is I'd hear a professor say model, and I just figured it was some detail that didn't really matter. That doesn't affect me, right? Well, would it surprise you to know that a simulation? uses a model. If that is shocking, don't worry, I can keep a secret. I won't embarrass you in front of your math friends.

Starting point is 00:01:10 So here's the skinny. A model is just a way to represent something. This is often done mathematically, hence a math model, but it can also be done as a simple list of steps and rules. You can model how a pendulum swings using a simple equation. But models don't have to be equations. You could also model a game of chess or checkers. For that, you'd write code that describes what moves are legal

Starting point is 00:01:37 and how the game transitions from one turn to another. A simulation then uses one or more models and runs them through their paces. One simulation I wrote back in college was all about heating in a galaxy's gas disk. The simulation had a model for how particles were attracted by gravity, how they were slowed down by drag and how friction heated them up. Those were all equations. The simulation strung all those equations together to give me an answer. It turns out that that kind of model is a little too complicated as one massive equation,

Starting point is 00:02:14 but it will work pretty well if you break it down into a couple different models and string them together. Just take it step by step. In this case, I was trying to figure out how hot gas could get, and how hot the dust around it would get under specific circumstances. One of the big pain points on that project was actually programming the thing. I wrote that whole simulation, models and all, in Fortran. It crunched all the numbers, took all the steps, and followed all my crazy, wacky rules.

Starting point is 00:02:47 But when it came time to analyze the data, I had to use another tool. I ended up writing a few Python programs to make sense of all the numbers, Fortran spit out. I also had a Python program that would work up data to feed into Fortran. It ended up being a bit of a mess. What I didn't know back then, well, besides what the word model actually meant, was that there was a better tool for simulation. Fortran, it turns out, is so last year. Welcome back to advent of computing. I'm your host, Sean Hass, and this is episode 170 programming block by block. Today, we're talking about a language that I've never heard of before,

Starting point is 00:03:41 hence, it's a language I'm excited about. The tongue is GPSS, the general purpose simulation language. But before we get into the episode, I have a plug for another podcast. You may have heard already that advent of computing now has an official after show. It is called Adjunct of Computing. The show is co-hosted by myself and Joe, a former listener and now collaborator. The format is a lot more loose than the main show. It's a Q&A-style discussion and just general ramble about last episode's topic.

Starting point is 00:04:22 Episode two just came out where we talked about the Programma 101 and dig a little bit more into the programming model and just what exactly the thing is. If you like Adron of Computing but have ever thought, gee, Sean, I'd like to ask you a few questions, then you'll really like adjunct. I'll have links to that in the show's description, and I recommend you go give it a try. Anyway, on to the topic at hand. Let me lay out the scene. Late one night, I cracked open my worn copy of Proceedings of the ACM's History of Programming Languages, Volume 1. I was cruising for something interesting, something new, and I came across a paper I hadn't seen before. That was Jeffrey Gordon's The Development of GPSS.

Starting point is 00:05:11 I was immediately interested because, well, like I said, it's totally new to me. GPSS is a simulation language, which is something I know basically nothing about. I know that simulation languages exist and that's the end of my knowledge. The final nail in the nerd coffin is that GPSS is structured in a very unique way. Its programs are a series of blocks. Each block operates on a stream of transactions. That's just about as different as a programming language can be. And what can I say?

Starting point is 00:05:50 I think Dan Timkin has influenced me to be on the lookout for weird things that don't feel like normal programming languages. GPSS is designed to be very special purpose. It's for simulating systems. It's for modeling how things flow through a system, like traffic through a grid of streets, or long hairs through a barbershop, or calls through a switchboard. I personally subscribe to Grace Hopper's view of programming that specialized languages are needed for use in specialized tasks. Anytime I run into a special purpose language, I want to know more. What's its task? How is it adapted to the niche? And, most important of all, is it really worth having a special language for this task? While I do hold with Hopper, I still believe it's important to try and have evidence to back up your views. If you can't

Starting point is 00:06:45 contest a viewpoint you have and make it out the other side convinced, then maybe you need to change your viewpoint. Will GPSS be another piece of evidence? in my growing toolkit, to find out we have to trace its roots. Luckily, and this is another reason I'm excited to produce this episode, Gordon gives us all the tools we need. Today, my main source is going to be his history of programming languages paper. This paper traces the language's origins very well. It also cites other papers about GPSS's use in real-world projects. The last part is something that I rarely get. Suffice to say, I'm excited. Will I be singing this same tune by the end of the

Starting point is 00:07:32 episode? There's only one way to find out. Our main man today is the aforementioned Jeffrey Gordon. He eventually develops the first version of GPSS. But the lead-up to that takes a while. Gordon initially cut his teeth working with, of all things, electronic analog computers. This is way back in the early 50s from history of programming languages. Quote, No one who has worked with analog simulation can fail to have been impressed by the way an analog computer gets its user involved with the problem solving. Putting together the elements of an analog computer to study a system feels almost like

Starting point is 00:08:19 building the system itself, and it is extremely gratifying to get results that can be immediately related to elements of the system, end quote. this sentiment is super consistent with what I've read about analog computing. People that use electronic analog systems for simulation really liked them and really liked how hands-on these machines were. Now, Gordon makes the jump into digital by the latter part of the decade, but this experience with analog sticks with them. At least, it does something to the guy.

Starting point is 00:08:55 Analog computers were a huge factor in the early digital era. They don't exist anymore, at least not at any scale, so it's easy to forget they ever existed. But there must have been many like Gordon who jumped from analog into digital, but still remembered how gratifying earlier analog systems were. One of the huge downgrades with digital was the fact that users were more abstract, from their problems. The act of programming is, well, it's much less directly involved than the act of configuring an analog computer. If you want to simulate a circuit on an electronic analog machine,

Starting point is 00:09:39 you build up that circuit. If you want to simulate a pendulum swinging on a spring, you need to make a circuit that acts like a pendulum on a spring. There's a very physical and visceral connection between the machine, its user, and the problem you're working out. You'll actually have little components for simulating the pendulum, the spring, the force of gravity, the mass of the pendulum, and so on. Digital computers just flat out do not work this way. And that, dear listener, leads to a winding series of issues. Some of the first applications of digital computers are simulation.

Starting point is 00:10:23 That is, using a digital computer to describe and analyze some physical phenomenon, to simulate some phenomenon. Let's take ballistics as one of the earliest examples. That's what INIAC was designed to do. A projectile will follow a very well-defined arc. That arc is dictated by things like the projectiles, shape, speed, weight, starting velocity, and initial angle. If you want to figure out where the projectile will land, you need to run a simulation. It's pretty simple, it ends up being a single equation, but it's still a simulation. On an analog machine, you had modeled this directly. You connect

Starting point is 00:11:06 components that model the force of gravity pulling down the projectile, the force of drag slowing it down, and the initial oomph put into blasting it off. You get a neat little system that directly represents the model. You can actually point to this box, which simulates gravity in the system. The calculation for an analog circuit is more or less instant. It happens as soon as electrons flow through that circuit. On a digital computer, you have something very different.

Starting point is 00:11:38 You start with your equation. You then break that down into smaller operations the machine can understand. Then you have to convert all your starting conditions into values that can be expressed as binary. That will often include approximations. You finally run the program, and then you confer at its outputs from binary into something a human can read.

Starting point is 00:12:00 The digital process is much more abstract, more removed from the reality of a simulation. You can't point to a little box in your computer and say, that's where gravity's modeled. You'd have to point to a couple of lines of code, maybe a function or something. This abstraction is necessary because of how digital computers work. They're very flexible and very general.

Starting point is 00:12:25 In this case, that generality comes at the cost of clarity, and this is something historically that a lot of analog aficionados complained about. That said, the digital machine can be more direct for certain types of simulations, namely discrete systems. These are systems that can naturally be broken down into steps, systems that are made up of separate and individual parts. The example here is easy. The first program to ever run on ENIAC was a simulation of nuclear fusion. This used a newfangled approach called the Monte Carlo method.

Starting point is 00:13:05 Go listen to episode 116 if you want the long story. I swear, it is actually very interesting. The short story is that the Monte Carlo method is an approach to simulation that works kind of like a game. The game has turns and it has simple rules. Each turn, you roll some dice and, depending on the roll and the current state of the game, apply a rule. By repeating that process over and over you slowly and randomly sample all possible game states. By applying the Monte Carlo method to a simple model of nuclear fusion, researchers are figured out what was needed to make a hydrogen bomb.

Starting point is 00:13:47 You could apply the same method to a model of shoots and ladders and figure out which shoot was the most dangerous. Both of those models of fusion and shoots and ladders are discrete because there are specific steps and specific states. There are only so many squares on a board of shoots and ladders and you can only move in a non-infinite number of ways. So it's discrete, it's not continuous. These discrete models with their finite steps and finite things

Starting point is 00:14:20 are trivially easy to build on a digital computer. They also end up being almost impossible to do on an analog machine. Analog doesn't really do steps. It does flow and ratios and continuous values with infinite possibilities. Different tools here truly have special applications. The jump from analog to digital isn't just a new technology, it's a jump to a different tool that's good and bad at different things. Gordon made the digital jump in 1954. He would bounce around a few projects but stayed in the realm of simulation.

Starting point is 00:15:00 Initially, these were still continuous simulations. He was writing software to calculate similar things to what he would have done with analog circuits. In 1956, he entered Bell Lue. Labs. While there, he would complete his digital transition. At Bell, Gordon started working with discrete simulations on very discrete machines. Now, Bell Labs is always a bit of a funny place to me. Naminally, they are a lab attached to a telephone company. They also fund pure research, truly blue sky stuff. They also fund research targeted at Telephony. Some, that telephony research turns into truly blue sky research. I mean, sometimes we get transistors,

Starting point is 00:15:49 sometimes we get Unix, and sometimes we get the national phone grid. Gordon was very much on the telephone side of things. He wound up writing simulations of switching systems. He was essentially simulating components that make up the telephone grid. I'm sure this was one of those classic projects where the problem seems huge until you start working on it. Gordon figured out a nice trick to make the simulation actually pretty easy. Quote, it became apparent that the individual items of equipment, whether they were switching device, communication channels, or message storage units, could be represented as simple server units,

Starting point is 00:16:32 with their own service, discipline, and capacity. The system models were essentially networks of such units. end quotes. There are only so many switches they can be used for routing phone calls, so you can simplify the project a lot. Instead of writing one big simulation, Gordon wrote a handful of very small ones. He just needed a way to simulate each kind of switch. Then the larger simulation was just a matter of describing how those switches were connected. I love this approach because it's general purpose and it's modular. The only special part of the program is the simulated wiring,

Starting point is 00:17:15 the code that describes how switches are connected to each other. The actual switch simulation, that can be reused. That kind of modular design can save you a mountain of work later. This is also a somewhat analog approach. At least, this points to some analog thinking. Electronic analog computers aren't really programmed. The computer consists of a pile of reusable modules. These are building blocks you can use for adding values, integrating equations, or even generating signals.

Starting point is 00:17:50 The act of programming an analog computer with quotes boils down to wiring these modules together. Your program is actually those connections. Gordon would reuse his switching simulations like analog modules. He even describes keeping the code on some extra punch cards and carrying those cards from one program to another. He was naturally building up a library of reusable tools, of these little simulated modules that you could wire together with imaginary wires. The path from here to GPSS is really just a matter of refinement.

Starting point is 00:18:29 In 1959, Gordon got involved with a project to write a tool for sequence diagrams. That's another form of discrete simulation. Since this was a tool, it was a little more general purpose than Gordon's handy deck of switch simulators. So he was pushed in a more general direction. The final leap comes in 1960. That year, Gordon joined IBM. He moved around the company a little, but he ends up in the mathematics and programming department. As he put it, one of the major goals of this department was to use mathematical analysis.

Starting point is 00:19:05 to solve design problems within IBM. But it wasn't possible to use normal mathematics. Consider, if you will, time sharing. In a time sharing system, a computer splits its time between a number of users. A user will get some slice of time, usually fractions of a second, for their program to run on the computer. After that time slices over, the computer suspends their program, and switches over to some other program.

Starting point is 00:19:37 But a user doesn't always operate a computer in clean time slices. There's some amount of time spent typing at a keyboard, thinking, and reading terminal outputs. It also takes time for signals to propagate between their terminal and the computer. That's not to mention things like I.O. devices. So what's the best way to split up time? How many users can share a certain computer before performance degrades? You could try to work up a mathematical equation for this, but that would be really complicated to do.

Starting point is 00:20:10 A more realistic solution could be found using simulation. This is a pretty small system, a computer, some terminals, and some users. And you don't even need to simulate all that much here. Each user gets to use the computer for a certain amount of time before they're switched out. Each user spends some amount of time reading and typing. Each terminal has some delay. This is actually a very approachable problem if you have the right tools. In 1960, those tools didn't really exist, at least not inside IBM.

Starting point is 00:20:46 So Gordon was tasked with developing a tool. Quote, I began writing a program implementing a block diagram language, choosing a relatively simple, table-oriented interpretive structure that would make changes easy, end quote. This was the start of the Gordon simulator. It would later be renamed the General Purpose Simulator, and still later renamed to the General Purpose Simulation System or GPSS. There's already a valuable lesson here. Gordon was trying to solve a specific set of problems. He only needed a tool that would work for IBM's woes. Specifically, when the project initially starts,

Starting point is 00:21:31 it's dealing with communications on buses inside mainframes. But that's super specific. A tool for that case might be used a handful of times. So instead, he decided to be more general. GPSS ends up being a very well-loved and well-used language because he designed it as a general-purpose tool. It's made inside IBM, but it can be used to model systems outside of Big Blues walls.

Starting point is 00:22:01 Thinking in this kind of modular and general-purpose way is powerful. Please, I implore you. If you write software, this should always be somewhere in your head. It will save you so much time in the future. What is a programming language? That's a question that I've reported. repeatedly ask, and that I will continue to ask. It's also something that you should all think about very carefully. GPSS initially caught my attention, partly because it doesn't look or feel like

Starting point is 00:22:38 other languages. Hey, maybe after all these years I'm so jaded that I need to find increasingly strange things to hold my interest. But, like I was saying last episode, I truly do think that to know a thing, you need to test its boundaries. When do we cross from a calculator becoming sophisticated enough to become a computer? Or when does a computer become so simple? It's actually just a calculator. By that same token, when does a tool become sophisticated enough to be considered a programming language? We have hard and fast measurements we can use, but those tend to get blurry on the edges. That's why the edge will always be so interesting to me.

Starting point is 00:23:19 One thing that struck me about GPSS is how it's discussed. In Gordon's history of programming languages paper, he talks a lot about GPSS, but he shows no code. He describes as development, how it's structured, how it executes, how it's been expanded, but he never shows a single line of GPSS code. That's odd because every other paper in the history of programming language volumes show at least a little code. The first paper on GPSS doesn't show any code for seven pages. This is a trend in just about every source on GPSS that I've come across. I was skimming a primer on the language to try and see how it was presented back in the day,

Starting point is 00:24:10 and it took 43 pages before the author introduced a single line of GPSS. For a programming language, that just, that don't feel right. That feels odd. Why would you do this? Why would you shift focus away from the actual text of language? I haven't seen anything explicitly giving reasoning, but I read enough to have a few good ideas as to why. In GPSS, the language itself is a means to an end. It's how you load your model into a computer. As such, it actually plays second fiddle to the deeper workings of GPSS itself. This is odd, but it doesn't make GPSS unique.

Starting point is 00:25:01 In fact, Gordon himself gives us a clue here. Remember how he said he chose a table-oriented structure for the language? The actual text of GPSS doesn't so much have syntax as it has structure. An early GPSS program looks like a form that's been filled out. That form tells the computer how to run a simulation. There's this other IBM language that uses the same trick, RPG. That programming language is also table-oriented. You fill out a series of forms that tell the RPG compiler what to do.

Starting point is 00:25:40 RPG is a report generation tool, so those forms specify how to read, manipulate, format, and output data. This is, in the most technical sense, a programming language. It's just structured in a very simplistic and very different way. A programming language is usually defined by its syntax, that is, how the code looks. Code has to be structured in a very specific way to actually hold meaning. That structure is modeled as to. after human language. At least how we discuss that structure is modeled after natural human languages. You have things like parts of speech and words. You even have idioms and phrases. You have special

Starting point is 00:26:23 characters for grouping ideas, ways to signify special meaning, and ways to end a statement. None of that is present in early versions of GPSS or RPG. A line of code is actually just a set of values that fit into some predefined columns. You program by filling in something that looks like a spreadsheet. The first column is a location. The second is a column name and on down the line. Meaning is defined by position, not so much by any syntax. I know it's not technically correct,

Starting point is 00:27:02 but I like to think of these as syntax-free languages, because what they're doing just isn't traditional syntax. Technically, you could make the argument this is a form of syntax, but it really, really doesn't feel like that. So if GPSS doesn't put code first, then how are GPSS programs described? How can you have a book about GPSS that shows no code for 43 pages? Well, easy.

Starting point is 00:27:34 It uses flowcharts. That's GPSS's most striking feature. The language is actually a way of representing flow charts. I don't mean this in the abstract sense. I actually mean that GPSS is designed as a way to build executable flow charts. One of the earlier sources I've been working out of is a 1964 article in the IBM Systems Journal. The article describes GPSS2.

Starting point is 00:28:05 It's a revision to the language. and it's really when the language starts to see use. It follows the usual pattern of not showing any code until we're most of the way through the article. What it does show, however, are a lot of flow charts. These old IBM Systems journals use a really neat format that I really, really dig. The outer margin of each page is extra large.

Starting point is 00:28:32 In that space, there are section headers and figures. It's used a great effect here. Each page has a few of these skinny flowcharts. The paper builds you up to code by describing each block and building up more complex charts. Then, and only then, does it hit you with code? This isn't just a bit of rhetorical flourish. I mean, you can do this with any language. You could start with a flowchart and then use that to lead readers into actual code text.

Starting point is 00:29:03 but GPSS is uniquely suited for this. A GPSS program is composed of two things, blocks and transactions. Your immediate reaction to this should be, huh? A block is just a box on a flow chart. The block is connected to other blocks, and it does some kind of operation. Transactions flow through the chart and are operated on by blocks.

Starting point is 00:29:33 That's the execution model. This is another great place where you may be thinking, Huh? GPSS doesn't work like other languages. In fact, its design is a lot closer to that of an analog electric computer than that of a programming language. The only difference here is that GPSS operates on discrete data. Actually, there is one thing that's closer to GPSS than an analog computer.

Starting point is 00:30:03 That's Inniak. That very, very early computer was an oddball in a number of ways, but one is that it wasn't really programmed, per se. Iniac was composed of a series of modules that all operated on data. The program was actually in how you wired those modules together. Execution was accomplished by sending groups of pulses into the wires. GPSS basically operates in the same. way, just in software instead of in hardware. The actual code of GPSS describes how the network of blocks is formed and how many

Starting point is 00:30:44 transactions you want to feed into the model. This is, well, this is all very abstract, right? So what exactly do these blocks do? What can you do with GPSS? Well, if you thought my description so far is abstract, then buckle up. GPSS is so generalized that talking about it feels slippery. So, here we go. There are two non-block things to understand in GPSS.

Starting point is 00:31:15 These are facilities and storage. Facilities are entities that can only be used by one transaction at a time. Storagees are entities that can be used by multiple transactions. Crucially, these aren't represented as well. blocks. Blocks describe an event, something that happens. Blocks can operate on facilities, storages, and transactions. And behold, this has been my frustration with the literature around GPSS. It's meant to be completely general purpose. That means everything reads in this super vague way. Let me hit you with a trivial example. That should help.

Starting point is 00:32:01 For this, we're actually going back to Widget Co. I've recently added a reception area to the main office. If you want to see me, you have to first wait in that reception area. It even has artificial plants and a water cooler. Now, I only accept one supplicant in my office at a time. I need to figure out how many visitors I can see in a day. I would model my office as a facility. My waiting room is a storage.

Starting point is 00:32:30 A GPSS program describes how transactions are generated and how they flow. So I'd start off with a generate block. That tells GPSS how many transactions to generate and how often to generate them. That's how I'd model how many people arrive at Widget Co and what kind of cadence they arrive at. I'd next use an interblock to put incoming transactions into the waiting room. Since the waiting room is a storage, it has a maximum occupancy. That you have to find ahead of time. I personally only let five people in my waiting room at any one time.

Starting point is 00:33:11 What can I say? It's a small office. If a sixth person shows up and tries to enter, well, that's no good. They will be forced to wait outside in the rain. You see, it's always raining around Widget Co. In GPSS terms, that means the transaction is blocked. Once someone moves out of the waiting room, the transaction that's out in the rain will fall out of that queue and fall into my waiting room. Next, I'd add a seize block. That's how a transaction uses a facility.

Starting point is 00:33:47 Seas is special because it's immediately blocking. If a facility has been seized, if someone has stepped into my office, that's it. The door is barred until the person leaves. In GPSS, this can be modeled in a few ways. Most simply, you use an advanced block. That tells GPSS that we have to wait around for some amount of time. I could guess that my usual argument with a customer takes 15 minutes. That would be easy to plug into the model. But there's some variance. So I could tell GPSS to add some error bars. You literally write that you want to wait for 15 minutes plus or minus a few. But you see, I have some special requirements.

Starting point is 00:34:35 A supplicant to Wigico has to bring in a list of grievances. I'll only discuss each grievance for five minutes. I know each supplicant will have between one and ten grievances with an average of three. GPSS can actually model this. GPSS can model this pretty easily. A transaction can have a set of parameters assigned to it, and those can be randomized. To do this, you would use an assign block,

Starting point is 00:35:07 which will put a parameter value onto a transaction. Then you can make that advanced block reference a transaction's parameter. That way, you can wait for a variable amount of time depending on what values are assigned to that transaction. Another neat feature of Widget Co is the Fubar Gold membership. Members of the Fubar program have a number of perks. They get one free doodad for every 10 widgets they purchase. They get an exclusive newsletter, and they can skip the line to my office.

Starting point is 00:35:43 GPSS can also model this. Each transaction has a priority, and higher priority transactions can cut in front of others. priority can be assigned as well using a, well, get this, priority block. If 10% of my customers are priority customers, then that just takes a single line of GPSS to model. I now have my model. It's a sophisticated waiting room with lists of grievances, foobar gold members, commenters, and a single office. I'm using this office example for a simple reason. This is basically GPSS's Hello World program.

Starting point is 00:36:24 You can't really print Hello World in GPSS. So to show off its features at a glance, you need something simple. That's usually something like an office with a waiting room or a barbershop with a line or a checkout line at a grocery store. What happens when you actually run the model? Well, it doesn't print Hello World, that's for sure. You end up with a report. The report by default will tell you details about how many transactions went through the model,

Starting point is 00:36:53 how much simulated time that took, and how well utilized each facility and storage was. You can use special blocks like tabulate to adding your own calculations. This is profoundly different from a normal programming language. That should tell you just how special purpose GPSS really is. The whole point of this type of modeling is to generate a report on how the model performs. It's to work out how many people I can see a day at WidgetCo, or to calculate the maximum load of a time sharing system, or to figure out how much data a telephone switchboard can move in an hour

Starting point is 00:37:34 and then work out how you could change those circuits to make it better. You're not going to write a version of colossal cave adventure in GPSS. It's simply not the right tool. Besides, we already have a tool for that. It's called Fortran. The whole point is that GPSS is just for simulation. It's more like a special purpose tool that's using a programming language as its input. In that regard, it's very similar to RPG.

Starting point is 00:38:05 That said, GPSS is turning complete. This is a full-blown programming language. You can do conditional execution. The trick is to make a block diagram that has a circle. You loop blocks back on themselves, and GPSS has conditional blocks that will redirect transactions. You can even route a transaction based on its parameters. So you can use GPSS to do anything. It's just that I can't imagine why you'd want to.

Starting point is 00:38:36 Again, we have a tool for more general purpose programming. It's called Fortran. I know I just harped on how the syntax isn't important to GPS. but I do want to point out how miserable it is. Take facilities in storage, as an example. In early versions of GPSS, those entities don't get names. They have numbers. Worse, only storages are predefined,

Starting point is 00:39:04 meaning that if you want to seize a facility, you'd write something like seize one. Of course, in practice, that would be more like, Well, putting C's in column 2 and the number 1 in column 3. But you get the idea. GPSS sees that 1 hasn't been used before, assumes it's a facility, and sets it all up. Functions are another oddity. GPSS supports functions in a way I've never imagined.

Starting point is 00:39:35 You see, they're tables, as in lookup tables. A function is actually defined like an array in another language. It's a list of values, and then you can look up values from that list. It's... well, it's something. GPSS does this because it doesn't actually need full-on functions. It just needs a way to assign specific values. Back at WidgetCo, I know that if a customer has five grievances, they're going to need three extra minutes to complain.

Starting point is 00:40:08 But a one grievance customer, well, they have no extra room to complain at all. A GPSS function would give me a way to map out that data. It's workable, but again, it looks miserable. There's also just the fact that the execution model of GPSS is crazy. At least compared to other languages, GPSS is just completely different. Normal programming languages just do not work this way, and I'm going to leave it at that. Now, this divergence from the norm was intentional. From Gordon, quote,

Starting point is 00:40:47 My early experience in using the program convinced me that if it were properly organized and documented, engineers and analysts would be able to use the program themselves, even if they were not trained in programming. As far as possible, using the program was to be like using a special purpose machine, avoiding programming terminology and practices. This potential elimination of an intermediary in the form of a programmer who was to interpret the system designer's ideas was considered a highly desirable characteristic. When Gordon says the program in that quote, he's referring to the GPSS interpreter.

Starting point is 00:41:28 Now, this to me is where things get really interesting. The mission statement here is very similar to that of Fortran. Fortran was developed so that engineers and mathematicians could use a computer without having to learn to program in machine code or assembly language and without having to work with a special programmer. GPSS is developed after the advent of high-level programming languages. I know I keep joking about how, oh, Fortran exists at this point, but that's a crucial point. Gordon designs GPSS so engineers can use a computer without learning to program. It's intentionally designed to not really be a programming language. It just uses some of the generalities of a language.

Starting point is 00:42:19 When he says that he wants users without programming training to use GPSS, he quite literally means people who don't have training in Fortran. Anyone spot the elephant in the room yet? GPSS may actually be another example of an early fourth-generation language. A while back, I ran an episode on fourth-gen languages. They are, in short, application-specific languages that are more abstract than your run-of-the-mill high-level language. The term 4GL is coined in 1981, but the category existed previous to the name. During that episode, I ran through a pretty big list of languages developed before 81 that fit the category.

Starting point is 00:43:06 SQL is a great example here. It's application-specific, very abstract, and it's used in conjunction with a non-language tool. In the case of SQL, the language is a gateway into a database. GPSS is neat here because of how explicit Gordon is. He said he wanted GPSS as a tool for non-nobsts. programmers. It's a super high-level language. A single line of GPSS does a wild amount of computation. And it's definitely a gateway into a larger non-language system. For GPSS, that's the simulation model. It's execution model of transactions flowing through events and facilities is a non-language

Starting point is 00:43:51 tool in a sense. But together, it's plain to see that GPSS would be considered a four fourth-generation language. And it's clear that Gordon wanted to place GPSS in its own category. Heck, maybe even a better term for 4GL would be a tool language. GPSS is designed as a tool for non-programmers. It's a tool that can do anything, but it's specifically meant for one type of simulation. I realize I'm still being pretty abstract here. The slipperiness of GPSS has kind of sunk into my psyche.

Starting point is 00:44:34 So let me get more concrete. What was GPSS actually used for? This is where we luck out. One of the reasons I really like Gordon's history of programming languages paper is that he cites examples of projects that use GPSS. Not all of those citations are available to me, but enough are. So we can firmly and clearly see what people were doing with GPSS. In 1966, E.C. Smith Jr. published a paper in, again, IBM, Systems Journal, about his work

Starting point is 00:45:10 using GPSS. This is just as GPSS has taken shape, so it's a very early use case. Something to note is the Smith paper says that the simulation is just presented as an example, but it slips into language that makes it clear that the model was actually used in some capacity. So I think this is a valid example. So what was Smith doing with GPSS? His first example, the one I'm pulling from, is a computer room. He describes a setup where an IBM 7040 mainframe is supervising the execution of a 7090 mainframe. The setup here is pretty slick.

Starting point is 00:45:52 The 7040, the smaller machine, prepares programs for the larger 7090 to run. Those programs are sent out to the larger machine by tape. The 7040 then monitors execution and completion of the program and will even reset the 7090 if a program goes rogue or hogs too much time. This is actually a real setup. IBM would sell you a system like this where a smaller computer was the front end for a larger machine. It's an automated way to handle a machine room. Users, or, more likely, staff would interact with the 40 to set up jobs, then the 40 would make

Starting point is 00:46:33 sure the 90 was spinning efficiently and correctly. And thus, we reach an optimization problem. You want the 70-90 to be working 100% of the time. That's the optimal case. We're limited by the speed of communication channels. The 40 has to prepare and then transfer data into the 90. Can the smaller machine keep up? That's a very important question. The outcome of the model, however, isn't a simple yes-no. It's not just a report that says, yep, the system will work.

Starting point is 00:47:10 As Smith puts it, quote, manipulation of the model produces the result. What exactly does that mean? Basically, that a GPSS program cannot be static. A single run of a GPSS program cannot be static. A single run of a GPSS program produces one report. In the case of this model, that report will tell you the 7090s utilization. But that's not the actual answer to the question. What you really want to know is how changes to the system will change that number. So to actually make use of GPSS, you have to manipulate the model. You tweak parameters, you change connections, or you come up with new ideas. This is another aspect of GPSS that's somewhat unique. Traditionally, a program is static, or at least, that's the idea, right?

Starting point is 00:48:04 It's idealized as a static result. When you start working on a program, you're working towards some completed state, some final static state. Once you reach that state, the program's done. Its code, in a perfect world, will never change again. That program is then used and reused, run and rerun. Think of something like an old video game on a cartridge, like Pokemon Blue version. That program is done, it's static.

Starting point is 00:48:34 It's never going to change again. You don't plug the cartridge in one day and there's some new Pokemon. That doesn't work. The completed product is the program itself burned into silicon. That's not the case for GPSS. A simulation program may only be executed once before it's rewritten or modified. There's no final version of a GPSS program. Rather, there are models that are manipulated and changed to explore the system.

Starting point is 00:49:08 The product of a GPSS program is a decision. How do I configure my 7040 and my 7090 to work together? Or it's a deeper understanding of a system. How does the speed of a certain bus impact the utilization of my mainframe? I'm not going to go into superfine detail on how the model worked, but I do want to hit a highlight. The system model was modular. Smith describes nine different sub-modules. There's one for simulating the 70-90 running a program, but the rest simulate different data channels.

Starting point is 00:49:47 card reader to disk, terminal to disk, disk to 7090 to disk, and so on. The problem is all about throughput. How fast data can be shoved in or pulled out of the mainframe. So the model reflects that. Its transactions are the packets that travel along these simulated channels. These sub-modules, these smaller models, are linked together in one large model that basically wires up all the different communication channels. By taking this approach, the model can do a lot more than just say,

Starting point is 00:50:25 how well used your mainframe is. It will spit out how each data channel is being used. That helps to identify bottlenecks, and it gives you feedback as you manipulate the model. You could do something like this in Fortran. In fact, if you cruise the literature, you can find examples of these types of simulations in all kinds of languages. But those aren't using purpose-built tools.

Starting point is 00:50:52 What GPSS is really bringing to the equation is that specialization. You can treat Fortran like a modeling tool. I've done that before, but you have to write a lot of code. You end up being forced to build up your own simulation framework. In my case, I had to write a lot of ancillary programs to wrap around my Fortran model. Or you could just use GPSS. That already solves a lot of those issues for you. There are some more concrete examples in other IBM Systems Journal articles. In fact, there's a whole series of case studies that was published in 64. This series includes a pretty

Starting point is 00:51:32 neat simulation of a steel mill. It models how different grades of steel are processed, turn into products and shipped out. But when you get down to it, this is very similar to the machine room model we just talked about. If you've seen one model, then you've seen, well, maybe not all, but you've seen many models. So I'm going to jump to something more radical. In 1963, a group of IBMers created this beast

Starting point is 00:51:59 called the Generalized Monitor System, or GMX, from Gordon. Quote, it had occurred to me that there was a strong similarity between tasks performed by the GPSS simulation algorithm and the tasks performed by a control program. In simulating a control program, the simulation is maintaining records of the system resources being simulated that essentially duplicate the records kept by the control program. Also, the simulation algorithm, like the control program, must be organized to respond to stochastic events, even though the randomness is internally generated

Starting point is 00:52:39 rather than being the result of an interrupt from an external source." Yeah, vague in general, classic for GPSS. Gordon is saying that he realized GPSS's execution model how it actually ran could be used to write a time-sharing system. GPSS dealt with all the same problems that it time-sharing system did. So, why not see if GPSS could run a computer all on its own? If you don't see the connection, let me explain. All a time-sharing system or a control program

Starting point is 00:53:16 or a real-time monitor, whatever you want to call it, all it really does is manage a computer's resources. If a program wants time, it has to wait in some kind of queue. Let's say a program wants to read from a tape deck. That device can only be used by one program at a time. So the time-sharing system needs to figure out if someone's using it. If they are, then it has to have some mechanism to make a program wait in line for that resource. Does that sound, perhaps, something like the waiting room at Widget Co? The basic problem here is the same. It's all a matter of figuring out who gets what resource and when. and a time-sharing system, the who is a process.

Starting point is 00:54:02 At Widget Co, the who is a supplicant. In GPSS, the who is a transaction. The trick here, what Gordon realizes, is that a time-sharing system has to keep track of its processes. So internally, there are two separate entities, the actual binary code of the process, and in some entry in a table that represents that process. The time-sharing system doesn't actually track the binary very much.

Starting point is 00:54:31 When the system makes decisions about what process should be running, it's really just looking at this internal table. The only connection to the process is binary is that sometimes the system decides, well, it's time for that binary to run, and then that it's time for that binary to stop running. GPSS was only one step away from handling time sharing. That step was a new block. GMX is really just GPSS

Starting point is 00:55:01 with this new thing called a process block. This block allows GMX to let a process run for some amount of time. In technical terms, it's a context switch. There is a technical paper written about this experiment, but I haven't been able to track it down. All the details we get on GMX come from Gordon himself. That said, what we do have is,

Starting point is 00:55:25 That's pretty cool. Transactions were used to model processes. The process state, registers and all, were stored in the transactions parameters. Then a normal GPSS model was used to decide which process to run. The model handled figuring out what different I.O. devices were in use. What memory was in use, everything. It was an operating system written in GPSS. That's fascinating on an academic level, right? Here we have a very, very high-level programming language that, with some expansion, is used to write an operating system. Very cool, very strange, very unique. Gordon viewed this in a much more practical light. With GMX, it would be possible to simulate some theoretical computer. Let's say you just want a 70-90 that runs a little faster or was an extra tape channel,

Starting point is 00:56:24 or maybe even a totally new design altogether. You could write a time-sharing system for it, simulate how well that system would work, and make tweaks all along the way. That would let you know how well an entire computer system would work and what kind of software should be written for it before a single wire is soldered. A GMX simulation, by necessity,

Starting point is 00:56:49 would have to simulate things like performance and bandwidth of your components. then once the computer was actually built, that same simulation model could be used as its operating system. That, dear listener, is a wild future. It also drops GPSS back into this more traditional role. In the case of GMX, there would be a final program. You could have a complete working state that was used again and again. This is also a wild amount of flexibility. You would be using the same code to simulate a system under design

Starting point is 00:57:29 and then to run that very system. I can't give it a better word than wild. It's simulation as a cornerstone of design and operations. I'd love to get my hands on the GMX technical report and learn more. The biggest point of interest for me is implementation. From the start, GPSS was an interpreted language. It wasn't ever compiled. According to Gordon, that made implementation much easier, and I'm just going to believe that one. The use pattern of GPSS also means you don't really take a performance hit from interpretation, at least not usually.

Starting point is 00:58:09 Interpreters tend to be slower than compilers when it comes to execution. Your program is read and executed at the same time. That can bog down a computer. Compiled code executes faster because you end up with a pure binary. Once you compile a program, you can run it again and again without recompiling it. But that compilation step can be slow. So there's this break-even point. If you want to run a finished program over and over again, a compiler is going to stay the time. It's the obvious choice.

Starting point is 00:58:42 If you're going to be changing the code very often, an interpreter could make more sense. since. GPSS definitely falls in the second camp, but GMX breaks that pattern. It's a finalized program that's executed again and again. So all the gains GPSS gets by being interpreted fly out the window. That's partly why I want to know more. Just how much did GMX modify GPSS? Was it still interpreted? Was it actually compiled? Or was it pulling some kind of trick to get around performance issues. Gordon makes it sound like GMX was only a slight modification, but without sourcing, we can't say what slight really means. I'm also pretty curious if GMX ran on top of another operating system or if it had the computer to itself. This is more nuanced than you may

Starting point is 00:59:35 first assume. In the era GMX was written, operating systems were very different from what we used today. GMX was developed for the 7090, so if it ran under an operating system, it would have probably been IBIS. That system took input from punch cards. It really just monitored execution of jobs and batch mode. I'm not sure how much that would have helped GMX. The point is, GMX is really, really neat, and I have some very pointed questions about it. If you happen to have a copy of IBM Technical Report 17-1-1-15, by Bean at All, please give me a scan.

Starting point is 01:00:16 I want to know more. All right, that does it for a dive into GPSS. If you want to read more, I'm going to link to Gordon's History of Programming Languages paper in the description. But really, that paper is more of a jumping off point. I'd recommend flipping to the last page and cruising the citations if you really want to go deep. This has been my first dive into the world of simulation link. For my first exposure, GPSS is pretty wild.

Starting point is 01:00:51 It really isn't like anything I've seen before. It's also, from my limited understanding, a bit of a language isolate. It's not the first simulation language, but it's an early one, and it was developed in relative isolation. Gordon even admits he didn't know about other simulation languages when he started the project. The bottom line is that GPSS is odd. It stretches the definition of a programming language to something near its breaking point. But there are other languages that are equally stretchy.

Starting point is 01:01:25 As I continue to travel down these weird paper trails, I'm excited to see if I run into another block-structured language that deals in the flow of transactions. Until then, thanks for listening to Advent of Computing. I'll be back in a week with the after show for this over at the ATSA. adjunct of Computing podcast feat. Then I'll be back a week after that, with another episode of Advent of Computing. You can find links to everything at advent of computing.com. And as always, have a great rest of your day.

Advent of Computing - Episode 179 - Programming Block by Block

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.