Advent of Computing - Episode 115 - Digital Lifeforms

Starting point is 00:00:00 Have you ever had one of those near-miss kind of moments? You know, something big almost happens, but there's a missing spark to ignite that fire. You get close to disaster or perhaps even some great revelation, but you never quite get there. I personally tend to get these out in the woods. A buddy and I were hiking up this mountain a while back. This particular peak is one of my great enemies. I've attempted a few summits, but I keep getting turned back right at the end. This time, I had brought some backup, and we thought that today was going to be the day.

Starting point is 00:00:36 Conditions were great, the weather was wonderful, we were well-rested and ready to go, and the trail had been easy all morning long. That is, until we met my next misfortune on this mountain. We turned a corner and ran into a black bear. It was probably 15 or 20 feet ahead of us on the trail. Luckily, we ended up being pretty loud and pretty scary. The bear locked eyes with us for a moment, then turned and ran. Now, the trail itself bears mentioning here. It's one of those switchback trails that has sheer walls on each side. To one side is a steep drop-off that no one really wants to lumber into, be they bear or human. The other

Starting point is 00:01:21 side is, invariably, a very steep cliff that none of the three of us could actually climb that well. We eventually decided to carefully continue forward, looking out for our friend the bear. But that didn't last too long. We turned the next switch back and ran into some very concerning territorial markings. We decided, I think quite rationally, to turn back and go home. We were obviously in someone else's house, and that someone was much bigger and much stronger than the both of us combined. I think this was a near miss because, in retrospect, in the cold light of the drive back, things could have gone very poorly. If we continued on, well, we would have just been going deeper and deeper into a bear's territory and wandering further

Starting point is 00:02:11 and further away from any possible help. These near-miss events are interesting to me precisely because of the power of retrospect. Looking back, we can see that certain events could have been turning points. A bear attack, for certain events could have been turning points. A bear attack, for instance, would have been a big turning point in my life. At least, personally, it would have been a big turning point. You may be saying to yourself, why is Sean sharing this? What on earth does this have to do with computers? Well, that's a very easy one.

Starting point is 00:02:42 We can use our own powers of retrospect to find all kinds of turning points in the history of computing. Today, I want to take you on a journey to find one of these near-misses. It's a program that came very close to a breakthrough, but, thanks to its creator's unique perspective, veered in a somewhat unexpected direction just at the end. Welcome back to Advent of Computing. I'm your host, Sean Haas, and this is episode 115, Digital Lifeforms. Now, I know, this is not the promised Monte Carlo episode. I'm still waiting to receive my beautiful scans. If all goes well, then the next episode, or maybe the one after that, will cover the method. So, while we wait around, I figured we should look at something that's somewhat similar. I don't really do seasons on the show. This is mainly because I don't want to commit to one single topic or genre for too long.

Starting point is 00:03:48 I prefer to give myself a little freedom. I also don't really do planning ahead, so seasons don't really work for me. Instead, I poorly plan out future episodes. It's all kind of haphazard over here at Advent of Computing HQ. One result is that I tend to get stuck in similar topics for a little bit. I'll read something that's tangentially related to an episode I'm working on, think it's neat, then throw it onto my ever-growing spreadsheet. Thus, you get these weird groupings of episodes that are only related by my wild train of thought. It's all very Mimics, if you think about it.

Starting point is 00:04:30 This episode will mainly be discussing a paper called Numerical Testing of Evolution Theories by Nils Al-Barachelli. How is this connected to last episode and to the Monte Carlo method? Well, it's very tangentially connected, of course. The paper is all about a computational model used to simulate evolution, a digitally defined form of life, if you will. So, we're in Algorithmville today. John von Neumann also makes cameos in all of these topics, so, you know, there is that connection. It's always nice to see old Johnny von Neumann around.

Starting point is 00:05:12 Now, I first heard of Baricelli's research in, of all places, a TED Talk given by George Dyson. A friend sent me a video a while back, and it's kind of been kicking around in my head ever since. sent me a video a while back, and it's kind of been kicking around in my head ever since. Dyson, for those unfamiliar, is a historian and the author of Turing's Cathedral. That's one of the big-name books in the history of computing. Now, and this may be a shock to some listeners, I haven't actually read all of Turing's Cathedral. I've tried at a few points to pick up the book. It just doesn't really click with me for some reason. Preparing this episode has, once again, reinforced that feeling. Dyson is a good author.

Starting point is 00:05:53 His writing style just isn't my style, if that makes any sense. Back to the topic at hand. This TED Talk is titled The Birth of the Computer. It's a nice and generic title. It discusses, among other things, Baricelli's attempt to create digital lifeforms. Dyson calls it something along the lines of biology inside the machine. And these weren't just attempts. Apparently, Baricelli saw results.

Starting point is 00:06:23 That, of course, piqued my interest. It sounds like some old hacker folklore. Man creates life inside his computer. Civilizations rise and fall, all in the digital realm. The only thing that's missing is a tragic ending that teaches some lesson about programmers playing God. Of course, I had to track down and read the papers about this for myself. I mean, come on. It's about a researcher creating artificial life way back in the 1950s. What's not to love? Well, it opens with this, to quote, It is not the intention of the author to face the reader with a new type of life or new living forms.

Starting point is 00:07:10 Baracelli continues, If the reader at any stage should fall for the temptation to attribute to the numerical symbio-organisms a little too many of the properties of living beings, please do not make the mistake of believing that this was the intention of the author. End quote. That, you know, it's a little different than what Dyson said. Perhaps he fell for the temptation, so to speak. So what's going on here? Is this some tale right out of the jargon file of a programmer creating some new entity? Or is this actually some more reserved and reasonable study of the effects of evolution?

Starting point is 00:07:53 This is what we're going to be unraveling this episode. What exactly was Baricelli doing with his digital forms of life? Were they forms of life even? And how does this fit into the larger story of genetics on the computer? Let us start by discussing evolution, or at least a highly simplified view of the phenomenon. I will admit I'm no biologist, I'm barely a scientist at this point, but I do have some understanding of evolution. This actually comes mainly from working with genetic algorithms. Did I mention I've been in a bit of an algorithm mood lately? Now, a genetic algorithm is a fantastic tool. It's a way to leverage evolution to make better software. The crux of Darwinian evolution is this process of natural

Starting point is 00:08:46 selection, sometimes put as survival of the fittest. The idea being that, over multiple generations, the best adapted individuals in a population will pass their genes onto more offspring. If you get a wolf that's better adapted to winter, then it's going to have more kids. get a wolf that's better adapted to winter, then it's going to have more kids. Those children, if all goes well, will carry that adaptation forward to more generations. Thus, you end up with a better wolf. Genes, the chunks of DNA that actually encode things, can change as a product of reproduction. Each reproducing organism provides some of their genes to make new organisms. I hear that that's also a fun way to spend a Friday night. Anyway, the resultant organism will have a combination of genes from both its parents. More successful parents, evolutionarily speaking,

Starting point is 00:09:39 will create more offspring. That, in turn, will pass on more successful genes. This process of mixing genes is sometimes called crossover, since, you know, you're crossing genes to get a new gene. There's one more mechanism that makes evolution work. Mutation. DNA, the molecule that encodes life, is pretty big. This molecule encodes data as base pairs. Think of these like the equivalent of binary, but biological.

Starting point is 00:10:12 Instead of storing some electrical signal, these base pairs store data as certain chemicals. The human genome, the sum total of all the DNA that makes us us, contains about 3 billion of these base pairs. That's a whole lot of data. But crucially, DNA doesn't necessarily replicate itself with 100% accuracy. Mutation can occur. This can happen on its own, while DNA is just sitting there. It can happen while DNA is replicating inside the organism,

Starting point is 00:10:46 or it can happen during crossover. Things can get a little weird. These mutations bring in an element of chance. It's very possible for mutations to, over time, lead to better adapted organisms. It adds a level of genetic diversity that's good for the overall evolutionary trajectory of a species. Taken in total, the evolutionary process is really just an algorithm. That is, it's a series of steps that help you arrive at a conclusion. Here, the conclusion is that, after a number of years, you reach an organism that's better adapted to its role. Put in more abstract terms, evolution can be viewed as an optimization problem. You can start as an organism that's not very well optimized at all and, over time, become more optimal. The classic example here is Darwin's

Starting point is 00:11:40 finches. This is a group of birds found in the Galapagos Islands. Each species of finch in the group, about 15 in total, is adapted to eat a specific type of food found on these islands. It's a pretty easy adaptation to see since each bird has a unique beak. Some are adapted to eating bugs, some for seeds, and some for plants. In other words, we see a group of birds that have been optimized for certain purposes. Some grew bigger beaks for breaking seeds, while others evolved longer beaks for rooting around for bugs. I think it's fair to say that this is a very biological process. In fact, it might be the most biological process there is. Evolution is the driving force that has led to everything on the planet. We don't get folk, trees, bears, or bees without this process.

Starting point is 00:12:34 So we must ask the question, how can we exploit this? How can we make evolution work for us? Since time immemorial, or at least since computers immemorial, researchers have been taking cues from nature. For an easy example, we can just pull from As We May Think by Vannevar Bush, my perennial example of almost everything, actually. The whole point of that paper is that we need to start organizing data in a more human-like manner. That data storage systems should be modeled after human thought processes. You know, as we may think. We also see this in early forms of artificial intelligence.

Starting point is 00:13:18 Go back and listen to my prologue series if you don't believe me. During the early days of automatic translation, programmers were pulling very heavily from cognitive research. I didn't really talk about this in the episode, but a lot of the early papers I referenced have citations to Noam Chomsky's works on linguistics. That's another very humanistic approach to programming. This is, I think, the real power of programming. Computers are, by their very nature, hugely interdisciplinary. They're such a new technology that, especially during the 20th century, nearly any type of scientist could be pulled in by their allure. That's how we get linguists, mathematicians, physicists, cognitive researchers,

Starting point is 00:14:06 really any number of experts contributing to the early history of computing. So, in that spirit of interdisciplinary studies, I'm going to introduce you to something that's a little slippery. Let me tell you about genetic algorithms. These are a class of programs that are modeled off evolution. These programs aren't necessarily used to study evolution, but rather to use evolution as a tool in a larger process. Now, I ask you, if you had to place this in a department at a university, where should it go? You are the dean, after all. You get to pick. Should biology professors have

Starting point is 00:14:47 to teach about optimizing problems? Should mathematicians lecture on the wonders of evolution? Or should we just throw it into the computer science department and leave it to languish out of sight? This kind of question really gets to the heart of what makes computers so cool to me. Genetic algorithms were developed by looking at the natural world, then applying those ideas to computers. That right there covers a whole pile of departments. But, you may ask, what exactly is a genetic algorithm? Simply put, it takes the rough outline of evolution that I gave earlier

Starting point is 00:15:26 and implements it in software. This is used to find an optimal solution to some problem. So, we need a problem. Let's say I'm working at Sean's Widget Co. again. I'm trying to figure out how to make a five-sided polygon with the most surface area. My constraints are simple. I just want to figure out the largest surface area possible for a 5-sided shape. I personally think this optimized pentagon is going to sell really well during the holiday season. So you start off with a population of organisms. Of course, these aren't real organisms. These are just piles of data in a computer. Most often, it's a list of numbers. That list, that gene, can be interpreted in some way further down the line. For our pentagon example, a gene would be five points on a grid, five coordinates.

Starting point is 00:16:21 Next, you work up what's called a fitness function. This is a chunk of code that, given a gene, will tell you how fit it is, how well adapted it is. For WidgetCo, this function is simple. We just need to calculate the area of a pentagon. The arguments are taken as the gene itself. Once we apply the fitness function, we can figure out which gene is best, then sort our population accordingly. Next comes operating on the genes. This is the tricky part that takes some trial and error to work out. In this step, we apply some kind of mutation. This is actually really important for genetic algorithms because it helps keep us out of ruts. Sometimes a genetic algorithm will zero in on a solution that seems pretty fit, even if it's not the best possible solution.

Starting point is 00:17:12 Think of a valley on a hillside. You might be able to reach into that valley and you might get stuck there when you're actually trying to reach the very bottom of the slope. A healthy level of mutation can prevent that kind of outcome. This is most often done by throwing some random numbers into genes, but you have to use a steady hand or you risk ruining your progress. Crossover is an equally tricky thing to deal with. In this part of the process, you first have to select which genes will cross over, which genes should be carried over to the next generation.

Starting point is 00:17:47 Then you mix those genes together. This mimics the reproductive step of evolution. You'll often pair up the most fit genes in the population, but there are other possible strategies to consider. For Widget Co., we're taking the easy route. I'm going to sort my population by fitness, and then pair off genes in groups of two. The most fit gene gets paired with the second most fit, the third with the fourth, and so on. Crossover is also simple for me. I'm just going to swap a few points between

Starting point is 00:18:16 genes. That's the whole simulation part done. We start with random genes, gauge how good each gene is, decide if we want to mutate any, then mix up the genes based on fitness. Then we go back to step two, gauge the fitness of our new population, and repeat. We do this again and again and again. After a few thousand generations, maybe more, WidgetCo gets its answer and we can now produce the largest pentagon possible. This is, of course, a very silly example. A Widget Co. employee could probably work out their own design in an afternoon. Maybe a day, but it would be quick. Genetic algorithms are a lot more useful when it comes to solving

Starting point is 00:19:01 non-trivial problems. In other words, when it comes to solving things that we can't just plug equations together to solve. My favorite example is the so-called evolved antenna. These are antennae that are designed by genetic algorithms. The genes in this program are possible antenna designs, so you might have a series of points in 3D space. The fitness function here is a radio simulation. Instead of doing some simple math, these antennas are simulated and tested, all inside the computer. These simulations aren't trivial things. We can

Starting point is 00:19:39 model how an antenna will pick up a signal under different scenarios, but that doesn't mean we can just pull the best possible antenna out of that model. You have to solve some pretty heavy-duty math to get that kind of result. But a genetic algorithm can, in many cases, find the optimal solution. These computer-designed antennas also just look kind of wild. One example, the X-band radio antenna used in NASA's ST5 probes, is this weird unevenly bent wire. It looks like someone unfolded a paperclip and kind of scrunched it around. Despite the seemingly haphazard design, it works really well. The genetic algorithm was able to arrive at a solution that human engineers could have never considered. Now, we've talked quite a bit about old-school artificial intelligence

Starting point is 00:20:32 on the podcast before. Genetic algorithms are interesting in part because they represent a different approach to artificial intelligence. A good genetic algorithm isn't necessarily intelligent per se, it's just taking advantage of natural processes. It's easy to just lump together genetic algorithms and artificial intelligence. But I hope I've made it clear that there are ideological differences at play here. I can mark an even bigger distinction. Genetic algorithms and AI have different origins. This fancy genetic code isn't just some offshoot of AI research. It didn't appear at the MIT AI Lab, or the Stanford AI Lab, or really any artificial intelligence lab at all.

Starting point is 00:21:19 So here's the skinny. Artificial intelligence as a field starts around 1956. That's when the first conference on thinking software is held, and also when the term AI is coined and adopted. There are some grumblings of AI a little earlier, but 56 is a solid start date. Genetic algorithms, or rather, evolution-inspired software, first appears in the early 1950s Not only does that predate AI, very handily in fact, but there's not really much of a researcher overlap I'm getting that 56 date from a summer workshop that was organized at Dartmouth College This was the first time a bunch of nerds got

Starting point is 00:22:05 together to discuss thinking computers. We even have a list of everyone who was in attendance at this workshop. It's a good selection of researchers. That said, it's missing someone. At the same time this workshop was going on, Nils Baracelli was in a lab in Princeton laying the groundwork for genetic algorithms. Not only was he left uninvited, but I seriously doubt anyone at the workshop even knew his name. In other words, the two seemingly similar fields start in totally different ways. It's also doubtful that Baricelli would have even considered his work artificially intelligent. So then, we must ask, how did genetic algorithms evolve?

Starting point is 00:22:51 This is a twisty tale, so prepare yourselves. Our story starts with Nils Al Baricelli. The precise details of his pre-computer life are a little hazy. There is a paper trail. Somewhere. It's just hard to track down. So most of the finer details come second or third hand. What's clear is that Baricelli had a unique way of looking at the world.

Starting point is 00:23:18 I've seen the words genius and maverick used to describe him. Apparently, at one point, he actually walked away from a PhD because his review board wouldn't read his 500-page thesis. Instead of shortening it, he just gave up on the degree program. That's one of the most repeated stories about Baricelli, and I think it gives us a good idea of his personality. This also leads to a bit of a fun categorization problem. We can't really slot Baricelli into any one academic box. I mean, he wasn't even a true academic, right? He didn't have a doctorate, so we can't call him a doctor of math or something easy. He published in the fields of virology and bacteriology, as well as mathematics and physics.

Starting point is 00:24:06 He probably also published in other fields, but once again, paper trail. What we can say is that Baricelli was a somewhat eccentric generalist. I've also seen him described as somewhat untethered from any one location. This, once again, makes the paper trail harder to deal with. During most of his career, Baricelli worked out of a university. That's usually a nice thing for me. It means there's a nice university archive or collection that I can fall back on. But not for Baricelli. He traveled around quite a lot. He worked at the University of Oslo, Princeton, and Vanderbilt, at least. Probably more if I were to carry out a full accounting.

Starting point is 00:24:55 So his papers are kind of scattered to the winds. We will primarily be focusing on the Princeton years. That's when Baricelli's world would change fundamentally and irrevocably. Leading up to this period, he had become interested in Darwinian evolution. Specifically, he had a problem with Darwin. This seems to have been a theme with Baricelli. If you've been around scientists for long, you might know this kind of person. They like to pick one-way fights with established research, get into feuds with someone who doesn't even know they exist, or can't know they exist. Baracelli had done this with Gödel's Incompleteness Theorem a few years prior to joining Princeton.

Starting point is 00:25:39 If you could find his journals, I bet you'd find a pile of very similar arguments and feuds. His problem with Darwinian evolution was that, frankly, the numbers just didn't add up. Crossing a mutation alone shouldn't be enough to create a truly complex life. The argument here is actually pretty interesting. And, sure enough for Baricelli, it takes a unique approach. Look at it this way. Genes are responsible for genetic variation in organisms. A gene itself is just a collection of data about how to build an organism. The encoding is a little weird here. Chunks of genetic data are grouped into these

Starting point is 00:26:25 things called alleles. Each allele can come in multiple forms, but we can just reduce that to either on or off. Maybe one allele is responsible for telling the body to sweat. So you might get a sweaty dude if it's on, but if it's off, you get a dry dude. Since these alleles are what actually matter for gene expression, we can take a shortcut. We don't have to model down to the molecular level. We just have to deal with alleles. Call it genetic bites as opposed to a bit. That's the setup, and here's where Baricelli drops the theory. For evolution to hold for complex organisms, it also has to hold for simple organisms. Since, after all, simple organisms must have evolved into complex ones. So any organism with only a few alleles should be able to, if given enough time,

Starting point is 00:27:22 evolve into something more complicated. Take a population of these simple lifeforms, let them reproduce, maybe set up a UV lamp to add in some mutations, and you should arrive at humans after a while. Or maybe not. Baricelli argues that these two processes, reproduction and mutation, can't supply enough variability for rapid evolution. Genetic code won't change very quickly if the only possible changes are crossover and random mutations. A third process is needed to make evolution really work. That process, according to Baricelli's argument, is symbiosis. This is the idea that

Starting point is 00:28:06 unrelated organisms can impact each other's development. And purely on mathematics alone, this makes sense. Crossing doesn't lead to new genes, it just leads to different genes. You still end up with the same number of alleles, the same number of bytes. Same with mutation. That just changes things up, but it doesn't add anything. Symbiosis, however, can lead to bigger genes. Think of this as taping two unrelated genes together. All of a sudden, you have a longer strand of DNA, which allows for more genetic variability. With that added variability, evolution can occur more quickly, you can get more complex

Starting point is 00:28:48 life forms. Baricelli isn't just making this up from nothing. There was already evidence of this type of symbiosis. In fact, this was something that Baricelli was familiar with and publishing quite a bit about. Now, this is going to be a place where nomenclature is a little weird. During the first half of the 1950s, or the late 40s, Baricelli published a number of papers on bacterial and viral crossbreeding. Now, this is distinct from crossing. Crossing,

Starting point is 00:29:21 we must remember, is what happens dna during reproduction cross breeding at least as described in this period is when dna from different organisms is crossed to form a new strand this phenomenon was observed at least as early as 1952 but i think probably earlier i'm no dna expert so please excuse any mistake here, since I'm working off citations in later papers. In short, researchers were finding and creating bacteria that had the DNA of multiple different strains of bacteria. The experiments aren't up there with modern gene splicing, at least not from what I've read. aren't up there with modern gene splicing, at least not from what I've read, it sounds more like they were just throwing spare DNA into bacteria, blasting it with UV light to damage some stuff, and then seeing what happened. Similar work was done with virus strains in this same time

Starting point is 00:30:16 period. Baricelli wasn't personally carrying out these experiments, but he was in conversation with them. He wasn't really an experimentalist. That said, we do get a nice paper trail here. He had symbiosis on the mind. Once again, there's a bottom-up angle here. If this kind of symbiosis occurs on the smaller scale, then that must have ramifications for the larger story of evolution. have ramifications for the larger story of evolution. Now, this is all really low-level stuff. These are things that show up on the chromosomal level. Researchers are just seeing bacteria that have too many chromosomes, or their chromosomes are a little too long. But I can give you a better example of symbiosis. Look no further than the mitochondria, the powerhouse of the cell. Mitochondria is an interesting part of our cells because it has its own DNA.

Starting point is 00:31:13 That DNA is distinct from the DNA in our cell's nucleus. The reason for this is pretty wild. As the theory goes, mitochondria started out as a separate organism. Over time, that organism started living inside other cells. The larger cell provides raw material to the mitochondria, and the mitochondria turns that into energy for the cell. It's a nice arrangement. This presents another type of symbiosis that should be considered. In this case, DNA hasn't been blended and extended. Rather, we just have two organisms that require help from each other to function.

Starting point is 00:31:58 That does, in effect, greatly increase genetic diversity. Taken as a whole, we have a more complex system, a more evolved organism. That's the whole reason that Baricelli is interested in symbiosis. It's a complement to the classic reproduction and mutation of Darwin's theory. At least, that's the theory. There is a bit of a complicating factor here. Baricelli was a whole lot of things, but one thing he wasn't was an experimentalist. In the wild and woolly world of science, there are really two kinds of folk you run into. Experimentalists and theorists. Experimentalists are what you probably think of when you hear the word scientist. These kinds of researchers will sit at a telescope late into the night.

Starting point is 00:32:45 They'll work with a pipette and beakers full of dangerous chemicals, or crossbreed peas in a fancy greenhouse. On the other side of things are the theorists. These researchers work in a world of their own. They crunch numbers, they work up ideas, they sort through data, and in some cases, even plan experiments. Baricelli was firmly a theorist, but he had his own twist to this. Most scientists tend to stick in their lane with a certain type of reverence, for lack of a better term. Back in my research days, I was a theorist. I didn't touch any instruments. For me, this was simple fear. I mainly worked with data from radio telescopes, and I personally believe that those kinds of instruments work off some type of magic. It's a little beyond my understanding. I've talked to experimentalists

Starting point is 00:33:39 that feel the same way about the spooky math that I used to whip up. Baricelli didn't stay in the realm of theory due to a lack of understanding. He just didn't think that experimental research was practical, mainly because it wasn't easy to reproduce. There were too many uncontrollable variables. He preferred to work up purely mathematical experiments. This is the origin of his numerical work. Dyson tells us that in the early 50s, right as the interest in crossbreeding and viruses was brewing, Baricelli was working up a fascinating experiment. He wanted to simulate evolution all on paper. This would present all kinds of advantages over experimental research in evolution. It would be faster than trying to breed fancy bacteria in a lab. It would also be perfectly reproducible,

Starting point is 00:34:33 since Baricelli could control every aspect of his simulated world. The issue, however, was number-crunching power. Apparently, Baricelli first approached this problem with pen and paper alone. That did have some advantages, but it was painfully slow and difficult. Baricelli had stumbled upon an algorithm that needed a new type of tool. This state of affairs would collide with an interesting debacle. Shortly before the end of World War II, mere months in fact, one of the most important computers in history emerged. Or, at least, its designs emerged.

Starting point is 00:35:22 This machine was called EDVAC, and it was the direct successor to ENIAC. It was designed by the same team that had initially created ENIAC at the University of Pennsylvania. The big difference, and the one that makes EDVAC so crucially important, is the fact that it was a stored program computer. That means that EDVAC could actually be programmed with code that was stored in memory. This should have been the crowning achievement of the ENIAC team, which was headed by John Mouchley and John Eckert. But, well, things would go a little sideways, as they tend to do. This is a story that I've covered on the podcast a number of times. It all comes down to an internal memo called the First Draft of a Report on EDVAC. This was meant as a very internal thing, not for public consumption. It was written in June of 1945 by John von Neumann, but it was

Starting point is 00:36:15 based off research done by the entire ENIAC team. I mean, it's an internal memo on their research, it's just a summary of what they've all been doing. But, and here's the kicker, the byline only had one name on it. That name was, of course, Johnny Von Neumann. It was internal. It was a draft. Of course it would only have one name on it. But shortly after the draft was completed, it was leaked to the outside world. That leak leads directly to two hugely important outcomes. First, we get a pile of new computers based off the EDVAC report. There's this whole class of machines that, while not compatible, are all based off the same design. This also leads to the popularization of the so-called von Neumann architecture. He was the only author on the draft, after all. Almost all computers

Starting point is 00:37:13 nowadays follow this architecture. It prescribes that code and data should be stored in the same big block of memory, and code and data should be treated identically by the computer. and code and data should be treated identically by the computer. The name is a misnomer, but it does sound like von Neumann actually really liked this architecture. Maybe we should just call it von Neumann's favorite architecture instead. Shortly after World War II, in 1946, von Neumann and a group of collaborators attempted to create a machine based off the EDVAC report. This machine was built at the Institute for Advanced Studies, or IAS, in Princeton. So, it was fittingly called the IAS machine. Very imaginative names back in the day. Now, it would take about five years before the machine was operable. So, come 1951,

Starting point is 00:38:03 there was a lab over in New Jersey with a very early programmable computer. We know that IAS, or at least similar machines, were on Baricelli's radar. In 1951, while still loosely associated with the University of Oslo, Nils applied for a Fulbright Fellowship. This would have let him travel to the United States to work, hopefully, with a computer. His first application didn't work out. Nor did his second attempt, but eventually he would make it stateside. Apparently, a friend of Baricelli wrote von Neumann, who threw around some weight to help bring Baricelli to the states. So why would von Neumann be interested in numeric evolution? To put it simply, he was willing to do just about anything with a computer. He just wanted to throw

Starting point is 00:38:51 any ideas he could find against a digital wall. In this period, computers were so new that essentially any application would be of massive importance. Dyson describes the IAS machine as this place where all kinds of researchers gathered to get computer time, and von Neumann was willing to host almost anything that sounded interesting. It's in this climate that Baricelli reached Princeton. From there, his simulation software took root. As we move forward, I want to give the same disclaimer that Nils gives. We aren't going to be talking about some kind of new digital lifeforms. At times, I may use that wording, but don't get it twisted. This is a simulation of evolution.

Starting point is 00:39:42 Baracelli's goal was never to create some organism that lived inside a computer. That wasn't the outcome of his research, either. We are, instead, dealing with simulated evolution of simulated lifeforms. A string of numbers isn't any more alive in a computer than it is on a piece of paper, but it can exhibit certain lifelike properties. So how, exactly, can a string of numbers act like a lifeform? I certainly haven't seen numbers that appear any more alive than others. Well, it all comes down to a set of rules, an algorithm, if you will. In other words, yes, we are dealing with a rule-based system. If you want, you could actually just call this a number game.

Starting point is 00:40:26 Baracelli's numeric world starts out as simply a blank sheet of grid paper. This will be used to represent a one-dimensional world, where each row is a snapshot in time. He starts by placing random numbers and blanks on the top row. What do these numbers represent? Well, that's a little bit ambiguous. These aren't exactly genes and they aren't exactly organisms. Rather, they're some abstract, self-reproducing entities. Remember that we're in the world of theory. Just think of these numbers as little dudes that are programmed to follow evolutionary rules. Darwin dudes, if you will. Once the first row is in place, it's time to apply these all-important rules to generate the next row. There are a number of rule sets that he used over the years, but by the time he publishes

Starting point is 00:41:17 numerical testing, he had settled into a rule set that he liked, but he still discusses other possible sets. I'm going with the big complicated setup since, well, that's where the cool stuff is hidden. The first rule is reproduction. This is pretty simple. You slide each number over by its value. Each number can be positive or negative, so some numbers actually slide to the left, some to the right. If you have a 2 in the first cell, for instance, then that will appear two spaces to the right in the next row. A minus 2 would similarly appear two spaces to the left. Numbers can also slide off the map, so to speak. So if you have a negative number in the first cell, then it wouldn't even show up in the next generation. It's just gone, never to be seen again. That, in a way, represents a form of death. So far, nothing fancy. You just get numbers that slide around on their own. This is, in part,

Starting point is 00:42:18 modeling reproduction, or rather, self-replication. This isn't something we see too often on the big macro level. Folk don't really spit out new folk on their own without outside intervention. But self-reproduction does show up on the smallest levels of lifeforms. Many cells self-reproduce, making perfect copies of themselves in the process. Even DNA itself will reproduce. So as far as evolution is concerned, this is perfectly kosher. The next rule is mutation. That's what Baricelli calls it, but it's kind of a misleading name. This is how Baricelli models crossing. If two numbers land on the same space, then their values are added. In this way, a collision between a 1 and a 2 would make a 3. A 1 and a negative 2 would make a negative 1,

Starting point is 00:43:13 and so on. The offspring number, if you want to call it that, is a mix of its parents. In other words, this is a highly simplified form of heredity. So, we're still on the evolutionary train. If you're following along at home, it's at this point that Baricelli hits us with the first simulation on paper. A tiny digital world. You see, these are all the rules you need to model evolution. To quote, In this manner, we have created a class of numbers which are able to reproduce and to undergo hereditary changes. you need to model evolution. To quote, created in figure 1 by the rule state above will survive. The other numbers will be eliminated little by little. A process of adaptation to the environmental conditions, that is,

Starting point is 00:44:11 a process of Darwinian evolution, will take place. End quote. With this rule set, you will, eventually, end up with patterns of numbers optimized to stay on the grid. That is, at least in theory. You can actually work up one of these simulations pretty quickly. I think it actually took me less than an hour to throw something together in JavaScript. It works, but it's janky. Within a few generations, everything dies out. That's kind of what you see in figure one. You usually wind up with ones and negative ones down in the last generations, which, to be fair, is optimizing to stay alive. It's just not very impressive. Baricelli felt the same way. There's something missing from the equation. That something,

Starting point is 00:44:59 he argues, is symbiosis. Now, this is where the rule set gets weird. We're in pure math land, so it doesn't actually matter how things would shake out in the real world. The symbiosis rule states that if a number falls in a cell that has a different number above it, then symbiosis occurs. In other words, if you fall in a space that was occupied during the last generation, and it was occupied by a different number, that constitutes symbiosis. If the condition is met, then we get to make a copy of the organism that moved into that cell. Let's call the moving organism N and its symbiosis partner, the one above it, M. The copy of N is placed M spaces to the right of N's initial location. If that isn't super clear, then please, do not worry.

Starting point is 00:45:55 Baricelli's model of symbiosis isn't super clear to deal with. You kind of have to sit with the paper and try to program this thing for it to make a lot of sense. You kind of have to sit with the paper and try to program this thing for it to make a lot of sense. What does matter is that symbiosis cares about the state of the last generation and how the last generation is organized compared to the current generation. If symbiosis occurs, then we get an extra copy of a number. We get to propagate again. When you add in symbiosis, something really interesting happens. Instead of the grid emptying out, you start to see complex patterns take form. It's clear to see why this would happen.

Starting point is 00:46:34 Symbiosis, as formulated by Baricelli, provides a type of feedback. Future generations are now more dependent on past generations. Symbiosis also has a cascading effect. When you go to place a number, it might trigger symbiosis once. The copy of that initial tile can trigger another level of symbiosis. So if things are placed right, then a single number can quickly propagate across the board. This is actually where I hit a bit of a rabbit hole. across the board. This is actually where I hit a bit of a rabbit hole. I kind of lost a lot of productivity because of this. As I've alluded to, I've been toying around with my own implementation of Baricelli's world. And, I must admit, it really is engrossing. I've spent way too much time adjusting my code and watching simulations. I can totally see why he got

Starting point is 00:47:25 sucked into this stuff himself. It's really satisfying software. Anyway, back to the point, out of my tar pit. Symbiosis really kicks things up to a whole nother level. This is where we reach what Baricelli calls the symbio-organism. Or symbi-organism? I'm probably going to use both pronunciations because this isn't a real word. So, a symbi-organism is as close as we get to a digital life form, if we want to stretch the truth a little. A symbi-organism is a string of numbers, a group of organisms, if you will, that remains constant between generations. They are these numbers that, barring outside interference, will stay on the grid paper. This is all possible because, under the rules of symbiosis, you get to make extra copies of numbers.

Starting point is 00:48:18 That just means that if you have a well-crafted string of numbers, it will make copies of itself. that if you have a well-crafted string of numbers, it will make copies of itself. If you really wanted to, you could actually sit down with some graph paper and work out a few symbio-organisms of your own. It would be a challenge, but there's no reason you couldn't. However, we don't need to waste our brainpower on that. These things, it turns out, will form on their own through the process of evolution, or, you know, simulated evolution. It turns out that a static string of numbers is actually preferred in this simulation. Anything that moves without copying itself will die out. Without symbiosis, that means that each simulation ends with 1s and minus 1s. With symbiosis, we can have somewhat stationary number strings,

Starting point is 00:49:12 so that becomes the dominant result over time. To put it another way, Baricelli didn't make symbio-organisms. He discovered them inside his simulation. That, to me, is kind of the coolest thing about this. The simulation itself leads to these digital lifeforms, for lack of a better term. Now, I am throwing that term around pretty loosely. Baracelli did stress that readers should avoid this trap, but here's the thing. These symbio-organisms exhibit a number of properties of honest-to-goodness lifeforms. I've already implied two ways that symbio-organisms act alive.

Starting point is 00:49:57 First is that they self-reproduce. That's a biggie. That means that, left to their own devices, a symbio-organism will either continue to exist in some cases, or it will make copies of itself. This lines up nicely with single-cell organisms. They will reproduce on their own. The second point is that symbio-organisms will spontaneously appear. In other words, given enough time, these nice numbers will form. This is crucial because it shows that, just as with evolution, simple organisms will lead to more complex ones. One symbiorganism that exhibits these properties is my favorite, good ol' 5-3-1-, negative 3, 0, negative 3, 1, 0.

Starting point is 00:50:46 If you plug that into the simulation, it will proceed to make copies of itself over and over again until the board's full of that string of numbers. I'm sharing the actual lifeform with you not so you remember these sacred numbers, but rather to point out something. These are not lifeforms in any classical sense. This is a string of numbers that, under the right ruleset, under the right conditions, exhibits some neat properties. Once again, keep Baricelli's warning in the front of your mind. He wasn't trying to make digital friends, he was creating a tool to study evolution. trying to make digital friends, he was creating a tool to study evolution. There's actually this whole pile of lifelike behaviors, but I want to share just a few more. One of the most exciting ones is crossbreeding. If two symbio-organisms run into each other, they will, in some cases,

Starting point is 00:51:39 create a new organism. This can occur in a few ways. Baricelli observed crossings where both parents and offspring survive, where one parent survives, or where neither parent survives, only leaving an offspring. In successful crossings, the offspring is, itself, a symbio-organism. It exhibits all of the lifelike properties of its parents. Crossbreeding in the simulation is particularly cool because it brings us full circle. Baricelli had initially become interested in symbiosis after researching of bacteria and viral crossbreeding. His simulation isn't explicitly programmed for crossbreeding, not at all. It simply simulates reproduction, mutation, and symbiosis on a very small scale.

Starting point is 00:52:29 That leads, on its own, to organisms that can be crossbred to make new organisms. I think that really speaks volumes to the validity of this software. Baricelli also observed what he called parasitism. This is a little more abstract, but really interesting to consider. Some of his symbiorganisms were unstable if left alone. They wouldn't self-reproduce, instead dying out or scattering after a few generations. However, if they were near another organism, these parasites would actually reproduce. As they reproduced, the parasite eats up its neighbor. Once the neighbor, the host, is consumed, the parasite goes back to being unstable.

Starting point is 00:53:16 It falls apart. The final big behavior is, well, of course, it's the biggie. It's evolution. is, well, of course, it's the biggie. It's evolution. Baricelli proposed that symbio-organisms could undergo evolution themselves. And, I mean, why not? The rule set in place models evolution on a very small, number-by-number scale.

Starting point is 00:53:45 As numbers group into larger organisms, the same rules seem to apply. Symbiorganisms will reproduce, mutate, and engage in symbiosis with one another. Over time, this leads to newer, more complex symbiorganisms. Eventually, a string of numbers powerful enough to take over the one-dimensional universe will emerge. In fact, that's just the problem that Baricelli ran into. Baricelli's world was, of course, modeled on the IAS machine. For the time, this was a pretty powerful computer, but come on, we're still talking small scale here. The initial simulations used a 512-column universe. That's 512 spaces for organisms to live and move around in.

Starting point is 00:54:31 Not really much space at all. This small size was just a limitation in the IAS machine's memory. The computer only had 5 kilobytes of RAM. At least, that's once we converted to modern 8-bit words. Old computers were a little strange when it came to word and byte sizes. Anyway, Baricelli had to restrict the size of his simulations. Early simulations were only 512 columns across, and they ran for around 5,000 generations each. That would work out to, I think, around 32k of memory,

Starting point is 00:55:05 which means that Baricelli must have been pulling some kind of trick. Most likely, he was just dropping older rows from memory and writing them out to punch cards as newer generations were, well, generated. This means that he could have actually kept a pretty comfortable buffer of organisms in memory and still had some space for the program itself. So there were limitations, but nothing unworkable. There was another strange limitation that I can't entirely explain. Baricelli's simulations would start with a random set of numbers and blank spaces. He explains that these were determined by drawing from a deck of cards or flipping coins.

Starting point is 00:55:47 Now, here's the thing that kind of gets me. Baricelli is always described as something of an eccentric genius. Apparently, he took to the IAS machine very naturally. There are even accounts of him finding bugs in the computer's very design. He sounds like a really good programmer. The IAS machine didn't have a native way to generate random numbers. By this, I mean you couldn't just say, hey computer, give me a number between 1 and 10. However, that shouldn't have stopped anyone from writing a program to make random numbers. In fact, von Neumann himself was

Starting point is 00:56:26 something of an old hand at this. He had first developed a program for generating pseudo-random numbers all the way back in 1946. Since then, he had published at least one paper on the topic. He was even involved in early Monte Carlo algorithm research, which relies very heavily on generating boatloads of random numbers. So why, then, did Baricelli flip coins and draw cards? I don't have many of Baricelli's early papers to draw on here. Maybe he explains it in one of those. That said, I do feel comfortable speculating here. Like I've mentioned, I think I know Baricelli. Well, rather, I've known Baricelli's. I think he was avoiding programmatically generated random numbers because they weren't good enough for him. Let me explain. The algorithms that von Neumann

Starting point is 00:57:20 and his colleagues were using generated pseudo-random numbers. These are only mostly random, and there's a big difference between mostly random and all random. A computer is a deterministic machine. If given a set of instructions, it will carry them out one after another and always arrive at the same conclusions. That's not really random at all. Pseudo-random number generators are just programs, so they can't actually generate truly random numbers. They follow the same steps and arrive at the same conclusions every time. Von Neumann's program started with a seed number,

Starting point is 00:58:03 then, through a complicated equation, generated a new quote-unquote random number. That new number became the seed for the next iteration. These numbers would look random, but over time, you could see patterns. And those patterns, in fact, would always be the same. Given the same seed, the algorithm would gladly spit out the same random number. If you apply this to Baricelli's universe, then every run would be identical. Now, there's a really easy way around this.

Starting point is 00:58:37 Baricelli could have drawn cards or thrown dice to generate the initial seed. But like I said, I've known Baricelli's. Once he learned that the IAS machine was spitting out subpar random numbers, I bet he just threw up his hands, cursed a vacuum tube, maybe threw a few punch cards at a wall, and walked out. No way these fake random numbers would ever touch his beautiful code. I've known a number of theorists, a number of scientists, that very much act that way. They kind of have a chip on their shoulder about doing things a specific way, even if it's harder than it needs to be.

Starting point is 00:59:22 And I kind of get that vibe from reading about Baricelli. Now, to properly set up my digression, I need to throw in one more puzzle piece. If you look up Baricelli, you will undoubtedly see these big technicolor swatches. These are visualizations of his experiments. Most of these are actual modern recreations. Like I said, it's relatively easy to recreate his simulation software. The symbiosis step is the only part that should give you any trouble. It's become customary to show each number, each organism, as a single pixel of some set color. A 1 is blue, a 2 is green, and so on. Use any colors you want. I have seen slides that Baricelli supposedly generated and some printouts, but I'm

Starting point is 01:00:07 not sure how these were made. That's a bit of an outstanding question that I'm curious about. As far as I can tell, Baricelli worked off of printouts of numbers, and perhaps straight off of punch cards in some cases. Each of these visualizations are good for spotting patterns. Anyway, the colorful visualizations are neat because you can see the patterns really easily. If you take a minute, you can actually spot symborganisms moving around from generation to generation. You can even see things like crossings if you really sit with these images. These images also make the final phenomenon very clear. It's that problem that Baricelli kept running into. Quote, End quote.

Starting point is 01:01:20 After a while, some organism would develop that was uniquely adapted to Baricelli's world. It would take over the entire row, outcompeting every other symbiorganism. This makes sense as an end stage for the simulation. Eventually, you should get an optimized solution to the problem. But it gets more weird. Check this out. Quote, attempts to modify the mutation rule did not remove this difficulty, but they often led to

Starting point is 01:01:51 a different type of final symbioorganism better adapted to the new mutation rule. End quote. Baricelli spent a lot of time toying around with his mutation rule. That was kind of his main variable during much of his research. He found that by changing the mutation rule, new organisms would evolve and, after enough time, would solve the rule set, so to speak. The simulation would always produce some optimal solution to the problem Baricelli formulated. would always produce some optimal solution to the problem Baricelli formulated. You can actually see this occurring in the big graphical outputs that folk like to generate. If you look down at the bottom of one of these images, you usually see a uniform pattern. Sometimes it's a checkerboard, sometimes it's all one color, sometimes it's a series of lines. That pattern is the optimized organism, the solution to Baricelli's rule set, the lifeform that's able to beat Baricelli's game. This takes us right back

Starting point is 01:02:55 to where we started this episode. Baricelli didn't set out to create genetic algorithms. He wanted to model evolution. He wanted to make a tool. He wanted to build his own digital laboratory where he could carry out precise and reproducible experiments. In that, he succeeded. The fact that multiple programmers, including myself, have been able to reproduce his experiments is a testament to that. Code that I wrote this very week can give the same results as code Baricelli wrote on one of the first digital computers. That's just really, really cool. But he was missing a key piece of the puzzle. Baricelli only wanted to apply his research to evolution, to take the model and use it to reason about reality. What he had stumbled on

Starting point is 01:03:46 was a way to exploit evolution. The model itself could be used for solving more generic problems. His little problem of super-optimized organisms is actually the key. Symbio-organisms kept solving whatever rule set Baricelli wrote up. He saw that as a problem and would eventually find a way to break and outsmart his own creations. In later experiments, Baricelli actually ran multiple universes at the same time. Occasionally, he would swap out regions from each realm every few hundred generations. That added genetic diversity and shook things up enough that no symborganism could beat the rules. If he had taken a different angle to this, if he didn't see these solutions as a problem, then the AI revolution could have taken a whole

Starting point is 01:04:38 different shape. This is a case where a project gets tantalizingly close to something revolutionary, but it's missing just the right push to jump the shark. Alright, this ends our dive into Baricelli's digital world, today's story of near-missing. I think this should serve as a good primer if you want to dig a bit deeper. The paper I've been working off, Numerical Testing of Evolution Theories, is one of those short but dense articles. I've covered the big stuff, but there's a whole lot more to get into. This is one of those times where I do recommend reading the paper for yourself.

Starting point is 01:05:23 I'll leave a link in the description. That'll give you all the gritty details that I've glossed over, and even a full-on symbio-organism tournament that Baricelli constructed. I'm not kidding or exaggerating when I say this is a truly wild paper. So where does this leave us? Well, let me take you straight to the deep end for a second. For me, with my 21st century vision, I can just say that Baricelli's work is the precursor to genetic algorithms. I have all the history at hand. I know where things go and how things develop, so that fills in the whole story for me. By the 1970s, this type of research has flowered

Starting point is 01:06:06 into genetic organisms, into these beautiful tools for solving problems, for optimizing solutions. But that's totally anachronistic. I want to leave you with a passage from the conclusion of numerical testing. Quote, There are very few important limitations concerning temperature and other environmental conditions required for development of symbio-organisms. Neither the low temperatures of the moons of Uranus nor the high temperatures of the sun-side face of Mercury are sufficient arguments to exclude the possibility of symbiogenetic phenomena. The experiment recorded in figure 24 is a clear demonstration that symbiogenesis can not only take place on a planet or a satellite, but even in the memory of a high-speed computer. To maintain that only conditions similar to those prevailing on

Starting point is 01:07:00 Earth could permit symbiogenetic processes would obviously be too great a pretension. This is, in part, where Baricelli's research took him. Life as we know it only needs to follow three simple rules. Those rules apply to everything living on earth, from cells up to folk. But through simulation, it's possible to show that anything following those rules could partake in a little bit of evolution of its own. This implies, among other things, that we aren't alone in the universe.

Starting point is 01:07:36 That may be a little more lofty than designing fancy antennas. Thanks for listening to Advent of Computing. I'll be back in two weeks' time with another piece of computing's past. And hey, if you like the show, there are a few ways you can support it. If you know someone else who's interested in the history of computing, then please take a minute to share the show with them. You can also rate and review the show on Apple Podcasts. If you want to be a superfan, then you can support the show through Advent of Computing merch or signing up as a patron on Patreon. Patrons get early access to episodes, polls for the direction of the show, and bonus content. You can find links to everything on my website,

Starting point is 01:08:15 adventofcomputing.com. If you have any comments or suggestions for a future episode, then go ahead and shoot me a tweet. I'm at Advent of Comp on Twitter. And as always, have a great rest of your day.

Your Ad Here

Advent of Computing - Episode 115 - Digital Lifeforms

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.