The Joy of Why - Why Do We Need Error-Correcting Codes?

Starting point is 00:00:00 Hey there. Do you like learning wild and wonderful new things about science? Well, have I got a show for you? I'm Rachel Feldman, the host of Scientific Americans podcast Science Quickly. Every Monday, Wednesday and Friday, Science Quickly offers bite-sized deep dives into the latest breakthroughs in science and tech. Whether you're into sexual health or supernovas, quick news roundups, or immersive narratives, we've got you covered. Check out Science Quickly on your favorite podcast app today. We promise we'll make it worth your time. I'm Steve Strogatz. And I'm Jana Levin. And this is The Joy of Why, a podcast from Quantum Magazine exploring some of the biggest unanswered questions in math and science today. Hi there, Jana. Hey. How's it going? Good. What do you have for me today?

Starting point is 00:00:52 Well, we're more or less about the same age, so I'm guessing you remember what CDs are. Oh, yeah. The kids don't necessarily know. I guess some of them are using vinyl now, aren't they? Records. Back to vinyl. Yeah. Nobody's back to CDs, though.

Starting point is 00:01:07 I don't think anyone's like got a new CD Walkman, you know. But do you remember back when CDs were the thing and people would demonstrate that you could scratch them and they would still play? Yeah. I'm bringing it up because the fact that when you play a CD that's been mangled, if not too totally messed up, it'll still play is because of a wonderful technology called error-correcting code that was built in to that kind of device. And that was the first time I ever ran across the concept. But apparently it's really out there in our modern world today when you're streaming TV shows or cell phones. Probably this communication we're having with each other right now. There are probably error-correcting codes running behind

Starting point is 00:01:50 the scenes, making sure that all our bits, you know, our zeros and ones are getting where they're supposed to be, even though there's always noise. Think how we're going to be. Think how we're weird it would be if your light switch suddenly switched itself off. That doesn't really happen in our everyday macroscopic world. But apparently there's a lot of things that can cause bits to flip. Like ones become zeros when they shouldn't. They get hit by a cosmic ray or your cell phone signal is bouncing off the walls and it can interfere with itself. But you have to add more bits to help correct the inevitable errors caused by all the noise and all the potential sources of corruption in the environment. And so error correcting codes is the story.

Starting point is 00:02:28 of math fighting noise, how to use those extra bits as efficiently as possible using redundancy, but in very clever ways. And would this be a computer scientist's arena? Yeah, totally. That's totally their bailiwick. They're very ingenious. This is a lot of discreet, really clever stuff about algebra, about graph theory. And do they ever get it really wrong?

Starting point is 00:02:54 Do they ever make the errors worse? Do the error-correcting people need to be corrected? Right. Well, from what I hear, some of the error-correcting codes can recover messages in astonishing ways. But maybe we should dive in and let Mary Wooters, an assistant professor in electrical engineering and computer science at Stanford. Tell us a little bit about the grand story of error-correcting codes. Hi, Mary. Hi, Mary.

Starting point is 00:03:26 much for having me. I'm very happy to meet you and see you here. I was wondering a little bit about this book of yours that you wrote with your co-author Aviad about algorithms for toddlers. You want to tell us a little bit about that? Sure. Algorithms for toddlers is a whole semester of undergraduate algorithms condensed into 14 illustrated pages with rhyming couplets, which you can find, I would like to say, at all fine bookstores, but actually just online at Barnes & Noble.com because we self-published. So if that is for you or for your inner toddler, yeah, it's a good thing. I enjoyed the few bits of it that I saw on YouTube. That's right. Yeah, I have my best kindergarten teacher voice on. If you want to see me reading the book on YouTube. Do you have an illustrator

Starting point is 00:04:08 or did one of you illustrated? I illustrated it. You illustrated it. So this is a book co-authored with my colleague here at Stanford, Aviod Rubinstein. Aviat and I both occasionally teach the undergraduate Algorithms course here at Stanford, and Aviat has young kids, and so he thought it might be a good idea to try to make a book to teach algorithms to them. I see. You do seem to have some passion for helping convey the wonders of computer science, of algorithms, coding, to a pretty broad audience. I mean, you're cornering the toddler market. We're going where are the niches, yeah. But so do you think it's important to share this kind of information in a very broad or accessible way? What's motivating you to do that kind of work?

Starting point is 00:04:53 It's not so common. That's a good question. I mean, I think you're right. This is something that I really care about. I get really excited about my work about the mathematical and technical ideas in it and I want to share the joy, right? I want to communicate that to others. So I think that's where that comes from in me is basically there's like this little five-year-old kid inside being like, oh my God, it's so cool. Beautiful. That's an honest answer. Well, so we have you. with us to talk about error-correcting codes, and I would love to dive into that. It's not something I'm very familiar with, and it's possible our listeners are not either. So if you don't mind, I'm wondering how did you get interested in that? What inspired you to start learning about

Starting point is 00:05:33 them? Yeah, so maybe just for listeners who have no idea what the heck these things are, error-correcting codes are sort of a fundamental tool for protecting data from noise. So basically, whenever you send data, think of your cell phone talking to a cell tower, or if you want to store data on a hard drive or something like that, like errors are going to happen. Radio waves get interfered with, hard disks get scratched, or things go wrong. And so you need to add redundancy, basically, in order to make sure you don't lose data. And an error-correcting code is a way to do that. And how did I get into it?

Starting point is 00:06:07 One of the things I really like about this area is it's super interdisciplinary. So my background's in math, and there's a lot of really beautiful mathematics that goes on with error correcting codes, but now I'm in a computer science department and electrical engineering department. There's also a bunch of folks in CS&E who think about these things too in lots of other communities where error correcting codes are interesting. So I think that's one of the things that really drew me in is just the people and the opportunity to deploy a wide range of toolkits to study this one type of object. Great. Okay. You've mentioned a little bit about why error correction is important that there's always some source of corruption or noise or something. We hear the word

Starting point is 00:06:43 noise a lot. I mean, in ordinary speech, noise means I'm talking to someone at a cocktail party and there's other people making noise or there's music. How should we think of noise generally? It's a great question, and there's not one answer, but it really depends on the application. So in communication, you have your signal that you want to send and you have lots of other signals that might interfere with it, which is not that different, at least conceptually, with the cocktail party. When things are digitized on either ends, that can be manifest as bit flips, for example. like I meant to send a zero and actually a one gets received. Or possibly you get some kind of uncertainty where you're like,

Starting point is 00:07:19 ah, an 85% sure it was a one, but it could have been a zero. So that's one sort of model of noise. But you could also have different types of noise depending on the context. So for example, you might have something that's called an erasure rather than an error, which would mean that a bit just turns into a question mark. That would be the formal version of it. Sort of like a blank or something? There's nothing there anymore.

Starting point is 00:07:40 Exactly. For example, in a distributed storage system or something, something if a server goes down or if a storage node goes down, you know which one went down, but you still don't know the data on it. So that would be like an erasure type of model. But there are lots of other models. You could have synchronization errors. Like I'm supposed to receive a string of length 7 and I actually receive one of length 6. I know one bit went missing somewhere, but I don't know where. Oh, wow. Or the other way around, things could get inserted in or insertions and deletions. There's really a rich class of errors. And that's kind of

Starting point is 00:08:10 what makes this problem very interesting. So there's something really mysterious about how you could even possibly do error correction, which is if something is missing, like a one or a zero was supposed to be there but it's not, or a one got flipped to a zero for whatever reason. How could I possibly know that? What is an error correcting code, I guess, is what I'm asking? Great. Yeah, so first you're totally right.

Starting point is 00:08:33 If we don't do anything to the data, there's nothing that can be done. If I'm going to be sending you a random string of zeros and one and something gets erased and turned into a question mark, you have no hope of learning what that is. But the idea of an error-correcting code is that you're going to add a little bit of redundancy to help. And so the most naive way to do this is just to repeat the data a whole bunch of times. And then you'd hope that enough copies are there that corruptions don't matter too much. And that works in most situations, but it's extremely costly. If you're talking about communication systems or story systems or something like that, that's more power, more time, more space. That's not great. And so instead,

Starting point is 00:09:07 we use something called error-correcting codes to have the same idea, but add a little bit less redundancy. To show how this works, maybe I can give a quick toy example illustrating the principle, and then I could talk a little more generally about some geometric intuition behind what's going on here. So as a first example, let's suppose that I want to send you two bits, so A and B, and they're either going to be zero or one. And I know that despite all the effort that we've done to get the sound just right on this podcast and everything, unfortunately, whatever I tell you in the next like 10 seconds, one bit that I tell you is going to get erased is going to get replaced with a question mark. And so in this setting, if I want to

Starting point is 00:09:43 send you two bits, the naive thing to do is to send you four bits total. I'll send you A and B, and then A and B again. And then no matter what gets erased, you'll still get both A and B. For example, I could send you. If you hear A question mark A, B, well, you still have at least one copy of A and one copy of B, so you know what's going on. Okay, so that's the baseline. That's four bits. But here's a better scheme that I could do. I could send you A and B, and then A plus B mod 2, or AX or B. Okay, wait, let's think about that. Go slowly.

Starting point is 00:10:13 Let's see why that works. So, just to give a quick example, if A and B were 0 and 1, then I would send you 0-1-1 because A plus B is 1, mod 2 is 1. Or if A and B were both 1, I'd send you 1-1-0 because A-plus B is 2, which is 0-1-2. right? So I'm going to send you now three bits. And why does this work? Well, suppose that one of these bits gets erased, just for a sake of example, say it's the second one. So you hear A question mark A plus B mod 2. And when you get both of those pieces of information, you can actually figure out what B was meant to be, even though it got erased. If you know that A was zero and A plus B mod 2 was

Starting point is 00:10:55 one, then you know that B must have been one. That's the only thing that you can add to A to get 1 mod 2. So you can always solve the system to figure out what you are missing. I see. And so just in case anyone listening isn't familiar with the idea of mod 2, I guess a simple way to say it would be, is it even? Yeah, that's exactly right. So you add those together and you'd say, is it even? If so, that's a 0, if it's odd, that's at 1. At the receiving end, am I supposed to know that you're doing this A plus B mod 2? Like, is that understood? Great. Yes, that is understood. So we've agreed beforehand on what scheme we're going to use. Okay.

Starting point is 00:11:29 So this was just a very simple example, but it illustrates that you can do better than sort of the naive repetition. Real error-correcting codes are generally much bigger than this, but there's a big class called linear error-correcting codes would have basically this idea. So you start with your message, let's say instead of two bits, I have K-bits, and I want to add some redundancy. The redundancy that I add is by taking parity checks of my message. So instead of taking all of them mod two, that's one parody check, but I could also take the first plus the third plus the seventh mod two and the seventh plus the eighth plus the ninth mod two and so on.

Starting point is 00:12:05 And the game is kind of how do you design these parity checks so that at the end of the day you can correct against the flavor of error that you're interested in, be that erasures or errors or synchronization errors or other sorts of things. Here's a geometric way to think about this previous example. So what's going on? There are four possible messages that I might want to send to you. There's 0-0-1-1-0 or-1-1. And correspondingly, there's four possible strings that I might send to you. These are now three-bit strings. So I might send you 0-0-1-1-101 or 1-1-1-1-1.

Starting point is 00:12:42 Those are the allowable strings in this silly example, okay? So now I've got these four strings of length three. These four special strings that I just said, they have the property that they all differ from each other in at least two places. So it takes two bit flips to get from one to the other. Oh, that's nice. Okay, this is helpful. And so what I've just done here is I've designed a set of strings so that they all differ in at least two places. And we can see why this means that they can correct one erasure.

Starting point is 00:13:11 And you can generalize this to talk about errors as well. So instead of something getting replaced with a question mark, I would have. have like a zero getting replaced with a one or a one getting replaced with a zero, but I don't know where. But if I have such a design of strings, all three apart, I can actually correct one bit flip error. So you could put in a lot of redundancy, but then you're not sending much message. Right. Tell us how you would characterize the trade-off. So there's many, many things that one could care about, but a key trade-off that one cares about here is this notion of distance. So how far apart are my special strings in terms of

Starting point is 00:13:48 bit flip error. And then also, how many strings are there given the length of the string? I only get to say one message per string. So the more strings there are, the more messages I get to send. Oh, okay. That's a nice way to say it. So let's say I'm going to have a bunch of strings of length, I don't know, 100. Right. And I want them all to be at least 10 apart from each other. I want to be able to correct at least four errors or something like that. Then the more length 10 binary strings that I can fit in while never making any pair of them too close to each other, the more efficient that code is, basically the more expressive my language is. It occurs to me that maybe this is trouble with a mathematician talking to another

Starting point is 00:14:33 mathy person, that we may want to back up for a second and just go a little more palpable. As a grad student, I remember someone showing me a CD. I don't know if our listeners even remember what CDs are. But it was a big deal back then to take a coin or something and scratch the CD and then unlike vinyl records, you didn't hear it. The CD player would take care of it somehow. That is almost certainly something called Reed Solomon Error Correction, which is very pretty stuff based on polynomials. That is a perfect example of error correction at work. Uh-huh. And do you have some others for us? I mean, if I'm watching some TV show or movie, are they doing error correction when they send it to me? Yeah, basically anywhere where

Starting point is 00:15:14 data is stored or transmitted today, there's going to be error correction. There are also other things you can do on top, like for Netflix or something. They're trying to send you these relatively big packets or something. So if a packet doesn't come through, your computer will just say, hey, I didn't get that packet and they can send it again or something. I should also say, I'm not an expert on streaming video. I don't know exactly how that works, but that's my understanding. But yeah, there is error correction at some level, basically anywhere data's communicated or stored. So CDs are a great example, cell phone communication, satellite communication. It's absolutely very interesting. Error-correcting codes are just a very useful sort of combinatorial and

Starting point is 00:15:50 algorithmic primitive. They have uses in algorithm design. They have uses in complexity theory in theoretical computer science, pseudorandumness, things like that. Well, you've mentioned some error correcting codes by name already. Is there sort of a broad category of codes that we should know about other than those? Yeah, so there are lots and lots of different types of error-correcting codes out there. One family is algebraic. I mentioned read Solomon codes earlier. This would fit into that family. So algebraic codes are codes that are mostly based on polynomials over finite fields. So that's some jargon. But the basic idea is if our goal is to come up with a whole bunch of strings that don't agree with each other too much. And imagine, I were to come up with a big set of such strings.

Starting point is 00:16:37 by taking a whole bunch of quadratic polynomials and evaluating them at a bunch of points. And these are going to give me a bunch of strings in this example at strings of real numbers, 1, 2, 3, 4, 5, 6, 7, 8, and so on. And then I might ask, well, okay, how far apart are these strings from each other? Well, any two quadratic polynomials

Starting point is 00:16:57 can only intersect in two points. You can kind of see this for yourself if you, like, imagine trying to draw parabolas and, like, how many points can you get them to intersect in? It's no more than two. So that means that if I were to build a bunch of strings this way, they can't agree in more than two points. I see. That's a nice picture. And so that's the basic idea that's at the core of a lot of these algebraic constructions.

Starting point is 00:17:18 Another family is what I would call graph-based codes. So these are codes that are based on bipartite graphs. And so the idea here is I'm going to view all of the elements of my code words, that is the elements of these special vectors that I will send. So maybe I have a binary string of length N. I'm going to view each one of those N positions as a vertex and a graph. Then I'm going to have a bunch of other vertices, which I'll call check vertices, floating around, that are connected to some of these what's called variable vertices.

Starting point is 00:17:53 So the vertices that are indexing positions of my code were. And I'm going to say that I need to come up with a big set of binary strings that are all pairwise far apart. The set of strings I'm going to consider are strings so that when I label these n vertices with those numbers, zeros or ones, then all of the other vertices, these checked vertices, should always see something nice. Let's say they should always see an even number of ones or something like that. Depending on the types of constraints you put on these extra nodes, you can define a whole lots of classes of codes like this, and then you can use ideas from graph theory to analyze

Starting point is 00:18:33 how far apart these strings are from each other, and you can also use that to design efficient algorithms. This is something maybe we haven't talked about so much is that you actually don't just want a big set of strings that all fall apart from each other. You also want them to have enough structure so that you can correct errors efficiently with an efficient algorithm.

Starting point is 00:18:51 And sometimes these graph-based codes give us very efficient algorithms for doing that. Okay, so we've got graph-based codes, algebraic codes. Yeah, and I think those have been the two big ones for a long time, but relatively recently, there's a new class of codes, which was more information theoretically motivated, called polar codes. These were invented in 2008, something like that. And these don't maybe quite fit into

Starting point is 00:19:12 either of those categories. They're now in wireless standards and they're very good. Does the word polar have any meaning that we can understand? Is it a very technical thing or what's polar about it? Yes, so there's some really great intuition about it. So imagine that I have some noisy channel between you and me, something that's going to introduce some errors. What polar codes utilize is a transformation that basically will transform a whole bunch of channels that are kind of equally not great into a few channels that are fantastic and a few channels that are really, really bad. So that's where polar comes from. It kinds of polarizes into having a few really good channels and a few

Starting point is 00:19:58 channels that are really not great. And then you just throw away the channels that are really not great and use the really good ones. And that kind of gives you your code. Wow. It seems like a big gulf between that, you know, intersecting parabolas and when your friend's trying to tell you the end of their dramatic story from last night. Yeah, I have to admit, I mean, this is very far out, sophisticated stuff. So can I run a few analogies by you? Yeah, please. They're not perfect, but I think they might get some of the essence of what she was trying to say. For instance, she mentioned first algebraic codes. So here's my analogy.

Starting point is 00:20:33 It's sort of like I'm trying to send you a message, and I'm going to do it by sending you a secret math puzzle. Okay. Okay. The secret puzzle is going to take the form of a function. She mentioned algebraic objects like parabolas, you know, quadratic polynomials. So if there's some ingenious way that I can translate my message into a curve, that's already a cool idea, that my message is now going to have a shape.

Starting point is 00:20:56 And I'm going to send you a few points on that curve. And then the idea is that even if some of the points get messed up, because there's so much structure built into the way I'm sending it to you, you'll be able to reconstruct the whole curve and thereby recover the intended message. Right. That's very helpful. Yeah. Let me give you another one. So she mentioned graph-based codes. Okay.

Starting point is 00:21:18 The analogy here would be that this is like a group project with a lot of cross-checking. You know, like think of the kids. They have to do group work. Usually they hate it because some kids are loafing. Other kids are doing all the work. But here, it's not going to be like that. Each bit we're going to think of as like a person in the group. And each person is going to work on several parts of the project.

Starting point is 00:21:41 And then small subgroups of the whole group are going to compare their answers. So that if one person messes up, that's like an error, then the group can spot the error and fix it. there's a network structure in the problem that decides who's checking who, and through all those interactions, somehow errors can get corrected collectively. It's kind of like the opposite of a game of telephone. Something like that. Right. That's just a chain and the error just has no way to correct.

Starting point is 00:22:10 It just gets worse and worse and worse. Yeah. But in a more complicated graph, there's a potential for correction. It's interesting. That's the idea that the design of the graph can help you with the error correction. And then the last thing that Mary mentioned was polar codes, those are pretty deep and modern. As she says, it's only 2008. Those are apparently close to theoretically optimal.

Starting point is 00:22:30 It's like if I have a bunch of junkie walkie-talkies, and we're going to communicate with those. And there's some magical way to make some of the walkie-talkies fantastically clear and good while others become totally terrible and useless. And somehow you then make sure to send your message only over the good ones. Whereas the bad ones, you don't have to throw them away. They turn out to act like fillers for the rest of the structures. They play a sort of supporting role, almost like if you built a complicated bridge to do a different analogy, where you had some strong beams and some really bad beams. The bad beams are needed to make the whole bridge, but you're not going to put any weight on those. You're only going to drive over the strong beams.

Starting point is 00:23:13 It's something like that in the polar codes. You polarize the system into really good and really bad stuff and try to just use the good stuff. I'm not driving on that bridge, man, is all I'm saying, though. No, it's the best bridge you could be on in the face of noise, turns out. Right. Anyway, so it's really deep stuff. And we're going to hear more for Mary Wooters after the break about how these ideas get used in things like cloud servers, how they may help in the future for quantum computing, and just how practical it is. So stick around.

Starting point is 00:23:44 We'll see you after the break. Welcome back to the joy of why. We're speaking with Stanford computer scientist, Mary Wooters. From a little bit of the preparation I did to talk to you, I got the impression I got the impression that there are some folks in your field who might think of the whole subject as a little bit old-fashioned. Are there people making claims like that? Or what's your take on that? Yeah. So error correction has been around since probably late 40s, early 50s, so Richard Hamming, Claude Shannon, and then a whole bunch of other people, you know, sort of foundational

Starting point is 00:24:39 in this area. And all of these NASA missions and stuff like that, is there a correction going on there, too. In my view, I think air correction is a very classical area. but I think it's also still a very interesting and important area for a couple of reasons. So the first reason is that there are actually a lot of questions that people were asking back in the 50s that we still don't know the answers to. There's been progress on that over the past 75 years or so. There's still very foundational and fundamental questions that are left to be answered. The second reason is that technology changes.

Starting point is 00:25:11 And so the types of error correction that was developed in the 50s or 60s or 70s or 80s is no longer necessarily the type of error correction. that we want today. One big example of this is, now that everything is much more distributed, we might care a lot more about the communication costs of our error correction. This is not something that people cared about as much in the 50s. Communication between the parts of this distributed system? What would be the example? Yeah, so for example, if I want to store something in a distributed storage system, what does that mean? I have my data, and it's broken up and sent all over the cloud. And the types of errors that I might be worried about in this context are some

Starting point is 00:25:48 storage node goes down. I don't want to lose information if that happens. And traditionally, when such a thing were to happen, if a server were to go down, you'd want to set up a replacement node or something like that. And traditionally, that would involve downloading all of the data from elsewhere in the system, doing the error correction, figuring out what belongs on that node and putting it back there. That's a lot of sort of internal communication within the system. And one thing that Richards has been working on for a little while more recently is schemes for doing that whole repair process, but in a low bandwidth way. So a way where the internal nodes don't need to talk to each other all that much.

Starting point is 00:26:25 So this language of nodes, servers, I'm sure it's very tangible to you, but I'm never exactly sure what people are talking about. So a server is a kind of computer with a particular purpose or a node is another name. What are these things mean in plain speak? What I have in mind here is that you've got a hard drive over here, you've got a hard drive over there, and your data is getting split between those two hard drives. That's sort of the model. Okay.

Starting point is 00:26:52 And so that's something that didn't use to happen when different hard drives didn't talk to each other in the old days. It was just a computer sitting on a desk with a hard drive. And now you're saying a lot of distributed systems, and you refer to the cloud. I'm never exactly sure what the cloud is either. The cloud is what? Lots of computers in different locations, so-called server farms and stuff like that. Yeah, that's right. That's the model to have in mind here.

Starting point is 00:27:16 You have lots of different computers, either compute nodes. So these would be like computers that are actually doing tasks for you or storage nodes. These are just computers that are storing stuff for you. And you have many of them, and they're all talking to each other. So the error correction, then, that is a modern problem that we didn't use to have, is dealing with the errors between all the parts of this distributed system as they talk to each other. Yeah, like if you just forget that they're different computers and imagine they're all the same computer, then this is the same question as before.

Starting point is 00:27:45 but now our objective function is a little bit different. I don't just want whatever correction algorithm I'm doing to run fast. I also want it to run fast so that the computers don't have to talk to each other too much because that sort of intra-system communication is costly, or more costly than it used to be. Okay. Now, so given that we're now talking about new challenges or new-ish challenges, one that I'm sure is on the minds of some of our listeners is quantum computing, which we keep hearing about. that that is going to be a real thing sometime, soon, and we should be prepared for it.

Starting point is 00:28:22 Is there anything really different that we need to think about if we're doing error correction for quantum computing? And also, why would we need to do it for quantum computing? Definitely, error correction for quantum computing is a big thing. And at least at the moment, it's arguably much more important than error correction for classical computing, because my understanding is that the implementations we have of quantum computers are very, very noisy. And so it's really important to do error correction. There's a couple of different ways to get such things. One of them is based on classical coding theory.

Starting point is 00:28:54 So there's something called a CSS code, which basically you take two classical error-correcting codes that have some nice relationship with each other, and you can put them together to correct quantum errors. But then there's other sorts of types of error correction. You know, are there a special thing? I'm not an expert. But I have been thinking about it a little bit. It's a really fascinating problem, and I'm getting more and more into it recently. But I would say also, I guess quantum error correction is another great example.

Starting point is 00:29:23 Like, why should we still study error correction in this modern day and age? Didn't we know about it since the 50s? But, like, yeah, quantum computers are a great example of a technology that at least we hope will be around fairly soon that certainly wasn't around. Back then, and there's something called DNA storage, which basically you're using DNA as a storage system. So the idea here is we can synthesize DNA and we can sequence it. So basically, that means we can write and we can read. That allows us potentially to build storage systems out of it. And so the proposed application for such systems would be for very cold storage, so things

Starting point is 00:29:56 that you want to store for a very long time, because it is very costly to access. You need to sequence some DNA, which is not as fast as reading something from a hard disk. But it is extremely stable and extremely small, so you could archive things for a very long time, very efficiently this way. And for this application, like any storage system, you want to design error correction for it and what sorts of errors do you would expect precisely these synchronization errors like bases drop out or bases get added depending on the technology that you're using to sequence the DNA. Oh, that's an interesting idea. So rather than doing nanotechnology, this is like natural nanotechnology. Exactly, yeah. Wow. People are seriously thinking about

Starting point is 00:30:33 this? I mean, it seems so inaccessible to have stored things in DNA. Yeah, not only are people thinking about it, but people had done it. This is definitely possible. But what kinds of things would get stored that way? Well, any data you could want. So you could store a movie, you could store an image, you could store text files. It seems outrageous. True. Well, I mean, so one important thing to note is that this is not like DNA in a living organism. It's not like you're walking around with the library of converse, like on your elbow. This is just the molecules, not in a living organism. Right. There was a field some years

Starting point is 00:31:05 ago that people called DNA computing. So I guess the thinking is we've got to this naturally occurring wonderful storage device that evolution has used for probably literally billions of years. So why not use it ourselves? Yeah. And it brings up lots of really interesting challenges from the air correction point of view. Because we were talking earlier about these sorts of synchronization errors, like the types of errors that are introduced when doing DNA storage are different than the types of errors that you expect in silicon devices or something like that. And you have to designing different codes to deal with them. Right. I mean, the biologists know quite a bit about that.

Starting point is 00:31:41 Like we talked earlier about frame shift errors where you might be off by a single base and you're just shifted. A whole string is shifted. But they also have something called intercalation errors, right, where a wrong base can get wedged in there and split up a string. And there are other sorts of constraints as well. Like, you can't just write any old molecule you want or any old string of AGs and T's because some of them will have weird properties. Well, they'll then like kink up back on themselves or things like that. And so there are some constraints also for how you design your code just based on the biology. Well, okay, so you're giving us a pretty good broad survey here of this.

Starting point is 00:32:16 And I wonder when you mention the foundational questions that still exist in error correction, theoretically, is that where your heart is? Like, do you like to do the real world part of error correction? Do you like the fundamental theoretical computer science parts of it? What whines your watch? I'd really like these fundamental math questions that sort of remain open, I think, are very very important, not just from an error-correcting code standpoint, but also just from a mathematical standpoint. But then it's also sometimes nice to put on a more applied hat and at least go talk to folks who are actually implementing stuff, both because that means I get to learn new things,

Starting point is 00:32:53 which is fantastic. And it also, I think, informs different ways of thinking about these very classical questions. Like if you get an injection of a new more applied problem, you're like, oh, that is a question I hadn't thought to ask before. It is really interesting. So you can sort get new fundamental questions that way. Uh-huh. So we did talk earlier about your book, algorithms for toddlers, but I see that you've also designed games, that game about monsters that grew from beans?

Starting point is 00:33:20 That's true. Yeah, so a project I had with some of my friends from graduate school, it's a puzzle game, is called Bean in Nothingness. Oh, I see. Now I'm getting the joke. Hearing you saying it that way, now I get the joke. It's a philosophy pun. Ah.

Starting point is 00:33:35 Ah, being in nothingness. What? Is that a Jean-Pos ultra book or something? Yeah. Jeopardy question right there. Yeah, so being in nothingness is a famous book, Being and Nothingness, B-E-A-N, is our video game. And what's the idea there?

Starting point is 00:33:51 Well, I think it's a very good puzzle game. Personally, I find it very difficult. The storyline has a little bit to do with being a math graduate student because we were math graduate students when we started making it, although I think it took us nine years to finish, because once we graduated, it's dragged out for quite some time. But the basic premise is you get dropped in a puzzle that's totally static like nothing's going on. You have to get to an end tile.

Starting point is 00:34:15 And what you have is a magic wand and some recipes. And the recipes give you ways of turning beans into monsters, which will then act and interact with each other and with you in interesting ways. And they can kill you, but they can also help you in ways like destroying blocks or interacting with each other or things like this to help you get to the end. That's pretty separate from my academic work, but it's a lot of fun to make, and if your listeners enjoy difficult puzzle games, I would definitely recommend it. I have fun planning. Does that kind of playful energy come into your mathematical and scientific work? You mentioned to us that more generally you have some interest in illustration, but some people think of art and visual things as being kind of holistic right brain stuff, whereas very logical reasoning of the type I imagine you need to do in coding theory is kind of left brain. Maybe that's all bogus. Maybe you don't buy any of that. You're using both halves of your brain all the time, and you don't make that distinction. That's a good question. It definitely comes into my work in that the things that I produce in my teaching materials and also in my more academic scholarly work typically has silly illustrations.

Starting point is 00:35:21 But when I'm thinking about a problem, I think my first approach is very holistic and you want to understand the whole picture, you think about new ideas, and then at some point it's time to sit down and do all the nitty-gritty details, but that's the same when you're making a picture or something. For me, at least, you know, there is a time when you have to go down, you have to make all the little lines just write and do the inking and stuff. So to me, it seems like there's both big picture stuff and details stuff in both types of tasks.

Starting point is 00:35:50 It does to me, too, that you have to make a mess and then you have to clean it up. Right. Or something, right? It's sort of in that sequence. You've mentioned that you like the collaborative aspect of learning new things, how you can import those ideas back to fundamental theory. How would you put it, though? What is it that gives you the most pleasure in your subject?

Starting point is 00:36:10 Gosh, I'm fortunate that there are many things that bring me joy in my job. The human aspect I really love. Another aspect is the joy of discovery. Like, you know, solving puzzles is fun. Like, that's enjoyable. But then I think also there's this larger project that all researchers are contributing to the sum of human knowledge, stuff like that. You know, I think some of these problems are important. And even if I can't solve them perfectly, maybe I can make a dent.

Starting point is 00:36:35 And that brings me joy as well. Well, Mary, it's been a real pleasure to talk to you about error correcting codes. And thanks a lot for joining us today. Thanks so much for having me. I'm struck how we always come back to the human dimension. Here, in some sense, you're mathematicizing human communication in a way that's become so difficult, so challenging, and so abstract. And yet, at the end of the day, it kind of comes back to this human capacity to see things on this large scale and this small scale. Uh-huh.

Starting point is 00:37:06 Is it this desire to play is that, do you think a productive impulse for a scientist, is it an essential impulse? what do you make of that? There's no question that my colleagues will say, hey, you know, it might be really fun to look into, I don't know, something crazy, a non-orientable manifold or whatever, right? This is very common in how scientists talk to each other. You know, I'm not done playing around with how that looks in that space time. So I think it's anecdotally a really crucial part of how we interact with the work and why we pursue certain things. Yeah, even the word puzzle, I think, is indicative of a spirit. If you think of what you're doing as solving puzzles. That puts you in a childlike frame of mind where it's okay to make a mistake. You can explore. If you think you're doing an important problem, that feels more grown up and scarchy. I definitely think there's a childlike character to a lot of the scientists I know, and that there has to be a moment where you really are just playing in the sandbox before you can figure out, okay, what's the actual direction to go in? Well, as always, Jana, it's a pleasure to hang out and play with you.

Starting point is 00:38:13 Exactly. Until next time. Take care. Bye. Bye. Bye, bye. The Joy of Why is a podcast from Quantum Magazine, an editorially independent publication supported by the Simons Foundation. Funding decisions by the Simons Foundation have no influence

Starting point is 00:38:51 on the selection of topics, guests, or other editorial decisions in this podcast or in Quantum Magazine. The Joy of Why is produced by PRX Productions. The production team is Caitlin Foltz, Livia Brock, Genevieve Sponsler, and Merritt Jacob. The executive producer of PRX productions is Jocelyn Gonzalez. Edwin Ochoa is our project manager. From Quanta magazine, Simon France and Samir Patel provided editorial guidance with support from Matt Carlstrom, Samuel Velasco, Simone Barr, and Michael Canyungalow. Samir Patel is Quanta's editor-in-chief. Our theme music is from APM Music. The episode art is by Peter Greenwood and our logo is by Jackie King and Christina Armitage. Special thanks to the Columbia

Starting point is 00:39:40 Journalism School in the Cornell Broadcast Studios. I'm your host, Steve Strogatz. If you have any questions or comments for us, please email us at quanta at simonsfoundation.org. Thanks for listening. From PRX.

The Joy of Why - Why Do We Need Error-Correcting Codes?

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.