Stuff You Should Know - Short Stuff: DNA Data Storage

Episode Date: June 7, 2023

One of the more futuristic things around right now on Earth is research into encoding data into strings of genetic code.See omnystudio.com/listener for privacy information....

Transcript
Discussion (0)
Starting point is 00:00:00 So, there is a ton of stuff they don't want you to know. Yeah, like does the US government really have alien technology? Or what about the future of AI? What happens when computers actually learn to think? Could there be a serial killer in your town? From UFOs to psychic powers and government cover-ups, from unsolved crimes to the bleeding edge of science, history is riddled with unexplained events. Listen to stuff they don't want you to know on the iHeartRadio app,
Starting point is 00:00:29 Apple Podcasts, or wherever you find your favorite shows. Hey, I'm welcome to the short stuff. I'm Josh and there's Chuck and Jerry's here too, and so's Dave and Spirit, and we're coming at you from the future of right now. This is one of those where it's so interesting, so cool, so mind-blowing and so promising and then you get to the very end and then you're like, oh, to me that just meant just give it a little more time. No, and a lot of times that is the case and probably will be in this case. But it was such a, oh, and you'll see what in about, you know, 12ish minutes, what we're talking about.
Starting point is 00:01:13 So essentially what we're talking about first is data. We've got a lot of data like anytime somebody says something, thinks something, writes something down, somebody comes up with a new recipe or a new patent or whatever that gets encoded. It's data that gets saved. We don't really throw stuff away anymore. And so we're kind of a wash in data. And if you want to take that day and you want to save it, you want to preserve it, let's say it's really important like you're the Library of Congress. Sure. Get this, man. I did not realize this. What you do is you take that data and you transfer it onto the same kind of magnetic reels that those old room size computer mainframes used to read and write data. Yeah, you put it on tape. Yeah, exactly. I didn't realize that,
Starting point is 00:01:57 but it's just the proven go to means of long term, it's called archival storage, of the kind of data that you don't really need to access anytime soon. It's called low touch data. You're just putting it literally in cold storage. Yeah, I mean, it's been around for a long time. Very dependable, very durable, very reliable. It doesn't cost a lot of money.
Starting point is 00:02:24 It can hold a ton of data. One tape can hold between 1 million and 15 million gigabytes or 1 to 15 petabytes. That's a lot of stuff. And the problem is, you know, it's all relative, but they're kind of big, but not big, big. They're three inches by three inches, and you're like, I'll check that's not very big at all. But that is big when you talk about potentially billions of these things, and having to store them in a place that is, like you said, cold storage.
Starting point is 00:02:57 So it's the cost of building these cold storage buildings that is the issue when it comes to this three by three inch thing. That, and then also, you know, they've been around for three quarters of a century, so we know they last that long if you keep them in cold storage, but we don't know exactly how long they will last. So there's also a question of that. So that combined with so cost, questions about how long it will last, and then also just the enormous amounts of information we're adding every year, or making people look for other ways to encapsulate data to encode data in ways that are cheaper,
Starting point is 00:03:37 that are smaller, that require less money to keep cold, and what they've come up with Chuck, for anybody who has looked at the title of this episode, they won't be very surprised, but DNA. That's right. I know it's early, but we got to take a break right there, right? Agreed. Alright, we'll be right back. There's a ton of stuff they don't want you to know.
Starting point is 00:04:11 Does the US government really have alien technology? And what about the future of artificial intelligence, AI? What happens when computers learn to think? Could there be a serial killer in your town? From UFOs to psychic powers and government hover ups, from unsolved crimes to the bleeding edge of science, history is riddled with unexplained events. We spent a decade applying critical thinking
Starting point is 00:04:34 to some of the most bizarre phenomenon civilization and beyond. Each week, we dive deep into unsolved mysteries, conspiracy theories and actual conspiracies. You've heard about these things, but what's the full story? Listen to stuff they don't want you to know on the iHeart Radio app Apple Podcasts or wherever you find your favorite shows. What's up fam? I'm Brian Ford, Artisan Baker, and host of the new podcast, Flaky Biscuit. On this podcast, I'm gonna get to know my guests by cooking up their favorite nostalgic meal.
Starting point is 00:05:12 It could be anything from Twinkies to mom's Thanksgiving dressing. Sometimes I might get it wrong, sometimes I'll get it right. I'm so happy it's good because if it wasn't, I'd be like, you know, everybody not my mom. Either way, we will have a blast. You'll have access to every recipe so you can cook and bake alongside me. As I talk to artists, musicians, and chefs about how this meal guided them to success.
Starting point is 00:05:38 And these nostalgic meals, fam, they inspire one of a kind conversations. When I bake this recipe, it hit me like a ton of bricks. Does this podcast come with a therapist? He can. Listen to Flaky Biscuit every Tuesday on the iHeartRadio app, Apple Podcasts, or wherever you get your podcasts. All right, so you dropped a pretty big truth bomb on everyone. I'm sure there are people that for 60 seconds were like, what? Storing data on DNA, dude, that's in my body.
Starting point is 00:06:22 What are you talking about? Putting data in my body? You got that straight. You don't have that straight, but here's a pretty good, as far as how much this stuff can hold. This is pretty staggering stuff. And this is from a couple of dudes from the Los Alamos National Lab.
Starting point is 00:06:39 And I think you got it from Scientific American. Here's how much DNA can hold. 74 million, million bytes of information, which is basically the Library of Congress. That's a lot. That's a lot. You can put that, if you were putting it on DNA, into the size of something as big as a poppy seed, 6,000 times over. Right. Said another way, if you split that seed in half, you could store all of the data on Facebook.
Starting point is 00:07:10 Yeah, and then by 2025, the size of the data that humanity's generated will reach an estimated 33 zettabytes. So 3.3 followed by 22 zeros of bytes of information. A lot of bytes. If you can transcribe that all into DNA, you could fit the whole thing into a ping pong ball. Not a three by three plastic cartridge, multiple times over. A single ping pong ball could hold all of the world's data and you can make multiple ping pong balls as backups too. Yeah, and you don't need to, and it's pretty easy to duplicate them, apparently, and you
Starting point is 00:07:52 don't need to keep them in the fridge, even though you could put it in an egg carton and be set. You don't even have to. It's going to last a long time not being in cold storage and probably even longer in cold storage. You could give a ping pong ball to every living human to keep in their fridge and it would have no problem whatsoever. Be like, here, you keep this cold for 150 years. And only half of them would eat that ping pong ball thinking it was an egg. Yeah. Yeah. So you'd still be left with all those backups.
Starting point is 00:08:24 Here's where it gets super interesting though, because, you know, as most people listening probably are, like I said, as I was reading this, I was like, okay, that's a cool idea, but like how in the world does this work? And it turns out that it's not that mind blowing or difficult. I'm not saying I could go out and do it, but it makes a lot of sense to wrap your head around because DNA, as we all know, is composed of four nucleotides, or at least combinations of guanine, thymine, adenine, and cytosine. Just remember GTAC. ATAC and Madoka. Yeah Oh ironically All this digital data though is included That's out there in the world and as everyone knows and ones and zeros. Mm-hmm. So it's it's you know
Starting point is 00:09:15 It sounds like you know, and it can be any combination of ways But when you really break it down, it's really you can either just have zero zero zero one one zero or one one Uh-huh as far as those combinations go. And that's four things. And there are those four nucleotides. So if you just say, hey, each one of these nucleotides is going to be assigned a different number, then that's all you need. There's the key to your map.
Starting point is 00:09:37 Yeah. So say Adenisine stands for 0, 0, and Guayanine stands for 1, 1, and so forth. Each one stands for one of those pairs of possible combinations. Then you can take any string of binary data, zeros and ones and turn it into genetic code based on those nucleotides. So like you would just have you go from a string of ones and zeros to a string of ATGs and Cs. That's it. The thing is, is you're not turning ones and zeros into letters. You're actually transcribing the ones and zeros from binary code into physical genetic material. You're actually putting a base of adenosine right there. You're putting a base of thymine next to it. Like, depending on how the code reads
Starting point is 00:10:26 with the ones in the zeros and what order they're in, you're actually physically creating genetic material, DNA, but rather than encoding the information to building a living thing, you're encoding the information to the entire catalog of stuff you should know. And honestly, isn't that the first thing we should preserve in DNA? Sure. Good. After the movies of Gene Wilder. How about at the same time as the movies of Gene Wilder?
Starting point is 00:10:56 Can we just agree to that? How dare you? Hey, I think highly of us and Gene Wilder. I don't know why he's been on my mind lately, but he has been. He's been shaking it for you in your head? He's been shaking it for me. So this all sounds great.
Starting point is 00:11:12 And like I've mentioned at the very beginning, this is one of those things where we're like, holy cow. This is the future, this is it. And then the o, at the end. And the o is that it's really expensive to do this. Like we can do this. We figured out how to do this. It's possible.
Starting point is 00:11:29 We have the tech to do this. But that, here's a tape name, LTO-9. It's a magnetic storage tape. You can get it for $8 and you can get one petabyte of storage. That would cost you about a trillion dollars to do for DNA. Yeah, there was a guy who was interviewed in ours Technica named he and June Park. He's the CEO of a data storage company called catalog and he even estimate He said let's say it cost you three cents to print a single nucleotide Yes, that's cheap, but for each base pair now you're up to six cents and then you now you're translating gigabytes
Starting point is 00:12:02 You're entering millions of dollars. So, if it costs millions of dollars to translate a gigabyte, it's cost trillions of dollars to do a petabyte, and the other problem of it, two chuck, is that it's really, really slow, right? It's super slow. So, this is a clear case of one of those things, like you mentioned, which is like just wait, because like with any technology, it's going to get quicker, it's going to get cheaper. I don't know if this is like 100 years in the future, but at this point, the cost is just so outrageous that there's
Starting point is 00:12:35 no government is going to fund something like this. I mean, a trillion dollars for one petabyte of information is not, you're not going to sell that very, very easily. And then, yeah, like I was saying, the speed not, you're not gonna sell that very easily. And then yeah, like I was saying the speed, if you're transferring information from one of those magnetic storage tapes, you're transferring it about a gigabyte per second, typically. If it takes even like a second to print a single nucleotide,
Starting point is 00:13:00 which is still is very fast, but you're thinking on human level fast. We need to think on like how many ones and zeros are in the average gigabyte of code. Now you're talking about decades to transfer a petabyte disk's worth of information using DNA technology. Yeah. So yes, it's very slow right now. It's very expensive right now, but I don't think we're a hundred years off Chuck because we're able to do this now relatively cheaply
Starting point is 00:13:30 because the human genome project came along. That was 20 years ago. Think about how far we've come. And this is like the hardest chunk, the first 20 years. I think it's just gonna get faster and easier. I don't think we're gonna be waiting a hundred years to see DNA data storage. Does that mean that stuff you should know is in the hardest chunk?
Starting point is 00:13:49 We're near 15. I think so. Yeah, it feels like it. Okay. I'm kidding. Well, I'm teasing Chuck, right? Yeah. Just teasing, which means of course, short stuff is that. Stuff you should know is a production of I Heart Radio.
Starting point is 00:14:08 For more podcasts of my heart radio, visit the I Heart Radio app. Apple podcasts are wherever you listen to your favorite shows. you

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.