Science Friday - AI Is Coming Up With Brand New Molecules, Fueling Drug Discovery

Starting point is 00:00:03 This is Science Friday. I'm Flora Lickman. Today on the podcast, exploring the frontier of AI and drug design. We've gone from a situation where I would say kind of we were on the lunatic fringe and everyone thought it was crazy to now kind of in the mainstream. It's a little bit weird. Last week, we briefly touched on a recent nature paper where scientists unveiled new proteins that can neutralize deadly snake venom, a type of toxin found in cobras and their relatives that can be different. difficult to treat. A new anti-venom is a cool discovery on its own, but there is a twist to this snake tail. These new proteins were designed by AI. This is a fast-growing area of research, using AI to discover or design the building blocks of drugs. Another team at Penn is using AI to search for new potential antibiotics in the genomes of extinct species like Neanderthals and woolly mammoths. I know it sounds almost too sci-fi to be unsubyed.

Starting point is 00:01:03 fry, but it is happening. Here to tell us more are two pioneers in this field. Dr. Cesar de la Fuente, bioengineer and presidential associate professor at the University of Pennsylvania in Philadelphia, and Nobel laureate, Dr. David Baker, director of the Institute for Protein Design and professor at the University of Washington in Seattle. Welcome to you both. It's great to be here. Thank you. David, you focus on designing new proteins, which I think can feel a little abstract for the non-protein designers out there. How should we think about them? How are you interested in using them? Proteins in nature carry out essentially all the important functions in our bodies and in all living things. And they solve a really enormously broad range of problems, ranging from powering

Starting point is 00:01:53 movement to thinking to capturing solar energy. So if you think about all the problems that we face today, and you know a little bit about what proteins do in nature, you sort of are brought to the conclusion that a lot of the problems that we face could be solved by new proteins, and we don't want to wait hundreds of millions of years for new proteins to evolve. And so the really exciting thing about protein design now is we can make brand new proteins and we can make them to solve a really wide range of different problems, ranging from snake bite to completely different problems like degrading plastic or getting methane out of the atmosphere or making improved vaccines.

Starting point is 00:02:31 So it's just a really exciting time now. Let's talk about your anti-venom proteins. Walk me through the process of how you got to them. We've been designing proteins to bind to other proteins for quite a few years. In the last three years or so, we've developed AI methods for doing this, which are kind of analogous to the way that Dolly, the image generator, works. So in Dolly, you might say, generate an image of a cat, sitting on a table. Instead, in the case of the snake venom, what we did was, or what Susanna

Starting point is 00:03:04 Vasquez Torres, the brilliant student who did the work did, was she took the snake venom proteins whose structures were known, and she basically told the generative AI generate a protein which binds to this site on the venom. The gender of AI then builds up a completely new protein, which kind of fits perfectly against the snake venom toxin, just like a key would fit into a lock. Proteins in nature, they're encoded in DNA in our genome, so each protein in our bodies is encoded by a gene. Since the proteins that Susanna was making were brand new, she had to make brand new synthetic DNA that encoded them. She put them into bacteria, and the bacteria produced the proteins, and then she could determine whether they block the venom from killing

Starting point is 00:03:53 animals. Okay, so the algorithm gives you possible proteins that would work. And like, how many outputs do you get? Is it like 10, 3, 1,000? You get many thousands, but you can't test all of the designs that the computer suggests. And so Susanna developed a way of selecting a smaller subset of designs to test. And she tested about 100 designs for each venom, in some cases less. And she was able to find amongst those brand new proteins invented by the AI, very potent inhibitors of the toxin. Do they always work, like right off the shelf like that? Well, no, because many designs don't work at all. It's a computer fantasy, but they don't actually work in the real world. And sometimes they work, but they don't work well enough. They don't stick tightly enough to the venom.

Starting point is 00:04:45 And so then what Susanna did in those cases was to carry out a second round of design where you basically tell the AI, generate new proteins that look like this one but aren't exactly the same. And then it will kind of explore that region of that new protein it invented. And generally, when that is done, you find much better, much tighter binders than what you do in the first round. Cesar, your focus is on discovering new antibiotics. Tell me your process. Yes, in our case, about a decade ago, we wanted to introduce computational methods in our ability to discover new antibiotics. And typically, using traditional methods to come up with new antibiotics, it's a really physical process. You go around nature and you take soil samples or water samples and then you try to purify active compounds from all of that complex organic matter.

Starting point is 00:05:40 As you can imagine, this is a process that is very reliant on trial and error experimentation. So instead of doing that, we proposed why not take advantage of the decades' worth of biological data that we have at our disposal in the form of genomes, proteomes, metagenomes that have been sequenced, and why not try to use AI and develop the right algorithms to explore all this biological data digitally to be able to accelerate our ability to discover new. antibiotics. So I want to understand this better. Are you saying that in our genomes or in the genomes of organisms, there are genes that create natural antibiotics? I think I'm familiar with how we fight infection using T cells and B cells. But are we producing antibiotics as well? Exactly. We're producing a lot of molecules, including proteins and including small proteins called peptides that can be used as antibiotics or as immunomodulators. So the idea here was

Starting point is 00:06:39 thinking of biology as an information source. All of biology, you can conceptualize it as a bunch of code, essentially. You know, if you think about DNA, it can be thought of a bunch of nucleotide code, or if you think about proteins and peptides, they're composed of amino acids. And so in the end, it's all code, just like the code that we use to communicate with each other through the alphabet. And so it's code that can be searched using the correct algorithms. And if we can come up with those AIs, then we can systematically and very rapidly browse through all this vast amount of genetic data to try to find new molecules. What genomes are you looking in?

Starting point is 00:07:19 So we started by exploring the human proteome for the first time. The human proteome are essentially all the proteins encoded in our genome. And there we found thousands of new antibiotics that were previously undescribed. So we came up with this idea that perhaps we could identify similar compounds, all throughout evolutionary history, including in our closest ancestors, Neanderthals and Denisovans. So we explored their genetic code, and we identified new antibiotics there, such as Neanderthalian. We had to come up with new names for all these new molecules because they had not really been described before.

Starting point is 00:07:57 Nanderthalyn is a great name. Like, I can hear the drug commercial in my mind. Yeah, and then we don't only do the AI work or the computational work, but we also do all the experimental work. So then the computer gives us a number of sequences, which is essentially code, and then we have this chemical robots that can make these small peptides. Basically, we tell them, you know, make this sequence of amino acids, and the robots are capable of making that particular compound in the laboratory. And then we can actually test all those molecules against real bacteria that are clinically relevant that we have access to in my lab. And then if those work,

Starting point is 00:08:35 And if we do toxicity studies and the toxicity profiles look good, we can go to preclinical infection models, which is what we've done with Neanderthalian and many, many other compounds that we found all across the tree of life. So you are resurrecting extinct proteins. In some cases we are. In some cases, we don't see any homology, meaning we can't find them in any living organism in the biological world today. So in those particular instances, we are sort of resurrecting them, if you will, using chemistry in the lab.

Starting point is 00:09:12 And we've gone beyond ancient humans. We've also developed a new AI model that we call Apex that enables us to actually sample every single extinct organism known to science. And so this Apex model has enabled us to discover new antibiotic compounds in ancient penguins, magnolia trees, that disappear through our evolution, and also even the woolly mammoth or giant's loss. David and Cesar, you know, you both have been in this world using AI for biology, using AI to sort of create these new proteins to search for antibiotics. Did either of you encounter any skepticism when you were just starting out? Well, yes, when we first started trying to design new proteins completely from scratch with new functions,

Starting point is 00:10:03 It seemed really crazy because the only proteins that humans knew about were the proteins that have come down through nature, through evolution. It's kind of like the ancient elven ruins that have these magical properties. You know, natural proteins have these complicated names and they're very exotic. And so the idea that you could make brand new proteins up on the computer that would actually solve hard problems. It seemed kind of crazy. And indeed, for a long time, we couldn't make proteins that were very good at anything.

Starting point is 00:10:28 But particularly with the latest generation of AI methods that we've developed, now the proteins we can make are actually, they're starting to look more and more interesting. So I would say we've gone from a situation where I would say kind of we were on the lunatic fringe and everyone thought it was crazy to now kind of in the mainstream. It's a little bit weird. Everyone's talking about the protein design revolution and their companies starting up every day to try and design new proteins. So it's really come full circle or maybe 180 degrees, I guess I should say. Cesar? Yeah, we also faced a lot of skepticism initially. I remember when I got recruited to MIT, to do my postdoc.

Starting point is 00:11:05 I originally proposed this idea that we could maybe create an antibiotic on the computer. And at the time, MIT was a mecca for AI research. But most people were applying AI systems to pattern recognition algorithms, things like recognizing faces and sounds. But the idea of applying it to biology or to antibiotic discovery seemed kind of crazy at the time. The general consensus was that it was impossible, that biology was, just too complex, too chaotic. There were too many variables for an algorithm to be of any use.

Starting point is 00:11:40 And perhaps because maybe I was younger at the time, I sort of ignored that skepticism. And I continued, you know, with my collaborators and my colleagues working on this area. And we were able to actually design an antibiotic on the computer that when synthesized, it was capable of killing some of the most dangerous pathogens in our society. And then we showed that it could produce infections in preclinical mouse models. And so that was the beginning of convincing ourselves and others that this could be a whole new area of research where we could do antibiotic design and antibiotic discovery using machines. None of these antibiotics have made it to the drugstore yet. What will it take to get them there? Not yet. I think what we've been able to do so far with

Starting point is 00:12:27 AI is really dramatically accelerate our ability to discover new antibiotics. So instead of having to wait for years with traditional methods, which, you know, it can take more than the time that it takes to complete a PhD program to come up with some candidates. Today on the computer, we can discover hundreds of thousands of candidates within a few hours. So just on any given day, just to give you a hundreds of thousands of candidates in a few hours. In a few hours. So it's quite remarkable. So on any given day, like this morning, for example, I came into the lab. I had a cup of coffee. And by lunchtime, my team has already told me that we have thousands of new molecules to sort through. And by dinner time, we're going to have a lot more. And so it's an

Starting point is 00:13:09 amazing playground for a scientist like myself that has been dreaming about really coming up with new antibiotics for so long. And now with AI for the last several years, we've been able to really help dramatically accelerate discovery. Don't go away when we come back. How do we make sure this technology is used for good? For the foreseeable future, the methods I'm describing are going to be much more powerfully deployed to combat nature's pandemic viruses and perhaps bioweapons. Stick around. David, you're using AI to design good things to help humanity, but would it be equally easy

Starting point is 00:13:57 to use these algorithms to design bad stuff, like a bio weapon or something like that? It could be, but for better, well, for worse, I guess I should say nature has already perfected ways of doing bad things. If you take something like Ebola virus or the 1918 Spanish flu, whose sequence is now publicly available, you know, it has this amazing and incredibly dangerous ability to, you know, infect a person and then spread in a population. And that involves many, many different biological functions that individually are quite a challenge to design.

Starting point is 00:14:32 So I think currently, and for the foreseeable future, the methods I'm describing are going to be much more powerfully deployed to combat nature's pandemic viruses and perhaps bioweapons, because there's plenty of stuff that's bad already out in biology. What we don't have are good ways to protect against viruses, for example. David, what's your bluest of blue sky ideas for using this approach? Well, I have a lot of them. And one of the fun things now, with the rest of the world now using our methods to do sort of the easier design problems, we're really focusing on the bluer sky problems. But I'll give you an example of one of them. What if we could design nanomachines that could circulate in our bodies and use a fuel that was present in our diet?

Starting point is 00:15:18 So, for example, a triglyceride or something else that might be part of your diet. And that nanomachine would do things like unclog arteries, you know, untangle amyloid plaques, basically be a much more active, cleaner upper than current drugs are. Current protein medicines are things like antibodies, which just bind to a target and block an interaction. But what if we could make medicines that actually actively reconstruct and fix damaged tissue and perhaps could help with some of the problems in aging? Wow. Cesar, where do you think this field's going to be in five years or where do you hope it will be? Well, my greatest dream is hopefully some of the things that we've come up with and transition into the clinic and eventually help people. That's what

Starting point is 00:16:01 really drive us every single day to do the work that we do. I think one thing perhaps for the future is that we need better data sets to train AI models. We want to see really an explosion of a successful story of AI in biology and in chemistry. We're going to need really good standardized high-quality data sets. And I've been talking to NSF and NIH to try to convince them to maybe start funding data set generation projects, not only hypothesis-driven projects, in order to be able to be to train the next AI models that will continue fueling this revolution. Just echoing what Cesar said about the importance of data sets, the work we've done on designing new protein rests entirely on the really hard work done by generations of graduates and postdocs

Starting point is 00:16:48 and scientists solving protein structures and putting them in the protein structure data bank. So they're really the unsung heroes of the advances in AI for protein design and protein structure prediction, because that they generate the really high-quality data that the protein design methods, for example, that I described were trained on. You know, every crucial decision in my lab, it takes into account the recommendation from the machine, but also the recommendation made by the human scientists that actually generate the data and then can inform decisions made by the algorithm. Yeah, that's very true.

Starting point is 00:17:24 In our case, I mean, we have to decide what problem to try to solve, and that's very much a human decision, and then the AI will generate some number of solutions, and then you have to decide which ones you're going to test. That's a human decision. You have to make them in the lab. That's a human action. And then you have to decide what to do with them once they work, and that's a human decision as well. It's really a tight collaboration between humans and machines at this point. A tight collaboration for now, anyway. Yes. For now. For now, we're still very helpful. Yes. Thank you both for taking time to talk to me today. Thank you so much. Thank you. Dr. Cesar de la Fuente, bioengineer and presidential associate professor at the University of Pennsylvania in Philadelphia,

Starting point is 00:18:04 Nobel laureate Dr. David Baker, director of the Institute for Protein Design, and professor at the University of Washington in Seattle. Before we go, L.A. dwellers, we are working on a segment about the toxins left behind by the fires, and we want to hear from you. How is this affecting you? What are you concerned about? How are you approaching cleanup? Let us know your questions. Leave us a voicemail at 646-76767-6532, or send us an email at SciFri at Science Friday.com.

Starting point is 00:18:41 And that is about all we have time for. Lots of folks helped make the show happen, including Jordan Smudjik, Charles Bergquist, George Harper, John Dancosky. I'm Flora Lickman. Thanks for listening.

Science Friday - AI Is Coming Up With Brand New Molecules, Fueling Drug Discovery

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.