Science Friday - AI Is Coming Up With Brand New Molecules, Fueling Drug Discovery
Episode Date: January 27, 2025A recent study in the journal Nature unveiled new proteins that can neutralize the deadliest of snake venoms. They’re “new” in that they aren’t found in nature—they were created in a lab, dr...eamed up by AI.Using AI to discover, or design, the building blocks of drugs is a fast-growing area of research. Another team of scientists out of Philadelphia is using AI to discover new antibiotics by resurrecting long-lost molecules from extinct species like neanderthals and woolly mammoths.We know what you’re thinking: It sounds too sci-fi to be true.Flora Lichtman talks with two pioneers in the field about how AI is supercharging drug discovery: Dr. César de la Fuente, bioengineer and presidential associate professor at the University of Pennsylvania in Philadelphia, and Nobel laureate Dr. David Baker, director of the Institute for Protein Design and professor at the University of Washington in Seattle.Transcripts for each segment will be available after the show airs on sciencefriday.com.For our Los Angeles listeners: We’re working on a story about the toxins left behind by the fires and we want to hear from you. How is this affecting you? Are you worried about the air and water and soil? How are you approaching clean-up? And what questions do you have?Leave us a voicemail at 1-646-767-6532 or send us an email at scifri@sciencefriday.com. Subscribe to this podcast. Plus, to stay updated on all things science, sign up for Science Friday's newsletters.
Transcript
Discussion (0)
This is Science Friday. I'm Flora Lickman. Today on the podcast, exploring the frontier of AI and drug design.
We've gone from a situation where I would say kind of we were on the lunatic fringe and everyone thought it was crazy to now kind of in the mainstream. It's a little bit weird.
Last week, we briefly touched on a recent nature paper where scientists unveiled new proteins that can neutralize deadly snake venom, a type of toxin found in cobras and their relatives that can be different.
difficult to treat. A new anti-venom is a cool discovery on its own, but there is a twist to this
snake tail. These new proteins were designed by AI. This is a fast-growing area of research,
using AI to discover or design the building blocks of drugs. Another team at Penn is using AI to
search for new potential antibiotics in the genomes of extinct species like Neanderthals and
woolly mammoths. I know it sounds almost too sci-fi to be unsubyed.
fry, but it is happening. Here to tell us more are two pioneers in this field. Dr. Cesar de la Fuente,
bioengineer and presidential associate professor at the University of Pennsylvania in Philadelphia,
and Nobel laureate, Dr. David Baker, director of the Institute for Protein Design and professor
at the University of Washington in Seattle. Welcome to you both. It's great to be here.
Thank you. David, you focus on designing new proteins, which I think can feel a little
abstract for the non-protein designers out there. How should we think about them? How are you interested in
using them? Proteins in nature carry out essentially all the important functions in our bodies and in all
living things. And they solve a really enormously broad range of problems, ranging from powering
movement to thinking to capturing solar energy. So if you think about all the problems that we
face today, and you know a little bit about what proteins do in nature, you sort of are brought to
the conclusion that a lot of the problems that we face could be solved by new proteins,
and we don't want to wait hundreds of millions of years for new proteins to evolve.
And so the really exciting thing about protein design now is we can make brand new proteins
and we can make them to solve a really wide range of different problems, ranging from snake
bite to completely different problems like degrading plastic or getting methane out of the atmosphere
or making improved vaccines.
So it's just a really exciting time now.
Let's talk about your anti-venom proteins.
Walk me through the process of how you got to them.
We've been designing proteins to bind to other proteins for quite a few years.
In the last three years or so, we've developed AI methods for doing this,
which are kind of analogous to the way that Dolly, the image generator, works.
So in Dolly, you might say, generate an image of a cat,
sitting on a table. Instead, in the case of the snake venom, what we did was, or what Susanna
Vasquez Torres, the brilliant student who did the work did, was she took the snake venom proteins
whose structures were known, and she basically told the generative AI generate a protein which binds
to this site on the venom. The gender of AI then builds up a completely new protein, which
kind of fits perfectly against the snake venom toxin, just like a key would fit into a lock.
Proteins in nature, they're encoded in DNA in our genome, so each protein in our bodies is
encoded by a gene. Since the proteins that Susanna was making were brand new, she had to make
brand new synthetic DNA that encoded them. She put them into bacteria, and the bacteria
produced the proteins, and then she could determine whether they block the venom from killing
animals. Okay, so the algorithm gives you possible proteins that would work. And like, how many
outputs do you get? Is it like 10, 3, 1,000? You get many thousands, but you can't test all of the
designs that the computer suggests. And so Susanna developed a way of selecting a smaller subset
of designs to test. And she tested about 100 designs for each venom, in some cases less. And she was
able to find amongst those brand new proteins invented by the AI, very potent inhibitors of the
toxin. Do they always work, like right off the shelf like that? Well, no, because many designs
don't work at all. It's a computer fantasy, but they don't actually work in the real world. And
sometimes they work, but they don't work well enough. They don't stick tightly enough to the venom.
And so then what Susanna did in those cases was to carry out a second round of design where you basically tell the AI, generate new proteins that look like this one but aren't exactly the same.
And then it will kind of explore that region of that new protein it invented.
And generally, when that is done, you find much better, much tighter binders than what you do in the first round.
Cesar, your focus is on discovering new antibiotics.
Tell me your process.
Yes, in our case, about a decade ago, we wanted to introduce computational methods in our ability to discover new antibiotics.
And typically, using traditional methods to come up with new antibiotics, it's a really physical process.
You go around nature and you take soil samples or water samples and then you try to purify active compounds from all of that complex organic matter.
As you can imagine, this is a process that is very reliant on trial and error experimentation.
So instead of doing that, we proposed why not take advantage of the decades' worth of biological data that we have at our disposal in the form of genomes, proteomes, metagenomes that have been sequenced,
and why not try to use AI and develop the right algorithms to explore all this biological data digitally to be able to accelerate our ability to discover new.
antibiotics. So I want to understand this better. Are you saying that in our genomes or in the
genomes of organisms, there are genes that create natural antibiotics? I think I'm familiar
with how we fight infection using T cells and B cells. But are we producing antibiotics as well?
Exactly. We're producing a lot of molecules, including proteins and including small proteins
called peptides that can be used as antibiotics or as immunomodulators. So the idea here was
thinking of biology as an information source. All of biology, you can conceptualize it as a bunch of
code, essentially. You know, if you think about DNA, it can be thought of a bunch of nucleotide code,
or if you think about proteins and peptides, they're composed of amino acids. And so in the end,
it's all code, just like the code that we use to communicate with each other through the alphabet.
And so it's code that can be searched using the correct algorithms. And if we can come up with those AIs,
then we can systematically and very rapidly browse through all this vast amount of genetic data
to try to find new molecules.
What genomes are you looking in?
So we started by exploring the human proteome for the first time.
The human proteome are essentially all the proteins encoded in our genome.
And there we found thousands of new antibiotics that were previously undescribed.
So we came up with this idea that perhaps we could identify similar compounds,
all throughout evolutionary history, including in our closest ancestors, Neanderthals and Denisovans.
So we explored their genetic code, and we identified new antibiotics there, such as Neanderthalian.
We had to come up with new names for all these new molecules because they had not really been
described before.
Nanderthalyn is a great name.
Like, I can hear the drug commercial in my mind.
Yeah, and then we don't only do the AI work or the computational work, but we also do all the
experimental work. So then the computer gives us a number of sequences, which is essentially code,
and then we have this chemical robots that can make these small peptides. Basically, we tell them,
you know, make this sequence of amino acids, and the robots are capable of making that
particular compound in the laboratory. And then we can actually test all those molecules against
real bacteria that are clinically relevant that we have access to in my lab. And then if those work,
And if we do toxicity studies and the toxicity profiles look good, we can go to preclinical
infection models, which is what we've done with Neanderthalian and many, many other
compounds that we found all across the tree of life.
So you are resurrecting extinct proteins.
In some cases we are.
In some cases, we don't see any homology, meaning we can't find them in any living organism
in the biological world today.
So in those particular instances, we are sort of resurrecting them, if you will, using chemistry in the lab.
And we've gone beyond ancient humans.
We've also developed a new AI model that we call Apex that enables us to actually sample every single extinct organism known to science.
And so this Apex model has enabled us to discover new antibiotic compounds in ancient penguins, magnolia trees,
that disappear through our evolution, and also even the woolly mammoth or giant's loss.
David and Cesar, you know, you both have been in this world using AI for biology,
using AI to sort of create these new proteins to search for antibiotics.
Did either of you encounter any skepticism when you were just starting out?
Well, yes, when we first started trying to design new proteins completely from scratch with new functions,
It seemed really crazy because the only proteins that humans knew about were the proteins
that have come down through nature, through evolution.
It's kind of like the ancient elven ruins that have these magical properties.
You know, natural proteins have these complicated names and they're very exotic.
And so the idea that you could make brand new proteins up on the computer that would actually
solve hard problems.
It seemed kind of crazy.
And indeed, for a long time, we couldn't make proteins that were very good at anything.
But particularly with the latest generation of AI methods that we've developed, now the
proteins we can make are actually, they're starting to look more and more interesting. So I would say
we've gone from a situation where I would say kind of we were on the lunatic fringe and everyone thought
it was crazy to now kind of in the mainstream. It's a little bit weird. Everyone's talking about
the protein design revolution and their companies starting up every day to try and design new
proteins. So it's really come full circle or maybe 180 degrees, I guess I should say.
Cesar? Yeah, we also faced a lot of skepticism initially. I remember when I got recruited to MIT,
to do my postdoc.
I originally proposed this idea that we could maybe create an antibiotic on the computer.
And at the time, MIT was a mecca for AI research.
But most people were applying AI systems to pattern recognition algorithms, things like
recognizing faces and sounds.
But the idea of applying it to biology or to antibiotic discovery seemed kind of crazy
at the time.
The general consensus was that it was impossible, that biology was,
just too complex, too chaotic. There were too many variables for an algorithm to be of any use.
And perhaps because maybe I was younger at the time, I sort of ignored that skepticism. And I
continued, you know, with my collaborators and my colleagues working on this area. And we were
able to actually design an antibiotic on the computer that when synthesized, it was capable of
killing some of the most dangerous pathogens in our society. And then we showed that it could
produce infections in preclinical mouse models. And so that was the beginning of convincing ourselves
and others that this could be a whole new area of research where we could do antibiotic design
and antibiotic discovery using machines. None of these antibiotics have made it to the drugstore
yet. What will it take to get them there? Not yet. I think what we've been able to do so far with
AI is really dramatically accelerate our ability to discover new antibiotics. So instead of having
to wait for years with traditional methods, which, you know, it can take more than the time
that it takes to complete a PhD program to come up with some candidates. Today on the computer,
we can discover hundreds of thousands of candidates within a few hours. So just on any given day,
just to give you a hundreds of thousands of candidates in a few hours. In a few hours. So it's
quite remarkable. So on any given day, like this morning, for example, I came into the lab. I had a
cup of coffee. And by lunchtime, my team has already told me that we have thousands of new
molecules to sort through. And by dinner time, we're going to have a lot more. And so it's an
amazing playground for a scientist like myself that has been dreaming about really coming up
with new antibiotics for so long. And now with AI for the last several years, we've been
able to really help dramatically accelerate discovery.
Don't go away when we come back. How do we make sure this technology is used for good?
For the foreseeable future, the methods I'm describing are going to be much more powerfully deployed
to combat nature's pandemic viruses and perhaps bioweapons.
Stick around.
David, you're using AI to design good things to help humanity, but would it be equally easy
to use these algorithms to design bad stuff, like a bio weapon or something like that?
It could be, but for better, well, for worse,
I guess I should say nature has already perfected ways of doing bad things.
If you take something like Ebola virus or the 1918 Spanish flu, whose sequence is now
publicly available, you know, it has this amazing and incredibly dangerous ability to, you know,
infect a person and then spread in a population.
And that involves many, many different biological functions that individually are quite a
challenge to design.
So I think currently, and for the foreseeable future, the methods I'm describing are going to be
much more powerfully deployed to combat nature's pandemic viruses and perhaps bioweapons,
because there's plenty of stuff that's bad already out in biology. What we don't have are good
ways to protect against viruses, for example. David, what's your bluest of blue sky ideas
for using this approach? Well, I have a lot of them. And one of the fun things now, with the
rest of the world now using our methods to do sort of the easier design problems, we're really
focusing on the bluer sky problems. But I'll give you an example of one of them. What if we could
design nanomachines that could circulate in our bodies and use a fuel that was present in our diet?
So, for example, a triglyceride or something else that might be part of your diet. And that
nanomachine would do things like unclog arteries, you know, untangle amyloid plaques,
basically be a much more active, cleaner upper than current drugs are. Current protein
medicines are things like antibodies, which just bind to a target and block an interaction. But what
if we could make medicines that actually actively reconstruct and fix damaged tissue and perhaps
could help with some of the problems in aging? Wow. Cesar, where do you think this field's going to be
in five years or where do you hope it will be? Well, my greatest dream is hopefully some of the
things that we've come up with and transition into the clinic and eventually help people. That's what
really drive us every single day to do the work that we do. I think one thing perhaps for the future is
that we need better data sets to train AI models. We want to see really an explosion of a successful
story of AI in biology and in chemistry. We're going to need really good standardized high-quality
data sets. And I've been talking to NSF and NIH to try to convince them to maybe start funding
data set generation projects, not only hypothesis-driven projects, in order to be able to be
to train the next AI models that will continue fueling this revolution.
Just echoing what Cesar said about the importance of data sets, the work we've done on designing
new protein rests entirely on the really hard work done by generations of graduates and postdocs
and scientists solving protein structures and putting them in the protein structure data bank.
So they're really the unsung heroes of the advances in AI for protein design and protein structure
prediction, because that they generate the really high-quality data that the protein design methods,
for example, that I described were trained on.
You know, every crucial decision in my lab, it takes into account the recommendation from
the machine, but also the recommendation made by the human scientists that actually generate
the data and then can inform decisions made by the algorithm.
Yeah, that's very true.
In our case, I mean, we have to decide what problem to try to solve, and that's very much
a human decision, and then the AI will generate some number of solutions, and then you have to decide
which ones you're going to test. That's a human decision. You have to make them in the lab. That's a
human action. And then you have to decide what to do with them once they work, and that's a human
decision as well. It's really a tight collaboration between humans and machines at this point.
A tight collaboration for now, anyway. Yes. For now. For now, we're still very helpful.
Yes. Thank you both for taking time to talk to me today. Thank you so much. Thank you.
Dr. Cesar de la Fuente, bioengineer and presidential associate professor at the University of Pennsylvania in Philadelphia,
Nobel laureate Dr. David Baker, director of the Institute for Protein Design,
and professor at the University of Washington in Seattle.
Before we go, L.A. dwellers, we are working on a segment about the toxins left behind by the fires,
and we want to hear from you.
How is this affecting you? What are you concerned about?
How are you approaching cleanup?
Let us know your questions.
Leave us a voicemail at 646-76767-6532, or send us an email at SciFri at Science Friday.com.
And that is about all we have time for.
Lots of folks helped make the show happen, including Jordan Smudjik, Charles Bergquist,
George Harper, John Dancosky.
I'm Flora Lickman.
Thanks for listening.
