Short Wave - Can AI Crack The Biology Code?
Episode Date: June 3, 2025As artificial intelligence seeps into some realms of society, it rushes into others. One area it's making a big difference is protein science — as in the "building blocks of life," proteins! Produce...r Berly McCoy talks to host Emily Kwong about the newest advance in protein science: AlphaFold3, an AI program from Google DeepMind. Plus, they talk about the wider field of AI protein science and why researchers hope it will solve a range of problems, from disease to the climate.Have other aspects of AI you want us to cover? Email us at shortwave@npr.org.See pcm.adswizz.com for information about our collection and use of personal data for sponsorship and to manage your podcast sponsorship preferences.NPR Privacy Policy
Transcript
Discussion (0)
You're listening to Shortwave from NPR.
Hey, hey, shortwaver.
Emily Kwong here with producer Burley McCoy.
What's up, Burley?
Hey, Emily.
Hello, what do you have for us today?
Okay, so Emily, today I want to dig into how AI has shaken up the field of protein science,
as in the fundamental building blocks of life proteins.
I've heard of them.
Yeah.
I mean, this is like what you studied back in your scientist days.
Yes, yes.
I love proteins.
Oh.
We love that you love them. How has AI moved the needle in this field, though?
Well, scientists have used it to dig into a problem that protein scientists have struggled with for more than 60 years. And that is, what do these building blocks of which there are millions look like?
Like their shape? Like their shape, yeah, exactly.
And why is that so important? Well, the ability of a protein to do its specific job, so like carry oxygen through your body or turn light into sugar, that relies wholly on its unique complex.
complicated shape. So to understand how it works, you need to know its shape.
But why can't scientists just run an experiment to determine the shape?
They can for some proteins, but those experiments can take years and years.
And Emily, that's because a scientist essentially needs to take the equivalent of a molecular
photo of the protein to map its complicated shape. But getting the protein to cooperate to get
that photo, so like to hold still, for example, without falling apart, that can be super
tricky, and it could take a grad student's entire PhD program to figure out a single protein.
And other proteins were just abandoned because they would never cooperate.
Proteins sound difficult, honestly. So the challenge is how do you figure out a protein's
shape without running these super tedious experiments? Is this where AI comes in?
Yeah, and to give you a sense of kind of how AI has changed the protein game, there's this
this protein competition that scientists run every other year.
Get out a protein competition. Okay.
Yeah, and they've run it for the past 30 years where groups will basically compete on who can accurately guess the most protein shapes.
It's like nerd central for sure.
We love.
And for most of that 30-year history, participants have really only made incremental progress.
But in 2020, Google DeepMind used AlphaFold 2, that's its AI protein prediction model, and Emily.
ElfaFold 2 blew the other competition out of the water completely.
Wow. Okay.
Game changer.
And now the Google DeepMind team has taken this AI tool to the next level by expanding it beyond proteins.
So today on the show, how scientists have taken a huge step to understanding the building blocks of life using AI.
Plus, how other researchers are using the tech to design brand new proteins, ones never before,
in nature. And how AI could help us solve the biggest problems we face today, from disease
to climate. You are listening to Shortwave, the science podcast from NPR. Okay, Burley, so scientists,
it seems, have been trying to figure out the complicated shapes of proteins for decades to better
understand how they work. Why has this been such a complicated thing to figure out? Well, the short
answer, Emily, is that there are so many theoretical ways a single protein could fold that it's a
big problem to solve. So if you unfolded a protein, it would kind of look like a bunch of beads
on a long string. Those beads are little molecules called amino acids. Oh, I remember this from biology.
There are like 20 types of amino acids. Yep. Each one is a little different. Right. So each one has a
slightly different shape. And that kind of dictates how that part of the string can be folded up.
Because proteins often have a hundred or more amino acids, you can see how imagining all the
ways it could fold would get complicated.
Yeah, it just sounds like thousands of different shapes or what, hundreds of thousands of
different shapes.
Okay, try billions of trillions, Emily.
Like, there are theoretically more ways for one single protein to fold than there are stars in
our night sky.
This sounds like a glorious nightmare.
Right?
I'm so curious.
Okay.
So you said that AI has helped us make some leaps and bounds towards a solution.
How does this technology work?
So this Alpha Fold model is a type of AI called a deep learning program, which is this huge network of data processing points called nodes.
And the purpose of this network is to learn and then make predictions based on what it's learned.
In AlphaFold's case and other models like it, it learns about proteins from a huge collection of protein structures that scientists have been building on for decades from their experimental data.
Okay.
So the idea is that after these.
models use all of that carefully gathered experimental data to learn. They can then predict the shapes
of proteins they do not know yet. Exactly. Okay, and going back to the protein competition in 2020,
how did Alpha Fold blow away the competition? So they essentially changed the whole architecture
of their model. They had been using AI before, but remember the beads on a string analogy?
If amino acids are the beads, even if one bead is far from another on the string, when it all folds up,
they could be right next to each other.
So with Alpha Fold 2, the model looked at distances between all the different amino acids
and previous knowledge from solved protein structures.
Awesome.
And the accuracy and speed of the predictions went way up.
Okay.
And I'm assuming that made a huge difference for scientists everywhere studying proteins.
Totally.
Julian Bergeron, a structural biologist at King's College London, is one of them.
He studies the tail-like appendage that propelled.
bacteria. So it's called a phlegelum, and it's pretty complicated.
It's this huge assembly. So it's longer than the bacterial cell itself. It consists of 20 to 25
different proteins, but many of them have hundreds of thousands of copies of that protein.
And these huge propeller machines are what gives some bacteria the ability to make you sick
or build plaque on your teeth. So Julian's lab is trying to figure out how these giant machines work,
what their pieces look like and how it all fits together.
And so when the AlphaFold II model came out, he just had to try it.
And I input a sequence, and then a few hours later, I had the model, and I was like,
oh my God, this just did it.
And we'd been struggling with that problem for, you know, months, if not years.
And all of a sudden, I messaged my lab, and I said, we model everything.
And we've had dozens of project that immediately progressed thanks to this.
Okay, so it sounds like overnight Alpha Fold changed the trajectory of his lab.
Yeah.
But how did you know that using AlphaFold 2 would actually work?
Yeah, so the accuracy is super important, right?
Especially when you're basing all of your other experiments on the results.
And it's important to note that like other AI, AlphaFold 2 isn't right 100% of the time.
so you can't just take the results at face value.
But unlike some other AI, included in the results is a score,
basically telling you how accurate each part of the structure is.
Okay.
And are others in the field using AlphaFold too?
Yeah.
So this is something that actually sets AlphaFold apart from other protein prediction AI models.
It's extremely user-friendly.
So essentially, anyone who works on a protein or even just has a sequence of a protein can plug it in and get results.
I talked to Pushmeet Coley, Vice President of Research at Google DeepMine, and he told me why it was important for them to make this tool open access.
The mission statement that we have for the science program at Google DeepMind is to leverage AI to accelerate an advanced science.
Okay, so I'm scrolling through the AlphaFold website, and I'm seeing scientists using this model for all kinds of things.
They're working on malaria and cancer research, drug discovery, plastic eating enzymes.
And last week, DeepMind released a new version, Alpha Fold 3, which can predict the 3D structure of proteins and other kinds of biomolecules that they attach to.
Why are those other biomolecules important?
Yes.
So I know we talked about how much proteins are super important.
I love them.
But I have to admit they rarely work alone.
And if we actually want to know how biology works as a whole, we need to understand how proteins work with their proteins.
partner molecules. So it really gives you a more detailed and more accurate picture of what is
happening inside the body where proteins are just not just sort of existing in isolation.
They are interacting in a very rich biological space or soup of RNA and DNA and small molecules
and it really sheds light into those rich interactions.
Now, previous versions of these protein prediction software would model where each amino acid
Cid was located, but in this new version, AlphaFold 3, it maps things on an even smaller level.
So it models where individual atoms are.
Wow.
So they can predict the structure of multi-protein complexes like the bacterial phlegelum or something like proteins in the blood, which attach to iron atoms.
That is powerful. Okay.
What are the limits to AlphaFold predictions?
Yeah, there are definitely limitations.
Pushmeet says that the model works best when a protein has.
has a single defined structure.
But some proteins have more than one shape or they have sections that are kind of flimsy,
think cooked versus uncooked spaghetti.
Okay.
So the model sounds like some trouble with prediction in some cases and the results show that.
Yeah.
So the idea is that these results would say, hey, I'm not so confident in this area of the protein.
Just so like users know.
Oh.
And another limitation is that the prediction ability depends on the amount of what's
called training data available.
Uh-huh.
So I mentioned that there's a lot of training data for proteins, but...
Some categories have much less training data available.
For example, there's much less structural data available for RNAs.
Okay, so the prediction is only as good as the data.
Exactly, exactly.
But Emily.
But Burley.
There's another way scientists can use AI in the protein world.
Okay, what's that?
to generate brand new proteins.
Ones, like, not found in nature anywhere.
Humans face new problems today,
and, you know, we live longer.
We're polluting and heating up the planet.
And it's reasonable to think that if with more millions of years of evolution,
that some of these problems would be solved,
but we don't want to wait that long.
So the idea is that we can now create completely new proteins
that solved these problems that weren't really relevant during evolution.
to make the world a better place.
So this is David Baker.
He's a biochemist and the director of the Institute for Protein Design at the University of Washington.
And he's been working on proteins for years.
He actually developed one of the earlier protein prediction models.
His lab has a similar AI program to AlphaFold 3.
It's called Rosetta Fold All-Atum.
But his big focus is designing these brand new proteins.
This sounds so futuristic.
Right?
Like what kind of new proteins?
So far, they've done things like design new protein antibodies, which are important for fighting infections, in this case to fight influenza.
They've made something called a switch protein that could be used as an environmental sensor.
And they've also made proteins that could help store carbon, which is a huge hurdle for fighting climate change.
I think really across medicine, sustainability, technology, I think there's huge opportunities to transatlantic.
form the current ways we do things with protein design.
So these predictive and generative AI models have fundamentally changed the protein science
landscape.
And again, there's definitely room for improving the prediction power.
But with what the field has shifted to, like in terms of prediction accuracy and design
potential, I mean, it's really gotten this retired protein fanatic, like missing my science days.
Burley?
Thank you so much for bringing us this big, big story about the little things in life.
Thanks, Emily.
This episode was produced by Rachel Carlson.
It was edited by our showrunner, Rebecca Ramirez.
Burleigh Check the Facts.
Co. Takasugi Chernovin was the audio engineer.
Special thanks to Jeff Brumfield.
Beth Donovan is our senior director, and Colin Campbell is our senior vice president of podcasting strategy.
I'm Emily Kwong. Thank you for listening to Shortwave from NPR.
