TED Talks Daily - How AI is saving billions of years of human research time | Max Jaderberg

Episode Date: December 2, 2024

Can AI compress the yearslong research time of a PhD into seconds? Research scientist Max Jaderberg explores how “AI analogs” simulate real-world lab work with staggering speed and scale,... unlocking new insights on protein folding and drug discovery. Drawing on his experience working on Isomorphic Labs' and Google DeepMind's AlphaFold 3 — an AI model for predicting the structure of molecules — Jaderberg explains how this new technology frees up researchers' time and resources to better understand the real, messy world and tackle the next frontiers of science, medicine and more.

Transcript
Discussion (0)
Starting point is 00:00:00 You're listening to TED Talks Daily, where we bring you new ideas to spark your curiosity every day. I'm your host, Elise Hwu. Today a breakthrough in science thanks to the neural networks of AI that is saving years of human research time. In his 2024 talk, AI researcher Max Jotterberg makes the case for something called AI analogs, and he explains what these advances can do to allow for more experimentation, understanding, and new knowledge.
Starting point is 00:00:39 Coming up after the break. Support for the show comes from Airbnb. I've got a trip to Asia plan for this December. I booked an Airbnb. They are always the most cozy and inviting after such a long journey. My own home will be empty while I'm gone. So I was looking into hosting on Airbnb myself. I'm having fun thinking of some small touches
Starting point is 00:01:06 I might add for potential guests, like the ones I've received at Airbnbs in the past. And with the extra income from hosting, I can make my next trip abroad even longer. Your home might be worth more than you think. Find out how much at airbnb.ca slash host. And now our TED Talk of the Day. So a while ago now, I did a PhD. And I actually thought it'd be quite easy to do research. Turns out it was really hard.
Starting point is 00:01:36 My PhD was spent coding up neural network layers and writing CUDA kernels, very much computer-based science. And at that time, I had a friend who worked in a lab doing real messy science. He was trying to work out the structure of proteins experimentally. And this is a really difficult thing to do. It can take a whole PhD's worth of work just to work out the structure of a single new protein system.
Starting point is 00:02:09 And then ten years later, the field that I was in, machine learning, revolutionized his world of protein structure. A neural network called AlphaFold was created by DeepMind that can very accurately predict the structure of proteins and solved this 50-year challenge of trying to do protein folding. And just two weeks ago, this won the Nobel Prize in chemistry. And it's estimated that since the release of this model, we've saved over a billion years of research time. A billion years.
Starting point is 00:02:52 A whole PhD's worth of work is now approximated by a couple of seconds of neural network time. And to my friend, this might sound a bit depressing, and I'm sorry about that, but to me this is just really an incredible thing. The sheer scale of new knowledge about our protein universe that we now have access to, due to an AI model that's able to replace the need for real-world experimental lab work, and that frees up our precious human time
Starting point is 00:03:20 to begin probing the next frontiers of science. Now, some people say that this is a one-time-only event and that we can't expect to see these sort of breakthroughs in science with AI to be repeated. And I disagree. We will continue to see breakthroughs in understanding our real messy world with AI. Why?
Starting point is 00:03:45 Because we now have the neural network architectures that can eat up any data modality that you throw at them. And we have tried and tested recipes of incorporating any possible signal in the world into these learning algorithms. And then we have the engineering and infrastructure to scale these models to whatever size is needed to take advantage of the massive amount of compute power that we can create.
Starting point is 00:04:12 And finally, we're always creating new ways to record and measure every detail of our real messy world that then creates even bigger data sets that help us train even richer models. And so this is a new paradigm in front of us. creates even bigger data sets that help us train even richer models. And so this is a new paradigm in front of us, that of creating AI analogs of our real messy world. This new AI paradigm takes our real messy natural world and learns to recreate the elements of it with neural networks. And why these AI analogues are so powerful is that it's not just about understanding, approximating, or simulating the world for the sake of understanding, but this actually
Starting point is 00:04:55 gives us a little virtual world that we can experiment in at scale to ultimately create new knowledge. And you can imagine that this experimentation against our AI analogs, this can also happen in Silico, in a computer, with other agents in a loop of in Silico open-ended discovery, ultimately to create new knowledge that we can take back out and change the world around us. And this isn't science fiction.
Starting point is 00:05:33 Right now, we have thousands of graphics cards burning, training foundational models of our own microbiological world, and then agents that are probing these AI analogs to design new molecules that could be potential new drugs. And I want to show you exactly how this process works for us, because I believe it can serve as a blueprint to bring about a whole new wave of the future of AI-driven scientific and technological progress. Now, drug design is such an important area to focus on,
Starting point is 00:06:06 because it's actually becoming harder and harder to design new drugs. And why is this, given that we've had so much technological progress over this time? Well, actually, in the past, we didn't have very good ways to design new drugs. It was very empirical. We were basically playing chemical roulette. It was literally called shotgun research. And so whatever drugs we did find and develop ended up hitting the easier, low-hanging fruit, which leaves us now with much harder diseases to treat.
Starting point is 00:06:36 Now, during this same time period, we've had a huge amount of advancement in the capabilities of AI, driven by a whole host of algorithmic breakthroughs. But one of the secret sources of this advancement in AI has also been that of Moore's law, that the amount of computing power has just been exponentially increasing over time. And these days, it perhaps isn't Moore's law that we should care about, but Jensen's law,
Starting point is 00:07:01 Jensen Wang being the CEO of NVIDIA, for the exponential increase in GPU flops that are now powering our neural networks. So really the question is how do we bring this world of AI and machine learning to that of drug design? Can we think about using our AI analogs to reverse this curse of Eroom's law and jump on this exponential wave of GPU flops
Starting point is 00:07:26 powering on neural networks. Actually bringing these worlds together and driving this change is the day-to-day responsibility that I feel. So how can we go about modeling biology? Well, if we were in the world of physics, for example, modeling the universe, then we can actually write down a lot of the theory by hand with maths
Starting point is 00:07:49 and very accurately predict, for example, the unfolding of the universe even millions of light years away. But we can't do that for the incredibly complex dynamics within our cells. We can't just write down some equations for our cells. We can perhaps write down the theory of how atoms interact, that's physics. But then simulating these interactions on the scale of trillions of atoms within our cells is just completely unfeasible. And then we haven't worked out how to describe these complex dynamics
Starting point is 00:08:23 in coarser and simpler terms that we could write down with maths. It's just crazy to think that we can model the universe so far away, but not the cells at our fingertips. But AI and machine learning can be the perfect abstraction for a biological world. Using the snippets of data that we can record from our cells, we can then learn the equations and theories and abstractions implicitly within the activations of our neural networks.
Starting point is 00:08:56 In fact, our company is called Isomorphic Labs. Isomorphic because we believe there is an isomorphism, a fundamental symmetry that we can create between the biological world and the world of information science, machine learning, and AI. And now, back to the episode. So to see how we are using these AI analogs today, I want to dive into the body and have a look into cells and think about proteins. Now, proteins are one of the fundamental building blocks of life,
Starting point is 00:09:37 and these proteins carry different functions in the body. And if we can modulate the function of a protein, then we are well on our way to creating a new drug. Proteins are made up of a sequence of amino acids, and there are about 20 different amino acids. An amino acid is a collection of atoms, a molecule, and these molecules are joined together into a linear sequence. And the function of a protein is not just due to the sequence of these proteins, but also due to the three-dimensional shape that these proteins fold up into. And there are thousands of proteins inside of us, each with their own unique sequences and their own unique 3D shape.
Starting point is 00:10:22 And remember, trying to work out experimentally that 3D shape can take months or even years of lab work. But with the breakthrough of alpha-fold and alpha-fold 2 in 2020, we now have a model that can take the sequence of amino acids as input and then very accurately predict the 3D structure of a protein as the output. And this allows us to actually fill in the gaps of our known protein universe. It's our AI analog of proteins.
Starting point is 00:10:59 So proteins carry their function. But these proteins, they don't actually act in isolation. They're part of bigger molecular machines. But these proteins interacting with other proteins, as well as other biomolecules like DNA, RNA and small molecules. Now in drug design, what we want to do is either make molecular machines work better or actually stop them from working. And in this case, for cancer, we actually want to stop this particular DNA repair protein from working,
Starting point is 00:11:32 because in cancerous cells, there is no backup DNA repair mechanism. And so if we stop this from working, then cancerous cells will die, leaving just healthy cells remaining. So what would a drug actually look like for this protein? Well, a drug is something that comes in and modulates a molecular machine. And this could be a drug molecule that goes into the body, goes into the cell, and then sticks to this protein. And this drug molecule actually glues the DNA repair protein's clamp shut, so it can't do effective DNA repair,
Starting point is 00:12:08 causing cancerous cells to die and leaving just healthy cells remaining. Now, to design such an amazing drug molecule completely rationally, we'd have to understand how all of these biomolecular elements come together. We would need an AI analog of all and any biomolecular elements come together, we would need an AI analogue of all and any biomolecular systems. Earlier this year, we had a breakthrough. We developed a new version of AlphaFold called AlphaFold 3 that can model the structure of almost all biomolecules coming together with unprecedented accuracy.
Starting point is 00:12:48 This model takes as input the protein sequence, the DNA sequence, and the molecule atoms. And these inputs are fed to a neural network that has a large processing trunk based on transformers. Now, unlike a large language model that operates on one-dimensional sequences, instead our model uses what's called a pairformer and operates on a 2D interaction grid of the input sequence. And this allows our model to explicitly reason about every pairwise interaction that could occur in this biomolecular system.
Starting point is 00:13:23 And so we can use the features of this processing trunk to condition a diffusion model. Now you might know diffusion models as these amazing image-generative models. Now just like diffusing the pixels in an image, instead our diffusion model diffuses the 3D atom coordinates of our biomolecular system. So now this gives us a completely malleable, virtual biomolecular world. It's our AI analog that we can probe as if it's the real world.
Starting point is 00:13:59 We can make changes to the inputs, changes to the molecule designs, and see how that changes the output structure. So let's use this model to design a new drug for our DNA repair protein. We can take a small molecule that's been recorded to stick to this protein and make changes to its design. We want to change the molecule design so that this molecule makes more interactions with the protein and that will make it stick to this protein stronger. And so you can imagine that this gives a human drug designer a perfect game to play.
Starting point is 00:14:30 How do I change the design of this molecule to create more interactions? Now, normally, a drug designer would have to wait months to get results back from a real lab at each step of this design game. But for us, using this AI analog, this takes just seconds. And this is the reality of what our drug designers back in London are doing right now. So we have this beautiful game that's being played by our drug designers who are using this AI analog of biomolecular systems to rationally design potential new drug molecules. But you can imagine that we don't have to just limit this game
Starting point is 00:15:13 to human drug designers. Earlier in my career, I worked on training agents to beat the top human professionals at the game of StarCraft, and we created game-playing agents for the games of Go and Capture the Flag. So why can't we create agents that instead play the game that our human drug designers are playing? So now our AI analog becomes the game environment.
Starting point is 00:15:39 And we can train agents against that. And we already have some incredibly powerful agents that are already doing this today. Now in this setup, all of the drug design is happening on a computer. So what happens if we have access to many, many computers? Well, instead of having one human drug designer working on some new molecule designs,
Starting point is 00:16:04 instead we can have thousands of agents doing molecule design in parallel. Just imagine what impact that could have on patients suffering from a rare type of cancer, the speed that we could get to a potential new molecule to address this medical need, or the ability to go after many diseases in parallel. Cancer is often caused by mutations of proteins, and even within the same type of cancer,
Starting point is 00:16:38 each patient can have different mutations. And that means that one drug molecule won't work for all patients. But what if we could go in and measure each individual patient's protein mutations, and then have a whole team of molecule design agents working on that individual's protein mutations? Then we could create a molecule tailored for each individual patient.
Starting point is 00:17:08 Now, this is still far away from patients, and there's a huge amount of complexity in drug design left to tackle. But this really does give us a glimpse at the future that is to come. So we've seen how this new AI paradigm is driving our progression in drug design, and you can also see this paradigm being played out in material science and creating new forms of energy and in chemistry. The ability to take our real messy world and then create our own AI analogs to then, on a computer, do open-ended scientific discovery to create new knowledge that we can take back out
Starting point is 00:17:44 and change the world around us. This is an incredibly powerful paradigm and one that will bring about a whole new wave of scientific and technological advancements. And we're going to need as many people as possible, especially those working in machine learning, AI and technology, to help drive this new wave of progression. Thank you. Support for the show comes from Airbnb. I've got a trip to Asia plan for this December. I booked an Airbnb. They are always the most cozy and inviting after such a long journey. My own home will be empty while I'm gone, so I was looking into hosting on Airbnb myself.
Starting point is 00:18:30 I'm having fun thinking of some small touches I might add for potential guests, like the ones I've received at Airbnbs in the past. And with the extra income from hosting I can make my next trip abroad even longer. Your home might be worth more than you think. Find out how much at airbnb.ca slash host. That was Max Jotterberg speaking at TED AI San Francisco in 2024. If you're curious about Ted's curation, find out more at ted.com slash curation guidelines. And that's it for today.
Starting point is 00:19:05 TED Talks Daily is part of the TED Audio Collective. This episode was produced and edited by our team, Martha Estefanos, Oliver Friedman, Brian Green, Autumn Thompson, and Alejandra Salazar. It was mixed by Christopher Faisy-Bogan. Additional support from Emma Taubner and Daniela Ballarezo. I'm Elise Hu. I'll be back tomorrow with a fresh idea for your feet.
Starting point is 00:19:26 Thanks for listening. PRX.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.