Microsoft Research Podcast - 129 - Machine learning, molecular simulation, and the opportunity for societal good with Chris Bishop and Max Welling

Episode Date: July 20, 2021

Unlocking the challenge of molecular simulation has the potential to yield significant breakthroughs in how we tackle such societal issues as climate change, drug discovery, and the treatment of disea...se, and Microsoft is ramping up its efforts in the space. In this episode, Chris Bishop, Lab Director of Microsoft Research Cambridge, welcomes renowned machine learning researcher Max Welling to the Microsoft Research team as head of the new Amsterdam lab. Connecting over their shared physics background and vision for molecular simulation, Bishop and Welling explore several fascinating topics, including a future in which machine learning and quantum computing will be used in tandem to model molecules, the power of machine learning to provide “on demand” data in this space, and goals for the first year and beyond at the Amsterdam lab. https://www.microsoft.com/research

Transcript
Discussion (0)
Starting point is 00:00:00 It's kind of strange. Molecules are basically everything around us except for light and a few other forces that we can't really see. But everything else is made of molecules and yet we don't really understand them. We can't really predict their properties. So if we start to understand molecules better, then a number of applications become within reach, we can start to design better catalysts, for instance to help the hydrogen economy. If you want to split water into hydrogen and oxygen, it actually costs a lot of energy. If you find a catalyst to reduce the amount of energy that you need to use, you will make that process a lot more efficient and boost the possibility to use water as a
Starting point is 00:00:46 battery in some sense, where you can store energy in hydrogen. In general, you can make better materials. Welcome to the Microsoft Research Podcast. My name is Chris Bishop and I'm the Lab Director of Microsoft Research in Cambridge, UK. One of the most exciting areas of machine learning right now is its application to the field of molecular simulation. Today I'm joined by Max Welling, one of the world's leading researchers in machine learning and someone for whom molecular simulation has become something of a passion. I'm delighted to announce that Max will be joining Microsoft Research in September and
Starting point is 00:01:22 that we'll be opening a new lab in Amsterdam, which Max will be leading. Max, a very warm welcome to the podcast. Hi, Chris. Thank you very much for inviting me. I'm very excited to chat with you today. That's great to have you. We'll talk a lot more about molecular simulation in a moment, but first, Max, can you tell us a bit about your background? Yes, absolutely. So I was educated in physics, like yourself, actually. And as I understand, we have somewhat similar backgrounds. So I did my bachelor, my master and my PhD in physics. I did my PhD actually in two-dimensional quantum gravity. And then I decided I wanted to move to something with more societal impact. In the beginning, I was working in computer vision at Caltech with Pieter Perona. And then I migrated slowly to machine learning, working with Jeff Hinton at UCL in London. And then when he moved to the University of Toronto, I moved with him. And after that, I became professor at the University of California in Irvine. And then in 2013, I moved to the University of Amsterdam to become a research chair there.
Starting point is 00:02:30 So I'm very excited, actually, to join Microsoft in September. Yeah, we're really excited to have you join us. I thought it might be fun just to share the story of how you and I came to be partnering in this, because it was clear to me that a lot of your very impressive research is directly relevant to the problem of molecular simulation. And that's an area that we've been very interested in for a while in Microsoft Research in Cambridge.
Starting point is 00:02:55 And so I guess it was a few months now, wasn't it, when I called you up and I seem to recall there was just an amazing meeting of minds. We suddenly realized we were both thinking along exactly the same direction and it was quite an energizing call. No, absolutely. It was strange in a way that I was already working on pivoting my research in that direction and seeing the opportunities there. And then getting that call right at that moment, it was indeed a meeting of
Starting point is 00:03:22 the minds. And it was immediately clear that I wanted to do this. Also, because for me, it was very important that I wanted to spend sort of the second half, let's say, of my career on climate change. I see this as a major challenge for the world. And I also see a huge opportunity here to actually make a dent in that problem through computational chemistry. And of course, you know, in order to achieve anything, you need colleagues who are smart and are willing to engage with you on this particular journey, but also a lot of compute. And so I know Microsoft has an amazing amount of compute infrastructure that we can leverage
Starting point is 00:04:03 to try to solve this problem. Absolutely. I mean, there's so many exciting application areas for this, but you're right, climate change is such a pressing need. I recently read Bill Gates's book, How to Avoid a Climate Disaster. I found it very inspirational. He doesn't spare any punches. I mean, he highlights very clearly the depth of the challenge and the extent of the challenge, but also talks very systematically through all the different opportunities we have to use technology to help us address this. And in many cases, there are clear opportunities for molecular simulation to play a role, whether it's developing catalysts or new electrodes for
Starting point is 00:04:40 driving the hydrogen economy and so on. I read that book too, actually, before you called me, I think. And I also found it very inspirational. I really liked the way he approaches this in a super practical, economical way. And it actually gave me hope that this thing is solvable. So in fact, I do think that we can solve it together. Yeah, and I guess we should say a little bit about what molecular simulation is and why we think it's so exciting right now and what it's got to do with machine learning, because this is possibly a moment when machine learning is about to disrupt the field of molecular simulation in much the same way that it's already transformed fields like computer vision or speech recognition or natural language understanding. And for me, the real excitement here is the fact that there is almost a new frontier opening up of intellectually very deep research that combines machine learning with
Starting point is 00:05:31 quantum physics and chemistry and molecular biology and so on, but also has this tremendous potential to have very important real world impact, not just in climate change, but also in domains like healthcare, in drug discovery, in medicine, in understanding biology and understanding disease and helping us to treat disease. Yeah, there's a lot of things coming together. That's the beauty of this research. And it's kind of strange. Molecules are basically everything around us, except for light and a few other forces that we can't really see. But everything else is made of molecules, and yet we don't really understand them. We can't really predict their properties. So if we start to understand molecules better, then a number of applications become within reach. We can start
Starting point is 00:06:17 to design better catalysts, for instance, to help the hydrogen economy. If you want to split water into hydrogen and oxygen, it actually costs a lot of energy. If you find a catalyst to reduce the amount of energy that you need to use, you will make that process a lot more efficient and boost the possibility to use water as a battery in some sense, where you can store energy in hydrogen. In general, you can make better materials. We can maybe make plastic, which doesn't sort of pollute the environment all that much. And maybe you can tell me something about this, Chris, you know, design new drugs. I understand at Microsoft Research, there is already quite a big effort in that direction. That's right. We've been very
Starting point is 00:07:01 interested in drug design, collaborating with some major pharma companies and looking at how machine learning can disrupt that process of drug discovery because that space of potential molecules is so vast. And surely within that space are some really interesting and powerful drugs that don't have side effects and are not toxic and so on. But discovering them is tremendously challenging and we really see machine learning as having a big impact on that. A lot of the work we've been doing has been using machine learning driven by experimental data but we're very excited to augment that and amplify that by using molecular simulation where we're creating data more from first principle simulation of the quantum physics of molecules, of proteins folding and interacting with other proteins and so on. And so we see this as an area with tremendous potential, not just drug discovery, but just more broadly in the life sciences. It's an area of great interest to us. For example, we have a collaboration with
Starting point is 00:07:56 Adaptive Biotech that recently we pivoted to look at COVID-19. And in fact, just a few months ago, the so-called T-Detect COVID developed by Adaptive Biotech using some of our machine learning technology was granted emergency FDA approval, and it's become the world's first test for past COVID infections based not on antibodies, but on T-cells, which are a key part of the adaptive immune system, and which actually may be much longer lived than antibodies. So again, that's a lovely example of very interesting research, but which also leads to very important real world application. It is such an exciting feel, isn't it? It also feels that it's almost like an ideal application of machine learning. I mean, molecular modelling has been around for a good few years now. And it's this,
Starting point is 00:08:44 it offers this amazing computational microscope where you can see deep inside the workings of living organisms, for example. But it's just so mind-bogglingly expensive from a computational point of view. And I think one of the things that we're all very excited about is the idea that machine learning could really speed this up by many orders of magnitude. And it just feels like a great application for machine learning because the training data, in large part, can even be synthetic. We can simulate the systems using more conventional techniques of solving the equations of quantum physics,
Starting point is 00:09:15 generate synthetic data to train machine learning, and then use those trained models to run other simulations, but very much more efficiently. And, of course, in machine learning, we're very dependent on data. We like lots of data, but it can be expensive to obtain. It can be hard to find. It can be time consuming to label. And yet somehow in this field, we can generate more data on demand, as much perfectly labeled data as we wish, just by spending more computation, as it were. So it really feels like just a great application domain.
Starting point is 00:09:45 But as you say, one which has so much potential for real world impact. So you have this background in theoretical physics in much the same way that I do. And I noticed that shines through in a lot of your research. I mean, you've done some pretty amazing things over the years. But one thing that I think you're quite well known for is work on invariances and equivariances in machine learning. And maybe you could just say a few words about what that means and why that might be relevant to this challenge of molecular simulation. Yeah, certainly. So in physics, we think about symmetries. Basically, almost all of our physical theories are built around symmetries. We like to write down all our equations in such a way that they
Starting point is 00:10:26 look the same, whichever observer is used to describe that system. And that has led to revolutions that has led to the realization that electricity and magnetism are actually two sides of the same coin that are transformed into each other by changing the observer from one that is standing still to one that is flying by. And Einstein went actually one step further and said, well, actually one observer, instead of experiencing acceleration, might explain that as gravity. And so he equated acceleration and gravity together, which led to the general theory of relativity. And in fact, the whole standard model is built out of, you know,
Starting point is 00:11:05 particles which are organized according to these symmetry transformations. That particular principle, you know, in 2016, when I started to do research with Taco Cohen, actually that principle we wanted to also implement in neural network research. And convolutional neural networks already implement to some degree. They basically have this idea that if you move an object from one place to the next place, which is a translation, the output of the neural network should either be invariant, you know, a cat is a cat, you know, if you see it on the left or the right of your image,
Starting point is 00:11:37 or, you know, if you want to segment a cat out, the segmentation mask should move with the cat as you move the input. So that's called equivariance. And we were thinking about how to extend these groups, basically saying, well, if I rotate the object, you know, my prediction should still be invariant under those rotated objects. A cat upside down is still a cat after all, right? And that's particularly important for molecular simulation because a molecule, if you rotate it, you would still think has the same properties than if you see it in some other orientation. And building that particular inductive bias, this particular prior knowledge into your model is what we have recently been doing. what's called graph neural networks, where you can think of the atoms as the nodes in a graph and the atoms that interact with each other as edges. And these atoms are sending messages to
Starting point is 00:12:32 each other, which is similar to doing a convolution. So in that graph neural net, we made them sort of symmetric under these rotations and then applied it to describe molecules. And that's amazingly successful. So the interesting thing is that you can train on data sets to predict the properties of these molecules and the predictions are amazingly accurate. And we are now starting as a community to realize the enormous impact that that can have in the future. Yeah, symmetries are so powerful and fundamental in physics.
Starting point is 00:13:04 It's really interesting to see them play such a fundamental role in machine learning as well. We're seeing right now a lot of interest in machine learning in this space and a lot of activities and really interesting research projects spinning up. But if we look further out, what about the field of quantum computing? What impact do you see quantum computing having in the domain of molecular simulation? I think it's a very interesting question because molecules are inherently quantum systems. The electrons in particular are well described by quantum mechanics.
Starting point is 00:13:35 And a quantum computer is in some sense a natural sort of quantum simulation. So Feynman sort of first noticed that you can think of a quantum computer as doing some kind of quantum experiment, some kind of quantum simulation. And people think that the first actual applications of quantum computing are going to happen in this field of simulating quantum mechanics itself. Now, quantum computing is still at its infant stages. And, you know, we really need full tolerant quantum computing for it to be truly useful. And that might be more than 10 years away. But hopefully already somewhat earlier, it's hard to predict precisely how much
Starting point is 00:14:17 earlier, we can use sort of more noisy quantum devices to start modeling molecules using quantum computing. And I think the most exciting part of this is that I believe that quantum computing and machine learning are going to collaborate together in order to make this really useful. So you could imagine a machine learning algorithm doing a bit of computation and then asking a quantum computer to do a particular computation, say, why don't you use this energy for me or this Hamiltonian for me? Do a computation, give the result back to me, and I'll process that information to then try and figure out what the next thing is that you should compute. And I think it's in this interaction that we're going to see a lot of action in the future. But it might be still, you know, more than five years out. So really, we have two pieces of news today, as well as the news that you'll be joining Microsoft Research.
Starting point is 00:15:23 We're also announcing the opening of a new research lab in Amsterdam, which you'll be leading. And we're going to be hiring extensively in molecular simulation, both in Cambridge and in Amsterdam over the coming years. Tell us a little bit about your early thoughts as to what the Amsterdam lab might be focusing on. So first of all, personally, I'm super grateful that Microsoft has decided to open a new lab in Amsterdam. I mean, this is somewhat of a dream come true for me. You know, I love Amsterdam. It's a great place to live and a great place to set up an office, actually, I think, because there's a lot of talent running around here. Yeah, so what are we going to do?
Starting point is 00:16:01 Well, the first piece of business is, of course, to fill the lab with excellent researchers and to build a very diverse team in Amsterdam. That will be a significant effort. But also, I really want to start the actual science, you know, collaborating with the Cambridge teams and other teams around the globe to start thinking about precisely, you know, what we want to achieve over the next couple of years. I hope that we can make a start with building a system that can predict properties of molecules, can generate molecules with certain properties,
Starting point is 00:16:36 and can search through this enormous space of these molecules. There's so many possible molecules, more than possible atoms in the universe. And you have to somehow search through that space if you're looking for molecules with certain properties. And it's only through machine learning that we can now begin to search fast through this space. And I want to make a beginning with that sort of program to build that software stack that can achieve that. That sounds really exciting. Actually, we've already made our first hire into the Amsterdam lab, of course, Rianne Vandenberg.
Starting point is 00:17:09 She spent about three months with us in Cambridge as a visiting researcher a few years ago while she was a postdoc with you at the University of Amsterdam. So obviously, you know Rianne very well. Yeah, I know her very well. And I'm extremely impressed with her technical achievements. She was a physicist like ourselves. Then she moved to work with us on interesting graph neural network problems. She's made a major impact in the field and of course given the fact that she is both sort of a physicist and a sort of machine learner, she's the perfect hire for the Amsterdam team to work with us on these types of problems. That sounds great.
Starting point is 00:17:50 As well as joining Microsoft Research, you're also going to have a joint appointment with the University of Amsterdam. Microsoft Research has worked with you for several years in that capacity, funding students and postdocs. And of course, we've hired some great graduates from your lab over the years. And we've collaborated with your lab on various research projects can you tell us a bit more about the work of your university group yeah so i'm very grateful
Starting point is 00:18:14 actually that microsoft allows me to have that joint position i i personally believe it is very important actually to have this influx from academia and to have one leg in one and one leg in the other the research lab is called M lab it's about 50 researchers right now with five or so faculty we've been working a lot on sort of deep learning problems but also graphical models which was sort of the hype maybe ten years ago. You know, you've written a book about it. We all teach from your book, actually, at the university.
Starting point is 00:18:52 And connected to that also generative models, normalizing flows, VAEs, and also causality research. And I really hope that Microsoft can fruitfully collaborate with the University of Amsterdam but also with the other knowledge institutes in the Amsterdam ecosystem and in particular I'm thinking about some of these ECAI labs which is Innovation Center for AI where we could possibly start one of these labs
Starting point is 00:19:20 and work on the more academic problems with students in that lab and hopefully inspire them to join Microsoft at some point. It's a very impressive group. I mean, some amazing researchers have really come out of your team in Amsterdam. And so it's very exciting to have that close partnership. So finally, Max, you'll be joining us in September. If we look ahead, say 10 years, what do you hope we will have accomplished? So that's an excellent question. And it's a bit of a glass ball to be looking into. But I certainly know my dreams. So my dreams are that in 10 years, we will have cracked the problem of understanding molecules. We will be able to
Starting point is 00:20:00 design new materials on the fly. We will have designed new catalysts to feed the green economy. We will be able to design new drugs for all sorts of diseases that we cannot treat right now. I hope also that by that time we have grown and inspired a whole lot of people around the world in joining us in this very important effort. Well, I've no doubt at all, Max, that that combination of hiring you, as well as opening a new research lab in continental Europe, and also defining this major new mission that I think really combines intellectually very deep science together with the opportunity for major social impact.
Starting point is 00:20:42 I think that whole package will be very inspiring for many scientists and engineers who'll want to come and join us in this exciting endeavor. Thank you for your time, Max. And thanks also to our listeners. For those of you interested in joining Microsoft Research to work with us on molecular simulation, we've just made several job postings
Starting point is 00:21:02 for both the Cambridge and the Amsterdam labs. And you can learn more about Microsoft Research and the great work coming out of its labs at microsoft.com forward slash research. And be sure to subscribe for new episodes of the Microsoft Research podcast. Until next time.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.