Microsoft Research Podcast - 129 - Machine learning, molecular simulation, and the opportunity for societal good with Chris Bishop and Max Welling
Episode Date: July 20, 2021Unlocking the challenge of molecular simulation has the potential to yield significant breakthroughs in how we tackle such societal issues as climate change, drug discovery, and the treatment of disea...se, and Microsoft is ramping up its efforts in the space. In this episode, Chris Bishop, Lab Director of Microsoft Research Cambridge, welcomes renowned machine learning researcher Max Welling to the Microsoft Research team as head of the new Amsterdam lab. Connecting over their shared physics background and vision for molecular simulation, Bishop and Welling explore several fascinating topics, including a future in which machine learning and quantum computing will be used in tandem to model molecules, the power of machine learning to provide “on demand” data in this space, and goals for the first year and beyond at the Amsterdam lab. https://www.microsoft.com/research
Transcript
Discussion (0)
It's kind of strange. Molecules are basically everything around us except for light and a few
other forces that we can't really see. But everything else is made of molecules and yet
we don't really understand them. We can't really predict their properties. So if we start to
understand molecules better, then a number of applications become within reach, we can start to design better catalysts,
for instance to help the hydrogen economy.
If you want to split water into hydrogen and oxygen, it actually costs a lot of energy.
If you find a catalyst to reduce the amount of energy that you need to use, you will make
that process a lot more efficient and boost the possibility to use water as a
battery in some sense, where you can store energy in hydrogen. In general, you can make better
materials. Welcome to the Microsoft Research Podcast. My name is Chris Bishop and I'm the
Lab Director of Microsoft Research in Cambridge, UK. One of the most exciting areas of machine
learning right now is its application to the
field of molecular simulation.
Today I'm joined by Max Welling, one of the world's leading researchers in machine learning
and someone for whom molecular simulation has become something of a passion.
I'm delighted to announce that Max will be joining Microsoft Research in September and
that we'll be opening a new lab in Amsterdam, which Max will be leading. Max, a very warm welcome to the podcast.
Hi, Chris. Thank you very much for inviting me. I'm very excited to chat with you today.
That's great to have you. We'll talk a lot more about molecular simulation in a moment,
but first, Max, can you tell us a bit about your background?
Yes, absolutely. So I was educated in physics, like yourself, actually. And as I understand, we have somewhat similar backgrounds. So I did my bachelor, my master and my PhD in physics. I did my PhD actually in two-dimensional quantum gravity. And then I decided I wanted to move to something with more societal impact. In the beginning, I was working in computer vision at Caltech with
Pieter Perona. And then I migrated slowly to machine learning, working with Jeff Hinton at
UCL in London. And then when he moved to the University of Toronto, I moved with him.
And after that, I became professor at the University of California in Irvine. And then in 2013, I moved to the University of Amsterdam to become a research chair there.
So I'm very excited, actually, to join Microsoft in September.
Yeah, we're really excited to have you join us.
I thought it might be fun just to share the story of how you and I came to be partnering in this,
because it was clear to me
that a lot of your very impressive research
is directly relevant to the problem of molecular simulation.
And that's an area that we've been very interested in
for a while in Microsoft Research in Cambridge.
And so I guess it was a few months now, wasn't it,
when I called you up and I seem to recall
there was just an amazing meeting of minds.
We suddenly realized we were both thinking
along exactly the same direction
and it was quite an energizing call. No, absolutely. It was strange in a way
that I was already working on pivoting my research in that direction and seeing the
opportunities there. And then getting that call right at that moment, it was indeed a meeting of
the minds. And it was immediately clear that I wanted to do this.
Also, because for me, it was very important that I wanted to spend sort of the second half,
let's say, of my career on climate change. I see this as a major challenge for the world.
And I also see a huge opportunity here to actually make a dent in that problem through
computational chemistry.
And of course, you know, in order to achieve anything, you need colleagues who are smart
and are willing to engage with you on this particular journey, but also a lot of compute.
And so I know Microsoft has an amazing amount of compute infrastructure that we can leverage
to try to solve this problem.
Absolutely. I mean, there's so many exciting application areas for this, but you're right,
climate change is such a pressing need. I recently read Bill Gates's book,
How to Avoid a Climate Disaster. I found it very inspirational. He doesn't spare any punches. I mean,
he highlights very clearly the depth of the challenge and the extent of the challenge,
but also talks very systematically through all the different opportunities we have to
use technology to help us address this. And in many cases, there are clear opportunities for
molecular simulation to play a role, whether it's developing catalysts or new electrodes for
driving the hydrogen economy and so on. I read that book too, actually, before you called me, I think.
And I also found it very inspirational.
I really liked the way he approaches this in a super practical, economical way.
And it actually gave me hope that this thing is solvable.
So in fact, I do think that we can solve it together.
Yeah, and I guess we should say a little bit about what molecular simulation is
and why we think it's so exciting right now and what it's got to do with machine learning, because this is possibly a moment when machine learning is about to disrupt the field of molecular simulation in much the same way that it's already transformed fields like computer vision or speech recognition or natural language understanding. And for me, the real excitement here is the fact that there is almost
a new frontier opening up of intellectually very deep research that combines machine learning with
quantum physics and chemistry and molecular biology and so on, but also has this tremendous
potential to have very important real world impact, not just in climate change, but also in domains
like healthcare, in drug discovery, in medicine,
in understanding biology and understanding disease and helping us to treat disease.
Yeah, there's a lot of things coming together. That's the beauty of this research. And it's
kind of strange. Molecules are basically everything around us, except for light and a few other forces
that we can't really see. But everything else is made of molecules, and yet we don't really understand them. We can't really predict their properties. So if we start to
understand molecules better, then a number of applications become within reach. We can start
to design better catalysts, for instance, to help the hydrogen economy. If you want to split water into hydrogen and oxygen,
it actually costs a lot of energy. If you find a catalyst to reduce the amount of energy that
you need to use, you will make that process a lot more efficient and boost the possibility to use
water as a battery in some sense, where you can store energy in hydrogen.
In general, you can make better materials. We can maybe make
plastic, which doesn't sort of pollute the environment all that much. And maybe you can
tell me something about this, Chris, you know, design new drugs. I understand at Microsoft
Research, there is already quite a big effort in that direction. That's right. We've been very
interested in drug design, collaborating with some major pharma companies and looking at how machine learning can disrupt that process of drug discovery because that space of potential molecules is so vast.
And surely within that space are some really interesting and powerful drugs that don't have side effects and are not toxic and so on.
But discovering them is tremendously challenging and we really see machine learning as having a big impact on that. A lot of the work we've been doing has been using machine learning driven by experimental data but
we're very excited to augment that and amplify that by using molecular simulation where we're
creating data more from first principle simulation of the quantum physics of molecules, of proteins
folding and interacting with other proteins and so on. And so we see this
as an area with tremendous potential, not just drug discovery, but just more broadly in the
life sciences. It's an area of great interest to us. For example, we have a collaboration with
Adaptive Biotech that recently we pivoted to look at COVID-19. And in fact, just a few months ago, the so-called T-Detect COVID developed by Adaptive
Biotech using some of our machine learning technology was granted emergency FDA approval,
and it's become the world's first test for past COVID infections based not on antibodies,
but on T-cells, which are a key part of the adaptive immune system, and which actually
may be much longer lived than antibodies. So again, that's a lovely example of very interesting research,
but which also leads to very important real world application. It is such an exciting feel,
isn't it? It also feels that it's almost like an ideal application of machine learning. I mean,
molecular modelling has been around for a good few years now. And it's this,
it offers this amazing computational microscope where you can see deep inside the workings
of living organisms, for example.
But it's just so mind-bogglingly expensive from a computational point of view.
And I think one of the things that we're all very excited about is the idea that machine
learning could really speed this up by many orders of magnitude.
And it just feels like a great application for machine learning because the training data, in large part, can even be synthetic.
We can simulate the systems using more conventional techniques
of solving the equations of quantum physics,
generate synthetic data to train machine learning,
and then use those trained models to run other simulations,
but very much more efficiently.
And, of course, in machine learning,
we're very dependent on data. We like lots of data, but it can be expensive to obtain. It can
be hard to find. It can be time consuming to label. And yet somehow in this field, we can generate
more data on demand, as much perfectly labeled data as we wish, just by spending more computation,
as it were. So it really feels like just a great application domain.
But as you say, one which has so much potential for real world impact. So you have this background
in theoretical physics in much the same way that I do. And I noticed that shines through in a lot
of your research. I mean, you've done some pretty amazing things over the years. But one thing that
I think you're quite well known for is work on invariances and equivariances in machine learning.
And maybe you could just say a few words about what that means and why that might be relevant to this challenge of molecular simulation.
Yeah, certainly.
So in physics, we think about symmetries.
Basically, almost all of our physical theories are built around symmetries. We like to write down all our equations in such a way that they
look the same, whichever observer is used to describe that system. And that has led to
revolutions that has led to the realization that electricity and magnetism are actually
two sides of the same coin that are transformed into each other by changing the observer from
one that is standing still to
one that is flying by. And Einstein went actually one step further and said, well,
actually one observer, instead of experiencing acceleration, might explain that as gravity.
And so he equated acceleration and gravity together, which led to the general theory
of relativity. And in fact, the whole standard model is built out of, you know,
particles which are organized according to these symmetry transformations.
That particular principle, you know, in 2016,
when I started to do research with Taco Cohen,
actually that principle we wanted to also implement in neural network research.
And convolutional neural networks already implement to some degree.
They basically have this idea that if you move an object from one place to the next place,
which is a translation, the output of the neural network should either be invariant,
you know, a cat is a cat, you know, if you see it on the left or the right of your image,
or, you know, if you want to segment a cat out,
the segmentation mask should move with the cat as you move the input.
So that's called equivariance. And we were thinking about how to extend these groups, basically saying, well,
if I rotate the object, you know, my prediction should still be invariant under those rotated
objects. A cat upside down is still a cat after all, right? And that's particularly important for
molecular simulation because a molecule, if you rotate it, you would still think has the same properties than if you see it in some other orientation.
And building that particular inductive bias, this particular prior knowledge into your model is what we have recently been doing. what's called graph neural networks, where you can think of the atoms as the nodes in a graph
and the atoms that interact with each other as edges. And these atoms are sending messages to
each other, which is similar to doing a convolution. So in that graph neural net,
we made them sort of symmetric under these rotations and then applied it to describe
molecules. And that's amazingly successful. So the interesting thing is that you can train on data sets
to predict the properties of these molecules
and the predictions are amazingly accurate.
And we are now starting as a community
to realize the enormous impact that that can have in the future.
Yeah, symmetries are so powerful and fundamental in physics.
It's really interesting
to see them play such a fundamental role in machine learning as well.
We're seeing right now a lot of interest in machine learning in this space and a lot of
activities and really interesting research projects spinning up.
But if we look further out, what about the field of quantum computing?
What impact do you see quantum computing having in the domain of molecular simulation?
I think it's a very interesting question because molecules are inherently quantum systems.
The electrons in particular are well described by quantum mechanics.
And a quantum computer is in some sense a natural sort of quantum simulation.
So Feynman sort of first noticed that you can think of a quantum computer
as doing some kind of quantum experiment, some kind of quantum simulation.
And people think that the first actual applications of quantum computing are going to happen in
this field of simulating quantum mechanics itself.
Now, quantum computing is still at its infant stages. And, you know,
we really need full tolerant quantum computing for it to be truly useful. And that might be more than
10 years away. But hopefully already somewhat earlier, it's hard to predict precisely how much
earlier, we can use sort of more noisy quantum devices to start modeling molecules using quantum computing.
And I think the most exciting part of this is that I believe that quantum computing and
machine learning are going to collaborate together in order to make this really useful.
So you could imagine a machine learning algorithm doing a bit of computation and then asking
a quantum computer to do a particular computation, say, why don't you use this energy for me or this Hamiltonian for me?
Do a computation, give the result back to me, and I'll process that information to then try and figure out what the next thing is that you should compute.
And I think it's in this interaction that we're going to see a lot of action in the future. But it might be still, you know, more than five years out.
So really, we have two pieces of news today, as well as the news that you'll be joining Microsoft Research.
We're also announcing the opening of a new research lab in Amsterdam, which you'll be leading. And we're going to be hiring
extensively in molecular simulation, both in Cambridge and in Amsterdam over the coming years.
Tell us a little bit about your early thoughts as to what the Amsterdam lab might be focusing on.
So first of all, personally, I'm super grateful that Microsoft has decided to open a new lab in Amsterdam.
I mean, this is somewhat of a dream come true for me.
You know, I love Amsterdam.
It's a great place to live and a great place to set up an office, actually, I think, because there's a lot of talent running around here.
Yeah, so what are we going to do?
Well, the first piece of business is, of course, to fill the lab with excellent researchers
and to build a very diverse team in Amsterdam.
That will be a significant effort.
But also, I really want to start the actual science, you know, collaborating with the
Cambridge teams and other teams around the globe to start thinking about precisely, you
know, what we want to achieve over the next couple of years. I hope that we can make a start with building a system
that can predict properties of molecules,
can generate molecules with certain properties,
and can search through this enormous space of these molecules.
There's so many possible molecules,
more than possible atoms in the universe.
And you have to somehow search through that space if you're looking for molecules with certain properties.
And it's only through machine learning that we can now begin to search fast through this space.
And I want to make a beginning with that sort of program to build that software stack that can achieve that.
That sounds really exciting.
Actually, we've already made our first hire into the Amsterdam lab, of course, Rianne Vandenberg.
She spent about three months with us in Cambridge as a visiting researcher a few years ago while
she was a postdoc with you at the University of Amsterdam. So obviously, you know Rianne very well.
Yeah, I know her very well. And I'm extremely impressed with her technical achievements. She was a physicist like
ourselves. Then she moved to work with us on interesting graph neural network problems.
She's made a major impact in the field and of course given the fact that she is both
sort of a physicist and a sort of machine learner, she's the perfect hire for the Amsterdam team
to work with us on these types of problems.
That sounds great.
As well as joining Microsoft Research,
you're also going to have a joint appointment
with the University of Amsterdam.
Microsoft Research has worked with you for several years
in that capacity, funding students and postdocs.
And of course, we've hired some great graduates
from your lab over the years. And we've collaborated with your lab on various research projects
can you tell us a bit more about the work of your university group yeah so i'm very grateful
actually that microsoft allows me to have that joint position i i personally believe it is very
important actually to have this influx from academia and to have one leg in one
and one leg in the other the research lab is called M lab it's about 50
researchers right now with five or so faculty we've been working a lot on sort
of deep learning problems but also graphical models which was sort of the
hype maybe ten years ago.
You know, you've written a book about it.
We all teach from your book, actually, at the university.
And connected to that also generative models, normalizing flows, VAEs, and also causality research.
And I really hope that Microsoft can fruitfully collaborate
with the University of Amsterdam
but also with the other knowledge institutes
in the Amsterdam ecosystem
and in particular I'm thinking about some of these ECAI labs
which is Innovation Center for AI
where we could possibly start one of these labs
and work on the more academic problems with students in that lab
and hopefully inspire them
to join Microsoft at some point. It's a very impressive group. I mean,
some amazing researchers have really come out of your team in Amsterdam. And so it's very exciting
to have that close partnership. So finally, Max, you'll be joining us in September. If we look
ahead, say 10 years, what do you hope we will have accomplished? So that's an excellent question.
And it's a bit of a glass ball to be looking into. But I certainly know my dreams. So my dreams are
that in 10 years, we will have cracked the problem of understanding molecules. We will be able to
design new materials on the fly. We will have designed new catalysts to feed the green
economy. We will be able to design new drugs for all sorts of diseases that we cannot treat right
now. I hope also that by that time we have grown and inspired a whole lot of people around the
world in joining us in this very important effort.
Well, I've no doubt at all, Max, that that combination of hiring you,
as well as opening a new research lab in continental Europe,
and also defining this major new mission that I think really combines intellectually very deep science
together with the opportunity for major social impact.
I think that whole package will be very inspiring
for many scientists and engineers
who'll want to come and join us in this exciting endeavor.
Thank you for your time, Max.
And thanks also to our listeners.
For those of you interested in joining Microsoft Research
to work with us on molecular simulation,
we've just made several job postings
for both the Cambridge and the Amsterdam labs.
And you can learn more about Microsoft Research and the great work coming out of its labs at microsoft.com forward slash research.
And be sure to subscribe for new episodes of the Microsoft Research podcast.
Until next time.