Microsoft Research Podcast - Abstracts: March 21, 2024
Episode Date: March 21, 2024Senior Researcher Chang Liu discusses M-OFDFT, a variation of orbital-free density functional theory (OFDFT) that leverages deep learning to help identify molecular properties in a way that minimizes ...the tradeoff between accuracy and efficiency. Read the paper
Transcript
Discussion (0)
Welcome to Abstracts,
a Microsoft Research podcast that puts
the spotlight on world-class research in brief.
I'm Dr. Gretchen Huizenga.
In this series,
members of the research community at Microsoft give us
a quick snapshot or a podcast abstract
of their new and noteworthy papers.
Today, I'm talking to Dr. Chang Liu,
a senior researcher from Microsoft Research AI for Science.
Dr. Liu is co-author of a paper called
Overcoming the Barrier of Orbital-Free Density Functional Theory
for Molecular Systems Using Deep Learning.
Chang Liu, thanks for joining us on Abstracts.
Thank you. Thank you for this opportunity to share our work.
So in a few sentences, tell us about the issue or problem your paper addresses
and why people should care about this research.
Sure. Since this is an AI for science work, let's start from this perspective.
About science, people always want to understand
the properties of matters, such as why some substances can cure disease and why some
materials are heavy or conductive. For a very long period of time, these properties can only
be studied by observation and experiments, and the outcome will just look like magic to us.
If we can understand the underlying mechanism
and calculate these properties on our computer,
then we can do the magic ourselves
and it can hence accelerate industries
like medicine development and material discovery.
Our work aims to develop a method
that handles the most fundamental part
of such property calculation
and with better accuracy and efficiency. If you zoom in into the problem, properties of matters
are determined by the properties of molecules that constitute the matter. For example, the energy of
a molecule is an important property. It determines which structure it mostly takes,
and the structure indicates whether it can bind to a disease-related biomolecule.
You may know that molecules consist of atoms, and atoms consist of nuclei and electrons,
so properties of a molecule are the result of the interaction among the nuclei and the electrons in the molecule.
The nuclei can be treated as classical particles, but electrons exhibit significant quantum effect.
You can imagine this like electrons move so fast that they appear like cloud or mist spreading over the space. To calculate the properties of the molecule,
you need to first solve the electronic structure,
that is, how the electrons spread over the space.
This is governed by an equation that is hard to solve.
The target of our research is hence to develop a method
that solves the electronic structure more accurately and more efficiently
so that properties of molecules can be calculated in a higher level of accuracy and efficiency
that leads to better ways to solve the industrial problems.
Well, most research owes a debt to work that went before, but also moves the science forward. So
how does your approach
build on and or differ from related research in this field?
Yes, there are indeed quite a few methods that can solve the electronic structure,
but they show a harsh trade-off between accuracy and efficiency. Currently,
density functional theory, often called DFT, achieves a preferred balance for
most cases and is perhaps the most popular choice.
But DFT still requires a considerable cost for large molecular systems.
It has a cubic cost scaling.
We hope to develop a method that scales with a milder cost increase. We noted an alternative type of method
called opto-free DFT or called OFDFT, which has a lower order of cost scaling. But existing OFDFT
methods cannot achieve satisfying accuracy on molecules. So our work leverages deep learning to achieve an accurate OFDFT method.
The method can achieve the same level of accuracy as conventional DFT,
meanwhile inherits the cost scaling of OFDFT, hence is more efficient than the conventional DFT.
Okay, so we're moving acronyms from DFT to OFDFT, and you've got an acronym that goes M-OFDFT. What does that stand for?
The M represents molecules, since it is especially hard for classical or existing OFDFT to achieve a good accuracy on molecules.
So our development tackles that challenge.
Great. And I'm eager to hear about your methodology and your findings.
So let's go there.
Tell us a bit about how you conducted this research and what your methodology was.
Yeah, regarding methodology, let me delve into a bit into some details. We follow the formulation of OFDFT, which solves the electronic structure by optimizing the electron density, where the
optimization objective is to minimize the electronic energy. The challenge in OFDFT is
part of the electronic energy, specifically the kinetic energy. It is hard to calculate
accurately, especially for molecular systems. Existing computation formulas are based on
approximate physical models, but the approximation accuracy is not satisfying.
Our method uses a deep learning model to calculate the kinetic energy.
We train the model on labeled data, and by the powerful learning ability, the model can give a more accurate result.
This is the general idea, but there are many technical challenges.
For example, since the model is used as an optimization objective, it needs to capture the overall landscape of the function.
The model cannot recover the landscape if only one labeled data point is provided.
For this, we made a theoretical analysis on the data generation method and found a way to generate multiple labeled data points for each molecular structure.
Moreover, we can also calculate a gradient label for each data point, which provides the slope information on the landscape.
Another challenge is that the kinetic energy has a strong non-local effect,
meaning that the model needs to account for the interaction between any pair of spots in space.
This incurs a significant cost if using the conventional way to represent density, that is, using a grid.
For this challenge, we choose to expand the density function on a set of basis functions,
and use the expansion coefficients to represent the density.
The benefit is that it greatly reduces the representation dimension,
which in turn reduces the cost for a non-local calculation. These two examples are also the
differences from other deep learning OFDFT works. There are more technical designs,
and you may check them in the paper. So talk about your findings after you completed and analyzed
what you did. What were your major takeaways or findings? Yeah, let's dive into the details,
into the empirical findings. We find that our deep learning OFDFT, or abbreviated as MOFDFT,
is much more accurate than existing OFDFT methods, with tens to hundreds times lower error,
and achieves the same level of accuracy as the conventional DFT. On the other hand, the speed is indeed improved over conventional DFT.
For example, on a protein molecule with more than 700 atoms, our method achieves nearly 30 times speed up.
The empirical cost scaling is lower than quadratic and is one order less than that of conventional DFT.
So the speed advantage would be more significant on larger molecules.
I'd also like to mention an interesting observation.
Since our method is based on deep learning,
a natural question is how accurate would the method be
if applied to much larger molecules than those used for training the deep learning model.
This is a generalization challenge and is one of the major challenges of deep learning method for molecular science applications. We investigated this question in our method and found that the
error increases slower than linearly with molecular size. Although this is not perfect
since the error is still increasing,
but it is better than using the same model
to predict the property directly,
which shows an error that increases faster than linearly.
This somehow shows the benefit
of leveraging the OFDFT framework
for using a deep learning method
to solve molecular tasks.
Well, let's talk about real-world impact for a second.
You've got this research going on in the lab, so to speak.
How does it impact real-life situations?
Who does this work help the most, and how?
Since our method achieves the same level of accuracy as conventional DFT,
but runs faster, it could accelerate molecular property calculation and molecular dynamic
simulation, especially for large molecules. Hence, it has the potential to accelerate solving
problems such as medicine development and material discovery. Our method
also shows that AI techniques can create new opportunities for other electronic structure
formulations, which could inspire more methods to break the long-standing trade-off between
accuracy and efficiency in this field. So if there was one thing you wanted our listeners
to take away, just one little nugget from your research, what would that be?
If only for one thing, that would be we developed a method that solves molecular properties more
accurately and efficiently than the current portfolio of available methods. So finally, Chang, what are the big unanswered questions and unsolved problems that remain in
this field? And what's next on your research agenda?
Yeah, sure. There are indeed remains problems and challenges. One remaining challenge mentioned
above is the generalization to molecules much larger than those in training.
Although the OIFDFT method is better than directly predicting properties, there is still room to improve.
One possibility is to consider the success of large language models by including more abundant data and more diverse data in training, and
using a large model to digest all the data.
This can be costly, but it may give us a surprise.
Another way we may consider is to incorporate mathematical structures of the learning target functional into the model, such as
convexity, lower and upper bounds, and some invariants. And such structures
could regularize the model when applied to larger systems than it has seen
during training. So we have actually incorporated some such structures into the model, for example, the geometric invariance.
But other mathematical properties are non-trivial to incorporate.
We made some discussions in the paper,
and we'll engage working on that direction in the future.
The ultimate goal underlying this technical development
is to build a computational method
that is fast and accurate universally, so that we can simulate the molecular world of any kind.
Well, Chang Liu, thanks for joining us today. And to our listeners, thanks for tuning in.
If you want to read this paper, you can find a link at aka.ms forward
slash abstracts. You can also read it on archive, or you can check out the March 2024 issue of
Nature Computational Science. See you next time on Abstracts.