Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- Recent years have seen fast improvement in neural architectures that learn potential energy surfaces from molecular data.
- Message Passing Neural Network (MPNN) is a key driver of this success.
- MPNN relies on a spatial distance limit on messages, which can impede the learning of long-range interactions.
- Ewald message passing is proposed to address this drawback, which limits interactions via a cutoff on frequency instead of distance.
- Ewald message passing is computationally cheap and agnostic to other architectural details.
- Robust improvements in energy mean absolute errors are observed across all models and datasets, averaging 10% on OC20 and 16% on OE62.
- Improvements have an outsize impact on structures with high long-range contributions to the ground truth energy.
Paper Content
Introduction
- Graph neural networks (GNNs) can learn molecular properties from quantum chemical reference data quickly and accurately
- GNNs have difficulty learning long-ranged physical interactions between atoms
- GNNs are in contrast to molecular dynamics, which uses long-range interactions
- Ewald summation decomposes physical interactions into short-range and long-range parts
- Ewald message passing is a general framework to complement existing GNN models
- Experiments show robust improvements in energy mean absolute errors
- Ewald message passing recovers the effects of a hand-crafted long-range correction term
Background
- MPNNs are a conceptual framework for molecular GNN architectures
- Molecules are represented as graphs with atoms as vertices
- Atoms are adjacent if they are within a certain distance
- Embeddings are created based on local atom properties
- Messages are sent between atoms to model interactions
- Outputs are generated from the embeddings
- PBC is used to approximate an infinite system with a finite atom collection
Ewald message passing
- Analogy between electrostatics-type interactions and continuous filter convolutions
- Previous work has highlighted this analogy as a useful source of inductive bias
- Ewald summation evaluates potentials given a PBC lattice, a generic interaction kernel and generic atom “charges”
- Ewald message passing flips the rationale behind Ewald summation
- Ewald message passing combines short-range and long-range message passing
- Long-range sum is rewritten as a sum over Fourier frequencies
- Fourier coefficients are related to the 3D Fourier transform of the kernel
- Frequency filters are parametrized as linear combinations of radial basis functions in the aperiodic case
- Non-radial filtering strategy is used in the periodic case to make use of the PBC frame
Related work
- Hand-crafted long-range corrections are used to augment short-ranged models in computational chemistry
- DFT and MPNN potentials cannot correctly express long-range effects
- Ewald message passing is designed to fill the same demand for a cheap, largely out-of-the-box correction in the MPNN context
- Non-local learning schemes are used to target non-local interactions in a structure
- Ewald summation is a computational technique for the summation of long-range interactions
Experiments
- OC20 dataset features adsorption energies and atom forces for 265 million structures
- OE62 dataset features DFT-computed energies for 62,000 large organic molecules
- OC20 features PBC, OE62 is aperiodic
- Aim to upgrade baseline architectures with Ewald summation
- Test if Ewald message passing leads to significant and robust improvements of energy mean absolute errors
- Measure average runtime per structure
- Increase settings for distance cutoff and maximum number of message-passing neighbors to compare
- Primary role of short-range model choice in GNN design landscape
- Consistency of accuracy gains through Ewald MP encourages its use
- Runtime overheads are comparable to, if not smaller than, the runtime differences between baselines
- Average Ewald improvements across all models are 16.1% for OE62 and 10.3% for OC20
- Ewald MP achieves better cost-performance tradeoff than large cutoff setting on OC20
- Ewald MP appears to be more efficient on OC20
- Improvements of force MAEs on OC20 are consistent but small
- Ewald MP learns long-range effects in particular
Conclusion
- Ewald message passing is a general message passing framework based on Ewald summation
- It is computationally efficient and model-agnostic
- It improves energy targets across models and datasets
- It has an outsize impact on structures with high long-range energy contributions
- It is especially useful for large or periodic structures containing a diverse set of atoms
- It is implemented with real and imaginary parts
- It uses damping values and filtering schemes
- It has hyperparameters for the aperiodic and periodic cases
- It uses a linear combination of energy and force loss mean absolute errors as its loss function
- It has ablation studies to test its robustness in hyperparameter space