Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Recent years have seen fast improvement in neural architectures that learn potential energy surfaces from molecular data.
  • Message Passing Neural Network (MPNN) is a key driver of this success.
  • MPNN relies on a spatial distance limit on messages, which can impede the learning of long-range interactions.
  • Ewald message passing is proposed to address this drawback, which limits interactions via a cutoff on frequency instead of distance.
  • Ewald message passing is computationally cheap and agnostic to other architectural details.
  • Robust improvements in energy mean absolute errors are observed across all models and datasets, averaging 10% on OC20 and 16% on OE62.
  • Improvements have an outsize impact on structures with high long-range contributions to the ground truth energy.

Paper Content

Introduction

  • Graph neural networks (GNNs) can learn molecular properties from quantum chemical reference data quickly and accurately
  • GNNs have difficulty learning long-ranged physical interactions between atoms
  • GNNs are in contrast to molecular dynamics, which uses long-range interactions
  • Ewald summation decomposes physical interactions into short-range and long-range parts
  • Ewald message passing is a general framework to complement existing GNN models
  • Experiments show robust improvements in energy mean absolute errors
  • Ewald message passing recovers the effects of a hand-crafted long-range correction term

Background

  • MPNNs are a conceptual framework for molecular GNN architectures
  • Molecules are represented as graphs with atoms as vertices
  • Atoms are adjacent if they are within a certain distance
  • Embeddings are created based on local atom properties
  • Messages are sent between atoms to model interactions
  • Outputs are generated from the embeddings
  • PBC is used to approximate an infinite system with a finite atom collection

Ewald message passing

  • Analogy between electrostatics-type interactions and continuous filter convolutions
  • Previous work has highlighted this analogy as a useful source of inductive bias
  • Ewald summation evaluates potentials given a PBC lattice, a generic interaction kernel and generic atom “charges”
  • Ewald message passing flips the rationale behind Ewald summation
  • Ewald message passing combines short-range and long-range message passing
  • Long-range sum is rewritten as a sum over Fourier frequencies
  • Fourier coefficients are related to the 3D Fourier transform of the kernel
  • Frequency filters are parametrized as linear combinations of radial basis functions in the aperiodic case
  • Non-radial filtering strategy is used in the periodic case to make use of the PBC frame
  • Hand-crafted long-range corrections are used to augment short-ranged models in computational chemistry
  • DFT and MPNN potentials cannot correctly express long-range effects
  • Ewald message passing is designed to fill the same demand for a cheap, largely out-of-the-box correction in the MPNN context
  • Non-local learning schemes are used to target non-local interactions in a structure
  • Ewald summation is a computational technique for the summation of long-range interactions

Experiments

  • OC20 dataset features adsorption energies and atom forces for 265 million structures
  • OE62 dataset features DFT-computed energies for 62,000 large organic molecules
  • OC20 features PBC, OE62 is aperiodic
  • Aim to upgrade baseline architectures with Ewald summation
  • Test if Ewald message passing leads to significant and robust improvements of energy mean absolute errors
  • Measure average runtime per structure
  • Increase settings for distance cutoff and maximum number of message-passing neighbors to compare
  • Primary role of short-range model choice in GNN design landscape
  • Consistency of accuracy gains through Ewald MP encourages its use
  • Runtime overheads are comparable to, if not smaller than, the runtime differences between baselines
  • Average Ewald improvements across all models are 16.1% for OE62 and 10.3% for OC20
  • Ewald MP achieves better cost-performance tradeoff than large cutoff setting on OC20
  • Ewald MP appears to be more efficient on OC20
  • Improvements of force MAEs on OC20 are consistent but small
  • Ewald MP learns long-range effects in particular

Conclusion

  • Ewald message passing is a general message passing framework based on Ewald summation
  • It is computationally efficient and model-agnostic
  • It improves energy targets across models and datasets
  • It has an outsize impact on structures with high long-range energy contributions
  • It is especially useful for large or periodic structures containing a diverse set of atoms
  • It is implemented with real and imaginary parts
  • It uses damping values and filtering schemes
  • It has hyperparameters for the aperiodic and periodic cases
  • It uses a linear combination of energy and force loss mean absolute errors as its loss function
  • It has ablation studies to test its robustness in hyperparameter space