Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Atomic partial charges are important for molecular dynamics simulations.
  • Traditionally, partial charges are assigned using quantum chemical methods.
  • A hybrid physical/graph neural network-based approach is proposed to approximate the widely popular AM1-BCC charge model.
  • This hybrid approach is orders of magnitude faster and maintains accuracy comparable to differences in AM1-BCC implementations.
  • The hybrid approach scales linearly with the number of atoms.
  • Source code is available at a specified URL.

Paper Content

Theory: espaloma graph neural networks for chemical environment perception, charge equilibration (qeq), and espalomacharge

  • Espaloma uses graph neural networks to assign continuous latent representations of chemical environments to atoms
  • GNNs are used to assign symmetry-preserving parameters for atomic, bond, angle, torsion, and improper force terms
  • Initial features associated with nodes are determined based on resonance-independent atomic chemical features
  • Message-passing step of GNN consists of edge update, neighborhood aggregation, and node update
  • Hyperparameters are optimized during training to produce robust models
  • Espaloma framework is used to predict atomic parameters
  • Partial charges are predicted by predicting electronegativity and hardness of each atom
  • Runtime complexity of EspalomaCharge is ( )

Experiments: espalomacharge accurately reproduces am1-bcc charges at a fraction of its cost

  • EspalomaCharge is comparable to or better than AmberTools and OpenEye in terms of charge RMSE
  • EspalomaCharge is generalizable to other molecules of significance to chemical and biophysical modeling
  • EspalomaCharge is most accurate where there is abundant data in the training set
  • EspalomaCharge is 300 to 3000 times faster than AmberTools and 15 to 75 times faster than OpenEye
  • EspalomaCharge has ( ) complexity and is capable of parameterizing peptides of a few hundred residues within seconds
  • EspalomaCharge provides a seamless way to accelerate parameterization by distributing calculations on GPU hardware
  • EspalomaCharge provides statistically indistinguishable performance compared to AmberTools and the OpenEye toolkit on both RMSE and R2 metrics

Discussion

  • EspalomaCharge assigns high-quality conformation-independent AM1-BCC charges
  • Uses modern machine learning infrastructure
  • Supports accelerated hardware
  • Runtime complexity is ( ) with respect to number of atoms
  • Small discrepancies to high-quality AM1-BCC reference implementations
  • Assigns charges to biopolymers with hundreds of residues
  • Python-based machine learning framework
  • One-hot element encoding prevents model from perceiving elemental similarities
  • Future work could aim to reduce error for uncommon chemistries
  • Incorporate ESP as a target in the training process
  • Multi-objective strategy to include multiple targets and charge regularization terms

Funding

  • Research was supported by the National Institute for General Medical Sciences of the National Institutes of Health
  • YW was funded by NIH grant R01GM132386 and the Sloan Kettering Institute
  • JDC was funded by NIH grants R01GM132386 and R01GM140090

Disclaimer

  • Authors are responsible for the content
  • Content does not represent the official views of the National Institutes of Health

Disclosures

  • JDC is a member of the Scientific Advisory Board of four companies
  • The Chodera laboratory receives funding from multiple sources
  • Conformers were generated using RDKit and MMFF94 charges
  • ESPs were calculated using OpenFF Recharge
  • Induced solvent potential was calculated using OpenEye ZAP
  • Hydration free energies were calculated using a modified protocol
  • Charges were assigned using a hybrid physical/GNN model
  • EspalomaCharge showed smaller average charge RMSE than AmberTools
  • EspalomaCharge is fast, even for large systems
  • EspalomaCharge can be used in batch mode
  • EspalomaCharge introduced little error to explicit hydration free energy prediction