Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • AlphaFold2 presented at CASP14 in Dec 2020, revolutionizing protein structure predictions
  • AlphaFold2 code released in summer 2021, can accurately predict structure of most proteins and protein-protein interactions
  • AlphaFold2 release sparked explosion of development in the field, improving AI-based methods for protein complexes, disordered regions, and protein design

Paper Content

Background

  • Protein structure prediction is divided into two categories: template-based and de-novo
  • In the 1980s, methods were developed to predict structure by copying coordinates from an experimentally determined protein structure
  • Increase in sequence and structure data has enabled more proteins to be modelled accurately
  • De-novo modelling has blurred over the last two decades by using fragments
  • In the 1990s, co-evolutionary signals in a multiple sequence alignment were proposed to predict structure
  • 1999 solution proposed to increase accuracy of contact predictions by separating direct and indirect contacts
  • Ten years ago, the idea of indirect correlation was rediscovered
  • Combining DCA and machine learning was a way to improve contact prediction
  • AlphaFold (version 1) introduced at CASP13, deeper architecture and predicted distance probabilities

Alphafold v2.0

  • DeepMind presented AlphaFold2 at CASP14 with impressive results
  • AlphaFold2 source code was released with an open-source license
  • AlphaFold2 consists of two main modules: EvoFormer block and structure block
  • AlphaFold2 input is a “raw” multiple sequence alignment
  • EvoFormer block uses row and column-wise attention mechanism and triangle updates
  • Structure module is locally translated and rotational equivariant
  • Structure module contains information about sidechains and uses Frame Aligned Point Error as a loss function

The field after the release of alphafold2

  • AlphaFold was released as open source in June 2021
  • Colabfold was an essential tool for using AlphaFold
  • AlphaFold was used to predict protein structures and protein-peptide interactions
  • AlphaFold2 was developed to predict the structure of single protein chains
  • AlphaFold2 was tricked into predicting the structure of dimers and higher multimers
  • AlphaFold2 was used to study and design protein-peptide interactions
  • DeepMind developed a version of AlphaFold aimed at predicting the structure of complexes
  • AlphaFold2 can accurately predict about half of all complexes up to 6 chains

Alphafold2 clones/copies

  • AlphaFold code and description released, inspiring work to reproduce it
  • RoseTTAFold published at same time, but initially worse performance than AlphaFold
  • Later implementations of RoseTTAFold rival AlphaFold in accuracy
  • AlphaFold uses multiple sequence alignment to predict single protein structure
  • OmegaFold and ESMfold similar to AlphaFold but without using MSA
  • AlphaFold used for protein design and machine-learning tools for protein-ligand interactions
  • Most groups used AlphaFold in some way in CASP15
  • Two ways to improve over standard AlphaFold: use templates more efficiently or increase sampling

Conclusions

  • AlphaFold has been released in 2021 and has been used in hundreds of papers.
  • AlphaFold can produce models close to experimental quality for most proteins and protein complexes.