Link to paper
The full paper is available here.
You can also find the paper on PapersWithCode here.
Abstract
- AlphaFold2 presented at CASP14 in Dec 2020, revolutionizing protein structure predictions
- AlphaFold2 code released in summer 2021, can accurately predict structure of most proteins and protein-protein interactions
- AlphaFold2 release sparked explosion of development in the field, improving AI-based methods for protein complexes, disordered regions, and protein design
Paper Content
Background
- Protein structure prediction is divided into two categories: template-based and de-novo
- In the 1980s, methods were developed to predict structure by copying coordinates from an experimentally determined protein structure
- Increase in sequence and structure data has enabled more proteins to be modelled accurately
- De-novo modelling has blurred over the last two decades by using fragments
- In the 1990s, co-evolutionary signals in a multiple sequence alignment were proposed to predict structure
- 1999 solution proposed to increase accuracy of contact predictions by separating direct and indirect contacts
- Ten years ago, the idea of indirect correlation was rediscovered
- Combining DCA and machine learning was a way to improve contact prediction
- AlphaFold (version 1) introduced at CASP13, deeper architecture and predicted distance probabilities
Alphafold v2.0
- DeepMind presented AlphaFold2 at CASP14 with impressive results
- AlphaFold2 source code was released with an open-source license
- AlphaFold2 consists of two main modules: EvoFormer block and structure block
- AlphaFold2 input is a “raw” multiple sequence alignment
- EvoFormer block uses row and column-wise attention mechanism and triangle updates
- Structure module is locally translated and rotational equivariant
- Structure module contains information about sidechains and uses Frame Aligned Point Error as a loss function
The field after the release of alphafold2
- AlphaFold was released as open source in June 2021
- Colabfold was an essential tool for using AlphaFold
- AlphaFold was used to predict protein structures and protein-peptide interactions
- AlphaFold2 was developed to predict the structure of single protein chains
- AlphaFold2 was tricked into predicting the structure of dimers and higher multimers
- AlphaFold2 was used to study and design protein-peptide interactions
- DeepMind developed a version of AlphaFold aimed at predicting the structure of complexes
- AlphaFold2 can accurately predict about half of all complexes up to 6 chains
Alphafold2 clones/copies
- AlphaFold code and description released, inspiring work to reproduce it
- RoseTTAFold published at same time, but initially worse performance than AlphaFold
- Later implementations of RoseTTAFold rival AlphaFold in accuracy
- AlphaFold uses multiple sequence alignment to predict single protein structure
- OmegaFold and ESMfold similar to AlphaFold but without using MSA
- AlphaFold used for protein design and machine-learning tools for protein-ligand interactions
- Most groups used AlphaFold in some way in CASP15
- Two ways to improve over standard AlphaFold: use templates more efficiently or increase sampling
Conclusions
- AlphaFold has been released in 2021 and has been used in hundreds of papers.
- AlphaFold can produce models close to experimental quality for most proteins and protein complexes.