Google AI offshoot DeepMind has made a pioneering step in solving one of biology’s greatest challenges – determining a protein’s 3D shape from its amino acid sequence.
Proteins are the building blocks of life – responsible for most functions within a cell. They are large complex molecules, made up of amino acids. Their functions are largely dependent on their unique 3D structure. Moreover, proteins adopt these unique shapes without help, guided by the laws of physics. For decades, researchers have wondered how a proteins constituent parts map out the many twists and folds of its 3D structure. Early attempts in the 1980s and 1990s, using computers to predict protein structures, performed poorly. In 1994, Professor John Moult and Professor Krzysztof Fidelis founded CASP (Critical Assessment of Structure Prediction) as a biennial assessment to catalyse research and establish state of the art technology in accurately predicting protein structures. The event challenges teams to predict the structures of proteins that have only very recently been experimentally determined but are not yet published.
In the 14th CASP assessment, DeepMind’s AlphaFold 2 system achieved a median score of 92.4 GDT (Global Distance Test) overall across all targets. GDT ranges from 0-100 and is essentially a percentage of amino acid residues within a threshold distance from the correct position.
For the latest version of AlphaFold, the team created an attention-based neural network system that attempts to interpret the structure of a spatial graph. Specifically, it uses evolutionarily related sequences, multiple sequence alignments and a representation of amino acid residue pairs to refine this graph. By iterating this process, the system develops strong predictions of the underlying physical structure of the protein and is able to determine highly-accurate structures. The team trained this system on publicly available data consisting of ~170,000 protein structures.
DeepMind’s AlphaFold 2 algorithm significantly outperformed other teams at CASP14 and also outperformed their previous version’s performance at the last CASP.
In summary, these exciting results open up the potential for biologists to use computational structure prediction as a tool in critical research. Many of the most challenging aspects of developing treatments for diseases are fundamentally linked to proteins and the roles they play. This breakthrough demonstrates the impact AI can have on scientific discovery. Additionally, it emphasises its potential to dramatically accelerate progress in some of the most fundamental fields.
Image credit: By Image Team – canva.com