Source code for AlphaFold 2, an algorithm that predicts 3D protein structure with unprecedented accuracy, is now freely available.
Proteins are essential cellular constituents, directing most of the biological processes that sustain life. They are made of chains of amino acids that naturally fold into 3D structures, which in turn determine protein function. For decades, how amino acid sequences dictate 3D protein structures has befuddled scientists.
To address this, researchers have endeavoured to predict protein structure computationally, but to no avail. That is, until Google AI offshoot DeepMind developed the AlphaFold algorithm. The latest version, AlphaFold 2, took home the gold medal in a protein structure prediction competition in 2020.
AlphaFold 2
DeepMind has now released the source code for the newest version of AlphaFold, which is freely accessible in Nature.
AlphaFold is an attention-based neural network system that predicts protein 3D structures from their amino acid sequences. It aligns a protein’s amino acid sequence with evolutionarily related sequences to estimate the 3D coordinates of the atoms in a protein. The predictions achieve near-experimental accuracy even when no similar protein structure is known.
This version of AlphaFold is about 16 times faster than its predecessors, producing structures in minutes to hours, depending on protein size. Some earlier versions took days.
Prospects
With AlphaFold’s source code now accessible to the scientific community, researchers can work to further enhance the algorithm. In the future, AlphaFold may even have the power to determine the structure of multi-protein complexes and design novel proteins. This would permit an even greater mechanistic understanding of the role of proteins in health and disease. This breakthrough may even spearhead the development of new, efficacious therapies for currently untreatable diseases.
AlphaFold also sets the stage for interpreting the backlog of genomic data accumulated by advances in next-generation sequencing. Though sequencing has revealed thousands of disease-causing gene mutations, it falls short of elucidating their impact on protein function. Until recently, experimental approaches, such as X-ray crystallography and cryo-electron microscopy, were the only way to resolve protein structure. However, these are laborious and expensive, severely hindering the translation of sequencing data into protein structure. AlphaFold represents a rapid, computational solution to this problem and may finally allow the genetic architecture of disease to be functionally interpreted. Computationally resolving protein structures may usher in a new era of scientific discovery and medical breakthroughs.
Image credit: Debstar – Wikimedia Commons