Recently, researchers have developed a novel method that uses convolutional neural networks to detect adaptive introgression in the human genome.
Neanderthals are an extinct species of archaic humans that lived in Eurasia until about 40,000 years ago, when they were replaced by early European modern humans. Ancient DNA studies have shown that, before their extinction, Neanderthals interbred extensively with modern humans. However, the overall fate of this inherited genetic material remains largely unknown.
It is thought that over 40% of the Neanderthal genome has survived in present-day humans of non-African descent. Although, it is probable that the genetic material is spread out so that any modern human genome is, in fact, only composed of 2% Neanderthal material.
Adaptive introgression refers to the incorporation of a foreign genetic variant that leads to an increased fitness of the recipient pool. Historically, the introduction of beneficial genetic material into the human gene pool frequently occurred, particularly after the rapid expansion of the species across the globe.
Although a large proportion of Neanderthal ancestry was negatively selected, recent analysis has uncovered that adaptive introgression also persisted on a small proportion of the genome.
Despite advances in deep learning methods for population genetics, there are few clear frameworks for modelling adaptive introgression using genomic sequence data. However, researchers from GLOBE Institute at the University of Copenhagen have recently developed a new method to detect adaptive introgression along the genome using convolutional neural networks (CNNs).
Using CNNs to search the human genome
A CNN is a type of deep learning framework that is commonly used for image and video recognition. The approach has also outperformed alternative methods for classification and predictions in numerous areas of population genetics.
Using thousands of simulations, the researchers from Copenhagen trained a CNN to identify patterns in the Neanderthal genome that would be produced by adaptive introgression. The team then applied the trained CNNs to human genomic datasets to detect candidates for adaptive introgression – essentially genes that may have shaped human evolutionary history.
The deep learning method used was named ‘genomatnn’. It was able to confirm certain human genetic mutations that were already suspected to arise from adaptive introgression. Remarkably, the approach was also able to discover previously unreported mutations that were not known to be introgressed. Some of these mutations were involved in core pathways in human metabolism, blood-related diseases and immunity.
Future of the Neanderthal genome
The genomatnn CNN architecture exhibited a 95% accuracy on the simulated data. This demonstrates how deep learning can be used as a powerful tool for population genetic inference. Similar techniques could act as useful resources for future studies that aim to further unravel human history, especially as technological advances are made. Gaining knowledge about advantageous mutations and deepening our understanding of how best to study the human genome will ultimately provide scientists with the tools to increase human health and well-being in the future.
Fernando Racimo, an Associate Professor at the GLOBE Insititute, said: “Our method is highly accurate and outcompetes previous approaches in power. We applied it to various human genomic datasets and found several candidate beneficial gene variants that were introduced into the human gene pool.”
The researchers are currently following up on the function of the novel genome variants that were uncovered during the project. The team also aims to adapt the CNN method to a more complex demographic, with different selection scenarios, in a bid to understand the overall fate of Neanderthal genetic material.
Image credit: FreePik wirestock