Mobile Menu

Scientists Pit Variant Prediction Platforms Against Each Other in Head-to-Head Study

Written by Kirsty Oswald, Science Writer 

Researchers have assessed which computational methods and platforms have the strongest performance for predicting the functional relevance of genetic variants. 

The findings should help scientists to select the best approach for their research and could lead to the development of better predictive methods say the team, led by Dong Wang at the Harbin Institute of Technology, China. 

A head-to-head comparison 

For the study, researchers compiled two independent datasets from the ClinVar and VariBench databases on which they performed a comparison of 14 functional impact prediction methods. 

They found that the highest performing prediction methods were CADD and REVEL.  

The CADD (Combined Annotation-Dependent Depletion) framework integrates genome annotations and scores of any possible human single-nucleotide variant or small-insertion-deletion event, and had excellent performance on multiple types of variants.  

By contrast, the REVEL method takes an ensemble approach, integrating multiple functional prediction and sequence conservation scores, including many of those included in this study. This approach had excellent predictive performance specifically concerning missense variants. 

A tailored approach 

No method had excellent performance when looking both generally across all types of SNPs and specifically at one type of variant. Therefore, the team say that researchers may need to select different methods to harness their advantages in different situations. 

For example, some specific methods, like M-CAP and REVEL may be best suited when the user is looking at missense variants, but CADD and FATHMM-MKL may be a better choice when they need to predict the functional impact in a dataset with many uncertain types of variant.  

The great challenge for genomics 

Deciphering which genetic variants are influential to health and human functioning is one of the great challenges in the era of next-generation sequencing. But it can be an inefficient and often time-consuming process. Therefore, researchers have developed a number of computational methods and platforms to help prioritise variants. However, Wang and colleagues say that the predictive performance of these platforms is not well established. 

The team note that, in their findings, methods employing deep-learning technology did not outperform more traditional approaches. But they say that in the future, with greater availability of labelled variants for deep learning, the technology could become more influential and contribute to the improved performance of computational prediction methods.  

Image Credit: Canva