Using data derived from the UK Biobank, researchers have found that polygenic scores have low portability between genetic populations.
Polygenic scores
A polygenic score is a summary of an individual’s genetic component for a particular trait or disease. It is derived from the aggregation of information from many genetic variants into a single score. In most cases, this information is from the summary statistics from large meta-analyses such as genome-wide association studies (GWAS). It can also be derived directly from individual-level data. When the individual-level dataset is small, predictions for most phenotypes are poor. However, the development and availability of large datasets, such as the UK Biobank, has provided researchers with access to individual-level data that can be used to derive polygenic scores.
One of the biggest concerns regarding polygenic scores is their ability to transfer to other ancestries. Most polygenic scores are derived from individuals of European ancestry and therefore, are not predictive of traits and diseases in individuals of other ancestries e.g., African ancestry. Prediction has also been shown to decay with genetic distance to the training population and with increased admixture. This issue with portability is thought to primarily be due to differences in linkage disequilibrium and allele frequencies between populations.
Portability
In a recent study, published in the American Journal of Human Genetics, researchers used UK Biobank data to examine how transferable polygenic scores are between ancestries. The team specifically derived polygenic scores for 245 curated traits from the UK Biobank data and applied them in nine ancestry groups from the same cohort. Using the UK Biobank data as both the training and test group reduced the risk of environmental and genotyping confounding effects.
Overall, the team found a systematic and significant reduction in portability of polygenic scores from UK ancestry to other ancestries. For example, the phenotypic variance explained by the polygenic scores was only 64.7% in South Asia, 48.6% in East Asia and 18% in West Africa compared to in individuals of Northwestern European ancestry. The results also showed that prediction also already dropped within Europe e.g., for Northeast and South Europe compared to Northwest Europe.
The researchers found that this decay in variance explained by the polygenic scores is roughly linear in distance to the training population. In other words, prediction reduced globally in proportion to genetic distance. There were a few exceptions to this, including hair colour and some blood measurements. Nonetheless, altogether, this study has provided unique insights into the issue of polygenic score portability, which will be important when considering its clinical application.
Image credit: canva