Researchers have used exome sequencing data from 77,184 individuals to generate penetrance estimates and assess the utility of polygenic variation in risk prediction of monogenic variants.
Interpreting sequencing data
With advanced technologies and declining costs, healthcare providers and researchers are increasingly faced with interpreting genetic sequence data from asymptomatic individuals. Reporting of whole exome and genome sequencing results often involves risk assessment of genetic variation in conditions of known relevance to the individual. But it can also identify impactful variation unrelated to the primary indication, so-called secondary genetic findings. As a result, predicting the risk conferred by genetic findings in individuals who are not known to have the relevant conditions is vital.
In addition, majority of interpreted genetic variation within clinical practice is related to rare monogenic diseases that have large predictive effect sizes. This leaves the vast majority of the genome, particularly common variants, unassessed. A number of studies have suggested that a high burden of common genetic variants could confer increased disease risk similarly to rare monogenic variants. However, many have questioned this equivalency and it continues to remain unclear how to integrate polygenic scores into medical practice.
Penetrance and expressivity
The detailed guidelines from ACMG/AMP regarding variant interpretation have been implemented by ~95% of clinical laboratories internationally. Nevertheless, the probability that individuals who carry such variants will manifest the given condition (termed penetrance) is uncertain for the vast majority of reported pathogenic variants. In addition, even individuals with the same genotype can exhibit variable degrees of phenotype expression (termed variable expressivity). To date, studies estimating penetrance and expressivity have largely focussed on individuals with a given condition and their family members. However, this approach suffers from ascertainment bias. Moreover, the lack of data available to assess penetrance further complicates this estimation. Alternatively, researchers have been exploring the use of large-scale population-based and cohort studies to estimate penetrance and expressivity. These studies offer both sequence and phenotype data with less bias compared to family or case-control studies.
Determinants of penetrance and variable expressivity
In a recent study, published in Nature Communications, researchers performed exome sequencing on 77,184 adult individuals. This cohort specifically included 38,618 multi-ancestral individuals from a type 2 diabetes case-control study and 38,566 participants from the UK Biobank. The team then applied clinical standard-of-care gene variant curation for eight monogenic metabolic conditions. These included diabetes (maturity-onset diabetes of the young (MODY), neonatal diabetes, autosomal dominant lipodystrophy) and disorders of LDL cholesterol, HDL cholesterol, triglycerides, and obesity. These traits have complex genetic architectures, involving both rare and common genetic variants. Moreover, they also calculated polygenic scores in the UKB dataset to allow them to make direct comparisons between monogenic and polygenic risk.
The team found that rare variants causing monogenic diabetes and dyslipidaemias displayed effect sizes significantly larger than the top 1% of corresponding polygenic scores. However, they also found that penetrance estimates for monogenic variant carriers averaged 60% lower for most conditions. Furthermore, they also demonstrated that including polygenic variation significantly improved biomarker estimation for two monogenic dyslipidaemias.
This study emphasises the importance of careful interpretation of monogenic variation. Over the next few years, our access to larger sequencing studies will enable researchers to assess increasingly rare variants. In addition, an improved understanding of monogenic variant expressivity will require broader incorporation of genetic variation across the spectrum and integration of environmental factors. This in turn will improve modelling of disease risk and optimise patient genetic counselling and management.
Image credit: By Science Photo Library – canva