Mobile Menu

Exome sequencing data from UK Biobank links protein-coding variants to health

A massive effort using exome sequencing data from more than 450,000 UK Biobank participants has revealed rare and common gene variants linked to various health-related traits.

Variation and proteins

A major aim in human genetics is to identify natural variation within the population and use this to understand the effects of altering protein-coding genes. To link genetic variants to human health, researchers must first collect genetic and health-related data on as many individuals as possible (and in enough detail).

In 2016, the Exome Aggregation Consortium presented aggregated whole-exome sequencing (WES) data for over 60,000 people, which was later expanded to include exome and genome data from more than 140,000 individuals. While these studies provided insight into the biological impact of genetic variants, the impact was not linked to specific phenotypes due to a lack of clinical information. Other analyses such as the DiscovEHR collaboration and the Trans-Omics for Precision Medicine Program combined exome data with detailed phenotypic data. However, these analyses indicated that expanding samples sizes further would be more valuable.

Exome variants and Health-related traits

In a recent article, published in Nature, the UK Biobank Exome Sequencing Consortium sequenced the exomes of 454,787 participants with 95.8% of targeted bases covered at a depth of 20X or greater. The team demonstrated that by increasing the sample size by less than ten-fold, they were able to identify around 20 times more gene-phenotype associations.

In total, they identified 12 million coding variants, including ~1 million loss-of-function and ~1.8 million deleterious missense variants. Almost all of the 12 million variants were rare (occurring in less than 1% of people across all ancestries) and the team observed roughly half in only one person in the dataset. This emphasises the value of using WES at such a large scale.

The researchers then tested these variants for associations with 3,994 health-related traits and found 564 genes with trait associations. More specifically, they discovered several risk-increasing associations with traits related to liver disease, eye disease and cancer. They also found novel risk-lowering associations for hypertension (SLC9A3R2), diabetes (MAP3K15, FAM234A) and asthma (SLC27A3). Additionally, six genes were associated with brain imaging phenotypes, including two genes involved in neural development (GBE1, PLD1).

What next

Overall, these findings illustrate the ability of exome sequencing to identify novel gene-trait associations at scale. The team have now made the WES resources publicly available.

Despite the exceptionally large WES study, the authors noted that their data still had insufficient statistical power to find more of the rare protective variants. This emphasises that future strategies must continue to expand on sample size. In addition, shifting from WES to whole-genome sequencing will uncover a broader range of variants that may impact health and disease. Most importantly, the involvement of diverse populations is key. The proportion of non-European ancestry in the sample used in this study is relatively small. Greater ancestral diversity will be necessary to discover more variation and achieve equity in human genetics.

Image credit: canva

More on these topics

Biobank / Exome / Whole-exome sequencing