The UK Biobank Exome Sequencing Consortium
UKB-ESC (The UK Biobank Exome Sequencing Consortium) is a unique private/public partnership between the UK Biobank and eight biopharma companies. The Consortium aim to sequence the exomes of all ~500,000 UK Biobank participants.
The probability of drug candidates processing past Phase 1 clinical trials into approval and launch remains near 10%. Most clinical failures are attributable to safety concerns or lack of efficacy. As a result, drug developers’ attention has turned to human genetics to increase the likelihood of successful drug discovery.
The development of large-scale biobanks has created a unique opportunity for the scientific community to accelerate the use of human genetics to inform drug discovery. The UK Biobank is of particular interest to drug developers as not only does it contain human genetic data, it also contains longitudinal phenotype data. Researchers can link both of these data types together to explore the biological consequences of genetic variation.
The aim of UKB-ESC is to gain a comprehensive assessment of protein-coding genetic variation. This will extend genetic investigations to include rare and private variants that are not already captured in existing UK Biobank chip-based genotypes.
Lessons learned from the UK Biobank Exome Sequencing Consortium
Large-scale collaborative projects can drive transformative scientific discoveries. Such collaborations provide a unique opportunity to unify scientific communities. The UKB-ESC seeks to further strengthen the relationship between academia and industry through a precompetitive collaboration. They hope that this partnership will catalyse novel scientific discoveries, accelerate the development of new therapies and ultimately improve patient outcomes. The authors noted the following principles as essential to the success of this effort:
- The collaboration is enabled by the UK Biobank’s open data access policy
- The scope of the project will enable unique, valuable scientific discoveries
- UK Biobank’s data access and contribution terms invites pre-competitive collaboration by industry partners
- The value of engagement in a large pre-competitive industry collaborative project presents additional value to the participating institutions
The first 200,000 exome sequences
The release of the first ~200,000 sequenced exomes represents an important milestone in the availability of large-scale genomic data. The team found approximately 10 million variants within the targeted regions. These included 8,086,176 SNPs, 370,958 indels and 1,596,984 multi-allelic variants. Of the ~8 million SNPs observed, 84.5% were coding variants. These included 2,139,318 (25.3%) synonymous, 4,549,694 (53.8%) missense and 453,733 (5.4%) predicted loss-of-function (LOF) variants affecting at least one coding transcript. Restricting the analysis to variants that affected canonical transcripts, the team observed a median of 142 LOF variants per individual.
The UKB-ESC is making important contributions to the scientific community. While exome sequencing is nearly complete, the team hope that the key features of this collaboration will be adopted for similar projects in the future. These large-scale collaborations play an essential role in generating accessible resources that a diverse community can use to address critical questions and advance human health. Additionally, the team expect that the value of WES data will be enhanced by layering deeper and richer phenotypes. The exome sequencing data from the first 200,643 UK Biobank participants is now accessible to the research community.
Image credit: By freepik – www.freepik.com