Mark Effingham (Deputy CEO, UK Biobank) joined us at The Festival of Genomics and Biodata 2021 to give us an update on the UK Biobank. He discussed the invaluable nature of this resource and the work they have been doing during the pandemic.
What is the UK Biobank?
The UK Biobank is a prospective cohort that initially started with half a million individuals aged between 40-69 from 2006-2010. Those recruited gave consent for their data to be used. This was even with the knowledge that they would get no personalised feedback – acting purely altruistically.
The biological sample data gathered is available with non-preferential access to “all bonified researchers who want to do research in the good of public health.”
Despite the name, the UK Biobank is cemented as a major international biomedical resource.
Enabling scientific discoveries that improve human health
Since 2016, there has been a three-fold increase in researchers applying to access the resource. There has been over 18,000 globally approved registrations. Once the genotyping dataset was released, this led to a major increase of requests to access the data. This included tripling the number of international requests.
There are over 2,000 ongoing global research projects with over 1,400 papers published using UK Biobank data. Nearly 200 of these have been published in Nature.
The pivot to COVID-19 research
Many have had to pivot their work during the pandemic, the UK Biobank is no exception to this. With established links to test results, hospitalisation and death records, the UK Biobank was able to secure more regular updates from these links, providing researchers with an invaluable resource. Since April last year, 700 research groups applied to access the data, and 80 papers have been published so far.
Papers include research into ethnic disparities in hospitalisation due to COVID, and into the genetic risk factors (such as blood type).
Serology and Multi-Imaging Study
Not just being content with providing a frequently updated resource with primary care data, the UK Biobank set out its own studies to determine the extent of COVID-19 and the effects of infection in people across the UK.
The serology study involved 20,000 participants. The team used blood samples from 10,000 of the original UK Biobank participants, and then another 10,000 from their adult children/grandchildren, to monitor and measure antibodies to SARS-CoV-2. As mentioned before, the public were willing to get involved and over 100,000 people expressed interest in joining the study. Each participant gave a monthly blood sample for the duration.
This data was used to build a seroprevalence map by region, age, sex, ethnicity, and to track the changes in levels of antibodies over time. This 6-month study has been published and more information can be found on the UK Biobank’s website.
In addition, the UK Biobank was also in the process of conducting a multi-imaging study on the original participants but it was paused due to COVID-19 halfway though. However, this study is now being resumed to create a unique dataset of imaging pre and post COVID-19. Although COVID-19 is primarily a respiratory disease, it is also associated with multi-organ injury. The centres will shortly be re-opening to support this and the data will be made available at regular intervals.
Democratising access to the UK Biobank
Data volume has grown exponentially. From the original estimates of half a petabyte of data in 2022, to now the addition of whole-exome sequencing data and more. The new estimate is for over 15 petabytes of data!
This enormous amount of data means a new approach and a platform are needed. To remove the limits of computational capabilities of researchers, they brought the researchers to the data. This ensures as Mark said, “researchers are only constrained by their imagination.”
This platform is currently in development and is invitation-only, supported by DNAnexus and Amazon Web Services. Researchers only have to pay for the data they use. But AWS itself are providing research credits for early career researchers, or those from low to middle income countries to help broaden access.
Mark gave us an update on some of the upcoming data releases researchers can expect from the Biobank, including:
- A whole-exome sequencing project of 300,000 exomes mid-year, with the whole dataset released towards the end of the year.
- Whole-genome sequencing of 500,000 healthy volunteers to be made available partly at the end of this year, fully end of 2022.
- Telomere lengths of half a million to be released shortly.
- Metabolomics assay data of 225 biomarkers from 120,000 participants, also to be released shortly.
And lastly, moving beyond genomics into proteomics.
The UK Biobank has launched one of the largest studies of circulating proteins. The study has the ambitious goal of measuring nearly 1,500 plasma proteins in an initial phase of 53,000 participants. This will help to better understand the link between genetics and disease.
Thanks to Mark Effingham for the providing insights and updates on the UK Biobank for 2021. We can’t wait to see what they produce next!
Registration for on-demand access to watch this talk and all our other talks from the Festival will end on February 12th. Register now.