Mobile Menu

Using Healthcare Data in Research to Improve Lives

Caroline Cake (CEO, Health Data Research UK) joined us at The Festival of Genomics and Biodata 2021 to talk about the work HDRUK has been doing during the pandemic with healthcare data and the importance of patients and members of the public in this work.

Healthcare data and HDRUK

HDRUK is the national institute of data science, with a mission to improve people’s lives by enabling discoveries through data. Caroline started her talk by highlighting some of the characteristics of health data that make it difficult, and interesting to work with, emphasising that at the heart of it all – this is about us.

She specifically noted 10 heath characteristics:

  1. Personal
  2. Identifiable
  3. Secondary purpose
  4. Access, opt-out and consent
  5. Public, patient and practitioner trust
  6. Multi-Modal & complex
  7. Linkage & Scope
  8. Scale
  9. Federation
  10. Timeliness

HDRUK does 3 main things to address these characteristics, by uniting, improving and using health data. These together lead to the biggest health and care impact by bringing together the HDRUK with users (researchers, industry, NHS) and also with the public, patients and practitioners.

How is HDRUK set-up?

The HDRUK is a federated institute, combing research and training hubs across the four nations of the UK, with 86 organisations in 32 locations. This is to bring together and also grow a community of data custodians to drive discovery and change.

The UK Biobank and Genomics England are members of this alliance who focus on genomics data, but it also includes many more from charities, NHS and universities.

Hubs – How is the data utilised and made useful for research?

Health Data Research Hubs are more focussed research areas that shape discovery. Currently, there are 8 data hubs across the UK covering eye disease, respiratory conditions, digitising clinical trials and much more.

PIONEER is a hub that focusses on acute care, unplanned medical care, and also how to innovate and improve this care – as demand on UK acute health services is currently unsustainable for the NHS.


One of the recurring themes is how to make data accessible to researchers and innovators, which is where the Gateway comes in. Gateway specifically is a resource of datasets, tools and training, with over 550 datasets that have a common and transparent approach to data access. It abides by the Five Safes Framework, which tries to encourage data decisions that lead to safe use.

COVID-19 – Data and Connectivity Programme

During the COVID-19 pandemic, the need for data has never been higher. But we must also ensure it is up to FAIR standards (findable, accessible, inter-operable and reusable) so these priority datasets can inform policy and operational decisions. By making health data available, the programme can link together multiple studies and areas of research, linking surveillance and epidemiology study data to immunity and clinical trials.

The programme has also funded projects to ensure researchers can support the COVID-19 response. From this programme, over 1,000 pre-prints and 87 published papers have manifested from this healthcare data.


This trial has been focussing on therapeutics for COVID-19 in older people, specifically in a community setting, not in hospital. This had made the tracking of COVID-19 cases much more difficult.

The idea was that from someone being identified as testing positive in the community – could they then contact them in 24-48 hours, determine eligibility and offer participation?

This involved working NHS DigiTrial, Test and Trace and members of the public in order to understand how best to contact and provide support to these people in a trustworthy way. Over 3 months, with the involvement of HDRUK, the number of participants has nearly quadrupled – leading to more cases, trial arms and a streamlined workflow.

Lessons from COVID

During this time healthcare data has led to many discoveries about COVID-19, such as loss of smell and taste from the COVID Symptom App being recognised as a key indicator, leading to a change in NHS guidance. Others include identifying that those from minority backgrounds in the UK are more likely to get COVID-19 and have worse outcomes.

However, the lessons learnt focus on the importance of data, people, and collaboration.

The key lesson has been the enthusiastic role of the public, from surveys, volunteers, and participants. As Caroline summarised “engaging members of the public and patients doesn’t slow research up, it makes it more effective, avoids pitfalls at later stages and it develops better research coming out of it.”

As the main goal of the Festival of Genomics and Biodata, is to bring the benefits of genomics to patients faster, we couldn’t agree more with Caroline!

Registration for on-demand access to watch this talk and all our other talks from the Festival will end on February 12th.

More on these topics

Data / Festival of Genomics / Healthcare