At the Festival of Genomics and Biodata, we are lucky enough to be joined by some of the biggest names in the business. In January, we sat down with a few of our esteemed speakers to chat about their backgrounds, roles and the work of their organisations. In this interview, we speak with Serena Scollen (Head of Human Genomics and Translational Data, ELIXIR) about cross-border data sharing, sustainability and the importance of education.
Please note transcript has been edited for brevity and clarity.
Interview originally conducted by Lauren Robertson.
FLG: Thank you for coming to speak at the Festival. It would be great if you could tell me a little about yourself and your role.
I’m a genetic epidemiologist by background. After my postdoc, I went into the pharmaceutical industry and spent some time at Pfizer, where I worked on using genomics in the drug development pipeline, mostly to support drug target selection for the Pain and Sensory Disorders Research Unit.
I then joined the ELIXIR Hub in 2016 as the Head of Human Genomics and Translational Data, where we enable researchers to access and analyse human data. ELIXIR is a unique setting in many aspects. For example, it is a research infrastructure with 23 member countries called Nodes, with each Node composed of one or more institutions (totalling over 250 research institutes!). ELIXIR’s activities are driven by the Nodes, while the ELIXIR Hub coordinates and disseminates Node contributions and ensures everyone is connected.
FLG: So you’re driving the EU Beyond 1 Million Genomes Project. Could you tell me a little more about that project and its aims?
ELIXIR is involved in a number of different EU projects – and we have a large portfolio for human genomics! Before diving into the B1MG (Beyond 1 Million Genomes) project, I need to introduce the 1+ Million Genome initiative. The 1+MG initiative is working towards creating a European data infrastructure for genomic data and implementing national rules enabling federated data access. Since 2018, 26 countries have signed a declaration to do so. Soon after it was launched, there was a call, by the European Commission, to support and oversee implementation – this became the B1MG project. ELIXIR was well-placed to take a coordination role with key expertise in the area, tools and services in development and many existing partnerships of relevance.
At ELIXIR we have been considering how to address many of the challenges in this field. For example, whilst it is great that individual countries are realising the benefits and are developing national genomic programs for healthcare, more of the data will be generated in a healthcare setting. But currently, as the data isn’t generated specifically for research, often researchers don’t have access to this data in their own countries, let alone at the European level.
Data access is critical. Take rare disease as an example, to identify patients with similar phenotypes, researchers need access to data on a global scale. Similarly, for large genetic epidemiology studies, researchers must combine data from hundreds of thousands of people. While GWAS (genome-wide association studies) have been successful, they rely on highly powered studies to pin down polygenic risk scores. It becomes about how we can access data across borders, whilst respecting European regulations and data access frameworks.
Returning to the B1MG project, we received four million Euros to provide coordination and support, and to produce guidelines and recommendations to support the creation of a pan-European genome-based health data infrastructure, encompassing data quality and exchange standards, access protocols and legal guidance.
This meant we had to speak to many different stakeholders to understand their requirements, the full stakeholder list is on our website. It’s critical to understand stakeholder requirements and incorporate them into the technical recommendations. Whilst specific citizen engagement funding was limited, we ran a workshop with citizen engagement experts to discuss recommendations for ongoing work.
The European Genomic Data Infrastructure (GDI) project is the next step in the implementation of the 1+MG initiative. The GDI project overlaps with the B1MG project by about eight months and it aims to bring signatory countries to a position of technical readiness. The project will build a physical data infrastructure, not just hardware, that is ready to handle genomic data. However, just like other research projects, data management needs national investments. The GDI project is a 50% funded project – partners have to put in 50% and the European Commission pays the remainder. The national investment helps sustain the outcomes and resources developed from the project after it ends.
ELIXIR has been working in the area of human genomics since its first scientific programme in 2014 and we have already made great headway. By coordinating these specific projects we act as a neutral broker (for the countries that signed the Declaration), a role that couldn’t be fulfilled by one country alone.
FLG: One of the challenges I often hear when I’m talking to people in the data space is standardisation. How does that work when it’s across so many countries in the EU, what are the key challenges? And how have you overcome them? Or how are you planning to overcome them?
We work with standard organisations like GA4GH. Most expert groups across Europe, including the ELIXIR community and EU projects adhere to GA4GH standards, and ELIXIR is one of the largest organisations rolling them out. If everybody uses community-endorsed open standards, the data from the research will be interoperable in the future.
For example, Australian genomic scientists are using GA4GH standards, and whatever Australian bioinformaticians develop will then be compatible with European research activities. This is important because researchers in Europe and EU-funded projects need to share results with countries outside Europe for the impact of genomics to truly be realised.
ELIXIR does not specify the standards that need to be put in place, instead, we make recommendations. Countries can then review the recommendations and develop their own standards which are interoperable with those already in use. This approach helps strengthen collaborations and ends up with infrastructures that are constantly communicating with each other. If you have one organisation or country that has a completely different standard, then data accessibility becomes really hard. We are not preventing new developments, something better could come along, as long as it is interoperable!
FLG: One of the key points of your talk was research driven versus healthcare driven. What does that really mean? And why is that such an important factor to overcome?
We need a better flow between the healthcare system and research to enable research data to feed into healthcare implementation. There will be many benefits if we understand more about the disease, or even the prevention of disease.
We must also take into account that vast amounts of data in the future will be generated in the healthcare setting and not optimised for research. Currently, genetic data are predominantly generated in research organisations where there is significant data management expertise. But how do we manage data generated in a healthcare setting and ensure it can be accessed by researchers securely and on a global scale? We need to create a whole ecosystem that feeds data back and forward between research and healthcare so that citizens can benefit.
FLG: What’s the key challenge there? What’s the bottleneck?
One bottleneck is the pressure on national initiatives to focus on their national healthcare rather than on accessing data across borders for research. Additional investments will be needed to overcome the many technical challenges of pan-European collaboration, and these investments must feed back to healthcare to show economic benefit. Whilst commonly accepted, it is not easy to demonstrate the economic benefit of data in research via economic models, so this is another challenge. It is clear that long term investments in genomic research yield benefits, with the UK and the NHS benefitting from Genomics England’s work. To secure investment the government of each country needs to be convinced that the research will be of benefit, but this is not easy.
FLG: What would you like to see happen in the next few years and maybe even longer term to help overcome some of these things we’ve discussed?
There are a few things. One would be education. If we don’t educate children in schools, how will they make informed decisions about giving consent for their data to support future research? Likewise, healthcare providers need sufficient training in genomics to be prepared for the advances round the corner.
As a researcher, one of my biggest challenges was accessing genomic data at a scale for the analysis needed. By the time all contracts and agreements were in place, the research needs have changed. Reducing the time to access data brings benefits to everyone.
From the ELIXIR perspective, we must work to sustain the infrastructure, which includes tools, services and workflows. Many resources are developed within time-limited projects and it is important to sustain them after the projects end. Resources that have taken years to develop should be available to benefit research but often risk being lost completely.
FLG: These are big challenges to overcome. I think COVID has really opened people’s eyes to timelines and collaborations, and how we can get things done if we want to.
Massive amounts of COVID data were openly shared because there was an urgent need. We could do that for genomics data!
FLG: Yes! Thank you for speaking to us, and thank you again for speaking at the Festival!