Since the advent of single cell sequencing, cellular heterogeneity has been heavily studied in the search for novel insights into disease onset. Coupled with spatial context in tissue samples, the granularity of this research has reached even finer details.
At the Festival of Genomics and Biodata 2022, we hosted a single cell data analysis workshop with several experts from the space. Below, we share some of the key insights learned from this session.
The gold standard dataset
The first topic raised focused on the pace of technological advancement in single cell analysis; specifically, the longevity of datasets. Often it is desirable to have a ‘gold standard’ dataset from which to compare subsequent data. This is the case with simpler, more mature technologies that have had time to develop such as whole genome sequencing (WGS).
However, our panellists pointed out that if one was to try and generate a ‘gold standard’ dataset using current single cell techniques, it could quickly become outdated. Technology is improving at such a rate that a new ‘gold standard’ dataset would need to be produced every few years. Additionally, it is much harder to obtain reproducible results from single cell analysis of a tissue. With typical WGS, you can repeatedly sequence a clonal population to ensure the accuracy and reproducibility of your technique as the clonal cells differ from each other only slightly, if at all. Currently, this is very challenging at single cell resolution in complex tissues.
One way in which experts might expect a single cell ‘gold standard’ to evolve is through the inclusion of multi-modal data. The most common and advanced form of data analysis in single cells is RNA-seq, mostly due to its availability. This will likely change in the future, despite single cell WGS not currently being up to the same standard.
Another modality raised was single cell epigenomics. This is an interesting angle for analysis, especially by examining to what extent it is coupled with the transcriptome. While new techniques are enabling easier generation of more interesting single cell data, the complexity of these data is also much higher. It is harder to combine complex single cell data of different modalities than to, for example, combine long read and short read DNA sequencing data.
Defining cell types and subtypes
The discussion on different data modalities was very much intertwined with another key insight from the workshop; defining cell types and subtypes.
The question of how to define a cell type is almost a philosophical one. Our workshop panellists mentioned that many argue that there is no ‘ground truth’ to defining cell types, which makes consensus and data integration between different groups challenging. How do you define something? Which ontology do you use? Can these ontologies evolve fast enough to keep pace with the constant discovery of new cell types?
One of our panellists postulated that, to an extent, we are submitting to a dogma of cell type classification based on traditional biomarkers, that is not necessarily aligned with data produced through single cell techniques. Another panellist raised the subtleties of assigning cell subtypes versus cell states, noting that there are different perspectives on where to draw the line. It was suggested that using spatial data to understand more fully the context in which potential cell subtypes reside is a particularly powerful avenue for research. One hypothetical route to reducing the problem of defining cell types, is establishing a future cell census effort for all tissues in the body, either via a digital platform or an organisation.
Translation into clinical settings
A final broad topic area raised was the relationship between single cell research and clinical implementation, a common area of discussion in many fields of research. The core question raised was whether single cell technologies will be mostly directly implemented as clinical assays in the future, or if they will instead be used to create maps or cell atlases to develop more effective clinical assays.
According to our panellists, there are problems with standardisation and sample collection in current clinical settings. Often samples are not collected or treated in a way that is amenable to single cell analysis. Our panellists saw a hybrid strategy as the future of single cell analysis in the clinic, using it both directly as an assay, as well as a method for refining other assays.
This article was written using insights gathered from a Single Cell Analysis workshop at the Festival of Genomics & Biodata, in late January 2022. We would like to thank the following individuals for their contribution to this workshop, and – by extension – the thoughts and ideas shared in this article:
- Irene Papatheodorou, Team Leader, EMBL-EBI
- Enrique Sapena Ventura, Bioinformatician, EMBL-EBI• Enrique Sapena Ventura, Bioinformatician, EMBL-EBI
- Oliver Stegle, Divisional Head, DKFZ German Cancer Research Center
- Omer Bayrakter, Group Leader, Wellcome Sanger Institute
Image credit: Canva