Mobile Menu

A guide to cancer genomics

According to the British Journal of Cancer, one in two of us will develop cancer in our lifetime. Despite this devastating statistic, there is hope on the horizon. Genomics has transformed our understanding of cancer, providing researchers with increasingly complex information on tumour heterogeneity and enabling clinicians to better monitor the success of certain treatments.

Rapid development and innovation have widened access to next-generation sequencing platforms among cancer researchers and enabled genome analysis on a scale that was unimaginable just a few years ago. Could we be standing on the verge of the next revolution in cancer genomics?

The basics of cancer biology

Cancer is a disease of the genome. Environmental factors can certainly influence the growth and spread of cancer, but the changes that first lead to this devastating disease originate inside the cell. Once believed to be a single disease, we now know that cancer is in fact a group of related diseases characterised by cells dividing uncontrollably and spreading into surrounding tissues.


Oncogenesis encompasses a wide range of biological changes that result in a normal cell becoming cancerous. This process is often the result of a gene mutation, either in the form of an oncogene (a mutated gene that, instead of regulating normal cell division, drives tumour growth) or a mutated tumour suppressor gene (a gene that usually inhibits cell proliferation and tumour development).

Mutations come in many shapes and sizes, but the two main types are germline and somatic mutations. Germline mutations occur in sperm or egg cells and can therefore be inherited. Somatic mutations occur after conception and are the result of damage to genes in a single body cell. Though these mutations cannot be passed on to offspring, they are much more common than those found in the germline and are a major focus of research in cancer genomics, prevention, and treatment.

Tumour classification

Tumours are classified depending on the cell type from which they arise. The five main categories are carcinoma, sarcoma, leukaemia, lymphoma and myeloma (classified together), and central nervous system cancers.

Approximately 90% of human cancers fall under the carcinoma category, consisting of malignancies that arise in epithelial cells. Tumours can also be further classified depending on their tissue or organ of origin, for example erythroid leukaemias arise from precursors of erythrocytes.

Cancer metastasis

Normal cells will migrate through the body until they contact another cell, get stuck, and create a uniform array of cells. On the other hand, tumour cells exhibit a reduced expression of cell surface adhesion molecules, meaning that when they contact other cells, they don’t get stuck. Instead, tumour cells continue to migrate over and around other cells, and (in culture) will grow in a disorderly and often multi-layered pattern. This lack of adhesion molecules plays an important role in the proliferation, invasion, and metastasis of cancer.

Another unique feature of cancer cells is their ability to evade apoptosis. While normal cells will undergo programmed cell death at even the faintest whiff of DNA damage, tumours (or parts of tumour populations) can survive even the immense stress of chemotherapy or irradiation, creating a significant problem in terms of treatment resistance.

The tumour microenvironment

The tumour microenvironment (TME) consists of the extracellular matrix, surrounding blood vessels, immune cells, fibroblasts, and signalling molecules. Tumour cells can influence the TME by secreting extracellular paracrine signals into their environment, inducing peripheral immune tolerance and supporting angiogenesis.

An overview of cancer biology

Report: Cancer Genomics

Cancer genomics sequencing options

We’ve come a long way since the dawn of Sanger sequencing in 1977. The rapid development of next-generation sequencing (NGS) approaches over the last decade has vastly altered the cancer genomics landscape, providing researchers with the ability to assess multiple genes simultaneously.

Whole-genome sequencing

Whole-genome sequencing (WGS) enables researchers to analyse the entire genome, base by base. This high-resolution view offers a wealth of data on the function of genes and their potential role in diseases such as cancer.

Supported by NGS technologies, WGS can provide base-pair level information about the mutations present in cancer cells and enables discovery of cancer-associated variants (single nucleotide polymorphisms, copy number variations, insertions/deletions, and structure variants). When combined with transcriptome analysis, WGS can also give researchers a comprehensive view of cancer as it progresses in response to therapy.

Short-read sequencing

In contrast to WGS, short-read technologies break DNA into small fragments that are then amplified and sequenced to produce “reads”. Short-read sequencing can be categorized as either single molecule based (involving the sequencing of a single molecule) or ensemble based (the sequencing of multiple identical copies of a DNA molecule that have usually been amplified together on isolated beads).

The relatively high accuracy of short-read sequencing enables researchers to identify small genetic variations that may have a role in cancer progression and treatment response. However, there are inherent limitations in sequencing shorter stretches of DNA. Since the strands must be fragmented and amplified in NGS, there is the high potential to introduce bias into the samples. Short-read sequencing can also fail to generate sufficient overlap between DNA fragments to produce a full genome for a sample, meaning sequencing of a highly complex and repetitive genome (like that of human cancers) can be challenging.

Long-read sequencing

In comparison to traditional short-read sequencing, long-read sequencing (LRS) allows for the analysis of much longer (>10,000bp) reads. This overcomes the amplification bias of SRS by sequencing a single molecule and generating a longer length to overlap a sequence for better assembly.

Compared with NGS, long-read sequencing allows for better overall resolution of highly repetitive genomic sequences, allowing the assembly of large and complex genomes. It makes the task of assembling a complete picture of the 3-billionbit human genome much simpler, with less ambiguity and error. LRS also enables other omics technologies to be brought into the picture, such as epigenetic modifications or RNA sequencing.

Single-cell and spatial genomics

In recent years, single-cell and spatial sequencing have emerged, promising researchers the ability to create 3D cellular atlases of entire tissues and analyse hundreds of patient samples. Though costs remain relatively high compared to other sequencing technologies, they are increasingly utilised in oncology for their ability to detect heterogeneity among individual cells, distinguish between small numbers of cells, and to delineate cell maps.

Single-cell sequencing allows granularity and resolution at the single-cell level to determine different cell populations, types, and states; a level of detail that is lost in bulk sequencing. Pooling this information together to infer the spatial relationships between cells in tissues has significant promise for the future of cancer genomics.

Sequencing options for cancer genomics

Report: Cancer Genomics

Immunotherapy and precision oncology

In a healthy individual, the immune system responds to “foreign” cells (such as cancer) by attacking and eliminating them. Unfortunately, cancer cells have their own strategies for evading this immune response, leading to further proliferation and potential metastasis. The traditional course of action is to treat the disease using surgery, chemotherapy, or radiotherapy (or some combination of these). However, many patients simply do not respond to these established therapies.

Immunotherapy – a type of biological therapy – is a treatment strategy focused on harnessing the power of the patient’s immune system to attack cancer and stunt its development. It shows great promise as a bespoke therapy for cancers that do not respond to traditional treatments and could improve quality of life for many patients.

Checkpoint inhibitors

Immune checkpoint inhibitors are drugs that are able to block T cell activation and regulate hyperactivation of the immune system. The most well-known examples are antibodies that block the cytotoxic T lymphocyte antigen 4 (CTLA4) and programmed cell death 1 (PD-1) proteins.

These drugs are used to treat melanoma, renal cell carcinomas, colorectal cancers, non-small cell lung cancer, head and neck cancer, cervical cancer, endometrial cancer, bladder cancer and breast cancer – with more cancer types on the horizon.

CAR T-cell therapy

Chimeric antigen receptor (CAR) T-cell therapy – otherwise known as T-cell transfer therapy – is a specialised immunotherapy in which changes are made to the genes of a patient’s T-cells to increase their efficiency in recognising and destroying cancer.

Once these tweaks have been made in the lab, the T-cells are grown in batches and put back into the body via an intravenous drip. CAR T-cell therapy is currently used to treat children with some forms of leukaemia, and in adults with lymphoma.

Cancer vaccines

There are two types of cancer vaccines: prophylactic and therapeutic. Prophylactic vaccines are more similar to a traditional vaccine and are used to prevent infection by an oncogenic virus. One common example is the human papillomavirus vaccine against cervical cancer.

Therapeutic vaccines harness tumour-associated antigens to help the immune system eliminate cancer cells. Non-cancerous cells are protected from this attack as they either do not display these antigens or do not possess the antigens in high enough numbers to be targeted.

Cancer genomics in precision oncology

In the past, cancer was defined in terms of the tissue-of-origin – if a cancer originated in the lung, it is lung cancer. With the dawn of tumour sequencing, we now have an insight into the many different subsets of cells within a cancer and how these are defined based on their patterns of genetic alterations.

In the clinic, this has seen us move from treatment determined by the location of the tumour in the body, to considering the molecular patterns present in cancers and treating them accordingly. In clinical trials, we have seen a similar shift towards small, focused patient populations to test treatments, and drugs that are matched to specific mutations in a patient’s cancer. Ultimately, this has led to better responses to treatment.

NGS and genomic data

NGS data has proven instrumental in developing targeted therapies (drugs that directly attack cancer by altering expression of crucial oncogenes) and immunotherapies in precision oncology. NGS methods and bioinformatics platforms have generated oceans of cancer genomics data which has been used to target aggressive cancers that do not respond, or respond poorly, to conventional treatment options.

NGS was used to obtain massive amounts of genomic data from cancer patients with acute myeloid leukaemia, which later expanded to other solid tumours, and now forms The Cancer Genome Atlas (TCGA). NGS profiles from a host of tumours can assist in the creation of targeted therapies by identifying mutations in signalling pathways and blocking them with existing or novel drugs.

Cancer immunotherapy and precision oncology

Report: Cancer Genomics

Drug discovery and development

Drug discovery is a time-consuming and costly process, particularly given the high number of trials that ultimately “fail” or have negative outcomes. A high percentage of negative trial outcomes is to be expected in early-phase (I or II) trials, given they are mostly used as a proof-of-concept. However, the estimated 50% negative outcome rate in phase III trials represents a significant burden of cost in the drug development pipeline – and is a key target of genomics research.

Facilitating drug development with genomics

There are various ways that genomic information can help accelerate and improve drug development. Conceptual approaches in genetics and genomics help with target identification, prioritisation, and tractability, as well as predicting outcomes of pharmacological perturbations. Population genomics initiatives can also aid in target identification. Bulk and single-cell gene expression data is useful to understand the biological relevance of drug targets. Genome-wide CRISPR editing can screen for loss of function or activation of genes – a valuable tool for prioritising drug targets.

Genome sequencing and genotyping

Genome-wide association studies (GWAS) use high-density genotyping of common variants and linkage analysis. Exome sequencing captures the coding region of the human genome (about 1.5% of the entire genome). Whole genome sequencing achieves good coverage (around 85%) of the whole genome.

Exome sequencing and WGS are useful in identifying specific rare disease-associated variants that may be causal in cancer. Technical specifications of each technology may determine their success in translating variant discovery into actionable targets.


Transcriptional profiling of cells and tissues is a common technique in drug discovery, with high relevance to cancer drug and therapeutic development. Its use in supporting drug development includes mapping responses to compounds, interrogating tissues and cells for expression of target variants, and identifying causal variants of clinical phenotypes. It can also be used as a source of biomarkers to stratify patients for clinical trials.

Transcriptomics offers insights into the mechanisms of action and off-target effects in drugs. RNA-sequencing is not constrained by cell types or numbers, meaning accurate physiological models can be selected. This flexibility is derived from protocols ranging across low inputs, bulk or single-cell interrogation, and spatial transcriptomics.

CRISPR-based technologies

CRISPR-based genome editing facilitates the creation of targeted genetic perturbations at scale and can screen for a phenotype of interest. RNA programmable genome-targeting by CRISPR/Cas-9 has been used to inhibit or activate transcription, edit nucleotides, and modify epigenetic states.

Screening for disease-relevant or drug mechanism-of-action targets are limited by suitability and scalability of available model systems. CRISPR screens have, nonetheless, driven target prioritisation for various disease models and clarified targets, enhancers, and resistance genes for existing drugs.

Cancer drug discovery and development

Further reading

Report: Cancer Genomics

An overview of cancer biology

Sequencing options for cancer genomics

Cancer immunotherapy and precision oncology

Cancer drug discovery and development