A new study, published in Nature Communications, has developed a machine learning model that can identify the evolutionary pressures acting on mutations in ageing blood. The findings, which uncovered mechanisms that drive cancer, have the potential to significantly advance early disease detection and treatment.
Mutations in blood accumulate with age
As people age, mutations can accumulate in blood stem cells and their clones. This process is known as age-related clonal haematopoiesis (ARCH). Some mutations will confer a proliferative advantage to the cells they arise in. Cells that carry these so called “driver” mutations will then be positively selected, resulting in a rise in frequency. This causes a imbalance in cell lineage representation within the blood cell pool.
ARCH is a risk factor for acute myeloid leukaemia (AML), which affects around one million people globally. However, while some individuals with specific ARCH mutations do not develop AML, others with the same mutations do. The reason for this has been a mystery. Understanding the mechanisms that drive AML development is key to improving the early detection and treatment of this cancer.
Using machine learning to model evolution
In this study, the team used deep learning techniques to model the interaction of positive and negative evolutionary selection acting on mutations in blood stem cells. Their deep neural network model was able to identify with high accuracy whether mutations were positively selected, negatively selected, subject to both (combination selection) or neutral. Due to their proliferative advantage, driver mutations are positively selected. Mutations that accumulate in non-driver genes, known as ‘passenger’ mutations, are mostly neutral. However, some will be mildly damaging and be negatively selected for.
The trained machine learning model was then used to analyse blood samples that had undergone deep genomic sequencing. Samples were taken from individuals who subsequently developed AML, known as preleukemic cases, and controls who were healthy individuals with ARCH mutations.
Protective passenger mutations
First, because ARCH is an age-associated phenomenon, the researchers questioned whether age influenced the selective pressures on blood cells. They found that there was a clear correlation between age and positively selected mutations. Interestingly, the preleukemic samples showed evidence of positive selection at a younger age than controls. This indicates that driver mutations have occurred earlier in the patients’ lives. The team suggested that driver mutations arising at a young age will have a greater fitness advantage, as there are fewer mildly damaging passenger mutations in the population.
Surprisingly, the researchers also found positively selected driver mutations in healthy samples. The team proposed that the reason why these individuals did not develop disease is due to the proportion of driver-to-passenger mutations. They found that the control samples had an increased proportion of mutations in passenger genes compared to driver genes in the combination selection model. This implies that negative selection acting on mildly damaging passenger mutations plays a protective role in preventing cells with driver mutations proliferating out of control.
The functional impact of mutations
Next, the team scored mutations by the functional impact they had on the genome. Overall, they found that mutations under positive or combination selection in driver genes scored higher than mutations in non-driver genes. In addition, passenger mutations under negative selection scored significantly lower than passenger mutations under neutral selection. This finding suggests that negative selection can remove highly damaging mutations and decrease overall pathogenicity.
The protective role of negative selection was further supported by the finding that when ARCH occurred in the absence of positive selection, patients had a lower risk of progression to AML. Individuals with ARCH mutations that had signatures of both negative and positive selection had an approximately two-fold increased risk of progressing to AML compared to those with ARCH mutations fitting neutral models of evolution. These results can be used in the future to differentiate between patients with ARCH who are at increased risk of disease, and those who are not.
Conclusion and future implications
The model developed in this study was the first neural network able to discriminate between negative selection and neutrality. This enabled the researchers to evaluate the impact of both positive and negative mutations in human blood cell populations. Importantly, they found that negative selection appears to play a protective role in preventing the proliferation of cells with driver mutations.
Future studies over longer periods of time, with larger sample sizes, will be needed to validate this study’s findings. However, the team hope that their machine learning tool can be used in future for early disease detection.
Co-lead investigator, Dr. Philip Awadalla, said: “In the future, we can anticipate screening blood samples for early detection of disease and blood cancers. With these tools we can more proactively monitor people’s health. Early detection of cancer is critical with respect to prevention and effectiveness of treatment.”