Mobile Menu

AI identifies genetic targets for therapy with limited transcriptomic data

Understanding the regulatory networks between genes is a powerful way of uncovering mechanisms and therapeutic targets for disease.

A study published in Nature, led by a group of researchers from the Broad Institute, has reported the creation of an Artificial Intelligence (AI) model capable of identifying therapeutic targets for rare diseases by modelling these gene networks in human cells.

Unpicking gene networks

With thousands of genes in every human cell, mapping their extensive set of interactions requires large amounts of gene expression data. But what about rare diseases and difficult-to-sequence tissues in which gene expression data is limited?

The researchers in this study used AI to solve this problem. Specifically, they have developed a machine-learning AI model called ‘Geneformer’ which can transfer previous learning about human gene networks to model networks in novel situations where genomic data is sparse.

Teaching an old AI new tricks

Their deep-learning model was pretrained on publicly-available expression profiles of 30 million human cells. By training the model on such vast quantities of cells originating from a variety of tissues, cell types, disease states and developmental stages, Geneformer effectively learned to ‘understand’ the general dynamics of human gene networks and how the expression levels of individual genes change between cells and disease states. This general understanding meant Geneformer could predict the consequences of deleting a gene from a gene network and, perhaps more importantly, if removing a gene from a diseased cell network would alter network expression back to a healthier state and hence offer a therapeutic target.

The real power of this model is its adaptability. Using transfer learning, Geneformer can be tweaked and fine-tuned with new data from diseases and tissues with limited expression data. This means that Geneformer can transfer its understanding of general gene networks to predict gene network behaviour for cases without the data needed to train a deep-learning model from scratch.

From computer to bench to bedside

Theoretically, this approach could identify genetic targets for therapy and, as a demonstration of this, the team of scientists fine-tuned Geneformer using sequencing data for cardiomyopathy as an exemplar. This rare disease has very little sequencing data to make predictions from, but Geneformer used its understanding of gene networks to identify target genes that, if removed, would shift cardiomyopathic cells towards a ‘healthy’ network state. What’s more, this computer-based prediction had biological significance when tested in the lab; the group used CRISPR to delete these target genes in isolated heart muscle cells, which improved their ability to contract.

This in silico tool is cheaper than conventional approaches and isn’t limited to identifying gene candidates for rare heart diseases. As lead author Christina V. Theodoris states, “Geneformer can be applied to diverse questions to accelerate the discovery of key network regulators and candidate therapeutic targets when data are limited”. Ultimately, Geneformer provides scientists with a tool to speed up the search for treatment targets using the adaptability of deep-learning AI.