A new study has combined SNP-to-gene linking strategies to pinpoint genes that drive disease. The developed framework has identified novel disease genes, highlighting potential therapeutic targets.
Disease associated SNPs
Genome-wide association studies (GWAS) have identified thousands of disease-associated common single nucleotide polymorphisms (SNPs). However, most disease SNPs are regulatory. For regulatory SNPs, GWAS cannot identify the underlying causal variants of the disease. This has greatly limited the ability to translate GWAS findings into discoveries that will enhance disease treatment. Therefore, identifying target genes for disease-associated SNPs is a critical challenge.
Many SNP-to-gene (S2G) linking strategies have been developed to attempt to connect regulatory SNPs with their target genes, including transcriptome-wide association studies. However, so far these methods have not been applied in a way that allows interpretation of common disease risk variants.
Combining S2G strategies
In this study, published as preprint in medRxiv, researchers developed a framework for evaluating and combining different S2G strategies to optimise their informativeness for human disease risk. This framework was then applied to GWAS summary statistics for 63 diseases and complex traits.
The framework evaluated 50 S2G strategies. Next, the seven most accurate S2G strategies were combined to create an optimal combined S2G strategy (cS2G). The cS2G was then applied to fine-mapping results for 49 diseases and complex traits from the UK Biobank. In total, 7,111 causal SNP-gene-disease triplets were predicted. Of these predictions, 64% were identified as the correct casual SNP and target gene, which is 1.98 times higher than using an individual S2G strategy.
From the data collected, the researchers discovered causal genes involved in human diseases and were also able to rank these genes by heritability.
Pinpointing novel disease genes
In asthma, two independent SNPs were linked by cS2G to the target gene BCL6. The response of interleukin 4, which is known to be involved in asthma, is regulated by BCL6. However, this gene has not previously been implicated in asthma.
The study also uncovered new genes in eczema and high-density lipoprotein (HDL) cholesterol. In eczema, an SNP was linked to PDCD1, an immune-inhibitory receptor expressed in activated T cells. PDCD1 has also been previously connected to skin cancer and autoimmune diseases, but not eczema.
In HDL cholesterol, the gene LAMP1 was linked to a disease-associated SNP. Deficiency of lysosome associated membrane protein 1, which is encoded by LAMP1, has been linked to high cholesterol in mice but not in humans until now.
These novel discoveries highlight the benefit of using cS2G rather than S2G strategies on their own. In addition, the uncovered genes can hopefully be targeted in future therapeutic treatments.
Assessing disease omnigenicity
Next, the researchers set out to use their framework to assess disease omnigenicity. Previous studies have proposed an ‘omnigenic model’ of complex disease. This model posits that human gene regulatory networks are so interconnected that all of the individual genes expressed in disease-critical cells impact the function of core disease genes, and therefore disease heritability.
The omnigenic model has piqued interest in how much each gene contributes to disease heritability. Using the cS2G, the researchers in this study were able to investigate this by ranking genes by their contribution to heritability.
They concluded that the top 1% of ranked genes explained approximately half of the heritability linked to all genes. This is much less than is predicted by other S2G methods. Their findings indicate that a relatively small number of genes are the driving force behind disease heritability.
Limitations and future studies
The cS2G developed in this study represents a technological advancement over previous approaches that link SNPs to their target genes. However, this study was not without its limitations. The evaluation of accuracy of some S2G strategies was not precise. In addition, the biosample size used on cS2G was limited. Once larger datasets become available, further studies should be carried out to verify this study’s findings.
Nevertheless, the results collected demonstrate the advantages of using combined S2G strategies. The disease genes pinpointed by the framework provide great hope for the future of disease treatment development.