Researchers at Medical Research Council Epidemiology have carried out a genome-proteome-wide association study that targeted 4,775 distinct proteins, enabling them to generate a proteo-genomic map of human health that covered 1,859 gene-protein-phenotypes.
Proteins are essential functional units of the human body. They are the central layer of information transfer from the genome to any given phenotype. Therefore, characterisation of the genetic regulation of proteins is essential for understanding the mechanisms behind diseases and for developing novel therapies to improve patient outcomes.
Recently, studies have begun to uncover how sequence variation in the human genome impacts protein concentrations measured in biofluids, such as blood. These investigations are called protein-quantitative trait loci (pQTL) studies, and are now being used to help better understand the causes of disease. However, to date there has been little focus on understanding the clinical relevance of such pQTLs studies.
Generating a proteo-genomic map
Now, a group of researchers at MRC Epidemiology has carried out a genome-proteome-wide association study that targeted 4,775 distinct proteins measured from over 10,000 participants. The study, published in Science, aimed to combine genetic data with information about proteins circulating in the blood to produce a map showing how variants link to diseases.
The team identified 10,674 genetic associations for 3,892 plasma proteins. This revealed that over half of all pQTLs close to protein-encoding genes, otherwise known as cis-pQTLs, overlapped with gene expression or splicing in various tissues. In turn, this enabled the researchers to gain functional insights within tissues by integrating genetics with circulating blood proteomics.
Dr Claudia Langenberg, from the MRC Epidemiology Unit, explained:
“An extreme example we discovered of how one protein can be connected to several diseases is the protein Fibulin-3, which we connected to 37 conditions, including hypermobility, hernias, varicose veins, and a lower risk of carpal tunnel syndrome. A likely explanation is atypical formation of elastic fibres covering our organs and joints, leading to differences in elasticity of soft and connective tissues. This is also in line with features that others have observed in mice where this gene was deleted.”
Additionally, the team generated a proteo-genomic map of human health, covering 1,859 gene-protein-phenotypes. This provided insights into the shared molecular mechanisms driving various diseases and delivered biological contexts for new or emerging disorders.
The future of proteo-genomic connections in disease research
Overall, this study has identified numerous proteo-genomic connections within and between diseases and established the usefulness of annotating causal disease genes with cis-protein variants. These findings may help to explain why diverse diseases can be caused by the same underlying protein or mechanism, and in turn, could address some of the major barriers for clinical translation of genetic discoveries.
This information could also point to new strategies for treating a variety of conditions as many proteins act as drug targets. For example, historically the genomic region called KAT8 has been linked to Alzheimer’s disease, yet scientists have yet to identify which exact gene is involved. But by combining genomic and proteomic data, researchers in the current study were able to find PRSS8 in the KAT8 region. This gene encodes for prostasin and is now considered a novel candidate gene for Alzheimer’s disease.
Dr Eleanor Wheeler, at the MRC Epidemiology Unit, said:
“For most genomic regions associated with disease risk, the underlying causal gene and mechanism are not known. Our work demonstrates the distinctive value of proteins to zoom in on the causal gene for a disease and helps us to understand the mechanism through which genetic variation can cause disease. We envisage that the large amount of information we are sharing with the scientific community will help ongoing and emerging efforts to connect genes to diseases more directly via the encoded protein, thus facilitating accelerated identification of drug targets.”
Image credit: Drug Discovery World