Researchers have developed a free-access software tool called Tractor that increases the discovery power of genomics in understudied populations.
Admixed groups, whose genomes compromise of more than one ancestral population, make up over one-third of the US population. Many common heritable diseases, such as prostate cancer and cardiovascular disorders, are enriched in admixed populations. However, only a small proportion of studies currently address the genetic architecture of complex traits in these groups. This limits the clinical utility of large-scale data-collection efforts for minorities and exacerbates existing health disparities. As a result, efforts have shifted to collecting data from diverse groups containing higher amounts of admixture.
In GWASs, experts are concerned about false-positive hits that can arise from admixed populations. This is due to alleles being at different frequencies across populations. Most researchers attempt to control for this by using principal components, yet several limitations for this method exist.
Studying diverse populations not only reduces disparities, it also benefits genetic analysis. For example, linkage disequilibrium patterns from multiple ancestries can offer more refined mapping to localise GWAS signals.
In this article, published in Nature Genetics, researchers presented a statistical framework and software package, Tractor, to facilitate the inclusion of admixed individuals in association studies by leveraging local ancestry.
Tractor allows researchers to account for ancestry in a precise manner. It maps pieces of each person’s chromosomes according to its ancestry origin, which researchers can infer from reference genome sequences, and uses this information in a new GWAS model. The software also provides estimates of ancestry-specific effects sizes, which is not possible in a standard GWAS. Another advantage is that Tractor is able to improve the power of GWAS by detecting risk gene variants across multiple ancestries.
The team tested Tractor with simulated and empirical two-way admixed African-European cohorts. The software was able to generate accurate ancestry-specific effect-size estimates and P values. It also boosted GWAS power and improved the resolution of association signals. The team were able to replicate known hits for blood lipids and also discovered novel hits missed by standard GWAS.
This software advances the existing methodologies for studying the genetics of complex disorders in diverse and minority populations. The team hope that this software will increase the inclusion of admixed participants in large-scale association studies in the future.
Image credit: By freepik – www.freepik.com