Researchers at Tohoku University have now completed and released the first ever Japanese reference genome (JG1).
Since its completion in 2003, the human genome sequence has been an invaluable resource for both basic research in human genetics and clinical diagnosis. The reference genome is used as a target for mapping reads generated from next-generation sequencing technologies. This step is important for calling SNVs and indels. The reference genome is maintained and continually update by the Genome Reference Consortium.
Although it is undeniable that the reference genome is a source of unparalleled value, several of its characteristics are not ideal for application to NGS analyses, particularly for some populations. The reference genome contains rare and even private variants. Inclusion of such variants can lead to erroneous results of short-read mapping or variant calling. The reference allele is also considered the healthy or major allele for any variable site; therefore, the inclusion of such rare alleles may confuse subsequent interpretations. Another issue with the reference genome is that the samples used for construction are biased towards African and European ancestries. Recent studies have also revealed a lack of population-specific sequences within the reference genome.
Japanese reference genome
In this study, published in Nature Communications, using a hybrid scaffolding strategy, researchers constructed a reference genome by integrating de novo assemblies of three Japanese individuals. Specifically, the team use PacBio long reads and Bionano Genomics optical maps. After merging the three haploid assemblies, the team defined major variants among the three individuals and adopted them as the reference allele. JG1 is contiguous, accurate, and carries the Japanese major allele at most loci.
The team also demonstrated a potential application for the reference by highlighting the utility of using JG1 as a reference genome in NGS analyses. These analyses aimed at identifying the causal variants of several rare diseases. They found that re-analysis using JG1 reduced total candidate variant calls versus GRCh37 while retaining disease-causing variants.
These results show that integrating multiple genomes from a single population can help with genome analyses of that particular population.
Image credit: By freepik – www.freepik.com