The Telomere-to-Telomere Consortium has finally finished the first truly complete 3.05 billion base pair sequence of a human genome. This work represents the largest improvement to the human reference genome since its initial release in 2003.
The Human Genome Project
In 2003, after 13 years of hard work and nearly $3 billion in funding, researchers announced that they had finally mapped the first human genome sequence. This announcement was a historic breakthrough in the field and set the stage for a new era of genomics. Despite the excitement, the initial draft and subsequent updates of the human genome sequence were not 100% complete.
As a result, Karen Miga and Adam Phillippy from NHGRI set up the Telomere-to-Telomere (T2T) Consortium in late 2019 with a goal of producing a complete human genome sequence. The team harnessed the power of novel long-read sequencing technologies, such as PacBio and Oxford Nanopore, to aid in the construction of the hard-to-sequence regions within the genome.
The complete human genome sequence
In their paper, published as preprint in bioRxiv, the researchers claimed to have addressed the remaining 8% of the human genome that was missing.
One of my first articles I wrote for Front Line Genomics was about the complete assembly of chromosome 8. Nearly a year later, the same team present the new T2T-CHM13 reference that has increased the number of DNA bases from 2.92 billion to 3.05 billion. The CHM13 cell line is from a hydatidiform mole, the growth of an abnormal fertilised egg. This occurs when an ovum without a nucleus is fertilised by one sperm. This results in karyotype 46,XX in which all chromosomes are from the paternal line. The team used these cells as it made their computational efforts much simpler.
The new reference includes gapless assemblies for all 22 autosomes as well as chromosome X. It also has corrected a number of errors and introduced nearly 200 million base pairs of novel sequence. The completed regions importantly include all centromeric regions and the short arms of all five acrocentric chromosomes. The sequencing of these regions has opened the door to potentially new regulatory and functional insights.
This work is a step in the direction to assembling hundreds of different, complete genomes that are ethnically diverse. While the paper is yet to go under peer-review, these findings mark a new age of genomics!
Image credit: Created by freepik – freepik