Since the discovery of the basic double-helix structure of DNA In the 1950’s, scientists have devoted a huge amount of effort to determine the sequence of a variety of different genomics. It took over 10 years to produce the final draft of the human genome using Sanger sequencing. But now it is possible to sequence an entire genome in a fraction of that time – just one day – due to the advent of next generation sequencing (NGS). These sequencing technologies are continually evolving. Currently, they can be used in several applications, such as whole genome sequencing, metagenomics and RNA sequencing.
The history of DNA sequencing
The structure of the DNA double helix was discovered in 1953 by James Watson and Francis Crick. Rosalind Franklin was an English X-ray crystallographer who also contributed a great deal and was central to the understanding of the molecular structures of DNA. Watson and Crick received a Nobel Prize in 1962 for their astounding contribution to science, shortly followed by Robert Holley in 1968 for being the first person to sequence an RNA molecule. Unfortunately, Franklin had passed away before her work was fully appreciated, and so she is now often referred to as ‘the dark lady of DNA’. The combination of these pioneering discoveries paved the way for the sequencing of DNA.
Here are some of the most defining moments for DNA sequencing:
- 1972: Paul Berg developed the first technology that permitted the isolation of defined DNA fragments, leading to the development of modern genetic engineering. Before this, only phages or virus DNA was available for sequencing.
- 1973: The first nucleotide sequence was published by Walter Gilbert, consisting of 24 base pairs of the DNA lac operator.
- 1977: Frederick Sanger was the first to sequence the complete DNA genome of a bacteriophage, called phi X174. He also developed ‘DNA sequencing with chain-terminating inhibitors’.
- 1977: Walter Gilbert produced ‘DNA sequencing by chemical degradation’.
- 1986: Leroy Hood, at the California Institute of Technology, announced the invention of the first semi-automated DNA sequencing machine. This became a key tool for mapping and sequencing genetic material.
- 1987: Applied Biosystems in the US marketed the first automated sequencing machine, called ABI370. This was a significant advance for several scientific research projects, including mapping the human genome.
- 1990: The Human Genome Project formally began, involving the US, UK, France, Germany, Japan, China and India. It was expected to take around 15 years.
- 1998: ‘Method of nucleic acid amplification’ was developed by Eric Kawashima, Laurent Farinelli and Pascal Mayer at the Geneva Biomedical Research Institute, then run by Glaxo Wellcome. This was one of the major milestones in developing NGS technologies.
- 2000: A ‘rough draft’ of the human genome was finished by the Human Genome Project, mainly due to advances in the genomics field, especially in sequence analysis.
A timeline illustrating the milestones in major genome assembly achievements. The background colour indicates what type of sequencing was used: Red represents early sequencing methods, yellow is Sanger-based shotgun methods, green is NGS and blue represents third-generation sequencing. Image credit: Formenti et al., 2020
Evolution of NGS
NGS became available at the beginning of the 21st century. Perhaps the biggest advance that NGS offered was the ability to produce a huge amount of data, alongside its ability to provide a highly efficient, rapid, low-cost approach and accurate to DNA sequencing, beyond the reach of traditional Sanger methods.
These are the key moments in the evolution of NGS:
- 2000: Lynx Therapeutics Company launched the first of the NGS technologies, called Massively parallel signature sequencing (MPSS). The company was later bought by Illumina.
- 2004: 454 Life Sciences marketed a new generation pyrosequencing technology, called the Roche GS20. It was the first NGS platform on the market. It was launched a few years later and revolutionised DNA sequencing because it could produce up to 20 million base pairs.
- 2008: The first paper was published about studying the human genome sequence using NGS. James Watson’s personal genome sequence was handed to him on a hard drive and was estimated to cost $1 million. This was the first of numerous single genomes being produced using a variety of NGS methods.
- 2014: Illumina launched a new technology, called HiSeq X Ten Sequencer, and claimed to have produced the first $1,000 genome. However, tens of millions of upfront investments were needed to reach that milestone.
- 2014: It was announced that Illumina has effectively monopolised the industry, holding 70% of the market for DNA sequencers and accounting for over 90% of all the DNA data produced globally.
- 2018: Veritas Genetics offered whole genome sequencing that cost just $199 for a limited 1000 customers.
- 2019: The Nation Human Genome Research Institute reported that the price of sequencing a complete human genome was $942, beating Moore’s Law prediction.
The cost of sequencing a single human genome from 2001 to 2020. The graph shows that the cost of sequencing per human genome has consistently been lower than predicted by Moore’s Law in 2008. Image credit: National Human Genome Research Institute
Future prospects of NGS
The ‘$1000 genome’ was reached rendering genome sequencing newly accessible to millions of people around the world – but companies did not stop there. Organisations, such as Illumina, Pacific Biosciences, 454 Life Sciences and Oxford Technologies Nanopore are all still working tirelessly to reduce the price even further. The industry is continuing to expand, with companies now marketing NGS bench top platforms to bring these technologies into as many laboratories as possible.
NGS will continue to become increasingly efficient and affordable, revolutionising several fields related to genomics. At the moment, all NGS approaches require library preparation. This protocol occurs after DNA fragmentation, where adapters are attached to the ends of each fragment. It is usually followed by a step of DNA amplification to result in a library that can then be sequenced by the NGS platform.
A new class of DNA sequencing, called third generation sequencing (TGS), is currently under active development. Generally, these technologies are capable of sequencing single DNA molecules without amplification, and they allow the production of reads much longer than NGS. Pacific Bioscience and Oxford Nanopore Technologies dominate the sector with their systems called Single-Molecule Sequencing in Real Time (SMRT) and nanopore sequencing, respectively. Each of these technologies can rapidly generate very long reads of up to 15,000 bases long from single molecules of DNA and RNA. This means that the complete sequence of smaller genomes can be achieved without amplification bias, reducing the time and cost of the process. The accuracy and cost-reduction will continue to progress, allowing the application of TGS an option for a broad range of applications in genomics.
For more information on NGS platforms check out our Sample Preparation Report. It runs through a cohort of sequencing platforms from a variety of companies and expands on the library preparation techniques that each one needs.
Image credit: Enzo Life Sciences