In a recent paper, published in Nature Biotechnology, scientists at the Wellcome Sanger Institute have developed a machine learning tool to improve the efficiency of prime editing – a gene editing technology. The tool is designed to reduce the trial and error involved in optimizing prime editing and to help researchers design the best fix for a given genetic flaw, potentially accelerating the development of new therapies for a range of diseases and expanding the applications of prime editing.
Prime editing vs CRISPR
Prime editing and CRISPR are both used to manipulate DNA sequences, but these technologies differ in their approach and potential applications. CRISPR-Cas9 editing, developed in 2012, is often described as “molecular scissors” that can disrupt target genes. Base editors, developed in 2016, build on the CRISPR-Cas9 technology and function more like “molecular pencils,” substituting single nucleotides. Prime editors, developed in 2019, are able to insert short DNA sequences without generating double-strand breaks or requiring an external template. Therefore, prime editors are sometimes described as “molecular word processors” that are able to perform “search and replace” operations directly on the genome.
This technology has the potential to revolutionize biotechnology and medicine, as small insertions of DNA sequences can have a profound impact on gene expression, protein function, and disease treatment. These small sequences can encode protein tags for purification and visualization, or manipulate protein function by altering protein localization, half-life or interaction profiles. Over 16,000 small deletion variants have been linked to disease, and prime editing could be used to restore the missing sequence. However, the prime editing system is complex and still being optimized, with further research needed to understand what factors affect its efficiency and the potential length range of insertions feasible with this technology.
Boosting efficacy
Improving the efficacy of prime editing is crucial for advancing the technology and realizing its potential in biotechnology and medicine. However, the efficiency of each step in prime editing is affected by various factors, such as the length and composition of the inserted sequence. Eliminating unnecessary trial and error in optimizing prime editing could accelerate the development of new therapies and applications. That’s what the research team at the Wellcome Sanger Institute set out to do, by developing a new machine learning tool to “help researchers design the best fix for a given genetic flaw.”
“Put simply, several different combinations of three DNA letters can encode for the same amino acid in a protein. That’s why there are hundreds of ways to edit a gene to achieve the same outcome at the protein level,” explained Julianne Weller, an author of the study. “By feeding these potential gene edits into a machine learning algorithm, we have created a model to rank them on how likely they are to work. We hope this will remove much of the trial and error involved in prime editing and speed up progress considerably.”
The researchers designed 3,604 DNA sequences of various lengths and measured the frequency of their insertion into four genomic sites in three human cell lines using different prime editor systems in varying DNA repair contexts. The study found that the length, nucleotide composition and secondary structure of the insertion sequence all affect insertion rates. Mismatch repair also effected thousands of insertion sequences, and the 3′ flap nucleases TREX1 and TREX2 abolished the insertion of longer sequences. In summary, the specific features of sequences and DNA repair pathway activity explained most of the variation in insertion rate. Using this data, they trained a machine learning model to accurately predict the editing outcomes for novel sequences. The team also used the model to select optimal reagents for new insertions and to create a catalogue of a hundred useful sequences and their insertion rates.
Bringing prime editing closer to the clinic
Leopold Parts, senior author of the study noted that “the potential of prime editing to improve human health is vast, but first, we need to understand the easiest, most efficient, and safest ways to make these edits. It’s all about understanding the rules of the game, which the data and tool resulting from this study will help us to do.”
This study has revealed some of those “rules,” and the ability to accurately predict the efficiency of prime editing has the potential to greatly reduce the time and resources needed, making genetic editing therapies more feasible for clinical use. This may help to streamline the development of gene therapies and bring them closer to clinical application, ultimately benefiting patients who suffer from genetic diseases.
While the results of this study are promising, it’s important to note that caution needs to be taken to ensure that the potential benefits of genetic therapies are balanced against the risks, and that rigorous safety and efficacy testing is conducted before they are approved for clinical use. There is still a long way to go in terms of understanding the complexities of gene editing and developing safe and effective therapies that can be widely used in clinical settings.