Bioinformatics assists in identifying SARS-CoV-2 mutations, A new approach in bioinformatics to classify SARS-CoV-2 genetic variants helps address questions like whether one or more of the virus circulating strains have different levels of virulence. And if so, what of these therapeutic strains should be used?
In the April 19 paper on transboundary and emerging diseases, Australian researchers explored this approach.
New viruses have proven to be capable of rapidly evolving in hosts, presenting a wide range of quasis species – a population of viruses with a wide range of genomes. Due to its low fidelity, high polymorphism and lack of ability to correct errors, the viral diversity has been reduced.
Coronaviruses, however, express an exoribonuclease, which enables their large genomes to replicate highly and allow therefore high mutation rates. A range of trials with unknown functional differences poses a complex problem with rapid diagnostic, vaccine, antiviral and anticuerpic therapy development and assessment.
Advances in sequencing – Bioinformatics assists in identifying SARS-CoV-2 mutations
Researchers have been able to follow SARS-CoV-2, the agent in charge of the COVID-19 pandemic, upgrade genomic sequence technology and the international Community’s willingness to share information on the public domain. In Australia, researchers from the CSIRO developed a bioinformatics approach to epidemiological enhancement and response efforts, however, by synthesizing more efficiently and systematically, complex information.
SARS-CoV-2 isolates were then calculated for visual comparison at a frequency of 10 meters (decapeptides), and main component analysis (PCA) was followed. First, the researchers show that SARS-CoV-2 differentiates between 17 Severe ARS (SARS) and six Middle East Respiratory Syndrome (MERS) isolates, and coronaviral strains are unique. PCA analysis was then employed to depart from the clustering or forming of unique strains of SARS-CoV-2 sequences.
“Globally there is now a huge amount of individual virus sequences,” said Denis Bauer, PhD, CSIRO’s bioinformatics team leader and honorary associate professor at Macquarie University, in a statement. “Assessing the evolutionary distance between these data points and visualizing it helps researchers find out about the different strains of the virus, including where they came from and how they continue to evolve.”
Of the four Australian SARS-CoV-2 isolates that were analysed by a team, two have been closely grouped into Wuhan-Hu-1 (reference genome), which reflects the fact that those sequences have only had minimum changes in their core sequences compared to Wuhan-Hu-1. This results from the phylogenetic results. Based on the several deletions in the sequences, PCA analysis positioned the two other isolates further away from Wuhan-Hu-1 than the phylogenetic tree.
The scientists suggest that the K-mer approach may reflect more accurately the fluidity of changes – the variants’ cloud. Furthermore, the information gathered in this analysis can determine high-level similitudes between genomes, such as separate genomic islands with common functions or recombination events, since the analogy-free method looks at changes in the whole genome rather than on specific locations.
Determining isolates for preclinical models
NextStrain, a bioinformatics toolkit with open source, is a powerful tool to view the SARS-CoV-2 strains on the market in real-time, but it depends only on phylogeny. The researchers have shown that the changes in the evolution of the virus and the new strains can not be shown as complete as possible. Researchers aim to identify the most representative and appropriate option when identifying a strain for use in clinical models.
SARS-CoV-2 phylogenies are shown in three main clusters but are not able to properly cover the evolutionary space of the virus, as the PCA analysis reveals. The researchers perform PCA analysis of the major and emerging SARS-CoV-2 clusters from the phylogenetic trees on a consensus sequence (the most common residue found on each position in a sequence).
They found that the alignment-free approach could suggest alternative strains covering all new clusters. Due to its central position in this analysis and the ability to represent newly emerging clusters, strains that can be a good choice for preclinical testing (USA / WA1) were identified.
The researchers showed that synthetic consensus sequences could be used to visualize the evolutionary space the virus already claims to have and can inform isolates for diagnoses, vaccines and other countermeasures more effectively.
Bioinformatics assists in identifying SARS-CoV-2 mutations