Open Access
ARTICLE
A Vicenary Analysis of SARS-CoV-2 Genomes
1 Department of Mathematics, Pingla Thana Mahavidyalaya, Paschim Medinipur, 721140, India
2 Computer Science & Engineering, National Institute of Technology Srinagar, Hazratbal, 190006, J&K, India
3 Department of Computer Science and Engineering, SRM University, Amaravati, AP, 522502, India
4 School of Computer Science and Engineering, Taylor’s University, Subang Jaya, 47500, Malaysia
5 Department of Computer Science and Engineering, Aliah University, Kolkata, India
6 Materials Science Research Institute, King Addulaziz City for Science and Technology (KACST), Riyad, 6086, Kingdom of Saudi Arabia
7 General Administration of Research and Development Laboratories, King Abdulaziz City for Science and Technology (KACST), Riyad, 6086, Kingdom of Saudi Arabia
* Corresponding Author: Thamer A. Tabbakh. Email:
Computers, Materials & Continua 2021, 69(3), 3477-3493. https://doi.org/10.32604/cmc.2021.017206
Received 24 January 2021; Accepted 01 May 2021; Issue published 24 August 2021
Abstract
Coronaviruses are responsible for various diseases ranging from the common cold to severe infections like the Middle East syndromes and the severe acute respiratory syndrome. However, a new coronavirus strain known as COVID-19 developed into a pandemic resulting in an ongoing global public health crisis. Therefore, there is a need to understand the genomic transformations that occur within this family of viruses in order to limit disease spread and develop new therapeutic targets. The nucleotide sequences of SARS-CoV-2 are consist of several bases. These bases can be classified into purines and pyrimidines according to their chemical composition. Purines include adenine (A) and guanine (G), while pyrimidines include cytosine (C) and tyrosine (T). There is a need to understand the spatial distribution of these bases on the nucleotide sequence to facilitate the development of antivirals (including neutralizing antibodies) and epitomes necessary for vaccine development. This study aimed to evaluate all the purine and pyrimidine associations within the SARS-CoV-2 genome sequence by measuring mathematical parameters including; Shannon entropy, Hurst exponent, and the nucleotide guanine-cytosine content. The Shannon entropy is used to identify closely associated sequences. Whereas Hurst exponent is used to identifying the auto-correlation of purine-pyrimidine bases even if their organization differs. Different frequency patterns can be used to determine the distribution of all four proteins and the density of each base. The GC-content is used to understand the stability of the DNA. The relevant genome sequences were extracted from the National Center for Biotechnology Information (NCBI) virus database. Furthermore, the phylogenetic properties of the COVID-19 virus were characterized to compare the closeness of the COVID-19 virus with other coronaviruses by evaluating the purine and pyrimidine distribution.Keywords
Cite This Article
Citations
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.