Open Access


A Vicenary Analysis of SARS-CoV-2 Genomes

Sk Sarif Hassan1, Ranjeet Kumar Rout2, Kshira Sagar Sahoo3, Nz Jhanjhi4, Saiyed Umer5, Thamer A. Tabbakh6,*, Zahrah A. Almusaylim7
1 Department of Mathematics, Pingla Thana Mahavidyalaya, Paschim Medinipur, 721140, India
2 Computer Science & Engineering, National Institute of Technology Srinagar, Hazratbal, 190006, J&K, India
3 Department of Computer Science and Engineering, SRM University, Amaravati, AP, 522502, India
4 School of Computer Science and Engineering, Taylor’s University, Subang Jaya, 47500, Malaysia
5 Department of Computer Science and Engineering, Aliah University, Kolkata, India
6 Materials Science Research Institute, King Addulaziz City for Science and Technology (KACST), Riyad, 6086, Kingdom of Saudi Arabia
7 General Administration of Research and Development Laboratories, King Abdulaziz City for Science and Technology (KACST), Riyad, 6086, Kingdom of Saudi Arabia
* Corresponding Author: Thamer A. Tabbakh. Email:

Computers, Materials & Continua 2021, 69(3), 3477-3493.

Received 24 January 2021; Accepted 01 May 2021; Issue published 24 August 2021


Coronaviruses are responsible for various diseases ranging from the common cold to severe infections like the Middle East syndromes and the severe acute respiratory syndrome. However, a new coronavirus strain known as COVID-19 developed into a pandemic resulting in an ongoing global public health crisis. Therefore, there is a need to understand the genomic transformations that occur within this family of viruses in order to limit disease spread and develop new therapeutic targets. The nucleotide sequences of SARS-CoV-2 are consist of several bases. These bases can be classified into purines and pyrimidines according to their chemical composition. Purines include adenine (A) and guanine (G), while pyrimidines include cytosine (C) and tyrosine (T). There is a need to understand the spatial distribution of these bases on the nucleotide sequence to facilitate the development of antivirals (including neutralizing antibodies) and epitomes necessary for vaccine development. This study aimed to evaluate all the purine and pyrimidine associations within the SARS-CoV-2 genome sequence by measuring mathematical parameters including; Shannon entropy, Hurst exponent, and the nucleotide guanine-cytosine content. The Shannon entropy is used to identify closely associated sequences. Whereas Hurst exponent is used to identifying the auto-correlation of purine-pyrimidine bases even if their organization differs. Different frequency patterns can be used to determine the distribution of all four proteins and the density of each base. The GC-content is used to understand the stability of the DNA. The relevant genome sequences were extracted from the National Center for Biotechnology Information (NCBI) virus database. Furthermore, the phylogenetic properties of the COVID-19 virus were characterized to compare the closeness of the COVID-19 virus with other coronaviruses by evaluating the purine and pyrimidine distribution.


Fractal dimension; shannon entropy; hurst exponent; GC-content; SARS-CoV-2

Cite This Article

S. Sarif Hassan, R. Kumar Rout, K. Sagar Sahoo, N. Jhanjhi, S. Umer et al., "A vicenary analysis of sars-cov-2 genomes," Computers, Materials & Continua, vol. 69, no.3, pp. 3477–3493, 2021.


This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 994


  • 499


  • 0


Share Link

WeChat scan