Miguel-Angel Sicilia1,*, Elena García-Barriocanal1, Marçal Mora-Cantallops1, Salvador Sánchez-Alonso1, Lino González2
CMC-Computers, Materials & Continua, Vol.68, No.2, pp. 1661-1672, 2021, DOI:10.32604/cmc.2021.015874
- 13 April 2021
Abstract Existing studies have challenged the current definition of named bacterial species, especially in the case of highly recombinogenic bacteria. This has led to considering the use of computational procedures to examine potential bacterial clusters that are not identified by species naming. This paper describes the use of sequence data obtained from MLST databases as input for a k-means algorithm extended to deal with housekeeping gene sequences as a metric of similarity for the clustering process. An implementation of the k-means algorithm has been developed based on an existing source code implementation, and it has been More >