Open Access
ARTICLE
Impact of Distance Measures on the Performance of AIS Data Clustering
1 Department of Maritime Telecommunications, Gdynia Maritime University, Morska 81-87, 81-225, Gdynia, Poland
2 Department of Information Systems, Gdynia Maritime University, Morska 81-87, 81-225, Gdynia, Poland
* Corresponding Author: Marta Mieczyńska. Email:
Computer Systems Science and Engineering 2021, 36(1), 69-82. https://doi.org/10.32604/csse.2021.014327
Received 13 September 2020; Accepted 01 November 2020; Issue published 23 December 2020
Abstract
Automatic Identification System (AIS) data stream analysis is based on the AIS data of different vessel’s behaviours, including the vessels’ routes. When the AIS data consists of outliers, noises, or are incomplete, then the analysis of the vessel’s behaviours is not possible or is limited. When the data consists of outliers, it is not possible to automatically assign the AIS data to a particular vessel. In this paper, a clustering method is proposed to support the AIS data analysis, to qualify noises and outliers with respect to their suitability, and finally to aid the reconstruction of the vessel’s trajectory. In this paper, clustering results have been obtained using selected algorithms, including k-means, k-medoids, and fuzzy c-means. Based on the clustering results, it is possible to decide on the qualification of data with outliers and on their usefulness in the reconstruction of the vessel trajectory. The main aim of this paper is to answer how different distance measures during a clustering process can influence AIS data clustering quality. The main core question is whether or not they have an impact on the process of reconstruction of the vessel trajectories when the data are damaged. The research question during the computational experiments asked whether or not distance measure influence AIS data clustering quality. The computational experiments have been carried out using original AIS data. In general, the experiment and the results confirm the usefulness of the cluster-based analysis when the data include outliers that are derived from the natural environment. It is also possible to monitor and to analyse AIS data using clustering when the data include outliers. The computational experiment results confirm that the k-means with Euclidean distance has the best performance.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.