Open Access
ARTICLE
Design of a Web Crawler for Water Quality Monitoring Data and Data Visualization
1 College of Engineering and Design, Hunan Normal University, Changsha, 410081, China
2 LIHERO Technology (Hunan) Co., Ltd., Changsha, 410205, China
3 Big Data Institute, Hunan University of Finance and Economics, Changsha, 410205, China
4 University Malaysia Sabah, Sabah, 88400, Malaysia
* Corresponding Author: Jianjun Zhang. Email:
Journal on Big Data 2022, 4(2), 135-143. https://doi.org/10.32604/jbd.2022.031024
Received 08 April 2022; Accepted 14 June 2022; Issue published 31 October 2022
Abstract
Many countries are paying more and more attention to the protection of water resources at present, and how to protect water resources has received extensive attention from society. Water quality monitoring is the key work to water resources protection. How to efficiently collect and analyze water quality monitoring data is an important aspect of water resources protection. In this paper, python programming tools and regular expressions were used to design a web crawler for the acquisition of water quality monitoring data from Global Freshwater Quality Database (GEMStat) sites, and the multi-thread parallelism was added to improve the efficiency in the process of downloading and parsing. In order to analyze and process the crawled water quality data, Pandas and Pyecharts are used to visualize the water quality data to show the intrinsic correlation and spatiotemporal relationship of the data.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.