Open AccessOpen Access


Design of a Web Crawler for Water Quality Monitoring Data and Data Visualization

Ziwen Yu1, Jianjun Zhang1,*, Wenwu Tan1, Ziyi Xiong1, Peilun Li1, Liangqing Meng2, Haijun Lin1, Guang Sun3, Peng Guo4

1 College of Engineering and Design, Hunan Normal University, Changsha, 410081, China
2 LIHERO Technology (Hunan) Co., Ltd., Changsha, 410205, China
3 Big Data Institute, Hunan University of Finance and Economics, Changsha, 410205, China
4 University Malaysia Sabah, Sabah, 88400, Malaysia

* Corresponding Author: Jianjun Zhang. Email:

Journal on Big Data 2022, 4(2), 135-143.


Many countries are paying more and more attention to the protection of water resources at present, and how to protect water resources has received extensive attention from society. Water quality monitoring is the key work to water resources protection. How to efficiently collect and analyze water quality monitoring data is an important aspect of water resources protection. In this paper, python programming tools and regular expressions were used to design a web crawler for the acquisition of water quality monitoring data from Global Freshwater Quality Database (GEMStat) sites, and the multi-thread parallelism was added to improve the efficiency in the process of downloading and parsing. In order to analyze and process the crawled water quality data, Pandas and Pyecharts are used to visualize the water quality data to show the intrinsic correlation and spatiotemporal relationship of the data.


Cite This Article

Z. Yu, J. Zhang, W. Tan, Z. Xiong, P. Li et al., "Design of a web crawler for water quality monitoring data and data visualization," Journal on Big Data, vol. 4, no.2, pp. 135–143, 2022.

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 354


  • 141


  • 3


Share Link

WeChat scan