Open Access
ARTICLE
A Composite Transformer-Based Multi-Stage Defect Detection Architecture for Sewer Pipes
1 School of Computer Science and Engineering, Macau University of Science and Technology, Macau, 999078, China
2 School of Environmental Science and Engineering, Sun Yat-sen University, Guangzhou, 510275, China
3 Guangdong AIKE Environmental Science and Technology Co., Ltd., Zhongshan, 528400, China
* Corresponding Author: Xianfeng Li. Email:
Computers, Materials & Continua 2024, 78(1), 435-451. https://doi.org/10.32604/cmc.2023.046685
Received 11 October 2023; Accepted 14 November 2023; Issue published 30 January 2024
Abstract
Urban sewer pipes are a vital infrastructure in modern cities, and their defects must be detected in time to prevent potential malfunctioning. In recent years, to relieve the manual efforts by human experts, models based on deep learning have been introduced to automatically identify potential defects. However, these models are insufficient in terms of dataset complexity, model versatility and performance. Our work addresses these issues with a multi-stage defect detection architecture using a composite backbone Swin Transformer. The model based on this architecture is trained using a more comprehensive dataset containing more classes of defects. By ablation studies on the modules of combined backbone Swin Transformer, multi-stage detector, test-time data augmentation and model fusion, it is revealed that they all contribute to the improvement of detection accuracy from different aspects. The model incorporating all these modules achieves the mean Average Precision (mAP) of 78.6% at an Intersection over Union (IoU) threshold of 0.5. This represents an improvement of 14.1% over the ResNet50 Faster Region-based Convolutional Neural Network (R-CNN) model and a 6.7% improvement over You Only Look Once version 6 (YOLOv6)-large, the highest in the YOLO methods. In addition, for other defect detection models for sewer pipes, although direct comparison with them is infeasible due to the unavailability of their private datasets, our results are obtained from a more comprehensive dataset and have superior generalization capabilities.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.