Open Access
ARTICLE
SGT-Net: A Transformer-Based Stratified Graph Convolutional Network for 3D Point Cloud Semantic Segmentation
1 Faculty of Robot Science and Engineering, Northeastern University, Shenyang, 110167, China
2 State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, China
3 Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang, 110169, China
4 SIASUN Robot & Automation Co., Ltd., Shenyang, 110169, China
* Corresponding Author: Suyi Liu. Email:
(This article belongs to the Special Issue: Advanced Artificial Intelligence and Machine Learning Frameworks for Signal and Image Processing Applications)
Computers, Materials & Continua 2024, 79(3), 4471-4489. https://doi.org/10.32604/cmc.2024.049450
Received 08 January 2024; Accepted 15 April 2024; Issue published 20 June 2024
Abstract
In recent years, semantic segmentation on 3D point cloud data has attracted much attention. Unlike 2D images where pixels distribute regularly in the image domain, 3D point clouds in non-Euclidean space are irregular and inherently sparse. Therefore, it is very difficult to extract long-range contexts and effectively aggregate local features for semantic segmentation in 3D point cloud space. Most current methods either focus on local feature aggregation or long-range context dependency, but fail to directly establish a global-local feature extractor to complete the point cloud semantic segmentation tasks. In this paper, we propose a Transformer-based stratified graph convolutional network (SGT-Net), which enlarges the effective receptive field and builds direct long-range dependency. Specifically, we first propose a novel dense-sparse sampling strategy that provides dense local vertices and sparse long-distance vertices for subsequent graph convolutional network (GCN). Secondly, we propose a multi-key self-attention mechanism based on the Transformer to further weight augmentation for crucial neighboring relationships and enlarge the effective receptive field. In addition, to further improve the efficiency of the network, we propose a similarity measurement module to determine whether the neighborhood near the center point is effective. We demonstrate the validity and superiority of our method on the S3DIS and ShapeNet datasets. Through ablation experiments and segmentation visualization, we verify that the SGT model can improve the performance of the point cloud semantic segmentation.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.