Open Access
ARTICLE
Segmentation of Head and Neck Tumors Using Dual PET/CT Imaging: Comparative Analysis of 2D, 2.5D, and 3D Approaches Using UNet Transformer
1 Information and Computer Science Department, College of Computer Science and Engineering, University of Ha’il, Ha’il, 55476, Saudi Arabia
2 Software Engineering Department, College of Computer Science and Engineering, University of Ha’il, Ha’il, 55476, Saudi Arabia
3 Computer Engineering Department, College of Computer Science and Engineering, University of Ha’il, Ha’il, 55476, Saudi Arabia
4 Department of Computer and Information Sciences, Universiti Teknologi Petronas, Seri Iskandar, 32610, Malaysia
5 Center for Research in Computer Vision (CRCV), University of Central Florida, Orlando, FL 32816, USA
* Corresponding Author: Rizwan Qureshi. Email:
(This article belongs to the Special Issue: Artificial Intelligence Emerging Trends and Sustainable Applications in Image Processing and Computer Vision)
Computer Modeling in Engineering & Sciences 2024, 141(3), 2351-2373. https://doi.org/10.32604/cmes.2024.055723
Received 05 July 2024; Accepted 11 September 2024; Issue published 31 October 2024
Abstract
The segmentation of head and neck (H&N) tumors in dual Positron Emission Tomography/Computed Tomography (PET/CT) imaging is a critical task in medical imaging, providing essential information for diagnosis, treatment planning, and outcome prediction. Motivated by the need for more accurate and robust segmentation methods, this study addresses key research gaps in the application of deep learning techniques to multimodal medical images. Specifically, it investigates the limitations of existing 2D and 3D models in capturing complex tumor structures and proposes an innovative 2.5D UNet Transformer model as a solution. The primary research questions guiding this study are: (1) How can the integration of convolutional neural networks (CNNs) and transformer networks enhance segmentation accuracy in dual PET/CT imaging? (2) What are the comparative advantages of 2D, 2.5D, and 3D model configurations in this context? To answer these questions, we aimed to develop and evaluate advanced deep-learning models that leverage the strengths of both CNNs and transformers. Our proposed methodology involved a comprehensive preprocessing pipeline, including normalization, contrast enhancement, and resampling, followed by segmentation using 2D, 2.5D, and 3D UNet Transformer models. The models were trained and tested on three diverse datasets: HeckTor2022, AutoPET2023, and SegRap2023. Performance was assessed using metrics such as Dice Similarity Coefficient, Jaccard Index, Average Surface Distance (ASD), and Relative Absolute Volume Difference (RAVD). The findings demonstrate that the 2.5D UNet Transformer model consistently outperformed the 2D and 3D models across most metrics, achieving the highest Dice and Jaccard values, indicating superior segmentation accuracy. For instance, on the HeckTor2022 dataset, the 2.5D model achieved a Dice score of 81.777 and a Jaccard index of 0.705, surpassing other model configurations. The 3D model showed strong boundary delineation performance but exhibited variability across datasets, while the 2D model, although effective, generally underperformed compared to its 2.5D and 3D counterparts. Compared to related literature, our study confirms the advantages of incorporating additional spatial context, as seen in the improved performance of the 2.5D model. This research fills a significant gap by providing a detailed comparative analysis of different model dimensions and their impact on H&N segmentation accuracy in dual PET/CT imaging.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.