Tech Science Press - Publisher of Open Access Journals

Open Access

ARTICLE

SwinHCAD: A Robust Multi-Modality Segmentation Model for Brain Tumors Using Transformer and Channel-Wise Attention

Seyong Jin¹, Muhammad Fayaz², L. Minh Dang³, Hyoung-Kyu Song³, Hyeonjoon Moon^2,*

CMC-Computers, Materials & Continua, Vol.86, No.1, pp. 1-23, 2026, DOI:10.32604/cmc.2025.070667 - 10 November 2025

Abstract Brain tumors require precise segmentation for diagnosis and treatment plans due to their complex morphology and heterogeneous characteristics. While MRI-based automatic brain tumor segmentation technology reduces the burden on medical staff and provides quantitative information, existing methodologies and recent models still struggle to accurately capture and classify the fine boundaries and diverse morphologies of tumors. In order to address these challenges and maximize the performance of brain tumor segmentation, this research introduces a novel SwinUNETR-based model by integrating a new decoder block, the Hierarchical Channel-wise Attention Decoder (HCAD), into a powerful SwinUNETR encoder. The HCAD… More >

Open Access

ARTICLE

Pyramid–MixNet: Integrate Attention into Encoder-Decoder Transformer Framework for Automatic Railway Surface Damage Segmentation

Hui Luo, Wenqing Li^*, Wei Zeng

CMC-Computers, Materials & Continua, Vol.84, No.1, pp. 1567-1580, 2025, DOI:10.32604/cmc.2025.062949 - 09 June 2025

Abstract Rail surface damage is a critical component of high-speed railway infrastructure, directly affecting train operational stability and safety. Existing methods face limitations in accuracy and speed for small-sample, multi-category, and multi-scale target segmentation tasks. To address these challenges, this paper proposes Pyramid-MixNet, an intelligent segmentation model for high-speed rail surface damage, leveraging dataset construction and expansion alongside a feature pyramid-based encoder-decoder network with multi-attention mechanisms. The encoding network integrates Spatial Reduction Masked Multi-Head Attention (SRMMHA) to enhance global feature extraction while reducing trainable parameters. The decoding network incorporates Mix-Attention (MA), enabling multi-scale structural understanding and More >

Open Access

ARTICLE

FS-MSFormer: Image Dehazing Based on Frequency Selection and Multi-Branch Efficient Transformer

Chunming Tang^*, Yu Wang

CMC-Computers, Materials & Continua, Vol.83, No.3, pp. 5115-5128, 2025, DOI:10.32604/cmc.2025.062328 - 19 May 2025

Abstract Image dehazing aims to generate clear images critical for subsequent visual tasks. CNNs have made significant progress in the field of image dehazing. However, due to the inherent limitations of convolution operations, it is challenging to effectively model global context and long-range spatial dependencies effectively. Although the Transformer can address this issue, it faces the challenge of excessive computational requirements. Therefore, we propose the FS-MSFormer network, an asymmetric encoder-decoder architecture that combines the advantages of CNNs and Transformers to improve dehazing performance. Specifically, the encoding process employs two branches for multi-scale feature extraction. One branch… More >

Open Access

ARTICLE

DMHFR: Decoder with Multi-Head Feature Receptors for Tract Image Segmentation

Jianuo Huang^1,2, Bohan Lai², Weiye Qiu³, Caixu Xu⁴, Jie He^1,5,*

CMC-Computers, Materials & Continua, Vol.82, No.3, pp. 4841-4862, 2025, DOI:10.32604/cmc.2025.059733 - 06 March 2025

Abstract The self-attention mechanism of Transformers, which captures long-range contextual information, has demonstrated significant potential in image segmentation. However, their ability to learn local, contextual relationships between pixels requires further improvement. Previous methods face challenges in efficiently managing multi-scale features of different granularities from the encoder backbone, leaving room for improvement in their global representation and feature extraction capabilities. To address these challenges, we propose a novel Decoder with Multi-Head Feature Receptors (DMHFR), which receives multi-scale features from the encoder backbone and organizes them into three feature groups with different granularities: coarse, fine-grained, and full set.… More >

Open Access

ARTICLE

ACSF-ED: Adaptive Cross-Scale Fusion Encoder-Decoder for Spatio-Temporal Action Detection

Wenju Wang¹, Zehua Gu^1,*, Bang Tang¹, Sen Wang², Jianfei Hao²

CMC-Computers, Materials & Continua, Vol.82, No.2, pp. 2389-2414, 2025, DOI:10.32604/cmc.2024.057392 - 17 February 2025

Abstract Current spatio-temporal action detection methods lack sufficient capabilities in extracting and comprehending spatio-temporal information. This paper introduces an end-to-end Adaptive Cross-Scale Fusion Encoder-Decoder (ACSF-ED) network to predict the action and locate the object efficiently. In the Adaptive Cross-Scale Fusion Spatio-Temporal Encoder (ACSF ST-Encoder), the Asymptotic Cross-scale Feature-fusion Module (ACCFM) is designed to address the issue of information degradation caused by the propagation of high-level semantic information, thereby extracting high-quality multi-scale features to provide superior features for subsequent spatio-temporal information modeling. Within the Shared-Head Decoder structure, a shared classification and regression detection head is constructed. A More >

Open Access

ARTICLE

Image Captioning Using Multimodal Deep Learning Approach

Rihem Farkh^1,*, Ghislain Oudinet¹, Yasser Foued²

CMC-Computers, Materials & Continua, Vol.81, No.3, pp. 3951-3968, 2024, DOI:10.32604/cmc.2024.053245 - 19 December 2024

Abstract The process of generating descriptive captions for images has witnessed significant advancements in last years, owing to the progress in deep learning techniques. Despite significant advancements, the task of thoroughly grasping image content and producing coherent, contextually relevant captions continues to pose a substantial challenge. In this paper, we introduce a novel multimodal method for image captioning by integrating three powerful deep learning architectures: YOLOv8 (You Only Look Once) for robust object detection, EfficientNetB7 for efficient feature extraction, and Transformers for effective sequence modeling. Our proposed model combines the strengths of YOLOv8 in detecting objects,… More >

Open Access

ARTICLE

A Video Captioning Method by Semantic Topic-Guided Generation

Ou Ye, Xinli Wei, Zhenhua Yu^*, Yan Fu, Ying Yang

CMC-Computers, Materials & Continua, Vol.78, No.1, pp. 1071-1093, 2024, DOI:10.32604/cmc.2023.046418 - 30 January 2024

Abstract In the video captioning methods based on an encoder-decoder, limited visual features are extracted by an encoder, and a natural sentence of the video content is generated using a decoder. However, this kind of method is dependent on a single video input source and few visual labels, and there is a problem with semantic alignment between video contents and generated natural sentences, which are not suitable for accurately comprehending and describing the video contents. To address this issue, this paper proposes a video captioning method by semantic topic-guided generation. First, a 3D convolutional neural network… More >

Open Access

ARTICLE

A Method of Integrating Length Constraints into Encoder-Decoder Transformer for Abstractive Text Summarization

Ngoc-Khuong Nguyen^1,2, Dac-Nhuong Le¹, Viet-Ha Nguyen², Anh-Cuong Le^3,*

Intelligent Automation & Soft Computing, Vol.38, No.1, pp. 1-18, 2023, DOI:10.32604/iasc.2023.037083 - 26 January 2024

Abstract Text summarization aims to generate a concise version of the original text. The longer the summary text is, the more detailed it will be from the original text, and this depends on the intended use. Therefore, the problem of generating summary texts with desired lengths is a vital task to put the research into practice. To solve this problem, in this paper, we propose a new method to integrate the desired length of the summarized text into the encoder-decoder model for the abstractive text summarization problem. This length parameter is integrated into the encoding phase More >

Open Access

ARTICLE

Optimizing Fully Convolutional Encoder-Decoder Network for Segmentation of Diabetic Eye Disease

Abdul Qadir Khan¹, Guangmin Sun^1,*, Yu Li¹, Anas Bilal², Malik Abdul Manan¹

CMC-Computers, Materials & Continua, Vol.77, No.2, pp. 2481-2504, 2023, DOI:10.32604/cmc.2023.043239 - 29 November 2023

Abstract In the emerging field of image segmentation, Fully Convolutional Networks (FCNs) have recently become prominent. However, their effectiveness is intimately linked with the correct selection and fine-tuning of hyperparameters, which can often be a cumbersome manual task. The main aim of this study is to propose a more efficient, less labour-intensive approach to hyperparameter optimization in FCNs for segmenting fundus images. To this end, our research introduces a hyperparameter-optimized Fully Convolutional Encoder-Decoder Network (FCEDN). The optimization is handled by a novel Genetic Grey Wolf Optimization (G-GWO) algorithm. This algorithm employs the Genetic Algorithm (GA) to… More >

Open Access

ARTICLE

Traffic Scene Captioning with Multi-Stage Feature Enhancement

Dehai Zhang^*, Yu Ma, Qing Liu, Haoxing Wang, Anquan Ren, Jiashu Liang

CMC-Computers, Materials & Continua, Vol.76, No.3, pp. 2901-2920, 2023, DOI:10.32604/cmc.2023.038264 - 08 October 2023

Abstract Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images, ensuring road safety while providing an important decision-making function for sustainable transportation. In order to provide a comprehensive and reasonable description of complex traffic scenes, a traffic scene semantic captioning model with multi-stage feature enhancement is proposed in this paper. In general, the model follows an encoder-decoder structure. First, multi-level granularity visual features are used for feature enhancement during the encoding process, which enables the model to learn… More >

Displaying 1-10 on page 1 of 34. Per Page

View

1455

Download

719

View

844

Download

391

View

1163

Download

495

View

1088

Download

593

View

1309

Download

626

View

1952

Download

921

View

1494

Download

704

Like

1

View

1808

Download

4085

View

1759

Download

894

Like

1

View

1344

Download

859

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp: