Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (14)
  • Open Access

    ARTICLE

    A Dual-Stream Framework for Landslide Segmentation with Cross-Attention Enhancement and Gated Multimodal Fusion

    Md Minhazul Islam1,2, Yunfei Yin1,2,*, Md Tanvir Islam1,2, Zheng Yuan1,2, Argho Dey1,2

    CMC-Computers, Materials & Continua, Vol.86, No.3, 2026, DOI:10.32604/cmc.2025.072550 - 12 January 2026

    Abstract Automatic segmentation of landslides from remote sensing imagery is challenging because traditional machine learning and early CNN-based models often fail to generalize across heterogeneous landscapes, where segmentation maps contain sparse and fragmented landslide regions under diverse geographical conditions. To address these issues, we propose a lightweight dual-stream siamese deep learning framework that integrates optical and topographical data fusion with an adaptive decoder, guided multimodal fusion, and deep supervision. The framework is built upon the synergistic combination of cross-attention, gated fusion, and sub-pixel upsampling within a unified dual-stream architecture specifically optimized for landslide segmentation, enabling efficient… More >

  • Open Access

    ARTICLE

    Bearing Fault Diagnosis Based on Multimodal Fusion GRU and Swin-Transformer

    Yingyong Zou*, Yu Zhang, Long Li, Tao Liu, Xingkui Zhang

    CMC-Computers, Materials & Continua, Vol.86, No.1, pp. 1-24, 2026, DOI:10.32604/cmc.2025.068246 - 10 November 2025

    Abstract Fault diagnosis of rolling bearings is crucial for ensuring the stable operation of mechanical equipment and production safety in industrial environments. However, due to the nonlinearity and non-stationarity of collected vibration signals, single-modal methods struggle to capture fault features fully. This paper proposes a rolling bearing fault diagnosis method based on multi-modal information fusion. The method first employs the Hippopotamus Optimization Algorithm (HO) to optimize the number of modes in Variational Mode Decomposition (VMD) to achieve optimal modal decomposition performance. It combines Convolutional Neural Networks (CNN) and Gated Recurrent Units (GRU) to extract temporal features… More >

  • Open Access

    REVIEW

    A Systematic Review of Multimodal Fusion and Explainable AI Applications in Breast Cancer Diagnosis

    Deema Alzamil1,2,*, Bader Alkhamees2, Mohammad Mehedi Hassan2,3

    CMES-Computer Modeling in Engineering & Sciences, Vol.145, No.3, pp. 2971-3027, 2025, DOI:10.32604/cmes.2025.070867 - 23 December 2025

    Abstract Breast cancer diagnosis relies heavily on many kinds of information from diverse sources—like mammogram images, ultrasound scans, patient records, and genetic tests—but most AI tools look at only one of these at a time, which limits their ability to produce accurate and comprehensive decisions. In recent years, multimodal learning has emerged, enabling the integration of heterogeneous data to improve performance and diagnostic accuracy. However, doctors cannot always see how or why these AI tools make their choices, which is a significant bottleneck in their reliability, along with adoption in clinical settings. Hence, people are adding… More >

  • Open Access

    REVIEW

    Bridging 2D and 3D Object Detection: Advances in Occlusion Handling through Depth Estimation

    Zainab Ouardirhi1,2,*, Mostapha Zbakh2, Sidi Ahmed Mahmoudi1

    CMES-Computer Modeling in Engineering & Sciences, Vol.143, No.3, pp. 2509-2571, 2025, DOI:10.32604/cmes.2025.064283 - 30 June 2025

    Abstract Object detection in occluded environments remains a core challenge in computer vision (CV), especially in domains such as autonomous driving and robotics. While Convolutional Neural Network (CNN)-based two-dimensional (2D) and three-dimensional (3D) object detection methods have made significant progress, they often fall short under severe occlusion due to depth ambiguities in 2D imagery and the high cost and deployment limitations of 3D sensors such as Light Detection and Ranging (LiDAR). This paper presents a comparative review of recent 2D and 3D detection models, focusing on their occlusion-handling capabilities and the impact of sensor modalities such More >

  • Open Access

    ARTICLE

    Low-Rank Adapter Layers and Bidirectional Gated Feature Fusion for Multimodal Hateful Memes Classification

    Youwei Huang, Han Zhong*, Cheng Cheng, Yijie Peng

    CMC-Computers, Materials & Continua, Vol.84, No.1, pp. 1863-1882, 2025, DOI:10.32604/cmc.2025.064734 - 09 June 2025

    Abstract Hateful meme is a multimodal medium that combines images and texts. The potential hate content of hateful memes has caused serious problems for social media security. The current hateful memes classification task faces significant data scarcity challenges, and direct fine-tuning of large-scale pre-trained models often leads to severe overfitting issues. In addition, it is a challenge to understand the underlying relationship between text and images in the hateful memes. To address these issues, we propose a multimodal hateful memes classification model named LABF, which is based on low-rank adapter layers and bidirectional gated feature fusion. More >

  • Open Access

    ARTICLE

    Image Style Transfer for Exhibition Hall Design Based on Multimodal Semantic-Enhanced Algorithm

    Qing Xie*, Ruiyun Yu

    CMC-Computers, Materials & Continua, Vol.84, No.1, pp. 1123-1144, 2025, DOI:10.32604/cmc.2025.062712 - 09 June 2025

    Abstract Although existing style transfer techniques have made significant progress in the field of image generation, there are still some challenges in the field of exhibition hall design. The existing style transfer methods mainly focus on the transformation of single dimensional features, but ignore the deep integration of content and style features in exhibition hall design. In addition, existing methods are deficient in detail retention, especially in accurately capturing and reproducing local textures and details while preserving the content image structure. In addition, point-based attention mechanisms tend to ignore the complexity and diversity of image features… More >

  • Open Access

    ARTICLE

    DMF: A Deep Multimodal Fusion-Based Network Traffic Classification Model

    Xiangbin Wang1, Qingjun Yuan1,*, Weina Niu2, Qianwei Meng1, Yongjuan Wang1, Chunxiang Gu1

    CMC-Computers, Materials & Continua, Vol.83, No.2, pp. 2267-2285, 2025, DOI:10.32604/cmc.2025.061767 - 16 April 2025

    Abstract With the rise of encrypted traffic, traditional network analysis methods have become less effective, leading to a shift towards deep learning-based approaches. Among these, multimodal learning-based classification methods have gained attention due to their ability to leverage diverse feature sets from encrypted traffic, improving classification accuracy. However, existing research predominantly relies on late fusion techniques, which hinder the full utilization of deep features within the data. To address this limitation, we propose a novel multimodal encrypted traffic classification model that synchronizes modality fusion with multiscale feature extraction. Specifically, our approach performs real-time fusion of modalities More >

  • Open Access

    ARTICLE

    Lightweight Classroom Student Action Recognition Method Based on Spatiotemporal Multimodal Feature Fusion

    Shaodong Zou1, Di Wu1, Jianhou Gan1,2,*, Juxiang Zhou1,2, Jiatian Mei1,2

    CMC-Computers, Materials & Continua, Vol.83, No.1, pp. 1101-1116, 2025, DOI:10.32604/cmc.2025.061376 - 26 March 2025

    Abstract The task of student action recognition in the classroom is to precisely capture and analyze the actions of students in classroom videos, providing a foundation for realizing intelligent and accurate teaching. However, the complex nature of the classroom environment has added challenges and difficulties in the process of student action recognition. In this research article, with regard to the circumstances where students are prone to be occluded and classroom computing resources are restricted in real classroom scenarios, a lightweight multi-modal fusion action recognition approach is put forward. This proposed method is capable of enhancing the… More >

  • Open Access

    ARTICLE

    Fusion of Hash-Based Hard and Soft Biometrics for Enhancing Face Image Database Search and Retrieval

    Ameerah Abdullah Alshahrani*, Emad Sami Jaha, Nahed Alowidi

    CMC-Computers, Materials & Continua, Vol.77, No.3, pp. 3489-3509, 2023, DOI:10.32604/cmc.2023.044490 - 26 December 2023

    Abstract The utilization of digital picture search and retrieval has grown substantially in numerous fields for different purposes during the last decade, owing to the continuing advances in image processing and computer vision approaches. In multiple real-life applications, for example, social media, content-based face picture retrieval is a well-invested technique for large-scale databases, where there is a significant necessity for reliable retrieval capabilities enabling quick search in a vast number of pictures. Humans widely employ faces for recognizing and identifying people. Thus, face recognition through formal or personal pictures is increasingly used in various real-life applications,… More >

  • Open Access

    ARTICLE

    MFF-Net: Multimodal Feature Fusion Network for 3D Object Detection

    Peicheng Shi1,*, Zhiqiang Liu1, Heng Qi1, Aixi Yang2

    CMC-Computers, Materials & Continua, Vol.75, No.3, pp. 5615-5637, 2023, DOI:10.32604/cmc.2023.037794 - 29 April 2023

    Abstract In complex traffic environment scenarios, it is very important for autonomous vehicles to accurately perceive the dynamic information of other vehicles around the vehicle in advance. The accuracy of 3D object detection will be affected by problems such as illumination changes, object occlusion, and object detection distance. To this purpose, we face these challenges by proposing a multimodal feature fusion network for 3D object detection (MFF-Net). In this research, this paper first uses the spatial transformation projection algorithm to map the image features into the feature space, so that the image features are in the… More >

Displaying 1-10 on page 1 of 14. Per Page