Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (9)
  • Open Access

    ARTICLE

    TGICP: A Text-Gated Interaction Network with Inter-Sample Commonality Perception for Multimodal Sentiment Analysis

    Erlin Tian1, Shuai Zhao2,*, Min Huang2, Yushan Pan3,4, Yihong Wang3,4, Zuhe Li1

    CMC-Computers, Materials & Continua, Vol.85, No.1, pp. 1427-1456, 2025, DOI:10.32604/cmc.2025.066476 - 29 August 2025

    Abstract With the increasing importance of multimodal data in emotional expression on social media, mainstream methods for sentiment analysis have shifted from unimodal to multimodal approaches. However, the challenges of extracting high-quality emotional features and achieving effective interaction between different modalities remain two major obstacles in multimodal sentiment analysis. To address these challenges, this paper proposes a Text-Gated Interaction Network with Inter-Sample Commonality Perception (TGICP). Specifically, we utilize a Inter-sample Commonality Perception (ICP) module to extract common features from similar samples within the same modality, and use these common features to enhance the original features of… More >

  • Open Access

    REVIEW

    Transformers for Multi-Modal Image Analysis in Healthcare

    Sameera V Mohd Sagheer1,*, Meghana K H2, P M Ameer3, Muneer Parayangat4, Mohamed Abbas4

    CMC-Computers, Materials & Continua, Vol.84, No.3, pp. 4259-4297, 2025, DOI:10.32604/cmc.2025.063726 - 30 July 2025

    Abstract Integrating multiple medical imaging techniques, including Magnetic Resonance Imaging (MRI), Computed Tomography, Positron Emission Tomography (PET), and ultrasound, provides a comprehensive view of the patient health status. Each of these methods contributes unique diagnostic insights, enhancing the overall assessment of patient condition. Nevertheless, the amalgamation of data from multiple modalities presents difficulties due to disparities in resolution, data collection methods, and noise levels. While traditional models like Convolutional Neural Networks (CNNs) excel in single-modality tasks, they struggle to handle multi-modal complexities, lacking the capacity to model global relationships. This research presents a novel approach for… More >

  • Open Access

    REVIEW

    Research Progress on Multi-Modal Fusion Object Detection Algorithms for Autonomous Driving: A Review

    Peicheng Shi1,*, Li Yang1, Xinlong Dong1, Heng Qi2, Aixi Yang3

    CMC-Computers, Materials & Continua, Vol.83, No.3, pp. 3877-3917, 2025, DOI:10.32604/cmc.2025.063205 - 19 May 2025

    Abstract As the number and complexity of sensors in autonomous vehicles continue to rise, multimodal fusion-based object detection algorithms are increasingly being used to detect 3D environmental information, significantly advancing the development of perception technology in autonomous driving. To further promote the development of fusion algorithms and improve detection performance, this paper discusses the advantages and recent advancements of multimodal fusion-based object detection algorithms. Starting from single-modal sensor detection, the paper provides a detailed overview of typical sensors used in autonomous driving and introduces object detection methods based on images and point clouds. For image-based detection… More >

  • Open Access

    ARTICLE

    Multi-Modal Named Entity Recognition with Auxiliary Visual Knowledge and Word-Level Fusion

    Huansha Wang*, Ruiyang Huang*, Qinrang Liu, Xinghao Wang

    CMC-Computers, Materials & Continua, Vol.83, No.3, pp. 5747-5760, 2025, DOI:10.32604/cmc.2025.061902 - 19 May 2025

    Abstract Multi-modal Named Entity Recognition (MNER) aims to better identify meaningful textual entities by integrating information from images. Previous work has focused on extracting visual semantics at a fine-grained level, or obtaining entity related external knowledge from knowledge bases or Large Language Models (LLMs). However, these approaches ignore the poor semantic correlation between visual and textual modalities in MNER datasets and do not explore different multi-modal fusion approaches. In this paper, we present MMAVK, a multi-modal named entity recognition model with auxiliary visual knowledge and word-level fusion, which aims to leverage the Multi-modal Large Language Model… More >

  • Open Access

    ARTICLE

    MMCSD: Multi-Modal Knowledge Graph Completion Based on Super-Resolution and Detailed Description Generation

    Huansha Wang*, Ruiyang Huang*, Qinrang Liu, Shaomei Li, Jianpeng Zhang

    CMC-Computers, Materials & Continua, Vol.83, No.1, pp. 761-783, 2025, DOI:10.32604/cmc.2025.060395 - 26 March 2025

    Abstract Multi-modal knowledge graph completion (MMKGC) aims to complete missing entities or relations in multi-modal knowledge graphs, thereby discovering more previously unknown triples. Due to the continuous growth of data and knowledge and the limitations of data sources, the visual knowledge within the knowledge graphs is generally of low quality, and some entities suffer from the issue of missing visual modality. Nevertheless, previous studies of MMKGC have primarily focused on how to facilitate modality interaction and fusion while neglecting the problems of low modality quality and modality missing. In this case, mainstream MMKGC models only use… More >

  • Open Access

    ARTICLE

    Research on Fine-Grained Recognition Method for Sensitive Information in Social Networks Based on CLIP

    Menghan Zhang1,2, Fangfang Shan1,2,*, Mengyao Liu1,2, Zhenyu Wang1,2

    CMC-Computers, Materials & Continua, Vol.81, No.1, pp. 1565-1580, 2024, DOI:10.32604/cmc.2024.056008 - 15 October 2024

    Abstract With the emergence and development of social networks, people can stay in touch with friends, family, and colleagues more quickly and conveniently, regardless of their location. This ubiquitous digital internet environment has also led to large-scale disclosure of personal privacy. Due to the complexity and subtlety of sensitive information, traditional sensitive information identification technologies cannot thoroughly address the characteristics of each piece of data, thus weakening the deep connections between text and images. In this context, this paper adopts the CLIP model as a modality discriminator. By using comparative learning between sensitive image descriptions and… More >

  • Open Access

    REVIEW

    A Comprehensive Survey on Deep Learning Multi-Modal Fusion: Methods, Technologies and Applications

    Tianzhe Jiao, Chaopeng Guo, Xiaoyue Feng, Yuming Chen, Jie Song*

    CMC-Computers, Materials & Continua, Vol.80, No.1, pp. 1-35, 2024, DOI:10.32604/cmc.2024.053204 - 18 July 2024

    Abstract Multi-modal fusion technology gradually become a fundamental task in many fields, such as autonomous driving, smart healthcare, sentiment analysis, and human-computer interaction. It is rapidly becoming the dominant research due to its powerful perception and judgment capabilities. Under complex scenes, multi-modal fusion technology utilizes the complementary characteristics of multiple data streams to fuse different data types and achieve more accurate predictions. However, achieving outstanding performance is challenging because of equipment performance limitations, missing information, and data noise. This paper comprehensively reviews existing methods based on multi-modal fusion techniques and completes a detailed and in-depth analysis.… More >

  • Open Access

    ARTICLE

    Fake News Detection Based on Text-Modal Dominance and Fusing Multiple Multi-Model Clues

    Lifang Fu1, Huanxin Peng2,*, Changjin Ma2, Yuhan Liu2

    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 4399-4416, 2024, DOI:10.32604/cmc.2024.047053 - 26 March 2024

    Abstract In recent years, how to efficiently and accurately identify multi-model fake news has become more challenging. First, multi-model data provides more evidence but not all are equally important. Secondly, social structure information has proven to be effective in fake news detection and how to combine it while reducing the noise information is critical. Unfortunately, existing approaches fail to handle these problems. This paper proposes a multi-model fake news detection framework based on Tex-modal Dominance and fusing Multiple Multi-model Cues (TD-MMC), which utilizes three valuable multi-model clues: text-model importance, text-image complementary, and text-image inconsistency. TD-MMC is… More >

  • Open Access

    ARTICLE

    Cross-Modal Relation-Aware Networks for Fake News Detection

    Hui Yu, Jinguang Wang*

    Journal of New Media, Vol.4, No.1, pp. 13-26, 2022, DOI:10.32604/jnm.2022.027312 - 21 April 2022

    Abstract With the speedy development of communication Internet and the widespread use of social multimedia, so many creators have published posts on social multimedia platforms that fake news detection has already been a challenging task. Although some works use deep learning methods to capture visual and textual information of posts, most existing methods cannot explicitly model the binary relations among image regions or text tokens to mine the global relation information in a modality deeply such as image or text. Moreover, they cannot fully exploit the supplementary cross-modal information, including image and text relations, to supplement… More >

Displaying 1-10 on page 1 of 9. Per Page