Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (68)
  • Open Access

    ARTICLE

    Fine-Grained Ship Recognition Based on Visible and Near-Infrared Multimodal Remote Sensing Images: Dataset, Methodology and Evaluation

    Shiwen Song, Rui Zhang, Min Hu*, Feiyao Huang

    CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 5243-5271, 2024, DOI:10.32604/cmc.2024.050879

    Abstract Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security. Currently, with the emergence of massive high-resolution multi-modality images, the use of multi-modality images for fine-grained recognition has become a promising technology. Fine-grained recognition of multi-modality images imposes higher requirements on the dataset samples. The key to the problem is how to extract and fuse the complementary features of multi-modality images to obtain more discriminative fusion features. The attention mechanism helps the model to pinpoint the key information in the image, resulting in a… More >

  • Open Access

    ARTICLE

    An Immune-Inspired Approach with Interval Allocation in Solving Multimodal Multi-Objective Optimization Problems with Local Pareto Sets

    Weiwei Zhang1, Jiaqiang Li1, Chao Wang2, Meng Li3, Zhi Rao4,*

    CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 4237-4257, 2024, DOI:10.32604/cmc.2024.050430

    Abstract In practical engineering, multi-objective optimization often encounters situations where multiple Pareto sets (PS) in the decision space correspond to the same Pareto front (PF) in the objective space, known as Multi-Modal Multi-Objective Optimization Problems (MMOP). Locating multiple equivalent global PSs poses a significant challenge in real-world applications, especially considering the existence of local PSs. Effectively identifying and locating both global and local PSs is a major challenge. To tackle this issue, we introduce an immune-inspired reproduction strategy designed to produce more offspring in less crowded, promising regions and regulate the number of offspring in areas… More >

  • Open Access

    ARTICLE

    Enhancing Cross-Lingual Image Description: A Multimodal Approach for Semantic Relevance and Stylistic Alignment

    Emran Al-Buraihy, Dan Wang*

    CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 3913-3938, 2024, DOI:10.32604/cmc.2024.048104

    Abstract Cross-lingual image description, the task of generating image captions in a target language from images and descriptions in a source language, is addressed in this study through a novel approach that combines neural network models and semantic matching techniques. Experiments conducted on the Flickr8k and AraImg2k benchmark datasets, featuring images and descriptions in English and Arabic, showcase remarkable performance improvements over state-of-the-art methods. Our model, equipped with the Image & Cross-Language Semantic Matching module and the Target Language Domain Evaluation module, significantly enhances the semantic relevance of generated image descriptions. For English-to-Arabic and Arabic-to-English cross-language… More >

  • Open Access

    ARTICLE

    Multimodal Deep Neural Networks for Digitized Document Classification

    Aigerim Baimakhanova1,*, Ainur Zhumadillayeva2, Bigul Mukhametzhanova3, Natalya Glazyrina2, Rozamgul Niyazova2, Nurseit Zhunissov1, Aizhan Sambetbayeva4

    Computer Systems Science and Engineering, Vol.48, No.3, pp. 793-811, 2024, DOI:10.32604/csse.2024.043273

    Abstract As digital technologies have advanced more rapidly, the number of paper documents recently converted into a digital format has exponentially increased. To respond to the urgent need to categorize the growing number of digitized documents, the classification of digitized documents in real time has been identified as the primary goal of our study. A paper classification is the first stage in automating document control and efficient knowledge discovery with no or little human involvement. Artificial intelligence methods such as Deep Learning are now combined with segmentation to study and interpret those traits, which were not… More >

  • Open Access

    ARTICLE

    Cross-Modal Consistency with Aesthetic Similarity for Multimodal False Information Detection

    Weijian Fan1,*, Ziwei Shi2

    CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 2723-2741, 2024, DOI:10.32604/cmc.2024.050344

    Abstract With the explosive growth of false information on social media platforms, the automatic detection of multimodal false information has received increasing attention. Recent research has significantly contributed to multimodal information exchange and fusion, with many methods attempting to integrate unimodal features to generate multimodal news representations. However, they still need to fully explore the hierarchical and complex semantic correlations between different modal contents, severely limiting their performance detecting multimodal false information. This work proposes a two-stage detection framework for multimodal false information detection, called ASMFD, which is based on image aesthetic similarity to segment and… More >

  • Open Access

    ARTICLE

    FusionNN: A Semantic Feature Fusion Model Based on Multimodal for Web Anomaly Detection

    Li Wang1,2,3,*, Mingshan Xia1,2,*, Hao Hu1, Jianfang Li1,2, Fengyao Hou1,2, Gang Chen1,2,3

    CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 2991-3006, 2024, DOI:10.32604/cmc.2024.048637

    Abstract With the rapid development of the mobile communication and the Internet, the previous web anomaly detection and identification models were built relying on security experts’ empirical knowledge and attack features. Although this approach can achieve higher detection performance, it requires huge human labor and resources to maintain the feature library. In contrast, semantic feature engineering can dynamically discover new semantic features and optimize feature selection by automatically analyzing the semantic information contained in the data itself, thus reducing dependence on prior knowledge. However, current semantic features still have the problem of semantic expression singularity, as… More >

  • Open Access

    ARTICLE

    MAIPFE: An Efficient Multimodal Approach Integrating Pre-Emptive Analysis, Personalized Feature Selection, and Explainable AI

    Moshe Dayan Sirapangi1, S. Gopikrishnan1,*

    CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 2229-2251, 2024, DOI:10.32604/cmc.2024.047438

    Abstract Medical Internet of Things (IoT) devices are becoming more and more common in healthcare. This has created a huge need for advanced predictive health modeling strategies that can make good use of the growing amount of multimodal data to find potential health risks early and help individuals in a personalized way. Existing methods, while useful, have limitations in predictive accuracy, delay, personalization, and user interpretability, requiring a more comprehensive and efficient approach to harness modern medical IoT devices. MAIPFE is a multimodal approach integrating pre-emptive analysis, personalized feature selection, and explainable AI for real-time health… More >

  • Open Access

    ARTICLE

    Multimodal Social Media Fake News Detection Based on Similarity Inference and Adversarial Networks

    Fangfang Shan1,2,*, Huifang Sun1,2, Mengyi Wang1,2

    CMC-Computers, Materials & Continua, Vol.79, No.1, pp. 581-605, 2024, DOI:10.32604/cmc.2024.046202

    Abstract As social networks become increasingly complex, contemporary fake news often includes textual descriptions of events accompanied by corresponding images or videos. Fake news in multiple modalities is more likely to create a misleading perception among users. While early research primarily focused on text-based features for fake news detection mechanisms, there has been relatively limited exploration of learning shared representations in multimodal (text and visual) contexts. To address these limitations, this paper introduces a multimodal model for detecting fake news, which relies on similarity reasoning and adversarial networks. The model employs Bidirectional Encoder Representation from Transformers… More >

  • Open Access

    ARTICLE

    Multimodality Medical Image Fusion Based on Pixel Significance with Edge-Preserving Processing for Clinical Applications

    Bhawna Goyal1, Ayush Dogra2, Dawa Chyophel Lepcha1, Rajesh Singh3, Hemant Sharma4, Ahmed Alkhayyat5, Manob Jyoti Saikia6,*

    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 4317-4342, 2024, DOI:10.32604/cmc.2024.047256

    Abstract Multimodal medical image fusion has attained immense popularity in recent years due to its robust technology for clinical diagnosis. It fuses multiple images into a single image to improve the quality of images by retaining significant information and aiding diagnostic practitioners in diagnosing and treating many diseases. However, recent image fusion techniques have encountered several challenges, including fusion artifacts, algorithm complexity, and high computing costs. To solve these problems, this study presents a novel medical image fusion strategy by combining the benefits of pixel significance with edge-preserving processing to achieve the best fusion performance. First,… More >

  • Open Access

    ARTICLE

    Audio-Text Multimodal Speech Recognition via Dual-Tower Architecture for Mandarin Air Traffic Control Communications

    Shuting Ge1,2, Jin Ren2,3,*, Yihua Shi4, Yujun Zhang1, Shunzhi Yang2, Jinfeng Yang2

    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 3215-3245, 2024, DOI:10.32604/cmc.2023.046746

    Abstract In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a promising means of preventing miscommunications and enhancing aviation safety. However, most existing speech recognition methods merely incorporate external language models on the decoder side, leading to insufficient semantic alignment between speech and text modalities during the encoding phase. Furthermore, it is challenging to model acoustic context dependencies over long distances due to the longer speech sequences than text, especially for the extended ATCC data. To address these issues,… More >

Displaying 1-10 on page 1 of 68. Per Page