Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (72)
  • Open Access

    ARTICLE

    A Recurrent Neural Network for Multimodal Anomaly Detection by Using Spatio-Temporal Audio-Visual Data

    Sameema Tariq1, Ata-Ur- Rehman2,3, Maria Abubakar2, Waseem Iqbal4, Hatoon S. Alsagri5, Yousef A. Alduraywish5, Haya Abdullah A. Alhakbani5,*

    CMC-Computers, Materials & Continua, Vol.81, No.2, pp. 2493-2515, 2024, DOI:10.32604/cmc.2024.055787 - 18 November 2024

    Abstract In video surveillance, anomaly detection requires training machine learning models on spatio-temporal video sequences. However, sometimes the video-only data is not sufficient to accurately detect all the abnormal activities. Therefore, we propose a novel audio-visual spatiotemporal autoencoder specifically designed to detect anomalies for video surveillance by utilizing audio data along with video data. This paper presents a competitive approach to a multi-modal recurrent neural network for anomaly detection that combines separate spatial and temporal autoencoders to leverage both spatial and temporal features in audio-visual data. The proposed model is trained to produce low reconstruction error… More >

  • Open Access

    ARTICLE

    Efficient User Identity Linkage Based on Aligned Multimodal Features and Temporal Correlation

    Jiaqi Gao1, Kangfeng Zheng1,*, Xiujuan Wang2, Chunhua Wu1, Bin Wu2

    CMC-Computers, Materials & Continua, Vol.81, No.1, pp. 251-270, 2024, DOI:10.32604/cmc.2024.055560 - 15 October 2024

    Abstract User identity linkage (UIL) refers to identifying user accounts belonging to the same identity across different social media platforms. Most of the current research is based on text analysis, which fails to fully explore the rich image resources generated by users, and the existing attempts touch on the multimodal domain, but still face the challenge of semantic differences between text and images. Given this, we investigate the UIL task across different social media platforms based on multimodal user-generated contents (UGCs). We innovatively introduce the efficient user identity linkage via aligned multi-modal features and temporal correlation… More >

  • Open Access

    REVIEW

    Evolution and Prospects of Foundation Models: From Large Language Models to Large Multimodal Models

    Zheyi Chen1,, Liuchang Xu1,, Hongting Zheng1, Luyao Chen1, Amr Tolba2,3, Liang Zhao4, Keping Yu5,*, Hailin Feng1,*

    CMC-Computers, Materials & Continua, Vol.80, No.2, pp. 1753-1808, 2024, DOI:10.32604/cmc.2024.052618 - 15 August 2024

    Abstract Since the 1950s, when the Turing Test was introduced, there has been notable progress in machine language intelligence. Language modeling, crucial for AI development, has evolved from statistical to neural models over the last two decades. Recently, transformer-based Pre-trained Language Models (PLM) have excelled in Natural Language Processing (NLP) tasks by leveraging large-scale training corpora. Increasing the scale of these models enhances performance significantly, introducing abilities like context learning that smaller models lack. The advancement in Large Language Models, exemplified by the development of ChatGPT, has made significant impacts both academically and industrially, capturing widespread… More >

  • Open Access

    ARTICLE

    GAN-DIRNet: A Novel Deformable Image Registration Approach for Multimodal Histological Images

    Haiyue Li1, Jing Xie2, Jing Ke3, Ye Yuan1, Xiaoyong Pan1, Hongyi Xin4, Hongbin Shen1,*

    CMC-Computers, Materials & Continua, Vol.80, No.1, pp. 487-506, 2024, DOI:10.32604/cmc.2024.049640 - 18 July 2024

    Abstract Multi-modal histological image registration tasks pose significant challenges due to tissue staining operations causing partial loss and folding of tissue. Convolutional neural network (CNN) and generative adversarial network (GAN) are pivotal in medical image registration. However, existing methods often struggle with severe interference and deformation, as seen in histological images of conditions like Cushing’s disease. We argue that the failure of current approaches lies in underutilizing the feature extraction capability of the discriminator in GAN. In this study, we propose a novel multi-modal registration approach GAN-DIRNet based on GAN for deformable histological image registration. To… More >

  • Open Access

    ARTICLE

    Fine-Grained Ship Recognition Based on Visible and Near-Infrared Multimodal Remote Sensing Images: Dataset, Methodology and Evaluation

    Shiwen Song, Rui Zhang, Min Hu*, Feiyao Huang

    CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 5243-5271, 2024, DOI:10.32604/cmc.2024.050879 - 20 June 2024

    Abstract Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security. Currently, with the emergence of massive high-resolution multi-modality images, the use of multi-modality images for fine-grained recognition has become a promising technology. Fine-grained recognition of multi-modality images imposes higher requirements on the dataset samples. The key to the problem is how to extract and fuse the complementary features of multi-modality images to obtain more discriminative fusion features. The attention mechanism helps the model to pinpoint the key information in the image, resulting in a… More >

  • Open Access

    ARTICLE

    An Immune-Inspired Approach with Interval Allocation in Solving Multimodal Multi-Objective Optimization Problems with Local Pareto Sets

    Weiwei Zhang1, Jiaqiang Li1, Chao Wang2, Meng Li3, Zhi Rao4,*

    CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 4237-4257, 2024, DOI:10.32604/cmc.2024.050430 - 20 June 2024

    Abstract In practical engineering, multi-objective optimization often encounters situations where multiple Pareto sets (PS) in the decision space correspond to the same Pareto front (PF) in the objective space, known as Multi-Modal Multi-Objective Optimization Problems (MMOP). Locating multiple equivalent global PSs poses a significant challenge in real-world applications, especially considering the existence of local PSs. Effectively identifying and locating both global and local PSs is a major challenge. To tackle this issue, we introduce an immune-inspired reproduction strategy designed to produce more offspring in less crowded, promising regions and regulate the number of offspring in areas… More >

  • Open Access

    ARTICLE

    Enhancing Cross-Lingual Image Description: A Multimodal Approach for Semantic Relevance and Stylistic Alignment

    Emran Al-Buraihy, Dan Wang*

    CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 3913-3938, 2024, DOI:10.32604/cmc.2024.048104 - 20 June 2024

    Abstract Cross-lingual image description, the task of generating image captions in a target language from images and descriptions in a source language, is addressed in this study through a novel approach that combines neural network models and semantic matching techniques. Experiments conducted on the Flickr8k and AraImg2k benchmark datasets, featuring images and descriptions in English and Arabic, showcase remarkable performance improvements over state-of-the-art methods. Our model, equipped with the Image & Cross-Language Semantic Matching module and the Target Language Domain Evaluation module, significantly enhances the semantic relevance of generated image descriptions. For English-to-Arabic and Arabic-to-English cross-language… More >

  • Open Access

    ARTICLE

    Multimodal Deep Neural Networks for Digitized Document Classification

    Aigerim Baimakhanova1,*, Ainur Zhumadillayeva2, Bigul Mukhametzhanova3, Natalya Glazyrina2, Rozamgul Niyazova2, Nurseit Zhunissov1, Aizhan Sambetbayeva4

    Computer Systems Science and Engineering, Vol.48, No.3, pp. 793-811, 2024, DOI:10.32604/csse.2024.043273 - 20 May 2024

    Abstract As digital technologies have advanced more rapidly, the number of paper documents recently converted into a digital format has exponentially increased. To respond to the urgent need to categorize the growing number of digitized documents, the classification of digitized documents in real time has been identified as the primary goal of our study. A paper classification is the first stage in automating document control and efficient knowledge discovery with no or little human involvement. Artificial intelligence methods such as Deep Learning are now combined with segmentation to study and interpret those traits, which were not… More >

  • Open Access

    ARTICLE

    Cross-Modal Consistency with Aesthetic Similarity for Multimodal False Information Detection

    Weijian Fan1,*, Ziwei Shi2

    CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 2723-2741, 2024, DOI:10.32604/cmc.2024.050344 - 15 May 2024

    Abstract With the explosive growth of false information on social media platforms, the automatic detection of multimodal false information has received increasing attention. Recent research has significantly contributed to multimodal information exchange and fusion, with many methods attempting to integrate unimodal features to generate multimodal news representations. However, they still need to fully explore the hierarchical and complex semantic correlations between different modal contents, severely limiting their performance detecting multimodal false information. This work proposes a two-stage detection framework for multimodal false information detection, called ASMFD, which is based on image aesthetic similarity to segment and… More >

  • Open Access

    ARTICLE

    FusionNN: A Semantic Feature Fusion Model Based on Multimodal for Web Anomaly Detection

    Li Wang1,2,3,*, Mingshan Xia1,2,*, Hao Hu1, Jianfang Li1,2, Fengyao Hou1,2, Gang Chen1,2,3

    CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 2991-3006, 2024, DOI:10.32604/cmc.2024.048637 - 15 May 2024

    Abstract With the rapid development of the mobile communication and the Internet, the previous web anomaly detection and identification models were built relying on security experts’ empirical knowledge and attack features. Although this approach can achieve higher detection performance, it requires huge human labor and resources to maintain the feature library. In contrast, semantic feature engineering can dynamically discover new semantic features and optimize feature selection by automatically analyzing the semantic information contained in the data itself, thus reducing dependence on prior knowledge. However, current semantic features still have the problem of semantic expression singularity, as… More >

Displaying 1-10 on page 1 of 72. Per Page