Tech Science Press - Publisher of Open Access Journals

Open Access

ARTICLE

A Recurrent Neural Network for Multimodal Anomaly Detection by Using Spatio-Temporal Audio-Visual Data

Sameema Tariq¹, Ata-Ur- Rehman^2,3, Maria Abubakar², Waseem Iqbal⁴, Hatoon S. Alsagri⁵, Yousef A. Alduraywish⁵, Haya Abdullah A. Alhakbani^5,*

CMC-Computers, Materials & Continua, Vol.81, No.2, pp. 2493-2515, 2024, DOI:10.32604/cmc.2024.055787 - 18 November 2024

Abstract In video surveillance, anomaly detection requires training machine learning models on spatio-temporal video sequences. However, sometimes the video-only data is not sufficient to accurately detect all the abnormal activities. Therefore, we propose a novel audio-visual spatiotemporal autoencoder specifically designed to detect anomalies for video surveillance by utilizing audio data along with video data. This paper presents a competitive approach to a multi-modal recurrent neural network for anomaly detection that combines separate spatial and temporal autoencoders to leverage both spatial and temporal features in audio-visual data. The proposed model is trained to produce low reconstruction error… More >

Open Access

ARTICLE

Efficient User Identity Linkage Based on Aligned Multimodal Features and Temporal Correlation

Jiaqi Gao¹, Kangfeng Zheng^1,*, Xiujuan Wang², Chunhua Wu¹, Bin Wu²

CMC-Computers, Materials & Continua, Vol.81, No.1, pp. 251-270, 2024, DOI:10.32604/cmc.2024.055560 - 15 October 2024

Abstract User identity linkage (UIL) refers to identifying user accounts belonging to the same identity across different social media platforms. Most of the current research is based on text analysis, which fails to fully explore the rich image resources generated by users, and the existing attempts touch on the multimodal domain, but still face the challenge of semantic differences between text and images. Given this, we investigate the UIL task across different social media platforms based on multimodal user-generated contents (UGCs). We innovatively introduce the efficient user identity linkage via aligned multi-modal features and temporal correlation… More >

Evolution and Prospects of Foundation Models: From Large Language Models to Large Multimodal Models

Zheyi Chen^1,, Liuchang Xu^1,, Hongting Zheng¹, Luyao Chen¹, Amr Tolba^2,3, Liang Zhao⁴, Keping Yu^5,*, Hailin Feng^1,*

CMC-Computers, Materials & Continua, Vol.80, No.2, pp. 1753-1808, 2024, DOI:10.32604/cmc.2024.052618 - 15 August 2024

Abstract Since the 1950s, when the Turing Test was introduced, there has been notable progress in machine language intelligence. Language modeling, crucial for AI development, has evolved from statistical to neural models over the last two decades. Recently, transformer-based Pre-trained Language Models (PLM) have excelled in Natural Language Processing (NLP) tasks by leveraging large-scale training corpora. Increasing the scale of these models enhances performance significantly, introducing abilities like context learning that smaller models lack. The advancement in Large Language Models, exemplified by the development of ChatGPT, has made significant impacts both academically and industrially, capturing widespread… More >

Open Access

ARTICLE

GAN-DIRNet: A Novel Deformable Image Registration Approach for Multimodal Histological Images

Haiyue Li¹, Jing Xie², Jing Ke³, Ye Yuan¹, Xiaoyong Pan¹, Hongyi Xin⁴, Hongbin Shen^1,*

CMC-Computers, Materials & Continua, Vol.80, No.1, pp. 487-506, 2024, DOI:10.32604/cmc.2024.049640 - 18 July 2024

Abstract Multi-modal histological image registration tasks pose significant challenges due to tissue staining operations causing partial loss and folding of tissue. Convolutional neural network (CNN) and generative adversarial network (GAN) are pivotal in medical image registration. However, existing methods often struggle with severe interference and deformation, as seen in histological images of conditions like Cushing’s disease. We argue that the failure of current approaches lies in underutilizing the feature extraction capability of the discriminator in GAN. In this study, we propose a novel multi-modal registration approach GAN-DIRNet based on GAN for deformable histological image registration. To… More >

Open Access

ARTICLE

Fine-Grained Ship Recognition Based on Visible and Near-Infrared Multimodal Remote Sensing Images: Dataset, Methodology and Evaluation

Shiwen Song, Rui Zhang, Min Hu^*, Feiyao Huang

CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 5243-5271, 2024, DOI:10.32604/cmc.2024.050879 - 20 June 2024

Abstract Fine-grained recognition of ships based on remote sensing images is crucial to safeguarding maritime rights and interests and maintaining national security. Currently, with the emergence of massive high-resolution multi-modality images, the use of multi-modality images for fine-grained recognition has become a promising technology. Fine-grained recognition of multi-modality images imposes higher requirements on the dataset samples. The key to the problem is how to extract and fuse the complementary features of multi-modality images to obtain more discriminative fusion features. The attention mechanism helps the model to pinpoint the key information in the image, resulting in a… More >

Open Access

ARTICLE

An Immune-Inspired Approach with Interval Allocation in Solving Multimodal Multi-Objective Optimization Problems with Local Pareto Sets

Weiwei Zhang¹, Jiaqiang Li¹, Chao Wang², Meng Li³, Zhi Rao^4,*

CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 4237-4257, 2024, DOI:10.32604/cmc.2024.050430 - 20 June 2024

Abstract In practical engineering, multi-objective optimization often encounters situations where multiple Pareto sets (PS) in the decision space correspond to the same Pareto front (PF) in the objective space, known as Multi-Modal Multi-Objective Optimization Problems (MMOP). Locating multiple equivalent global PSs poses a significant challenge in real-world applications, especially considering the existence of local PSs. Effectively identifying and locating both global and local PSs is a major challenge. To tackle this issue, we introduce an immune-inspired reproduction strategy designed to produce more offspring in less crowded, promising regions and regulate the number of offspring in areas… More >

Open Access

ARTICLE

Enhancing Cross-Lingual Image Description: A Multimodal Approach for Semantic Relevance and Stylistic Alignment

Emran Al-Buraihy, Dan Wang^*

CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 3913-3938, 2024, DOI:10.32604/cmc.2024.048104 - 20 June 2024

Abstract Cross-lingual image description, the task of generating image captions in a target language from images and descriptions in a source language, is addressed in this study through a novel approach that combines neural network models and semantic matching techniques. Experiments conducted on the Flickr8k and AraImg2k benchmark datasets, featuring images and descriptions in English and Arabic, showcase remarkable performance improvements over state-of-the-art methods. Our model, equipped with the Image & Cross-Language Semantic Matching module and the Target Language Domain Evaluation module, significantly enhances the semantic relevance of generated image descriptions. For English-to-Arabic and Arabic-to-English cross-language… More >

Open Access

ARTICLE

Multimodal Deep Neural Networks for Digitized Document Classification

Aigerim Baimakhanova^1,*, Ainur Zhumadillayeva², Bigul Mukhametzhanova³, Natalya Glazyrina², Rozamgul Niyazova², Nurseit Zhunissov¹, Aizhan Sambetbayeva⁴

Computer Systems Science and Engineering, Vol.48, No.3, pp. 793-811, 2024, DOI:10.32604/csse.2024.043273 - 20 May 2024

Abstract As digital technologies have advanced more rapidly, the number of paper documents recently converted into a digital format has exponentially increased. To respond to the urgent need to categorize the growing number of digitized documents, the classification of digitized documents in real time has been identified as the primary goal of our study. A paper classification is the first stage in automating document control and efficient knowledge discovery with no or little human involvement. Artificial intelligence methods such as Deep Learning are now combined with segmentation to study and interpret those traits, which were not… More >

Open Access

ARTICLE

Cross-Modal Consistency with Aesthetic Similarity for Multimodal False Information Detection

Weijian Fan^1,*, Ziwei Shi²

CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 2723-2741, 2024, DOI:10.32604/cmc.2024.050344 - 15 May 2024

Abstract With the explosive growth of false information on social media platforms, the automatic detection of multimodal false information has received increasing attention. Recent research has significantly contributed to multimodal information exchange and fusion, with many methods attempting to integrate unimodal features to generate multimodal news representations. However, they still need to fully explore the hierarchical and complex semantic correlations between different modal contents, severely limiting their performance detecting multimodal false information. This work proposes a two-stage detection framework for multimodal false information detection, called ASMFD, which is based on image aesthetic similarity to segment and… More >

Open Access

ARTICLE

FusionNN: A Semantic Feature Fusion Model Based on Multimodal for Web Anomaly Detection

Li Wang^1,2,3,*, Mingshan Xia^1,2,*, Hao Hu¹, Jianfang Li^1,2, Fengyao Hou^1,2, Gang Chen^1,2,3

CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 2991-3006, 2024, DOI:10.32604/cmc.2024.048637 - 15 May 2024

Abstract With the rapid development of the mobile communication and the Internet, the previous web anomaly detection and identification models were built relying on security experts’ empirical knowledge and attack features. Although this approach can achieve higher detection performance, it requires huge human labor and resources to maintain the feature library. In contrast, semantic feature engineering can dynamically discover new semantic features and optimize feature selection by automatically analyzing the semantic information contained in the data itself, thus reducing dependence on prior knowledge. However, current semantic features still have the problem of semantic expression singularity, as… More >

Displaying 1-10 on page 1 of 72. Per Page

View

263

Download

90

View

368

Download

208

View

2267

Download

628

View

496

Download

217

View

425

Download

223

View

411

Download

191

View

420

Download

217

View

658

Download

335

View

450

Download

209

View

368

Download

190

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp: