Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (170)
  • Open Access

    REVIEW

    A Comprehensive Survey of Recent Transformers in Image, Video and Diffusion Models

    Dinh Phu Cuong Le1,2, Dong Wang1, Viet-Tuan Le3,*

    CMC-Computers, Materials & Continua, Vol.80, No.1, pp. 37-60, 2024, DOI:10.32604/cmc.2024.050790

    Abstract Transformer models have emerged as dominant networks for various tasks in computer vision compared to Convolutional Neural Networks (CNNs). The transformers demonstrate the ability to model long-range dependencies by utilizing a self-attention mechanism. This study aims to provide a comprehensive survey of recent transformer-based approaches in image and video applications, as well as diffusion models. We begin by discussing existing surveys of vision transformers and comparing them to this work. Then, we review the main components of a vanilla transformer network, including the self-attention mechanism, feed-forward network, position encoding, etc. In the main part of More >

  • Open Access

    ARTICLE

    A Dual Domain Robust Reversible Watermarking Algorithm for Frame Grouping Videos Using Scene Smoothness

    Yucheng Liang1,2,*, Ke Niu1,2,*, Yingnan Zhang1,2, Yifei Meng1,2

    CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 5143-5174, 2024, DOI:10.32604/cmc.2024.051364

    Abstract The proposed robust reversible watermarking algorithm addresses the compatibility challenges between robustness and reversibility in existing video watermarking techniques by leveraging scene smoothness for frame grouping videos. Grounded in the H.264 video coding standard, the algorithm first employs traditional robust watermark stitching technology to embed watermark information in the low-frequency coefficient domain of the U channel. Subsequently, it utilizes histogram migration techniques in the high-frequency coefficient domain of the U channel to embed auxiliary information, enabling successful watermark extraction and lossless recovery of the original video content. Experimental results demonstrate the algorithm’s strong imperceptibility, with… More >

  • Open Access

    ARTICLE

    A Unified Model Fusing Region of Interest Detection and Super Resolution for Video Compression

    Xinkun Tang1,2, Feng Ouyang1,2, Ying Xu2,*, Ligu Zhu1, Bo Peng1

    CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 3955-3975, 2024, DOI:10.32604/cmc.2024.049057

    Abstract High-resolution video transmission requires a substantial amount of bandwidth. In this paper, we present a novel video processing methodology that innovatively integrates region of interest (ROI) identification and super-resolution enhancement. Our method commences with the accurate detection of ROIs within video sequences, followed by the application of advanced super-resolution techniques to these areas, thereby preserving visual quality while economizing on data transmission. To validate and benchmark our approach, we have curated a new gaming dataset tailored to evaluate the effectiveness of ROI-based super-resolution in practical applications. The proposed model architecture leverages the transformer network framework,… More >

  • Open Access

    ARTICLE

    Workout Action Recognition in Video Streams Using an Attention Driven Residual DC-GRU Network

    Arnab Dey1,*, Samit Biswas1, Dac-Nhuong Le2

    CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 3067-3087, 2024, DOI:10.32604/cmc.2024.049512

    Abstract Regular exercise is a crucial aspect of daily life, as it enables individuals to stay physically active, lowers the likelihood of developing illnesses, and enhances life expectancy. The recognition of workout actions in video streams holds significant importance in computer vision research, as it aims to enhance exercise adherence, enable instant recognition, advance fitness tracking technologies, and optimize fitness routines. However, existing action datasets often lack diversity and specificity for workout actions, hindering the development of accurate recognition models. To address this gap, the Workout Action Video dataset (WAVd) has been introduced as a significant… More >

  • Open Access

    ARTICLE

    Customized Convolutional Neural Network for Accurate Detection of Deep Fake Images in Video Collections

    Dmitry Gura1,2, Bo Dong3,*, Duaa Mehiar4, Nidal Al Said5

    CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 1995-2014, 2024, DOI:10.32604/cmc.2024.048238

    Abstract The motivation for this study is that the quality of deep fakes is constantly improving, which leads to the need to develop new methods for their detection. The proposed Customized Convolutional Neural Network method involves extracting structured data from video frames using facial landmark detection, which is then used as input to the CNN. The customized Convolutional Neural Network method is the date augmented-based CNN model to generate ‘fake data’ or ‘fake images’. This study was carried out using Python and its libraries. We used 242 films from the dataset gathered by the Deep Fake… More >

  • Open Access

    ARTICLE

    A HEVC Video Steganalysis Method Using the Optimality of Motion Vector Prediction

    Jun Li1,2, Minqing Zhang1,2,*, Ke Niu1, Yingnan Zhang1, Xiaoyuan Yang1,2

    CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 2085-2103, 2024, DOI:10.32604/cmc.2024.048095

    Abstract Among steganalysis techniques, detection against MV (motion vector) domain-based video steganography in the HEVC (High Efficiency Video Coding) standard remains a challenging issue. For the purpose of improving the detection performance, this paper proposes a steganalysis method that can perfectly detect MV-based steganography in HEVC. Firstly, we define the local optimality of MVP (Motion Vector Prediction) based on the technology of AMVP (Advanced Motion Vector Prediction). Secondly, we analyze that in HEVC video, message embedding either using MVP index or MVD (Motion Vector Difference) may destroy the above optimality of MVP. And then, we define More >

  • Open Access

    ARTICLE

    Machine-Learning Based Packet Switching Method for Providing Stable High-Quality Video Streaming in Multi-Stream Transmission

    Yumin Jo1, Jongho Paik2,*

    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 4153-4176, 2024, DOI:10.32604/cmc.2024.047046

    Abstract Broadcasting gateway equipment generally uses a method of simply switching to a spare input stream when a failure occurs in a main input stream. However, when the transmission environment is unstable, problems such as reduction in the lifespan of equipment due to frequent switching and interruption, delay, and stoppage of services may occur. Therefore, applying a machine learning (ML) method, which is possible to automatically judge and classify network-related service anomaly, and switch multi-input signals without dropping or changing signals by predicting or quickly determining the time of error occurrence for smooth stream switching when… More >

  • Open Access

    ARTICLE

    A Hybrid Machine Learning Approach for Improvised QoE in Video Services over 5G Wireless Networks

    K. B. Ajeyprasaath, P. Vetrivelan*

    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 3195-3213, 2024, DOI:10.32604/cmc.2023.046911

    Abstract Video streaming applications have grown considerably in recent years. As a result, this becomes one of the most significant contributors to global internet traffic. According to recent studies, the telecommunications industry loses millions of dollars due to poor video Quality of Experience (QoE) for users. Among the standard proposals for standardizing the quality of video streaming over internet service providers (ISPs) is the Mean Opinion Score (MOS). However, the accurate finding of QoE by MOS is subjective and laborious, and it varies depending on the user. A fully automated data analytics framework is required to… More >

  • Open Access

    ARTICLE

    Video Summarization Approach Based on Binary Robust Invariant Scalable Keypoints and Bisecting K-Means

    Sameh Zarif1,2,*, Eman Morad1, Khalid Amin1, Abdullah Alharbi3, Wail S. Elkilani4, Shouze Tang5

    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 3565-3583, 2024, DOI:10.32604/cmc.2024.046185

    Abstract Due to the exponential growth of video data, aided by rapid advancements in multimedia technologies. It became difficult for the user to obtain information from a large video series. The process of providing an abstract of the entire video that includes the most representative frames is known as static video summarization. This method resulted in rapid exploration, indexing, and retrieval of massive video libraries. We propose a framework for static video summary based on a Binary Robust Invariant Scalable Keypoint (BRISK) and bisecting K-means clustering algorithm. The current method effectively recognizes relevant frames using BRISK… More >

  • Open Access

    REVIEW

    Trends in Event Understanding and Caption Generation/Reconstruction in Dense Video: A Review

    Ekanayake Mudiyanselage Chulabhaya Lankanatha Ekanayake1,2, Abubakar Sulaiman Gezawa3,*, Yunqi Lei1

    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 2941-2965, 2024, DOI:10.32604/cmc.2024.046155

    Abstract Video description generates natural language sentences that describe the subject, verb, and objects of the targeted Video. The video description has been used to help visually impaired people to understand the content. It is also playing an essential role in devolving human-robot interaction. The dense video description is more difficult when compared with simple Video captioning because of the object’s interactions and event overlapping. Deep learning is changing the shape of computer vision (CV) technologies and natural language processing (NLP). There are hundreds of deep learning models, datasets, and evaluations that can improve the gaps… More >

Displaying 1-10 on page 1 of 170. Per Page