Special Issues
Table of Content

Recent Advances in Signal Processing and Computer Vision

Submission Deadline: 30 June 2025 View: 460 Submit to Special Issue

Guest Editors

Dr. Bo Yang

Email: boyang@uestc.edu.cn

Affiliation: School of Automation Engineering, University of Electronic Science and Technology of China, China

Homepage:

Research Interests: Computer Vision, Surgical Robotics, Surgical (endoscopic) Vison, and Medical Image Processing

图片6.png


Dr. Chao Liu

Email: liu@lirmm.fr

Affiliation: CNRS (French National Center for Scientific Research), France

Homepage:

Research Interests: Visual Augmentation and Reconstruction, 3D Reconstruction of Deformable Surface, Haptics in Human-Machine Interaction, Multimodal Sensor-Based Analysis of Manipulation Skills, Surgical Robot, Medical Image Processing

图片7.png


Summary

Over the past decade or so, artificial intelligence technologies represented by deep learning have made remarkable progress, especially in the domains of signal processing and computer vision. In these domains, deep learning-based methods are being iterated and commercialized at an unprecedented rate, dramatically changing the way humans live, learn, and work.

 

Since the resurgence of convolutional neural networks in 2010, computer vision has been one of the most dynamic areas for deep learning technology. Recently, image and video synthesis and generation, 3D vision, visual language model, and multimodal learning have gradually been the research hotspots in the field. The transformative wave of Large Language Models (LLMs) in the field of Natural Language Processing (NLP) has inspired further exploration of their potential in computer vision. On the other hand, AI technologies are increasingly transitioning from the virtual realm to the physical, integrating with automated devices or machinery to create embodied, intelligent entities capable of physical interaction, known as Embodied Artificial Intelligence (EAI). In this context, AI needs to sink down to deal with more underlying signal or hardware information. In the field of signal processing, AI intersects with traditional control, automation, and robotics technologies, leading to a fusion of these disciplines.

 

In short, AI has made great strides from initially recognizing the world (traditional vision tasks such as classification and recognition) to simulating the world (generative models) and changing the world (embodied intelligence). This special issue hopes to document and advance this trend by focusing on the latest advances in AI technology in the areas of signal processing and computer vision. We seek original research articles, reviews, and survey papers that explore the latest developments, challenges, and solutions in these rapidly evolving areas. The potential topics encompassed may include, but are not limited to, the following topics:

 

· Multimodal artificial intelligence

· 2D&3D generative modes

· Image & video segmentation

· 3D reconstruction

· Large models and their applications in signal processing and computer vision

· Visual question and answer (VQA), visual reasoning

· Meta-learning, transfer learning, few-shot learning.

· Embodied Artificial Intelligence

· Reinforcement Learning

· Medical image processing

· Medical robot

· Vision Foundation Models

· Efficient and robust AI



Published Papers


  • Open Access

    ARTICLE

    A Dual-Layer Attention Based CAPTCHA Recognition Approach with Guided Visual Attention

    Zaid Derea, Beiji Zou, Xiaoyan Kui, Alaa Thobhani, Amr Abdussalam
    CMES-Computer Modeling in Engineering & Sciences, Vol.142, No.3, pp. 2841-2867, 2025, DOI:10.32604/cmes.2025.059586
    (This article belongs to the Special Issue: Recent Advances in Signal Processing and Computer Vision)
    Abstract Enhancing website security is crucial to combat malicious activities, and CAPTCHA (Completely Automated Public Turing tests to tell Computers and Humans Apart) has become a key method to distinguish humans from bots. While text-based CAPTCHAs are designed to challenge machines while remaining human-readable, recent advances in deep learning have enabled models to recognize them with remarkable efficiency. In this regard, we propose a novel two-layer visual attention framework for CAPTCHA recognition that builds on traditional attention mechanisms by incorporating Guided Visual Attention (GVA), which sharpens focus on relevant visual features. We have specifically adapted the… More >

  • Open Access

    REVIEW

    A Survey on Enhancing Image Captioning with Advanced Strategies and Techniques

    Alaa Thobhani, Beiji Zou, Xiaoyan Kui, Amr Abdussalam, Muhammad Asim, Sajid Shah, Mohammed ELAffendi
    CMES-Computer Modeling in Engineering & Sciences, Vol.142, No.3, pp. 2247-2280, 2025, DOI:10.32604/cmes.2025.059192
    (This article belongs to the Special Issue: Recent Advances in Signal Processing and Computer Vision)
    Abstract Image captioning has seen significant research efforts over the last decade. The goal is to generate meaningful semantic sentences that describe visual content depicted in photographs and are syntactically accurate. Many real-world applications rely on image captioning, such as helping people with visual impairments to see their surroundings. To formulate a coherent and relevant textual description, computer vision techniques are utilized to comprehend the visual content within an image, followed by natural language processing methods. Numerous approaches and models have been developed to deal with this multifaceted problem. Several models prove to be state-of-the-art solutions… More >

Share Link