Special Issues
Table of Content

Metaheuristics, Soft Computing, and Machine Learning in Image Processing and Computer Vision

Submission Deadline: 31 May 2024 (closed) View: 451

Guest Editors

Dr. Diego Oliva, Universidad de Guadalajara, Mexico
Dr. Saul Zapotecas-Martinez, Instituto Nacional de Astrofisica Óptica y Electrónica Tonantzintla Puebla, Mexico
Dr. Seyed Jalaleddin Mousavirad, Mid Sweden University, Sweden

Summary

The use of images and videos has widely increased in the last few years. Cameras are an extension of the human vision sense. However, analyzing the scenes could be time-consuming, especially for some specific tasks. The use of intelligent algorithms is a way to help in the analysis of the scenes in mages or videos. Computational tools such as metaheuristics, soft computing, and machine learning are employed to overcome the drawbacks of classical image processing tools. This Special Issue is a collection of implementations and hybridizations of machine learning and metaheuristics in solving complex problems in image processing and computer vision. Recent advantages among the areas are included. Besides, literature reviews and surveys are also included to study the importance of the related areas and applications extensively.


Keywords

Metaheuristic Algorithms, Evolutionary Computation, Single Objective Methods, Multi-Objective Optimization, Image Processing, Computer Vision, Machine Learning

Published Papers


  • Open Access

    ARTICLE

    Image Captioning Using Multimodal Deep Learning Approach

    Rihem Farkh, Ghislain Oudinet, Yasser Foued
    CMC-Computers, Materials & Continua, DOI:10.32604/cmc.2024.053245
    (This article belongs to the Special Issue: Metaheuristics, Soft Computing, and Machine Learning in Image Processing and Computer Vision)
    Abstract The process of generating descriptive captions for images has witnessed significant advancements in last years, owing to the progress in deep learning techniques. Despite significant advancements, the task of thoroughly grasping image content and producing coherent, contextually relevant captions continues to pose a substantial challenge. In this paper, we introduce a novel multimodal method for image captioning by integrating three powerful deep learning architectures: YOLOv8 (You Only Look Once) for robust object detection, EfficientNetB7 for efficient feature extraction, and Transformers for effective sequence modeling. Our proposed model combines the strengths of YOLOv8 in detecting objects,… More >

  • Open Access

    ARTICLE

    Sports Events Recognition Using Multi Features and Deep Belief Network

    Bayan Alabdullah, Muhammad Tayyab, Yahay AlQahtani, Naif Al Mudawi, Asaad Algarni, Ahmad Jalal, Jeongmin Park
    CMC-Computers, Materials & Continua, Vol.81, No.1, pp. 309-326, 2024, DOI:10.32604/cmc.2024.053538
    (This article belongs to the Special Issue: Metaheuristics, Soft Computing, and Machine Learning in Image Processing and Computer Vision)
    Abstract In the modern era of a growing population, it is arduous for humans to monitor every aspect of sports, events occurring around us, and scenarios or conditions. This recognition of different types of sports and events has increasingly incorporated the use of machine learning and artificial intelligence. This research focuses on detecting and recognizing events in sequential photos characterized by several factors, including the size, location, and position of people’s body parts in those pictures, and the influence around those people. Common approaches utilized, here are feature descriptors such as MSER (Maximally Stable Extremal Regions),… More >

  • Open Access

    ARTICLE

    Ghost-YOLO v8: An Attention-Guided Enhanced Small Target Detection Algorithm for Floating Litter on Water Surfaces

    Zhongmin Huangfu, Shuqing Li, Luoheng Yan
    CMC-Computers, Materials & Continua, Vol.80, No.3, pp. 3713-3731, 2024, DOI:10.32604/cmc.2024.054188
    (This article belongs to the Special Issue: Metaheuristics, Soft Computing, and Machine Learning in Image Processing and Computer Vision)
    Abstract Addressing the challenges in detecting surface floating litter in artificial lakes, including complex environments, uneven illumination, and susceptibility to noise and weather, this paper proposes an efficient and lightweight Ghost-YOLO (You Only Look Once) v8 algorithm. The algorithm integrates advanced attention mechanisms and a small-target detection head to significantly enhance detection performance and efficiency. Firstly, an SE (Squeeze-and-Excitation) mechanism is incorporated into the backbone network to fortify the extraction of resilient features and precise target localization. This mechanism models feature channel dependencies, enabling adaptive adjustment of channel importance, thereby improving recognition of floating litter targets.… More >

  • Open Access

    ARTICLE

    Vehicle Head and Tail Recognition Algorithm for Lightweight DCDSNet

    Chao Wang, Kaijie Zhang, Xiaoyong Yu, Dejun Li, Wei Xie, Xinqiao Wang
    CMC-Computers, Materials & Continua, Vol.80, No.3, pp. 4451-4473, 2024, DOI:10.32604/cmc.2024.051764
    (This article belongs to the Special Issue: Metaheuristics, Soft Computing, and Machine Learning in Image Processing and Computer Vision)
    Abstract In the model of the vehicle recognition algorithm implemented by the convolutional neural network, the model needs to compute and store a lot of parameters. Too many parameters occupy a lot of computational resources making it difficult to run on computers with poor performance. Therefore, obtaining more efficient feature information of target image or video with better accuracy on computers with limited arithmetic power becomes the main goal of this research. In this paper, a lightweight densely connected, and deeply separable convolutional network (DCDSNet) algorithm is proposed to achieve this goal. Visual Geometry Group (VGG) More >

  • Open Access

    ARTICLE

    GDMNet: A Unified Multi-Task Network for Panoptic Driving Perception

    Yunxiang Liu, Haili Ma, Jianlin Zhu, Qiangbo Zhang
    CMC-Computers, Materials & Continua, Vol.80, No.2, pp. 2963-2978, 2024, DOI:10.32604/cmc.2024.053710
    (This article belongs to the Special Issue: Metaheuristics, Soft Computing, and Machine Learning in Image Processing and Computer Vision)
    Abstract To enhance the efficiency and accuracy of environmental perception for autonomous vehicles, we propose GDMNet, a unified multi-task perception network for autonomous driving, capable of performing drivable area segmentation, lane detection, and traffic object detection. Firstly, in the encoding stage, features are extracted, and Generalized Efficient Layer Aggregation Network (GELAN) is utilized to enhance feature extraction and gradient flow. Secondly, in the decoding stage, specialized detection heads are designed; the drivable area segmentation head employs DySample to expand feature maps, the lane detection head merges early-stage features and processes the output through the Focal Modulation More >

  • Open Access

    ARTICLE

    ED-Ged: Nighttime Image Semantic Segmentation Based on Enhanced Detail and Bidirectional Guidance

    Xiaoli Yuan, Jianxun Zhang, Xuejie Wang, Zhuhong Chu
    CMC-Computers, Materials & Continua, Vol.80, No.2, pp. 2443-2462, 2024, DOI:10.32604/cmc.2024.052285
    (This article belongs to the Special Issue: Metaheuristics, Soft Computing, and Machine Learning in Image Processing and Computer Vision)
    Abstract Semantic segmentation of driving scene images is crucial for autonomous driving. While deep learning technology has significantly improved daytime image semantic segmentation, nighttime images pose challenges due to factors like poor lighting and overexposure, making it difficult to recognize small objects. To address this, we propose an Image Adaptive Enhancement (IAEN) module comprising a parameter predictor (Edip), multiple image processing filters (Mdif), and a Detail Processing Module (DPM). Edip combines image processing filters to predict parameters like exposure and hue, optimizing image quality. We adopt a novel image encoder to enhance parameter prediction accuracy by More >

  • Open Access

    ARTICLE

    Attention Guided Food Recognition via Multi-Stage Local Feature Fusion

    Gonghui Deng, Dunzhi Wu, Weizhen Chen
    CMC-Computers, Materials & Continua, Vol.80, No.2, pp. 1985-2003, 2024, DOI:10.32604/cmc.2024.052174
    (This article belongs to the Special Issue: Metaheuristics, Soft Computing, and Machine Learning in Image Processing and Computer Vision)
    Abstract The task of food image recognition, a nuanced subset of fine-grained image recognition, grapples with substantial intra-class variation and minimal inter-class differences. These challenges are compounded by the irregular and multi-scale nature of food images. Addressing these complexities, our study introduces an advanced model that leverages multiple attention mechanisms and multi-stage local fusion, grounded in the ConvNeXt architecture. Our model employs hybrid attention (HA) mechanisms to pinpoint critical discriminative regions within images, substantially mitigating the influence of background noise. Furthermore, it introduces a multi-stage local fusion (MSLF) module, fostering long-distance dependencies between feature maps at… More >

  • Open Access

    ARTICLE

    A Hybrid Feature Fusion Traffic Sign Detection Algorithm Based on YOLOv7

    Bingyi Ren, Juwei Zhang, Tong Wang
    CMC-Computers, Materials & Continua, Vol.80, No.1, pp. 1425-1440, 2024, DOI:10.32604/cmc.2024.052667
    (This article belongs to the Special Issue: Metaheuristics, Soft Computing, and Machine Learning in Image Processing and Computer Vision)
    Abstract Autonomous driving technology has entered a period of rapid development, and traffic sign detection is one of the important tasks. Existing target detection networks are difficult to adapt to scenarios where target sizes are seriously imbalanced, and traffic sign targets are small and have unclear features, which makes detection more difficult. Therefore, we propose a Hybrid Feature Fusion Traffic Sign detection algorithm based on YOLOv7 (HFFT-YOLO). First, a self-attention mechanism is incorporated at the end of the backbone network to calculate feature interactions within scales; Secondly, the cross-scale fusion part of the neck introduces a… More >

  • Open Access

    ARTICLE

    Monocular Distance Estimated Based on PTZ Camera

    Qirui Zhong, Xiaogang Cheng, Yuxin Song, Han Wang
    CMC-Computers, Materials & Continua, Vol.79, No.2, pp. 3417-3433, 2024, DOI:10.32604/cmc.2024.049992
    (This article belongs to the Special Issue: Metaheuristics, Soft Computing, and Machine Learning in Image Processing and Computer Vision)
    Abstract This paper introduces an intelligent computational approach for extracting salient objects from images and estimating their distance information with PTZ (Pan-Tilt-Zoom) cameras. PTZ cameras have found wide applications in numerous public places, serving various purposes such as public security management, natural disaster monitoring, and crisis alarms, particularly with the rapid development of Artificial Intelligence and global infrastructural projects. In this paper, we combine Gauss optical principles with the PTZ camera’s capabilities of horizontal and pitch rotation, as well as optical zoom, to estimate the distance of the object. We present a novel monocular object distance… More >

  • Open Access

    ARTICLE

    Braille Character Segmentation Algorithm Based on Gaussian Diffusion

    Zezheng Meng, Zefeng Cai, Jie Feng, Hanjie Ma, Haixiang Zhang, Shaohua Li
    CMC-Computers, Materials & Continua, Vol.79, No.1, pp. 1481-1496, 2024, DOI:10.32604/cmc.2024.048002
    (This article belongs to the Special Issue: Metaheuristics, Soft Computing, and Machine Learning in Image Processing and Computer Vision)
    Abstract Optical braille recognition methods typically employ existing target detection models or segmentation models for the direct detection and recognition of braille characters in original braille images. However, these methods need improvement in accuracy and generalizability, especially in densely dotted braille image environments. This paper presents a two-stage braille recognition framework. The first stage is a braille dot detection algorithm based on Gaussian diffusion, targeting Gaussian heatmaps generated by the convex dots in braille images. This is applied to the detection of convex dots in double-sided braille, achieving high accuracy in determining the central coordinates of More >

  • Open Access

    ARTICLE

    Hybrid Optimization Algorithm for Handwritten Document Enhancement

    Shu-Chuan Chu, Xiaomeng Yang, Li Zhang, Václav Snášel, Jeng-Shyang Pan
    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 3763-3786, 2024, DOI:10.32604/cmc.2024.048594
    (This article belongs to the Special Issue: Metaheuristics, Soft Computing, and Machine Learning in Image Processing and Computer Vision)
    Abstract The Gannet Optimization Algorithm (GOA) and the Whale Optimization Algorithm (WOA) demonstrate strong performance; however, there remains room for improvement in convergence and practical applications. This study introduces a hybrid optimization algorithm, named the adaptive inertia weight whale optimization algorithm and gannet optimization algorithm (AIWGOA), which addresses challenges in enhancing handwritten documents. The hybrid strategy integrates the strengths of both algorithms, significantly enhancing their capabilities, whereas the adaptive parameter strategy mitigates the need for manual parameter setting. By amalgamating the hybrid strategy and parameter-adaptive approach, the Gannet Optimization Algorithm was refined to yield the AIWGOA. More >

Share Link