Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (3)
  • Open Access

    ARTICLE

    A Concise and Varied Visual Features-Based Image Captioning Model with Visual Selection

    Alaa Thobhani1,*, Beiji Zou1, Xiaoyan Kui1, Amr Abdussalam2, Muhammad Asim3, Naveed Ahmed4, Mohammed Ali Alshara4,5

    CMC-Computers, Materials & Continua, Vol.81, No.2, pp. 2873-2894, 2024, DOI:10.32604/cmc.2024.054841 - 18 November 2024

    Abstract Image captioning has gained increasing attention in recent years. Visual characteristics found in input images play a crucial role in generating high-quality captions. Prior studies have used visual attention mechanisms to dynamically focus on localized regions of the input image, improving the effectiveness of identifying relevant image regions at each step of caption generation. However, providing image captioning models with the capability of selecting the most relevant visual features from the input image and attending to them can significantly improve the utilization of these features. Consequently, this leads to enhanced captioning network performance. In light… More >

  • Open Access

    ARTICLE

    PF-YOLOv4-Tiny: Towards Infrared Target Detection on Embedded Platform

    Wenbo Li, Qi Wang*, Shang Gao

    Intelligent Automation & Soft Computing, Vol.37, No.1, pp. 921-938, 2023, DOI:10.32604/iasc.2023.038257 - 29 April 2023

    Abstract Infrared target detection models are more required than ever before to be deployed on embedded platforms, which requires models with less memory consumption and better real-time performance while considering accuracy. To address the above challenges, we propose a modified You Only Look Once (YOLO) algorithm PF-YOLOv4-Tiny. The algorithm incorporates spatial pyramidal pooling (SPP) and squeeze-and-excitation (SE) visual attention modules to enhance the target localization capability. The PANet-based-feature pyramid networks (P-FPN) are proposed to transfer semantic information and location information simultaneously to ameliorate detection accuracy. To lighten the network, the standard convolutions other than the backbone More >

  • Open Access

    ARTICLE

    Effective Video Summarization Approach Based on Visual Attention

    Hilal Ahmad1, Habib Ullah Khan2, Sikandar Ali3,*, Syed Ijaz Ur Rahman1, Fazli Wahid3, Hizbullah Khattak4

    CMC-Computers, Materials & Continua, Vol.71, No.1, pp. 1427-1442, 2022, DOI:10.32604/cmc.2022.021158 - 03 November 2021

    Abstract Video summarization is applied to reduce redundancy and develop a concise representation of key frames in the video, more recently, video summaries have been used through visual attention modeling. In these schemes, the frames that stand out visually are extracted as key frames based on human attention modeling theories. The schemes for modeling visual attention have proven to be effective for video summaries. Nevertheless, the high cost of computing in such techniques restricts their usability in everyday situations. In this context, we propose a method based on KFE (key frame extraction) technique, which is recommended… More >

Displaying 1-10 on page 1 of 3. Per Page