Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (12)
  • Open Access

    ARTICLE

    A Concise and Varied Visual Features-Based Image Captioning Model with Visual Selection

    Alaa Thobhani1,*, Beiji Zou1, Xiaoyan Kui1, Amr Abdussalam2, Muhammad Asim3, Naveed Ahmed4, Mohammed Ali Alshara4,5

    CMC-Computers, Materials & Continua, Vol.81, No.2, pp. 2873-2894, 2024, DOI:10.32604/cmc.2024.054841 - 18 November 2024

    Abstract Image captioning has gained increasing attention in recent years. Visual characteristics found in input images play a crucial role in generating high-quality captions. Prior studies have used visual attention mechanisms to dynamically focus on localized regions of the input image, improving the effectiveness of identifying relevant image regions at each step of caption generation. However, providing image captioning models with the capability of selecting the most relevant visual features from the input image and attending to them can significantly improve the utilization of these features. Consequently, this leads to enhanced captioning network performance. In light… More >

  • Open Access

    ARTICLE

    PCATNet: Position-Class Awareness Transformer for Image Captioning

    Ziwei Tang1, Yaohua Yi2,*, Changhui Yu2, Aiguo Yin3

    CMC-Computers, Materials & Continua, Vol.75, No.3, pp. 6007-6022, 2023, DOI:10.32604/cmc.2023.037861 - 29 April 2023

    Abstract Existing image captioning models usually build the relation between visual information and words to generate captions, which lack spatial information and object classes. To address the issue, we propose a novel Position-Class Awareness Transformer (PCAT) network which can serve as a bridge between the visual features and captions by embedding spatial information and awareness of object classes. In our proposal, we construct our PCAT network by proposing a novel Grid Mapping Position Encoding (GMPE) method and refining the encoder-decoder framework. First, GMPE includes mapping the regions of objects to grids, calculating the relative distance among… More >

  • Open Access

    ARTICLE

    Fine-Grained Features for Image Captioning

    Mengyue Shao1, Jie Feng1,*, Jie Wu1, Haixiang Zhang1, Yayu Zheng2

    CMC-Computers, Materials & Continua, Vol.75, No.3, pp. 4697-4712, 2023, DOI:10.32604/cmc.2023.036564 - 29 April 2023

    Abstract Image captioning involves two different major modalities (image and sentence) that convert a given image into a language that adheres to visual semantics. Almost all methods first extract image features to reduce the difficulty of visual semantic embedding and then use the caption model to generate fluent sentences. The Convolutional Neural Network (CNN) is often used to extract image features in image captioning, and the use of object detection networks to extract region features has achieved great success. However, the region features retrieved by this method are object-level and do not pay attention to fine-grained… More >

  • Open Access

    ARTICLE

    Enhanced Image Captioning Using Features Concatenation and Efficient Pre-Trained Word Embedding

    Samar Elbedwehy1,3,*, T. Medhat2, Taher Hamza3, Mohammed F. Alrahmawy3

    Computer Systems Science and Engineering, Vol.46, No.3, pp. 3637-3652, 2023, DOI:10.32604/csse.2023.038376 - 03 April 2023

    Abstract One of the issues in Computer Vision is the automatic development of descriptions for images, sometimes known as image captioning. Deep Learning techniques have made significant progress in this area. The typical architecture of image captioning systems consists mainly of an image feature extractor subsystem followed by a caption generation lingual subsystem. This paper aims to find optimized models for these two subsystems. For the image feature extraction subsystem, the research tested eight different concatenations of pairs of vision models to get among them the most expressive extracted feature vector of the image. For the More >

  • Open Access

    ARTICLE

    Red Deer Optimization with Artificial Intelligence Enabled Image Captioning System for Visually Impaired People

    Anwer Mustafa Hilal1,*, Fadwa Alrowais2, Fahd N. Al-Wesabi3, Radwa Marzouk4,5

    Computer Systems Science and Engineering, Vol.46, No.2, pp. 1929-1945, 2023, DOI:10.32604/csse.2023.035529 - 09 February 2023

    Abstract The problem of producing a natural language description of an image for describing the visual content has gained more attention in natural language processing (NLP) and computer vision (CV). It can be driven by applications like image retrieval or indexing, virtual assistants, image understanding, and support of visually impaired people (VIP). Though the VIP uses other senses, touch and hearing, for recognizing objects and events, the quality of life of those persons is lower than the standard level. Automatic Image captioning generates captions that will be read loudly to the VIP, thereby realizing matters happening… More >

  • Open Access

    ARTICLE

    Natural Language Processing with Optimal Deep Learning-Enabled Intelligent Image Captioning System

    Radwa Marzouk1, Eatedal Alabdulkreem2, Mohamed K. Nour3, Mesfer Al Duhayyim4,*, Mahmoud Othman5, Abu Sarwar Zamani6, Ishfaq Yaseen6, Abdelwahed Motwakel6

    CMC-Computers, Materials & Continua, Vol.74, No.2, pp. 4435-4451, 2023, DOI:10.32604/cmc.2023.033091 - 31 October 2022

    Abstract The recent developments in Multimedia Internet of Things (MIoT) devices, empowered with Natural Language Processing (NLP) model, seem to be a promising future of smart devices. It plays an important role in industrial models such as speech understanding, emotion detection, home automation, and so on. If an image needs to be captioned, then the objects in that image, its actions and connections, and any silent feature that remains under-projected or missing from the images should be identified. The aim of the image captioning process is to generate a caption for image. In next step, the… More >

  • Open Access

    ARTICLE

    Oppositional Harris Hawks Optimization with Deep Learning-Based Image Captioning

    V. R. Kavitha1, K. Nimala2, A. Beno3, K. C. Ramya4, Seifedine Kadry5, Byeong-Gwon Kang6, Yunyoung Nam7,*

    Computer Systems Science and Engineering, Vol.44, No.1, pp. 579-593, 2023, DOI:10.32604/csse.2023.024553 - 01 June 2022

    Abstract Image Captioning is an emergent topic of research in the domain of artificial intelligence (AI). It utilizes an integration of Computer Vision (CV) and Natural Language Processing (NLP) for generating the image descriptions. It finds use in several application areas namely recommendation in editing applications, utilization in virtual assistance, etc. The development of NLP and deep learning (DL) models find useful to derive a bridge among the visual details and textual semantics. In this view, this paper introduces an Oppositional Harris Hawks Optimization with Deep Learning based Image Captioning (OHHO-DLIC) technique. The OHHO-DLIC technique involves… More >

  • Open Access

    ARTICLE

    Image Captioning Using Detectors and Swarm Based Learning Approach for Word Embedding Vectors

    B. Lalitha1,*, V. Gomathi2

    Computer Systems Science and Engineering, Vol.44, No.1, pp. 173-189, 2023, DOI:10.32604/csse.2023.024118 - 01 June 2022

    Abstract IC (Image Captioning) is a crucial part of Visual Data Processing and aims at understanding for providing captions that verbalize an image’s important elements. However, in existing works, because of the complexity in images, neglecting major relation between the object in an image, poor quality image, labelling it remains a big problem for researchers. Hence, the main objective of this work attempts to overcome these challenges by proposing a novel framework for IC. So in this research work the main contribution deals with the framework consists of three phases that is image understanding, textual understanding and… More >

  • Open Access

    ARTICLE

    Efficient Image Captioning Based on Vision Transformer Models

    Samar Elbedwehy1,*, T. Medhat2, Taher Hamza3, Mohammed F. Alrahmawy3

    CMC-Computers, Materials & Continua, Vol.73, No.1, pp. 1483-1500, 2022, DOI:10.32604/cmc.2022.029313 - 18 May 2022

    Abstract Image captioning is an emerging field in machine learning. It refers to the ability to automatically generate a syntactically and semantically meaningful sentence that describes the content of an image. Image captioning requires a complex machine learning process as it involves two sub models: a vision sub-model for extracting object features and a language sub-model that use the extracted features to generate meaningful captions. Attention-based vision transformers models have a great impact in vision field recently. In this paper, we studied the effect of using the vision transformers on the image captioning process by evaluating… More >

  • Open Access

    ARTICLE

    Low Complexity Encoder with Multilabel Classification and Image Captioning Model

    Mahmoud Ragab1,2,3,*, Abdullah Addas4

    CMC-Computers, Materials & Continua, Vol.72, No.3, pp. 4323-4337, 2022, DOI:10.32604/cmc.2022.026602 - 21 April 2022

    Abstract Due to the advanced development in the multimedia-on-demand traffic in different forms of audio, video, and images, has extremely moved on the vision of the Internet of Things (IoT) from scalar to Internet of Multimedia Things (IoMT). Since Unmanned Aerial Vehicles (UAVs) generates a massive quantity of the multimedia data, it becomes a part of IoMT, which are commonly employed in diverse application areas, especially for capturing remote sensing (RS) images. At the same time, the interpretation of the captured RS image also plays a crucial issue, which can be addressed by the multi-label classification… More >

Displaying 1-10 on page 1 of 12. Per Page