Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (2)
  • Open Access

    ARTICLE

    Efficient Image Captioning Based on Vision Transformer Models

    Samar Elbedwehy1,*, T. Medhat2, Taher Hamza3, Mohammed F. Alrahmawy3

    CMC-Computers, Materials & Continua, Vol.73, No.1, pp. 1483-1500, 2022, DOI:10.32604/cmc.2022.029313 - 18 May 2022

    Abstract Image captioning is an emerging field in machine learning. It refers to the ability to automatically generate a syntactically and semantically meaningful sentence that describes the content of an image. Image captioning requires a complex machine learning process as it involves two sub models: a vision sub-model for extracting object features and a language sub-model that use the extracted features to generate meaningful captions. Attention-based vision transformers models have a great impact in vision field recently. In this paper, we studied the effect of using the vision transformers on the image captioning process by evaluating… More >

  • Open Access

    ARTICLE

    Instance Retrieval Using Region of Interest Based CNN Features

    Jingcheng Chen1, Zhili Zhou1,2,*, Zhaoqing Pan1, Ching-nung Yang3

    Journal of New Media, Vol.1, No.2, pp. 87-99, 2019, DOI:10.32604/jnm.2019.06582

    Abstract Recently, image representations derived by convolutional neural networks (CNN) have achieved promising performance for instance retrieval, and they outperform the traditional hand-crafted image features. However, most of existing CNN-based features are proposed to describe the entire images, and thus they are less robust to background clutter. This paper proposes a region of interest (RoI)-based deep convolutional representation for instance retrieval. It first detects the region of interests (RoIs) from an image, and then extracts a set of RoI-based CNN features from the fully-connected layer of CNN. The proposed RoI-based CNN feature describes the patterns of More >

Displaying 1-10 on page 1 of 2. Per Page