Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (1)
  • Open Access

    REVIEW

    A Comprehensive Survey of Recent Transformers in Image, Video and Diffusion Models

    Dinh Phu Cuong Le1,2, Dong Wang1, Viet-Tuan Le3,*

    CMC-Computers, Materials & Continua, Vol.80, No.1, pp. 37-60, 2024, DOI:10.32604/cmc.2024.050790

    Abstract Transformer models have emerged as dominant networks for various tasks in computer vision compared to Convolutional Neural Networks (CNNs). The transformers demonstrate the ability to model long-range dependencies by utilizing a self-attention mechanism. This study aims to provide a comprehensive survey of recent transformer-based approaches in image and video applications, as well as diffusion models. We begin by discussing existing surveys of vision transformers and comparing them to this work. Then, we review the main components of a vanilla transformer network, including the self-attention mechanism, feed-forward network, position encoding, etc. In the main part of More >

Displaying 1-10 on page 1 of 1. Per Page