Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (11)
  • Open Access

    ARTICLE

    FPGA Accelerators for Computing Interatomic Potential-Based Molecular Dynamics Simulation for Gold Nanoparticles: Exploring Different Communication Protocols

    Ankitkumar Patel1, Srivathsan Vasudevan1,*, Satya Bulusu2,*

    CMC-Computers, Materials & Continua, Vol.80, No.3, pp. 3803-3818, 2024, DOI:10.32604/cmc.2024.052851 - 12 September 2024

    Abstract Molecular Dynamics (MD) simulation for computing Interatomic Potential (IAP) is a very important High-Performance Computing (HPC) application. MD simulation on particles of experimental relevance takes huge computation time, despite using an expensive high-end server. Heterogeneous computing, a combination of the Field Programmable Gate Array (FPGA) and a computer, is proposed as a solution to compute MD simulation efficiently. In such heterogeneous computation, communication between FPGA and Computer is necessary. One such MD simulation, explained in the paper, is the (Artificial Neural Network) ANN-based IAP computation of gold (Au147 & Au309) nanoparticles. MD simulation calculates the forces… More >

  • Open Access

    ARTICLE

    A Novel Quantization and Model Compression Approach for Hardware Accelerators in Edge Computing

    Fangzhou He1,3, Ke Ding1,2, Dingjiang Yan3, Jie Li3,*, Jiajun Wang1,2, Mingzhe Chen1,2

    CMC-Computers, Materials & Continua, Vol.80, No.2, pp. 3021-3045, 2024, DOI:10.32604/cmc.2024.053632 - 15 August 2024

    Abstract Massive computational complexity and memory requirement of artificial intelligence models impede their deployability on edge computing devices of the Internet of Things (IoT). While Power-of-Two (PoT) quantization is proposed to improve the efficiency for edge inference of Deep Neural Networks (DNNs), existing PoT schemes require a huge amount of bit-wise manipulation and have large memory overhead, and their efficiency is bounded by the bottleneck of computation latency and memory footprint. To tackle this challenge, we present an efficient inference approach on the basis of PoT quantization and model compression. An integer-only scalar PoT quantization (IOS-PoT)… More >

  • Open Access

    ARTICLE

    FPGA Optimized Accelerator of DCNN with Fast Data Readout and Multiplier Sharing Strategy

    Tuo Ma, Zhiwei Li, Qingjiang Li*, Haijun Liu, Zhongjin Zhao, Yinan Wang

    CMC-Computers, Materials & Continua, Vol.77, No.3, pp. 3237-3263, 2023, DOI:10.32604/cmc.2023.045948 - 26 December 2023

    Abstract With the continuous development of deep learning, Deep Convolutional Neural Network (DCNN) has attracted wide attention in the industry due to its high accuracy in image classification. Compared with other DCNN hardware deployment platforms, Field Programmable Gate Array (FPGA) has the advantages of being programmable, low power consumption, parallelism, and low cost. However, the enormous amount of calculation of DCNN and the limited logic capacity of FPGA restrict the energy efficiency of the DCNN accelerator. The traditional sequential sliding window method can improve the throughput of the DCNN accelerator by data multiplexing, but this method’s… More >

  • Open Access

    ARTICLE

    CNN Accelerator Using Proposed Diagonal Cyclic Array for Minimizing Memory Accesses

    Hyun-Wook Son1, Ali A. Al-Hamid1,2, Yong-Seok Na1, Dong-Yeong Lee1, Hyung-Won Kim1,*

    CMC-Computers, Materials & Continua, Vol.76, No.2, pp. 1665-1687, 2023, DOI:10.32604/cmc.2023.038760 - 30 August 2023

    Abstract This paper presents the architecture of a Convolution Neural Network (CNN) accelerator based on a new processing element (PE) array called a diagonal cyclic array (DCA). As demonstrated, it can significantly reduce the burden of repeated memory accesses for feature data and weight parameters of the CNN models, which maximizes the data reuse rate and improve the computation speed. Furthermore, an integrated computation architecture has been implemented for the activation function, max-pooling, and activation function after convolution calculation, reducing the hardware resource. To evaluate the effectiveness of the proposed architecture, a CNN accelerator has been… More >

  • Open Access

    ARTICLE

    A Low-Power 12-Bit SAR ADC for Analog Convolutional Kernel of Mixed-Signal CNN Accelerator

    Jungyeon Lee1, Malik Summair Asghar1,2, HyungWon Kim1,*

    CMC-Computers, Materials & Continua, Vol.75, No.2, pp. 4357-4375, 2023, DOI:10.32604/cmc.2023.031372 - 31 March 2023

    Abstract As deep learning techniques such as Convolutional Neural Networks (CNNs) are widely adopted, the complexity of CNNs is rapidly increasing due to the growing demand for CNN accelerator system-on-chip (SoC). Although conventional CNN accelerators can reduce the computational time of learning and inference tasks, they tend to occupy large chip areas due to many multiply-and-accumulate (MAC) operators when implemented in complex digital circuits, incurring excessive power consumption. To overcome these drawbacks, this work implements an analog convolutional filter consisting of an analog multiply-and-accumulate arithmetic circuit along with an analog-to-digital converter (ADC). This paper introduces the… More >

  • Open Access

    ARTICLE

    A Resource-Efficient Convolutional Neural Network Accelerator Using Fine-Grained Logarithmic Quantization

    Hadee Madadum*, Yasar Becerikli

    Intelligent Automation & Soft Computing, Vol.33, No.2, pp. 681-695, 2022, DOI:10.32604/iasc.2022.023831 - 08 February 2022

    Abstract Convolutional Neural Network (ConNN) implementations on Field Programmable Gate Array (FPGA) are being studied since the computational capabilities of FPGA have been improved recently. Model compression is required to enable ConNN deployment on resource-constrained FPGA devices. Logarithmic quantization is one of the efficient compression methods that can compress a model to very low bit-width without significant deterioration in performance. It is also hardware-friendly by using bitwise operations for multiplication. However, the logarithmic suffers from low resolution at high inputs due to exponential properties. Therefore, we propose a modified logarithmic quantization method with a fine resolution More >

  • Open Access

    ARTICLE

    Preparation and Performance of a Fluorine-Free and Alkali-Free Liquid Accelerator for Shotcrete

    Jianbing Zhang1, Rongjin Liu1,2,3,*, Siyuan Fu1, Tianyu Gao1, Zhongfei Zhang1

    Journal of Renewable Materials, Vol.9, No.11, pp. 2001-2013, 2021, DOI:10.32604/jrm.2021.015812 - 04 June 2021

    Abstract Based on aluminum sulfate, a fluorine-free and alkali-free liquid accelerator (FF-AF-A) was prepared in this study. The setting time and compressive strength of three cement types with different FF-AF-A dosages were fully investigated. The compatibility of the FF-AF-A with the superplasticizers were also investigated, and the early hydration behavior and morphology of the hydration products of reference cement paste with the FF-AF-A were explored by hydration heat, X-ray diffractometry (XRD), and scanning electron microscopy (SEM). Test results indicated that adding the FF-AF-A at 8 wt% of the cement weight resulted in 2 min 35 s… More >

  • Open Access

    ARTICLE

    Effect of Mitigating Strength Retrogradation of Alkali Accelerator by the Synergism of Sodium Sulfate and Waste Glass Powder

    Yongdong Xu, Tingshu He*

    Journal of Renewable Materials, Vol.9, No.11, pp. 1991-1999, 2021, DOI:10.32604/jrm.2021.015931 - 04 June 2021

    Abstract This work aims to utilize waste glass powder (WGP) as a plementary material to mitigate the strength shrinkage caused by the alkaline accelerator. Waste glass power was used to replace cement by 0%, 10%, and 20% to evaluate waste glass powder on the alkaline accelerator’s strength retrogradation. The results show that the strength improvement effect of unitary glass powder is inconspicuous. Innovative methods have been proposed to use sodium sulfate and waste glass powder synergism, using the activity of amorphous silica in glass powder. Compared with the reference group, the compressive strength of 28d mortar More >

  • Open Access

    ARTICLE

    A Parallel Approach to Discords Discovery in Massive Time Series Data

    Mikhail Zymbler*, Alexander Grents, Yana Kraeva, Sachin Kumar

    CMC-Computers, Materials & Continua, Vol.66, No.2, pp. 1867-1878, 2021, DOI:10.32604/cmc.2020.014232 - 26 November 2020

    Abstract A discord is a refinement of the concept of an anomalous subsequence of a time series. Being one of the topical issues of time series mining, discords discovery is applied in a wide range of real-world areas (medicine, astronomy, economics, climate modeling, predictive maintenance, energy consumption, etc.). In this article, we propose a novel parallel algorithm for discords discovery on high-performance cluster with nodes based on many-core accelerators in the case when time series cannot fit in the main memory. We assumed that the time series is partitioned across the cluster nodes and achieved parallelization… More >

  • Open Access

    ARTICLE

    A Dynamically Reconfigurable Accelerator Design Using a Sparse-Winograd Decomposition Algorithm for CNNs

    Yunping Zhao, Jianzhuang Lu*, Xiaowen Chen

    CMC-Computers, Materials & Continua, Vol.66, No.1, pp. 517-535, 2021, DOI:10.32604/cmc.2020.012380 - 30 October 2020

    Abstract Convolutional Neural Networks (CNNs) are widely used in many fields. Due to their high throughput and high level of computing characteristics, however, an increasing number of researchers are focusing on how to improve the computational efficiency, hardware utilization, or flexibility of CNN hardware accelerators. Accordingly, this paper proposes a dynamically reconfigurable accelerator architecture that implements a Sparse-Winograd F(2 2.3 3)-based high-parallelism hardware architecture. This approach not only eliminates the pre-calculation complexity associated with the Winograd algorithm, thereby reducing the difficulty of hardware implementation, but also greatly improves the flexibility of the hardware; as a result, More >

Displaying 1-10 on page 1 of 11. Per Page