Tech Science Press - Publisher of Open Access Journals

Open Access

ARTICLE

Low-Complexity Hardware Architecture for Batch Normalization of CNN Training Accelerator

Go-Eun Woo, Sang-Bo Park, Gi-Tae Park, Muhammad Junaid, Hyung-Won Kim^*

CMC-Computers, Materials & Continua, Vol.84, No.2, pp. 3241-3257, 2025, DOI:10.32604/cmc.2025.063723 - 03 July 2025

Abstract On-device Artificial Intelligence (AI) accelerators capable of not only inference but also training neural network models are in increasing demand in the industrial AI field, where frequent retraining is crucial due to frequent production changes. Batch normalization (BN) is fundamental to training convolutional neural networks (CNNs), but its implementation in compact accelerator chips remains challenging due to computational complexity, particularly in calculating statistical parameters and gradients across mini-batches. Existing accelerator architectures either compromise the training accuracy of CNNs through approximations or require substantial computational resources, limiting their practical deployment. We present a hardware-optimized BN accelerator… More >

Open Access

ARTICLE

FPGA Accelerators for Computing Interatomic Potential-Based Molecular Dynamics Simulation for Gold Nanoparticles: Exploring Different Communication Protocols

Ankitkumar Patel¹, Srivathsan Vasudevan^1,*, Satya Bulusu^2,*

CMC-Computers, Materials & Continua, Vol.80, No.3, pp. 3803-3818, 2024, DOI:10.32604/cmc.2024.052851 - 12 September 2024

Abstract Molecular Dynamics (MD) simulation for computing Interatomic Potential (IAP) is a very important High-Performance Computing (HPC) application. MD simulation on particles of experimental relevance takes huge computation time, despite using an expensive high-end server. Heterogeneous computing, a combination of the Field Programmable Gate Array (FPGA) and a computer, is proposed as a solution to compute MD simulation efficiently. In such heterogeneous computation, communication between FPGA and Computer is necessary. One such MD simulation, explained in the paper, is the (Artificial Neural Network) ANN-based IAP computation of gold (Au₁₄₇ & Au₃₀₉) nanoparticles. MD simulation calculates the forces… More >

Open Access

ARTICLE

A Novel Quantization and Model Compression Approach for Hardware Accelerators in Edge Computing

Fangzhou He^1,3, Ke Ding^1,2, Dingjiang Yan³, Jie Li^3,*, Jiajun Wang^1,2, Mingzhe Chen^1,2

CMC-Computers, Materials & Continua, Vol.80, No.2, pp. 3021-3045, 2024, DOI:10.32604/cmc.2024.053632 - 15 August 2024

Abstract Massive computational complexity and memory requirement of artificial intelligence models impede their deployability on edge computing devices of the Internet of Things (IoT). While Power-of-Two (PoT) quantization is proposed to improve the efficiency for edge inference of Deep Neural Networks (DNNs), existing PoT schemes require a huge amount of bit-wise manipulation and have large memory overhead, and their efficiency is bounded by the bottleneck of computation latency and memory footprint. To tackle this challenge, we present an efficient inference approach on the basis of PoT quantization and model compression. An integer-only scalar PoT quantization (IOS-PoT)… More >

Open Access

ARTICLE

FPGA Optimized Accelerator of DCNN with Fast Data Readout and Multiplier Sharing Strategy

Tuo Ma, Zhiwei Li, Qingjiang Li^*, Haijun Liu, Zhongjin Zhao, Yinan Wang

CMC-Computers, Materials & Continua, Vol.77, No.3, pp. 3237-3263, 2023, DOI:10.32604/cmc.2023.045948 - 26 December 2023

Abstract With the continuous development of deep learning, Deep Convolutional Neural Network (DCNN) has attracted wide attention in the industry due to its high accuracy in image classification. Compared with other DCNN hardware deployment platforms, Field Programmable Gate Array (FPGA) has the advantages of being programmable, low power consumption, parallelism, and low cost. However, the enormous amount of calculation of DCNN and the limited logic capacity of FPGA restrict the energy efficiency of the DCNN accelerator. The traditional sequential sliding window method can improve the throughput of the DCNN accelerator by data multiplexing, but this method’s… More >

Open Access

ARTICLE

CNN Accelerator Using Proposed Diagonal Cyclic Array for Minimizing Memory Accesses

Hyun-Wook Son¹, Ali A. Al-Hamid^1,2, Yong-Seok Na¹, Dong-Yeong Lee¹, Hyung-Won Kim^1,*

CMC-Computers, Materials & Continua, Vol.76, No.2, pp. 1665-1687, 2023, DOI:10.32604/cmc.2023.038760 - 30 August 2023

Abstract This paper presents the architecture of a Convolution Neural Network (CNN) accelerator based on a new processing element (PE) array called a diagonal cyclic array (DCA). As demonstrated, it can significantly reduce the burden of repeated memory accesses for feature data and weight parameters of the CNN models, which maximizes the data reuse rate and improve the computation speed. Furthermore, an integrated computation architecture has been implemented for the activation function, max-pooling, and activation function after convolution calculation, reducing the hardware resource. To evaluate the effectiveness of the proposed architecture, a CNN accelerator has been… More >

Open Access

ARTICLE

A Low-Power 12-Bit SAR ADC for Analog Convolutional Kernel of Mixed-Signal CNN Accelerator

Jungyeon Lee¹, Malik Summair Asghar^1,2, HyungWon Kim^1,*

CMC-Computers, Materials & Continua, Vol.75, No.2, pp. 4357-4375, 2023, DOI:10.32604/cmc.2023.031372 - 31 March 2023

Abstract As deep learning techniques such as Convolutional Neural Networks (CNNs) are widely adopted, the complexity of CNNs is rapidly increasing due to the growing demand for CNN accelerator system-on-chip (SoC). Although conventional CNN accelerators can reduce the computational time of learning and inference tasks, they tend to occupy large chip areas due to many multiply-and-accumulate (MAC) operators when implemented in complex digital circuits, incurring excessive power consumption. To overcome these drawbacks, this work implements an analog convolutional filter consisting of an analog multiply-and-accumulate arithmetic circuit along with an analog-to-digital converter (ADC). This paper introduces the… More >

Open Access

ARTICLE

A Resource-Efficient Convolutional Neural Network Accelerator Using Fine-Grained Logarithmic Quantization

Hadee Madadum^*, Yasar Becerikli

Intelligent Automation & Soft Computing, Vol.33, No.2, pp. 681-695, 2022, DOI:10.32604/iasc.2022.023831 - 08 February 2022

Abstract Convolutional Neural Network (ConNN) implementations on Field Programmable Gate Array (FPGA) are being studied since the computational capabilities of FPGA have been improved recently. Model compression is required to enable ConNN deployment on resource-constrained FPGA devices. Logarithmic quantization is one of the efficient compression methods that can compress a model to very low bit-width without significant deterioration in performance. It is also hardware-friendly by using bitwise operations for multiplication. However, the logarithmic suffers from low resolution at high inputs due to exponential properties. Therefore, we propose a modified logarithmic quantization method with a fine resolution More >

Open Access

ARTICLE

Preparation and Performance of a Fluorine-Free and Alkali-Free Liquid Accelerator for Shotcrete

Jianbing Zhang¹, Rongjin Liu^1,2,3,*, Siyuan Fu¹, Tianyu Gao¹, Zhongfei Zhang¹

Journal of Renewable Materials, Vol.9, No.11, pp. 2001-2013, 2021, DOI:10.32604/jrm.2021.015812 - 04 June 2021

Abstract Based on aluminum sulfate, a fluorine-free and alkali-free liquid accelerator (FF-AF-A) was prepared in this study. The setting time and compressive strength of three cement types with different FF-AF-A dosages were fully investigated. The compatibility of the FF-AF-A with the superplasticizers were also investigated, and the early hydration behavior and morphology of the hydration products of reference cement paste with the FF-AF-A were explored by hydration heat, X-ray diffractometry (XRD), and scanning electron microscopy (SEM). Test results indicated that adding the FF-AF-A at 8 wt% of the cement weight resulted in 2 min 35 s… More >

Open Access

ARTICLE

Effect of Mitigating Strength Retrogradation of Alkali Accelerator by the Synergism of Sodium Sulfate and Waste Glass Powder

Yongdong Xu, Tingshu He^*

Journal of Renewable Materials, Vol.9, No.11, pp. 1991-1999, 2021, DOI:10.32604/jrm.2021.015931 - 04 June 2021

Abstract This work aims to utilize waste glass powder (WGP) as a plementary material to mitigate the strength shrinkage caused by the alkaline accelerator. Waste glass power was used to replace cement by 0%, 10%, and 20% to evaluate waste glass powder on the alkaline accelerator’s strength retrogradation. The results show that the strength improvement effect of unitary glass powder is inconspicuous. Innovative methods have been proposed to use sodium sulfate and waste glass powder synergism, using the activity of amorphous silica in glass powder. Compared with the reference group, the compressive strength of 28d mortar More >

Open Access

ARTICLE

A Parallel Approach to Discords Discovery in Massive Time Series Data

Mikhail Zymbler^*, Alexander Grents, Yana Kraeva, Sachin Kumar

CMC-Computers, Materials & Continua, Vol.66, No.2, pp. 1867-1878, 2021, DOI:10.32604/cmc.2020.014232 - 26 November 2020

Abstract A discord is a refinement of the concept of an anomalous subsequence of a time series. Being one of the topical issues of time series mining, discords discovery is applied in a wide range of real-world areas (medicine, astronomy, economics, climate modeling, predictive maintenance, energy consumption, etc.). In this article, we propose a novel parallel algorithm for discords discovery on high-performance cluster with nodes based on many-core accelerators in the case when time series cannot fit in the main memory. We assumed that the time series is partitioned across the cluster nodes and achieved parallelization… More >

Displaying 1-10 on page 1 of 12. Per Page

View

1079

Download

310

View

1471

Download

757

View

1563

Download

670

View

1349

Download

862

View

1560

Download

848

View

2684

Download

27659

View

3437

Download

2284

View

3438

Download

1942

Cited by

1

View

3184

Download

1862

Like

1

Cited by

1

View

3681

Download

2028

Cited by

1

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp: