Deterministic Convergence Analysis for GRU Networks via Smoothing Regularization

Qian Zhu¹, Qian Kang¹, Tao Xu², Dengxiu Yu^3,*, Zhen Wang¹
1 School of Cybersecurity, Northwestern Polytechnical University, Xi’an, 710072, China
2 Unmanned System Research Institute, Northwestern Polytechnical University, Xi’an, 710072, China
3 School of Artificial Intelligence, Optics and Electronics (iOPEN), Northwestern Polytechnical University, Xi’an, 710072, China
* Corresponding Author: Dengxiu Yu. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.061913

Received 06 November 2024; Accepted 11 February 2025; Published online 17 March 2025

Download PDF

Abstract

In this study, we present a deterministic convergence analysis of Gated Recurrent Unit (GRU) networks enhanced by a smoothing regularization technique. While GRU architectures effectively mitigate gradient vanishing/exploding issues in sequential modeling, they remain prone to overfitting, particularly under noisy or limited training data. Traditional regularization, despite enforcing sparsity and accelerating optimization, introduces non-differentiable points in the error function, leading to oscillations during training. To address this, we propose a novel smoothing regularization framework that replaces the non-differentiable absolute function with a quadratic approximation, ensuring gradient continuity and stabilizing the optimization landscape. Theoretically, we rigorously establish three key properties of the resulting smoothing -regularized GRU (SL1-GRU) model: (1) monotonic decrease of the error function across iterations, (2) weak convergence characterized by vanishing gradients as iterations approach infinity, and (3) strong convergence of network weights to fixed points under finite conditions. Comprehensive experiments on benchmark datasets-spanning function approximation, classification (KDD Cup 1999 Data, MNIST), and regression tasks (Boston Housing, Energy Efficiency)-demonstrate SL1-GRUs superiority over baseline models (RNN, LSTM, GRU, L1-GRU, L2-GRU). Empirical results reveal that SL1-GRU achieves 1.0%–2.4% higher test accuracy in classification, 7.8%–15.4% lower mean squared error in regression compared to unregularized GRU, while reducing training time by 8.7%–20.1%. These outcomes validate the method’s efficacy in balancing computational efficiency and generalization capability, and they strongly corroborate the theoretical calculations. The proposed framework not only resolves the non-differentiability challenge of regularization but also provides a theoretical foundation for convergence guarantees in recurrent neural network training.

Keywords

Gated recurrent unit; regularization; convergence

Downloads
- Full-Text PDF
Citation Tools
- BibTex
- EndNote
- RIS

90

View
24

Download
0

Like

Improved VGG Model for Road Traffic Sign Recognition
Shuren Zhou, Wenlong Liang, Junguo...
The Discrete-Analytical Solution Method for Investigation Dynamics of the Sphere with Inhomogeneous Initial Stresses
Surkay D. Akbarov, Hatam H. Guliyev,...
Solution of Algebraic Lyapunov Equation on Positive-Definite Hermitian Matrices by Using Extended Hamiltonian Algorithm
Muhammad Shoaib Arif, Mairaj Bibi,...
Comparison of CS, CGM and CS-CGM for Prediction of Pipe’s Inner Surface in FGMs
Haolong Chen, Bo Yu, Huanlin Zhou,...
Research on SFLA-Based Bidirectional Coordinated Control Strategy for EV Battery Swapping Station
Guo Zhao, Jiang Guo, Hao Qiang

All issues

Online First

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Deterministic Convergence Analysis for GRU Networks via Smoothing Regularization

Abstract

Keywords

90

24

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link