Open Access iconOpen Access

ARTICLE

Frequency-Quantized Variational Autoencoder Based on 2D-FFT for Enhanced Image Reconstruction and Generation

Jianxin Feng1,2,*, Xiaoyao Liu1,2

1 School of Information Engineering, Dalian University, Dalian, 116622, China
2 Key Laboratory of Communication and Networks, Dalian University, Dalian, 116622, China

* Corresponding Author: Jianxin Feng. Email: email

Computers, Materials & Continua 2025, 83(2), 2087-2107. https://doi.org/10.32604/cmc.2025.060252

Abstract

As a form of discrete representation learning, Vector Quantized Variational Autoencoders (VQ-VAE) have increasingly been applied to generative and multimodal tasks due to their ease of embedding and representative capacity. However, existing VQ-VAEs often perform quantization in the spatial domain, ignoring global structural information and potentially suffering from codebook collapse and information coupling issues. This paper proposes a frequency quantized variational autoencoder (FQ-VAE) to address these issues. The proposed method transforms image features into linear combinations in the frequency domain using a 2D fast Fourier transform (2D-FFT) and performs adaptive quantization on these frequency components to preserve image’s global relationships. The codebook is dynamically optimized to avoid collapse and information coupling issue by considering the usage frequency and dependency of code vectors. Furthermore, we introduce a post-processing module based on graph convolutional networks to further improve reconstruction quality. Experimental results on four public datasets demonstrate that the proposed method outperforms state-of-the-art approaches in terms of Structural Similarity Index (SSIM), Learned Perceptual Image Patch Similarity (LPIPS), and Reconstruction Fréchet Inception Distance (rFID). In the experiments on the CIFAR-10 dataset, compared to the baseline method VQ-VAE, the proposed method improves the above metrics by 4.9%, 36.4%, and 52.8%, respectively.

Keywords

VAE; 2D-FFT; image reconstruction; image generation

Cite This Article

APA Style
Feng, J., Liu, X. (2025). Frequency-Quantized Variational Autoencoder Based on 2D-FFT for Enhanced Image Reconstruction and Generation. Computers, Materials & Continua, 83(2), 2087–2107. https://doi.org/10.32604/cmc.2025.060252
Vancouver Style
Feng J, Liu X. Frequency-Quantized Variational Autoencoder Based on 2D-FFT for Enhanced Image Reconstruction and Generation. Comput Mater Contin. 2025;83(2):2087–2107. https://doi.org/10.32604/cmc.2025.060252
IEEE Style
J. Feng and X. Liu, “Frequency-Quantized Variational Autoencoder Based on 2D-FFT for Enhanced Image Reconstruction and Generation,” Comput. Mater. Contin., vol. 83, no. 2, pp. 2087–2107, 2025. https://doi.org/10.32604/cmc.2025.060252



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 206

    View

  • 121

    Download

  • 0

    Like

Share Link