Frequency-Quantized Variational Autoencoder Based on 2D-FFT for Enhanced Image Reconstruction and Generation

Jianxin Feng^1,2,*, Xiaoyao Liu^1,2
1 School of Information Engineering, Dalian University, Dalian, 116622, China
2 Key Laboratory of Communication and Networks, Dalian University, Dalian, 116622, China
* Corresponding Author: Jianxin Feng. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.060252

Received 28 October 2024; Accepted 08 February 2025; Published online 02 April 2025

Download PDF

Abstract

As a form of discrete representation learning, Vector Quantized Variational Autoencoders (VQ-VAE) have increasingly been applied to generative and multimodal tasks due to their ease of embedding and representative capacity. However, existing VQ-VAEs often perform quantization in the spatial domain, ignoring global structural information and potentially suffering from codebook collapse and information coupling issues. This paper proposes a frequency quantized variational autoencoder (FQ-VAE) to address these issues. The proposed method transforms image features into linear combinations in the frequency domain using a 2D fast Fourier transform (2D-FFT) and performs adaptive quantization on these frequency components to preserve image’s global relationships. The codebook is dynamically optimized to avoid collapse and information coupling issue by considering the usage frequency and dependency of code vectors. Furthermore, we introduce a post-processing module based on graph convolutional networks to further improve reconstruction quality. Experimental results on four public datasets demonstrate that the proposed method outperforms state-of-the-art approaches in terms of Structural Similarity Index (SSIM), Learned Perceptual Image Patch Similarity (LPIPS), and Reconstruction Fréchet Inception Distance (rFID). In the experiments on the CIFAR-10 dataset, compared to the baseline method VQ-VAE, the proposed method improves the above metrics by 4.9%, 36.4%, and 52.8%, respectively.

Keywords

VAE; 2D-FFT; image reconstruction; image generation

Downloads
- Full-Text PDF
Citation Tools
- BibTex
- EndNote
- RIS

134

View
59

Download
0

Like

A New Encryption-then-Compression Scheme on Gray Images Using the Markov Random Field
Chuntao Wang, Yang Feng, Tianzheng...
Coverless Image Steganography Based on Jigsaw Puzzle Image Generation
Al Hussien Seddik Saad, M. S....
Phase Error Compensation of Three-Dimensional Reconstruction Combined with Hilbert Transform
Tao Zhang, Jie Shen, Shaoen Wu
Data Matching of Solar Images Super-Resolution Based on Deep Learning
Liu Xiangchun, Chen Zhan, Song...
Energy Efficient Cluster Based Clinical Decision Support System in IoT Environment
C. Rajinikanth, P. Selvaraj, Mohamed...

All issues

Online First

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Frequency-Quantized Variational Autoencoder Based on 2D-FFT for Enhanced Image Reconstruction and Generation

Abstract

Keywords

134

59

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link