Open Access iconOpen Access

ARTICLE

crossmark

Denoising Letter Images from Scanned Invoices Using Stacked Autoencoders

Samah Ibrahim Alshathri1,*, Desiree Juby Vincent2, V. S. Hari2

1 Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, 84428, Saudi Arabia
2 Department of Electronics, College of Engineering Chengannur, Kerala Technological University, Chengannur, 689121, India

* Corresponding Author: Samah Ibrahim Alshathri. Email: email

Computers, Materials & Continua 2022, 71(1), 1371-1386. https://doi.org/10.32604/cmc.2022.022458

Abstract

Invoice document digitization is crucial for efficient management in industries. The scanned invoice image is often noisy due to various reasons. This affects the OCR (optical character recognition) detection accuracy. In this paper, letter data obtained from images of invoices are denoised using a modified autoencoder based deep learning method. A stacked denoising autoencoder (SDAE) is implemented with two hidden layers each in encoder network and decoder network. In order to capture the most salient features of training samples, a undercomplete autoencoder is designed with non-linear encoder and decoder function. This autoencoder is regularized for denoising application using a combined loss function which considers both mean square error and binary cross entropy. A dataset consisting of 59,119 letter images, which contains both English alphabets (upper and lower case) and numbers (0 to 9) is prepared from many scanned invoices images and windows true type (.ttf) files, are used for training the neural network. Performance is analyzed in terms of Signal to Noise Ratio (SNR), Peak Signal to Noise Ratio (PSNR), Structural Similarity Index (SSIM) and Universal Image Quality Index (UQI) and compared with other filtering techniques like Nonlocal Means filter, Anisotropic diffusion filter, Gaussian filters and Mean filters. Denoising performance of proposed SDAE is compared with existing SDAE with single loss function in terms of SNR and PSNR values. Results show the superior performance of proposed SDAE method.

Keywords


Cite This Article

S. Ibrahim Alshathri, D. Juby Vincent and V. S. Hari, "Denoising letter images from scanned invoices using stacked autoencoders," Computers, Materials & Continua, vol. 71, no.1, pp. 1371–1386, 2022.



cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1901

    View

  • 919

    Download

  • 0

    Like

Share Link