Open Access
ARTICLE
Intelligent Image Text Detection via Pixel Standard Deviation Representation
1 LIAP Laboratory, University of El Oued, El Oued, Algeria
2 Information and Computer Science Department, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia
3 National Higher School of Mathematics, Scientific and Technology Hub of Sidi Abdellah, P.O. Box 75, Algiers, 16093, Algeria
* Corresponding Author: Mohammad Hammoudeh. Email:
Computer Systems Science and Engineering 2024, 48(4), 915-935. https://doi.org/10.32604/csse.2024.046414
Received 30 September 2023; Accepted 27 February 2024; Issue published 17 July 2024
Abstract
Artificial intelligence has been involved in several domains. Despite the advantages of using artificial intelligence techniques, some crucial limitations prevent them from being implemented in specific domains and locations. The accuracy, poor quality of gathered data, and processing time are considered major concerns in implementing machine learning techniques, certainly in low-end smart devices. This paper aims to introduce a novel pre-treatment technique dedicated to image text detection that uses the images’ pixel divergence and similarity to reduce the image size. Mitigating the image size while keeping its features improves the model training time with an acceptable accuracy rate. The mitigation is reached by gathering similar image pixels in one pixel based on calculated values of the standard deviation σ, where we consider that two pixels are similar if they have approximately the same σ values. The work proposes a new pipeline approach that reduces the size of the image in the input and intermediate layers of a deep learning model based on merged pixels using standard deviation values instead of the whole image. The experimental results prove that this technique significantly improves the performance of existing text detection methods, particularly in challenging scenarios such as using low-end IoT devices that offer low contrast or noisy backgrounds. Compared with other techniques, the proposed technique can potentially be exploited for text detection in IoT-gathered multimedia data with reasonable accuracy in a short computation time. Evaluation of the MSRA-TD500 dataset demonstrates the remarkable performance of our approach, Standard Deviation Network (σNet), with precision and recall values of 93.8% and 85.6%, respectively, that outperform recent research results.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.