Quranic Script Optical Text Recognition Using Deep Learning in IoT Systems

Mahmoud Badry; Mohammed Hassanin; Asghar Chandio; Nour Moustafa

doi:10.32604/cmc.2021.015489

Open Access icon Open Access

ARTICLE

Quranic Script Optical Text Recognition Using Deep Learning in IoT Systems

Mahmoud Badry^1,*, Mohammed Hassanin^1,2, Asghar Chandio^2,3, Nour Moustafa²

1 School of Engineering and Information Technology, UNSW Canberra. ACT, Newcastle NSW, 2620, Australia
2 Faculty of Computers and Information, Fayoum University, Fayoum, Egypt
3 Quaid-e-Awam University of Engineering, Science and Technology, Nawabshah, Pakistan

* Corresponding Author: Mahmoud Badry. Email: email

Computers, Materials & Continua 2021, 68(2), 1847-1858. https://doi.org/10.32604/cmc.2021.015489

Received 17 November 2020; Accepted 22 December 2020; Issue published 13 April 2021

Abstract

Since the worldwide spread of internet-connected devices and rapid advances made in Internet of Things (IoT) systems, much research has been done in using machine learning methods to recognize IoT sensors data. This is particularly the case for optical character recognition of handwritten scripts. Recognizing text in images has several useful applications, including content-based image retrieval, searching and document archiving. The Arabic language is one of the mostly used tongues in the world. However, Arabic text recognition in imagery is still very much in the nascent stage, especially handwritten text. This is mainly due to the language complexities, different writing styles, variations in the shape of characters, diacritics, and connected nature of Arabic text. In this paper, two deep learning models were proposed. The first model was based on a sequence-to-sequence recognition, while the second model was based on a fully convolution network. To measure the performance of these models, a new dataset, called QTID (Quran Text Image Dataset) was devised. This is the first Arabic dataset that includes Arabic diacritics. It consists of 309,720 different 192 × 64 annotated Arabic word images, which comprise 2,494,428 characters in total taken from the Holy Quran. The annotated images in the dataset were randomly divided into 90%, 5%, and 5% sets for training, validation, and testing purposes, respectively. Both models were set up to recognize the Arabic Othmani font in the QTID. Experimental results show that the proposed methods achieve state-of-the-art outcomes. Furthermore, the proposed models surpass expectations in terms of character recognition rate, F1-score, average precision, and recall values. They are superior to the best Arabic text recognition engines like Tesseract and ABBYY FineReader.

Keywords

OCR; quranic script; IoT; deep learning

Cite This Article

APA Style

Badry, M., Hassanin, M., Chandio, A., Moustafa, N. (2021). Quranic Script Optical Text Recognition Using Deep Learning in IoT Systems. Computers, Materials & Continua, 68(2), 1847–1858. https://doi.org/10.32604/cmc.2021.015489

Vancouver Style

Badry M, Hassanin M, Chandio A, Moustafa N. Quranic Script Optical Text Recognition Using Deep Learning in IoT Systems. Comput Mater Contin. 2021;68(2):1847–1858. https://doi.org/10.32604/cmc.2021.015489

IEEE Style

M. Badry, M. Hassanin, A. Chandio, and N. Moustafa, “Quranic Script Optical Text Recognition Using Deep Learning in IoT Systems,” Comput. Mater. Contin., vol. 68, no. 2, pp. 1847–1858, 2021. https://doi.org/10.32604/cmc.2021.015489

BibTex EndNote RIS

Citations

1

[click to view]

Copyright © 2021 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Quranic Script Optical Text Recognition Using Deep Learning in IoT Systems

Abstract

Keywords

Cite This Article

Citations

3755

2330

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link