Open Access
ARTICLE
Urdu Ligature Recognition System: An Evolutionary Approach
1 Department of Computer Science, Institute of Management Sciences, Peshawar, 25000, Pakistan
2 Department of Information Technology, Hazara University, Mansehra, 21120, Pakistan
3 School of Electrical and Computer Engineering, Seoul National University, Seoul, 08826, South Korea
4 Tecnologico de Monterrey, School of Engineering and Sciences, Zapopan, 45201, Mexico
5 Department of Computer Science, Prince Sattam Bin Abdulaziz University, As Sulayyil, 11991, Saudi Arabia
6 Electrical Engineering Department, College of Engineering, Prince Sattam Bin Abdulaziz University, Wadi Addwasir, 11991, Saudi Arabia
7 Electrical Engineering Department, Faculty of Engineering, Aswan University, Aswan, 81542, Egypt
* Corresponding Author: Naila Habib Khan. Email:
Computers, Materials & Continua 2021, 66(2), 1347-1367. https://doi.org/10.32604/cmc.2020.013715
Received 10 August 2020; Accepted 24 August 2020; Issue published 26 November 2020
Abstract
Cursive text recognition of Arabic script-based languages like Urdu is extremely complicated due to its diverse and complex characteristics. Evolutionary approaches like genetic algorithms have been used in the past for various optimization as well as pattern recognition tasks, reporting exceptional results. The proposed Urdu ligature recognition system uses a genetic algorithm for optimization and recognition. Overall the proposed recognition system observes the processes of pre-processing, segmentation, feature extraction, hierarchical clustering, classification rules and genetic algorithm optimization and recognition. The pre-processing stage removes noise from the sentence images, whereas, in segmentation, the sentences are segmented into ligature components. Fifteen features are extracted from each of the segmented ligature images. Intra-feature hierarchical clustering is observed that results in clustered data. Next, classification rules are used for the representation of the clustered data. The genetic algorithm performs an optimization mechanism using multi-level sorting of the clustered data for improving the classification rules used for recognition of Urdu ligatures. Experiments conducted on the benchmark UPTI dataset for the proposed Urdu ligature recognition system yields promising results, achieving a recognition rate of 96.72%.Keywords
Cite This Article
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.