Muhammad Shahid Bhatti1,*, Azmat Ullah1, Rohaya Latip2, Abid Sohail1, Anum Riaz1, Rohail Hassan3
CMC-Computers, Materials & Continua, Vol.71, No.1, pp. 125-141, 2022, DOI:10.32604/cmc.2022.020083
- 03 November 2021
Abstract Text classification of low resource language is always a trivial and challenging problem. This paper discusses the process of Urdu news classification and Urdu documents similarity. Urdu is one of the most famous spoken languages in Asia. The implementation of computational methodologies for text classification has increased over time. However, Urdu language has not much experimented with research, it does not have readily available datasets, which turn out to be the primary reason behind limited research and applying the latest methodologies to the Urdu. To overcome these obstacles, a medium-sized dataset having six categories is… More >