Political Optimizer with Probabilistic Neural Network-Based Arabic Comparative Opinion Mining

Najm Alotaibi; Badriyya Al-onazi; Mohamed Nour; Abdullah Mohamed; Abdelwahed Motwakel; Gouse Mohammed; Ishfaq Yaseen; Mohammed Rizwanullah

doi:10.32604/iasc.2023.033915

icon Open Access

ARTICLE

Political Optimizer with Probabilistic Neural Network-Based Arabic Comparative Opinion Mining

Najm Alotaibi¹, Badriyya B. Al-onazi², Mohamed K. Nour³, Abdullah Mohamed⁴, Abdelwahed Motwakel^5,*, Gouse Pasha Mohammed⁵, Ishfaq Yaseen⁵, Mohammed Rizwanullah⁵

1 Prince Saud AlFaisal Institute for Diplomatic Studies, Riyadh, 13369, Saudi Arabia
2 Department of Language Preparation, Arabic Language Teaching Institute, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh, 11671, Saudi Arabia
3 Department of Computer Sciences, College of Computing and Information System, Umm Al-Qura University, Makkah, 24211, Saudi Arabia
4 Research Centre, Future University in Egypt, New Cairo, 11845, Egypt
5 Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam bin Abdulaziz University, AlKharj, 16242, Saudi Arabia

* Corresponding Author: Abdelwahed Motwakel. Email: email

Intelligent Automation & Soft Computing 2023, 36(3), 3121-3137. https://doi.org/10.32604/iasc.2023.033915

Received 01 July 2022; Accepted 14 October 2022; Issue published 15 March 2023

Abstract

Opinion Mining (OM) studies in Arabic are limited though it is one of the most extensively-spoken languages worldwide. Though the interest in OM studies in the Arabic language is growing among researchers, it needs a vast number of investigations due to the unique morphological principles of the language. Arabic OM studies experience multiple challenges owing to the poor existence of language sources and Arabic-specific linguistic features. The comparative OM studies in the English language are wide and novel. But, comparative OM studies in the Arabic language are yet to be established and are still in a nascent stage. The unique features of the Arabic language make it essential to expand the studies regarding the Arabic text. It contains unique features such as diacritics, elongation, inflection and word length. The current study proposes a Political Optimizer with Probabilistic Neural Network-based Comparative Opinion Mining (POPNN-COM) model for the Arabic text. The proposed POPNN-COM model aims to recognize comparative and non-comparative texts in Arabic in the context of social media. Initially, the POPNN-COM model involves different levels of data pre-processing to transform the input data into a useful format. Then, the pre-processed data is fed into the PNN model for classification and recognition of the data under different class labels. At last, the PO algorithm is employed for fine-tuning the parameters involved in this model to achieve enhanced results. The proposed POPNN-COM model was experimentally validated using two standard datasets, and the outcomes established the promising performance of the proposed POPNN-COM method over other recent approaches.

Keywords

Comparative opinion mining; Arabic text; social media; parameter tuning; machine learning; political optimizer

1 Introduction

Social media has emerged as a powerful platform for obtaining different types of data from different fields. The advancement and the application of computerized techniques in the extraction of information and knowledge from Arabic text in mass media are still in a nascent stage due to the complex nature of the language [1]. Various techniques such as normalization, learning algorithms, text cleaning and the Natural Language Processing (NLP) method have recently evolved. Arabic is a Semitic language that closely relates to Islam and the Arabic culture. The Arabic language is used to write the verses in the holy book of Muslims [2]. It is an official language for 422 million people spread across the globe. Along with cultural uniqueness and rich history, the Arabic language has a distinct structure and nature compared to other languages, such as English. For instance, Arabic is written from right to left [3,4]. With a total of 28 letters, the language has three vowels and diacritics. The shapes of the letters in Arabic text vary under their position in a word and the absence of capitalization [5]. Arabic is a structured language with a large vocabulary. Its morphology serves a significant role. Further, the words are frequently built in a complicated manner [6,7] and might contain agglutinative words, drop features and affixes. The Arabic language’s characteristics and complexities make its opinion mining process a highly challenging one [8].

Businesses acquire insights about their customer purchase behaviour to gain an advantage over their competitors. These insights explain why customers are not purchasing a company’s products and prefer their competitors’ products [9]. Such crucial information gives the companies an overview of whether a new product or a new service is liked or not by the consumer. Public Facebook review pages, customer product review pages and blogs are filled with opinions of the end users instead of the entities that compare the outcomes against their competitors [10,11]. Mostly, the political reviews that compare a government’s performance are important for the decision-makers in order to ensure that the people find their government’s performance satisfactory and the country remains economically stable. This study examined different types of comparative sentences used in Arabic to extract and analyse the sentiments expressed in the public domain [12]. Comparative relations denote the extraction of the comparative opinion components. Many research works conducted in the comparative opinion mining domain are concerned with identifying comparative relationships and general classification of such relationships under pre-defined comparative classes [13]. Even though such efforts are highly significant in opinion mining, it becomes insufficient when the comparative opinions are recognized in relation to the argument [14]. Numerous research articles have been published on comparative opinions in the English language. However, the studies concerning the Arabic language focused only on the detection of comparative relations so far.

Aljedani et al. [15] proposed a hierarchical multi-label classification framework for the Arabic language. This study projected a Hierarchical Multi-label Arabic Text Classification (HMATC) technique with Machine Learning (ML). The study evaluated the impact of the feature set dimensions and the Feature Selection (FS) approaches on the classification performance. The outcomes of the Hierarchy of Multilabel ClassifiER (HOMER) approach were maximized by analysing different sets of multi-label classifiers, clustering methods and various clusters to enhance the performance of the hierarchical classification process. Boukil et al. [16] introduced a creative approach to Arabic text classification in which an Arabic stemming technique was used for extraction, selection and reducing the required features. Afterwards, the Term Frequency-Inverse Document Frequency (TF-IDF) method was leveraged as a feature weighting algorithm. At last, CNN, a Deep Learning technique, was used for classification. Though this technique is highly effective in other domains like pattern recognition and image processing, it is hardly utilized in text mining.

Bahassine et al. [17] modelled an enhanced algorithm for the classification of Arabic texts in which the Chi-square Feature Selection (FS) (referred to here as ImpCHI) method was used to enhance the classification performance. The study compared the outcomes of the enhanced chi-square method using three conventional FS metrics: the Chi-square method, mutual information and the information gain approach. Chantar et al. [18] projected an enhanced binary Grey Wolf Optimizer (GWO) wrapper FS method to handle the Arabic text classification process appropriately. The presented binary GWO was helpful as a wrapper-related FS method. The outcomes of the projected technique were compared with that of the established learning methods such as the Decision Tree (DT), k-Nearest Neighbour (KNN), Naïve Bayes (NB) and the Support Vector Machine (SVM) model. Muaad et al. [19] modelled a method to recognize and represent the Arabic text at the character level with the help of a Deep Convolutional Neural Network (CNN) method. In this study, the CNN method was authenticated through a 5-fold cross-validation test for classifying the Arabic text documents. In addition, the authors employed a mechanism to evaluate the Arabic texts. Finally, the ArCAR mechanism revealed the proposed method’s capability in character-level Arabic text classification.

The current study proposes a Political Optimizer with Probabilistic Neural Network-based Comparative Opinion Mining (POPNN-COM) model to classify the Arabic text. The goal of the proposed POPNN-COM model is to recognize comparative and non-comparative texts in Arabic social media content. Initially, the POPNN-COM model involves different levels of data pre-processing to transform the input data into a useful format. For classification, the pre-processed data is fed into the PNN model to recognize the class labels. At last, the PO algorithm is employed for fine-tuning the parameters involved in the model to achieve enhanced results. The proposed POPNN-COM model was experimentally validated using two datasets, and the results were discussed under different measures.

2 The Proposed Model

In the current study, a novel POPNN-COM model has been developed for opinion mining from the Arabic text. The goal of the proposed POPNN-COM model is to recognize comparative and non-comparative texts in the context of Arabic social media content.

2.1 Data Pre-Processing

In this stage, the POPNN-COM model involves different levels of data pre-processing to transform the input data into a useful format. The pre-processing stage aims to remove the noise from the text and reduce the number of features during text presentation. This process results in fewer storage requirements and high classification efficiency. The text pre-processing stage involves multiple sub-processes for the Arabic language. At first, the symbols such as ‘‘#” (hashtag), URLs, ‘‘@” (mention), EOL (End Of Line) and RT (retweet) are removed. Then, the Arabic stopwords are eliminated with the help of the Python NLTK (Natural Language Toolkit) library. The purpose of normalization is to convert every letter variant to a single form (viz., Ï !i). This process is applied to unify different letter formats, such as ya, alif and waw in Arabic. The speech effects and the single letters are repeatedly reiterated. For example, ‘Hellooo’ is one of the common features that appear in social media posts, which is mainly intended for emphasizing a conversation. Therefore, if such terms are removed, it produces a single form of the words, i.e., ‘Hellooo’ and ‘Helloooooo’ and so on, are mapped altogether against the word, ‘Hello’.

TF-IDF is a weighting factor that reflects the significance of a word in a corpus document. It measures a certain word’s overall number of occurrences in the provided document and is termed ‘Term Frequency’. Nonetheless, certain words, such as the stop-words, commonly emerge in every single document; such content should be discounted in a systematic manner. This procedure is named ‘Inverse Document Frequency’. The lesser the value is, as a sign of differentiating a specific document, the higher the amount of documents will be, in which the word appears. Then, the TF-IDF approach is applied as a part of the Scikit-learn library and is evaluated as given below [20].

$w_{i j} = t f_{i j} \times log (\frac{N}{d f_{i}}),$ $wij=tfij×log⁡(Ndfi),$ (1)

Now, the number of times a word $i$ $i$ occurs in the document $j$ $j$ is represented by tfij, the number of documents that has the word $i$ $i$ is denoted by dfi, and N refers to overall number of documents.

2.2 Opining Mining Using PNN Model

For classification, the pre-processed data is fed into the PNN model to recognize the class labels. The PNN method is the most commonly-utilized data mining approach and is employed in several pattern recognition and classification problems [21]. In these kinds of Neural networks (NN), the operation is pre-arranged in a multi-layered network that contains four layers such as the output layer, input layer, pattern layer and summation layer. In the input layer, a dimension $(p)$ $(p)$ of the input vector reflects the dimension of a layer. In the pattern layer, the dimension of the number of instances in the training subset is equivalent to the dimension of a layer. The 3rd layer (summation) contains several classes in the group. The 4th layer (output) and the validation examples are categorized based on the number of classes. Fig. 1 depicts the infrastructure of the PNN technique.

images

Figure 1: Architecture of the PNN approach

The operation formula of the PNN method includes all four important layers.

In the input layer, all the neurons have a prediction parameter. Its values are fed into every neuron in the hidden layer.

In the pattern layer, for all the training samples, a single layer formulates the product associated with an input vector $x$ $x$ by involving the weight of the vector, $w_{i}, z_{i} = x \cdot w_{i}^{T}$ $wi,zi=x⋅wiT$ . Then, the succeeding non-linear process is performed as given below.

$exp [(\frac{(- w_{i} - x) . (w_{i} - x)^{T}}{(2 α^{2})})],$ $exp⁡[((−wi−x).(wi−x)T(2α2))],$ (2)

In Eq. (2), $i$ $i$ denotes a pattern number, $T$ $T$ indicates the overall number of the training patterns, $X$ $X$ shows the $i$ $i$ - $t h$ $th$ training pattern from a class, and $a$ $a$ refers to a smoothing variable.

The summation layer aggregates the enhanced values of all the input classes and produces the network output as a probability vector (Eq. (3)):

$\sum_{i} exp [(\frac{(- w_{i} - x) . (w_{i} - x)^{T}}{(2 α^{2})})] .$ $∑iexp⁡[((−wi−x).(wi−x)T(2α2))].$ (3)

The output layer produces different classes that depend on the decision classes, such as $Ω_{r}$ $Ωr$ and $Ω_{σ}$ $Ωσ$ . Here, $r \neq s,$ $r≠s,$ $r,$ $r,$ $s = 1, 2, . . ., q$ $s=1,2,...,q$ and a classification condition is shown in Eq. (4).

$\sum_{i} exp [(\frac{(- w_{i} - x) . (w_{i} - x)^{T}}{(2 α^{2})})] > \sum_{j} exp [(\frac{(- w_{j} - x) . (w_{j} - x)^{T}}{(2 α^{2})})]$ $∑iexp⁡[((−wi−x).(wi−x)T(2α2))]>∑jexp⁡[((−wj−x).(wj−x)T(2α2))]$ (4)

This node possesses a single weight $C$ $C$ to denote the probability of the preceding membership and is composed of a number of trained instances under each class $C$ $C$ . Further, it is represented as the cost parameter as given herewith.

$C = - (\frac{h_{s} l_{s}}{h_{r} l_{r}} . \frac{n_{r}}{n_{s}}),$ $C=−(hslshrlr.nrns),$ (5)

In Eq. (5), $h_{s}$ $hs$ indicates the previous prospect in which the present sample proceeds to a group, $n$ $n$ whereas $c_{n}$ $cn$ represents the misclassified cost.

2.3 Parameter Optimization Using PO

At last, the PO algorithm is employed for fine-tuning the parameters involved in the model to enhance the outcomes. The PO algorithm is an intelligent optimization technique developed based on political election procedures followed by human society [22]. In the PO algorithm, every party member is regarded as a solution candidate. The election performance of the party members is regarded as the assessment function. The PO algorithm searches for an optimum solution with the help of a multiple-phase iteration technique. Initially, the whole population encompassing $n^{2}$ $n2$ individuals is separated into $n$ $n$ parties. There exists an $n$ $n$ member, i.e., a solution candidate in all the parties. Furthermore, each party member acts as an election candidate, viz., one member from every party is carefully chosen to form a constituency. The blue-dotted line specifies the partition of a constituency, whereas the red-dotted line shows the partition of a political party. When mapping the population partition to align with the arithmetic modelling, the whole population is separated into $n$ $n$ political parties, and each party contains $n$ $n$ party members as follows.

$P = {P_{1}, P_{2}, P_{3}, \dots, P_{n}}$ $P={P1,P2,P3,…,Pn}$ (6)

$P_{i} = {p_{i}^{1}, p_{i}^{2}, p_{i}^{3}, \dots, p_{i}^{n}}$ $Pi={pi1,pi2,pi3,…,pin}$ (7)

Here, every party member acts as an election candidate. As a result, the whole population is viewed as $n$ $n$ constituency, as shown in the following equations. Also, a party member, i.e., a member of the constituency, should be highlighted, though the logical division is dissimilar. The membership of every constituency is classified as given below.

$C = {C_{1}, C_{2}, C_{3}, \dots, C_{n}}$ $C={C1,C2,C3,…,Cn}$ (8)

$C_{j} = {p_{1}^{j}, p_{2}^{j}, p_{3}^{j}, \dots, p_{n}^{j}}$ $Cj={p1j,p2j,p3j,…,pnj}$ (9)

Besides, the leader of the $i$ $i$ - $t h$ $th$ party, after calculating the fitness of every member, is distinguished as $p_{i}^{*}$ $pi∗$ , and the group of every party leader is characterized by $P^{*}$ $P∗$ as demonstrated in Eq. (10). Likewise, after the election process is over, $C^{*}$ $C∗$ represents the winner from every constituency and is called as a ‘parliamentarian’ as illustrated in Eq. (11). Here, $c_{j}^{*}$ $cj∗$ represents the winner of the $j$ $j$ - $t h$ $th$ constituency.

$P^{*} = {p_{1}^{*}, p_{2}^{*}, p_{3}^{*}, \dots, p_{n}^{*}}$ $P∗={p1∗,p2∗,p3∗,…,pn∗}$ (10)

$C^{*} = {c_{1}^{*}, c_{2}^{*}, c_{3}^{*}, \dots, C_{n}^{*}}$ $C∗={c1∗,c2∗,c3∗,…,Cn∗}$ (11)

Campaigning is a fundamental phase in the election process and is accountable for the updated position of the search agent. In this work, a particular presentation is that a party member changes their position based on their leader $P^{*}$ $P∗$ of the party and the winner $C^{*}$ $C∗$ of the constituency. Furthermore, as discussed below, they acquire knowledge about the previous election over the original position-updating method. The major concept of the RPPUS method is to forecast a favourable area over the mathematical relationships among the subgroups using an optimum solution, i.e., constituency winner or party leader, with the existing and earlier fitness values of the searching agent.

$pi,kj(t+1)={m∗+r(m∗−pi,kj(t)),ifpi,kj(t−1)≤pi,kj(t)≤m∗orpi,kj(t−1)≥pi,kj(t)≥m∗m∗+(2r−1)|m∗−pi,kj(t)|,ifpi,kj(t−1)≤m∗≤pi,kj(t)orpi,kj(t−1)≥m∗≥pi,kj(t)m∗+(2r−1)|m∗−pi,kj(t−1)|,ifm∗≤pi,kj(t−1)≤pi,kj(t)orm∗≥pi,kj(t−1)≥pi,kj(t)$ (12)

$pi,kj(t+1)={ m*+(2r−1)|m*−pi,kj(t)|,if pi,kj(t−1)≤pi,kj(t)≤m*orpi,kj(t−1)≥pi,kj(t)≥m*pi,kj(t−1)+r(pi,kj(t)−pi,kj(t−1)),if pi,kj(t−1)≤m*≤pi,kj(t)orpi,kj(t−1)≥m*≥pi,kj(t)m*+(2r−1)|m*−pi,kj(t−1)|,if m*≤pi,kj(t−1)≤pi,kj(t)orm*≥pi,kj(t−1)≥pi,kj(t)$ (13)

In this expression, $m∗$ shows the party’s leader or the constituency’s winner, $r$ signifies an arbitrary number in the range of [0,1], and $T$ characterizes the existing iteration count. Fig. 2 demonstrates the steps involved in the PO technique.

images

Figure 2: Steps involved in the PO algorithm

The party-switching stage is executed to create a balance between the exploitation and the exploration phases. It presents an adaptive variable $λ$ termed a ‘party switching rate’. Every party member tends to switch and gets selected into an arbitrarily-chosen party. The switching probability is represented by $λ$ , which is primarily 1. It gets reduced linearly to $0$ as follows.

$λ=(1−tT)∗λmax$ (14)

Here, the fitness of every solution candidate is described, whereas the party leader and the constituency winner are upgraded.

$q=arg⁡min1≤j≤nf(Pij)pi∗=Piq$ (15)

$q=arg⁡min1≤i≤nf(Pij)cj∗=pqj$ (16)

The party-switching stage aims to alter a party’s perception, whereas the parliamentary affair stage is the modification of a constituency’s perspectives. A constituency winner interacts with others to enhance their fitness values. Every constituency winner uses the following formula to update their location based on the rest of the arbitrarily-designated constituencies. It is to be noted that the movement is employed when the fitness of $cj∗$ becomes highly efficient.

images

3 Results and Discussion

The current chapter assesses the proposed POPNN-COM model’s performance using two datasets, Corpus and Corpus+. The details of the datasets are given in Table 1. For experimental validation, ten-fold cross-validation was executed.

images

Fig. 3 demonstrates the confusion matrices generated by the POPNN-COM model on the Corpus dataset using distinct Training (TR) and Testing (TS) datasets. The figure indicates that the proposed POPNN-COM model properly recognized both CT and the NCT samples in all aspects. Table 2 and Fig. 4 portray the CT and the NCT classification outcomes achieved by the proposed POPNN-COM model on the Corpus dataset. Upon the entire dataset, the proposed POPNN-COM model produced average $accuy$ , $precn$ , $recal$ , $Fscore$ and $AUCscore$ values such as 95.50%, 95.52%, 95.50%, 95.50% and 95.50%, respectively. Simultaneously, upon 70% of the TR data, the presented POPNN-COM model achieved average $accuy$ , $precn$ , $recal$ , $Fscore$ and $AUCscore$ values such as 95.93%, 95.95%, 95.92%, 95.93% and 95.92% correspondingly. Concurrently, on 30% of the TS data, the POPNN-COM approach resulted in average $accuy$ , $precn$ , $recal$ , $Fscore$ and $AUCscore$ values such as 94.50%, 94.50%, 94.54%, 94.50% and 94.54% correspondingly.

images

Figure 3: Confusion matrices of the POPNN-COM approach on Corpus dataset (a) Entire dataset, (b) 70% of the TR data and (c) 30% of the TS data

images

Figure 4: Analytical results of the POPNN-COM approach on the Corpus dataset

Both Training Accuracy (TA) and Validation Accuracy (VA) values attained by the proposed POPNN-COM method on Corpus dataset are illustrated in Fig. 5. The experimental outcomes denote that the proposed POPNN-COM technique achieved the maximal TA and VA values. In contrast, the VA values were higher than the TA values.

images

Figure 5: TA and VA analyses results of the POPNN-COM approach on the Corpus dataset

Both Training Loss (TL) and Validation Loss (VL) values, achieved by the proposed POPNN-COM approach on Corpus dataset, are shown in Fig. 6. The experimental outcomes imply that the presented POPNN-COM algorithm accomplished the least TL and VL values. In contrast, the VL values were lesser than the TL values.

images

Figure 6: TL and VL analyses results of the POPNN-COM approach on Corpus dataset

A clear precision-recall analysis was conducted on the proposed POPNN-COM method using the Corpus dataset, and the results are portrayed in Fig. 7. The figure indicates that the proposed POPNN-COM method produced the enhanced precision-recall values under all the classes.

images

Figure 7: Precision-recall analysis results of the POPNN-COM approach on the Corpus dataset

Fig. 8 establishes the confusion matrices created by the proposed POPNN-COM model on Corpus+ dataset. The figure represents that the proposed POPNN-COM model properly recognized both CT and the NCT samples in all aspects.

images

Figure 8: Confusion matrices of the POPNN-COM approach on Corpus+ dataset (a) Entire dataset, (b) 70% of the TR data and (c) 30% of the TS data

Table 3 and Fig. 9 signify the CT and the NCT classification outcomes achieved by the proposed POPNN-COM model on Corpus+ dataset. Upon the entire dataset, the proposed POPNN-COM model produced average $accuy$ , $precn$ , $recal$ , $Fscore$ and $AUCscore$ values such as 98.50%, 98.50%, 98.50%, 98.50% and 98.50%, respectively. At the same time, on 70% of the TR data, the presented POPNN-COM model achieved average $accuy$ , $precn$ , $recal$ , $Fscore$ and $AUCscore$ values such as 98.51%, 98.51%, 98.51%, 98.51% and 98.51% correspondingly. In parallel, on 30% of the TS data, the proposed POPNN-COM model attained average $accuy$ , $precn$ , $recal$ , $Fscore$ and $AUCscore$ values such as 98.47%, 98.48%, 98.47%, 98.47% and 98.47% correspondingly.

images

Figure 9: Analytical results of the POPNN-COM approach on the Corpus+ dataset

Both TA and VA values, obtained by the proposed POPNN-COM method on the Corpus+ dataset, are illustrated in Fig. 10. The experimental outcomes portray that the proposed POPNN-COM technique achieved the maximal TA and VA values. In contrast, the VA values were higher than the TA values.

images

Figure 10: TA and VA analyses results of the POPNN-COM approach on the Corpus+ dataset

Both TL and VL values, achieved by the presented POPNN-COM approach on the Corpus+ dataset, are established in Fig. 11. The experimental outcomes denote that the proposed POPNN-COM algorithm accomplished the least TL and VL values. In contrast, the VL values were lower than the TL values.

images

Figure 11: TL and VL analyses results of the POPNN-COM approach on the Corpus+ dataset

Table 4 showcases the comparative analytical results achieved by the proposed POPNN-COM technique and other existing techniques on the Corpus+ dataset. With respect to $accuy,$ the proposed POPNN-COM model achieved a maximum $accuy$ of 94.50%, whereas the Naïve Bayes (NB), Jrip, J48, POS, KC and the POS-KC models accomplished the least $accuy$ values such as 80.97%, 88.36%, 89.96%, 87.04%, 90.91% and 90.47% respectively. Also, with respect to $precn$ , the POPNN-COM model accomplished a maximum $precn$ of 94.50%, whereas the NB method, Jrip, J48, POS, KC and the POS-KC models accomplished the least $precn$ values such as 74.74%, 91.09%, 92.97%, 92.78%, 98.98% and 92.73% correspondingly. Followed by, with respect to $recal$ , the proposed POPNN-COM model attained a maximum $recal$ of 94.54%, whereas the NB, Jrip, J48, POS, KC and the POS-KC model achieved the least $recal$ values such as 84.72%, 77.12%, 78.60%, 75.48%, 83.14% and 84.16% respectively. With respect to $Fscore$ , the proposed POPNN-COM model achieved a maximum $Fscore$ of 94.50%, whereas the NB method, Jrip, J48, POS, KC and the POS-KC model accomplished the least $Fscore$ values such as 79.53%, 85.47%, 86.98%, 83.40%, 88.66% and 88.31% correspondingly.

images

Table 5 portrays the comparison study outcomes of the proposed POPNN-COM technique and other existing techniques on the Corpus+ dataset. With respect to $accuy,$ the POPNN-COM model achieved an increased $accuy$ of 98.47%, whereas the NB model, Jrip, J48, POS, KC and the POS-KC model achieved the least $accuy$ values such as 80.99%, 89.77%, 90.43%, 86.80%, 91.88% and 91.40% correspondingly. Moreover, with respect to $precn$ , the proposed POPNN-COM model accomplished a maximum $precn$ of 98.48%, whereas the NB approach, Jrip, J48, POS, KC and the POS-KC model accomplished the least $precn$ values such as 73.70%, 93.89%, 97.85%, 92.53%, 94.84% and 92.89% correspondingly. Then, with respect to $recal$ , the proposed POPNN-COM model exhibited an increase in the $recal$ value up to 98.47%. However, the NB approach, Jrip, J48, POS, KC and the POS-KC model accomplished the least $recal$ values such as 87.33%, 81.54%, 79.33%, 76.19%, 85.82% and 86.06%, correspondingly. With respect to $Fscore$ , the proposed POPNN-COM model achieved an increased $Fscore$ of 98.47%. However, the NB approach, Jrip, J48, POS, KC and the POS-KC model accomplished the least $Fscore$ values such as 79.83%, 87.01%, 87.46%, 83.61%, 90.33% and 89.47%, correspondingly.

images

4 Conclusion

In the current study, a novel POPNN-COM model has been developed for opinion mining from the Arabic text. The proposed POPNN-COM model aims to recognize comparative and non-comparative texts in Arabic social media content. Initially, the POPNN-COM model involves different levels of data pre-processing to transform the input data into a useful format. For classification, the pre-processed data is fed into the PNN model to recognize the class labels. At last, the PO algorithm is employed for fine-tuning the parameters involved in the model to enhance the results. The proposed POPNN-COM model was experimentally validated using two datasets, and the outcomes establish the promising performance of the presented POPNN-COM algorithm over other recent approaches. Thus, the POPNN-COM model has been established as an effective tool for comparative opinion mining. It can be used in the future for real-time applications. Further, future studies can also extend the presented model for hate speech detection.

Funding Statement: Princess Nourah bint Abdulrahman University Researchers Supporting Project Number (PNURSP2022R263), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code: 22UQU4310373DSR56.

Conflicts of Interest: The authors declare they have no conflicts of interest to report regarding the present study.

References

1. A. Elnagar, R. Al-Debsi and O. Einea, “Arabic text classification using deep learning models,” Information Processing & Management, vol. 57, no. 1, pp. 102121, 2020. [Google Scholar]

2. S. L. M. Sainte and N. Alalyani, “Firefly algorithm based feature selection for arabic text classification,” Journal of King Saud University-Computer and Information Sciences, vol. 32, no. 3, pp. 320–328, 2020. [Google Scholar]

3. A. M. F. Al-Sbou, “A survey of Arabic text classification models,” International Journal of Informatics and Communication Technology, vol. 8, no. 1, pp. 25–28, 2018. [Google Scholar]

4. F. N. Al-Wesabi, “Entropy-based watermarking approach for sensitive tamper detection of arabic text,” Computers Materials & Continua, vol. 67, no. 3, pp. 3635–3648, 2021. [Google Scholar]

5. A. Wahdan, S. A. L. Hantoobi, S. A. Salloum and K. Shaalan, “A systematic review of text classification research based on deep learning models in Arabic language,” International Journal of Electrical and Computer Engineering, vol. 10, no. 6, pp. 6629, 2020. [Google Scholar]

6. F. N. Al-Wesabi, “A hybrid intelligent approach for content authentication and tampering detection of arabic text transmitted via internet,” Computers Materials & Continua, vol. 66, no. 1, pp. 195–211, 2021. [Google Scholar]

7. M. A. Ahmed, R. A. Hasan, A. H. Ali and M. A. Mohammed, “The classification of the modern arabic poetry using machine learning,” TELKOMNIKA, vol. 17, no. 5, pp. 2667, 2019. [Google Scholar]

8. F. N. Al-Wesabi, “A smart english text zero-watermarking approach based on third-level order and word mechanism of Markov model,” Computers, Materials & Continua, vol. 65, no. 2, pp. 1137–1156, 2020. [Google Scholar]

9. A. El Kah and I. Zeroual, “The effects of pre-processing techniques on Arabic text classification,” International Journal of Advanced Trends in Computer Science and Engineering, vol. 10, no. 1, pp. 41–48, 2021. [Google Scholar]

10. F. N. Al-Wesabi, A. Abdelmaboud, A. A. Zain, M. M. Almazah and A. Zahary, “Tampering detection approach of arabic-text based on contents interrelationship,” Intelligent Automation & Soft Computing, vol. 27, no. 2, pp. 483–498, 2021. [Google Scholar]

11. H. El Rifai, L. Al Qadi and A. Elnagar, “Arabic text classification: The need for multi-labeling systems,” Neural Computing and Applications, vol. 34, no. 2, pp. 1135–1159, 2022. [Google Scholar] [PubMed]

12. F. S. Al-Anzi and D. AbuZeina, “Beyond vector space model for hierarchical Arabic text classification: A Markov chain approach,” Information Processing & Management, vol. 54, no. 1, pp. 105–115, 2018. [Google Scholar]

13. A. Adel, N. Omar, M. Albared and A. Al-Shabi, “Feature selection method based on statistics of compound words for arabic text classification,” The International Arab Journal of Information Technology, vol. 16, no. 2, pp. 178–185, 2019. [Google Scholar]

14. F. A. Abdulghani and N. A. Z. Abdullah, “A survey on arabic text classification using deep and machine learning algorithms,” Iraqi Journal of Science, vol. 63, no. 1, pp. 409–419, 2022. [Google Scholar]

15. N. Aljedani, R. Alotaibi and M. Taileb, “HMATC: Hierarchical multi-label Arabic text classification model using machine learning,” Egyptian Informatics Journal, vol. 22, no. 3, pp. 225–237, 2021. [Google Scholar]

16. S. Boukil, M. Biniz, F. Adnani, L. Cherrat and A. Moutaouakkil, “Arabic text classification using deep learning technics,” International Journal of Grid and Distributed Computing, vol. 11, no. 9, pp. 103–114, 2018. [Google Scholar]

17. S. Bahassine, A. Madani, M. Al-Sarem and M. Kissi, “Feature selection using an improved Chi-square for Arabic text classification,” Journal of King Saud University-Computer and Information Sciences, vol. 32, no. 2, pp. 225–231, 2020. [Google Scholar]

18. H. Chantar, M. Mafarja, H. Alsawalqah, A. A. Heidari, I. Aljarah et al., “Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification,” Neural Computing and Applications, vol. 32, no. 16, pp. 12201–12220, 2020. [Google Scholar]

19. A. Y. Muaad, M. A. Al-antari, S. Lee and H. J. Davanagere, “A novel deep learning arcar system for arabic text recognition with character-level representation,” in Proc. of The 1st Int. Electronic Conf. on Algorithms, pp. 14, 2021. [Google Scholar]

20. M. C. Manning, D. K. Chou, B. M. Murphy, R. W. Payne and D. S. Katayama, “Stability of protein pharmaceuticals: An update,” Pharmaceutical Research, vol. 27, no. 4, pp. 544–575, 2010. [Google Scholar] [PubMed]

21. G. Capizzi, G. L. Sciuto, C. Napoli, D. Polap and M. Wozniak, “Small lung nodules detection based on fuzzy-logic and probabilistic neural network with bioinspired reinforcement learning,” IEEE Transactions on Fuzzy Systems, vol. 28, no. 6, pp. 1178–1189, 2020. [Google Scholar]

22. Q. Askari and I. Younas, “Political optimizer based feedforward neural network for classification and function approximation,” Neural Processing Letters, vol. 53, no. 1, pp. 429–458, 2021. [Google Scholar]

Cite This Article

APA Style

Alotaibi, N., Al-onazi, B.B., Nour, M.K., Mohamed, A., Motwakel, A. et al. (2023). Political optimizer with probabilistic neural network-based arabic comparative opinion mining. Intelligent Automation & Soft Computing, 36(3), 3121–3137. https://doi.org/10.32604/iasc.2023.033915

Vancouver Style

Alotaibi N, Al-onazi BB, Nour MK, Mohamed A, Motwakel A, Mohammed GP, et al. Political optimizer with probabilistic neural network-based arabic comparative opinion mining. Intell Automat Soft Comput. 2023;36(3):3121–3137. https://doi.org/10.32604/iasc.2023.033915

IEEE Style

N. Alotaibi et al., “Political Optimizer with Probabilistic Neural Network-Based Arabic Comparative Opinion Mining,” Intell. Automat. Soft Comput., vol. 36, no. 3, pp. 3121–3137, 2023. https://doi.org/10.32604/iasc.2023.033915

BibTex EndNote RIS

Copyright © 2023 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Political Optimizer with Probabilistic Neural Network-Based Arabic Comparative Opinion Mining

Abstract

Keywords

References

Cite This Article

1009

826

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link