New Spam Filtering Method with Hadoop Tuning-Based MapReduce Naïve Bayes

Ji, Keungyeup; Kwon, Youngmi

doi:10.32604/csse.2023.031270

Open Access icon Open Access

ARTICLE

New Spam Filtering Method with Hadoop Tuning-Based MapReduce Naïve Bayes

by Keungyeup Ji, Youngmi Kwon^*

Department of Radio and Information Communications Engineering, Chungnam National University, Daejeon, 34134, Korea

* Corresponding Author: Youngmi Kwon. Email: email

Computer Systems Science and Engineering 2023, 45(1), 201-214. https://doi.org/10.32604/csse.2023.031270

Received 13 April 2022; Accepted 08 June 2022; Issue published 16 August 2022

Abstract

As the importance of email increases, the amount of malicious email is also increasing, so the need for malicious email filtering is growing. Since it is more economical to combine commodity hardware consisting of a medium server or PC with a virtual environment to use as a single server resource and filter malicious email using machine learning techniques, we used a Hadoop MapReduce framework and Naïve Bayes among machine learning methods for malicious email filtering. Naïve Bayes was selected because it is one of the top machine learning methods(Support Vector Machine (SVM), Naïve Bayes, K-Nearest Neighbor(KNN), and Decision Tree) in terms of execution time and accuracy. Malicious email was filtered with MapReduce programming using the Naïve Bayes technique, which is a supervised machine learning method, in a Hadoop framework with optimized performance and also with the Python program technique with the Naïve Bayes technique applied in a bare metal server environment with the Hadoop environment not applied. According to the results of a comparison of the accuracy and predictive error rates of the two methods, the Hadoop MapReduce Naïve Bayes method improved the accuracy of spam and ham email identification 1.11 times and the prediction error rate 14.13 times compared to the non-Hadoop Python Naïve Bayes method.

Keywords

Hadoop; hadoop distributed file system(HDFS); MapReduce; configuration parameter; malicious email filtering; Naïve Bayes

Cite This Article

APA Style

Ji, K., Kwon, Y. (2023). New spam filtering method with hadoop tuning-based mapreduce naïve bayes. Computer Systems Science and Engineering, 45(1), 201–214. https://doi.org/10.32604/csse.2023.031270

Vancouver Style

Ji K, Kwon Y. New spam filtering method with hadoop tuning-based mapreduce naïve bayes. Comput Syst Sci Eng. 2023;45(1):201–214. https://doi.org/10.32604/csse.2023.031270

IEEE Style

K. Ji and Y. Kwon, “New Spam Filtering Method with Hadoop Tuning-Based MapReduce Naïve Bayes,” Comput. Syst. Sci. Eng., vol. 45, no. 1, pp. 201–214, 2023. https://doi.org/10.32604/csse.2023.031270

BibTex EndNote RIS

Copyright © 2023 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

New Spam Filtering Method with Hadoop Tuning-Based MapReduce Naïve Bayes

Abstract

Keywords

Cite This Article

1673

1166

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link