Open Access iconOpen Access

ARTICLE

crossmark

VPN and Non-VPN Network Traffic Classification Using Time-Related Features

Mustafa Al-Fayoumi1, Mohammad Al-Fawa’reh2, Shadi Nashwan3,*

1 King Hussein School of Computing Sciences, Princess Sumaya University for Technology (PSUT), Amman, Jordan
2 College of Information Technology and Computer Science, Yarmouk University, Amman, Jordan
3 College of Computer and Information Sciences, Jouf University, Aljouf, Saudi Arabia

* Corresponding Author: Shadi Nashwan. Email: email

Computers, Materials & Continua 2022, 72(2), 3091-3111. https://doi.org/10.32604/cmc.2022.025103

Abstract

The continual growth of the use of technological appliances during the COVID-19 pandemic has resulted in a massive volume of data flow on the Internet, as many employees have transitioned to working from home. Furthermore, with the increase in the adoption of encrypted data transmission by many people who tend to use a Virtual Private Network (VPN) or Tor Browser (dark web) to keep their data privacy and hidden, network traffic encryption is rapidly becoming a universal approach. This affects and complicates the quality of service (QoS), traffic monitoring, and network security provided by Internet Service Providers (ISPs), particularly for analysis and anomaly detection approaches based on the network traffic’s nature. The method of categorizing encrypted traffic is one of the most challenging issues introduced by a VPN as a way to bypass censorship as well as gain access to geo-locked services. Therefore, an efficient approach is especially needed that enables the identification of encrypted network traffic data to extract and select valuable features which improve the quality of service and network management as well as to oversee the overall performance. In this paper, the classification of network traffic data in terms of VPN and non-VPN traffic is studied based on the efficiency of time-based features extracted from network packets. Therefore, this paper suggests two machine learning models that categorize network traffic into encrypted and non-encrypted traffic. The proposed models utilize statistical features (SF), Pearson Correlation (PC), and a Genetic Algorithm (GA), preprocessing the traffic samples into net flow traffic to accomplish the experiment’s objectives. The GA-based method utilizes a stochastic method based on natural genetics and biological evolution to extract essential features. The PC-based method performs well in removing different features of network traffic. With a microsecond per-packet prediction time, the best model achieved an accuracy of more than 95.02 percent in the most demanding traffic classification task, a drop in accuracy of only 2.37 percent in comparison to the entire statistical-based machine learning approach. This is extremely promising for the development of real-time traffic analyzers.

Keywords


Cite This Article

M. Al-Fayoumi, M. Al-Fawa’reh and S. Nashwan, "Vpn and non-vpn network traffic classification using time-related features," Computers, Materials & Continua, vol. 72, no.2, pp. 3091–3111, 2022. https://doi.org/10.32604/cmc.2022.025103



cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1498

    View

  • 1560

    Download

  • 0

    Like

Share Link