바로가기메뉴

본문 바로가기 주메뉴 바로가기

logo

  • E-ISSN2508-7894
  • KCI

Korean Journal of Artificial Intelligence / Korean Journal of Artificial Intelligence, (E)2508-7894
2023, v.11 no.2, pp.19-27
https://doi.org/https://doi.org/10.24225/kjai.2023.11.2.19
Hae-Duck Joshua Jeong

Abstract

In recent times, an exponential increase in Internet traffic has been observed as a result of advancing development of the Internet of Things, mobile networks with sensors, and communication functions within various devices. Further, the COVID-19 pandemic has inevitably led to an explosion of social network traffic. Within this context, considerable attention has been drawn to research on network traffic analysis based on machine learning. In this paper, we design and develop a new machine learning framework for network traffic analysis whereby normal and abnormal traffic is distinguished from one another. To achieve this, we combine together well-known machine learning algorithms and network traffic analysis techniques. Using one of the most widely used datasets KDD CUP'99 in the Weka and Apache Spark environments, we compare and investigate results obtained from time series type analysis of various aspects including malicious codes, feature extraction, data formalization, network traffic measurement tool implementation. Experimental analysis showed that while both the logistic regression and the support vector machine algorithm were excellent for performance evaluation, among these, the logistic regression algorithm performs better. The quantitative analysis results of our proposed machine learning framework show that this approach is reliable and practical, and the performance of the proposed system and another paper is compared and analyzed. In addition, we determined that the framework developed in the Apache Spark environment exhibits a much faster processing speed in the Spark environment than in Weka as there are more datasets used to create and classify machine learning models.

keywords
Network traffic measurement, Machine learning, Network traffic analysis, Logistic regression, Support vector machine

Korean Journal of Artificial Intelligence