Analyzing and Classifying Network Attacks Using Machine Learning on the NSL-KDD Dataset

Authors

  • John Stevens Department of Computer Science/ University of West Florida, Pensacola, FL, 32514
  • Sikha Bagui Department of Computer Science/ University of West Florida, Pensacola, FL, 32514
  • Subhash Bagui Department of Mathematics and Statistics/ University of West Florida, Pensacola, FL, 32514

Keywords:

Machine Learning, OneR, J48 Decision Tree, Naïve Bayes, NSL-KDD, Information Gain, Network Based Intrusion Detection Systems

Abstract

In an attempt to build an efficient network-based Intrusion Detection System, this is a thorough study on a benchmark dataset, NSL-KDD. The novelty of this work lies in determining the minimal number of features necessary to classify each individual attack as well as each attack category in the NSL-KDD dataset using Machine Learning. No previous analysis has yet been done at the individual attack level. Feature selection is performed using Information Gain, and then machine learning algorithms, specifically J48 Decision Tree, Naive Bayes, and a less commonly used classifier, OneR, are used for classification. The most important features for the classification of each individual attack as well as each attack category are presented, as determined by Information Gain. Classification results are also presented. High classification accuracies of mostly over 99%, using the J48 and Naïve Bayes, as well as OneR classifiers, were achieved. The number of attributes that it would take to get true positive rates of 100% or very close to 100% are also presented. In addition, the number of attributes it would take to achieve a recall of 100% or very close to 100% are also presented.

Downloads

Published

2021-12-02