AN ENSEMBLE CLASSİFİCATİON APPROACH WİTH SELECTİVE UNDER AND OVER SAMPLİNG OF IMBALANCE INTRUSİON DETECTİON DATASET

[ 31 Dec 2019 | vol. 13 | no. 4 | pp. 41-50 ]

About Authors:

Priyanka Tripathi1 and Rajni Ranjan Singh Makwana2
-1Madhav Instıtute of Technology and Scıence, CSE & IT, Gwalior(M.P.), India
-2Madhav Instıtute of Technology and Scıence, CSE & IT, Gwalior(M.P.), India

Abstract:

KDD CUP 99 dataset is a popular benchmark dataset was introduced at the third international knowledge discovery and data mining tools competition. It widely utilized for the improvement of intrusion detection strategies. The dataset is divided into four type of categories from all attacks which are Probe, DoS, R2L & U2R. In addition with these attack categories one more category normal is also included in the dataset to represent normal traffic. In the dataset R2L and U2R categories consists of very less tuples in comparison with others. Therefore there is a need for oversampling. Similarly remaining categories should be under sampled to mitigate the class imbalance of the dataset. Synthetic minority oversampling technique (SMOTE) is utilized with different ratios from 50% to 1000% for rare classes U2R & R2L and supplied to the ensemble classifier (Adaboost and random forest). The experiments using machine-learning techniques were conducted using the best ratios. The results using the proposed method were significantly better than those of previous approach and other related work.

Keywords:

Data mining, Intrusion Detection System (IDS), NIDS(Network based Intrusion Detection System), Weka, Smote, Attacks, Rare Class, Imbalanced Data, KDD Cup 1999

 

About this Article: