NADIA Articles

Advancement Through Research

COMBINED RULE FILTERING AND MAPREDUCE BASED HADOOP SVM APPROACH FOR DETECTING ATTACKS IN BIG DATA SECURITY ANALYTICS

[ 31 Dec 2019 | vol. 12 | no. 4 | pp. 1-18]

About Authors:

Ramakrishna Miryala1,2 and Vijaya Saradhi T3
-1Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India
-2Department of CSE, Sreenidhi Institute of Science and Technology, Ghatkesar, Hyderabad – 501 301, India
-3Department of CSE, Sreenidhi Institute of Science and Technology, Ghatkesar, Hyderabad – 501 301, India

Abstract:

The existence of a large number of layers and log files from different servers makes Virtualized Infrastructures (VIs) in cloud computing vulnerable to cyber-attacks. Hence there is a need to detect these attacks in the computing environment. The existing techniques uses Graph event correlation based feature extraction in combination with MapReduce, a classifier and a propagation model. Even though proved good, such techniques poses large burden on MapReduce which leads to large computation times, giving ample time for the attackers to carry out their malicious activity. Hence in this paper, an approach to minimize the detection time and maximize the accuracy, by employing a rule filter and an optimal classifier is proposed. Network and web user logs are acquired using an event log analyzer and are stored in Hadoop Distributed File storage System (HDFS). A relation graph is formed with respect to event logs and a rule filter is applied to the graph. The rule filter discards irrelevant and multiple correlations and the simplified graph is taken by the MapReduce parser which identifies the potential attack paths based on the similarities of source IPs and source ports. The port numbers and application IDs of potential attack paths are fed to a two-stage Support Vector Machine Classifier (SVM) which calculated whether an attack exists or not. The approach is evaluated by comparing the results obtained in terms of accuracy, specificity, and sensitivity and computation time with other existing methods. The results show that the inclusion of rule filtering, and the inherent ability of SVM to classify linear data produces a reduction of computation time in detecting attacks without lowering accuracy compared to other methods.