adaptive hybrid model for network intrusion detection and...

Adaptive Hybrid Model for Network Intrusion Detection and ComparisonAmong Machine Learning Algorithms

Md. Enamul Haque

Department of Computer EngineeringKing Fahd University of Petroleum and Minerals

Saudi Arabia

[email protected] by Dr. Talal Alkharobi

May 21, 2014

Md. Enamul Haque (KFUPM) COE 551 May 21, 2014 1 / 28

Coming UpToday’s agenda.

Network Intrusion Detection.

Objective.

Proposed Model.

Algorithms Used.

Classifier Overview.

Dataset Description.

Results.

Conclusion


Network Intrusion DetectionLet’s review

Host based intrusion detection: monitors andanalyzes the internal interfaces

Network based intrusion detection.

Misuse Based: searches for known intrusivepatterns.Anomaly Based: Supervised, Unsupervised,and Hybrid Anomaly Detection..


Attack TypesBroad category

DOS: Denial of service.

R2L: Unauthorized access to the local system from a remote host.

U2R: Unauthorized access to the root of a local system.

Probe: Sensing network from outside to detect vulnerabilities.


Anomaly TypesBroad classification

Table : Anomaly Types

Attack Type Exploits

DOS back, land, neptune, pod, smurf, teardrop

U2R buffer overflow, load module, perl, rootkit

R2I ftp write, guess pass, imap, multi hop

phi, spy, warezclient, warezmaster

Probe ip sweep, saint, satan, Nmap


Exploits CategorySample Information

Feature name Description Type

duration length (number of seconds) Continuous

of the connection

protocol type type of the protocol Discrete

e.g. tcp, udp, icmp etc.

land 1 if connection is from/to the Discrete

samehost/port; 0 otherwise.

urgent number of urgent packets Continuous

hot number of ”hot” indicators Continuous


ObjectiveLet’s be clear what we wanted to do.

We have intrusion classified data and incoming traffic

Classify the incoming traffic if there is any abnormality.

If abnormality present, classify into specific category.


MotivationReinventing the wheel or what?

Building more accurate prediction model.

Adaptive learning for the model.

Detect novel intrusions.

Performance Comparison among existing learning models.

Artificial Neural Network and Support Vector Machines are already used.


Proposed ModelOverview

Figure : Network Intrusion Detection Model


AlgorithmIn brief

Figure : Network Intrusion Detection ModelMd. Enamul Haque (KFUPM) COE 551 May 21, 2014 10 / 28

Classifiers UsedThree major classifiers used

Classifiers


Naive Bayes ClassifierHow it works in simple terms?

Value of a particular feature is unrelated to the presence or absence of any other feature,given the class variable.

Example: A fruit may be considered to be an apple if it is red, round, and about 3 inch indiameter.

Each of these features are considered to contribute independently to the probability thatthis fruit is an apple.

Regardless of the presence or absence of the other features.

It can be trained very efficiently in a supervised learning setting.

Requires a small amount of training data to estimate the parameters (means andvariances of the variables) necessary for classification.


Naive Bayes ClassifierHow it works in mathematical terms?

Bayes theorem:p(C |F1, . . . ,Fn) = p(C)p(F1,...,Fn|C)

p(F1,...,Fn)

posterior = prior×likelihoodevidence

In our problem,C = Anomaly / NormalF1, . . . ,Fn = The featuresn=No. of features.

Figure : Prediction based on recent events


Random ForestsHow it works?

Training set X = x1, . . . xn with class label/ responses Y = y1 . . . yn

Sample from n training examples X,Y; callthese Xb,Yb.

Train a decision or regression tree fb onXb,Yb.

Predictions for unseen samples x ′ can bemade by averaging the predictions from allthe individual regression trees on

x ′: f̂ = 1B

B∑b=1

f̂b(x ′)Figure : Random forests


k-Nearest NeighborInstance based KNN (IBK)

Classify an unknown example with themost common class among k closestexamples.

Tell me who your neighbors are, and I willtell you who you are!

Example: k=3, 2 sea bass, 1 salmon.

Classified as sea bass.

Figure : Simple example for the idea.


k-NN Distance SelectionWorst case scenario

Feature 1 gives the correct class: 1 or 2.

Feature 2 gives irrelevant number from 100 to 200.

Training dataset: [1 150] [2 110]

Classify [1 100]

D([1 100], [1 150]) =√

(1− 1)2 + (100− 150)2 = 50 (1)

D([1 100], [2 110]) =√

(1− 2)2 + (100− 110)2 = 10.5 (2)

[1 100] is misclassified!

The denser the samples , the less of this problem.


k-NN:Feature NormalizationEqualizing the scale of the features.

Notice that 2 features are on different scales:

First feature takes values between 1 or 2.

Second feature takes values between 100 to 200.

Idea: normalize features to be on the same scale.

Different normalization approaches.

Linearly scale the range of each feature to be, say, in range [0,1].

fnew =fold − f min

old

f maxold − f min

old

(3)


k-NN: How to Choose k?Is there any standard?

Figure : Sometimes due to noise1-NN provides erroneousoutcome.

Figure : 3-NN provides betterclassification accuracy than1-NN in this case.

Rule of thumb isk <√n, n is number of

examples.

In practice, k = 1 is oftenused for efficiency, butcan be sensitive to noise.

Larger k may improveperformance.


Dataset DistributionsAnomaly and normal quantity

Category No. of Instances

Normal 67343

Anomaly 58630

Total 125973

Table : Dataset Used in the Experiment

Category No. of Instances Contribution

DOS 9234 Continuous

U2R 11 Continuous

R2L 209 Continuous

Probe 2289 Continuous

Table : Distribution of Reduced Dataset for Anomaly Class


Feature ReductionToo much for computation.

Attribute Evaluator Search Method No. of Selected Attribute Selected Attributes

CFS Genetic Search 15 4,5,6,8,10,12,17,23,26,29,30,32,37,38,39CFS PSO Search 9 4,5,6,12,26,29,30,37,39CFS Best First 6 4,5,6,12,26,30CFS Evolutionary Search 18 3,4,5,6,8,17,19,23,25,26,29,30,33,34,37,38,39,41

Consistency Subset Greedy Stepwise 10 1,3,4,5,14,23,32,34,35,37

Table : Features Reduction

Reduce the features without affecting the accuracy to gain less computation.


Detailed Accuracy By Class10-fold Cross Validation for Random Forest

Table : Detailed Accuracy By Class : 10-fold Cross Validation for Random Forest

TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Type0.999 0.002 0.998 0.999 0.999 0.998 1.000 1.000 normal0.998 0.001 0.999 0.998 0.999 0.998 1.000 1.000 anomaly

Table : Confusion Matrix for Random Forest

a b Classified As

67308 35 a = normal

117 58513 b = anomaly


Classification AccuracyBased on confusion matrix

NaiveBayes PART RandomForest Grading Adaboost IBK0

10

20

30

40

50

60

70

80

90

100

Machine Learning Classifier

Accura

cy(%

)

Figure : Classification accuracy for different learning/classification algorithms. The major parameterswere tuned for each of the execution.


Tools and EquipmentsThose came handy

KDD Cup 1999

MySQL: Data preprocessing.

MATLAB: Algorithm testing and graph generation.

WEKA 3.7.9: Actual classification performed.


References

Herrero, lvaro, et al. RT-MOVICAB-IDS: Addressing real-time intrusion detection. Future GenerationComputer Systems 29.1 (2013): 250-261.

McHugh, John. Testing intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusiondetection system evaluations as performed by Lincoln Laboratory. ACM transactions on Information andsystem Security 3.4 (2000): 262-294.

Tavallaee, Mahbod, et al. A detailed analysis of the KDD CUP 99 data set. Proceedings of the SecondIEEE Symposium on Computational Intelligence for Security and Defence Applications 2009. 2009.

Kim, Gisung, Seungmin Lee, and Sehun Kim. A novel hybrid intrusion detection method integratinganomaly detection with misuse detection. Expert Systems with Applications 41.4 (2014): 1690-1700.

Luo, Bin, and Jingbo Xia. A novel intrusion detection system based on feature generation withvisualization strategy. Expert Systems with Applications (2014).

Fung, Carol J., and Raouf Boutaba. Design and management of collaborative intrusion detection networks.Integrated Network Management (IM 2013), 2013 IFIP/IEEE International Symposium on. IEEE, 2013.


Future DirectionsLets think about next level!!

Classify the anomaly class into further specific divisions

Usage of unsupervised learning methods.

Knowledge base development


Questions?Suggestions?


adaptive hybrid model for network intrusion detection and...

Documents