improving accuracy of malware detection by …...ffri, inc. background – malware and its detection...
Post on 22-Aug-2020
11 Views
Preview:
TRANSCRIPT
FFRI, Inc.
Fourteenforty Research Institute, Inc.
FFRI, Inc. http://www.ffri.jp
Improving accuracy of malware detection by filtering evaluation dataset based on its similarity
Junichi Murakami Director of Advanced Development
FFRI, Inc.
• This slides was used for a presentation at CSS2013
– http://www.iwsec.org/css/2013/english/index.html
• Please refer the original paper for the detail data
– http://www.ffri.jp/assets/files/research/research_papers/MWS2013_paper.pdf (Written in Japanese but the figures are common)
• Contact information
– research-feedback@ffri.jp
– @FFRI_Research (twitter)
Preface
2
FFRI, Inc.
• Background
• Problem
• Scope and purpose
• Experiment 1
• Experiment 2
• Experiment 3
• Consideration
• Conclusion
Agenda
3
FFRI, Inc.
Background – malware and its detection
4
Increasing
malware
Targeted Attack
(Unknown malware)
Malware generators
Obfuscators
Limitation of
signature matching
other methods
Heuristic
Could reputation
Machine learning Bigdata
FFRI, Inc.
Background – Related works
5
Features
Static information
Dynamic information
Hybrid
Algorithms
SVM
Naive bayes
Perceptron, etc.
Evaluation
TPR/FRP, etc.
ROC-curve, etc.
Accuracy, Precision
• Mainly focusing on a combination of the factors below
– Features selection and modification, parameter settings
• Some good results are reported (TRP:90%+, FRP:1%-)
FFRI, Inc.
• General theory of machine learning:
– Accuracy of classification declines if trends of training and testing data are different
• How about malware and benign files
Problem
6
? ?
FFRI, Inc.
① Investigating differences between similarities of malware and benign files(Experiment-1)
② Investigating an effect for accuracy of classification by the difference(Experiment-2)
③Based on the result above, confirming an effect of removing data whose similarity with a training data is low (Experiment-3)
Scope and purpose
7
FFRI, Inc.
• Used FFRI Dataset 2013 and benign files we collected as datasets
• Calculated the similarity of each malware and benign files (Jubatus, MinHash)
• Feature vector: A number of 4-gram of sequential API calls
– ex: NtCreateFile_NtWriteFile_NtWriteFile_NtClose: n times NtSetInformationFile_NtClose_NtClose_NtOpenMutext: m times
Experiment-1(1/3)
8
malware
benign A B C ...
A
B
C
...
A B C ...
A ー 0.8 0.52 ...
B ー ー 1.0 ...
C ー ー ー ...
... ー ー ー ー
FFRI, Inc.
Grouping malware and benign files based on their similarities
Experiment-1(2/3)
9
Threshold of similarity (0.0 - 1.0) benign
malware
FFRI, Inc.
Experiment-1(3/3)
10
0%
20%
40%
60%
80%
100%
正常
系
マル
ウェ
ア
正常
系
マル
ウェ
ア
正常
系
マル
ウェ
ア
正常
系
マル
ウェ
ア
正常
系
マル
ウェ
ア
0.8 0.85 0.9 0.95 1
仲間無
仲間有
Threshold of similarity
It is more difficult to find similar benign files compared to malware
malw
are
malw
are
malw
are
malw
are
malw
are
benig
n
benig
n
benig
n
benig
n
benig
n
unique
not unique
FFRI, Inc.
• How much does the difference affect a result?
• 50% of malware/benign are assigned to a training, the others are to a testing dataset(Jubatus, AROW)
Experiment-2(1/3)
11
benign
malware
train
jubatus
classify
jubatus TPR: ?
FPR: ?
TPR: True Positive Rate FPR: False Positive Rate
train
testi
ng
FFRI, Inc.
Experiment-2(2/3)
12
benign
malware
train
jubatus
classify
jubatus TPR: ?
FPR: ?
train
testi
ng
• How much does the difference affect a result?
• 50% of malware/benign are assigned to a training, the others are to a testing dataset(Jubatus, AROW)
FFRI, Inc.
The accuracy declines if trends of training and testing data are different
Experiment-2(3/3)
13
0 50 100 0 1 2 3 4 5
■TPR ■FPR
97.996(not unique)
81.297(unique)
0.624(not unique)
4.49(unique)
-16.699
+3.866
% %
FFRI, Inc.
14
benign(train) malware(train)
benign(test) malware(test)
dividing line
Experiment-3(1/6) – After a training
malware
benign
FFRI, Inc.
Experiment-3(2/6) – After a classification
15
benign(train) malware(train)
benign(test) malware(test)
dividing line
FFRI, Inc.
16
FP
FN
Experiment-3(2/6) – After a classification
benign(train) malware(train)
benign(test) malware(test)
dividing line
FFRI, Inc.
Experiment-3(3/6) – Low similarity data
17
TP(accidentally)
FN
FN
benign(train) malware(train)
benign(test) malware(test)
dividing line
FFRI, Inc.
Experiment-3(4/6) – Effect to TPR
18
0.88
0.90
0.92
0.94
0.96
0.98
1.00
0
200
400
600
800
1000
1200
1400
0 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
TP
FN
TPR
Threshold of similarity
The n
um
ber
of cla
ssifie
d d
ata
FFRI, Inc.
Experiment-3(5/6) – Effect to FPR
19
0.000
0.002
0.004
0.006
0.008
0.010
0.012
0.014
0
500
1000
1500
2000
2500
0 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
TN
FP
FPR
The n
um
ber
of cla
ssi
fied d
ata
Threshold of similarity
FFRI, Inc.
Experiment-3(6/6)
20
0%
20%
40%
60%
80%
100%
120%
0 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
マルウェア 正常系ソフトウェア
Threshold of similarity
The n
um
ber
of cla
ssifie
d d
ata
/ The n
um
ber
of to
tal te
stin
g d
ata
Transition of the number of classified data
malware benign
FFRI, Inc.
• In real scenario:
– trying to classify an unknown file/process whether it is benign files or not
• If we apply Experiment-3:
– Files are classified only if similar data is already trained
– If not, files are not classified which results in
• FN if the files is malware
• TF if the files is benign (All right as a result)
• Therefore it is a problem about “TPR for unique malware” (Unique malware is likely to be undetectable)
Consideration(1/3)
21
FFRI, Inc.
• If malware have many variants as the current
– ML-based detection works well
• Having many variants ∝ malware generators/obfuscators
• We have to investigate
– Trends of usage of the tools above
– Possibility of anti-machine learning detection
Consideration(2/3)
22
FFRI, Inc.
• How to deal with unclassified (filtered) data
1. Using other feature vectors
2. Enlarging a training dataset (Unique → Not unique)
3. Using other methods besides ML
Consideration(3/3)
23
FFRI, Inc.
• Distribution of similarity for malware and benign are difference (Experiment-1)
• Accuracy declines if trends of training and testing data are different (Experiment-2)
• TPR of unique malware declines when we remove low similarity data (Experiment-3)
• Continual investigation for trends of malware and related tools are required
• (Might be necessary to develop technology to determine benign files)
Conclusion
24
top related