hossein_rahimi-10-95011
TRANSCRIPT
-
8/3/2019 hossein_rahimi-10-95011
1/24
Hossein Rahimi, Student, Shiraz University
April 14th/ Security for The Next Generation 2011
Iterative System Call Patterns Blow the MalwareCover
M. Ahmadi, A. Sami, H. Rahimi, B. Yadegari
-
8/3/2019 hossein_rahimi-10-95011
2/24
OutlineOverall organization
Introduction
Previous Works
Static & Dynamic Analysis
Methodology
Motivation
Our Method
Monitoring
Iterative Pattern Extraction
Feature Selection
Dataset and Experiments
Data Format
Experimentation ResultsConclusion and Future Works
Acknowledgement
References
| 24 April 2011PAGE 2 |Iterative System Call Patterns Blow the Malware
Cover
-
8/3/2019 hossein_rahimi-10-95011
3/24
Introduction
Anti-Viruses basically use signature-based methods for malware detection.
To detect Polymorphic/Metamorphic and new malware:
Regular signature updates.
Executable collection and analysis laboratories.
Our focus is on the Microsoft Windows portable executable (PE) files.
Dynamic analysis is a cumbersome task:
It is simpler to just analyze PEs' interactions with the OS API.
Our approach is based on data mining and machine learning:
Learn from the behavior of existing malicious and benign software samples.
Given an unseen executable, guess its intent accurately.
A novel idea of using iterative API call patterns to identify malware.
| 24 April 2011PAGE 3 |Iterative System Call Patterns Blow the Malware
Cover
-
8/3/2019 hossein_rahimi-10-95011
4/24
Previous WorksStatic and dynamic approaches.
-
8/3/2019 hossein_rahimi-10-95011
5/24
Previous worksA static approach.
Malware Detection Based On Mining API Calls
By:Ashkan Sami, Hossein Rahimi , Babak Yadegari, Naser Peiravian, Sattar Hashemi,Ali Hamze, (Shiraz University)
ACM Symposium on Applied Computing, April2010.
Approach:
Static approach.
Extracts API Calls from PE's Import Address Table Accuracy of 98.3% and Detection Rate of 99.7%.
Weaknesses:
Custom and rare packers.
Fake API calls and unreachable code.
| 24 April 2011PAGE 5 |Iterative System Call Patterns Blow the Malware
Cover
PE PE analyzer
API callsdataset
Featureanalyzer
Classifier Alert!
-
8/3/2019 hossein_rahimi-10-95011
6/24
Previous worksDynamic approach for software fault detection.
Classification of Software Behaviors for Failure Detection:ADiscriminative Pattern Mining Approach
By: D. Lo, H. Cheng, J. Han, S. Khoo, C. Sun
KDD 2009
Approach:
Dynamic Approach.
Software traces are logged and mined for iterative patterns.
Failing runs detected.
Accuracy of near 100% in some cases.
| 24 April 2011PAGE 6 |Iterative System Call Patterns Blow the Malware
Cover
-
8/3/2019 hossein_rahimi-10-95011
7/24
Previous worksA dynamic approach.
Effective and Efficient Malware Detection at the End Host
Clemens Kolbitsch, Paolo Milani Comparetti, Christopher Kruegel, Engin Kirda, XiaoyongZhou (Secure System Labs, California)
Usenix. August 2009.
Approach:
Dynamically build dependency graphs as signatures.
Accuracy of about 63%Drawbacks:
High computational complexity.
Not accurate.
| 24 April 2011PAGE 7 |Iterative System Call Patterns Blow the Malware
Cover
-
8/3/2019 hossein_rahimi-10-95011
8/24
Previous worksA dynamic approach.
Efficient Virus Detection Using Dynamic Instruction Sequences
Jianyong Dai, Ratan Guha, Joohan Lee (University of Central Florida)
Journal Of Computers. May 2009.
Approach:
Dynamic, abstract the frequent instruction sequences.
Accuracy =91%
Drawbacks: Weak against Metamorphic malware who alter the instructions in many ways.
| 24 April 2011PAGE 8 |Iterative System Call Patterns Blow the Malware
Cover
-
8/3/2019 hossein_rahimi-10-95011
9/24
Methodology
Motivations, ideas and techniques behind this work.
-
8/3/2019 hossein_rahimi-10-95011
10/24
Motivation
Detection of polymorphic and metamorphic malware in case of static anddynamic evasion.
Making malware analyzer's work easier by selecting the most suspicious PEsamples to analyze next.
Avoiding the complex Graph Mining and Graph Isomorphism computations:
Program runs graphs are huge in size.
Finding minable and useful sub-graphs is computationaly expensive.
API calls are not easily replaceable like instructions:
e.g. One can replace a SUB instruction with a NEG and ADD.
Investigating iterative system calls can be a good means of guessing aprogram's intent.
Hiding the calls to the API is much more complex in a dynamic context incomparison with the static approach.
| 24 April 2011PAGE 10 |Iterative System Call Patterns Blow the Malware
Cover
-
8/3/2019 hossein_rahimi-10-95011
11/24
Our Method
| 24 April 2011PAGE 11 |Iterative System Call Patterns Blow the Malware
Cover
Classification Prediction
Malware
Predicted Samples
Monitor
APIs Log
Benigns
Train Dataset
Test Dataset
Controlled Environment
Detection Engine
Windows DLLs
MineClosed
Patterns
SelectDiscriminative
Patterns
NewTrain Dataset
New Test Dataset
-
8/3/2019 hossein_rahimi-10-95011
12/24
Monitoring
| 24 April 2011Iterative System Call Patterns Blow the Malware
CoverPAGE 12 |
Backdoor.Win32.Agent.cy
RegOpenCurrentUser,OpenProcessToken,AllocateAndInitializeSid,CheckTokenMembership,FreeSid,RegOpenKeyExW,RegQueryValueExW,RegCloseKey,RegCloseKey,RegOpenKeyExW,Reg
OpenKeyExA,mmRegQueryValueExA,RegCloseKey,RegOpenKeyExW,RegOpenKeyExW,InitializeSecurityDescriptor,InitializeAcl,AddAccessAllowedAce,AddAccessAllowedAce,SetSecurityDescriptorDacl,MD4Init,MD4Update,MD4Update,MD4Update,MD4Final,OpenSCManagerA,OpenServiceA,CreateServiceA,StartServiceA,CloseService
Handle,CloseServiceHandle
-
8/3/2019 hossein_rahimi-10-95011
13/24
Iterative Pattern Extraction
Consider a pattern P()
A sample database of two PEs:
Inst(P) denotes the set of instances of P :
e.g: {(1, 3, 5), (1, 6, 8), (2, 3, 5), (2, 8, 9)}
Multiple occurrences of an iterative pattern are considered to reflect repetition ofan iterative behavior (e.g. many worms scan a range of network addresses toreplicate themselves)
| 24 April 2011PAGE 13 |Iterative System Call Patterns Blow the Malware
Cover
Identifier Sequence of API calls
Benign1
Malware1
-
8/3/2019 hossein_rahimi-10-95011
14/24
Closed Frequent Pattern
An iterative pattern Pis frequent if its instances occur more than a threshold of
min_supin APIDB, i.e. ,
A frequent iterative pattern P is closed if there exists no super sequence Q suchthat:
Pand Qhave the same support & Inst(P) Inst(Q)
| 24 April 2011Iterative System Call Patterns Blow the Malware
CoverPAGE 14 |
supmin_DB)Inst(P,API
-
8/3/2019 hossein_rahimi-10-95011
15/24
Feature Selection
| 24 April 2011Iterative System Call Patterns Blow the Malware
CoverPAGE 15 |
To evaluate the discriminative power of every feature, Fisherdiscrimination
score was used.
Name Stands For
niNumber of instances of classi in the dataset.
Average of the feature valueallover the dataset.
i Average of the feature valuein class i.
iStandard deviation of featurevalues in class i.
-
8/3/2019 hossein_rahimi-10-95011
16/24
Dataset and Experiments
Experiment results and dataset format.
-
8/3/2019 hossein_rahimi-10-95011
17/24
Data Format
| 24 April 2011Kaspersky Lab PowerPoint TemplatePAGE 17 |
Name Of
PE
Single
API1
Single
API2
Closed
Frequent APIPattern 1
Closed
Frequent APIPattern 2
Benign 1 Total ofAPI1
Total ofAPI2
Support Support
Benign 2 Total of
API1
Total of
API2
Support Support
Malware 1 Total ofAPI1
Total ofAPI2
Support Support
Malware 2 Total ofAPI1
Total ofAPI2
Support Support
-
8/3/2019 hossein_rahimi-10-95011
18/24
Experimentation Results10-Fold cross validation used in the classification phase.
Results of running the classifier on a dataset with 269 malicious samples and 211
benign programs.
| 24 April 2011Iterative System Call Patterns Blow the Malware
CoverPAGE 18 |
S TP FP TN FN DR ACC
0.1 243 26 179 32 90.33 87.91
0.15 247 22 180 31 91.8 88.95
0.2 247 22 180 31 91.8 88.95
0.25 245 24 173 38 91.1 87.08
0.3 247 22 171 40 91.8 87.02
0.35 242 27 177 34 90 87.29
0.4 243 26 176 35 90.3 87.29
-
8/3/2019 hossein_rahimi-10-95011
19/24
Minimum Support vs. False Alarm Rate
| 24 April 2011Iterative System Call Patterns Blow the Malware
CoverPAGE 19 |
0
0,02
0,04
0,06
0,08
0,1
0,12
0,14
0,16
0,18
0,2
0,1 0,15 0,2 0,25 0,3 0,35 0,4
FalseAlarmR
ate
Minimum Support
-
8/3/2019 hossein_rahimi-10-95011
20/24
Conclusion and Future Works
Due to rapid growth of the malware produciton trend, a set of tools to boost theunseen malware analysis and detection is needed.
Signature-based anti-virus methods are not enough to get an acceptableprotection against the new threats.
Signature-based methods can gain more power with the aid of heuristic based
detection systems.
In this work we used Iterative patterns and statistical measures to furtherimprove the classification results.
Our best achievement is an accuracy of 88.95% and a detection rate of 91.5%.
Improving the classification using more informative data structures.
| 24 April 2011Iterative System Call Patterns Blow the Malware
CoverPAGE 20 |
-
8/3/2019 hossein_rahimi-10-95011
21/24
AcknowledgementI am proud to thank the people who contributed this research with their best.
Mansour Ahmadi Dr. Ashkan Sami Babak Yadegari
(main contribution)
| 24 April 2011Iterative System Call Patterns Blow the Malware
CoverPAGE 21 |
-
8/3/2019 hossein_rahimi-10-95011
22/24
Questions and Answers
Thanks for your patience
Q/A
| 24 April 2011Iterative System Call Patterns Blow the Malware
CoverPAGE 22 |
-
8/3/2019 hossein_rahimi-10-95011
23/24
Thank You
Hossein Rahimi, Student, Shiraz University
April 14th/ Security for The Next Generation 2011
Iterative System Call Patterns Blow the MalwareCover
M. Ahmadi, A. Sami, H. Rahimi, B. Yadegari
-
8/3/2019 hossein_rahimi-10-95011
24/24
References
[1] Ashkan Sami, Hossein Rahimi , Babak Yadegari, Naser Peiravian, Sattar Hashemi, Ali Hamze, "Malware DetectionBased On Mining API Calls,". the ACM Symposium on Applied Computing-Data Mining Track, Sierre, Switzerland, 2010.
[2] Christoph Csallner, Yannis Smaragdakis, Tao Xie. DSD-Crasher: A Hybrid Analysis Tool for Bug Finding. ACMTransactions on Software Engineering and Methodology, Vol. 17, Issue 2, pp. 345-371, July 2008.
[3] David Lo, Siau-Cheng Khoo and Chao Liu. Efficient Mining of Iterative Patterns for Software SpecificationDiscovery. 13th SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'07). California. Aug 12-15, 2007
[4] M. Christodorescu, Jha, S., Seshia, S., Song,D.,Bryant,R., "Semantics-aware malware detection,". the IEEESymposiumon Security and Privacy, 2005.
[5] P. Szor, TheArt of Computer Virus Research and Defense: Addison Wesley, 2005.
[6] M. C. a. S. Jha, "Static analysis of executables to detect malicious patterns," USENIX Security Symposium, 2003.
[7] T. Yetiser, "Polymorphic Viruses Implementation, detection, and protection," 1993.
[8] VX Heavens. computer virus collection. Available: http://vx.netlux.org/vl.php
[9] P. M. C. Clemens Kolbitsch, Christopher Kruegel, Engin Kirda, Xiaoyong Zhou and XiaoFeng Wang "Effective andEfficient Malware Detection at the End Host," presented at the 18th Usenix Security Symposium, 2009
[10] R. G. a. J. L. J. Dai, "Efficient Virus Detection Using Dynamic Instruction Sequences," Journal of Computers, 2009
| 24 April 2011Iterative System Call Patterns Blow the Malware
CPAGE 24 |
http://vx.netlux.org/vl.phphttp://vx.netlux.org/vl.php