hossein_rahimi-10-95011

Upload: muhammadammar

Post on 06-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 hossein_rahimi-10-95011

    1/24

    Hossein Rahimi, Student, Shiraz University

    April 14th/ Security for The Next Generation 2011

    Iterative System Call Patterns Blow the MalwareCover

    M. Ahmadi, A. Sami, H. Rahimi, B. Yadegari

  • 8/3/2019 hossein_rahimi-10-95011

    2/24

    OutlineOverall organization

    Introduction

    Previous Works

    Static & Dynamic Analysis

    Methodology

    Motivation

    Our Method

    Monitoring

    Iterative Pattern Extraction

    Feature Selection

    Dataset and Experiments

    Data Format

    Experimentation ResultsConclusion and Future Works

    Acknowledgement

    References

    | 24 April 2011PAGE 2 |Iterative System Call Patterns Blow the Malware

    Cover

  • 8/3/2019 hossein_rahimi-10-95011

    3/24

    Introduction

    Anti-Viruses basically use signature-based methods for malware detection.

    To detect Polymorphic/Metamorphic and new malware:

    Regular signature updates.

    Executable collection and analysis laboratories.

    Our focus is on the Microsoft Windows portable executable (PE) files.

    Dynamic analysis is a cumbersome task:

    It is simpler to just analyze PEs' interactions with the OS API.

    Our approach is based on data mining and machine learning:

    Learn from the behavior of existing malicious and benign software samples.

    Given an unseen executable, guess its intent accurately.

    A novel idea of using iterative API call patterns to identify malware.

    | 24 April 2011PAGE 3 |Iterative System Call Patterns Blow the Malware

    Cover

  • 8/3/2019 hossein_rahimi-10-95011

    4/24

    Previous WorksStatic and dynamic approaches.

  • 8/3/2019 hossein_rahimi-10-95011

    5/24

    Previous worksA static approach.

    Malware Detection Based On Mining API Calls

    By:Ashkan Sami, Hossein Rahimi , Babak Yadegari, Naser Peiravian, Sattar Hashemi,Ali Hamze, (Shiraz University)

    ACM Symposium on Applied Computing, April2010.

    Approach:

    Static approach.

    Extracts API Calls from PE's Import Address Table Accuracy of 98.3% and Detection Rate of 99.7%.

    Weaknesses:

    Custom and rare packers.

    Fake API calls and unreachable code.

    | 24 April 2011PAGE 5 |Iterative System Call Patterns Blow the Malware

    Cover

    PE PE analyzer

    API callsdataset

    Featureanalyzer

    Classifier Alert!

  • 8/3/2019 hossein_rahimi-10-95011

    6/24

    Previous worksDynamic approach for software fault detection.

    Classification of Software Behaviors for Failure Detection:ADiscriminative Pattern Mining Approach

    By: D. Lo, H. Cheng, J. Han, S. Khoo, C. Sun

    KDD 2009

    Approach:

    Dynamic Approach.

    Software traces are logged and mined for iterative patterns.

    Failing runs detected.

    Accuracy of near 100% in some cases.

    | 24 April 2011PAGE 6 |Iterative System Call Patterns Blow the Malware

    Cover

  • 8/3/2019 hossein_rahimi-10-95011

    7/24

    Previous worksA dynamic approach.

    Effective and Efficient Malware Detection at the End Host

    Clemens Kolbitsch, Paolo Milani Comparetti, Christopher Kruegel, Engin Kirda, XiaoyongZhou (Secure System Labs, California)

    Usenix. August 2009.

    Approach:

    Dynamically build dependency graphs as signatures.

    Accuracy of about 63%Drawbacks:

    High computational complexity.

    Not accurate.

    | 24 April 2011PAGE 7 |Iterative System Call Patterns Blow the Malware

    Cover

  • 8/3/2019 hossein_rahimi-10-95011

    8/24

    Previous worksA dynamic approach.

    Efficient Virus Detection Using Dynamic Instruction Sequences

    Jianyong Dai, Ratan Guha, Joohan Lee (University of Central Florida)

    Journal Of Computers. May 2009.

    Approach:

    Dynamic, abstract the frequent instruction sequences.

    Accuracy =91%

    Drawbacks: Weak against Metamorphic malware who alter the instructions in many ways.

    | 24 April 2011PAGE 8 |Iterative System Call Patterns Blow the Malware

    Cover

  • 8/3/2019 hossein_rahimi-10-95011

    9/24

    Methodology

    Motivations, ideas and techniques behind this work.

  • 8/3/2019 hossein_rahimi-10-95011

    10/24

    Motivation

    Detection of polymorphic and metamorphic malware in case of static anddynamic evasion.

    Making malware analyzer's work easier by selecting the most suspicious PEsamples to analyze next.

    Avoiding the complex Graph Mining and Graph Isomorphism computations:

    Program runs graphs are huge in size.

    Finding minable and useful sub-graphs is computationaly expensive.

    API calls are not easily replaceable like instructions:

    e.g. One can replace a SUB instruction with a NEG and ADD.

    Investigating iterative system calls can be a good means of guessing aprogram's intent.

    Hiding the calls to the API is much more complex in a dynamic context incomparison with the static approach.

    | 24 April 2011PAGE 10 |Iterative System Call Patterns Blow the Malware

    Cover

  • 8/3/2019 hossein_rahimi-10-95011

    11/24

    Our Method

    | 24 April 2011PAGE 11 |Iterative System Call Patterns Blow the Malware

    Cover

    Classification Prediction

    Malware

    Predicted Samples

    Monitor

    APIs Log

    Benigns

    Train Dataset

    Test Dataset

    Controlled Environment

    Detection Engine

    Windows DLLs

    MineClosed

    Patterns

    SelectDiscriminative

    Patterns

    NewTrain Dataset

    New Test Dataset

  • 8/3/2019 hossein_rahimi-10-95011

    12/24

    Monitoring

    | 24 April 2011Iterative System Call Patterns Blow the Malware

    CoverPAGE 12 |

    Backdoor.Win32.Agent.cy

    RegOpenCurrentUser,OpenProcessToken,AllocateAndInitializeSid,CheckTokenMembership,FreeSid,RegOpenKeyExW,RegQueryValueExW,RegCloseKey,RegCloseKey,RegOpenKeyExW,Reg

    OpenKeyExA,mmRegQueryValueExA,RegCloseKey,RegOpenKeyExW,RegOpenKeyExW,InitializeSecurityDescriptor,InitializeAcl,AddAccessAllowedAce,AddAccessAllowedAce,SetSecurityDescriptorDacl,MD4Init,MD4Update,MD4Update,MD4Update,MD4Final,OpenSCManagerA,OpenServiceA,CreateServiceA,StartServiceA,CloseService

    Handle,CloseServiceHandle

  • 8/3/2019 hossein_rahimi-10-95011

    13/24

    Iterative Pattern Extraction

    Consider a pattern P()

    A sample database of two PEs:

    Inst(P) denotes the set of instances of P :

    e.g: {(1, 3, 5), (1, 6, 8), (2, 3, 5), (2, 8, 9)}

    Multiple occurrences of an iterative pattern are considered to reflect repetition ofan iterative behavior (e.g. many worms scan a range of network addresses toreplicate themselves)

    | 24 April 2011PAGE 13 |Iterative System Call Patterns Blow the Malware

    Cover

    Identifier Sequence of API calls

    Benign1

    Malware1

  • 8/3/2019 hossein_rahimi-10-95011

    14/24

    Closed Frequent Pattern

    An iterative pattern Pis frequent if its instances occur more than a threshold of

    min_supin APIDB, i.e. ,

    A frequent iterative pattern P is closed if there exists no super sequence Q suchthat:

    Pand Qhave the same support & Inst(P) Inst(Q)

    | 24 April 2011Iterative System Call Patterns Blow the Malware

    CoverPAGE 14 |

    supmin_DB)Inst(P,API

  • 8/3/2019 hossein_rahimi-10-95011

    15/24

    Feature Selection

    | 24 April 2011Iterative System Call Patterns Blow the Malware

    CoverPAGE 15 |

    To evaluate the discriminative power of every feature, Fisherdiscrimination

    score was used.

    Name Stands For

    niNumber of instances of classi in the dataset.

    Average of the feature valueallover the dataset.

    i Average of the feature valuein class i.

    iStandard deviation of featurevalues in class i.

  • 8/3/2019 hossein_rahimi-10-95011

    16/24

    Dataset and Experiments

    Experiment results and dataset format.

  • 8/3/2019 hossein_rahimi-10-95011

    17/24

    Data Format

    | 24 April 2011Kaspersky Lab PowerPoint TemplatePAGE 17 |

    Name Of

    PE

    Single

    API1

    Single

    API2

    Closed

    Frequent APIPattern 1

    Closed

    Frequent APIPattern 2

    Benign 1 Total ofAPI1

    Total ofAPI2

    Support Support

    Benign 2 Total of

    API1

    Total of

    API2

    Support Support

    Malware 1 Total ofAPI1

    Total ofAPI2

    Support Support

    Malware 2 Total ofAPI1

    Total ofAPI2

    Support Support

  • 8/3/2019 hossein_rahimi-10-95011

    18/24

    Experimentation Results10-Fold cross validation used in the classification phase.

    Results of running the classifier on a dataset with 269 malicious samples and 211

    benign programs.

    | 24 April 2011Iterative System Call Patterns Blow the Malware

    CoverPAGE 18 |

    S TP FP TN FN DR ACC

    0.1 243 26 179 32 90.33 87.91

    0.15 247 22 180 31 91.8 88.95

    0.2 247 22 180 31 91.8 88.95

    0.25 245 24 173 38 91.1 87.08

    0.3 247 22 171 40 91.8 87.02

    0.35 242 27 177 34 90 87.29

    0.4 243 26 176 35 90.3 87.29

  • 8/3/2019 hossein_rahimi-10-95011

    19/24

    Minimum Support vs. False Alarm Rate

    | 24 April 2011Iterative System Call Patterns Blow the Malware

    CoverPAGE 19 |

    0

    0,02

    0,04

    0,06

    0,08

    0,1

    0,12

    0,14

    0,16

    0,18

    0,2

    0,1 0,15 0,2 0,25 0,3 0,35 0,4

    FalseAlarmR

    ate

    Minimum Support

  • 8/3/2019 hossein_rahimi-10-95011

    20/24

    Conclusion and Future Works

    Due to rapid growth of the malware produciton trend, a set of tools to boost theunseen malware analysis and detection is needed.

    Signature-based anti-virus methods are not enough to get an acceptableprotection against the new threats.

    Signature-based methods can gain more power with the aid of heuristic based

    detection systems.

    In this work we used Iterative patterns and statistical measures to furtherimprove the classification results.

    Our best achievement is an accuracy of 88.95% and a detection rate of 91.5%.

    Improving the classification using more informative data structures.

    | 24 April 2011Iterative System Call Patterns Blow the Malware

    CoverPAGE 20 |

  • 8/3/2019 hossein_rahimi-10-95011

    21/24

    AcknowledgementI am proud to thank the people who contributed this research with their best.

    Mansour Ahmadi Dr. Ashkan Sami Babak Yadegari

    (main contribution)

    | 24 April 2011Iterative System Call Patterns Blow the Malware

    CoverPAGE 21 |

  • 8/3/2019 hossein_rahimi-10-95011

    22/24

    Questions and Answers

    Thanks for your patience

    Q/A

    | 24 April 2011Iterative System Call Patterns Blow the Malware

    CoverPAGE 22 |

  • 8/3/2019 hossein_rahimi-10-95011

    23/24

    Thank You

    Hossein Rahimi, Student, Shiraz University

    April 14th/ Security for The Next Generation 2011

    Iterative System Call Patterns Blow the MalwareCover

    M. Ahmadi, A. Sami, H. Rahimi, B. Yadegari

  • 8/3/2019 hossein_rahimi-10-95011

    24/24

    References

    [1] Ashkan Sami, Hossein Rahimi , Babak Yadegari, Naser Peiravian, Sattar Hashemi, Ali Hamze, "Malware DetectionBased On Mining API Calls,". the ACM Symposium on Applied Computing-Data Mining Track, Sierre, Switzerland, 2010.

    [2] Christoph Csallner, Yannis Smaragdakis, Tao Xie. DSD-Crasher: A Hybrid Analysis Tool for Bug Finding. ACMTransactions on Software Engineering and Methodology, Vol. 17, Issue 2, pp. 345-371, July 2008.

    [3] David Lo, Siau-Cheng Khoo and Chao Liu. Efficient Mining of Iterative Patterns for Software SpecificationDiscovery. 13th SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'07). California. Aug 12-15, 2007

    [4] M. Christodorescu, Jha, S., Seshia, S., Song,D.,Bryant,R., "Semantics-aware malware detection,". the IEEESymposiumon Security and Privacy, 2005.

    [5] P. Szor, TheArt of Computer Virus Research and Defense: Addison Wesley, 2005.

    [6] M. C. a. S. Jha, "Static analysis of executables to detect malicious patterns," USENIX Security Symposium, 2003.

    [7] T. Yetiser, "Polymorphic Viruses Implementation, detection, and protection," 1993.

    [8] VX Heavens. computer virus collection. Available: http://vx.netlux.org/vl.php

    [9] P. M. C. Clemens Kolbitsch, Christopher Kruegel, Engin Kirda, Xiaoyong Zhou and XiaoFeng Wang "Effective andEfficient Malware Detection at the End Host," presented at the 18th Usenix Security Symposium, 2009

    [10] R. G. a. J. L. J. Dai, "Efficient Virus Detection Using Dynamic Instruction Sequences," Journal of Computers, 2009

    | 24 April 2011Iterative System Call Patterns Blow the Malware

    CPAGE 24 |

    http://vx.netlux.org/vl.phphttp://vx.netlux.org/vl.php