exploiting diverse observation perspectives to get insights on the malware landscape
Post on 25-Feb-2016
51 Views
Preview:
DESCRIPTION
TRANSCRIPT
EXPLOITING DIVERSE OBSERVATION PERSPECTIVES TO GET INSIGHTS ON THE MALWARE LANDSCAPE
Corrado Leita Symantec Research LabsUlrich Bayer Technical University ViennaEngin Kirda Institute Eurecom @ iSecLab
ADLab Meeting 2
Outline
Introduction Related Work SGNET and EPM Clustering Results Conclusion
2010/7/20
ADLab Meeting 32010/7/20
INTRODUCTION
ADLab Meeting 4
Introduction
30,000 samples per day submitted to VirusTotal website About the order of millions of samples
per month Malware writers can generate new
code by existing code bases or by re-packing the binaries using code obfuscation tools e.g., Allaple Worms.
2010/7/20
ADLab Meeting 5
Introduction
A complete picture on the complexity of the malware landscape is possible only by discerning polymorphic instances from new variants
Get quantitative insights on the interrelations among the different families, and on the extent to which malware writers share code and produce patches to known variants
2010/7/20
ADLab Meeting 6
Introduction
SGNET dataset Combine clustering techniques
based on either static or behavioral characteristics of the malware samples
2010/7/20
ADLab Meeting 72010/7/20
RELATED WORK
ADLab Meeting 8
Related Work
Ghorghescu, 2005 Disassembling Comparing their basic blocks
Kolter and Maloof, 2006 Comparing a hex dump of their code
segments Wicherski, 2009, peHash
Polymorphic binaries receive the same hash value
According to the portions of the PE header that are not mutated 2010/7/20
ADLab Meeting 9
Related Work
Lee and Mody, 2006 Based on system call traces First attempts to cluster malware
according to its behavior Bailey et al., 2007
The first builds a clustering system that described a sample’s behavior in more abstract terms
O(n^2)
2010/7/20
ADLab Meeting 10
Related Work
Anubis http://anubis.iseclab.org/ Data tainting The tracking of sensitive compare
operations Dynamic analysis system for capturing a
sample’s behavior
2010/7/20
ADLab Meeting 112010/7/20
SGNET AND EPM CLUSTERING
ADLab Meeting 12
SGNET and EPM Clustering
SGNET focuses on the collection of detailed information on code injection attacks and on the sources responsible these attacks
Virus Total Anubis
2010/7/20
13
SGNET and EPM Clustering
SGNET ScriptGen
Learning 0-day behavior Argos
Program flow hijack detection Nepenthes
Shellcode emulation Malware download
2010/7/20ADLab Meeting
14
SGNET and EPM Clustering
Sensor: ScriptGen FSM Sample Factory: Argos Shellcode handlers: Nepenthes
2010/7/20ADLab Meeting
ADLab Meeting 152010/7/20
EPM CLUSTERING
ADLab Meeting 16
EPM Clustering
Epsilon-Gamma-Pi-Mu (EPGM) model Exploit (ε) Bogus control data (γ) Payload (π) Malware (μ)
Assumption: any randomization performed by attacker has a limited scope
Do not consider γ due to lack of host-based information in the SGNET dataset 2010/7/20
ADLab Meeting 17
EPM Clustering
Phase 1: feature definition
2010/7/20
ADLab Meeting 18
EPM Clustering
2010/7/20
Pi PUSH-based interaction PULL-based interaction Central repository
Mu PE header characteristics seem to be
more difficult to mutate The change in their value is likely to be
associated to a modification or recompilation of existing codebase
ADLab Meeting 19
EPM Clustering
Clearly, all of the features taken into account for the classification could be easily randomized by the malware writer
More complex (costly) polymorphic approaches might appear in the future
2010/7/20
ADLab Meeting 20
EPM Clustering
Phase 2: invariant discovery An invariant value is a value that is not
specific to a certain .. Attack instance Attacker Destination
Threshold-based: At least 10 different attack instances At least 3 different attackers At least 3 honeypot IPs
2010/7/20
21
EPM Clustering
Phase 3: pattern discovery T = v1, v2, v3, …, vn
2010/7/20ADLab Meeting
ADLab Meeting 22
EPM Clustering
Phase 4: pattern-based classification Clustering Multiple patterns could match the same
instance Each instance is always associated with
the most specific pattern matching its feature values
All the instances associated to the same pattern are said to belong to the same EPM cluster
2010/7/20
ADLab Meeting 23
EPM Clustering
E-clusters Exploit
P-clusters Payload
M-clusters Malware
2010/7/20
ADLab Meeting 24
EPM Clustering
B-Cluster Anubis Compare two samples based on their
behavioral profile
2010/7/20
ADLab Meeting 252010/7/20
RESULTS
ADLab Meeting 26
Results
Data: Jan 2008 ~ May 2009, collected by SGNET deployment
6353 malware samples Only 5165 can be correctly executed in
Anubis Some malwares can not download
correctly by Nepenthes
2010/7/20
ADLab Meeting 27
Results
39 E-clusters 27 P-clusters 260 M-clusters 972 B-clusters
2010/7/20
ADLab Meeting 28
Results
2010/7/20
ADLab Meeting 29
Results
#(exploit/payload combinations) is low Most malware variants seem to be
sharing few distinct exploitation routines for propagation
#(B-clusters) is lower than #(M-clusters) Some M-clusters are likely to correspond
to variations of the same codebase
2010/7/20
ADLab Meeting 30
Results
Clustering anomalies 860 B-clusters are composed of a single
malware sample and are associated to a single attack instance in the SGNET dataset
A small number of size-1 B-clusters have a 1-1 association with a static M-cluster
Mostly…
2010/7/20
31
Results
2010/7/20ADLab Meeting
ADLab Meeting 32
Results
P-pattern 45: PUSH-based download TCP port 9988
2010/7/20
ADLab Meeting 33
Results
M-cluster 13:
2010/7/20
ADLab Meeting 34
Results
M-cluster 13 is a polymorphic malware associated to several different B-clusters MD5 is not an invariant Allaple mutates its content at each
attack instance
2010/7/20
ADLab Meeting 35
Results
Each behavioral profile corresponds to an execution time of 4 mins Bot? Honeypots may help!
2010/7/20
ADLab Meeting 36
Results
2010/7/20
ADLab Meeting 37
Results
Allaple Worm exploiting MS04-007 DoS attacks
2010/7/20
ADLab Meeting 38
Results
IRC servers
2010/7/20
ADLab Meeting 392010/7/20
CONCLUSION
ADLab Meeting 40
Conclusion
Combine different clustering techniques Improve effectiveness in building
intelligence on the threats economy
2010/7/20
top related