charles curtsinger umass at amherst benjamin livshits and benjamin zorm microsoft research christian...

34
ZOZZLE: FAST AND PRECISE IN-BROWSER JAVASCRIPT MALWARE DETECTION Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium (August, 2011)

Upload: gwendolyn-york

Post on 26-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

ZOZZLE: FAST AND PRECISE IN-BROWSER JAVASCRIPT

MALWARE DETECTION

Charles Curtsinger

UMass at Amherst

Benjamin Livshits and Benjamin Zorm

Microsoft Research

Christian Seifert

Microsoft

20th USENIX Security Symposium (August, 2011)

Page 2: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

ZOZZLE: LOW-OVERHEAD MOSTLY STATIC JAVASCRIPT

MALWARE DETECTION

Charles Curtsinger

UMass at Amherst

Benjamin Livshits and Benjamin Zorm

Microsoft Research

Christian Seifert

Microsoft

Microsoft Research Technical Report (November, 2010)

Page 3: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 3

Outline

Introduction Observation on Offline Nozzle Design Experiment Evaluation

2011/5/24

Page 4: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 4

Introduction

In the last several years, we have seen mass-scale exploitation of memory-based vulnerabilities migrate towards heap spraying attacks.

But many solutions are not lightweight enough to be integrated into a commercial browser.

2011/5/24

Page 5: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 5

About Nozzle

The overhead of this runtime technique may be 10% or higher.

This paper is based on our experience using NOZZLE for offline.

Offline scanning is also not as effective against transient malware that appears and disappears frequently.

2011/5/24

Page 6: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 6

About Zozzle

ZOZZLE is integrated with the browser’s JavaScript engine to collect and process JavaScript code that is created at runtime.

Our focus in this paper is on creating a very low false positive, low overhead scanner.

2011/5/24

Page 7: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 7

Observation on Offline Nozzle

Once we determine that JavaScript is malicious, we invested a considerable effort in examining the code by hand and categorizing it in various ways.

we investigated 169 malware samples.

2011/5/24

Page 8: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 8

Distribution of Different Exploit Samples

2011/5/24

Page 9: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 9

Transience of Detected Malicious URLs

2011/5/24

Page 10: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 10

Javascript eval Unfolding

2011/5/24

Page 11: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 11

Distribution of Context Counts

2011/5/24

Page 12: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 12

Design

2011/5/24

Page 13: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 13

Training Data Extraction and Labeling We start by augmenting the JavaScript

engine in a browser with a “deobfuscator” that extracts and collects individual fragments of JavaScript.Detours [link]jscript.dll [link]Compile function

(COlescript::Compile())

2011/5/24

Page 14: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 14

Feature Extraction

We create features based on the hierarchical structure of the JavaScript abstract syntax tree(AST).

2011/5/24

Page 15: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 15

Feature Selection

χ2 test

2011/5/24

With feature Without feature

malicious A C

benign B D

%9.9983.10

22

DCBADBCA

CBAD

Page 16: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 16

Classifier Training

Naϊve Bayesian classifier

Assume to be conditionally independent

2011/5/24

n

kikkin

n

inini

LFFFPLFFP

FFP

LFFPLPFFLP

1111

1

11

,,,,,

,,

,,,,

n

n

kiki

n

n

kikki

ni FFP

LFPLP

FFP

LFFFPLPFFLP

,,,,

,,,,,

1

1

1

111

1

Page 17: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 17

Naϊve Bayesian classifier

Complexity: linear time

2011/5/24

n

kikiscript

n

n

kiki

nibelspossibleLai

script

LFPLPC

FFP

LFPLPFFLPC

1

1

11

maxarg

,,maxarg,,maxarg

Page 18: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 18

Fast Pattern Matching

2011/5/24

Page 19: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 19

Fast Pattern Matching (cont.)

2011/5/24

Page 20: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 20

Experiment

Malicious Samples919 deobfuscated malicious context

Benign SamplesAlexa top 50 URLs7,976 contexts

2011/5/24

Page 21: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 21

Feature Selection

hand-picked vs. automatically selected

2011/5/24

Page 22: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 22

Evaluation

HP xw4600 workstationIntel Core2 Duo 3.16 GHz4 GB memoryWindows 7 64-bit Enterprise

2011/5/24

Page 23: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 23

Effectiveness

2011/5/24

Page 24: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 24

Training Set Size

2011/5/24

Page 25: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 25

Feature Set Size

2011/5/24

Page 26: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 26

Comparison with Other Techniques

2011/5/24

Page 27: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 27

Performance: Context Size

2011/5/24

Page 28: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 28

Performance: Feature Set

2011/5/24

Page 29: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 29

THANK YOU

2011/5/24

Page 30: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 30

JAVASCRIPT OBFUSCATION

2011/5/24

Page 31: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 31

I think these is the all…

2011/5/24

unescape(“%48%65%6c%6c%6f%57%6f%72%6c%64”)

“\u0048\u0065\u006C\u006C\u006F\u0057\u006F\u0072\u006C\u0064”

document.write(“alert(‘1’)”);eval(“alert(1)”);

"H976e246l3l2o19W42o45r7l88d734".replace(/[09]/g,"")

Page 32: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 32

If I want to eval…

<script>Fucntion("alert(‘1')")();setTimeout("alert(‘1')“;execScript("alert(‘1')", "javascript");[].constructor.constructor('alert(1)')();window["eval"]("alert(‘1’)");

</script>

2011/5/24

Page 33: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 33

In the network, I find …

<script>([][(![]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+

[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]+(!![]+[])[+[]]][([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[!+[]+!+[]]]()[(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]+(!![]+[])[+[]]])(+!+[])

</script>

2011/5/24

Page 34: Charles Curtsinger UMass at Amherst Benjamin Livshits and Benjamin Zorm Microsoft Research Christian Seifert Microsoft 20 th USENIX Security Symposium

A Seminar at Advanced Defense Lab 34

THE END

2011/5/24