pimp my pe: parsing malicious and malformed … · 3 purpose • chronicles the early development...
TRANSCRIPT
![Page 1: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/1.jpg)
Pimp My PE:Parsing Malicious andMalformed Executables
Virus Bulletin 2007
![Page 2: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/2.jpg)
2
Authors
• Sunbelt Software, Tampa FL• Anti-Malware SDK team:
– Casey Sheehan, lead developer– Nick Hnatiw, developer / researcher– Tom Robinson, developer / researcher– Nick Suan, developer / researcher
![Page 3: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/3.jpg)
3
Purpose• Chronicles the early development of our detection engine
– Specifically, the PE parser– Building enterprise infrastructure to support development
• Technical issues:– Understand malformations prevalent in wild PE’s– Methods for identifying malicious PE’s– Reliably parsing PE’s
• Pietrek’s article “An In-Depth Look into the Win32 PortableExecutable File Format” [3] a great intro, but much more is neededto successfully process modern PE’s
• Virtually all commercial analysis tools have serious issues parsingmalicious PEs
![Page 4: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/4.jpg)
4
Overview
• Introduction• Technical Background• Infrastructure• Image parsing in depth
![Page 5: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/5.jpg)
5
Part 1:Introduction
![Page 6: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/6.jpg)
6
The Need
• Ability to parse any PE into a robust internalrepresentation
• Ability to detect and remediate threats
![Page 7: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/7.jpg)
7
The Problem• Initial assumption: parsing is easy
– Simple parser should be able to cope with all samples.• Reality: malicious samples break parser constantly• Reaction: are these all corrupted PEs?• Realization: Windows loader behavior a valuable
comparison metric.– If Windows loads an image, we had better parse it– Corrupted images are, at very least, suspicious.
• In summary:– Implementations in the literature perform poorly versus threats
in the wild; generally cope poorly with “malformed” images– A large percentage of images in the wild are malformed (68%)
![Page 8: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/8.jpg)
8
The Problem (con’t)• The actual problem: building a parser to
effectively process modern, malicious PEs• Key hurdles:
– Qualify behavior of Windows loader for comparisonpurposes
– Analyze and categorize “anomalous” characteristics ofsample images which identify malformed images
– Iteratively improve parser performance (i.e., avoidperformance regression)
![Page 9: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/9.jpg)
9
The Solution
• Iteratively build and test parser• Constant regression testing
– Ensure new features don’t cause overall performanceto regress
• Verify performance vs. Windows loader– Gauge parser performance in absolute terms
![Page 10: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/10.jpg)
10
Image Anomalies
• Anomaly:– specific structural malformation; a particular field
malformed a particular way– frequently inconsistent with PE specification, or just
unusual or suspicious
• Analysis of anomalies and other structuralcharacteristics provides key insight into commonimage malformations
![Page 11: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/11.jpg)
11
Part 2:Background
![Page 12: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/12.jpg)
12
Basic PE Structure
• PE Header• PE Sections• Overlay (optional)
header
sect 1
sect 2
sect n
overlay
![Page 13: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/13.jpg)
13
Alignment
• Alignment applies to section mapping• PE header specifies two sectional alignment
values– File alignment specifies file mapped alignment– Virtual alignment specifies virtual mapped alignment
![Page 14: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/14.jpg)
14
Image Mapping• Windows loader performs “map and load” operation:
– Map:– Size the view– Create view in process VA space– Allocate storage
– Load image section by section• Our parser mimics this behavior
– “Source representation”• Frequently file mapped (linker output)• However we may be given memory mapped image with no
corresponding file image– “Target representation”
• Typically virtual mapped
![Page 15: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/15.jpg)
15
Mapping Translation
• Need to handle both file- and virtual-mappedimages
• cImageStream class– Accepts any source representation– Translates to requested target representation– Manages all stream-related details
![Page 16: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/16.jpg)
16
Section Size• Fundamental concept when dealing with
sections due to variable section alignment– Applies to header and sections
• 3 unique size concepts:– Raw size: unpadded data size– File size: RoundUp(raw_size, file_align)
• “File cave”; persistent– Virtual size: RoundUp(file_size, virtual_align)
• “Virtual cave”; transient– Be precise!
• Always explicitly state the size type in source code
![Page 17: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/17.jpg)
17
Section Size (con’t)
• Interesting (and annoying) that raw section sizeis unavailable– Important if you want size of REAL content!– E.g., when parsing structures in the header– … Or instructions (atoms) in a code section
• In practice, file aligned size is often treated assynonymous with raw size
• Demo:– Dump basic white file; identify raw, file, virtual sizes
![Page 18: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/18.jpg)
18
PE Structure• PE header:
– Documents “explicit” image structure– Vs. “implicit” structure
• PE section– Primary image content– Code, data, etc.– Described in header’s section table
• Overlay: non-loadable data, appended to PE image– Certificates– Debug info– Malware-specific payload– Demo Ganda
![Page 19: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/19.jpg)
19
PE Structural Abstractions
• Metasection:– abstraction for header, section, overlay components
• Metadata:– predefined data types– enumerated in the Data Directory (“DD”)– scattered throughout the image (and overlay)
![Page 20: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/20.jpg)
20
Part 3:Enterprise Infrastructure:Data Management & Analysis
![Page 21: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/21.jpg)
21
Infrastructure Overview
BlackFiles
WhiteFiles
AnalysisDB
RegressionDB
Data Warehouse
Analysis Tools
PESWEEP PeID RegressionSuite
![Page 22: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/22.jpg)
22
Data Repositories• PE repository consists of
– ~9,000 known good PEs (“white collection”)– ~70,000 known malicious PEs (“black collection”)
• Images processed through two tools– PEiD packer identifier [1]
– Proprietary static analyzer PeSweep• Post-process tool output, import into DB• Mine DB for interesting correlations
– Data mining is speculative, iterative, time-consuming– Results shown here are tip of iceberg
![Page 23: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/23.jpg)
23
PeSweep Analysis• Analyzes single file, directory, optional recursion• For every file processed, generates info on:
– Infer whether Windows is able to load it– Details on how much of the structure the parser is able to parse– Entropy values on a sectional basis– Header structure– Anomaly bits
• Able to create both file and virtually mapped targetmappings of the image
• Fully parses “explicit” content (header+metadata) :import, export, relocation, resource, etc values
![Page 24: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/24.jpg)
24
Sample Analysis Results
![Page 25: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/25.jpg)
25
![Page 26: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/26.jpg)
26
![Page 27: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/27.jpg)
27
![Page 28: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/28.jpg)
28
Section Name Frequency
![Page 29: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/29.jpg)
29
Sectional Analysis
![Page 30: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/30.jpg)
30
Overlay Prevalence
![Page 31: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/31.jpg)
31
Anomaly Frequency
![Page 32: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/32.jpg)
32
Analysis Summary
• We’re profiling characteristics of known-bad andknown-good images
• Distilling these results into general rules forfiltering files at runtime
• These rules could help identify suspicious files– E.g., the more suspicious a file, the more analysis
resources it receives
![Page 33: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/33.jpg)
33
Analysis Use Case 1
• Goal: Identify Loadable PEs– Classify PEs as valid / invalid at runtime
• Approach: synthesized “loader test”– Indicates whether Windows will run the file– Comprised of CreateProcess/LoadLibraryEx– Run across NT, 2000, XP, Vista
![Page 34: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/34.jpg)
34
Loader Test Results
![Page 35: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/35.jpg)
35
Analysis Data Use Case 2
• Goal: Identify Malicious PEs– Obviously a runtime heuristic generating a reliable “Is
Suspicious” flag is valuable
• Single query of anomaly bits– Identifies 67% of black list– Identifies 1.4% of white list
• This could be improved dramatically byincreasing the sophistication of our query.
![Page 36: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/36.jpg)
36
Part 4:Image Parsing in Depth
![Page 37: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/37.jpg)
37
PE Parser Class Organization
![Page 38: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/38.jpg)
38
PE Parsing Flowchart
![Page 39: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/39.jpg)
39
ImageStream Initialization• Same MapAndLoad process as before• Calculate target stream size
– Sum source stream metasection sizes, according to targetstream mapping
• Construct target stream– Copy each source metasection at computed offset in target
stream– Delicate process due to possible structural anomalies
• Parse anomalies are tracked throughout entire parsingprocess
![Page 40: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/40.jpg)
40
Stream Normalization
• Problem: MapAndLoad process is fragile– Image structure can be corrupted in a myriad of
different ways– Non-validated fields can lead to crashes during
mapping and loading
• Solution: preliminary scan of header– “Normalization” pass through the header to fix
obviously illegal values– Guarantee subsequent parse pass succeeds
• Initial results were promising!
![Page 41: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/41.jpg)
41
Stream Normalization (con’t)• Sample “illegal” values:
– Section table entry RVA falls within the header– Section table entry wild RVA and sizes entry– Header structures overlap– Wild DD entries
• TinyPE breaks them all! [2]– File ends before nominal end of OptHdr!
• Demo• Summary:
– Normalization must allow many degenerate cases– Less is more
• none is best ☺
![Page 42: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/42.jpg)
42
In Summary
• Anomaly Mechanism– Useful source of info for analysis engine
• Parser Design– Hope there are some useful nuggets here..
• Infrastructure– Supports ongoing technology improvement and QA– Insight into malformations prevalent in the wild– Proven useful for technology refinement
![Page 43: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/43.jpg)
43
Future Work• Extend
– Infrastructure– Analysis
• Refine heuristics for identifying malware and “suspicious”images
• Build additional tools– GUI version of PeSweep
• For now, SDK resources available athttp://research.sunbelt-software.com/ViperSDK/– PeSweep (cmdline binary; no source ☺)– Presentation
![Page 44: Pimp My PE: Parsing Malicious and Malformed … · 3 Purpose • Chronicles the early development of our detection engine – Specifically, the PE parser – Building enterprise infrastructure](https://reader030.vdocuments.us/reader030/viewer/2022011803/5b93090b09d3f2446f8cee5d/html5/thumbnails/44.jpg)
44
Thanks!
References:[1] PEiD homepage (http://peid.has.it/)[2] TinyPE (http://www.phreedom.org/solar/code/tinype/)
[3] Matt Pietrek, Under The Hood, An In-Depth Look into theWin32 Portable Executable File Format, MSDN Magazine, April2002, http://msdn.microsoft.com/msdnmag/issues/02/02/PE