Transcript
Page 1: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization

StevenH.H.Ding

DataMiningandSecurityLab

SchoolofInformationStudies

McGillUniversity

Montreal,Canada

BenjaminC.M.FungDataMiningandSecurityLabSchoolofInformationStudies

McGillUniversity,Montreal,Canada

PhilippeCharland

MissionCriticalCyberSecuritySectionDefenceR&DCanada–Valcartier

Quebec,Canada

Page 2: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Reverseengineer

Manualanalysis

Reverseengineering

2

Didanyoneanalyzesomethingsimilarbefore?Isitalibraryfunction?

f1f2f3

LDR R3,[R11,#sct]LDR R2,[R3,#0xC]LDR R3,[R11,#applet_no]CMP R2,R3BEQ loc_DFD0LDR R3,[R11,#sct]LDR R3,[R3]STR R3,[R11,#sct]loc_DFC0LDR R3,[R11,#sct]CMP R3,#0BNE loc_DFA0

Disassemble

Abinaryfile

Page 3: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

WithKam1n0

3

LDR R3,[R11,#sct]LDR R2,[R3,#0xC]LDR R3,[R11,#applet_no]CMP R2,R3BEQ loc_DFD0LDR R3,[R11,#sct]LDR R3,[R3]STR R3,[R11,#sct]loc_DFC0LDR R3,[R11,#sct]CMP R3,#0BNE loc_DFA0

Commentedassemblyfunction

LDR R3,[R11,#sct]LDR R2,[R3,#0xC]LDR R3,[R11,#applet_no]CMP R2,R3BEQ loc_DFD0LDR R3,[R11,#sct]LDR R3,[R3]STR R3,[R11,#sct]loc_DFC0LDR R3,[R11,#sct]CMP R3,#0BNE loc_DFA0

Labeledlibraryfunction

Page 4: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

TypeI:Exactclone

4

0x1FE69C0+ PUSHebp

0x1FE69C1+ MOVebp,esp

0x1FE69C3+ MOVecx,[ebp+arg_0]

0x1FE69C6+ PUSHebx

0x1FE69C7+ MOVebx,[ebp+arg_8]

0x1FE69CA+ PUSHesi

0x1FE69CB+ MOVesi,ecx

0x1FE69CD+ ANDecx,0FFFFh

0x1FE69D3+ SHResi,10h

0x1FE69D6+ CMPebx,1

0x1FE69D9+ +JNZloc_1FE6A0C

0x1FE69C0+ PUSHebp

0x1FE69C1+ MOVebp,esp

0x1FE69C3+ MOVecx,[ebp+arg_0]

0x1FE69C6+ PUSHebx

0x1FE69C7+ MOVebx,[ebp+arg_8]

0x1FE69CA+ PUSHesi

0x1FE69CB+ MOVesi,ecx

0x1FE69CD+ ANDecx,0FFFFh

0x1FE69D3+ SHResi,10h

0x1FE69D6+ CMPebx,1

0x1FE69D9+ +JNZloc_1FE6A0C

Page 5: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

TypeII:Syntacticallyequivalent

5

0x1FE05B0+ PUSHebp

0x1FE05B1+ MOVebp,esp

0x1FE05B3+ MOVecx,[ebp+arg_0]

0x1FE05B6+ PUSHebx

0x1FE05B7+ MOVebx,[ebp+arg_8]

0x1FE05BA+ PUSHesi

0x1FE05BB+ MOVesi,ecx

0x1FE05BD+ ANDecx,0FFFFh

0x1FE05B3+ SHResi,10h

0x1FE05B6+ CMPebx,1

0x1FE05B9+ +JNZloc_1FE05BC

0x1FE69C0+ PUSHebp

0x1FE69C1+ MOVebp,esp

0x1FE69C3+ MOVeax,[ebp+msg_0]

0x1FE69C6+ PUSHedx

0x1FE69C7+ MOVedx,[ebp+msg_1]

0x1FE69CA+ PUSHesi

0x1FE69CB+ MOVesi,eax

0x1FE69CD+ ANDeax,0FFFFh

0x1FE69D3+ SHResi,10h

0x1FE69D6+ CMPedx,1

0x1FE69D9+ +JNZloc_1FE6A0C

Page 6: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

TypeIII:Minormodification

6

0x1FE05B0+ PUSHebp

0x1FE05B1+ MOVebp,esp

+

+

0x1FE05B7+ MOVebx,[ebp+arg_8]

0x1FE05BA+ PUSHesi

0x1FE05BB+ MOVesi,ecx

0x1FE05BD+ ANDecx,0FFFFh

0x1FE05B3+ MOVeax,ecx

0x1FE05B6+ SHResi,10h

0x1FE05B9+ CMPebx,1

0x1FE05C1+ +JNZloc_1FE05BC

0x1FE69C0+ PUSHebp

0x1FE69C1+ MOVebp,esp

0x1FE69C3+ MOVeax,[ebp+msg_0]

0x1FE69C6+ PUSHedx

0x1FE69C7+ MOVedx,[ebp+msg_1]

0x1FE69CA+ PUSHesi

0x1FE69CB+ MOVesi,eax

0x1FE69CD+ ANDeax,0FFFFh

0x1FE69D3+ SHResi,10h

0x1FE69D6+ CMPedx,1

0x1FE69D9+ +JNZloc_1FE6A0C

Page 7: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

originalclone7

Page 8: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Obfuscation and Optimization - Challenges

8

Page 9: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Obfuscation and Optimization - Problems

•  P1:Therelationshipsamongassemblytokens•  xmm0(SSE)registervs.SSEoperationssuchasmovaps•  fclosevs.fopen.•  strcpyvs.memcpy.

•  P2:Tokencombinationweights•  Reverseengineerslookfor‘interestingpattern’.(higherweight)•  Regular,random,orrepeatedpatternisnotinteresting.(lowerweight)

•  SoundsofamiliarinNLP!

9

Page 10: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Learning English

1)Thecat____onthemat.

A:foodB:satC:sittingD:isspeaking

10

Page 11: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Paragraph Vector (p2vec):

11

king–man+woman=queenbad-good=maniacal_killer*

* ExamplecollectedfromAndreasMueller@amuellerml

Page 12: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Asm2Vec:

12

Page 13: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

T-SNE Visualization

13

Page 14: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

T-SNE Visualization

14

Page 15: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Evaluation (Quantitative)

15

Page 16: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Evaluation (Quantitative)

16

Page 17: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Evaluation (Case Studies)

17

Vulnerability retrieval

Page 18: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Evaluation (Case Studies)

18

Page 19: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Asm2Vec (IEEE S&P19) +Againstobfuscationandoptimization.+Evenbetterthanthemostrecentdynamicapproach.+Staticapproach:efficientandscalable.-  Binarydiffering(interpretability?)-  Staticapproach:cannotrecognizejumptable,etc.-Assemblycodecomefromthesameprocessorfamily.

19

Page 20: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

TheKam1n02.xBinaryAnalysisPlatform

20

Page 21: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Subgraphclone

21

Page 22: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Sym1n0

22

Page 23: Asm2Vec: Boosting Static Representation …...Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization Steven H

Thank you. Questions?


Top Related