deobfuscation - optimice

Upload: rexplantado

Post on 02-Jun-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Deobfuscation - optimice

    1/28

  • 8/10/2019 Deobfuscation - optimice

    2/28

    Overview

    Why?Project goal?How?

    DisassemblyInstruction semanticsOptimizationsAssembling

    DemoQuestions?

  • 8/10/2019 Deobfuscation - optimice

    3/28

    Why?

    To name a fewX86 is complex

    2 books, ~1600 pages of instructions

    Obfuscated code = complexity++Disassembly is not always prettyDebugging can help mitigate some problemsbut not all

  • 8/10/2019 Deobfuscation - optimice

    4/28

  • 8/10/2019 Deobfuscation - optimice

    5/28

    Why?

    Now what?

  • 8/10/2019 Deobfuscation - optimice

    6/28

    Why?

    No public/opensource tool for deobfuscationNo framework to analyze instruction semanticsFun thing to do?

    To speed up thingsReuse code for some other projects

  • 8/10/2019 Deobfuscation - optimice

    7/28

    Project goal?

    Rewrite code to fix disassembly representationproblemsBuild framework to analyze instruction tainting andsemantics

    Extend it for automatic deobfuscationExpose API

    Ease development of heuristic deobfuscation rules andcode transformationsExperiment with code transformations

  • 8/10/2019 Deobfuscation - optimice

    8/28

    How? - Disassembly

    Main disassembly unit is a functionFunction representation should have all instructionsvisibleProblems (for reversers)

    Basic block scatteringNot a real problem for disassembler

    Fake paths in conditional jumps lead to broken disassembly(opaque predicates)Instruction overlapping

    Not a real problem for disassembler

  • 8/10/2019 Deobfuscation - optimice

    9/28

    How? - Disassembly

    JCC path leads to broken disassemblyReplace it with RET , add comment to instruction and continueThis way code can be transformed to a function

  • 8/10/2019 Deobfuscation - optimice

    10/28

    How? - Disassembly

    Instruction overlapping hides code pathsDisassembly graph should contain all instructions

  • 8/10/2019 Deobfuscation - optimice

    11/28

  • 8/10/2019 Deobfuscation - optimice

    12/28

    How? - Disassembly

    Nodes represent InstructionsInstruction contains the following information:

    OriginEA, Mnemonic, Disassembly, Prefix, Operands,Opcode, Operand types

    Instruction information populated from two sources:Information from IDA API

    GetMnem (), GetOpnd (), GetOpType ()

    Information derived from GetDisasm () API

  • 8/10/2019 Deobfuscation - optimice

    13/28

  • 8/10/2019 Deobfuscation - optimice

    14/28

    How? Disassembly - Functions

    Function abstracted as a ClassFunction = Basic Blocks + CFGCFG stored as two graphs

    References from location (GetRefsFrom)References to location (GetRefsTo)

    Some of the exposed functions are: GetRefsFrom (),GetRefsTo (), DFSFalseTraverseBlocks ),

    CFG optimizations mainly operate on Function class

  • 8/10/2019 Deobfuscation - optimice

    15/28

    How? Disassembly Basic Blocks

    Basic Block implemented as linked list in FunctionClassEach entry is an Instruction Class instanceInstruction stores relevant instruction data:

    Prefix, mnemonic, operands, types, values, comments

    Stores instruction information from two sources:IDAGetOp *() functionsParsing of GetDisasm () string, regex style

  • 8/10/2019 Deobfuscation - optimice

    16/28

  • 8/10/2019 Deobfuscation - optimice

    17/28

    How? Instruction Semantics

    Implemented in TaintInstr ClassContains information about:

    Source and destination operands (displayed and hidden)Flag modificationSide effects (e.g. ESP+4 for POP )Ring association ( LLDT )

    BlockTainting Class automates process on blocks

    Tainting information necessary to perform safeoptimizations

  • 8/10/2019 Deobfuscation - optimice

    18/28

    How? Overall

    FunctionCFG information

    Basic BlocksInstruction grouping

    InstructionOpcode information

    Taint informationOperands information

    Function

    Basic Block

    InstructionTaint

    Information

  • 8/10/2019 Deobfuscation - optimice

    19/28

    How? Optimizations

    We have foundation to analyze codeIts time to exploit some algorithms Four main types of optimizations:

    CFG reductionsJCC reductionJMP merging

    Dead code removalHeuristic rulesConstant propagation and folding (TODO)

  • 8/10/2019 Deobfuscation - optimice

    20/28

    How? Optimizations - CFG

    JCC reductionsJCC path depends on flags statusUse tainting information to detect constant flagsReplace JCC with JMPe.g. AND , clears OF and CF flags

    [JO, JNO, JC, JB, ] all take single path

    Results in smaller graphs and better/more

    precise disassemblyRemoves fake paths that break creation offunctions and mess up disassembly

  • 8/10/2019 Deobfuscation - optimice

    21/28

    How? Optimizations - CFG

    JMP mergingIf current block ends in JMP andnext block has only single reference then merge them

    Increases block size, reduces CFG complexityCode optimizations are block based so merging caninfluence a lot the final code quality

  • 8/10/2019 Deobfuscation - optimice

    22/28

    JCC

    How? Optimizations - CFG

    Two staged CFG optimization:

    BLOCK 2

    BLOCK 3

    JMP

    BLOCK 2

    BLOCK 1

    BLOCK 1,2

    BLOCK 1

  • 8/10/2019 Deobfuscation - optimice

    23/28

  • 8/10/2019 Deobfuscation - optimice

    24/28

    How? Optimizations Rule based

    There is obfuscation which bothers you and isntautomagically removed?Adding rule based optimization is easy?

  • 8/10/2019 Deobfuscation - optimice

    25/28

    How? Assembling

    idaapi.Assemble () ?, we do not support it. It is very limited and canhandle only some trivial instructions. We do nothave plans to improve or modify it . Ilfak

    Sensitive to syntaxRemember GetMnem ()?IDA before 5.5 cant assemble JCC easily BUT, it can handle most instructions if you play nice(Batch(1) is your friend)

  • 8/10/2019 Deobfuscation - optimice

    26/28

    DEMO TIME

  • 8/10/2019 Deobfuscation - optimice

    27/28

    Conclusion

    It can remove static obfuscationsYou can feed it data from disassembler for betterresults

    Tool chaining

    Work in progressIt has bugs :D send samples and will fix themGot ideas? Share them.

    You can extend, improve, contributeShouts: n00ne, bzdrnja, tox, haarp, MazeGen,RolfRolles, all gnoblets, reddit/RE

  • 8/10/2019 Deobfuscation - optimice

    28/28

    Thank you for your attentionQuestions?

    http://code.google.com/p/optimice