inside vmprotect - webtv université de lille

57
Inside VMProtect Introduction Internal Analysis VM Logic Conclusion Samuel Chevet Inside VMProtect Samuel Chevet 16 January 2015

Upload: others

Post on 25-Mar-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Inside VMProtect

Samuel Chevet

16 January 2015

Page 2: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Agenda

Describe what VMProtect isIntroduce code virtualization in software protectionMethods for circumventionVM logic

Page 3: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Warning

Some assumptions are made in this presentationOnly few binaries have been studiedMostly 64 bits target

Page 4: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Plan

1 Introduction

Page 5: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Software-based protection

Content of the executable’s sections is encryptedand/or compressedAppend new code for decrypting/decompressing thesectionsAdd all kinds of anti-debug, anti-vm, . . .Executable’s entrypoint is redirected into this newcodeExecution is transferred back to the originalentrypoint after decrypt/decomp

Page 6: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VMProtect

Memory protectionAllows protection of the file image in memory fromany changesIntegrity is checked before giving execution to theoriginal entry point

Page 7: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VMProtect

Import protectionAll entries used by the original binary are removedfrom Import TableAppend code redirection for API callReplace CALL DWORD PTR[@IAT] / CALLQWORD PTR[@IAT] (Encoded on 6 bytes)By CALL VMProtect.section (Encode on 5 bytes)

1 byte left: two variationsBefore: Fake push (Stack will be readjusted duringredirection)After: Dead code (Increment the return addressduring redirection)

Page 8: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VMProtect

Page 9: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VMProtect

Resource protectionEncrypt resources: except icons, manifest and someother system typesHook:

LoadStringA/WLdrFindResource_ULdrAccessResource

License managerTrack your sales online and manage serial numbersI have never worked on it

Page 10: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Code virtualization

In simple packer native code is simply encryptedand/or compressedDisassemble native code and compile it intoproprietary bytecodeExecuted in a custom interpreter at run-timeInterpreter: Fetch, Decode, ExecuteOriginal native code has disappearedEfficient way for anti-reverse engineering

Page 11: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Code virtualization

Page 12: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Code virtualization

VM must fully reproduce correctly CPU instructionsSave/Restore correctly the context of the applicationbefore/after emulationCare about correct result in EFLAGS|RFLAGSAny error in the emulation is not acceptable

Page 13: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VMProtect

VMProtect doesn’t decrypt the code at allNative code is compiled into a proprietarypolymorphic bytecodeFrom one binary to another one, VM will not be thesameOr even different VM inside the same binary!

Page 14: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Agenda

Questions?What is the architecture of the virtual CPUgenerated?Is the VM generated randomly?VM bytecode obfuscated?Difficult to recontruct original bytecode?. . .

Page 15: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Plan

2 Internal

Page 16: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Virtual machine architecture

Virtualization obfuscator is Reduced Instruction SetComputing (RISC)One Complex Instruction Set Computing (CISC)instruction will be translated in multiple virtualizedinstructions

lea ecx, [ecx + ebx * 4 + 42]

Translated into several virtual instructions1 Fetch ebx2 Multiply ebx by 43 Fetch ecx4 Add theses two registers5 Add 426 Store result in ecx

Page 17: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Virtual machine architecture

Language used by virtualization obfuscator isStack-BasedA stack machine implements registers with a stackThe operands of the arithmetic logic unit (ALU) arealways the top two registers of the stackResult from the ALU is stored in the top register ofthe stackReconstruction original native code will involveremoving stack machine feature

Page 18: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Virtual machine context

Before entering into the virtualization obfuscator,host’s registeres and flags must be saved into VM’scontext structure

VMProtect context structure8/16 for VM-registers2 for Relocation-Difference andSECURITY_CONSTANT6 for temporal usage (mostly EFLAGS|RFLAGS)0x80 bytes free for pushed variables

sub esp, 0C0h ; 32bitsub rsp, 140h ; 64bit

Page 19: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Interesting fact

Register EDI|RDI holds VM contextRegister EBP|RBP holds VM stack

Maximum value of RBP|EBP64 bit: RDI + 0xE0, 32 bit: EDI + 0x50If this value is reached, reserve more space on thestack and copy VM context and pushed variables

Page 20: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Interesting fact

Register EDI|RDI holds VM contextRegister EBP|RBP holds VM stack

Maximum value of RBP|EBP64 bit: RDI + 0xE0, 32 bit: EDI + 0x50If this value is reached, reserve more space on thestack and copy VM context and pushed variables

Page 21: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VMProtect Context

VM context is accessed by EDI|RDI (via memlocation)Index register is EDI|RDIIndex base is stored in opcode operands (can beencrypted, see later)From one VM to another, VM registers will not bestored at the same index!It makes VM context totally random

Page 22: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Virtual machine implementations

VM LoopRead the bytecode at instruction pointerCompute opcode handlerCall the handlerCan have two variations

Down-read VM-BytesUp-read VM-Bytes

ESI|RSI: VM instruction pointer

Page 23: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Bytecode encryption

If encryption key is presentStart of code virtualization depends on anencryption keyVM Loop depends on this key to decrypt opcodeHandler depends on this key to decrypt operandsKey is updated during VM Loop and opcodehandler executionImpossible to study code virtualization at a chosenpointEBX|RBX holds the encryption key

Page 24: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Logical & Arithmetic operations

Some logical and arithmetic opcode handler mustcare of EFLAGS|RFLAGSEach of them has code to store them after theoperation; ... operationpushfqpop qword ptr [rbp+0]

After such handler, VM will call an handler to POPthem in VM registerGUESS: there is VM opcode pairs

Page 25: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Block Start

Push all registers, and EFLAGS|RFLAGSOrder is totally randomPush SECURITY_CONSTANTPush Relocation-DifferenceDecrypt SECURITY_CONSTANTStore all pushed registers, flags & others into VMcontextIndex in VM context is totally random

Page 26: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Block End

Push from VM context registers to stackIF VM_EXIT

Pop all registers and EFLAGS|RFLAGS and returnELSE

Encrypt SECURITY_CONSTANTPush SECURITY_CONSTANTPush Relocation-DifferenceJump to next VM_Block

Page 27: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Internal registers

What we knowEBX|RBX: encryption keyEDI|RDI: VM contextESI|RSI: VM instruction pointerEBP|RBP: VM stackEDX|RDX: arithmetic/result operation of handleraddressEAX|RAX: opcode valueR13: relocation-differenceR12: opcode handler table

Page 28: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Plan

3 Analysis

Page 29: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Dynamic analysis

Now that we know how it "works"Before using symbolic execution to solve thisproblemWe have to write an "intelligent code tracer"So we will be sure our symbolic execution is notbuggy

Page 30: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Dynamic analysis

Trace full execution will take too much timeLocate the VM LoopInject DLL that setup a HBP on execution at VM LoopStore in DB:

VM StackVM Context

Make a local WebService to output result (Diffbetween two states on VM_STACK, VM_CONTEXT)Initialize VM Context with default value

Page 31: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Dynamic analysis

Page 32: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Dynamic analysis

Page 33: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Static analysis

Now we can know the VM context at any point(perfect for debugging)We want to be able to reconstruct original bytecodeAutomate taskUse metasm framework(https://github.com/jjyg/metasm)

Page 34: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Metasm

Ruby open source frameworkAssembler, disassembler, compiler, linker, . . .Description of the semantics for each instructionAllowing us to compute the semantic of a set ofinstructionscode_binding

Page 35: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Metasm

code_binding examplerax => (byte ptr [rsi-1]&0ffffff00h)|((((((byte ptr [rsi-1]>>1)&7fffff80h)|(((((byte ptr [rsi-1]>>1)&7fffff80h)|((((byte ptr [rsi-1]&0ffffffffh)-(rbx&0ffh))>>1)&7fh)^7fffffffh)&7fh))^30h)&7fh)|((((byte ptr [rsi-1]&0ffffffffh)-(rbx&0ffh))<<7)&80h))rdx => qword ptr [rbp]&0ffffffffffffffffhrbx => (rbx&0ffffffffffffff00h)|((rbx-((((((byte ptr [rsi-1]>>1)&7fffff80h)|(((((byte ptr [rsi-1]>>1)&7fffff80h)|((((byte ptr [rsi-1]&0ffffffffh)-(rbx&0ffh))>>1)&7fh)^7fffffffh)&7fh))^30h)&7fh)|((((byte ptr [rsi-1]&0ffffffffh)-(rbx&0ffh))<<7)&80h)))&0ffh)rbp => (rbp+8)&0ffffffffffffffffhrsi => (rsi-1)&0ffffffffffffffffh

Just need to replace (inject) inside expression theknown value, so expression can be reduced

RSI: bytecode_ptrRBX: encryption keyRBP: VM_stack

Page 36: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM’s symbolic internal

VM symbolic is hugeAll VM registers must implem each size of operand(byte, word, dword, qword)VM context contains lot of internals registers

vm_symbolism = {:rax => :opcode,:rbx => :vmkey,:rsi => :bytecode_ptr,:rbp => :vm_stack,Indirection[[:vm_stack], 8, nil] => :QWORD_OP_1,...Indirection[[:vm_stack, :+, 0x8], 8, nil] => :QWORD_OP_02,Indirection[[:rdi], 8, nil] => :qword_vm_r0,Indirection[[:rdi, :+, 0x8], 1, nil] => :byte_vm_r0,...Indirection[[:rdi, :+, 0x8], 8, nil] => :qword_vm_r1,...Indirection[[:rdi, :+, 0x18], 8, nil] => :qword_vm_r3,...

Page 37: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Static analysis

Setup start context of the VMDisassemble and compute semantic of the currentopcode handlerCompute next state with solved semanticsLoop if not VM_EXIT

ProblemWe need to know the end address for thecode_bindingCheck list of basic block, if one basic block match thecheck on maximum value of VM_STACK (EBP) =>STOP ADDRIf not we are back to the VM_LOOP or RETN(VM_EXIT)

Page 38: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Static analysis

Setup start context of the VMDisassemble and compute semantic of the currentopcode handlerCompute next state with solved semanticsLoop if not VM_EXIT

ProblemWe need to know the end address for thecode_bindingCheck list of basic block, if one basic block match thecheck on maximum value of VM_STACK (EBP) =>STOP ADDRIf not we are back to the VM_LOOP or RETN(VM_EXIT)

Page 39: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Static analysis

VM Context requirement at startBytecode pointer start address: arg_00 ofVM_ENTRY (constant unfolding can be applied onit)Key stored in EBX|RBX necessary to decrypt bytcodeis equal to original PE ImageBase + RVA of bytecodepointerOpcode handler table (normally stored in r12)

With our dynamic analysis we know those 3parameters at any point!

Page 40: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Static analysis

Remove native register / not interesting VM registerfrom solved binding

Keep only operation on EDI|RDI or VM_STACK

Thanks to the RISC architecture and stack-basedlanguageCheck if VM_STACK has been incremented ordecremented

[+] solved bindingqword ptr [rsp] => 140087e62hdword ptr [rsp-8] => dword ptr [rsp-8]+0e4018d37hQWORD_IMM => 0ffffffffe4018d37h # Indirection[[:vm_stack, :+, -0x8], 8, nil] => :QWORD_IMMvirt_rax => 0ffffffffe4018d37hvm_stack => (vm_stack-8)&0ffffffffffffffffhbytecode_ptr => 140087e01h

########### After Remove registerQWORD_IMM => 0ffffffffe4018d37hvm_stack => (vm_stack-8)&0ffffffffffffffffh

[DISAS]: PUSH 0XFFFFFFFFE4018D37

Page 41: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Static analysis

With that we can start to disassemble the wholebytecode VMCheck "Pratical Reverse Engineering" (Chapter 5) forcomplete example on how to use metasm

Page 42: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Plan

4 VM Logic

Page 43: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Logic

ReminderLangage used is stack-basedNext opcode after logical or arithmetic operation willstore EFLAGS|RFLAGS inside VM context

For all the following slides we will use the followingsyntax:

QWORD_OP_1: [RBP + 0] ; operand 01QWORD_OP_2: [RBP + 8] ; operand 02

Page 44: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Disassembler

...[DISAS]: PUSH 0xC2666C77B83B1153[DISAS]: PUSH vm_r6[DISAS]: PUSH 0x000000014014A631[DISAS]: ADD QWORD_OP_1, QWORD_OP_2[DISAS]: POP vm_r2[DISAS]: MOV QWORD_OP_1, [QWORD_OP_1][DISAS]: ADD QWORD_OP_1, QWORD_OP_2[DISAS]: POP vm_r14...

Page 45: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Optimization

We need to remove stack machine "feature"Replace push ; pop by assignement statementTrack stack pointerCheck if the destination size match!

Page 46: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Logic

All push, pop with all different size & mem derefAddDiv, IdivMulRcl, RcrShl, ShrShld, Shrd

Page 47: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Logic

Inside VM handlers, operation likeAND|SUB|OR|NOT seems not supportedIn fact all those operations are managed by onehandler "NOR" logical gate:

Native semantic of this handlerNOT QWORD_OP_1NOT QWORD_OP_2AND QWORD_OP_1, QWORD_OP_2MOV QWORD_OP_2, QWORD_OP_1MOV QWORD_OP_1, RFLAGS

Page 48: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Logic

Lot of logical instruction will use this "NOR" logicalgate handler:

NOT(OP_00) = NOR(OP_00, OP_00)AND(OP_00, OP_01) = NOR(NOT(OP_00),NOT(OP_01))XOR(OP_00, OP_01) = NOR(NOR(OP_00, OP_01),AND(OP_00, OP_01))SUB(OP_00, OP_01) = NOR(ADD(OP_01,NOT(OP_01))). . .

Page 49: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM Logic

VM_ADC(OP_00, OP_01) = VM_ADD(OP_00,(OP_01 + CARRY))VM_SUB(OP_00, OP_01) = VM_NOT(VM_ADD(B,VM_NOT(A)))VM_CMP = VM_SUBVM_NEG(OP_00) = VM_SUB(0, OP_00). . .Original bytecode is sometimes converted to morethan 50 VM opcodes . . .

Page 50: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM block entry

VM block entryPOP REG ; relocation-differencePUSH IMMEDIATEADD QWORD_OP_1, QWORD_OP_2 ; compute security constantPOP REG ; flagsPOP REG ; pop result...; POP ALL HOST REGISTER (SAVE CONTEXT)...

Page 51: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM conditional jump

VM jcc1 Push two vm_offset2 Push VM_stack3 Convert EFLAGS|RFLAGS for adjustement 0 or 4|84 Adjust pointer from result (ADD operation)5 Prepare to load next vm block from [VM_STACK]

We will have to reconstruct JCC correctly

Page 52: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM CRC

VM CRCThere is a special opcode for making CRCOp_01: Mem pointer, Op_02: SizeCheck VM integrity, executable integrityCollision :)

rcx = rax = 0;for (i = 0; i < Size; i++) {

rcx = raxrcx = rcx >> 0x19rax = (rax << 0x07) | rcxrax = (rax & 0xFFFFFF00) | (rax & 0xFF) ^ buf[i]

}

Compared with SECURITY_CONSTANTFound the same checksum in all samples

Page 53: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

VM CPUID

VM CPUIDThere is a special opcode for making CPUIDinstructionOp_01: ValueSave 0x0C on VM_STACK (EBP) for storing eax, ebx,ecx, edx

Page 54: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Listing opcodes

Try to compute set of all list of opcodes to reconstructthe correct original oneReally long task, I didn’t have finish it at this time

1 bored2 need more samples

The mapping between the set of VM bytecode andoriginal one will work directly on all binaries

Page 55: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Plan

5 Conclusion

Page 56: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Conclusion

Always the same architecture: RISC + stack machineVirtualMachine are generated in a random wayDifficult to make a static disassembler, prefer to usesymbolic executionBefore having the question: no toolz is going to bereleasedVMProtect is a cool challenge (start by 64 bits binary,"obfuscation" is not difficult)

Page 57: Inside VMProtect - WebTV Université de Lille

Inside VMProtect

Introduction

Internal

Analysis

VM Logic

Conclusion

Samuel Chevet

Questions ?

Thank you for your attention