VMprotect VM_logic (in v1.8 demo)
by
on July 6th, 2009 at 14:57 (1730 Views)
+ VM_opcode listings in attachmentCode:******* VM-protect hides CPU instruction by dividing single instruction into many VM_opcodes. But correct VM must fully reproduce CPU instructions and care about correct result in EFlags, so any kind simulation is not acceptable! lets look at VM_handlers VM_handlers ( ~71 ) (SA = StackAdd , SS = StackSub) AddByteByte_SS2 AddWordWord_SS2 AddDwordDword Div168_SS2 Div3216_SS2 DIV6432 ExitVM IDIV168_SS2 IDIV3216_SS2 IDIV6432 IMUL88_SS2 IMUL1616_SS4 IMUL3232_SS4 MUL88_SS2 MUL1616_SS4 MUL3232_SS4 NotNotAndByte_SS2 NotNotAndWord_SS2 NotNotAndDword PopBP PopEBP PopfD_SA4 (mostly used on VM_STD, VM_CLD) LoadVmIP_SA4 PopMemByte_SA6 PopMemByteSS_SA6 PopMemByteES_SA6 PopMemWord_SA6 PopMemWordSS_SA6 PopMemWordES_SA6 PopMemDword_SA8 PopMemDwordSS_SA8 PopMemDwordES_SA8 (also can be for CS,FS,GS case) PopByteToVMRegsImmID_SA2 PopWordToVMRegsImmID_SA2 PopDwordToVMRegsOpcID_SA4 ^^^^ PushByteFromVMRegsImmID_SS2 PushWordFromVMRegsImmID_SS2 PushDwordFromVMRegsOpcID_SS4 ^^^^for byte/word/dword parts access in VM-Registers ( EAX, AX, AL) PushBP_SS2 (push VM_SP) PushEBP_SS4 (push VM_ESP) PushwImmUByte_SS2 PushdImmSByte_SS4 PushwImmWord_SS2 PushdImmSWord_SS4 PushImmDword_SS4 PushMemByte_SA2 PushMemByteSS_SA2 PushMemByteES_SA2 PushMemWord_SA2 PushMemWordSS_SA2 PushMemWordES_SA2 PushMemDword PushMemDwordSS PushMemDwordES (also can be for CS,FS,GS case) RclByte_SS2 RclWord_SS2 RclDword_SS2 RcrByte_SS2 RcrWord_SS2 RcrDword_SS2 SHLD_SA2 SHRD_SA2 ShlByte_SS2 ShlDword_SS2 ShlWord_SS2 ShrByte_SS2 ShrDword_SS2 ShrWord_SS2 tool-handlers PushRDTSC_SS8 PushCPUID_SS12 (value) CRCsum_SA4 (pmem, size) ** all Logical & Arithmetic Handlers, which must care on EFlags, has code to store Eflags: pushfd pop d, [ebp+0] then after such handler VM will always call PopDwordToVMRegsOpcID_SA4 for store EFlags into VM-Registers (intermediate or main). so we can state, they are VM-opcode-pairs *** in VM_handlers we not see exact handlers for And/Or/Not/Xor/Sub/Rol... instructions; How are they emulated!? For Logical-instructions author builds main VM-handler "NotNotAnd"; it's assembly code looks so: Mov eax [ebp+0] Mov edx [ebp+4] Not eax Not edx And eax edx Mov [ebp+4] eax Pushfd Pop d,[ebp+0] NotNotAnd (var1, var2) = And (Not var1) (Not var2) and seems it is NOR LOGIC GATE! (below i will leave name "NotNotAnd"; I wrote this before did search, but you can search in internet for "NOR LOGIC" and see all in images!) Other main logical instuctions will done via this NOR LOGIC GATE. This sequence produces valid result in EFlags for emulated logical instructions, so no further works need on EFlags. VM_NOT (A) = NotNotAnd (A, A) PushEBP_SS4 + PushMemDwordSS = push dword[esp] usually uses in VM_NOT, to prevent dubble calculation VM_AND (A, B) = NotNotAnd {VM_NOT (A), VM_NOT (B)} = NotNotAnd {NotNotAnd (A, A) , NotNotAnd (B, B)} VM_TEST = VM_AND ; result value stored in intermediate VM-regs, discarded VM_OR (A, B) = VM_NOT [NotNotAnd (A, B)] = NotNotAnd {NotNotAnd (A, B) , <SamePushed } VM_XOR (A, B) = NotNotAnd {NotNotAnd (A, B)} {VM_AND (A, B)} = NotNotAnd {NotNotAnd (A, B)} {NotNotAnd [NotNotAnd (A, A) , NotNotAnd (B, B)]} VM_AND has also truncated variant, if one parameter is Immediate value. VMprotect compiles Immediate value already inverted, so part VM_NOT(Immediate) skipped in VM_AND construction. (see "AND ecx 7 " example; also in EFlags management) Rol,Ror,Sar are emulated via SHLD & SHRD handlers; VM_RCL and VM_RCR will handled by RclDword_SS2 & RclDword_SS2 handlers, for which Carry-Flag should extracted from VM_regs_Eflags; then instuction in handler-code "SHR CH, 1" will load extracted CFlag & do RCL/RCR VM_ADD is normal Addition for other Arithmetic-instructions VM uses VM_ADD + logic constructions for EFlags management (but for decompiling they are useless junk!) VM_ADC (A, B) = VM_ADD(A, (B+Carry_flag)) VM_SUB (A, B) = VM_NOT [VM_ADD {B, (VM_NOT A)}] EFlags = [And(0815, VM_ADD>>EFlags)] + [And( {Not(0815) }, final-VM_NOT>>EFlags)] (virtualized into 36 VM-bytes) VM_SBB (A, B) = VM_SUB(A, (B+Carry_flag)) VM_CMP = VM_SUB ; result value stored in intermediate VM-regs, discarded VM_NEG (A) = VM_SUB (0, A) ; (constant 0 is already inverted) Inc & Dec instructions in CPU not affects Carry-flag, so Carry-flag should leaved in previous state. VM_INC (A) = VM_ADD(1, A) Carry-flag restore in EFlags VM_DEC (A) = VM_ADD(-1, A) Carry-flag restore in EFlags | Align-Flag managing ...... VMprotect virtualizes also CPU's complex-instructions, if such can be represented by simple instructions. VM_SETLE (virtualized into 80 VM-bytes!) there will huge EFlags testing and produced result_byte will copied into destination. VM_CMOVLE same kind EFlags testing as SETLE, + VM_Conitional_Jump for example VM_MOVSB this complex-instruction is re-presented into simple instructions assembly group, then this group virtualized. VM_BSWAP is done in following way (27 VM-bytes) HiWord(result) = HiWord {Shl (LoWord_LoWord) 8} LoWord(result) = HiWord {Shl (HiWord_HiWord) 8} VM_XADD VM does same as CPU VM_XCHG !? while VMprotect author cares about LOCK prefix & not virtualizes instruction with it, author did mistake and virtualized XCHG instruction.. oops! to prevent XCHG virtualization, author recommends LOCK prefix ...... for FLD, FSTP instructions memory content will copied on stack, & load-store from there. ...... VM-Registers space is 16 dwords. 8 of them are for Eax,Ecx,Edx,Ebx,Ebp,Esi,Edi,EFlags ; Esp is directly assigned to VM_stack(Ebp) ; 2 used for Relocation-Difference & passed_Mem_pointer ; other 6 are used for temporal storage. mostly for intermediate EFlags, also for intermediate or temporal results (VM_TEST, VM_CMP), also for cleanup VM-stack; look at VM_SUB, where 2-intermediate Eflags will used in calculation. Place of real registers in this space is different not only for every other VM, but also can change inside one VM! Register read from one place, after CAN placed on intermediate place and old place become intermediate. so VM-Registers tracking need! ...... VM-entry works so: we are at current stack; lets call it TOP-ESP at original Opcode place VMprotect puts call to VM: push offset-VM_IP call VM-StartCode ,,,, VM-StartCode: push Registers, EFlags ; << Order of push CAN be other then order of pop on ExitVM! push [passed_pointer_for_security + "crypt-constant" ] ; <<new from 1.8, passed from StartupVM, which allocates this memory, ; resolves imports, does file CRC-check, push 0 ; Relocation-Difference mov esi, [esp+030] ; offset-VM_IP mov ebp, esp ; ebp will VM-stack sub esp, 0C0 ; 040 bytes reserved for 16 VM-Registers, other free 080 byte space will used ; for user-pushed-variables. if too low become VM-stack, then VM-Registers will ; moved down mov edi, esp ; edi holds VM-Registers pointer add esi, [ebp+0] ; add Relocation-Difference to offset-VM_IP ; also here jumps LoadVmIP handler and now code is on VM_main_loop: {VM_main_loop has 2 variations, down-read VM-bytes as below, or inverse - up-read} mov al,[esi] movzx eax,al inc esi jmp [JumpTable + eax*4] here starts VM_BLOCK execution, which will move all pushed by VM-StartCode Registers/EFlags/others to VM-Registers space, until VM-Stack(ebp) will reach TOP-ESP. now starts virtualized user-code execution; ...... VM-exit works so: VM-stack(Ebp) is at TOP-ESP; (can be above start value, if Ret_nn emulated or Esp changed) now VM executes VM_BLOCK-Epilog-bytes, which will pop all required values (+ return_IP). from VM-Registers_space to stack and Ebp is ready for ExitVM-handler; then last VM-byte will call ExitVM-handler, which pops all from stack to Registers/EFlags and does Ret to return_IP. ...... because of it's original way of Jump management, VMprotect divides executable code into VM_Blocks: from Start or Jump_label to End or Jump; VM_BLOCK starts with VM_BLOCK-Prolog_bytes . Prolog-bytes does: 1. "decrypt" passed_pointer_for_security ; 2. pop all from stack to VM_Registers; VM_BLOCK ends with VM_BLOCK-Epilog_bytes . Epilog-bytes does: 1. push from VM_Registers to stack; 2. IF VM-exit: go to ExitVM-handler; ELSE: "crypt" passed_pointer_for_security ; (for next VM_BLOCK entry) VM_JUMP (conditional) works so: in stack VM places 2 VM_BLOCK_IP offsets, then does Push_VM_ESP, so in VM_Stack is pointer to lower one. then VMprotect converts EFlags state to adjustment value 4 or 0, so pointer in VM_Stack either will adjusted to second VM_BLOCK_IP, either will leaved as is. example for VM_JNZ/VM_JZ PushdImmD_SS4 offset_JumpIf_Zf=1 PushdImmD_SS4 offset_JumpIf_Zf=0 PushEBP_SS4 (Push Esp) PushwImmUB_SS2 04 (Zf Bit position) VM_AND (040, EFlags) Shr_D_SS2 040 04 (Zf Bit position) ; Zf=1, so result=4 Add_DD 04 012FFB8 ; pointer adjusted Pushm_D_SS [012FFBC] PopVR_D_SA4 024 offset_JumpIf_Zf=1 ; pop into VM_Regs after follows VM_BLOCK-Epilog_bytes + LoadVmIP. It is our job to discover: is it JZ or JNZ ! single flag condition is light case! while complex condition requires wierd calculations for convert conditions into single adjustment value. longest calculation takes GREATER / LOWER_or_EQUAL conditions. same kind work done on EFlags for CMOVcc & SETcc instructions if we will see VM_BLOCK-Epilog_bytes direcly before VM_BLOCK-Prolog_bytes , then mostly it is label for jump; ...... VM_CALL done in way, we are often doing in assembly: Push Arguments, Return address, Call address & then Retn = ExitVM. ...... if VM can't virtualize instruction, it has 2 way: 1. if instruction not affects Memory/Registers/EFlags (like EMMS, FPU-commands), it can included in any free VM-handler and appropriate VM-byte will assigned; 2. else VM does full VM-exit to this instruction & after it's execution again calls VM. ...... memory effective address will calculated step by step example: mov eax D$edi+ecx*8+088888888 all Regs/Values will pashed & added. on (ecx*8) will used ShlDword_SS2 handler. ...... in these manners will virtualized many instructions, in hope to hide them. OK, but now Generic Question is: does Complex-Opcode virtualization matters!?!? for example, lets say, we decompiled VM_MOVSB into simple instruction group & we not guess their meaning as initial opcode. But, is that problem!? No! code has all it's functionality anyway!









