Page 2 of 3 FirstFirst 123 LastLast
Results 16 to 30 of 37

Thread: vm for the masses - a vm compiler incl source

  1. #16
    i think its one or rare source which came to public and are reaLLY great...

    thx again ..

  2. #17
    undefined
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    27
    do i get this right that coco only generates you the parser and scanner but you have to write the compiler yourself?

    from what i understand so far is that coco is run on a language to produce some sort of output. is the coco output already the code that gets executed by the virtual machine or is it processed further in to create a virtual machine byte code?

    im a bit lost here (even after having a look at the sources), so maybe someone can point me in the right direction.
    -------
    nothing
    -------

  3. #18
    coco generates the sourcecode of the used compiler, it is configured by the grammarfile xm.atg

    so basically i dont write the compilersource myself, i just make a config file for coco. based on this config, coco generates the sources for the compiler wich are used then

  4. #19
    undefined
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    27
    so the compiler generated by coco transforms your instructions into this for example:
    00000000 mov temp_0000, 0
    00000001 mov i, temp_0000
    00000002 mov temp_0000, i
    00000003 mov_data temp_0000, src
    00000004 mov temp_0001, 0
    00000005 not_equal temp_0000, temp_0001
    (taken from your strcpy snippet in the bigpicture.txt)

    and this is then executed by the vm? or is it processed further to some sort of binary code? which of the method in the packages is actually executing the instructions?
    -------
    nothing
    -------

  5. #20
    b3n, I don't think following 0rp's code is going to help you with what you want. It might actually make it harder to understand.

    0rp, no reflection on your code, just that b3n and I had quite a detailed discussion about VMs via privmsg.
    Still here...

  6. #21
    undefined
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    27
    hi silver,
    its not really concerned with what we talked about, i just want to get an understanding on how 0rp's code works and i couldnt figure that out yet.
    -------
    nothing
    -------

  7. #22
    lets assume you have this expression:

    1 + 2 * 3



    the coco-generated compiler (aka frontend), transforms this expression into:

    Code:
    00000000    mov temp_0000, 1
    00000001    mov temp_0001, 2
    00000002    mov temp_0002, 3
    00000003    mul temp_0001, temp_0002
    00000004    add temp_0000, temp_0001
    (if you prefer stackmachines, this code is identical to
    Code:
    push 1
    push 2
    push 3
    mul
    add
    actually the first xm generation was a stackmachine)




    this frontend code is given to the backend, wich transforms it into real down to the metal vm-instructions:

    Code:
      00000000    mov temp_0000, 1
      ---------------------------------------------------------
      10    00000126    MOV_TEMP_CONST      00000064,  00000001
      11    00000ca0    MOV_TEMP_CONST      00000078,  00000000
      12    0000020a    ADD                 00000078,  00000008
      13    00000ce5    MOV_MEM_TEMP        00000078,  00000064
    
    
    
    
      00000001    mov temp_0001, 2
      ---------------------------------------------------------
      14    000003f1    MOV_TEMP_CONST      00000064,  00000002
      15    00000944    MOV_TEMP_CONST      00000078,  00000004
      16    0000074c    ADD                 00000078,  00000008
      17    0000031a    MOV_MEM_TEMP        00000078,  00000064
    
    
    
      00000002    mov temp_0002, 3
      ---------------------------------------------------------
      18    00000f62    MOV_TEMP_CONST      00000064,  00000003
      19    00000d2e    MOV_TEMP_CONST      00000078,  00000008
      1a    000008fd    ADD                 00000078,  00000008
      1b    00000ff0    MOV_MEM_TEMP        00000078,  00000064
    
    
    
    
    
    
      mul temp_0001, temp_0002
      ---------------------------------------------------------
      1c    00001187    MOV_TEMP_CONST      00000078,  00000004
      1d    000011cc    ADD                 00000078,  00000008
      1e    00000d73    MOV_TEMP_MEM        00000064,  00000078
      1f    0000125a    MOV_TEMP_CONST      00000078,  00000008
      20    00000c59    ADD                 00000078,  00000008
      21    00000e46    MOV_TEMP_MEM        00000068,  00000078
      22    0000081f    MUL                 00000064,  00000068
      23    0000004f    MOV_TEMP_CONST      00000078,  00000004
      24    00000a62    ADD                 00000078,  00000008
      25    00000a19    MOV_MEM_TEMP        00000078,  00000064
    
    
    
    
      add temp_0000, temp_0001
      ---------------------------------------------------------
      26    000012b3    MOV_TEMP_CONST      00000078,  00000000
      27    00000989    ADD                 00000078,  00000008
      28    000003a8    MOV_TEMP_MEM        00000064,  00000078
      29    00000507    MOV_TEMP_CONST      00000078,  00000004
      2a    00000f1b    ADD                 00000078,  00000008
      2b    00000094    MOV_TEMP_MEM        00000068,  00000078
      2c    000012f8    ADD                 00000064,  00000068
      2d    00000ed6    MOV_TEMP_CONST      00000078,  00000000
      2e    0000066c    ADD                 00000078,  00000008
      2f    000009d0    MOV_MEM_TEMP        00000078,  00000064
    (first the frontend instruction, following the required vm instructions)

    as you can see, there are a lot of vm instructions required to do one frontendinstruction (i.e. add temp, temp requires 10 vm instructions)

  8. #23
    undefined
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    27
    thanks for that explanation 0rp, that made it a lot clearer. im currently still digging through the code commenting as much as i can. but i havent found the method that is doing the execution of the vm instructions yet. where is the generated backend code executed? or is the backend code generated and executed on the fly when the frontend instructions are read?

    edit:
    am i right if i assume the following snipped of vm code would translate to the instructions shown below?


    10 00000126 MOV_TEMP_CONST 00000064, 00000001
    11 00000ca0 MOV_TEMP_CONST 00000078, 00000000
    12 0000020a ADD 00000078, 00000008
    13 00000ce5 MOV_MEM_TEMP 00000078, 00000064


    mov dword [ebx+0xededed00], 0xededed01
    mov dword [ebx+0xededed00], 0xededed01
    mov eax, [ebx+0xededed01]
    add [ebx+0xededed00], eax
    mov eax, [ebx+0xededed01]
    mov ecx, [ebx+0xededed00]
    mov [ecx], eax

    im dont know what 0xededed00 and 0xededed01 are used for, could you please explain that to me?

    [--MOV_TEMP_CONST--]
    //initialize temp reg with 1 (ebx+0xededed00 points to the first temp reg?)
    //is 00000064 in ebx?
    mov dword [ebx+0xededed00], 0xededed01
    [--END MOV_TEMP_CONST--]

    [--MOV_TEMP_CONST--]
    //same as above, initialize second temp reg with 0
    mov dword [ebx+0xededed00], 0xededed01
    [--END MOV_TEMP_CONST--]

    [--ADD--]
    //move value of temp reg 2 into eax
    mov eax, [ebx+0xededed01]

    //probably add the value in eax to the first temp reg, but im not sure what
    //the 00000008 in the vm code stands for
    add [ebx+0xededed00], eax
    [--END ADD--]

    [--MOV_MEM_TEMP--]
    //move value of second temp reg into eax
    mov eax, [ebx+0xededed01]

    //move address of first temp reg in ecx
    mov ecx, [ebx+0xededed00]

    //save eax at address of first temp reg
    mov [ecx], eax
    [--END MOV_MEM_TEMP--]
    Last edited by b3n; April 21st, 2007 at 21:41.
    -------
    nothing
    -------

  9. #24
    the instructions itself are executable, when the vm is entered, it goes straight to the first opcode, this opcode knows who is next and jumps to it, and so on

    this edededXX stuff are markers. i compile the opcode source into .bin and overwrite the edededXX markers with their real values (done in void Backend::writeParam)

    example:

    ADD TEMP_0064, TEMP_0078

    add opcode source:
    mov eax, [ebx+0xededed01]
    add [ebx+0xededed00], eax

    wich gets:
    mov eax, [ebx+0x78]
    add [ebx+0x64], eax


    so 0xedededed01 (the source operand) is replaced with 0x78 during generation, and 0xededed00 (the dest) is replaced by 0x64



    and you are right with your example of those 4 instructions and their real asm

  10. #25
    undefined
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    27
    thanks 0rp!

    so do i get this right:
    1. you let the compiler generate the vm instruction from the input script
    2. the vm runs over this script and executes the matching instructions

    so:
    ADD TEMP_0064, TEMP_0078
    will be executed by the vm like:
    1. find out instruction (in this case add)
    2. look up the compiled opcode
    3. patch the 0xebebeb00 and 0xebebebe01 markers
    4. execute the opcode instructions
    5. get next instruction

    did i get this right?
    -------
    nothing
    -------

  11. #26
    this replacement of edededXX is done while generation, not while execution

    so, when generation is done, you have a big block of x86 executable code, that make up the single steps, so somewhere it will contain
    mov eax, [ebx+0x78]
    add [ebx+0x64], eax
    which was required for something



    here is how the final generation result looks like without encryption:

    mov temp64, 1:
    0049E845 mov dword ptr [ebx+64h],1
    0049E84F mov ecx,4FCh
    0049E854 mov edx,19h
    0049E859 add ecx,dword ptr [ebx+2Ch]
    0049E85C jmp ecx



    mov temp_78, 0:
    0049ECDC mov dword ptr [ebx+78h],0
    0049ECE6 mov ecx,0C8h
    0049ECEB mov edx,1Bh
    0049ECF0 add ecx,dword ptr [ebx+2Ch]
    0049ECF3 jmp ecx



    add temp_78, temp_8
    0049E8A8 mov eax,dword ptr [ebx+8]
    0049E8AE add dword ptr [ebx+78h],eax
    0049E8B4 mov ecx,516h
    0049E8B9 mov edx,1Dh
    0049E8BE add ecx,dword ptr [ebx+2Ch]
    0049E8C1 jmp ecx



    so the vm instructions end up as a chain of small executable and customized (the edededXX markers are replaced) x86 blocks, that are chained

  12. #27
    undefined
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    27
    i see, so the compiled opcode snippets are just small templates of code that get customized by the vm environment and put together to form the final program? the way is see it the backend is kind of a compiler too, which produces the final binary as output. the final program is then run by executing the first instruction in the instruction chain?
    -------
    nothing
    -------

  13. #28
    yes, exactly

  14. #29
    undefined
    Join Date
    Mar 2007
    Location
    Australia
    Posts
    27
    why did you decide to create a final binary version of the input program instead of letting the vm execute the vm instructions during runtime as kind of an interpreter? if you have a binary version of the input program, what do you need the vm for? (maybe i missed something on the way but thats what i ask myself)
    -------
    nothing
    -------

  15. #30
    there was a xm version, that was working like you suggested

    it had a static number of generic opcodes (add, mov, mul,...) that were parameterized. thatfor the vm contained also a big parameterstream

    i didnt like this idea too much, bc you can easy replace the static number of opcodes by own hacked opcodes and do whatever you want

Similar Threads

  1. A dongle for the masses?
    By SiGiNT in forum Off Topic
    Replies: 0
    Last Post: September 7th, 2006, 00:30
  2. Question about why a compiler does this sometimes
    By Technomancer in forum The Newbie Forum
    Replies: 4
    Last Post: June 5th, 2006, 22:23
  3. Understanding something about why a compiler does this
    By Technomancer in forum The Newbie Forum
    Replies: 15
    Last Post: May 19th, 2006, 05:39
  4. looking for a VB3 compiler
    By 0ffs3t in forum The Newbie Forum
    Replies: 10
    Last Post: October 31st, 2002, 12:31
  5. InstallSjield compiler
    By karakochev in forum Advanced Reversing and Programming
    Replies: 11
    Last Post: December 9th, 2001, 06:52

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •