i think its one or rare source which came to public and are reaLLY great...
thx again ..
Welcome to the new Woodmann RCE Messageboards Regroupment
Please be patient while the rest of the site is restored.
To all Members of the old RCE Forums:
In order to log in, it will be necessary to reset your forum login password ("I forgot my password") using the original email address you registered with. You will be sent an email with a link to reset your password for that member account.
The old vBulletin forum was converted to phpBB format, requiring the passwords to be reset. If this is a problem for some because of a forgotten email address, please feel free to re-register with a new username. We are happy to welcome old and new members back to the forums! Thanks.
All new accounts are manually activated before you can post. Any questions can be PM'ed to Kayaker.
Please be patient while the rest of the site is restored.
To all Members of the old RCE Forums:
In order to log in, it will be necessary to reset your forum login password ("I forgot my password") using the original email address you registered with. You will be sent an email with a link to reset your password for that member account.
The old vBulletin forum was converted to phpBB format, requiring the passwords to be reset. If this is a problem for some because of a forgotten email address, please feel free to re-register with a new username. We are happy to welcome old and new members back to the forums! Thanks.
All new accounts are manually activated before you can post. Any questions can be PM'ed to Kayaker.
vm for the masses - a vm compiler incl source
do i get this right that coco only generates you the parser and scanner but you have to write the compiler yourself?
from what i understand so far is that coco is run on a language to produce some sort of output. is the coco output already the code that gets executed by the virtual machine or is it processed further in to create a virtual machine byte code?
im a bit lost here (even after having a look at the sources), so maybe someone can point me in the right direction.
from what i understand so far is that coco is run on a language to produce some sort of output. is the coco output already the code that gets executed by the virtual machine or is it processed further in to create a virtual machine byte code?
im a bit lost here (even after having a look at the sources), so maybe someone can point me in the right direction.
-------
nothing
-------
nothing
-------
so the compiler generated by coco transforms your instructions into this for example:
00000000 mov temp_0000, 0
00000001 mov i, temp_0000
00000002 mov temp_0000, i
00000003 mov_data temp_0000, src
00000004 mov temp_0001, 0
00000005 not_equal temp_0000, temp_0001
(taken from your strcpy snippet in the bigpicture.txt)
and this is then executed by the vm? or is it processed further to some sort of binary code? which of the method in the packages is actually executing the instructions?
00000000 mov temp_0000, 0
00000001 mov i, temp_0000
00000002 mov temp_0000, i
00000003 mov_data temp_0000, src
00000004 mov temp_0001, 0
00000005 not_equal temp_0000, temp_0001
(taken from your strcpy snippet in the bigpicture.txt)
and this is then executed by the vm? or is it processed further to some sort of binary code? which of the method in the packages is actually executing the instructions?
-------
nothing
-------
nothing
-------
lets assume you have this expression:
1 + 2 * 3
the coco-generated compiler (aka frontend), transforms this expression into:
(if you prefer stackmachines, this code is identical to
actually the first xm generation was a stackmachine)
this frontend code is given to the backend, wich transforms it into real down to the metal vm-instructions:
(first the frontend instruction, following the required vm instructions)
as you can see, there are a lot of vm instructions required to do one frontendinstruction (i.e. add temp, temp requires 10 vm instructions)
1 + 2 * 3
the coco-generated compiler (aka frontend), transforms this expression into:
Code: Select all
00000000 mov temp_0000, 1
00000001 mov temp_0001, 2
00000002 mov temp_0002, 3
00000003 mul temp_0001, temp_0002
00000004 add temp_0000, temp_0001
Code: Select all
push 1
push 2
push 3
mul
add
this frontend code is given to the backend, wich transforms it into real down to the metal vm-instructions:
Code: Select all
00000000 mov temp_0000, 1
---------------------------------------------------------
10 00000126 MOV_TEMP_CONST 00000064, 00000001
11 00000ca0 MOV_TEMP_CONST 00000078, 00000000
12 0000020a ADD 00000078, 00000008
13 00000ce5 MOV_MEM_TEMP 00000078, 00000064
00000001 mov temp_0001, 2
---------------------------------------------------------
14 000003f1 MOV_TEMP_CONST 00000064, 00000002
15 00000944 MOV_TEMP_CONST 00000078, 00000004
16 0000074c ADD 00000078, 00000008
17 0000031a MOV_MEM_TEMP 00000078, 00000064
00000002 mov temp_0002, 3
---------------------------------------------------------
18 00000f62 MOV_TEMP_CONST 00000064, 00000003
19 00000d2e MOV_TEMP_CONST 00000078, 00000008
1a 000008fd ADD 00000078, 00000008
1b 00000ff0 MOV_MEM_TEMP 00000078, 00000064
mul temp_0001, temp_0002
---------------------------------------------------------
1c 00001187 MOV_TEMP_CONST 00000078, 00000004
1d 000011cc ADD 00000078, 00000008
1e 00000d73 MOV_TEMP_MEM 00000064, 00000078
1f 0000125a MOV_TEMP_CONST 00000078, 00000008
20 00000c59 ADD 00000078, 00000008
21 00000e46 MOV_TEMP_MEM 00000068, 00000078
22 0000081f MUL 00000064, 00000068
23 0000004f MOV_TEMP_CONST 00000078, 00000004
24 00000a62 ADD 00000078, 00000008
25 00000a19 MOV_MEM_TEMP 00000078, 00000064
add temp_0000, temp_0001
---------------------------------------------------------
26 000012b3 MOV_TEMP_CONST 00000078, 00000000
27 00000989 ADD 00000078, 00000008
28 000003a8 MOV_TEMP_MEM 00000064, 00000078
29 00000507 MOV_TEMP_CONST 00000078, 00000004
2a 00000f1b ADD 00000078, 00000008
2b 00000094 MOV_TEMP_MEM 00000068, 00000078
2c 000012f8 ADD 00000064, 00000068
2d 00000ed6 MOV_TEMP_CONST 00000078, 00000000
2e 0000066c ADD 00000078, 00000008
2f 000009d0 MOV_MEM_TEMP 00000078, 00000064
as you can see, there are a lot of vm instructions required to do one frontendinstruction (i.e. add temp, temp requires 10 vm instructions)
thanks for that explanation 0rp, that made it a lot clearer. im currently still digging through the code commenting as much as i can. but i havent found the method that is doing the execution of the vm instructions yet. where is the generated backend code executed? or is the backend code generated and executed on the fly when the frontend instructions are read?
edit:
am i right if i assume the following snipped of vm code would translate to the instructions shown below?
10 00000126 MOV_TEMP_CONST 00000064, 00000001
11 00000ca0 MOV_TEMP_CONST 00000078, 00000000
12 0000020a ADD 00000078, 00000008
13 00000ce5 MOV_MEM_TEMP 00000078, 00000064
mov dword [ebx+0xededed00], 0xededed01
mov dword [ebx+0xededed00], 0xededed01
mov eax, [ebx+0xededed01]
add [ebx+0xededed00], eax
mov eax, [ebx+0xededed01]
mov ecx, [ebx+0xededed00]
mov [ecx], eax
im dont know what 0xededed00 and 0xededed01 are used for, could you please explain that to me?
[--MOV_TEMP_CONST--]
//initialize temp reg with 1 (ebx+0xededed00 points to the first temp reg?)
//is 00000064 in ebx?
mov dword [ebx+0xededed00], 0xededed01
[--END MOV_TEMP_CONST--]
[--MOV_TEMP_CONST--]
//same as above, initialize second temp reg with 0
mov dword [ebx+0xededed00], 0xededed01
[--END MOV_TEMP_CONST--]
[--ADD--]
//move value of temp reg 2 into eax
mov eax, [ebx+0xededed01]
//probably add the value in eax to the first temp reg, but im not sure what
//the 00000008 in the vm code stands for
add [ebx+0xededed00], eax
[--END ADD--]
[--MOV_MEM_TEMP--]
//move value of second temp reg into eax
mov eax, [ebx+0xededed01]
//move address of first temp reg in ecx
mov ecx, [ebx+0xededed00]
//save eax at address of first temp reg
mov [ecx], eax
[--END MOV_MEM_TEMP--]
edit:
am i right if i assume the following snipped of vm code would translate to the instructions shown below?
10 00000126 MOV_TEMP_CONST 00000064, 00000001
11 00000ca0 MOV_TEMP_CONST 00000078, 00000000
12 0000020a ADD 00000078, 00000008
13 00000ce5 MOV_MEM_TEMP 00000078, 00000064
mov dword [ebx+0xededed00], 0xededed01
mov dword [ebx+0xededed00], 0xededed01
mov eax, [ebx+0xededed01]
add [ebx+0xededed00], eax
mov eax, [ebx+0xededed01]
mov ecx, [ebx+0xededed00]
mov [ecx], eax
im dont know what 0xededed00 and 0xededed01 are used for, could you please explain that to me?
[--MOV_TEMP_CONST--]
//initialize temp reg with 1 (ebx+0xededed00 points to the first temp reg?)
//is 00000064 in ebx?
mov dword [ebx+0xededed00], 0xededed01
[--END MOV_TEMP_CONST--]
[--MOV_TEMP_CONST--]
//same as above, initialize second temp reg with 0
mov dword [ebx+0xededed00], 0xededed01
[--END MOV_TEMP_CONST--]
[--ADD--]
//move value of temp reg 2 into eax
mov eax, [ebx+0xededed01]
//probably add the value in eax to the first temp reg, but im not sure what
//the 00000008 in the vm code stands for
add [ebx+0xededed00], eax
[--END ADD--]
[--MOV_MEM_TEMP--]
//move value of second temp reg into eax
mov eax, [ebx+0xededed01]
//move address of first temp reg in ecx
mov ecx, [ebx+0xededed00]
//save eax at address of first temp reg
mov [ecx], eax
[--END MOV_MEM_TEMP--]
-------
nothing
-------
nothing
-------
the instructions itself are executable, when the vm is entered, it goes straight to the first opcode, this opcode knows who is next and jumps to it, and so on
this edededXX stuff are markers. i compile the opcode source into .bin and overwrite the edededXX markers with their real values (done in void Backend::writeParam)
example:
ADD TEMP_0064, TEMP_0078
add opcode source:
mov eax, [ebx+0xededed01]
add [ebx+0xededed00], eax
wich gets:
mov eax, [ebx+0x78]
add [ebx+0x64], eax
so 0xedededed01 (the source operand) is replaced with 0x78 during generation, and 0xededed00 (the dest) is replaced by 0x64
and you are right with your example of those 4 instructions and their real asm
this edededXX stuff are markers. i compile the opcode source into .bin and overwrite the edededXX markers with their real values (done in void Backend::writeParam)
example:
ADD TEMP_0064, TEMP_0078
add opcode source:
mov eax, [ebx+0xededed01]
add [ebx+0xededed00], eax
wich gets:
mov eax, [ebx+0x78]
add [ebx+0x64], eax
so 0xedededed01 (the source operand) is replaced with 0x78 during generation, and 0xededed00 (the dest) is replaced by 0x64
and you are right with your example of those 4 instructions and their real asm
thanks 0rp!
so do i get this right:
1. you let the compiler generate the vm instruction from the input script
2. the vm runs over this script and executes the matching instructions
so:
ADD TEMP_0064, TEMP_0078
will be executed by the vm like:
1. find out instruction (in this case add)
2. look up the compiled opcode
3. patch the 0xebebeb00 and 0xebebebe01 markers
4. execute the opcode instructions
5. get next instruction
did i get this right?
so do i get this right:
1. you let the compiler generate the vm instruction from the input script
2. the vm runs over this script and executes the matching instructions
so:
ADD TEMP_0064, TEMP_0078
will be executed by the vm like:
1. find out instruction (in this case add)
2. look up the compiled opcode
3. patch the 0xebebeb00 and 0xebebebe01 markers
4. execute the opcode instructions
5. get next instruction
did i get this right?
-------
nothing
-------
nothing
-------
this replacement of edededXX is done while generation, not while execution
so, when generation is done, you have a big block of x86 executable code, that make up the single steps, so somewhere it will contain
mov eax, [ebx+0x78]
add [ebx+0x64], eax
which was required for something
here is how the final generation result looks like without encryption:
mov temp64, 1:
0049E845 mov dword ptr [ebx+64h],1
0049E84F mov ecx,4FCh
0049E854 mov edx,19h
0049E859 add ecx,dword ptr [ebx+2Ch]
0049E85C jmp ecx
mov temp_78, 0:
0049ECDC mov dword ptr [ebx+78h],0
0049ECE6 mov ecx,0C8h
0049ECEB mov edx,1Bh
0049ECF0 add ecx,dword ptr [ebx+2Ch]
0049ECF3 jmp ecx
add temp_78, temp_8
0049E8A8 mov eax,dword ptr [ebx+8]
0049E8AE add dword ptr [ebx+78h],eax
0049E8B4 mov ecx,516h
0049E8B9 mov edx,1Dh
0049E8BE add ecx,dword ptr [ebx+2Ch]
0049E8C1 jmp ecx
so the vm instructions end up as a chain of small executable and customized (the edededXX markers are replaced) x86 blocks, that are chained
so, when generation is done, you have a big block of x86 executable code, that make up the single steps, so somewhere it will contain
mov eax, [ebx+0x78]
add [ebx+0x64], eax
which was required for something
here is how the final generation result looks like without encryption:
mov temp64, 1:
0049E845 mov dword ptr [ebx+64h],1
0049E84F mov ecx,4FCh
0049E854 mov edx,19h
0049E859 add ecx,dword ptr [ebx+2Ch]
0049E85C jmp ecx
mov temp_78, 0:
0049ECDC mov dword ptr [ebx+78h],0
0049ECE6 mov ecx,0C8h
0049ECEB mov edx,1Bh
0049ECF0 add ecx,dword ptr [ebx+2Ch]
0049ECF3 jmp ecx
add temp_78, temp_8
0049E8A8 mov eax,dword ptr [ebx+8]
0049E8AE add dword ptr [ebx+78h],eax
0049E8B4 mov ecx,516h
0049E8B9 mov edx,1Dh
0049E8BE add ecx,dword ptr [ebx+2Ch]
0049E8C1 jmp ecx
so the vm instructions end up as a chain of small executable and customized (the edededXX markers are replaced) x86 blocks, that are chained
i see, so the compiled opcode snippets are just small templates of code that get customized by the vm environment and put together to form the final program? the way is see it the backend is kind of a compiler too, which produces the final binary as output. the final program is then run by executing the first instruction in the instruction chain?
-------
nothing
-------
nothing
-------
why did you decide to create a final binary version of the input program instead of letting the vm execute the vm instructions during runtime as kind of an interpreter? if you have a binary version of the input program, what do you need the vm for? (maybe i missed something on the way but thats what i ask myself)
-------
nothing
-------
nothing
-------
there was a xm version, that was working like you suggested
it had a static number of generic opcodes (add, mov, mul,...) that were parameterized. thatfor the vm contained also a big parameterstream
i didnt like this idea too much, bc you can easy replace the static number of opcodes by own hacked opcodes and do whatever you want
it had a static number of generic opcodes (add, mov, mul,...) that were parameterized. thatfor the vm contained also a big parameterstream
i didnt like this idea too much, bc you can easy replace the static number of opcodes by own hacked opcodes and do whatever you want