With help of GDB-script:
file ./program
b *0x12345
run
while 1
x/i $pc
ni
end
quit
I got a trace protocol of obfuscated program.
...
0x484e0: bx lr
?? ()
0x43d88: b 0x43db8
?? ()
0x43db8: ldr r3, [r11, #-16]
?? ()
0x43dbc: mov r0, r3
?? ()
0x43dc0: sub sp, r11, #12
?? ()
0x43dc4: pop {r4, r5, r11, pc}
?? ()
0x3fb94: ldr r3, [r11, #-8]
?? ()
0x3fb98: mov r0, r3
?? ()
0x3fb9c: sub sp, r11, #4
?? ()
0x3fba0: pop {r11, pc}
?? ()
0x3da68: ldr r3, [r11, #-8]
...
Kris Kaspersky writes, that it is good idea to pass the tracer's protocol through compiler-optimizer for better understanding of this program. In such case I will get the same executable file with more readable disassembled code. But I haven't any idea what compilers and in what way should I use.
P.S. What should I do to get rid of unnecessary lines: "?? ()" ? And what should I do to redirect GDB's out to file?
Welcome to the new Woodmann RCE Messageboards Regroupment
Please be patient while the rest of the site is restored.
To all Members of the old RCE Forums:
In order to log in, it will be necessary to reset your forum login password ("I forgot my password") using the original email address you registered with. You will be sent an email with a link to reset your password for that member account.
The old vBulletin forum was converted to phpBB format, requiring the passwords to be reset. If this is a problem for some because of a forgotten email address, please feel free to re-register with a new username. We are happy to welcome old and new members back to the forums! Thanks.
All new accounts are manually activated before you can post. Any questions can be PM'ed to Kayaker.
Please be patient while the rest of the site is restored.
To all Members of the old RCE Forums:
In order to log in, it will be necessary to reset your forum login password ("I forgot my password") using the original email address you registered with. You will be sent an email with a link to reset your password for that member account.
The old vBulletin forum was converted to phpBB format, requiring the passwords to be reset. If this is a problem for some because of a forgotten email address, please feel free to re-register with a new username. We are happy to welcome old and new members back to the forums! Thanks.
All new accounts are manually activated before you can post. Any questions can be PM'ed to Kayaker.
How to pass the obfuscated program's trace protocol through compiler-optimizer?
Hi Cristianu,
I don't want to offend you but you are obviously not aware of what you talking about. There is no such too which would allow you take a gdb trace log, paste it into a file and let the file be processes by tool which does compiler optimization.
Kris Kaspersky is theoretically talking about what technologies could be used to get a proper deobfusctor by "misusing" compiler optimization algorithms, although i doubt, that he has more than a POC
If you really want to do something like that keep in mind that you will have to write your own tools, there is no way around this. Nowadays most people are using available frameworks for task like this and the most used one is the llvm project. But be prepared to study that stuff the next 1 year at least ;D
Another possible, but in my opinion not so professional, approach would be to write your own deobfusctor which is processing on the text output of gdb. Have a look at the blogs here you go back one or two years. I think you will find a project which did something like you want to deobfuscate obfuscated virtual machine handlers. the difference here is simply that the guy who wrote the deobfucator dealt with x86 code instead of arm.
If you we are talking about only a few hundered lines of code you could also use piece of paper and pencil!
Nevertheless all of this will end up in a long project!
I don't want to offend you but you are obviously not aware of what you talking about. There is no such too which would allow you take a gdb trace log, paste it into a file and let the file be processes by tool which does compiler optimization.
Kris Kaspersky is theoretically talking about what technologies could be used to get a proper deobfusctor by "misusing" compiler optimization algorithms, although i doubt, that he has more than a POC

If you really want to do something like that keep in mind that you will have to write your own tools, there is no way around this. Nowadays most people are using available frameworks for task like this and the most used one is the llvm project. But be prepared to study that stuff the next 1 year at least ;D
Another possible, but in my opinion not so professional, approach would be to write your own deobfusctor which is processing on the text output of gdb. Have a look at the blogs here you go back one or two years. I think you will find a project which did something like you want to deobfuscate obfuscated virtual machine handlers. the difference here is simply that the guy who wrote the deobfucator dealt with x86 code instead of arm.
If you we are talking about only a few hundered lines of code you could also use piece of paper and pencil!
Nevertheless all of this will end up in a long project!
- Reverse Enginnering can be everything, but sometimes it's more than nothing. Really rare moments but then they appear to last ages... -
i dont find google showing me where kris kaspersky is talking about inputing a raw disassembly to some compiler optimizer and getting back super disassembly
so i refrained from replying earlier
since ohpen has burst the bubble i too would chime in and say there doesnt exist a method that would get you a almost reassembleable disassembly from obfuscated disassembly
yes many individual efforts exist and afaik they are x86 primarily and they all still have a long long way to go to be declared near perfect
anyway ill answer the minor questions leaving the compiler optimization whatever part for X86
convert it to arm
fisrt as to redirect output to file
if your gdb is newer version
you can do set logging on and provide a file name
if you have a linux like my DAMN SMALL LINUX running low ram in vm on windows host
where the gdb package thats i available is old and does not have set logging command
you can use the following method
so i refrained from replying earlier
since ohpen has burst the bubble i too would chime in and say there doesnt exist a method that would get you a almost reassembleable disassembly from obfuscated disassembly
yes many individual efforts exist and afaik they are x86 primarily and they all still have a long long way to go to be declared near perfect
anyway ill answer the minor questions leaving the compiler optimization whatever part for X86
convert it to arm
fisrt as to redirect output to file
if your gdb is newer version
you can do set logging on and provide a file name
if you have a linux like my DAMN SMALL LINUX running low ram in vm on windows host

where the gdb package thats i available is old and does not have set logging command
you can use the following method
Code: Select all
[email protected]:~$ [B]cat helloworld.c [/B]
#include<stdio.h>
int main (void){
printf("hi Damn Small Linux This is My First Proggie\n");
return 0;
}
[email protected]:[email protected]:~$ [B]gcc helloworld.c -o cristianu[/B]
[email protected]:~$ [B]./cristianu [/B]
hi Damn Small Linux This is My First Proggie
[email protected]:~$
[email protected]:~$ [B]cat foo[/B]
file ./cristianu
set disassembly-flavor intel
set annotate 0
set max-symbolic-offset 0
set print address off
set complaints 0
b main
run
while 1
x/i $pc
ni
end
quit
[email protected]:~$
[email protected]:~$ [B][size=200]gdb -q < foo > cristlog >&1[/size][/B]
gdb: Symbol `emacs_ctlx_keymap' has different size in shared object, consider re-linking
No symbol table is loaded. Use the "file" command.
No registers.
[email protected]:~$
[email protected]:~$ [B]cat cristlog [/B]
(gdb) Reading symbols from ./cristianu...(no debugging symbols found)...done.
(gdb) (gdb) (gdb) (gdb) (gdb) (gdb) Breakpoint 1 at 0x804838a
(gdb) Starting program: /home/dsl/cristianu
(no debugging symbols found)...(no debugging symbols found)...
Breakpoint 1, main ()
(gdb) > > >0x804838a <main+6>: and esp,0xfffffff0
main ()
0x804838d <main+9>: mov eax,0x0
main ()
0x8048392 <main+14>: sub esp,eax
main ()
0x8048394 <main+16>: mov DWORD PTR [esp],0x80484e0
main ()
0x804839b <main+23>: call 0x80482b0 <_init+56>
main ()
0x80483a0 <main+28>: mov eax,0x0
main ()
0x80483a5 <main+33>: leave
main ()
0x80483a6 <main+34>: ret
__libc_start_main () from /lib/libc.so.6
0x4002ee3e <__libc_start_main+206>: mov DWORD PTR [esp],eax
__libc_start_main () from /lib/libc.so.6
0x4002ee41 <__libc_start_main+209>: call 0x40044a30 <exit>
hi Damn Small Linux This is My First Proggie
Program exited normally.
(gdb) [email protected]:~$
[email protected]:~$ [B]grep -i main+ cristlog > cristasm[/B]
[email protected]:~$ cat crist
cristasm cristianu cristlog
[email protected]:~$ [B]cat cristasm[/B]
(gdb) > > >0x804838a <main+6>: and esp,0xfffffff0
0x804838d <main+9>: mov eax,0x0
0x8048392 <main+14>: sub esp,eax
0x8048394 <main+16>: mov DWORD PTR [esp],0x80484e0
0x804839b <main+23>: call 0x80482b0 <_init+56>
0x80483a0 <main+28>: mov eax,0x0
0x80483a5 <main+33>: leave
0x80483a6 <main+34>: ret
0x4002ee3e <__libc_start_main+206>: mov DWORD PTR [esp],eax
0x4002ee41 <__libc_start_main+209>: call 0x40044a30 <exit>
[email protected]:~$
[email protected]:~$ sed s/.*main+.*:.//g cristasm > prettycristasm
[email protected]:~$ [B]cat prettycristasm [/B]
and esp,0xfffffff0
mov eax,0x0
sub esp,eax
mov DWORD PTR [esp],0x80484e0
call 0x80482b0 <_init+56>
mov eax,0x0
leave
ret
mov DWORD PTR [esp],eax
call 0x40044a30 <exit>
[email protected]:~$
Ok, thank you for responses!
Now it is clear that developing of deobfuscation tools is rather the thing of the near future.
Nevertheless, trace list gives us some benefits - we have a real sequence of executed instructions.
It is possible to write gdb script that outputs instructions with current registers state.
Then it is possible to find necessary value with help of Ctrl+F.

He wrote about it in his Russian book "Art of disassembling" (it is literal translation).
Could you advice me some books to improve my skills in reverse engeneering (ARM-oriented books are preferable).
I would like to become a superhacker.
What should I research? Compilers, cryptography, deobfuscation theory, what else?
Any help would be appreciated.
Now it is clear that developing of deobfuscation tools is rather the thing of the near future.
Nevertheless, trace list gives us some benefits - we have a real sequence of executed instructions.
It is possible to write gdb script that outputs instructions with current registers state.
Then it is possible to find necessary value with help of Ctrl+F.
It is not surprisinglyblabberer
i dont find google showing me where kris kaspersky is talking about inputing a raw disassembly to some compiler optimizer and getting back super disassembly

He wrote about it in his Russian book "Art of disassembling" (it is literal translation).
Could you advice me some books to improve my skills in reverse engeneering (ARM-oriented books are preferable).
I would like to become a superhacker.

What should I research? Compilers, cryptography, deobfuscation theory, what else?
Any help would be appreciated.

i dont know about arm never had the necessity to hack arm but i believe i would be able to hack it if i put my head down to it in a few sessions
as basics are what must be solid and not implementation details x86 is an implementation like arm is what i think
anyway adopting kiss principle (keep it simple and <......> (sir,stupid sir,straightforward sir,shitty sir,s......sir)
i would go about like this
grab a simple crackme
find ways to run it as in installing os , framework , etc etc
when it runs find ways to open it raw and visually look at its guts ie using any text readers . binary readers
then put it in a comatose state and look at its guts sequentially ie using debuggers . disassemblers , descripters, dewhateverss
when i am comfortable with its inner workings (as in i can say in my dreams what ldr r3 #somereg, r18 would mean in any context)
i would start poking into its interaction with the os / framework / vm ( ie a few round trips into R0 as they would say in x86)
and hence forth simply try trapping every thing in R0 where simple r3 obfuscations wont matter
hope i live upto my nicks real meaning
as basics are what must be solid and not implementation details x86 is an implementation like arm is what i think
anyway adopting kiss principle (keep it simple and <......> (sir,stupid sir,straightforward sir,shitty sir,s......sir)
i would go about like this
grab a simple crackme
find ways to run it as in installing os , framework , etc etc
when it runs find ways to open it raw and visually look at its guts ie using any text readers . binary readers
then put it in a comatose state and look at its guts sequentially ie using debuggers . disassemblers , descripters, dewhateverss
when i am comfortable with its inner workings (as in i can say in my dreams what ldr r3 #somereg, r18 would mean in any context)
i would start poking into its interaction with the os / framework / vm ( ie a few round trips into R0 as they would say in x86)
and hence forth simply try trapping every thing in R0 where simple r3 obfuscations wont matter

hope i live upto my nicks real meaning

Well, there is Steve Furber's book widely known as "the ARM bible":Cristianu wrote: Could you advice me some books to improve my skills in reverse engeneering (ARM-oriented books are preferable).
http://www.amazon.co.uk/exec/obidos/ASI ... 09-1571011
and you can start here:
http://www.ee.ic.ac.uk/pcheung/teaching/ee2_computing/
Have fun.
Regards
darkelf
I flout Chuck Norris, Spongebob barbecues underwater!
In response to your original question, one idea would be as follows:
- Take your original disassembly and create a C program with it (using __asm__).
- Compile the C program into a binary.
- Use Hex-Rays on the binary to decompile that program.
- Now take the decompilation, and create a new C program with that decompiled C code.
- Disassemble that new program, and you should have your "optimized" disassembly.
I've just tried to optimize this example in such way:
I tried -O3 -O2 -O1 - result is the same:
What is wrong?
I guess, optimization of compiler should delete the first three lines
Code: Select all
#include <stdio.h>
int main(int argc,char** argv) {
__asm__ ( "movl $10, %eax;"
"movl $10, %eax;"
"movl $10, %eax;"
"movl $10, %eax;"
"movl $20, %ebx;"
"addl %ebx, %eax;"
);
}
Code: Select all
<main>
"movl $10, %eax;"
"movl $10, %eax;"
"movl $10, %eax;"
"movl $10, %eax;"
"movl $20, %ebx;"
"addl %ebx, %eax;"
...
I guess, optimization of compiler should delete the first three lines
Code: Select all
"movl $10, %eax;"
Whoohoo, that was a quick jump away from ARM, wasn't it?
Now, x86 coding is on the menu, right?
OK, to make it short and sweet here is a little quote:
Best regards
darkelf
Now, x86 coding is on the menu, right?
OK, to make it short and sweet here is a little quote:
How should the compiler know, what you are trying to do? When you use inline asm you are on your own. In general, the use of inline asm is discouraged, because it get's in the compilers way and prevents an overall optimization. So if you use it, you are expected to know what you are doing.The presence of an __asm block affects optimization in several ways. First, the compiler doesn't try to optimize the __asm block itself. What you write in assembly language is exactly what you get.
Best regards
darkelf
I flout Chuck Norris, Spongebob barbecues underwater!
Code: Select all
Whoohoo, that was a quick jump away from ARM, wasn't it?
Code: Select all
Now, x86 coding is on the menu, right?

If it doesn't work with x86 - it doesn't work wirh ARM.
Am I right?
ARM coding is still on the menu.

Best regards
Cristianu