Welcome to the new Woodmann RCE Messageboards Regroupment
Please be patient while the rest of the site is restored.

To all Members of the old RCE Forums:
In order to log in, it will be necessary to reset your forum login password ("I forgot my password") using the original email address you registered with. You will be sent an email with a link to reset your password for that member account.

The old vBulletin forum was converted to phpBB format, requiring the passwords to be reset. If this is a problem for some because of a forgotten email address, please feel free to re-register with a new username. We are happy to welcome old and new members back to the forums! Thanks.

All new accounts are manually activated before you can post. Any questions can be PM'ed to Kayaker.

How to pass the obfuscated program's trace protocol through compiler-optimizer?

RCE of Linux tools and programs.
Locked
Cristianu
Junior Member
Posts: 7
Joined: Mon Apr 09, 2012 11:13 am

How to pass the obfuscated program's trace protocol through compiler-optimizer?

Post by Cristianu »

With help of GDB-script:

file ./program
b *0x12345
run
while 1
x/i $pc
ni
end
quit



I got a trace protocol of obfuscated program.


...

0x484e0: bx lr
?? ()
0x43d88: b 0x43db8
?? ()
0x43db8: ldr r3, [r11, #-16]
?? ()
0x43dbc: mov r0, r3
?? ()
0x43dc0: sub sp, r11, #12
?? ()
0x43dc4: pop {r4, r5, r11, pc}
?? ()
0x3fb94: ldr r3, [r11, #-8]
?? ()
0x3fb98: mov r0, r3
?? ()
0x3fb9c: sub sp, r11, #4
?? ()
0x3fba0: pop {r11, pc}
?? ()
0x3da68: ldr r3, [r11, #-8]

...


Kris Kaspersky writes, that it is good idea to pass the tracer's protocol through compiler-optimizer for better understanding of this program. In such case I will get the same executable file with more readable disassembled code. But I haven't any idea what compilers and in what way should I use.

P.S. What should I do to get rid of unnecessary lines: "?? ()" ? And what should I do to redirect GDB's out to file?
User avatar
OHPen
Posts: 399
Joined: Wed Nov 06, 2002 1:20 pm
Location: .text

Post by OHPen »

Hi Cristianu,

I don't want to offend you but you are obviously not aware of what you talking about. There is no such too which would allow you take a gdb trace log, paste it into a file and let the file be processes by tool which does compiler optimization.
Kris Kaspersky is theoretically talking about what technologies could be used to get a proper deobfusctor by "misusing" compiler optimization algorithms, although i doubt, that he has more than a POC ;)

If you really want to do something like that keep in mind that you will have to write your own tools, there is no way around this. Nowadays most people are using available frameworks for task like this and the most used one is the llvm project. But be prepared to study that stuff the next 1 year at least ;D

Another possible, but in my opinion not so professional, approach would be to write your own deobfusctor which is processing on the text output of gdb. Have a look at the blogs here you go back one or two years. I think you will find a project which did something like you want to deobfuscate obfuscated virtual machine handlers. the difference here is simply that the guy who wrote the deobfucator dealt with x86 code instead of arm.

If you we are talking about only a few hundered lines of code you could also use piece of paper and pencil!

Nevertheless all of this will end up in a long project!
- Reverse Enginnering can be everything, but sometimes it's more than nothing. Really rare moments but then they appear to last ages... -
blabberer
Senior Member
Posts: 1535
Joined: Wed Dec 08, 2004 11:12 am

Post by blabberer »

i dont find google showing me where kris kaspersky is talking about inputing a raw disassembly to some compiler optimizer and getting back super disassembly

so i refrained from replying earlier

since ohpen has burst the bubble i too would chime in and say there doesnt exist a method that would get you a almost reassembleable disassembly from obfuscated disassembly

yes many individual efforts exist and afaik they are x86 primarily and they all still have a long long way to go to be declared near perfect

anyway ill answer the minor questions leaving the compiler optimization whatever part for X86
convert it to arm

fisrt as to redirect output to file

if your gdb is newer version

you can do set logging on and provide a file name

if you have a linux like my DAMN SMALL LINUX running low ram in vm on windows host :)

where the gdb package thats i available is old and does not have set logging command

you can use the following method

Code: Select all


[email protected]:~$ [B]cat helloworld.c [/B]
#include<stdio.h>

int main (void){

printf("hi Damn Small Linux This is My First Proggie\n");
return 0;
}


[email protected]:[email protected]:~$ [B]gcc helloworld.c  -o cristianu[/B]


[email protected]:~$ [B]./cristianu [/B]
hi Damn Small Linux This is My First Proggie
[email protected]:~$ 



 [email protected]:~$ [B]cat foo[/B]
file ./cristianu
set disassembly-flavor intel
set annotate 0
set max-symbolic-offset 0
set print address off
set complaints 0
b main
run
while 1
x/i $pc
ni
end
quit
[email protected]:~$ 

[email protected]:~$ [B][size=200]gdb -q < foo > cristlog >&1[/size][/B]
gdb: Symbol `emacs_ctlx_keymap' has different size in shared object, consider re-linking
No symbol table is loaded.  Use the "file" command.
No registers.
[email protected]:~$ 




[email protected]:~$ [B]cat cristlog [/B]
(gdb) Reading symbols from ./cristianu...(no debugging symbols found)...done.
(gdb) (gdb) (gdb) (gdb) (gdb) (gdb) Breakpoint 1 at 0x804838a
(gdb) Starting program: /home/dsl/cristianu 
(no debugging symbols found)...(no debugging symbols found)...
Breakpoint 1, main ()
(gdb)  > > >0x804838a <main+6>: and    esp,0xfffffff0
main ()
0x804838d <main+9>:     mov    eax,0x0
main ()
0x8048392 <main+14>:    sub    esp,eax
main ()
0x8048394 <main+16>:    mov    DWORD PTR [esp],0x80484e0
main ()
0x804839b <main+23>:    call   0x80482b0 <_init+56>
main ()
0x80483a0 <main+28>:    mov    eax,0x0
main ()
0x80483a5 <main+33>:    leave  
main ()
0x80483a6 <main+34>:    ret    
__libc_start_main () from /lib/libc.so.6
0x4002ee3e <__libc_start_main+206>:     mov    DWORD PTR [esp],eax
__libc_start_main () from /lib/libc.so.6
0x4002ee41 <__libc_start_main+209>:     call   0x40044a30 <exit>
hi Damn Small Linux This is My First Proggie

Program exited normally.
(gdb) [email protected]:~$ 

[email protected]:~$ [B]grep -i main+ cristlog > cristasm[/B]
[email protected]:~$ cat crist  
cristasm   cristianu  cristlog
[email protected]:~$ [B]cat cristasm[/B]
(gdb)  > > >0x804838a <main+6>: and    esp,0xfffffff0
0x804838d <main+9>:     mov    eax,0x0
0x8048392 <main+14>:    sub    esp,eax
0x8048394 <main+16>:    mov    DWORD PTR [esp],0x80484e0
0x804839b <main+23>:    call   0x80482b0 <_init+56>
0x80483a0 <main+28>:    mov    eax,0x0
0x80483a5 <main+33>:    leave  
0x80483a6 <main+34>:    ret    
0x4002ee3e <__libc_start_main+206>:     mov    DWORD PTR [esp],eax
0x4002ee41 <__libc_start_main+209>:     call   0x40044a30 <exit>
[email protected]:~$ 


[email protected]:~$ sed s/.*main+.*:.//g cristasm > prettycristasm
[email protected]:~$ [B]cat prettycristasm [/B]
and    esp,0xfffffff0
mov    eax,0x0
sub    esp,eax
mov    DWORD PTR [esp],0x80484e0
call   0x80482b0 <_init+56>
mov    eax,0x0
leave  
ret    
mov    DWORD PTR [esp],eax
call   0x40044a30 <exit>
[email protected]:~$ 
Cristianu
Junior Member
Posts: 7
Joined: Mon Apr 09, 2012 11:13 am

Post by Cristianu »

Ok, thank you for responses!
Now it is clear that developing of deobfuscation tools is rather the thing of the near future.
Nevertheless, trace list gives us some benefits - we have a real sequence of executed instructions.
It is possible to write gdb script that outputs instructions with current registers state.
Then it is possible to find necessary value with help of Ctrl+F.
blabberer

i dont find google showing me where kris kaspersky is talking about inputing a raw disassembly to some compiler optimizer and getting back super disassembly
It is not surprisingly :)
He wrote about it in his Russian book "Art of disassembling" (it is literal translation).

Could you advice me some books to improve my skills in reverse engeneering (ARM-oriented books are preferable).
I would like to become a superhacker. :)
What should I research? Compilers, cryptography, deobfuscation theory, what else?
Any help would be appreciated. :)
blabberer
Senior Member
Posts: 1535
Joined: Wed Dec 08, 2004 11:12 am

Post by blabberer »

i dont know about arm never had the necessity to hack arm but i believe i would be able to hack it if i put my head down to it in a few sessions

as basics are what must be solid and not implementation details x86 is an implementation like arm is what i think

anyway adopting kiss principle (keep it simple and <......> (sir,stupid sir,straightforward sir,shitty sir,s......sir)

i would go about like this

grab a simple crackme

find ways to run it as in installing os , framework , etc etc

when it runs find ways to open it raw and visually look at its guts ie using any text readers . binary readers

then put it in a comatose state and look at its guts sequentially ie using debuggers . disassemblers , descripters, dewhateverss

when i am comfortable with its inner workings (as in i can say in my dreams what ldr r3 #somereg, r18 would mean in any context)

i would start poking into its interaction with the os / framework / vm ( ie a few round trips into R0 as they would say in x86)

and hence forth simply try trapping every thing in R0 where simple r3 obfuscations wont matter :)

hope i live upto my nicks real meaning :)
User avatar
Darkelf
Posts: 222
Joined: Wed Jan 24, 2007 7:20 pm

Post by Darkelf »

Cristianu wrote: Could you advice me some books to improve my skills in reverse engeneering (ARM-oriented books are preferable).
Well, there is Steve Furber's book widely known as "the ARM bible":
http://www.amazon.co.uk/exec/obidos/ASI ... 09-1571011

and you can start here:
http://www.ee.ic.ac.uk/pcheung/teaching/ee2_computing/

Have fun.

Regards
darkelf
I flout Chuck Norris, Spongebob barbecues underwater!
Cristianu
Junior Member
Posts: 7
Joined: Mon Apr 09, 2012 11:13 am

Post by Cristianu »

Thank you for responses, guys! :)
It was very usefull discussion.
Good luck for everyone! ;)
User avatar
disavowed
Posts: 1290
Joined: Mon Apr 01, 2002 3:00 pm

Post by disavowed »

In response to your original question, one idea would be as follows:
  1. Take your original disassembly and create a C program with it (using __asm__).
  2. Compile the C program into a binary.
  3. Use Hex-Rays on the binary to decompile that program.
  4. Now take the decompilation, and create a new C program with that decompiled C code.
  5. Disassemble that new program, and you should have your "optimized" disassembly.
Cristianu
Junior Member
Posts: 7
Joined: Mon Apr 09, 2012 11:13 am

Post by Cristianu »

disavowed
Great reply!
It is just what I need!
Thank you.
Cristianu
Junior Member
Posts: 7
Joined: Mon Apr 09, 2012 11:13 am

Post by Cristianu »

I've just tried to optimize this example in such way:

Code: Select all

#include <stdio.h>

int main(int argc,char** argv) {

__asm__ (       "movl $10, %eax;"
		"movl $10, %eax;"
		"movl $10, %eax;"
		"movl $10, %eax;"
                "movl $20, %ebx;"
                "addl %ebx, %eax;"
    );	

}
I tried -O3 -O2 -O1 - result is the same:

Code: Select all

<main>	
	"movl $10, %eax;"
	"movl $10, %eax;"
	"movl $10, %eax;"
	"movl $10, %eax;"
        "movl $20, %ebx;"
        "addl %ebx, %eax;"
	        ...
What is wrong?
I guess, optimization of compiler should delete the first three lines

Code: Select all

"movl $10, %eax;"
User avatar
Darkelf
Posts: 222
Joined: Wed Jan 24, 2007 7:20 pm

Post by Darkelf »

Whoohoo, that was a quick jump away from ARM, wasn't it?

Now, x86 coding is on the menu, right?
OK, to make it short and sweet here is a little quote:
The presence of an __asm block affects optimization in several ways. First, the compiler doesn't try to optimize the __asm block itself. What you write in assembly language is exactly what you get.
How should the compiler know, what you are trying to do? When you use inline asm you are on your own. In general, the use of inline asm is discouraged, because it get's in the compilers way and prevents an overall optimization. So if you use it, you are expected to know what you are doing.

Best regards
darkelf
I flout Chuck Norris, Spongebob barbecues underwater!
Cristianu
Junior Member
Posts: 7
Joined: Mon Apr 09, 2012 11:13 am

Post by Cristianu »

Code: Select all

Whoohoo, that was a quick jump away from ARM, wasn't it?

Code: Select all

Now, x86 coding is on the menu, right?
It was just a test. :)
If it doesn't work with x86 - it doesn't work wirh ARM.

Am I right?
ARM coding is still on the menu. :)

Best regards
Cristianu
User avatar
disavowed
Posts: 1290
Joined: Mon Apr 01, 2002 3:00 pm

Post by disavowed »

The optimizing C compiler optimizes C code. It doesn't optimize inline assembly.
Locked