Results 1 to 15 of 26

Thread: Watermarking by linking order

Hybrid View

  1. #1

    Watermarking by linking order

    Inspired from this thread
    http://www.woodmann.com/forum/showthread.php?13913-Watermarking-application&p=88531#post88531

    and in particular from the contents of this post

    ...Others can correct me if I am wrong here but I believe what IDA does on top of what others have said is change the linker order of it's various object files during the linking stage.

    For example if the compile process ended up with the following objects

    file1.o, file2.o, file3.o

    You could change the order they are linked together giving and individualised watermark, now imagine doing that with hundreds of object files that IDA is most likely to have you would have loads of combinations you can use.

    And personally I don't think it's an easy task to remove since you would need to move the order of the linked in objects to alter the watermark which means relative addresses within the program would need to be updated.
    a mini project is proposed to study how to reverse/defeat/handle this (clever) way of creating a watermark. As is mentioned in the above post it may not be easy to reorder the objects/functions in the executable (.exe/.dll) because addresses then points to wrong locations. It turns out that IDA and its scripting functionality (IDC) may be used to achieve the reordering without having to go make a BIG project. This is a mini-project
    With IDA and IDC the reordering can be automized which is quite convenient because for applications with many object files it may not be safe to just reorder a subset of the object files. It would be more safe to create a whole new watermark/permutation of all object files.
    This mini-project is just as much a project about getting hands-on experience with IDC and having fun

    In order to get started I have created a toy-application. All the application does is to print two strings.

    Code:
    main.c
    
    extern void func1object1();
    extern void func1object2();
    
    void main()
    {
    	func1object1();
    	func1object2();
    }
    
    file1.c
    
    #include <stdio.h>
    
    void func1object1()
    {
    	printf("Hello from object 1!\n");
    }
    
    file2.c
    
    
    #include <stdio.h>
    
    void func1object2()
    {
    	printf("Hello from object 2!\n");
    }
    From these 3 very simple files two applications are built, the only difference being that the linking order of the object files is different. This makefile

    Code:
    SRCS = main.c file1.c file2.c
    
    OBJS1 = main.obj file1.obj file2.obj
    OBJS2 = file2.obj file1.obj main.obj 
    
    CC        = CL
    CCFLAGS   = /O2 /Oi /D "_MBCS" /FD /EHsc /MD /Gy /W3 /c /Zi /TC
                
    
    LINK       = link
    LINKFLAGS1 = "/OUT:watermark1.exe" "/MANIFESTUAC:level='asInvoker' uiAccess='false'" /OPT:REF /OPT:ICF /DYNAMICBASE /NXCOMPAT /MACHINE:X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib 
    LINKFLAGS2 = "/OUT:watermark2.exe" "/MANIFESTUAC:level='asInvoker' uiAccess='false'" /OPT:REF /OPT:ICF /DYNAMICBASE /NXCOMPAT /MACHINE:X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib 
    
    EC = echo
    RM = del
    
    default: all
    
    
    clean:
    	@$(RM) /F *.obj
    	@$(RM) /F *.idb
    	@$(RM) /F *.pdb
    	@$(RM) /F *.exe
    	@$(RM) /F *manifest*
    
    %.obj : %.c 
    	"C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\vcvars32.bat"
    	@$(EC) ************************************************
    	@$(EC) * Comiling $@
    	$(CC)  $(CCFLAGS) $<
    
    watermark1.exe: $(OBJS1)
    	"C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\vcvars32.bat"
    	$(LINK) $(LINKFLAGS1) $(OBJS1)
    	$(LINK) $(LINKFLAGS2) $(OBJS2)
    
    all: watermark1.exe
    creates the two .exe files watermark1.exe and watermark2.exe. Attached a zip file containing all the files.
    Maybe not surprisingly, for this example, the order of the objects in the binary corresponds to the order in which they are listed in the linker command. The idea is to create watermark2.exe from watermark1.exe.
    I hope this example is not too simple. Maybe it will be much harder with c++ code, have no idea. I'm not sure if it is possible to identify the objects themselves but the functions can be identified (by IDA) and IDC (as far as I understand now) provides functionality for jumping to specified functions or just the next function in the code given som virtual address.

    Does this make any sense at all?
    Attached Files Attached Files

  2. #2
    Administrator dELTA's Avatar
    Join Date
    Oct 2000
    Location
    Ring -1
    Posts
    4,206
    Blog Entries
    5
    Nice introduction and starting documentation, I'm looking much forward to see your progress in this project.

    And yes, it makes sense indeed.
    "Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."

  3. #3
    Thanks for the encouragement

    Just came back to this mini-project after I let myself be interrupted by a crackme (my first .NET reversing) and that crackme was driving me nuts. I had virtually the complete source code (dotfuscated) and I couldn't solve it anyway!? It was quite a frustrating struggle you can imagine

    Anyway, have just written and run my first IDC script. The script is basically a copy of an example in this book http://www.idabook.com/ p. 268.

    The script enumerates the, by IDA, identified functions. The script looks like this:

    Code:
    #include <idc.idc> // Mandatory include directive
    
    static main()
    {
        // Step one, enumerate/list functions
        GetFunctions();	
    }
    
    static GetFunctions()
    {
        auto addr, name;
        addr = 0;
        for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
        {
            name = Name(addr);
            Message("Function: %s at %x\n", name, addr);  
        }
    }
    When run on watermark1.exe it produces the following output (before you read on guess how many functions IDA finds? ):

    Code:
    Compiling file 'C:\rce\LinkOrder\linkorder.idc'...
    Executing function 'main'...
    Function: _main at 401000
    Function: sub_401010 at 401010
    Function: sub_401020 at 401020
    Function: _pre_cpp_init at 40102d
    Function: ___tmainCRTStartup at 401078
    Function: $LN31 at 4011ee
    Function: start at 4012cf
    Function: ?__CxxUnhandledExceptionFilter@@YGJPAU_EXCEPTION_POINTERS@@@Z at 4012d9
    Function: $LN5 at 40131b
    Function: _amsg_exit at 40132a
    Function: __onexit at 401330
    Function: $LN8 at 4013cc
    Function: _atexit at 4013d5
    Function: sub_4013EC at 4013ec
    Function: sub_401412 at 401412
    Function: _XcptFilter at 401438
    Function: __ValidateImageBase at 401440
    Function: __FindPESection at 401480
    Function: __IsNonwritableInCurrentImage at 4014d0
    Function: _initterm at 40158e
    Function: _initterm_e at 401594
    Function: __SEH_prolog4 at 40159c
    Function: __SEH_epilog4 at 4015e1
    Function: __except_handler4 at 4015f5
    Function: __setdefaultprecision at 40161a
    Function: sub_401645 at 401645
    Function: ___security_init_cookie at 401648
    Function: ?terminate@@YAXXZ at 4016de
    Function: _unlock at 4016e4
    Function: __dllonexit at 4016ea
    Function: _lock at 4016f0
    Function: sub_4016F6 at 4016f6
    Function: _except_handler4_common at 401706
    Function: _invoke_watson at 40170c
    Function: _controlfp_s at 401712
    Function: ___report_gsfailure at 401718
    Function: _crt_debugger_hook at 40181e
    We wrote 3 simple functions but IDA identifies 37!
    It is not clear, at least not to me at this point, whether these extra functions can be filtered out or neglected for the reordering. At this stage they are neglected. Another thing that is not considered yet is whether the data is part of the watermark. Right now only the functions are considered.

    I'm going to read some more to find out which IDA functions that can be used for the reordering of the functions and what data structure supported by IDA can be used for saving the functions into as preparation for the actual reordering.

  4. #4
    Administrator dELTA's Avatar
    Join Date
    Oct 2000
    Location
    Ring -1
    Posts
    4,206
    Blog Entries
    5
    The other functions that were detected are most likely just standard library functions of the compiler/linker. You can see that IDA even identified a majority of them from its standard signatures.

    If I were you I'd ignore those in the first stage of this project (some of them could have some quite annoying optimizations that will make trouble at the beginning of a project like this), and first only focus on your own functions (they will most likely be adjacent in the binary, and thus possible to rearrange independently of the library functions).

    Btw, at a later stage you should probably take a look at import table reordering too, since this is a very simple and efficient way to watermark an exe file.
    "Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."

  5. #5
    Teach, Not Flame Kayaker's Avatar
    Join Date
    Oct 2000
    Posts
    4,124
    Blog Entries
    5
    Boy, doesn't that illustrate the simple beauty of a program coded in ASM?

    I created MAP files of both exe's and compared them with UltraEdit/Text Compare. The only differences recorded were the following:

    Code:
    watermark1:
    
     0001:00000000       _main
     0001:00000020       sub_401020
     0002:000000E0       aHelloFromObject2
    
    
    watermark2:
    
     0001:00000000       sub_401000
     0001:00000020       _main
     0002:000000E0       aHelloFromObject1

    In this "simple" case, we only have to worry about 3 procs, 401000, 401010 and 401020. The middle proc doesn't change, but if we were to swap the 1st and 3rd it could affect it's alignment. In this particular case the number of bytes in the 1st and 3rd proc are the same so we can ignore the middle one, but even this shows how difficult fixing this up would be.

    I'm just thinking out loud here.. Let's say one devises a script to swap procs 1 and 3 (having determined that that's the strategy needed) and also fixes up the jump/call relative addresses. But add a small layer of complexity, i.e. say the next time procs 1 and 3 are of *different* byte lengths.. that means we also have to deal with moving/fixing proc 2 as well.

    Add a few more 'watermark' functions, different sizes, scattered all over a large amount of code, and now it just gets nasty to contemplate.

    I'm curious now how an IDC script to fix the simplest scenarios might fare with a more complex one.

    Simple:
    swap 2 identified procs of the same size - no functions in between are affected
    fix up relative jump/call addresses
    done?

    Not as simple:
    swap 2 identified procs of *different* size - all functions in between are affected
    fix up relative jump/call addresses of *all* affected code
    done?

    Crazy:
    swap around many procs of varying sizes, fixing up all affected code
    ?improbable?

    I suppose the other thing too is, understanding how the watermarks are checked. CRC check of only specific watermark functions? Maybe not all the code needs to be handled. Might'nt the watermark-check-code be the weak link in all this if the goal is to "crack" such a protection?


    Kayaker

  6. #6
    Administrator dELTA's Avatar
    Join Date
    Oct 2000
    Location
    Ring -1
    Posts
    4,206
    Blog Entries
    5
    Glad to have you in the discussion Kayaker.


    Quote Originally Posted by Kayaker View Post
    In this "simple" case, we only have to worry about 3 procs, 401000, 401010 and 401020. The middle proc doesn't change, but if we were to swap the 1st and 3rd it could affect it's alignment. In this particular case the number of bytes in the 1st and 3rd proc are the same so we can ignore the middle one, but even this shows how difficult fixing this up would be.
    Yes, my viewpoint from the start has been that you must be prepared to move around all functions in the executable for a procedure like this, exactly because of such alignment problems combined with the fact that very few functions will be of the exact same size, and thus not "switchable in-place".


    Quote Originally Posted by Kayaker View Post
    Add a few more 'watermark' functions, different sizes, scattered all over a large amount of code, and now it just gets nasty to contemplate.
    As long as you have generic code to relocate a function to any position, why would it really be so much worse to move them all around than to move just a few? I'm sure the computer won't complain too much about one for loop being iterated a few more times? The only possible problem I can think of that increases with the number of simultaneously relocated functions it that there might be functions that are "harder to relocate" (due to crazy compiler optimizations or dynamic address resolutions of different kinds, that IDA therefore won't catch when analyzing/decompiling it). Other than that, am I missing something?


    Quote Originally Posted by Kayaker View Post
    I'm curious now how an IDC script to fix the simplest scenarios might fare with a more complex one.

    Simple:
    swap 2 identified procs of the same size - no functions in between are affected
    fix up relative jump/call addresses
    done?

    Not as simple:
    swap 2 identified procs of the *different* size - all functions in between are affected
    fix up relative jump/call addresses of *all* affected code
    done?

    Crazy:
    swap around many procs of varying sizes, fixing up all affected code
    ?improbable?
    Again, as long as the "simple script" doesn't have hardcoded addresses for some special program or something stupid like that, and with my special reservations above, I can't really see the problem, neither coding-complexity wise or execution time-complexity wise? Please, tell me what I'm missing, oh great god of the kayak!

    Quote Originally Posted by Kayaker View Post
    I suppose the other thing too is, understanding how the watermarks are checked. CRC check of only specific watermark functions? Maybe not all the code needs to be handled. Might'nt the watermark-check-code be the weak link in all this if the goal is to "crack" such a protection?
    First of all, there is one VERY big and important difference between CRC checks and watermarks, which is also exactly what makes watermarks such a pain in the ass. CRC checks are performed by the application itself, and can therefore, just as you say, be easily found, reversed and/or neutralized. The problem with watermarks is that the checking code is contained in a completely separate program, locked into a safe (or ok, most likely in a crappy unpatched Windows server, but anyway ) inside the premises of the software author, only to be taken out and used locally at their office when the same software author finds a leaked/warezed version of their software on the net, in order to be able to subsequently sue the crap out of the person that the watermark reveals to be the source of the leak. Thus, no checking code is available for our analysis (unless you offer to burglarize the the IDA Pro offices and steal it of course, which I'm sure would make you quite popular around lots of people here ), and thus, each and every bit of information inside the executable could potentially be part of a secret watermark, cleverly steganographed into functionally important parts of the applications. So, contrary to the common solution for removing a CRC check in a program (patching the check, or in more rare cases reversing the CRC algo and adapting the patch data to result in the same checksum), the only way to "remove" watermarks is to mess up the binary file in each and every way and dimension that you think information might be implicitly stored to form part of the watermark, while still keeping it fully functional, and that's why we're here today!

    As mentioned in the thread referenced at the top if this thread, there is apparently rumours saying that e.g. IDA Pro uses the linking order of its object files to create one (out of many?) such watermark entropy pieces for IDA Pro copies, and thus, the idea of this mini project was born, and its primary scope of investigating how easy it would be to re-shuffle all the functions in an arbitrary executable, in order to create a generic "crack" for exactly that specific type of watermarking technology.

    Future (and probably well-needed in order to reach practical result) steps in the "creation of the ultimate generic watermark defeater tool" would probably be a similar (but comparatively more simple) import table shuffler, export table shuffler, relocation table shuffler, PE resource shuffler, and code-location-independent function and data area diffing tool, which checks for any differences within functions that are not related to their location (and thus neglecting different call and jump addresses inside their code related to that), e.g. to see if there are any differences in used instructions in sub areas of functions, differences in data ordering, or tracking data in PE headers or code caves.

    This mini project is both a great first step and a very good mini project though! Well, until you answer my questions above and tell me it's impossible, but anyway.
    "Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."

Similar Threads

  1. Watermarking application
    By LaBBa in forum Advanced Reversing and Programming
    Replies: 24
    Last Post: December 6th, 2010, 18:38
  2. anyone know the order in which PUSHAD works?
    By BanMe in forum Advanced Reversing and Programming
    Replies: 7
    Last Post: August 2nd, 2009, 20:19
  3. Replies: 2
    Last Post: April 10th, 2006, 17:49
  4. linking problem
    By Neitsa in forum The Newbie Forum
    Replies: 2
    Last Post: October 24th, 2005, 07:48
  5. Enabling "Save" function + linking to code
    By x! in forum Malware Analysis and Unpacking Forum
    Replies: 16
    Last Post: February 14th, 2001, 20:15

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •