Page 1 of 2 12 LastLast
Results 1 to 15 of 26

Thread: Watermarking by linking order

  1. #1

    Watermarking by linking order

    Inspired from this thread
    http://www.woodmann.com/forum/showthread.php?13913-Watermarking-application&p=88531#post88531

    and in particular from the contents of this post

    ...Others can correct me if I am wrong here but I believe what IDA does on top of what others have said is change the linker order of it's various object files during the linking stage.

    For example if the compile process ended up with the following objects

    file1.o, file2.o, file3.o

    You could change the order they are linked together giving and individualised watermark, now imagine doing that with hundreds of object files that IDA is most likely to have you would have loads of combinations you can use.

    And personally I don't think it's an easy task to remove since you would need to move the order of the linked in objects to alter the watermark which means relative addresses within the program would need to be updated.
    a mini project is proposed to study how to reverse/defeat/handle this (clever) way of creating a watermark. As is mentioned in the above post it may not be easy to reorder the objects/functions in the executable (.exe/.dll) because addresses then points to wrong locations. It turns out that IDA and its scripting functionality (IDC) may be used to achieve the reordering without having to go make a BIG project. This is a mini-project
    With IDA and IDC the reordering can be automized which is quite convenient because for applications with many object files it may not be safe to just reorder a subset of the object files. It would be more safe to create a whole new watermark/permutation of all object files.
    This mini-project is just as much a project about getting hands-on experience with IDC and having fun

    In order to get started I have created a toy-application. All the application does is to print two strings.

    Code:
    main.c
    
    extern void func1object1();
    extern void func1object2();
    
    void main()
    {
    	func1object1();
    	func1object2();
    }
    
    file1.c
    
    #include <stdio.h>
    
    void func1object1()
    {
    	printf("Hello from object 1!\n");
    }
    
    file2.c
    
    
    #include <stdio.h>
    
    void func1object2()
    {
    	printf("Hello from object 2!\n");
    }
    From these 3 very simple files two applications are built, the only difference being that the linking order of the object files is different. This makefile

    Code:
    SRCS = main.c file1.c file2.c
    
    OBJS1 = main.obj file1.obj file2.obj
    OBJS2 = file2.obj file1.obj main.obj 
    
    CC        = CL
    CCFLAGS   = /O2 /Oi /D "_MBCS" /FD /EHsc /MD /Gy /W3 /c /Zi /TC
                
    
    LINK       = link
    LINKFLAGS1 = "/OUT:watermark1.exe" "/MANIFESTUAC:level='asInvoker' uiAccess='false'" /OPT:REF /OPT:ICF /DYNAMICBASE /NXCOMPAT /MACHINE:X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib 
    LINKFLAGS2 = "/OUT:watermark2.exe" "/MANIFESTUAC:level='asInvoker' uiAccess='false'" /OPT:REF /OPT:ICF /DYNAMICBASE /NXCOMPAT /MACHINE:X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib 
    
    EC = echo
    RM = del
    
    default: all
    
    
    clean:
    	@$(RM) /F *.obj
    	@$(RM) /F *.idb
    	@$(RM) /F *.pdb
    	@$(RM) /F *.exe
    	@$(RM) /F *manifest*
    
    %.obj : %.c 
    	"C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\vcvars32.bat"
    	@$(EC) ************************************************
    	@$(EC) * Comiling $@
    	$(CC)  $(CCFLAGS) $<
    
    watermark1.exe: $(OBJS1)
    	"C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\vcvars32.bat"
    	$(LINK) $(LINKFLAGS1) $(OBJS1)
    	$(LINK) $(LINKFLAGS2) $(OBJS2)
    
    all: watermark1.exe
    creates the two .exe files watermark1.exe and watermark2.exe. Attached a zip file containing all the files.
    Maybe not surprisingly, for this example, the order of the objects in the binary corresponds to the order in which they are listed in the linker command. The idea is to create watermark2.exe from watermark1.exe.
    I hope this example is not too simple. Maybe it will be much harder with c++ code, have no idea. I'm not sure if it is possible to identify the objects themselves but the functions can be identified (by IDA) and IDC (as far as I understand now) provides functionality for jumping to specified functions or just the next function in the code given som virtual address.

    Does this make any sense at all?
    Attached Files Attached Files

  2. #2
    Administrator dELTA's Avatar
    Join Date
    Oct 2000
    Location
    Ring -1
    Posts
    4,204
    Blog Entries
    5
    Nice introduction and starting documentation, I'm looking much forward to see your progress in this project.

    And yes, it makes sense indeed.
    "Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."

  3. #3
    Thanks for the encouragement

    Just came back to this mini-project after I let myself be interrupted by a crackme (my first .NET reversing) and that crackme was driving me nuts. I had virtually the complete source code (dotfuscated) and I couldn't solve it anyway!? It was quite a frustrating struggle you can imagine

    Anyway, have just written and run my first IDC script. The script is basically a copy of an example in this book http://www.idabook.com/ p. 268.

    The script enumerates the, by IDA, identified functions. The script looks like this:

    Code:
    #include <idc.idc> // Mandatory include directive
    
    static main()
    {
        // Step one, enumerate/list functions
        GetFunctions();	
    }
    
    static GetFunctions()
    {
        auto addr, name;
        addr = 0;
        for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
        {
            name = Name(addr);
            Message("Function: %s at %x\n", name, addr);  
        }
    }
    When run on watermark1.exe it produces the following output (before you read on guess how many functions IDA finds? ):

    Code:
    Compiling file 'C:\rce\LinkOrder\linkorder.idc'...
    Executing function 'main'...
    Function: _main at 401000
    Function: sub_401010 at 401010
    Function: sub_401020 at 401020
    Function: _pre_cpp_init at 40102d
    Function: ___tmainCRTStartup at 401078
    Function: $LN31 at 4011ee
    Function: start at 4012cf
    Function: ?__CxxUnhandledExceptionFilter@@YGJPAU_EXCEPTION_POINTERS@@@Z at 4012d9
    Function: $LN5 at 40131b
    Function: _amsg_exit at 40132a
    Function: __onexit at 401330
    Function: $LN8 at 4013cc
    Function: _atexit at 4013d5
    Function: sub_4013EC at 4013ec
    Function: sub_401412 at 401412
    Function: _XcptFilter at 401438
    Function: __ValidateImageBase at 401440
    Function: __FindPESection at 401480
    Function: __IsNonwritableInCurrentImage at 4014d0
    Function: _initterm at 40158e
    Function: _initterm_e at 401594
    Function: __SEH_prolog4 at 40159c
    Function: __SEH_epilog4 at 4015e1
    Function: __except_handler4 at 4015f5
    Function: __setdefaultprecision at 40161a
    Function: sub_401645 at 401645
    Function: ___security_init_cookie at 401648
    Function: ?terminate@@YAXXZ at 4016de
    Function: _unlock at 4016e4
    Function: __dllonexit at 4016ea
    Function: _lock at 4016f0
    Function: sub_4016F6 at 4016f6
    Function: _except_handler4_common at 401706
    Function: _invoke_watson at 40170c
    Function: _controlfp_s at 401712
    Function: ___report_gsfailure at 401718
    Function: _crt_debugger_hook at 40181e
    We wrote 3 simple functions but IDA identifies 37!
    It is not clear, at least not to me at this point, whether these extra functions can be filtered out or neglected for the reordering. At this stage they are neglected. Another thing that is not considered yet is whether the data is part of the watermark. Right now only the functions are considered.

    I'm going to read some more to find out which IDA functions that can be used for the reordering of the functions and what data structure supported by IDA can be used for saving the functions into as preparation for the actual reordering.

  4. #4
    Administrator dELTA's Avatar
    Join Date
    Oct 2000
    Location
    Ring -1
    Posts
    4,204
    Blog Entries
    5
    The other functions that were detected are most likely just standard library functions of the compiler/linker. You can see that IDA even identified a majority of them from its standard signatures.

    If I were you I'd ignore those in the first stage of this project (some of them could have some quite annoying optimizations that will make trouble at the beginning of a project like this), and first only focus on your own functions (they will most likely be adjacent in the binary, and thus possible to rearrange independently of the library functions).

    Btw, at a later stage you should probably take a look at import table reordering too, since this is a very simple and efficient way to watermark an exe file.
    "Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."

  5. #5
    Teach, Not Flame Kayaker's Avatar
    Join Date
    Oct 2000
    Posts
    4,047
    Blog Entries
    5
    Boy, doesn't that illustrate the simple beauty of a program coded in ASM?

    I created MAP files of both exe's and compared them with UltraEdit/Text Compare. The only differences recorded were the following:

    Code:
    watermark1:
    
     0001:00000000       _main
     0001:00000020       sub_401020
     0002:000000E0       aHelloFromObject2
    
    
    watermark2:
    
     0001:00000000       sub_401000
     0001:00000020       _main
     0002:000000E0       aHelloFromObject1

    In this "simple" case, we only have to worry about 3 procs, 401000, 401010 and 401020. The middle proc doesn't change, but if we were to swap the 1st and 3rd it could affect it's alignment. In this particular case the number of bytes in the 1st and 3rd proc are the same so we can ignore the middle one, but even this shows how difficult fixing this up would be.

    I'm just thinking out loud here.. Let's say one devises a script to swap procs 1 and 3 (having determined that that's the strategy needed) and also fixes up the jump/call relative addresses. But add a small layer of complexity, i.e. say the next time procs 1 and 3 are of *different* byte lengths.. that means we also have to deal with moving/fixing proc 2 as well.

    Add a few more 'watermark' functions, different sizes, scattered all over a large amount of code, and now it just gets nasty to contemplate.

    I'm curious now how an IDC script to fix the simplest scenarios might fare with a more complex one.

    Simple:
    swap 2 identified procs of the same size - no functions in between are affected
    fix up relative jump/call addresses
    done?

    Not as simple:
    swap 2 identified procs of *different* size - all functions in between are affected
    fix up relative jump/call addresses of *all* affected code
    done?

    Crazy:
    swap around many procs of varying sizes, fixing up all affected code
    ?improbable?

    I suppose the other thing too is, understanding how the watermarks are checked. CRC check of only specific watermark functions? Maybe not all the code needs to be handled. Might'nt the watermark-check-code be the weak link in all this if the goal is to "crack" such a protection?


    Kayaker

  6. #6
    Administrator dELTA's Avatar
    Join Date
    Oct 2000
    Location
    Ring -1
    Posts
    4,204
    Blog Entries
    5
    Glad to have you in the discussion Kayaker.


    Quote Originally Posted by Kayaker View Post
    In this "simple" case, we only have to worry about 3 procs, 401000, 401010 and 401020. The middle proc doesn't change, but if we were to swap the 1st and 3rd it could affect it's alignment. In this particular case the number of bytes in the 1st and 3rd proc are the same so we can ignore the middle one, but even this shows how difficult fixing this up would be.
    Yes, my viewpoint from the start has been that you must be prepared to move around all functions in the executable for a procedure like this, exactly because of such alignment problems combined with the fact that very few functions will be of the exact same size, and thus not "switchable in-place".


    Quote Originally Posted by Kayaker View Post
    Add a few more 'watermark' functions, different sizes, scattered all over a large amount of code, and now it just gets nasty to contemplate.
    As long as you have generic code to relocate a function to any position, why would it really be so much worse to move them all around than to move just a few? I'm sure the computer won't complain too much about one for loop being iterated a few more times? The only possible problem I can think of that increases with the number of simultaneously relocated functions it that there might be functions that are "harder to relocate" (due to crazy compiler optimizations or dynamic address resolutions of different kinds, that IDA therefore won't catch when analyzing/decompiling it). Other than that, am I missing something?


    Quote Originally Posted by Kayaker View Post
    I'm curious now how an IDC script to fix the simplest scenarios might fare with a more complex one.

    Simple:
    swap 2 identified procs of the same size - no functions in between are affected
    fix up relative jump/call addresses
    done?

    Not as simple:
    swap 2 identified procs of the *different* size - all functions in between are affected
    fix up relative jump/call addresses of *all* affected code
    done?

    Crazy:
    swap around many procs of varying sizes, fixing up all affected code
    ?improbable?
    Again, as long as the "simple script" doesn't have hardcoded addresses for some special program or something stupid like that, and with my special reservations above, I can't really see the problem, neither coding-complexity wise or execution time-complexity wise? Please, tell me what I'm missing, oh great god of the kayak!

    Quote Originally Posted by Kayaker View Post
    I suppose the other thing too is, understanding how the watermarks are checked. CRC check of only specific watermark functions? Maybe not all the code needs to be handled. Might'nt the watermark-check-code be the weak link in all this if the goal is to "crack" such a protection?
    First of all, there is one VERY big and important difference between CRC checks and watermarks, which is also exactly what makes watermarks such a pain in the ass. CRC checks are performed by the application itself, and can therefore, just as you say, be easily found, reversed and/or neutralized. The problem with watermarks is that the checking code is contained in a completely separate program, locked into a safe (or ok, most likely in a crappy unpatched Windows server, but anyway ) inside the premises of the software author, only to be taken out and used locally at their office when the same software author finds a leaked/warezed version of their software on the net, in order to be able to subsequently sue the crap out of the person that the watermark reveals to be the source of the leak. Thus, no checking code is available for our analysis (unless you offer to burglarize the the IDA Pro offices and steal it of course, which I'm sure would make you quite popular around lots of people here ), and thus, each and every bit of information inside the executable could potentially be part of a secret watermark, cleverly steganographed into functionally important parts of the applications. So, contrary to the common solution for removing a CRC check in a program (patching the check, or in more rare cases reversing the CRC algo and adapting the patch data to result in the same checksum), the only way to "remove" watermarks is to mess up the binary file in each and every way and dimension that you think information might be implicitly stored to form part of the watermark, while still keeping it fully functional, and that's why we're here today!

    As mentioned in the thread referenced at the top if this thread, there is apparently rumours saying that e.g. IDA Pro uses the linking order of its object files to create one (out of many?) such watermark entropy pieces for IDA Pro copies, and thus, the idea of this mini project was born, and its primary scope of investigating how easy it would be to re-shuffle all the functions in an arbitrary executable, in order to create a generic "crack" for exactly that specific type of watermarking technology.

    Future (and probably well-needed in order to reach practical result) steps in the "creation of the ultimate generic watermark defeater tool" would probably be a similar (but comparatively more simple) import table shuffler, export table shuffler, relocation table shuffler, PE resource shuffler, and code-location-independent function and data area diffing tool, which checks for any differences within functions that are not related to their location (and thus neglecting different call and jump addresses inside their code related to that), e.g. to see if there are any differences in used instructions in sub areas of functions, differences in data ordering, or tracking data in PE headers or code caves.

    This mini project is both a great first step and a very good mini project though! Well, until you answer my questions above and tell me it's impossible, but anyway.
    "Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."

  7. #7
    Teach, Not Flame Kayaker's Avatar
    Join Date
    Oct 2000
    Posts
    4,047
    Blog Entries
    5
    Thanks for clarifying watermarking dELTA. I understood that it was to match a particular compilation to a particular person (so they might get the crap sued out of them as you say), but I was also envisioning it as being used as part of a "normal" protection scheme as well, which I guess doesn't necessarily have to be the case and obviously not part of this project.

    i.e. as a particular key file will only work with a particular compilation because the linking order is taken into account. In other words, the linking order fingerprint is embedded in the key file and some algorithm is used with it to verify the integrity of the program. (the CRC check comment was a simplistic example of that idea)

    If that's not the case, then what's the benefit of removing such a watermark? If I've got IDA and I'm able to steal YOUR IDA, then I can swap watermarks and YOU get blamed for the release, is that it?


    This mini project is both a great first step and a very good mini project though! Well, until you answer my questions above and tell me it's impossible, but anyway.
    No, actually I do have hope, that's why I said "I'm curious now how an IDC script to fix the simplest scenarios might fare with a more complex one."
    If you can reorder one function, in theory you should be able to reorder them all. In theory. That's the caveat that still needs to be addressed.


    This reminded me of a paper I had posted before
    http://www.woodmann.com/forum/showthread.php?9483-Article-Software-Security-Through-Targetted-Diversification

    Software Security Through Targetted Diversification
    http://www.cosic.esat.kuleuven.be/publications/thesis-122.pdf

    The paper is a thesis which discusses the idea of creating software which is distributed as polymorphised versions, in an effort to discourage automated or generic cracking of it. Specifically it suggests the use of Genetic Algorithm (GA) programming to create a diverse population of software for distribution to the masses.


    This suggests GA could also be used to create individualised programs. Change a few parameters, fitness/crossover values, record some unique aspect of the offspring (compiled program), and give it to its adopted parent (registered owner). If you find it outside of its new home (leaked), do a DNA analysis.

  8. #8
    The other functions that were detected are most likely just standard library functions of the compiler/linker. You can see that IDA even identified a majority of them from its standard signatures.
    I was thinking the same thing. That they are appended and as such appear last in image but this is just an assumption for now

    Kayaker, did you create those MAP files in IDA? (File->Produce File->Create MAP file...) I didn't think of creating MAP files, maybe because the files are so simple. Thanks for the tip

    About the length of the functions, then my assumption is that we deal with one continuous block of functions and alignment data and in general all functions are moved. In this case the length of the functions does not matter when we do the reordering, I think. If the watermark is scattered in several distinct areas with stuff in between that is not part of the watermark then this complicates things as length of the functions matters. The idea when starting the mini-project was to make things as simple as possible to begin with and understand how to deal with this. Then we can make things more complicated along the way. For now the simple scenario is challenging enough for me

    Personally, I think this linking-order approach to watermarking is quite clever, mostly because I believe it is very practical (low-cost). There is no need for extra tools or going to write any assembler. All that is needed is a couple of additional lines in an already existing build system. So basically there is no extra work to be done by the software writer if the build system and version control system is already set. And I also think the watermark is not so easy to remove, but that is what we are hoping to find out

    The IDC script has been expanded a little so that it actually takes care of reordering the functions.

    Current IDC script
    Code:
    #include <idc.idc> // Mandatory include directive
    
    static EnumerateAndStoreFunctions(hfunctionnames)
    {
        auto addr, tmpaddr, name, fidx, widx, bsuccess, tmphandle, inextfunction;
        addr = 0;
        fidx = 0; // function index
        widx = 0; // word idx
        for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
        {
            name = Name(addr);
            
            // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            if(name == "_pre_cpp_init")
            {
                return fidx;
            }
            
    	    bsuccess = SetArrayString(hfunctionnames, 2*fidx, name);
    	    if(bsuccess == 0)
            {
                Message("Saving name of function %s failed.",name); 
            }
    	
            tmphandle = CreateArray(name);
            if(tmphandle == -1)
            {
                tmphandle = GetArrayId(name);
            }
    
            inextfunction = NextFunction(addr);
            if(inextfunction == BADADDR)
            {
                inextfunction = GetFunctionAttr(addr, FUNCATTR_END);
            }
            
            widx = 0;
            for(tmpaddr = addr; tmpaddr < inextfunction; tmpaddr = tmpaddr + 4)
            {
                 SetArrayLong(tmphandle, widx, Dword(tmpaddr));
                 widx = widx + 1;
            }
    		bsuccess = SetArrayLong(hfunctionnames, 2*fidx+1, widx);
            fidx = fidx + 1;        
        }
        return fidx;
    }
    
    static PrintFunctions(hfunctionnames, inumberoffunctions)
    {
        auto fidx;
        for(fidx = 0; fidx < inumberoffunctions; fidx = fidx + 1)
        {
            Message("Function: %s\n", GetArrayElement(AR_STR, hfunctionnames, 2*fidx));
        }
    }
    
    static WriteBackFunctions(hfunctionnames, inumberoffunctions, iwriteaddr)
    {
        auto fidx, oidx, funcname, hopcodes, opcodeslen;
    
        for(fidx = 2; fidx >=0 ; fidx = fidx - 1)
        {
            funcname    = GetArrayElement(AR_STR, hfunctionnames, 2*fidx); 
            opcodeslen  = GetArrayElement(AR_LONG, hfunctionnames, 2*fidx+1); 
    	    hopcodes    = GetArrayId(funcname);
            for(oidx = 0; oidx < opcodeslen; oidx = oidx + 1)
            {
    			PatchDword(iwriteaddr, GetArrayElement(AR_LONG, hopcodes, oidx));
                iwriteaddr = iwriteaddr + 4;
            }
        }
    }
    
    static main()
    {
        auto inumberoffunctions, hfunctionnames;
    
        // This array is populated with names of functions and
        // the length of the functions in dwords in the following
        // way [name1,length1,name2,length2,...]
        hfunctionnames = CreateArray("FunctionNames");
         
        if(hfunctionnames == -1)
        {
            // If array already exist get the handle by GetArrayId
            Message("hfunctionnames is -1.\n");
            hfunctionnames = GetArrayId("FunctionNames");
        }
     
        //  Enumerate functions and store them i persistent array
        inumberoffunctions = EnumerateAndStoreFunctions(hfunctionnames);	
    
        // Print functions in IDA's output window
        PrintFunctions(hfunctionnames, inumberoffunctions);
        
    	// Write Back functions in reversed order
    	WriteBackFunctions(hfunctionnames, inumberoffunctions, 0x401000);
         
    }
    Watermark1.exe original

    Code:
    .text:00401000 ; =============== S U B R O U T I N E =======================================
    .text:00401000
    .text:00401000
    .text:00401000 ; int __cdecl main(int argc, const char **argv, const char **envp)
    .text:00401000 _main           proc near               ; CODE XREF: ___tmainCRTStartup+10Ap
    .text:00401000                 call    sub_401010
    .text:00401005                 call    sub_401020
    .text:0040100A                 xor     eax, eax
    .text:0040100C                 retn
    .text:0040100C _main           endp
    .text:0040100C
    .text:0040100C ; ---------------------------------------------------------------------------
    .text:0040100D                 align 10h
    .text:00401010
    .text:00401010 ; =============== S U B R O U T I N E =======================================
    .text:00401010
    .text:00401010
    .text:00401010 sub_401010      proc near               ; CODE XREF: _mainp
    .text:00401010                 push    offset Format   ; "Hello from object 1!\n"
    .text:00401015                 call    ds:printf
    .text:0040101B                 pop     ecx
    .text:0040101C                 retn
    .text:0040101C sub_401010      endp
    .text:0040101C
    .text:0040101C ; ---------------------------------------------------------------------------
    .text:0040101D                 align 10h
    .text:00401020
    .text:00401020 ; =============== S U B R O U T I N E =======================================
    .text:00401020
    .text:00401020
    .text:00401020 sub_401020      proc near               ; CODE XREF: _main+5p
    .text:00401020                 push    offset aHelloFromObj_0 ; "Hello from object 2!\n"
    .text:00401025                 call    ds:printf
    .text:0040102B                 pop     ecx
    .text:0040102C                 retn
    .text:0040102C sub_401020      endp
    Watermark1.exe modified with script

    Code:
    .text:00401000 ; =============== S U B R O U T I N E =======================================
    .text:00401000
    .text:00401000
    .text:00401000 ; int __cdecl main(int argc, const char **argv, const char **envp)
    .text:00401000 _main           proc near               ; CODE XREF: ___tmainCRTStartup+10Ap
    .text:00401000                 push    4020E0h
    .text:00401005                 call    ds:printf
    .text:0040100A                 add     [ecx-3Dh], bl
    .text:0040100C                 retn
    .text:0040100C _main           endp
    .text:0040100C
    .text:0040100C ; ---------------------------------------------------------------------------
    .text:0040100D                 align 10h
    .text:00401010
    .text:00401010 ; =============== S U B R O U T I N E =======================================
    .text:00401010
    .text:00401010
    .text:00401010 sub_401010      proc near               ; CODE XREF: _mainp
    .text:00401010                 push    offset Format   ; "Hello from object 1!\n"
    .text:00401015                 call    ds:printf
    .text:0040101B                 pop     ecx
    .text:0040101C                 retn
    .text:0040101C sub_401010      endp
    .text:0040101C
    .text:0040101C ; ---------------------------------------------------------------------------
    .text:0040101D                 align 10h
    .text:00401020
    .text:00401020 ; =============== S U B R O U T I N E =======================================
    .text:00401020
    .text:00401020
    .text:00401020 sub_401020      proc near               ; CODE XREF: _main+5p
    .text:00401020                 call    near ptr unk_4020A8-1078h ; "Hello from object 2!\n"
    .text:00401025                 call    near ptr loc_40103C+4
    .text:0040102B                 rol     bl, 0CCh
    .text:0040102C                 retn
    .text:0040102C sub_401020      endp
    I have double-checked things in Hex-view
    Code:
    Before (start 0x401000)
    E8 0B 00 00 00 E8 16 00  00 00 33 C0 C3 CC CC CC
    68 C8 20 40 00 FF 15 A0  20 40 00 59 C3 CC CC CC
    68 E0 20 40 00 FF 15 A0  20 40 00 59 C3 68 12 14
    After (start 0x401000)
    68 E0 20 40 00 FF 15 A0  20 40 00 59 C3 68 12 14
    68 C8 20 40 00 FF 15 A0  20 40 00 59 C3 CC CC CC
    E8 0B 00 00 00 E8 16 00  00 00 33 C0 C3 CC CC CC
    I don't understand the details of why for instance
    Code:
    xor     eax, eax
    in main becomes
    Code:
    rol     bl, 0CCh
    only that the addresses must be updated correspondingly in order to correct it. And this I think will be more difficult. I haven't yet thought about how to fix the addresses. One way maybe is to have a pre-processing stage where the addresses that need to be updated after the reordering are labeled and a post-processing stage after the actual reordering where the addresses are fixed...have to think some more about this

  9. #9
    Teach, Not Flame Kayaker's Avatar
    Join Date
    Oct 2000
    Posts
    4,047
    Blog Entries
    5
    Hi niaren,

    Nice start. If you edit the code it's always a good idea to get IDA to reanalyze. It will fix some things and point out errors in other sections. Try inserting something like the following at the end of main()

    PHP Code:
        auto text_starttext_endsize;    

        
    text_start SegByBase(1);
        
    text_end SegEnd(text_start);
        
    size text_end text_start;    

        
    Message("text_start %x \n"text_start);
        
    Message("text_end %x \n"text_end);
        
    Message("size %x \n"size);

        
    MakeUnknown (text_startsize1);
        
    AnalyzeArea (text_starttext_end+1); 
    or if you prefer to include the entire file it's even simpler to write just:

    PHP Code:
        MakeUnknown (MinEA(), MaxEA() - MinEA(), 1);
        
    AnalyzeArea (MinEA(), MaxEA()); 

    You can test this manually as well. After applying your existing script, undefine('U') the affected sections and reanalyze with 'C'. You'll see that the rol bl, 0CCh is fixed back to xor eax, eax, but you'll also see that the middle proc was actually affected negatively, which you don't see if you don't reanalyze.

    Cheers,
    Kayaker

  10. #10
    Administrator dELTA's Avatar
    Join Date
    Oct 2000
    Location
    Ring -1
    Posts
    4,204
    Blog Entries
    5
    Quote Originally Posted by Kayaker View Post
    Thanks for clarifying watermarking dELTA. I understood that it was to match a particular compilation to a particular person (so they might get the crap sued out of them as you say), but I was also envisioning it as being used as part of a "normal" protection scheme as well, which I guess doesn't necessarily have to be the case and obviously not part of this project.
    Sure, it could of course be done, but it would be extremely stupid to reveal the watermark locations explicitly in the program's own code. A normal CRC will work just as well in that aspect, and be just as hard (easy) to patch out. You do of course understand this already, I'm just writing it here for reference.


    Quote Originally Posted by Kayaker View Post
    If that's not the case, then what's the benefit of removing such a watermark? If I've got IDA and I'm able to steal YOUR IDA, then I can swap watermarks and YOU get blamed for the release, is that it?
    The benefit is that everyone will have their own copy of IDA for every new release, when people aren't afraid of leaking a cracked version of their own copy anymore, including you, when you (or whatever friend you're leeching it off ) get tired of paying the yearly fee.

    Jokes aside (and before "someone" gets unnecessarily pissed on us ), this thread and project is not about warezing IDA. Rather, it's about the theoretical challenge of defeating a more or less powerful "protection technique", for the pure hell (and learning experience) of it, just like all other discussions on this board. The IDA watermarks are one of the most highly held (and foremost, practically efficient!) protection systems out there today, so of course it's fun to try to break it!


    Quote Originally Posted by Kayaker View Post
    This reminded me of a paper I had posted before
    http://www.woodmann.com/forum/showthread.php?9483-Article-Software-Security-Through-Targetted-Diversification

    ...

    The paper is a thesis which discusses the idea of creating software which is distributed as polymorphised versions, in an effort to discourage automated or generic cracking of it. Specifically it suggests the use of Genetic Algorithm (GA) programming to create a diverse population of software for distribution to the masses.


    This suggests GA could also be used to create individualised programs. Change a few parameters, fitness/crossover values, record some unique aspect of the offspring (compiled program), and give it to its adopted parent (registered owner). If you find it outside of its new home (leaked), do a DNA analysis.
    (rant start) Just for the record, I think the inclusion of "Genetic Algorithms" in that paper is just a stupid excuse to include some buzz words, and I don't at all see the practical use for it. The primary use of Genetic Algorithms is to find (semi)optimal solutions to massively multidimensional problems, while the efficient polymorphing of code in order to effectively hide information is absolutely not that kind of problem. All it will result in is less efficient and less systematic information hiding, and much more easily corruptable watermarks I think. It is very much like "artificial intelligence", which people also often try to use on completely incompatible and inoptimal problems, just because it has a "cool ring to it". (rant stop)


    Quote Originally Posted by niaren View Post
    Kayaker, did you create those MAP files in IDA? (File->Produce File->Create MAP file...) I didn't think of creating MAP files, maybe because the files are so simple. Thanks for the tip
    I suspect he simply let the linker produce them, which would be much more efficient for use as "reference material" in a case like this. You will find options for it in your linker.


    Quote Originally Posted by niaren View Post
    I have double-checked things in Hex-view
    Code:
    Before (start 0x401000)
    E8 0B 00 00 00 E8 16 00  00 00 33 C0 C3 CC CC CC
    68 C8 20 40 00 FF 15 A0  20 40 00 59 C3 CC CC CC
    68 E0 20 40 00 FF 15 A0  20 40 00 59 C3 68 12 14
    After (start 0x401000)
    68 E0 20 40 00 FF 15 A0  20 40 00 59 C3 68 12 14
    68 C8 20 40 00 FF 15 A0  20 40 00 59 C3 CC CC CC
    E8 0B 00 00 00 E8 16 00  00 00 33 C0 C3 CC CC CC
    The optimal visualization method for your results would probably be to configure your IDA to show full opcode bytes directly in the disassembly listing. Then you would not need complementary hex dumps like this, and the somewhat confusing coinciding relative offset collisions of the string pointers in your disassembly listings above would also be much more easily explained too.


    Quote Originally Posted by Kayaker View Post
    Nice start. If you edit the code it's always a good idea to get IDA to reanalyze.

    ...

    You can test this manually as well. After applying your existing script, undefine('U') the affected sections and reanalyze with 'C'. You'll see that the rol bl, 0CCh is fixed back to xor eax, eax, but you'll also see that the middle proc was actually affected negatively, which you don't see if you don't reanalyze.
    When it comes to massive code permutations like this, I would never trust the results of a mere reanalysis of the live listing inside IDA. Rather, I would let the IDC script patch the raw mutated bytes right into a copy of the executable on disk, and load that one up in IDA individually. Otherwise, my guess is that you'll sooner or later be in a world of unnecessary pain and confusion.


    Quote Originally Posted by niaren View Post
    ...only that the addresses must be updated correspondingly in order to correct it. And this I think will be more difficult. I haven't yet thought about how to fix the addresses. One way maybe is to have a pre-processing stage where the addresses that need to be updated after the reordering are labeled and a post-processing stage after the actual reordering where the addresses are fixed
    Yes, you should definitely identify and keep track of all offsets and addresses in the code before starting to shuffle it around, and then adjust all these accordingly after haven chosen a new location for the function in question. I strongly advice you to make use of IDAs powerful analysis and metadata information of the code for this purpose, since it has already done most of the hard work for you in this regard, i.e. identifying all offsets, addresses and other constructs relevant for such an operation!

    Finally, very nice start niaren, keep up the good work, it will be much interesting to follow this!
    "Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."

  11. #11
    Thanks for all the feedback. It's a real pleasure

    Kayaker, I have tried to insert
    PHP Code:
    MakeUnknown (MinEA(), MaxEA() - MinEA(), 1);
    AnalyzeArea (MinEA(), MaxEA()); 
    It does not really work, can't figure out why. I have to manually press 'U' 'C' as you said in order to get the disassembly to look right. This is of course unfortunate if we depend on IDA showing the correct disassembly. However, the approach used now in the script does not depend on IDA showing the correct disassembly, only initially.

    And yes you were absolutely right, there was a bug in the script
    The reason why I asked about the MAP files is because I had not foreseen you would actually build the files yourself

    The script seems to work now including patching the call instructions. The script works by

    - creating an address translation lookup table [I made up that name myself, don't know what else to call it ]
    - Patch the instructions in-place (those that need to be updated)
    - Finally do the reordering

    The address translation LUT takes an RVA as input and returns the RVA in the reordered image. For watermark1.exe the LUT looks like this:

    Code:
    Address 401000 mapped to 40101d
    Address 401005 mapped to 401022
    Address 40100a mapped to 401027
    Address 40100c mapped to 401029
    Address 401010 mapped to 40100d
    Address 401015 mapped to 401012
    Address 40101b mapped to 401018
    Address 40101c mapped to 401019
    Address 401020 mapped to 401000
    Address 401025 mapped to 401005
    Address 40102b mapped to 40100b
    Address 40102c mapped to 40100c
    The PatchInPlaceDebug function prints the LUT.

    This is the script
    PHP Code:
    #include <idc.idc> // Mandatory include directive

    static GetNumberOfFunctions()
    {
        
    auto addrnamefidx;
        
    addr 0;
        
    fidx 0// function index
        
    for(addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return 
    fidx;
            }
            
    fidx fidx 1;        
        }
        return 
    fidx;
    }

    static 
    CreatePermutation(inumberoffunctions)
    {
        
    auto hpermutation;
        
        
    hpermutation CreateArray("Permutation");
        if(
    hpermutation == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    hpermutation GetArrayId("Permutation");
        }
        
    // Hardcoded permutation
        
    SetArrayLong(hpermutation02);
        
    SetArrayLong(hpermutation11);
        
    SetArrayLong(hpermutation20);
        return 
    hpermutation;
    }
        
    static 
    GetFunctionAddresses()
    {
        
    auto addrnamefidxhfunctionaddresses;
        
    addr 0;
        
    fidx 0// function index
        
        
    hfunctionaddresses CreateArray("FunctionAddresses");
        if(
    hfunctionaddresses == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    hfunctionaddresses GetArrayId("FunctionAddresses");
        }

        for(
    addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return 
    hfunctionaddresses;
            }
            
    SetArrayLong(hfunctionaddresses2*fidxaddr);
            
    SetArrayLong(hfunctionaddresses2*fidx+1NextFunction(addr) - addr);
            
            
    fidx fidx 1;
        }
        return 
    hfunctionaddresses;
    }

    static 
    GetNewFunctionAddresses(hfunctionaddresseshpermutationinumberoffunctions)
    {
        
    auto addrpidxfidxhnewfunctionaddresses;
        
    addr 0;
        
    pidx 0;
        
    fidx 0;
        
        
    hnewfunctionaddresses CreateArray("NewFunctionAddresses");
        if(
    hnewfunctionaddresses == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    hnewfunctionaddresses GetArrayId("NewFunctionAddresses");
        }
        
        
    // Address of first function
        
    addr NextFunction(addr);
        
        
    fidx GetArrayElement(AR_LONGhpermutationpidx); 
        
    SetArrayLong(hnewfunctionaddressesfidxaddr);
        
    addr addr GetArrayElement(AR_LONGhfunctionaddresses2*fidx+1);
        
        for(
    pidx=1pidx inumberoffunctionspidx++)
        {
            
    fidx GetArrayElement(AR_LONGhpermutationpidx); 
            
    SetArrayLong(hnewfunctionaddressesfidxaddr);
            
    addr addr GetArrayElement(AR_LONGhfunctionaddresses2*fidx+1);
        }
        return 
    hnewfunctionaddresses;
    }

    static 
    CreateAddressTranslationLUT(hnewfunctionaddresses)
    {
        
    auto addrhaddresstranslationlutnameendinstnewaddrfidx;
        
    addr 0;
        
    fidx 0;
        
        
    haddresstranslationlut CreateArray("AddressTranslationLookupTable");
        if(
    haddresstranslationlut == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    haddresstranslationlut GetArrayId("AddressTranslationLookupTable");
        }
        
        for(
    addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return 
    haddresstranslationlut;
            }
            
    end  GetFunctionAttr(addrFUNCATTR_END);
            
    inst addr;
            
            
    // Get new base address of function
            
    newaddr GetArrayElement(AR_LONGhnewfunctionaddressesfidx);
            
            
    SetArrayLong(haddresstranslationlutinstnewaddr);
            
    Message("haddresstranslationlut %x \n",haddresstranslationlut);
            
    inst FindCode(instSEARCH_DOWN SEARCH_NEXT);
            while(
    inst end)
            {
                
    SetArrayLong(haddresstranslationlutinstnewaddr + (inst-addr));
                
    inst FindCode(instSEARCH_DOWN SEARCH_NEXT);
            }
            
    fidx fidx 1;
        }
        return 
    haddresstranslationlut;
    }

    static 
    PatchInPlaceDebug(haddresstranslationlut)
    {
        
    auto addrnameendinstnewaddr;
        
    addr 0;
        
        for(
    addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return;
            }
            
    end  GetFunctionAttr(addrFUNCATTR_END);
            
    inst addr;
            
            while(
    inst end)
            {
                
    Message("Address %x mapped to %x\n",inst,GetArrayElement(AR_LONGhaddresstranslationlutinst));
                
    inst FindCode(instSEARCH_DOWN SEARCH_NEXT);
            }
        }
    }

    static 
    PatchInPlace(haddresstranslationlut)
    {
        
    auto addrnameendinstnewaddropidxoptypenewrvanearaddr;
        
    addr 0;
        
        for(
    addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return;
            }
            
    end  GetFunctionAttr(addrFUNCATTR_END);
            
    inst addr;
            
            while(
    inst end)
            {
                
    opidx 0;
                
    optype GetOpType(inst,opidx);
                while(
    optype 0)
                {
                    if(
    optype == 7)
                    {
                        
    // Immediate Near Address
                        
                        // Maybe not necessary but check for call instruction
                        
    if(GetMnem(inst) == "call")
                        {
                            
    Message("Instruction at %x being patched.\n"inst);
                            
    nearaddr LocByName(GetOpnd(instopidx));
                            
    newrva   GetArrayElement(AR_LONGhaddresstranslationlutnearaddr) - (GetArrayElement(AR_LONGhaddresstranslationlutinst)+0x6);
                            
    PatchDword(inst+0x1newrva+0x1);
                            if(
    nearaddr == BADADDR)
                            {
                                
    Message("Fatal error, error processing instruction at %x\n"inst);
                            }
                        }
                        else
                        {
                            
    Message("Unsupported! Unknown %s instruction needs to be patched.\n"GetMnem(inst));
                        }
                            
                    }
                    
                    
    opidx++;
                    
    optype GetOpType(inst,opidx);
                }  
                
    inst FindCode(instSEARCH_DOWN SEARCH_NEXT);        }
        }
    }

    static 
    EnumerateAndStoreFunctions(hfunctionnames)
    {
        
    auto addrtmpaddrnamefidxwidxbsuccesstmphandleinextfunction;
        
    addr 0;
        
    fidx 0// function index
        
    widx 0// word idx
        
    for(addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return 
    fidx;
            }
            
            
    bsuccess SetArrayString(hfunctionnames2*fidxname);
            if(
    bsuccess == 0)
            {
                
    Message("Saving name of function %s failed.",name); 
            }
        
            
    tmphandle CreateArray(name);
            if(
    tmphandle == -1)
            {
                
    tmphandle GetArrayId(name);
            }

            
    inextfunction NextFunction(addr);
            if(
    inextfunction == BADADDR)
            {
                
    inextfunction GetFunctionAttr(addrFUNCATTR_END);
            }
            
            
    widx 0;
            for(
    tmpaddr addrtmpaddr inextfunctiontmpaddr tmpaddr 1)
            {
                 
    SetArrayLong(tmphandlewidxByte (tmpaddr));
                 
    widx widx 1;
            }
            
    bsuccess SetArrayLong(hfunctionnames2*fidx+1widx);
            
    fidx fidx 1;        
        }
        return 
    fidx;
    }

    static 
    PrintFunctions(hfunctionnamesinumberoffunctions)
    {
        
    auto fidx;
        for(
    fidx 0fidx inumberoffunctionsfidx fidx 1)
        {
            
    Message("Function: %s\n"GetArrayElement(AR_STRhfunctionnames2*fidx));
        }
    }

    static 
    WriteBackFunctions(hfunctionnamesinumberoffunctionsiwriteaddr)
    {
        
    auto fidxoidxfuncnamehopcodesopcodeslen;

        for(
    fidx 2fidx >=fidx fidx 1)
        {
            
    funcname    GetArrayElement(AR_STRhfunctionnames2*fidx); 
            
    opcodeslen  GetArrayElement(AR_LONGhfunctionnames2*fidx+1); 
            
    hopcodes    GetArrayId(funcname);
            for(
    oidx 0oidx opcodeslenoidx oidx 1)
            {
                
    PatchByte(iwriteaddrGetArrayElement(AR_LONGhopcodesoidx));
                
    iwriteaddr iwriteaddr 1;
            }
        }
    }

    static 
    main()
    {
        
    auto didxinumberoffunctionshfunctionnameshpermutationhfunctionaddresseshnewfunctionaddresses;
        
    auto haddresstranslationlut;
        
        
    // Get number of functions 
        
    inumberoffunctions GetNumberOfFunctions();
        
        
    // DEBUG
        // Message("Number of functions %d\n",inumberoffunctions);

        // Create permutation array
        
    hpermutation CreatePermutation(inumberoffunctions);
        
        
    // Get current function addresses
        
    hfunctionaddresses GetFunctionAddresses();
        
        
    // Get addresses after permutation
        
    hnewfunctionaddresses GetNewFunctionAddresses(hfunctionaddresseshpermutationinumberoffunctions);    

        
    // Pre-processing, create address translation lookup table
        
    haddresstranslationlut CreateAddressTranslationLUT(hnewfunctionaddresses);
        
        
    PatchInPlace(haddresstranslationlut);
        
        
    //DEBUG  
        //for(didx = 0; didx<inumberoffunctions; didx++)
        //{
        //    Message("New Function address: %x\n", GetArrayElement(AR_LONG, hnewfunctionaddresses, didx));
        //}
        //return;
        
        // This array is populated with names of functions and
        // the length of the functions in dwords in the following
        // way [name1,length1,name2,length2,...]
        
    hfunctionnames CreateArray("FunctionNames");
         
        if(
    hfunctionnames == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    Message("hfunctionnames is -1.\n");
            
    hfunctionnames GetArrayId("FunctionNames");
        }
     
        
    //  Enumerate functions and store them i persistent array
        
    inumberoffunctions EnumerateAndStoreFunctions(hfunctionnames);    

        
    // Print functions in IDA's output window
        
    PrintFunctions(hfunctionnamesinumberoffunctions);
        
        
    // Write Back functions in reversed order
        
    WriteBackFunctions(hfunctionnamesinumberoffunctions0x401000);
     
        
    MakeUnknown (MinEA(), MaxEA() - MinEA(), 1);
        
    AnalyzeArea (MinEA(), MaxEA());      

    Watermark1.exe before

    Code:
    .text:00401000 ; int __cdecl main(int argc, const char **argv, const char **envp)
    .text:00401000 _main           proc near               ; CODE XREF: ___tmainCRTStartup+10Ap
    .text:00401000                 call    sub_401010
    .text:00401005                 call    sub_401020
    .text:0040100A                 xor     eax, eax
    .text:0040100C                 retn
    .text:0040100C _main           endp
    .text:0040100C
    .text:0040100C ; ---------------------------------------------------------------------------
    .text:0040100D                 align 10h
    .text:00401010
    .text:00401010 ; =============== S U B R O U T I N E =======================================
    .text:00401010
    .text:00401010
    .text:00401010 sub_401010      proc near               ; CODE XREF: _mainp
    .text:00401010                 push    offset Format   ; "Hello from object 1!\n"
    .text:00401015                 call    ds:printf
    .text:0040101B                 pop     ecx
    .text:0040101C                 retn
    .text:0040101C sub_401010      endp
    .text:0040101C
    .text:0040101C ; ---------------------------------------------------------------------------
    .text:0040101D                 align 10h
    .text:00401020
    .text:00401020 ; =============== S U B R O U T I N E =======================================
    .text:00401020
    .text:00401020
    .text:00401020 sub_401020      proc near               ; CODE XREF: _main+5p
    .text:00401020                 push    offset aHelloFromObj_0 ; "Hello from object 2!\n"
    .text:00401025                 call    ds:printf
    .text:0040102B                 pop     ecx
    .text:0040102C                 retn
    .text:0040102C sub_401020      endp
    watermark1.exe after (had to press 'U' 'C' on the last routine)

    Code:
    .text:00401000 _main           proc near               ; CODE XREF: .text:00401022p
    .text:00401000                                         ; .text:00401182p
    .text:00401000                 push    offset aHelloFromObjec ; "Hello from object 2!\n"
    .text:00401005                 call    ds:printf
    .text:0040100B                 pop     ecx
    .text:0040100C                 retn
    .text:0040100C _main           endp
    .text:0040100C
    .text:0040100D
    .text:0040100D ; =============== S U B R O U T I N E =======================================
    .text:0040100D
    .text:0040100D
    .text:0040100D sub_40100D      proc near               ; CODE XREF: .text:0040101Dp
    .text:0040100D                 push    offset aHelloFromObj_0 ; "Hello from object 1!\n"
    .text:00401012                 call    ds:printf
    .text:00401018                 pop     ecx
    .text:00401019                 retn
    .text:00401019 sub_40100D      endp
    .text:00401019
    .text:00401019 ; ---------------------------------------------------------------------------
    .text:0040101A                 db 3 dup(0CCh)
    .text:0040101D ; ---------------------------------------------------------------------------
    .text:0040101D                 call    sub_40100D
    .text:00401022                 call    _main
    .text:00401027                 xor     eax, eax
    .text:00401029                 retn
    The acid test must be to get a working exe. I was a little surprised to learn that File->Produce File->Create EXE file... shows an 'Unsupported' messagebox. The Entry Point also needs to be changed.

    I have also tested the script on watermark2.exe and it also seem to work there after forcing IDA to show the correct disassembly with 'U' 'C'.

    I will do some searching afterwards to see how I can get the changes made in IDA down to a file on disk as you mentioned dELTA.
    Another thing that I'm seriously considering is to switch to IDAPython. The IDC script is a little messy, there is little code reuse and I can hardly find way through the code myself. I hope all this will change going to Python. This mini-project was also about learning IDC but I think there is IDC code enough now.
    Are you ready for Python?

  12. #12
    Teach, Not Flame Kayaker's Avatar
    Join Date
    Oct 2000
    Posts
    4,047
    Blog Entries
    5
    No I just used the IDA feature to produce the MAP file. Rumours to the contrary are unfounded
    I mean, we couldn't relink to create a map file in real life, so one could hardly "cheat" in this mini project right?


    I see what you mean about AnalyzeArea not working very well. I tried adding a second instance after the first with
    PHP Code:
    ...
        
    Wait();  // Wait for the end of autoanalysis
        
    AnalyzeArea (MinEA(), MaxEA()); 
    The second pass produced further changes, but still didn't get everything correct. As dELTA alluded to, I guess it's not perfect and a full redisassembly of a patched file would probably produce better results.


    However, if you're interested in what else you can do to produce good (re)disassembly results using a script, you might want to look at the source of the very effective IDA_ExtraPass_PlugIn by Sirmabus. It handles things like 'align' blocks, stray blocks of code, undefined functions and such.

    http://www.woodmann.com/collaborative/tools/index.php/ExtraPass


    If interested, you might also like to look at the IDC scripts I wrote for analyzing a malware:

    http://www.woodmann.com/forum/entry.php?35-IDC-scripting-a-Win32.Virut-variant-Part-1

    I ended up doing several "clean-up" passes to make a readable disassembly. They are in 4 separate idc scripts just for clarity. The first was a standard AnalyzeArea reanalysis after doing some decrypting. The next step was a manual fix-up of embedded string pointers (I couldn't think of a "smart" script to handle that automatically).

    Then came a script to convert operands of the form "[ebp+xxxxxxh]" to a real offset, another one to clean up unwanted operand prefix/suffix text the disassembly produced, and one more to resolve API addresses. Finally we read in a C header file containing some undefined function prototypes and structures. If you read the full blog post, I mention a few more details about some useful idc commands and a few quirks I found while working with reanalysing a disassembly.

    It just goes to show that there are a fair number of things you can do to produce a "nice" looking and accurate disassembly using IDC/plw scripts.


    Python? Are you a masochist?

  13. #13
    Quote Originally Posted by Kayaker View Post
    Python? Are you a masochist?
    Hehe, had a good laugh

    I will take a look at all the goodies you referenced in order to get the disassembly right. However, I can't wait to test the script on a real exe so I have prioritized that for today. And I think I'm almost there now. Made a quick test on watermark1.exe and it looked alright. Just need to change the entry point and do some testing

    PHP Code:
    #include <idc.idc> // Mandatory include directive

    static GetFileHandle(mode)
    {
        
    auto hFile;
        
        
    hFile fopen(GetInputFilePath(), mode);
        if (
    == hFile)
        {
            
    Message("Cannot open \"" GetInputFile() + "\"");
        }
        return 
    hFile;
    }

    static 
    GetPointerToPEHeader(hfile)
    {
        
    auto e_lfanew;
        
        
    // Seek to the e_lfanew field 
        
    if (!= fseek(hfile0x3C0))
        {
            
    Message(" 1 Cannot seek in \"" GetInputFile() + "\", handle: %x"hfile);
        }

        
    // Read the value of e_lfanew
        
    e_lfanew readlong(hfile0);

        
    // Seek to IMAGE_NT_HEADERS
        
    if (!= fseek(hfilee_lfanew0))
        {
            
    Message(" 2 Cannot seek in \"" GetInputFile() + "\", handle: %x, elfanew: %x\n"hfilee_lfanew);
        }

        
    // Read the Signature
        
    if (0x00004550 != readlong(hfile0))
        {
            
    Message("Not a valid PE file");
        }
        return 
    e_lfanew;
    }

    static 
    GetImageBase(hfilee_lfanew)
    {
        
    auto imageBase;
        
        
    // Seek to the IMAGE_NT_HEADERS.OptionalHeader.ImageBase field
        
    if (!= fseek(hfilee_lfanew 0x18 0x1C0))
        {
            
    Fatal(" 3 Cannot seek in \"" GetInputFile() + "\"");
        }
        
    imageBase readlong(hfile0);
        return 
    imageBase;
    }

    static 
    GetVirtualSectionOffset(hfilee_lfanewsection)
    {
        
    auto numberOfSectionssectionRva;
        
        
    // Seek to the IMAGE_FILE_HEADER.NumberOfSections field
        
    if (!= fseek(hfilee_lfanew 0x060))
        {
            
    Fatal(" 4 Cannot seek in \"" GetInputFile() + "\"");
        }

        
    // Read the number of sections
        
    numberOfSections readshort(hfile0);
        
        if (
    section >= numberOfSections)
        {
            
    Fatal("Invalid section");
        }

        
    // Seek to the desired section
        
    if (!= fseek(hfilee_lfanew 0xF8 section 0x28 0x0C0))
        {
            
    Fatal(" 5 Cannot seek in \"" GetInputFile() + "\"");
        }

        
    sectionRva readlong(hfile0);
        return 
    sectionRva;
    }

    static 
    GetRawSectionOffset(hfilee_lfanewsection)
    {
        
    auto pointerToRawData;
        
        
    // Seek to the desired section
        
    if (!= fseek(hfilee_lfanew 0xF8 section 0x28 0x140))
        {
            
    Fatal(" 6 Cannot seek in \"" GetInputFile() + "\"");
        }

        
    pointerToRawData readlong(hfile0);
        return 
    pointerToRawData;
    }

    static 
    GetFileOffset(rvaimagebasevirtualsectionoffsetrawsectionoffset)
    {
        return 
    rva imagebase virtualsectionoffset rawsectionoffset;
    }

    static 
    GetNumberOfFunctions()
    {
        
    auto addrnamefidx;
        
    addr 0;
        
    fidx 0// function index
        
    for(addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return 
    fidx;
            }
            
    fidx fidx 1;        
        }
        return 
    fidx;
    }

    static 
    CreatePermutation(inumberoffunctions)
    {
        
    auto hpermutation;
        
        
    hpermutation CreateArray("Permutation");
        if(
    hpermutation == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    hpermutation GetArrayId("Permutation");
        }
        
    // Hardcoded permutation
        
    SetArrayLong(hpermutation02);
        
    SetArrayLong(hpermutation11);
        
    SetArrayLong(hpermutation20);
        return 
    hpermutation;
    }
        
    static 
    GetFunctionAddresses()
    {
        
    auto addrnamefidxhfunctionaddresses;
        
    addr 0;
        
    fidx 0// function index
        
        
    hfunctionaddresses CreateArray("FunctionAddresses");
        if(
    hfunctionaddresses == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    hfunctionaddresses GetArrayId("FunctionAddresses");
        }

        for(
    addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return 
    hfunctionaddresses;
            }
            
    SetArrayLong(hfunctionaddresses2*fidxaddr);
            
    SetArrayLong(hfunctionaddresses2*fidx+1NextFunction(addr) - addr);
            
            
    fidx fidx 1;
        }
        return 
    hfunctionaddresses;
    }

    static 
    GetNewFunctionAddresses(hfunctionaddresseshpermutationinumberoffunctions)
    {
        
    auto addrpidxfidxhnewfunctionaddresses;
        
    addr 0;
        
    pidx 0;
        
    fidx 0;
        
        
    hnewfunctionaddresses CreateArray("NewFunctionAddresses");
        if(
    hnewfunctionaddresses == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    hnewfunctionaddresses GetArrayId("NewFunctionAddresses");
        }
        
        
    // Address of first function
        
    addr NextFunction(addr);
        
        
    fidx GetArrayElement(AR_LONGhpermutationpidx); 
        
    SetArrayLong(hnewfunctionaddressesfidxaddr);
        
    addr addr GetArrayElement(AR_LONGhfunctionaddresses2*fidx+1);
        
        for(
    pidx=1pidx inumberoffunctionspidx++)
        {
            
    fidx GetArrayElement(AR_LONGhpermutationpidx); 
            
    SetArrayLong(hnewfunctionaddressesfidxaddr);
            
    addr addr GetArrayElement(AR_LONGhfunctionaddresses2*fidx+1);
        }
        return 
    hnewfunctionaddresses;
    }

    static 
    CreateAddressTranslationLUT(hnewfunctionaddresses)
    {
        
    auto addrhaddresstranslationlutnameendinstnewaddrfidx;
        
    addr 0;
        
    fidx 0;
        
        
    haddresstranslationlut CreateArray("AddressTranslationLookupTable");
        if(
    haddresstranslationlut == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    haddresstranslationlut GetArrayId("AddressTranslationLookupTable");
        }
        
        for(
    addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return 
    haddresstranslationlut;
            }
            
    end  GetFunctionAttr(addrFUNCATTR_END);
            
    inst addr;
            
            
    // Get new base address of function
            
    newaddr GetArrayElement(AR_LONGhnewfunctionaddressesfidx);
            
            
    SetArrayLong(haddresstranslationlutinstnewaddr);
            
    Message("haddresstranslationlut %x \n",haddresstranslationlut);
            
    inst FindCode(instSEARCH_DOWN SEARCH_NEXT);
            while(
    inst end)
            {
                
    SetArrayLong(haddresstranslationlutinstnewaddr + (inst-addr));
                
    inst FindCode(instSEARCH_DOWN SEARCH_NEXT);
            }
            
    fidx fidx 1;
        }
        return 
    haddresstranslationlut;
    }

    static 
    PatchInPlaceDebug(haddresstranslationlut)
    {
        
    auto addrnameendinstnewaddr;
        
    addr 0;
        
        for(
    addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return;
            }
            
    end  GetFunctionAttr(addrFUNCATTR_END);
            
    inst addr;
            
            while(
    inst end)
            {
                
    Message("Address %x mapped to %x\n",inst,GetArrayElement(AR_LONGhaddresstranslationlutinst));
                
    inst FindCode(instSEARCH_DOWN SEARCH_NEXT);
            }
        }
    }

    static 
    PatchInPlace(haddresstranslationlut)
    {
        
    auto addrnameendinstnewaddropidxoptypenewrvanearaddr;
        
    addr 0;
        
        for(
    addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return;
            }
            
    end  GetFunctionAttr(addrFUNCATTR_END);
            
    inst addr;
            
            while(
    inst end)
            {
                
    opidx 0;
                
    optype GetOpType(inst,opidx);
                while(
    optype 0)
                {
                    if(
    optype == 7)
                    {
                        
    // Immediate Near Address
                        
                        // Maybe not necessary but check for call instruction
                        
    if(GetMnem(inst) == "call")
                        {
                            
    Message("Instruction at %x being patched.\n"inst);
                            
    nearaddr LocByName(GetOpnd(instopidx));
                            
    newrva   GetArrayElement(AR_LONGhaddresstranslationlutnearaddr) - (GetArrayElement(AR_LONGhaddresstranslationlutinst)+0x6);
                            
    PatchDword(inst+0x1newrva+0x1);
                            if(
    nearaddr == BADADDR)
                            {
                                
    Message("Fatal error, error processing instruction at %x\n"inst);
                            }
                        }
                        else
                        {
                            
    Message("Unsupported! Unknown %s instruction needs to be patched.\n"GetMnem(inst));
                        }
                            
                    }
                    
                    
    opidx++;
                    
    optype GetOpType(inst,opidx);
                }  
                
    inst FindCode(instSEARCH_DOWN SEARCH_NEXT);        }
        }
    }

    static 
    EnumerateAndStoreFunctions(hfunctionnames)
    {
        
    auto addrtmpaddrnamefidxwidxbsuccesstmphandleinextfunction;
        
    addr 0;
        
    fidx 0// function index
        
    widx 0// word idx
        
    for(addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return 
    fidx;
            }
            
            
    bsuccess SetArrayString(hfunctionnames2*fidxname);
            if(
    bsuccess == 0)
            {
                
    Message("Saving name of function %s failed.",name); 
            }
        
            
    tmphandle CreateArray(name);
            if(
    tmphandle == -1)
            {
                
    tmphandle GetArrayId(name);
            }

            
    inextfunction NextFunction(addr);
            if(
    inextfunction == BADADDR)
            {
                
    inextfunction GetFunctionAttr(addrFUNCATTR_END);
            }
            
            
    widx 0;
            for(
    tmpaddr addrtmpaddr inextfunctiontmpaddr tmpaddr 1)
            {
                 
    SetArrayLong(tmphandlewidxByte (tmpaddr));
                 
    widx widx 1;
            }
            
    bsuccess SetArrayLong(hfunctionnames2*fidx+1widx);
            
    fidx fidx 1;        
        }
        return 
    fidx;
    }

    static 
    PrintFunctions(hfunctionnamesinumberoffunctions)
    {
        
    auto fidx;
        for(
    fidx 0fidx inumberoffunctionsfidx fidx 1)
        {
            
    Message("Function: %s\n"GetArrayElement(AR_STRhfunctionnames2*fidx));
        }
    }

    static 
    WriteBackFunctions(hfunctionnamesinumberoffunctionsiwriteaddrwritetofilehfile)
    {
        
    auto fidxoidxfuncnamehopcodesopcodeslen;
        
    auto imagebasevirtualsectionoffsetrawsectionoffset;
        
    auto writeerrorbytehglobalvarsfileoffset;
        
        if(
    writetofile == 1)
        {
            
    hglobalvars          GetArrayId("GlobalVars");
            
    imagebase            GetArrayElement(AR_LONGhglobalvars0);
            
    virtualsectionoffset GetArrayElement(AR_LONGhglobalvars1);
            
    rawsectionoffset     GetArrayElement(AR_LONGhglobalvars2);
        }
        
        
    // DEBUG
        
    Message("imagebase: %x, virtualsectionoffset: %x, rawsectionoffset: %x\n",imagebase,virtualsectionoffset,rawsectionoffset);
        
        for(
    fidx 2fidx >=fidx fidx 1)
        {
            
    funcname    GetArrayElement(AR_STRhfunctionnames2*fidx); 
            
    opcodeslen  GetArrayElement(AR_LONGhfunctionnames2*fidx+1); 
            
    hopcodes    GetArrayId(funcname);
            for(
    oidx 0oidx opcodeslenoidx oidx 1)
            {
                
    byte GetArrayElement(AR_LONGhopcodesoidx);
                
    PatchByte(iwriteaddrbyte);
                if(
    writetofile == 1)
                {
                    
    fileoffset GetFileOffset(iwriteaddrimagebasevirtualsectionoffsetrawsectionoffset);
                    
    writeerror fseek(hfilefileoffset0);
                    
    writeerror fputc(bytehfile);
                    if(
    writeerror == -1)
                    {
                        
    Message("Could not write to file (RVA %x)",iwriteaddr);
                        return;
                    }
                    
    Message("Write byte %x to file offset %x\n"bytefileoffset);
                }
                
                
    iwriteaddr iwriteaddr 1;
            }
        }
    }

    static 
    main()
    {
        
    auto hfilee_lfanewimagebasevirtualsectionoffsetrawsectionoffsetwritetofilesection;
        
    auto didxinumberoffunctionshfunctionnameshpermutationhfunctionaddresseshnewfunctionaddresses;
        
    auto haddresstranslationluthglobalvars;
        
        
    writetofile              1;
        
        
    // This is init stuff and should be wrapped into a separate init function
        
    if(writetofile == 1)
        {
            
    section              0;
            
    hfile                GetFileHandle("rb");
            
    e_lfanew             GetPointerToPEHeader(hfile);
            
    imagebase            GetImageBase(hfilee_lfanew);
            
    virtualsectionoffset GetVirtualSectionOffset(hfilee_lfanewsection);
            
    rawsectionoffset     GetRawSectionOffset(hfilee_lfanewsection);
            
            
    hglobalvars          CreateArray("GlobalVars");
            if(
    hglobalvars == -1)
            {
                
    // If array already exist get the handle by GetArrayId
                
    hglobalvars GetArrayId("GlobalVars");
            }
            
    SetArrayLong(hglobalvars0imagebase);
            
    SetArrayLong(hglobalvars1virtualsectionoffset);
            
    SetArrayLong(hglobalvars2rawsectionoffset);
            
    fclose(hfile);
            
    hfile                GetFileHandle("r+");
        }
        
        
    // Get number of functions 
        
    inumberoffunctions GetNumberOfFunctions();
        
        
    // DEBUG
        // Message("Number of functions %d\n",inumberoffunctions);

        // Create permutation array
        
    hpermutation CreatePermutation(inumberoffunctions);
        
        
    // Get current function addresses
        
    hfunctionaddresses GetFunctionAddresses();
        
        
    // Get addresses after permutation
        
    hnewfunctionaddresses GetNewFunctionAddresses(hfunctionaddresseshpermutationinumberoffunctions);    

        
    // Pre-processing, create address translation lookup table
        
    haddresstranslationlut CreateAddressTranslationLUT(hnewfunctionaddresses);
        
        
    PatchInPlace(haddresstranslationlut);
        
        
    //DEBUG  
        //for(didx = 0; didx<inumberoffunctions; didx++)
        //{
        //    Message("New Function address: %x\n", GetArrayElement(AR_LONG, hnewfunctionaddresses, didx));
        //}
        //return;
        
        // This array is populated with names of functions and
        // the length of the functions in dwords in the following
        // way [name1,length1,name2,length2,...]
        
    hfunctionnames CreateArray("FunctionNames");
         
        if(
    hfunctionnames == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    Message("hfunctionnames is -1.\n");
            
    hfunctionnames GetArrayId("FunctionNames");
        }
     
        
    //  Enumerate functions and store them i persistent array
        
    inumberoffunctions EnumerateAndStoreFunctions(hfunctionnames);    

        
    // Print functions in IDA's output window
        
    PrintFunctions(hfunctionnamesinumberoffunctions);
        
        
    // Write Back functions in reversed order
        
    WriteBackFunctions(hfunctionnamesinumberoffunctions0x401000writetofilehfile);
     
        if(
    writetofile == 1)
        {
            
    fclose(hfile);
        }
     
        
    MakeUnknown (MinEA(), MaxEA() - MinEA(), 1);
        
    AnalyzeArea (MinEA(), MaxEA());      


  14. #14
    After running the below script on watermark1.exe I learned about Base relocations. It's something that prevents the 'dewatermarked' exe from running

    The below script now takes care of correcting the entry point as well so it should (in theory) work on an exe with fixed base or stripped relocation info.
    If I manually, in a hex-editor, correct the relevant values in the relocation directory, the 'dewatermarked' exe runs without any problems. As watermark1.exe and watermark2.exe were built with relocation information and dynamic base it would most likely be considered cheating if the script isn't updated to correct relocation information as well

    Hopefully there will be an Xmas version of the IDC script that takes care of the relocation information but most likely it will be a new year edition

    PHP Code:
    #include <idc.idc> // Mandatory include directive

    static GetFileHandle(mode)
    {
        
    auto hFile;
        
        
    hFile fopen(GetInputFilePath(), mode);
        if (
    == hFile)
        {
            
    Message("Cannot open \"" GetInputFile() + "\"");
        }
        return 
    hFile;
    }

    static 
    GetPointerToPEHeader(hfile)
    {
        
    auto e_lfanew;
        
        
    // Seek to the e_lfanew field 
        
    if (!= fseek(hfile0x3C0))
        {
            
    Message(" 1 Cannot seek in \"" GetInputFile() + "\", handle: %x"hfile);
        }

        
    // Read the value of e_lfanew
        
    e_lfanew readlong(hfile0);

        
    // Seek to IMAGE_NT_HEADERS
        
    if (!= fseek(hfilee_lfanew0))
        {
            
    Message(" 2 Cannot seek in \"" GetInputFile() + "\", handle: %x, elfanew: %x\n"hfilee_lfanew);
        }

        
    // Read the Signature
        
    if (0x00004550 != readlong(hfile0))
        {
            
    Message("Not a valid PE file");
        }
        return 
    e_lfanew;
    }

    static 
    GetImageBase(hfilee_lfanew)
    {
        
    auto imageBase;
        
        
    // Seek to the IMAGE_NT_HEADERS.OptionalHeader.ImageBase field
        
    if (!= fseek(hfilee_lfanew 0x18 0x1C0))
        {
            
    Fatal(" 3 Cannot seek in \"" GetInputFile() + "\"");
        }
        
    imageBase readlong(hfile0);
        return 
    imageBase;
    }

    static 
    GetVirtualSectionOffset(hfilee_lfanewsection)
    {
        
    auto numberOfSectionssectionRva;
        
        
    // Seek to the IMAGE_FILE_HEADER.NumberOfSections field
        
    if (!= fseek(hfilee_lfanew 0x060))
        {
            
    Fatal(" 4 Cannot seek in \"" GetInputFile() + "\"");
        }

        
    // Read the number of sections
        
    numberOfSections readshort(hfile0);
        
        if (
    section >= numberOfSections)
        {
            
    Fatal("Invalid section");
        }

        
    // Seek to the desired section
        
    if (!= fseek(hfilee_lfanew 0xF8 section 0x28 0x0C0))
        {
            
    Fatal(" 5 Cannot seek in \"" GetInputFile() + "\"");
        }

        
    sectionRva readlong(hfile0);
        return 
    sectionRva;
    }

    static 
    GetRawSectionOffset(hfilee_lfanewsection)
    {
        
    auto pointerToRawData;
        
        
    // Seek to the desired section
        
    if (!= fseek(hfilee_lfanew 0xF8 section 0x28 0x140))
        {
            
    Fatal(" 6 Cannot seek in \"" GetInputFile() + "\"");
        }

        
    pointerToRawData readlong(hfile0);
        return 
    pointerToRawData;
    }

    static 
    GetFileOffset(rvaimagebasevirtualsectionoffsetrawsectionoffset)
    {
        return 
    rva imagebase virtualsectionoffset rawsectionoffset;
    }

    static 
    GetNumberOfFunctions()
    {
        
    auto addrnamefidx;
        
    addr 0;
        
    fidx 0// function index
        
    for(addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return 
    fidx;
            }
            
    fidx fidx 1;        
        }
        return 
    fidx;
    }

    static 
    CreatePermutation(inumberoffunctions)
    {
        
    auto hpermutation;
        
        
    hpermutation CreateArray("Permutation");
        if(
    hpermutation == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    hpermutation GetArrayId("Permutation");
        }
        
    // Hardcoded permutation
        
    SetArrayLong(hpermutation02);
        
    SetArrayLong(hpermutation11);
        
    SetArrayLong(hpermutation20);
        return 
    hpermutation;
    }
        
    static 
    GetFunctionAddresses()
    {
        
    auto addrnamefidxhfunctionaddresses;
        
    addr 0;
        
    fidx 0// function index
        
        
    hfunctionaddresses CreateArray("FunctionAddresses");
        if(
    hfunctionaddresses == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    hfunctionaddresses GetArrayId("FunctionAddresses");
        }

        for(
    addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return 
    hfunctionaddresses;
            }
            
    SetArrayLong(hfunctionaddresses2*fidxaddr);
            
    SetArrayLong(hfunctionaddresses2*fidx+1NextFunction(addr) - addr);
            
            
    fidx fidx 1;
        }
        return 
    hfunctionaddresses;
    }

    static 
    GetNewFunctionAddresses(hfunctionaddresseshpermutationinumberoffunctions)
    {
        
    auto addrpidxfidxhnewfunctionaddresses;
        
    addr 0;
        
    pidx 0;
        
    fidx 0;
        
        
    hnewfunctionaddresses CreateArray("NewFunctionAddresses");
        if(
    hnewfunctionaddresses == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    hnewfunctionaddresses GetArrayId("NewFunctionAddresses");
        }
        
        
    // Address of first function
        
    addr NextFunction(addr);
        
        
    fidx GetArrayElement(AR_LONGhpermutationpidx); 
        
    SetArrayLong(hnewfunctionaddressesfidxaddr);
        
    addr addr GetArrayElement(AR_LONGhfunctionaddresses2*fidx+1);
        
        for(
    pidx=1pidx inumberoffunctionspidx++)
        {
            
    fidx GetArrayElement(AR_LONGhpermutationpidx); 
            
    SetArrayLong(hnewfunctionaddressesfidxaddr);
            
    addr addr GetArrayElement(AR_LONGhfunctionaddresses2*fidx+1);
        }
        return 
    hnewfunctionaddresses;
    }

    static 
    CreateAddressTranslationLUT(hnewfunctionaddresses)
    {
        
    auto addrhaddresstranslationlutnameendinstnewaddrfidx;
        
    addr 0;
        
    fidx 0;
        
        
    haddresstranslationlut CreateArray("AddressTranslationLookupTable");
        if(
    haddresstranslationlut == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    haddresstranslationlut GetArrayId("AddressTranslationLookupTable");
        }
        
        for(
    addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return 
    haddresstranslationlut;
            }
            
    end  GetFunctionAttr(addrFUNCATTR_END);
            
    inst addr;
            
            
    // Get new base address of function
            
    newaddr GetArrayElement(AR_LONGhnewfunctionaddressesfidx);
            
            
    SetArrayLong(haddresstranslationlutinstnewaddr);
            
    Message("haddresstranslationlut %x \n",haddresstranslationlut);
            
    inst FindCode(instSEARCH_DOWN SEARCH_NEXT);
            while(
    inst end)
            {
                
    SetArrayLong(haddresstranslationlutinstnewaddr + (inst-addr));
                
    inst FindCode(instSEARCH_DOWN SEARCH_NEXT);
            }
            
    fidx fidx 1;
        }
        return 
    haddresstranslationlut;
    }

    static 
    PatchInPlaceDebug(haddresstranslationlut)
    {
        
    auto addrnameendinstnewaddr;
        
    addr 0;
        
        for(
    addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return;
            }
            
    end  GetFunctionAttr(addrFUNCATTR_END);
            
    inst addr;
            
            while(
    inst end)
            {
                
    Message("Address %x mapped to %x\n",inst,GetArrayElement(AR_LONGhaddresstranslationlutinst));
                
    inst FindCode(instSEARCH_DOWN SEARCH_NEXT);
            }
        }
    }

    static 
    PatchInPlace(haddresstranslationlut)
    {
        
    auto addrnameendinstnewaddropidxoptypenewrvanearaddr;
        
    addr 0;
        
        for(
    addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return;
            }
            
    end  GetFunctionAttr(addrFUNCATTR_END);
            
    inst addr;
            
            while(
    inst end)
            {
                
    opidx 0;
                
    optype GetOpType(inst,opidx);
                while(
    optype 0)
                {
                    if(
    optype == 7)
                    {
                        
    // Immediate Near Address
                        
                        // Maybe not necessary but check for call instruction
                        
    if(GetMnem(inst) == "call")
                        {
                            
    Message("Instruction at %x being patched.\n"inst);
                            
    nearaddr LocByName(GetOpnd(instopidx));
                            
    newrva   GetArrayElement(AR_LONGhaddresstranslationlutnearaddr) - (GetArrayElement(AR_LONGhaddresstranslationlutinst)+0x6);
                            
    PatchDword(inst+0x1newrva+0x1);
                            if(
    nearaddr == BADADDR)
                            {
                                
    Message("Fatal error, error processing instruction at %x\n"inst);
                            }
                        }
                        else
                        {
                            
    Message("Unsupported! Unknown %s instruction needs to be patched.\n"GetMnem(inst));
                        }
                            
                    }
                    
                    
    opidx++;
                    
    optype GetOpType(inst,opidx);
                }  
                
    inst FindCode(instSEARCH_DOWN SEARCH_NEXT);        }
        }
    }

    static 
    EnumerateAndStoreFunctions(hfunctionnames)
    {
        
    auto addrtmpaddrnamefidxwidxbsuccesstmphandleinextfunction;
        
    addr 0;
        
    fidx 0// function index
        
    widx 0// word idx
        
    for(addr NextFunction(addr); addr != BADADDRaddr NextFunction(addr))
        {
            
    name Name(addr);
            
            
    // Stop if name of function is _pre_cpp_init
            // It is assumed that compiler/linker generated functions
            // are appended in the end of image and that they start with the
            // _pre_cpp_init function
            
    if(name == "_pre_cpp_init")
            {
                return 
    fidx;
            }
            
            
    bsuccess SetArrayString(hfunctionnames2*fidxname);
            if(
    bsuccess == 0)
            {
                
    Message("Saving name of function %s failed.",name); 
            }
        
            
    tmphandle CreateArray(name);
            if(
    tmphandle == -1)
            {
                
    tmphandle GetArrayId(name);
            }

            
    inextfunction NextFunction(addr);
            if(
    inextfunction == BADADDR)
            {
                
    inextfunction GetFunctionAttr(addrFUNCATTR_END);
            }
            
            
    widx 0;
            for(
    tmpaddr addrtmpaddr inextfunctiontmpaddr tmpaddr 1)
            {
                 
    SetArrayLong(tmphandlewidxByte (tmpaddr));
                 
    widx widx 1;
            }
            
    bsuccess SetArrayLong(hfunctionnames2*fidx+1widx);
            
    fidx fidx 1;        
        }
        return 
    fidx;
    }

    static 
    PrintFunctions(hfunctionnamesinumberoffunctions)
    {
        
    auto fidx;
        for(
    fidx 0fidx inumberoffunctionsfidx fidx 1)
        {
            
    Message("Function: %s\n"GetArrayElement(AR_STRhfunctionnames2*fidx));
        }
    }

    static 
    WriteBackFunctions(hfunctionnamesinumberoffunctionsiwriteaddrwritetofilehfile)
    {
        
    auto fidxoidxfuncnamehopcodesopcodeslen;
        
    auto imagebasevirtualsectionoffsetrawsectionoffset;
        
    auto writeerrorbytehglobalvarsfileoffset;
        
        if(
    writetofile == 1)
        {
            
    hglobalvars          GetArrayId("GlobalVars");
            
    imagebase            GetArrayElement(AR_LONGhglobalvars0);
            
    virtualsectionoffset GetArrayElement(AR_LONGhglobalvars1);
            
    rawsectionoffset     GetArrayElement(AR_LONGhglobalvars2);
        }
        
        
    // DEBUG
        
    Message("imagebase: %x, virtualsectionoffset: %x, rawsectionoffset: %x\n",imagebase,virtualsectionoffset,rawsectionoffset);
        
        for(
    fidx 2fidx >=fidx fidx 1)
        {
            
    funcname    GetArrayElement(AR_STRhfunctionnames2*fidx); 
            
    opcodeslen  GetArrayElement(AR_LONGhfunctionnames2*fidx+1); 
            
    hopcodes    GetArrayId(funcname);
            for(
    oidx 0oidx opcodeslenoidx oidx 1)
            {
                
    byte GetArrayElement(AR_LONGhopcodesoidx);
                
    PatchByte(iwriteaddrbyte);
                if(
    writetofile == 1)
                {
                    
    fileoffset GetFileOffset(iwriteaddrimagebasevirtualsectionoffsetrawsectionoffset);
                    
    writeerror fseek(hfilefileoffset0);
                    
    writeerror fputc(bytehfile);
                    if(
    writeerror == -1)
                    {
                        
    Message("Could not write to file (RVA %x)",iwriteaddr);
                        return;
                    }
                    
    Message("Write byte %x to file offset %x\n"bytefileoffset);
                }
                
                
    iwriteaddr iwriteaddr 1;
            }
        }
    }

    static 
    main()
    {
        
    auto hfilee_lfanewimagebasevirtualsectionoffsetrawsectionoffsetwritetofilesection;
        
    auto didxinumberoffunctionshfunctionnameshpermutationhfunctionaddresseshnewfunctionaddresses;
        
    auto haddresstranslationluthglobalvarsmaincall2mainnewrvafileoffsetwriteerror;
        
        
    writetofile              1;
        
        
    // This is init stuff and should be wrapped into a separate init function
        
    if(writetofile == 1)
        {
            
    section              0;
            
    hfile                GetFileHandle("rb");
            
    e_lfanew             GetPointerToPEHeader(hfile);
            
    imagebase            GetImageBase(hfilee_lfanew);
            
    virtualsectionoffset GetVirtualSectionOffset(hfilee_lfanewsection);
            
    rawsectionoffset     GetRawSectionOffset(hfilee_lfanewsection);
            
            
    hglobalvars          CreateArray("GlobalVars");
            if(
    hglobalvars == -1)
            {
                
    // If array already exist get the handle by GetArrayId
                
    hglobalvars GetArrayId("GlobalVars");
            }
            
    SetArrayLong(hglobalvars0imagebase);
            
    SetArrayLong(hglobalvars1virtualsectionoffset);
            
    SetArrayLong(hglobalvars2rawsectionoffset);

            
    // Get address of main
            
    main                 LocByName("_main");
            if(
    main == BADADDR)
            {
                
    Message("Could not find _main. Aborting...\n");
                return;
            }
            
    call2main RfirstB(main);
            if(
    GetMnem(call2main) != "call")
            {
                
    Message("Expecting to find call to _main. Unsuccessful. Aborting...\n");
                return;
            }
            
            
    fclose(hfile);
            
    hfile                GetFileHandle("r+");
        }
        
        
    // Get number of functions 
        
    inumberoffunctions GetNumberOfFunctions();
        
        
    // DEBUG
        // Message("Number of functions %d\n",inumberoffunctions);

        // Create permutation array
        
    hpermutation CreatePermutation(inumberoffunctions);
        
        
    // Get current function addresses
        
    hfunctionaddresses GetFunctionAddresses();
        
        
    // Get addresses after permutation
        
    hnewfunctionaddresses GetNewFunctionAddresses(hfunctionaddresseshpermutationinumberoffunctions);    

        
    // Pre-processing, create address translation lookup table
        
    haddresstranslationlut CreateAddressTranslationLUT(hnewfunctionaddresses);
        
        
    PatchInPlace(haddresstranslationlut);

        
    // Fix call to _main
        
    if(writetofile == 1)
        {
            if(
    GetOpType(call2main,0) != 7)
            {
                
    Message("Unexpected operand found at call2main. Aborting...\n");
                return;
            }
            
    newrva   GetArrayElement(AR_LONGhaddresstranslationlutmain) - (call2main+0x6);
            
    PatchDword(call2main+0x1newrva+0x1);  

            
    fileoffset GetFileOffset(call2main+1imagebasevirtualsectionoffsetrawsectionoffset);
            
    writeerror fseek(hfilefileoffset0);
            
    writeerror writelong(hfilenewrva+0x10);
            if(
    writeerror == -1)
            {
                
    Message("Could not patch call2main (newrva %x)"newrva);
                return;
            }
            
    Message("Write long %x to file offset %x\n"newrvafileoffset);
        }
        
        
    //DEBUG  
        //for(didx = 0; didx<inumberoffunctions; didx++)
        //{
        //    Message("New Function address: %x\n", GetArrayElement(AR_LONG, hnewfunctionaddresses, didx));
        //}
        //return;
        
        // This array is populated with names of functions and
        // the length of the functions in dwords in the following
        // way [name1,length1,name2,length2,...]
        
    hfunctionnames CreateArray("FunctionNames");
         
        if(
    hfunctionnames == -1)
        {
            
    // If array already exist get the handle by GetArrayId
            
    Message("hfunctionnames is -1.\n");
            
    hfunctionnames GetArrayId("FunctionNames");
        }
     
        
    //  Enumerate functions and store them i persistent array
        
    inumberoffunctions EnumerateAndStoreFunctions(hfunctionnames);    

        
    // Print functions in IDA's output window
        
    PrintFunctions(hfunctionnamesinumberoffunctions);
        
        
    // Write Back functions in reversed order
        
    WriteBackFunctions(hfunctionnamesinumberoffunctions0x401000writetofilehfile);
     
        if(
    writetofile == 1)
        {
            
    fclose(hfile);
        }
     
        
    MakeUnknown (MinEA(), MaxEA() - MinEA(), 1);
        
    AnalyzeArea (MinEA(), MaxEA());      


  15. #15
    Administrator dELTA's Avatar
    Join Date
    Oct 2000
    Location
    Ring -1
    Posts
    4,204
    Blog Entries
    5
    Very nice work niaren!

    Next steps would be to update all locations that IDA classifies as addresses (if the executable has a relocation table, they should all be in there though, so in that case you already got the jackpot, but the relocs might be stripped from executables (contrary to DLLs) and this would make the script useless on them [first protector counter-measure, woo]), and then the little more complicated inter-functional offsets (note: offsets != addresses). If relocating code on function-level rather that object file-level, you might even have to mutate the code in more complex ways than just patching addresses in order to fix these inter-functional offsets (since the new relocated offset might need more bits space than the original one needed), but if not, you can ignore them completely I would think, since there would not be any in the object file-level case.

    After this, I guess tests on more and more complex executables is the way to go, until they possibly crash after your dewatermarking, and then analyze their crash/disassembly in IDA to see what kind of special case cause the script not to work, then implement support for this special case, and then iterate the procedure until the dewatermarking code shuffling produces a working IDA executable.

    After that, in order to make it a serious "generic dewatermarker", my suggested steps are probably these (as loosely mentioned in my previous posts too):

    • Import table shuffler (should be easy as long as all code that is statically linked to imports is correctly analyzed by IDA)
    • Export table shuffler (should be easy no matter what)
    • PE resource directory shuffler (should be easy under normal conditions, I think)
    • Relocation table shuffler (this might be covered by what you already mention is your planned next step, but I cannot bring myself to remember if the contents of the relocation table can have arbitrary order, or if they must be ordered by relocated address - in the former case you should always randomly reshuffle the order of the relocations too, to eradicate any watermark entropy that might be hidden in this ordering).
    • Code-location-independent function diffing tool (checking for any differences within functions that are not related to their location (and thus neglecting differences relating to call and jump addresses/offsets in the code, but detecting all other differences), e.g. to see if there are any differences in used instructions in sub areas of functions etc. Do note that not all addresses/offsets can/should be ignored during this process though, only those related to jumps/calls in the code, since otherwise entropy can be hidden in e.g. the ordering of data in data sections!
    • Data area diffing tool (detecting differences in default data section contents).
    • Non PE-section data diffing tool (diffing content of non-PE-section parts of executable files, e.g. PE headers, code caves or data inserted between, before or after PE section areas in the executable.


    For any detected differences in code sections, you must mutate the affected functions with a code obfuscation algorithm to hope to remove any watermarking entropy hidden in their original implementation. This algorithm should at least obfuscate/morph instruction ordering and substitute instructions or instruction sequences with semantically equivalent instructions or instructions sequences. Only "changing their CRC" with simple antivirus evasion-style obfuscation (insertion of nops, xor decryption layer etc) won't remove much entropy from the possibility of recovery by manual analysis.

    All code differences must also be analyzed manually by the reverser running the script/differ though, since a final resort of the protector might be to generate semantically different code in each watermarked version (e.g. setting a register to a serial number in some stray instruction somewhere), which would then not be removed by the above code obfuscation techniques. It could then be concluded by the reverser to be superfluous to proper program operation though, and thus completely nopped out instead.

    If you follow this advice in your implementation, you'll have a pretty damn capable (and unique) generic dewatermarker tool in you hands I'd say.
    "Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."

Similar Threads

  1. Watermarking application
    By LaBBa in forum Advanced Reversing and Programming
    Replies: 24
    Last Post: December 6th, 2010, 18:38
  2. anyone know the order in which PUSHAD works?
    By BanMe in forum Advanced Reversing and Programming
    Replies: 7
    Last Post: August 2nd, 2009, 20:19
  3. Replies: 2
    Last Post: April 10th, 2006, 17:49
  4. linking problem
    By Neitsa in forum The Newbie Forum
    Replies: 2
    Last Post: October 24th, 2005, 07:48
  5. Enabling "Save" function + linking to code
    By x! in forum Malware Analysis and Unpacking Forum
    Replies: 16
    Last Post: February 14th, 2001, 20:15

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •