Welcome to the new Woodmann RCE Messageboards Regroupment
Please be patient while the rest of the site is restored.

To all Members of the old RCE Forums:
In order to log in, it will be necessary to reset your forum login password ("I forgot my password") using the original email address you registered with. You will be sent an email with a link to reset your password for that member account.

The old vBulletin forum was converted to phpBB format, requiring the passwords to be reset. If this is a problem for some because of a forgotten email address, please feel free to re-register with a new username. We are happy to welcome old and new members back to the forums! Thanks.

All new accounts are manually activated before you can post. Any questions can be PM'ed to Kayaker.

Watermarking by linking order

A classroom run by newbies for newbies. Gain valuable reversing experience & skills as we explain the in's and out's of RCE.
User avatar
evaluator
Posts: 1538
Joined: Tue Sep 18, 2001 2:00 pm

Post by evaluator »

if program is not too complex, then re-implementing into new program is quite enough..
:P :P

P.S. well, & then selling that as yours..
User avatar
disavowed
Posts: 1290
Joined: Mon Apr 01, 2002 3:00 pm

Post by disavowed »

This is a bit off-topic since it's not about watermarking by linking order, but you may want to take a look at http://www.woodmann.com/crackz/Tutorials/Tsehpida.htm ("Reversing IDA 4.01 - Watermarked protection scheme") nonetheless.
User avatar
dELTA
Posts: 4209
Joined: Mon Oct 30, 2000 7:00 am
Location: Ring -1

Post by dELTA »

It's always good to be familiar with your history, thanks for the reference disa. :yay:
"Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."
niaren
Member
Posts: 70
Joined: Thu Dec 10, 2009 3:16 pm

Post by niaren »

dELTA, I haven't really thought about the case where relocation information is stripped or about the inter-functional offset issue. That is a small set back :) Let's see, have no clue right now how big a problem it is. As you also mentions, let's do it (small) step-by-step trial and error ;)
I also understand your point that the watermark leaves traces in the other directories and as such that would have to be 'taken care of' as well. I agree with you on everything :)

I've just set up IDAPython now and will start rewrite the IDC script in python and see if there will be many problems with this. Unless there are some problems I hope this means working smarter not harder and if that means being a masochist then yes I'm that! :p

At least this example works, one line that prints out all functions and their addresses :)

Code: Select all

print [(hex(func),GetFunctionName(func)) for func in Functions(SegStart(ScreenEA()),SegEnd(ScreenEA()))]
User avatar
dELTA
Posts: 4209
Joined: Mon Oct 30, 2000 7:00 am
Location: Ring -1

Post by dELTA »

Sounds great, then we are looking much forward to your next status report, sitting here ready to answer any further questions. :yay:
"Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."
User avatar
dELTA
Posts: 4209
Joined: Mon Oct 30, 2000 7:00 am
Location: Ring -1

Post by dELTA »

Short after posting my last message above, I came to think about adding TLS directory shuffling to my list of necessary things for "complete de-watermarkation" above.

BanMe just mentioned something else regarding TLS in another thread, directed at you niaren, and seemingly more specifically aimed at the discussion in this thread, so I'll include it here too:
BanMe wrote:*side note to niaren* What happens if the image has tls functions and these are not included in the reloc section.. I saw a function that relocs Tls as well somewhere but just noted it for interest..
Actually, all parts of the PE specification should be gone through carefully to make sure that there are none left in which entropy could be hidden. My short enumeration in previous post were just approximate and off the top of my head.

You yourself (niaren) also mentioned above the more general problem of absolute addresses not being included in the reloc table (e.g. because there is no reloc table to begin with, due to reloc stripping).

I'd say that as long as all code in the executable is free from encryption/obfuscation, IDA will already have parsed up and classified all such addresses in the code for you, ripe and ready for your picking, so not much trouble at all really. Also, if any such addresses are part of a watermark (and thus differ between different copies of the executable) you will find them easily with the "Code-location-independent function diffing tool" that I mention above, since this tool will only ignore offsets and absolute addresses explicitly mentioned in the reloc table (or even just identified by IDA).

Finally, encrypted/obfuscated code containing watermark data will of course also be easily identified by the data and code diffing tools I mention.
"Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."
niaren
Member
Posts: 70
Joined: Thu Dec 10, 2009 3:16 pm

Post by niaren »

dELTA, I keep getting distracted but I have begun the task of porting the IDC script to python :) Hopefully, I can do it this weekend. I will post here as soon as it is done.
Thanks a lot, you're extremely helpful :)
User avatar
dELTA
Posts: 4209
Joined: Mon Oct 30, 2000 7:00 am
Location: Ring -1

Post by dELTA »

Sounds great, looking forward to your progress reports. :yay:

This is a very good learning project (IDA scripting, PE structure, code anatomy, etc) and it would also be really cool if a good generic watermark detection/destruction tool would come out of it.
"Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."
niaren
Member
Posts: 70
Joined: Thu Dec 10, 2009 3:16 pm

In between update

Post by niaren »

In order to make things even more simple the makefile has been changed such that
- dependent functions are linked into the exe file (/MD -> /MT)
Without this modification renaming the exefile would give an error e.g. 'msvcr90.dll not found'
- Manifest file option is disabled
- exe file is linked with no dynamic base and fixed base address
With this modification it is possible to test the code reshuffling without worrying about reloc info.

As a side comment how many functions would you guess is identified/found by IDA inside the 'new' exe? See the answer below :)
(remember that we only defined 3 functions)

With these modifications the below IDC script and Python script both reorders the functions (the 3 functions made by us) in the exe file and the exe file actually runs afterwards :)



Things to do to (in priorized order) before moving on to other parts of the exe as pointed out by dELTA
- automatically find the range of functions that the script can reorder (hardcoded now)
- make python code look nice (subclass of PE class from pefile)
- make IDA output look nice (kayaker ideas)

New makefile

Code: Select all




SRCS = main.c file1.c file2.c

OBJS1 = main.obj file1.obj file2.obj
OBJS2 = file2.obj file1.obj main.obj 

CC        = CL
CCFLAGS   = /O2 /Oi /D "_MBCS" /FD /EHsc /MT /Gy /W3 /c /Zi /TC
            

LINK       = link
LINKFLAGS1 = "/OUT:watermark1.exe" /MANIFEST:NO /OPT:REF /OPT:ICF /DYNAMICBASE:NO /FIXED /NXCOMPAT /MACHINE:X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib 
LINKFLAGS2 = "/OUT:watermark2.exe" /MANIFEST:NO /OPT:REF /OPT:ICF /DYNAMICBASE:NO /FIXED /NXCOMPAT /MACHINE:X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib 


EC = echo
RM = del


default: all


clean:
	@$(RM) /F *.obj
	@$(RM) /F *.idb
	@$(RM) /F *.pdb
	@$(RM) /F *.exe
	@$(RM) /F *manifest*

%.obj : %.c 
	"C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\vcvars32.bat"
	@$(EC) ************************************************
	@$(EC) * Comiling [email protected]
	$(CC)  $(CCFLAGS) $<

watermark1.exe: $(OBJS1)
	"C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\vcvars32.bat"
	$(LINK) $(LINKFLAGS1) $(OBJS1)
	$(LINK) $(LINKFLAGS2) $(OBJS2)



all: watermark1.exe
IDC script

Code: Select all

#include <idc.idc> // Mandatory include directive

static GetFileHandle(mode)
{
	auto hFile;
	
	hFile = fopen(GetInputFilePath(), mode);
	if (0 == hFile)
	{
		Message("Cannot open \"" + GetInputFile() + "\"");
	}
	return hFile;
}

static GetPointerToPEHeader(hfile)
{
	auto e_lfanew;
	
	// Seek to the e_lfanew field 
	if (0 != fseek(hfile, 0x3C, 0))
	{
		Message(" 1 Cannot seek in \"" + GetInputFile() + "\", handle: %x", hfile);
	}

	// Read the value of e_lfanew
	e_lfanew = readlong(hfile, 0);

	// Seek to IMAGE_NT_HEADERS
	if (0 != fseek(hfile, e_lfanew, 0))
	{
		Message(" 2 Cannot seek in \"" + GetInputFile() + "\", handle: %x, elfanew: %x\n", hfile, e_lfanew);
	}

	// Read the Signature
	if (0x00004550 != readlong(hfile, 0))
	{
		Message("Not a valid PE file");
	}
	return e_lfanew;
}

static GetImageBase(hfile, e_lfanew)
{
	auto imageBase;
	
	// Seek to the IMAGE_NT_HEADERS.OptionalHeader.ImageBase field
	if (0 != fseek(hfile, e_lfanew + 0x18 + 0x1C, 0))
	{
		Fatal(" 3 Cannot seek in \"" + GetInputFile() + "\"");
	}
	imageBase = readlong(hfile, 0);
	return imageBase;
}

static GetVirtualSectionOffset(hfile, e_lfanew, section)
{
	auto numberOfSections, sectionRva;
	
	// Seek to the IMAGE_FILE_HEADER.NumberOfSections field
	if (0 != fseek(hfile, e_lfanew + 0x06, 0))
	{
		Fatal(" 4 Cannot seek in \"" + GetInputFile() + "\"");
	}

	// Read the number of sections
	numberOfSections = readshort(hfile, 0);
	
	if (section >= numberOfSections)
	{
		Fatal("Invalid section");
	}

	// Seek to the desired section
	if (0 != fseek(hfile, e_lfanew + 0xF8 + section * 0x28 + 0x0C, 0))
	{
		Fatal(" 5 Cannot seek in \"" + GetInputFile() + "\"");
	}

	sectionRva = readlong(hfile, 0);
	return sectionRva;
}

static GetRawSectionOffset(hfile, e_lfanew, section)
{
	auto pointerToRawData;
	
	// Seek to the desired section
	if (0 != fseek(hfile, e_lfanew + 0xF8 + section * 0x28 + 0x14, 0))
	{
		Fatal(" 6 Cannot seek in \"" + GetInputFile() + "\"");
	}

	pointerToRawData = readlong(hfile, 0);
	return pointerToRawData;
}

static GetFileOffset(rva, imagebase, virtualsectionoffset, rawsectionoffset)
{
	return rva - imagebase - virtualsectionoffset + rawsectionoffset;
}

static GetNumberOfFunctions()
{
    auto addr, name, fidx;
    addr = 0;
    fidx = 0; // function index
    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
    {
        name = Name(addr);
        
        // Stop if name of function is _pre_cpp_init
        // It is assumed that compiler/linker generated functions
        // are appended in the end of image and that they start with the
        // _pre_cpp_init function
        if(name == "_pre_cpp_init")
        {
            return fidx;
        }
        fidx = fidx + 1;        
    }
    return fidx;
}

static CreatePermutation(inumberoffunctions)
{
    auto hpermutation;
    
	hpermutation = CreateArray("Permutation");
	if(hpermutation == -1)
    {
        // If array already exist get the handle by GetArrayId
        hpermutation = GetArrayId("Permutation");
    }
    // Hardcoded permutation
    SetArrayLong(hpermutation, 0, 2);
    SetArrayLong(hpermutation, 1, 1);
    SetArrayLong(hpermutation, 2, 0);
    return hpermutation;
}
	
static GetFunctionAddresses()
{
    auto addr, name, fidx, hfunctionaddresses;
    addr = 0;
    fidx = 0; // function index
    
    hfunctionaddresses = CreateArray("FunctionAddresses");
	if(hfunctionaddresses == -1)
    {
        // If array already exist get the handle by GetArrayId
        hfunctionaddresses = GetArrayId("FunctionAddresses");
    }

    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
    {
        name = Name(addr);
        
        // Stop if name of function is _pre_cpp_init
        // It is assumed that compiler/linker generated functions
        // are appended in the end of image and that they start with the
        // _pre_cpp_init function
        if( (name == "_pre_cpp_init") ||  (fidx==3) )
        {
            return hfunctionaddresses;
        }
        SetArrayLong(hfunctionaddresses, 2*fidx, addr);
        SetArrayLong(hfunctionaddresses, 2*fidx+1, NextFunction(addr) - addr);
        
        fidx = fidx + 1;
    }
    return hfunctionaddresses;
}

static GetNewFunctionAddresses(hfunctionaddresses, hpermutation, inumberoffunctions)
{
	auto addr, pidx, fidx, hnewfunctionaddresses;
    addr = 0;
    pidx = 0;
	fidx = 0;
	
    hnewfunctionaddresses = CreateArray("NewFunctionAddresses");
	if(hnewfunctionaddresses == -1)
    {
        // If array already exist get the handle by GetArrayId
        hnewfunctionaddresses = GetArrayId("NewFunctionAddresses");
    }
    
	// Address of first function
    addr = NextFunction(addr);
    
    fidx = GetArrayElement(AR_LONG, hpermutation, pidx); 
    SetArrayLong(hnewfunctionaddresses, fidx, addr);
    addr = addr + GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx+1);
    
    for(pidx=1; pidx < inumberoffunctions; pidx++)
    {
        fidx = GetArrayElement(AR_LONG, hpermutation, pidx); 
	    SetArrayLong(hnewfunctionaddresses, fidx, addr);
		addr = addr + GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx+1);
    }
    return hnewfunctionaddresses;
}

static CreateAddressTranslationLUT(hnewfunctionaddresses)
{
	auto addr, haddresstranslationlut, name, end, inst, newaddr, fidx;
    addr = 0;
    fidx = 0;
    
    haddresstranslationlut = CreateArray("AddressTranslationLookupTable");
	if(haddresstranslationlut == -1)
    {
        // If array already exist get the handle by GetArrayId
        haddresstranslationlut = GetArrayId("AddressTranslationLookupTable");
    }
    
    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
    {
        name = Name(addr);
        
        // Stop if name of function is _pre_cpp_init
        // It is assumed that compiler/linker generated functions
        // are appended in the end of image and that they start with the
        // _pre_cpp_init function
        if( (name == "_pre_cpp_init") || (fidx==3))
        {
            return haddresstranslationlut;
        }
	end  = GetFunctionAttr(addr, FUNCATTR_END);
        inst = addr;
        
        // Get new base address of function
        newaddr = GetArrayElement(AR_LONG, hnewfunctionaddresses, fidx);
        
        SetArrayLong(haddresstranslationlut, inst, newaddr);
        Message("haddresstranslationlut %x -> %x \n", inst, newaddr);
        inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);
        while(inst < end)
        {
			SetArrayLong(haddresstranslationlut, inst, newaddr + (inst-addr));
                        Message("haddresstranslationlut %x -> %x \n", inst, newaddr + (inst-addr));
			inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);
        }
        fidx = fidx + 1;
    }
    return haddresstranslationlut;
}

static PatchInPlaceDebug(haddresstranslationlut)
{
	auto addr, name, end, inst, newaddr;
    addr = 0;
    
    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
    {
        name = Name(addr);
        
        // Stop if name of function is _pre_cpp_init
        // It is assumed that compiler/linker generated functions
        // are appended in the end of image and that they start with the
        // _pre_cpp_init function
        if(name == "_pre_cpp_init")
        {
            return;
        }
		end  = GetFunctionAttr(addr, FUNCATTR_END);
        inst = addr;
        
        while(inst < end)
        {
			Message("Address %x mapped to %x\n",inst,GetArrayElement(AR_LONG, haddresstranslationlut, inst));
			inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);
        }
    }
}

static PatchInPlace(haddresstranslationlut)
{
    auto addr, name, end, inst, newaddr, opidx, optype, newrva, nearaddr, fidx;
    auto nearaddrnew;
    addr = 0;
    fidx = 0;
    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
    {
        name = Name(addr);
        
        // Stop if name of function is _pre_cpp_init
        // It is assumed that compiler/linker generated functions
        // are appended in the end of image and that they start with the
        // _pre_cpp_init function
        if( (name == "_pre_cpp_init")  || (fidx == 3) )
        {
            return;
        }
		end  = GetFunctionAttr(addr, FUNCATTR_END);
        inst = addr;
        
        while(inst < end)
        {
			opidx = 0;
			optype = GetOpType(inst,opidx);
			while(optype > 0)
			{
				if(optype == 7)
				{
					// Immediate Near Address
					
					// Maybe not necessary but check for call instruction
					if(GetMnem(inst) == "call")
					{
						Message("Instruction at %x being patched.\n", inst);
					    nearaddr = LocByName(GetOpnd(inst, opidx));
                        Message("Operand near address is %x\n", nearaddr);
                        nearaddrnew = GetArrayElement(AR_LONG, haddresstranslationlut, nearaddr);
                        Message("Looking up near address: %x\n", nearaddrnew);
                        if(nearaddrnew == 0)
                        {
                            if(nearaddr == BADADDR)
                            {
    							Message("Fatal error, error processing instruction at %x\n", inst);
                            }
                            nearaddrnew = nearaddr;
                        }
						newrva   = nearaddrnew - (GetArrayElement(AR_LONG, haddresstranslationlut, inst)+0x6);
						PatchDword(inst+0x1, newrva+0x1);
						
					}
					else
					{
						Message("Unsupported! Unknown %s instruction needs to be patched.\n", GetMnem(inst));
					}
						
				}
				
				opidx++;
				optype = GetOpType(inst,opidx);
			}  
			inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

        }
        fidx = fidx + 1;    
    } // end for-loop
    
} // end function

static EnumerateAndStoreFunctions(hfunctionnames)
{
    auto addr, tmpaddr, name, fidx, widx, bsuccess, tmphandle, inextfunction;
    addr = 0;
    fidx = 0; // function index
    widx = 0; // word idx
    for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
    {
        name = Name(addr);
        
        // Stop if name of function is _pre_cpp_init
        // It is assumed that compiler/linker generated functions
        // are appended in the end of image and that they start with the
        // _pre_cpp_init function
        if( (name == "_pre_cpp_init") || (fidx == 3) )
        {
            return fidx;
        }
        
	    bsuccess = SetArrayString(hfunctionnames, 2*fidx, name);
	    if(bsuccess == 0)
        {
            Message("Saving name of function %s failed.",name); 
        }
	
        tmphandle = CreateArray(name);
        if(tmphandle == -1)
        {
            tmphandle = GetArrayId(name);
        }

        inextfunction = NextFunction(addr);
        if(inextfunction == BADADDR)
        {
            inextfunction = GetFunctionAttr(addr, FUNCATTR_END);
        }
        
        widx = 0;
        for(tmpaddr = addr; tmpaddr < inextfunction; tmpaddr = tmpaddr + 1)
        {
             SetArrayLong(tmphandle, widx, Byte (tmpaddr));
             widx = widx + 1;
        }
		bsuccess = SetArrayLong(hfunctionnames, 2*fidx+1, widx);
        fidx = fidx + 1;        
    }
    return fidx;
}

static PrintFunctions(hfunctionnames, inumberoffunctions)
{
    auto fidx;
    for(fidx = 0; fidx < inumberoffunctions; fidx = fidx + 1)
    {
        Message("Function: %s\n", GetArrayElement(AR_STR, hfunctionnames, 2*fidx));
    }
}

static WriteBackFunctions(hfunctionnames, inumberoffunctions, iwriteaddr, writetofile, hfile)
{
    auto fidx, oidx, funcname, hopcodes, opcodeslen;
	auto imagebase, virtualsectionoffset, rawsectionoffset;
	auto writeerror, byte, hglobalvars, fileoffset;
	
	if(writetofile == 1)
	{
		hglobalvars          = GetArrayId("GlobalVars");
		imagebase            = GetArrayElement(AR_LONG, hglobalvars, 0);
		virtualsectionoffset = GetArrayElement(AR_LONG, hglobalvars, 1);
		rawsectionoffset     = GetArrayElement(AR_LONG, hglobalvars, 2);
	}
	
	// DEBUG
	Message("imagebase: %x, virtualsectionoffset: %x, rawsectionoffset: %x\n",imagebase,virtualsectionoffset,rawsectionoffset);
	
    for(fidx = 2; fidx >=0 ; fidx = fidx - 1)
    {
        funcname    = GetArrayElement(AR_STR, hfunctionnames, 2*fidx); 
        opcodeslen  = GetArrayElement(AR_LONG, hfunctionnames, 2*fidx+1); 
	    hopcodes    = GetArrayId(funcname);
        for(oidx = 0; oidx < opcodeslen; oidx = oidx + 1)
        {
            byte = GetArrayElement(AR_LONG, hopcodes, oidx);
			PatchByte(iwriteaddr, byte);
			if(writetofile == 1)
			{
			    fileoffset = GetFileOffset(iwriteaddr, imagebase, virtualsectionoffset, rawsectionoffset);
			    writeerror = fseek(hfile, fileoffset, 0);
				writeerror = fputc(byte, hfile);
				if(writeerror == -1)
				{
					Message("Could not write to file (RVA %x)",iwriteaddr);
					return;
				}
				Message("Write byte %x to file offset %x\n", byte, fileoffset);
			}
			
            iwriteaddr = iwriteaddr + 1;
        }
    }
}

static main()
{
    auto hfile, e_lfanew, imagebase, virtualsectionoffset, rawsectionoffset, writetofile, section;
    auto didx, inumberoffunctions, hfunctionnames, hpermutation, hfunctionaddresses, hnewfunctionaddresses;
    auto haddresstranslationlut, hglobalvars, main, call2main, newrva, fileoffset, writeerror, fidx;
    
	writetofile              = 1;
	
	// This is init stuff and should be wrapped into a separate init function
	if(writetofile == 1)
	{
		section              = 0;
		hfile                = GetFileHandle("rb");
		e_lfanew             = GetPointerToPEHeader(hfile);
		imagebase            = GetImageBase(hfile, e_lfanew);
		virtualsectionoffset = GetVirtualSectionOffset(hfile, e_lfanew, section);
		rawsectionoffset     = GetRawSectionOffset(hfile, e_lfanew, section);
	    
	    hglobalvars          = CreateArray("GlobalVars");
		if(hglobalvars == -1)
		{
			// If array already exist get the handle by GetArrayId
			hglobalvars = GetArrayId("GlobalVars");
		}
		SetArrayLong(hglobalvars, 0, imagebase);
		SetArrayLong(hglobalvars, 1, virtualsectionoffset);
		SetArrayLong(hglobalvars, 2, rawsectionoffset);

	    // Get address of main
		main                 = LocByName("_main");
		if(main == BADADDR)
		{
			Message("Could not find _main. Aborting...\n");
			return;
		}
		call2main = RfirstB(main);
		if(GetMnem(call2main) != "call")
		{
			Message("Expecting to find call to _main. Unsuccessful. Aborting...\n");
			return;
		}
		
		fclose(hfile);
		hfile                = GetFileHandle("r+");
	}
    
    // Get number of functions 
    inumberoffunctions = GetNumberOfFunctions();
    inumberoffunctions = 3;
        
    // DEBUG
    Message("Number of functions %d\n",inumberoffunctions);

	// Create permutation array
	hpermutation = CreatePermutation(inumberoffunctions);
    
    // Get current function addresses
    hfunctionaddresses = GetFunctionAddresses();
    for(fidx = 0; fidx < inumberoffunctions; fidx++)
    {
        Message("Function address: %x\n", GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx));
    }
    
        
    // Get addresses after permutation
    hnewfunctionaddresses = GetNewFunctionAddresses(hfunctionaddresses, hpermutation, inumberoffunctions);
    for(fidx = 0; fidx < inumberoffunctions; fidx++)
    {
        Message("New Function address: %x\n", GetArrayElement(AR_LONG, hnewfunctionaddresses, fidx));
    }
    
    
    // Pre-processing, create address translation lookup table
    haddresstranslationlut = CreateAddressTranslationLUT(hnewfunctionaddresses);
      
    PatchInPlace(haddresstranslationlut);
   

	// Fix call to _main
	if(writetofile == 1)
	{
		if(GetOpType(call2main,0) != 7)
		{
			Message("Unexpected operand found at call2main. Aborting...\n");
			return;
		}
		newrva   = GetArrayElement(AR_LONG, haddresstranslationlut, main) - (call2main+0x6);
		PatchDword(call2main+0x1, newrva+0x1);  

	    fileoffset = GetFileOffset(call2main+1, imagebase, virtualsectionoffset, rawsectionoffset);
	    writeerror = fseek(hfile, fileoffset, 0);
		writeerror = writelong(hfile, newrva+0x1, 0);
		if(writeerror == -1)
		{
			Message("Could not patch call2main (newrva %x)", newrva);
			return;
		}
		Message("Write long %x to file offset %x\n", newrva, fileoffset);
	}
	
	//DEBUG  
    //for(didx = 0; didx<inumberoffunctions; didx++)
    //{
	//	Message("New Function address: %x\n", GetArrayElement(AR_LONG, hnewfunctionaddresses, didx));
    //}
    //return;
    
    // This array is populated with names of functions and
    // the length of the functions in dwords in the following
    // way [name1,length1,name2,length2,...]
    hfunctionnames = CreateArray("FunctionNames");
     
    if(hfunctionnames == -1)
    {
        // If array already exist get the handle by GetArrayId
        Message("hfunctionnames is -1.\n");
        hfunctionnames = GetArrayId("FunctionNames");
    }
 
    //  Enumerate functions and store them i persistent array
    inumberoffunctions = EnumerateAndStoreFunctions(hfunctionnames);	

    // Print functions in IDA's output window
    PrintFunctions(hfunctionnames, inumberoffunctions);
    
    // Write Back functions in reversed order
    WriteBackFunctions(hfunctionnames, inumberoffunctions, 0x401000, writetofile, hfile);
 
    if(writetofile == 1)
    {
	fclose(hfile);
    }
 
    MakeUnknown (MinEA(), MaxEA() - MinEA(), 1);
    AnalyzeArea (MinEA(), MaxEA());      
}

Python script (this script creates a new permutation for each invocatioin)

Code: Select all

import pefile
import random
from collections import defaultdict


intsize = 32
magic   = 2**intsize
def dec2hex(x, m=magic):
    if x<0:
        return magic+x
    else:
        return x
    
class DEWA(pefile.PE):
    
    def PrintStuff(self):
        print self.DOS_HEADER.e_lfanew
        print self.OPTIONAL_HEADER.ImageBase
        print self.sections[0].PointerToRawData
        print self.sections[0].VirtualAddress

dewa = DEWA('C:\\rce\\LinkOrder\\manifestless\\watermark1.exe')
dewa.PrintStuff()

main = LocByName("_main")
if (main <= 0):
    print "Unexpected result: could not find _main, LocByName(\"_main\")"
    sys.exit(1)
    
print "address of main is " + hex(main)
print "entry point read from file is " + hex(dewa.OPTIONAL_HEADER.AddressOfEntryPoint)
call2main = RfirstB(main)
if(GetMnem(call2main) != "call"):
    Message("Unexpected result: Expecting to find call to _main. Unsuccessful. Aborting...")
    sys.exit(1)

# Get functions
funcs = []
for idx,func in enumerate(Functions()):
    if Name(func) == "_pre_cpp_init":
        break
    if idx > 2:
        break
    funcs.append(func)

funclens  = []
func2func = []
# Loop over the function and check if they represent a contiguous block
for idx in range(len(funcs)-1):
    curfunc = funcs[idx]
    end_addr = GetFunctionAttr(curfunc, FUNCATTR_END)
    funclens.append(end_addr - curfunc)
    func2func.append(funcs[idx+1] - curfunc)
    for byte in range(end_addr, funcs[idx+1]):
        if Byte(byte) != 204:  # 0xCC
            print "Warning: Test for continuity failed."
            print "func number %i at address %s" % (idx, hex(curfunc))

# Last function is treated separately
end_addr = GetFunctionAttr(funcs[-1], FUNCATTR_END)
funclens.append(end_addr-funcs[-1])
last     = NextFunction(funcs[-1])
if last > 0:
    func2func.append(last - funcs[-1])
else:
    func2func.append(funclens[-1])    
    
print funclens
print func2func
# Get number of functions
no_functions = len(funcs)

# Make a permutation
funcorder = range(no_functions)

# make a copy of funcorder
new_funcorder = funcorder[:]
random.shuffle(new_funcorder)

# check the permutation
while True:
    count = 0
    for idx in range(no_functions):
        if funcorder[idx] == new_funcorder[idx]:
            count = count + 1
    if count > 0:
        new_funcorder = funcorder[:]
        random.shuffle(new_funcorder)
    else:
        break

##print funcorder
##map(print hex(funcs), funcs)
funcmap = dict()
curaddr = funcs[0]
for idx in xrange(no_functions):
    curfunc = funcs[new_funcorder[idx]]
    funcmap[hex(curfunc)] = hex(curaddr)
    curaddr = curaddr + func2func[new_funcorder[idx]]
 
print new_funcorder    
print funcmap    

# Build Address Translation LUT
atlut = dict()
for func in funcs:
    func_end = GetFunctionAttr(func, FUNCATTR_END)
    newfuncaddr = funcmap[hex(func)]
    inst = func
    atlut[hex(inst)] = newfuncaddr
    inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT)
    while inst < func_end:
        atlut[hex(inst)] = hex(int(newfuncaddr,16) + (inst-func));
	inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

print atlut
print "*************************************"
# Patch in-place
for func in funcs:
    func_end = GetFunctionAttr(func, FUNCATTR_END)
    instr    = func
    while instr < func_end:
        print "Considering %s." % hex(instr)
        opidx  = 0
        optype = GetOpType(instr, opidx)
        while optype > 0:
            if(optype == 7):
                # Immediate near address
                # Maybe not necessary but check for call instruction
                if(GetMnem(instr) == 'call'):
                    print "Instruction at %s being pathed." % hex(instr)
                    nearaddr = LocByName(GetOpnd(instr, opidx))
                    print hex(nearaddr)
                    print "atlut[hex(instr)] is " + atlut[hex(instr)]
                    if (hex(nearaddr) in atlut):
                        newrva   = int(atlut[hex(nearaddr)],16) - (int(atlut[hex(instr)],16)+0x6)
                    else:
                        if (nearaddr == 0):
                            print "Fatal error! error processing instruction at %s" % hex(instr)
                            sys.exit(1)
                        else:
                            newrva   = nearaddr - (int(atlut[hex(instr)],16)+0x6)
                    print "newrva is " + str(newrva) + " in hex " + hex(newrva)
                    PatchDword(instr+1, dec2hex(newrva)+1)
                else:
                    print "Unsupported! Unknown %s instruction needs to be patch" % GetMnem(instr)

            opidx = opidx + 1
            optype = GetOpType(instr, opidx)

        instr = FindCode(instr, SEARCH_DOWN | SEARCH_NEXT);

# Enumerate and store functions
funcdict = defaultdict(list)
funcidx  = 0 # used for indexing into funclens list
for func in funcs:
    funclen = func2func[funcidx]
    for addr in range(func,func+funclen):
        funcdict[funcidx].append(Byte(addr))
    funcidx = funcidx + 1

print funcdict

imbase = dewa.OPTIONAL_HEADER.ImageBase
# layout the functions in the new order
writeaddr = funcs[0]
print hex(writeaddr)
for fidx in new_funcorder:
    print fidx
    for codebyte in funcdict[fidx]:
        print hex(writeaddr), hex(codebyte)
        PatchByte(writeaddr, codebyte)
        dewa.set_bytes_at_rva(writeaddr-imbase, chr(codebyte))
        writeaddr = writeaddr + 1

# Correct entry point
if(GetOpType(call2main,0) != 7):
    print "Unexpected operand found at call2main. Aborting...\n"
    sys.exit(1)

print atlut[hex(main)]
print int(atlut[hex(main)],16)
print call2main
print newrva
newrva   = int(atlut[hex(main)],16) - (call2main+0x6);
PatchDword(call2main+0x1, dec2hex(newrva+1));  
dewa.set_dword_at_rva(call2main+1-imbase, dec2hex(newrva+1))

# Commit to file / save to disk
dewa.write('C:\\rce\\LinkOrder\\manifestless\\dewatermark1.exe')



Answer: 235!?
User avatar
dELTA
Posts: 4209
Joined: Mon Oct 30, 2000 7:00 am
Location: Ring -1

Post by dELTA »

Looking great niaren. :yay:

About this:
niaren wrote:automatically find the range of functions that the script can reorder (hardcoded now)
Shouldn't that always be "all of them"?
"Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."
niaren
Member
Posts: 70
Joined: Thu Dec 10, 2009 3:16 pm

Post by niaren »

dELTA wrote: Shouldn't that always be "all of them"?
My assumption was that the 'overhead/auxillary functions' introduced by the compiler are always appended to the exe file. As such they don't contribute to the watermark. However, instead of creating a more complicated exe file for further development and testing we might just as well try to reorder all the functions in the watermark1.exe file instead of just the 3 functions. I have tried to do that.

The result is an improved python script (se bottom). With this script it is possible to reorder a range of functions that obey a continuity constraint. As it turns out the first 9 functions in the watermark1.exe file fulfill this constraint and can be reordered with the script.

Continuity constraint:
A consecutive sequence of functions, as identified by IDA, is said to be continuous if the space in between functions (if there is any) consists of sequences of 0xCC (int 3) bytes.

In other words, we expect to find 0xCC sequences in between functions in order to align functions to paragraphs. If this is not the case we have no real control of what is going on with the bytes in between the functions. It could very well be code.

I will now depict a scenario in the watermark1.exe file where this continuity constraint is not obeyed and which furthermore turns out to be relative hard to deal with in de-watermarking terms (at least I think so :) ).

Below is shown the IDA output for watermark1 where sub_401276 is the 10th function (where first function has lowest address). We see that the bytes following the end of function sub_401276 are not identified (or at least not indicated) by IDA as a function. We could regard these bytes as code and move them. However, there are no references (Xrefs) to the code. Therefore we have no direct way to take care of patching the code that calls this code, if it gets called.

Code: Select all

.text:0040126C                 public $LN25
.text:0040126C $LN25           proc near             <-------------  OEP (not interesting)
.text:0040126C                 call    ___security_init_cookie
.text:00401271                 jmp     ___tmainCRTStartup
.text:00401271 $LN25           endp
.text:00401271
.text:00401276
.text:00401276 ; =============== S U B R O U T I N E =======================================
.text:00401276
.text:00401276
.text:00401276 sub_401276      proc near               ; CODE XREF: sub_40102C:loc_401063p
.text:00401276                                         ; sub_40102C+4Ep ...
.text:00401276                 mov     eax, offset off_40C008
.text:0040127B                 retn
.text:0040127B sub_401276      endp
.text:0040127B
.text:0040127C ; ---------------------------------------------------------------------------
.text:0040127C
.text:0040127C ___initstdio:                     <- instructions/data located in between functions
.text:0040127C                 mov     eax, dword_40EAC0
.text:00401281                 push    esi
.text:00401282                 push    14h
.text:00401284                 pop     esi
.text:00401285                 test    eax, eax
.text:00401287                 jnz     short loc_401290
.text:00401289                 mov     eax, 200h
.text:0040128E                 jmp     short loc_401296
In fact, this code gets called by this piece of code (inside __initterm_e function)

Code: Select all

esi holds the address of an array of addresses, the loop is executed until a non NULL pointer is found and the execution directed to the function with the corresponding address
.text:004026C4 loc_4026C4:                             ; CODE XREF: __initterm_e+1Fj
.text:004026C4                 test    eax, eax
.text:004026C6                 jnz     short loc_4026D8
.text:004026C8                 mov     ecx, [esi]
.text:004026CA                 test    ecx, ecx
.text:004026CC                 jz      short loc_4026D0
.text:004026CE                 call    ecx
.text:004026D0
.text:004026D0 loc_4026D0:                             ; CODE XREF: __initterm_e+15j
.text:004026D0                 add     esi, 4
.text:004026D3
.text:004026D3 loc_4026D3:                             ; CODE XREF: __initterm_e+Bj
.text:004026D3                 cmp     esi, [ebp+arg_4]
.text:004026D6                 jb      short loc_4026C4
This seems like a quite difficult situation to handle from a de-watermarking point of view. One possible approach could be to search data section for 0040127C values...but that doesn't seem very feasible.

Updated python script

Code: Select all

import pefile
import random
from collections import defaultdict


intsize = 32
magic   = 2**intsize
def dec2hex(x, m=magic):
    if x<0:
        return magic+x
    else:
        return x
    
class DEWA(pefile.PE):

    def GetFunctionFromVA(self, va):
        for func in Functions():
            if func > va:
                f = PrevFunction(func)
                return f

        
    def CheckInstr(self, instr):
        mnem = GetMnem(instr)
        if(mnem == "call") or (mnem == "jmp") :
            if(Byte(instr) == 0xeb):  # jmp short
                return False
            if(Byte(instr) == 0xff):  # call indirect
                return False
            
            # Immediate near address
            if(GetOpType(instr, 0) != 7):
                print "Unexpected %s operand %i found at %s. Aborting...\n" % (mnem, GetOpType(instr, 0), hex(instr))
                return False
            else:
                nearaddr = LocByName(GetOpnd(instr, 0))
                if(nearaddr == BADADDR):
                    print "Error locating near address at %s. Aborting...\n" % (hex(instr))
                    return False
                else:
                    # check that near address falls outside of current function
                    fcur = self.GetFunctionFromVA(instr)
                    if(fcur == BADADDR):
                        print "GetFunctionFromVA Error at %s. Aborting...\n" % (hex(instr))
                        return
                    fnext = NextFunction(fcur)
                    if(fnext == BADADDR):
                        print "NextFunction Error at %s. Aborting...\n" % (hex(instr))
                        return                       
                    if(((instr+nearaddr) >= fcur) and ((instr+nearaddr) <= fnext)):
                        print "instr %s, fcur %s, fnext %s\n" % (hex(instr),hex(fcur),hex(fnext))
                        return False
                    else:
                        return True
        else:
            return False


    def PatchInstr(self, instr, atlut):
        mnem = GetMnem(instr)
        if( (mnem == "call") or (mnem == "jmp") ):
            nearaddr = LocByName(GetOpnd(instr, 0))
            if (hex(nearaddr) in atlut):
                newrva   = int(atlut[hex(nearaddr)],16) - (int(atlut[hex(instr)],16)+0x6)
            else:
                newrva   = nearaddr - (int(atlut[hex(instr)],16)+0x6)
            print "oldrva: " + hex(nearaddr) + " newrva is " + str(newrva) + " in hex " + hex(newrva)
            PatchDword(instr+1, dec2hex(newrva)+1)
            

    def PatchXrefInstr(self, instr, addr, atlut, imbase):
        newrva   = int(atlut[hex(addr)],16) - (instr+0x6);
        PatchDword(instr+0x1, dec2hex(newrva+1));  
        self.set_dword_at_rva(instr+1-imbase, dec2hex(newrva+1))

    
    def PrintStuff(self):
        print self.DOS_HEADER.e_lfanew
        print self.OPTIONAL_HEADER.ImageBase
        print self.sections[0].PointerToRawData
        print self.sections[0].VirtualAddress


def idapymain():
    dewa = DEWA('C:\\rce\\LinkOrder\\manifestless\\watermark1.exe')
    dewa.PrintStuff()

    imbase = dewa.OPTIONAL_HEADER.ImageBase
    
    main = LocByName("_main")
    if (main <= 0):
        print "Unexpected result: could not find _main, LocByName(\"_main\")"
        return
        
    print "address of main is " + hex(main)
    print "entry point read from file is " + hex(dewa.OPTIONAL_HEADER.AddressOfEntryPoint)

    # Get functions
    funcs = []
    for idx,func in enumerate(Functions()):
        if Name(func) == "_pre_cpp_init":
            break
        if idx > 9:
            break
        funcs.append(func)

    funclens  = []
    func2func = []
    # Loop over the function and check if they represent a contiguous block
    for idx in range(len(funcs)-1):
        curfunc = funcs[idx]
        end_addr = GetFunctionAttr(curfunc, FUNCATTR_END)
        funclens.append(end_addr - curfunc)
        func2func.append(funcs[idx+1] - curfunc)
        for byte in range(end_addr, funcs[idx+1]):
            if Byte(byte) != 204:  # 0xCC
                print "Warning: Test for continuity failed."
                print "func number %i at address %s" % (idx, hex(curfunc))

    # Last function is treated separately
    end_addr = GetFunctionAttr(funcs[-1], FUNCATTR_END)
    funclens.append(end_addr-funcs[-1])
    last     = NextFunction(funcs[-1])
    if last > 0:
        func2func.append(last - funcs[-1])
    else:
        func2func.append(funclens[-1])    
        
    print funclens
    print func2func
    # Get number of functions
    no_functions = len(funcs)

    # Make a permutation
    funcorder = range(no_functions)

    # make a copy of funcorder
    new_funcorder = funcorder[:]
    random.shuffle(new_funcorder)

    # check the permutation
    while True:
        count = 0
        for idx in range(no_functions):
            if funcorder[idx] == new_funcorder[idx]:
                count = count + 1
        if count > 0:
            new_funcorder = funcorder[:]
            random.shuffle(new_funcorder)
        else:
            break

    ##print funcorder
    ##map(print hex(funcs), funcs)
    funcmap = dict()
    curaddr = funcs[0]
    for idx in xrange(no_functions):
        curfunc = funcs[new_funcorder[idx]]
        funcmap[hex(curfunc)] = hex(curaddr)
        curaddr = curaddr + func2func[new_funcorder[idx]]
     
    print new_funcorder    
    print funcmap    

    # Build Address Translation LUT
    atlut = dict()
    for func in funcs:
        func_end = GetFunctionAttr(func, FUNCATTR_END)
        newfuncaddr = funcmap[hex(func)]
        inst = func
        atlut[hex(inst)] = newfuncaddr
        inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT)
        while inst < func_end:
            atlut[hex(inst)] = hex(int(newfuncaddr,16) + (inst-func));
            inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);

    print atlut
    print "*** Build ATLUT end *** \n\n"
        
    print "*** Handle xrefs start *** \n"
    
    for idx,func in enumerate(funcs):
        xref = RfirstB(func)
        while(xref != BADADDR):
            if( hex(xref) not in atlut):
                if (dewa.CheckInstr(xref) == False):
                    print "Unexpected Xref instruction found at %s. Aborting...\n" % (hex(xref))
                    return
                print "Reference to %i %s found at %s\n" % (idx, hex(func), hex(xref))
                dewa.PatchXrefInstr(xref, func, atlut, imbase)
            xref = RnextB(func, xref)
            
    print "*** Handle xrefs end *** \n\n"
    
    print "*** Patch in-place *** \n"
    
    # Patch in-place
    for func in funcs:
        func_end = GetFunctionAttr(func, FUNCATTR_END)
        instr    = func
        while instr < func_end:
            if(dewa.CheckInstr(instr)==True):
                print "Instruction at %s being pathed." % hex(instr)
                dewa.PatchInstr(instr, atlut)

            instr = FindCode(instr, SEARCH_DOWN | SEARCH_NEXT);

    # Enumerate and store functions
    funcdict = defaultdict(list)
    funcidx  = 0 # used for indexing into funclens list
    for func in funcs:
        funclen = func2func[funcidx]
        for addr in range(func,func+funclen):
            funcdict[funcidx].append(Byte(addr))
        funcidx = funcidx + 1

    print funcdict

    # layout the functions in the new order
    writeaddr = funcs[0]
    print hex(writeaddr)
    for fidx in new_funcorder:
        print fidx
        for codebyte in funcdict[fidx]:
            #print hex(writeaddr), hex(codebyte)
            PatchByte(writeaddr, codebyte)
            dewa.set_bytes_at_rva(writeaddr-imbase, chr(codebyte))
            writeaddr = writeaddr + 1

    # Check entry point
    oep = dewa.OPTIONAL_HEADER.AddressOfEntryPoint + imbase
    print "OEP is %s\n" % hex(oep)
    if (hex(oep) in atlut):
        print "OEP is changed to %s\n" % atlut[hex(oep)]
        dewa.OPTIONAL_HEADER.AddressOfEntryPoint = int(atlut[hex(oep)],16) - imbase
    
    # Commit to file / save to disk
    dewa.write('C:\\rce\\LinkOrder\\manifestless\\dewatermark1.exe')
    print "Done!\n"
    
idapymain()
Attachments
watermark1.zip
(24.47 KiB) Downloaded 57 times
User avatar
dELTA
Posts: 4209
Joined: Mon Oct 30, 2000 7:00 am
Location: Ring -1

Post by dELTA »

Nice to see your progress. :yay:

I understand that indirect calls that are not caught in the IDA xrefs will pose a big problem. But if the full address of the called function is read from the data section, it must also be in the reloc data of the executable, so as long as relocs are not stripped, we should still be ok, right?

Also, for the cases with stripped relocs, maybe the de-watermarking tool could at least helpfully provide a list of all indirect call instructions in the entire program, so that they could be analyzed manually by the user, and then manually entered as resolved xrefs into IDA before proceeding with the final working de-watermarking procedure?

I have no idea how many this is in a normal program, but at least the program has then done all it can to help, which would be the goal - nothing more can be demanded. And if the user wants, he can then analyze/resolve/enter all of these and their resolved xrefs manually in IDA, and then again proceed with the tool to actually get a working de-watermarked executable!
"Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."
Locked