This is a bit off-topic since it's not about watermarking by linking order, but you may want to take a look at http://www.woodmann.com/crackz/Tutorials/Tsehpida.htm ("Reversing IDA 4.01 - Watermarked protection scheme") nonetheless.
This is a bit off-topic since it's not about watermarking by linking order, but you may want to take a look at http://www.woodmann.com/crackz/Tutorials/Tsehpida.htm ("Reversing IDA 4.01 - Watermarked protection scheme") nonetheless.
It's always good to be familiar with your history, thanks for the reference disa.![]()
"Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."
dELTA, I haven't really thought about the case where relocation information is stripped or about the inter-functional offset issue. That is a small set backLet's see, have no clue right now how big a problem it is. As you also mentions, let's do it (small) step-by-step trial and error
I also understand your point that the watermark leaves traces in the other directories and as such that would have to be 'taken care of' as well. I agree with you on everything
I've just set up IDAPython now and will start rewrite the IDC script in python and see if there will be many problems with this. Unless there are some problems I hope this means working smarter not harder and if that means being a masochist then yes I'm that!
At least this example works, one line that prints out all functions and their addresses
PHP Code:
print [(hex(func),GetFunctionName(func)) for func in Functions(SegStart(ScreenEA()),SegEnd(ScreenEA()))]
Sounds great, then we are looking much forward to your next status report, sitting here ready to answer any further questions.![]()
"Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."
Short after posting my last message above, I came to think about adding TLS directory shuffling to my list of necessary things for "complete de-watermarkation" above.
BanMe just mentioned something else regarding TLS in another thread, directed at you niaren, and seemingly more specifically aimed at the discussion in this thread, so I'll include it here too:
Actually, all parts of the PE specification should be gone through carefully to make sure that there are none left in which entropy could be hidden. My short enumeration in previous post were just approximate and off the top of my head.Originally Posted by BanMe
You yourself (niaren) also mentioned above the more general problem of absolute addresses not being included in the reloc table (e.g. because there is no reloc table to begin with, due to reloc stripping).
I'd say that as long as all code in the executable is free from encryption/obfuscation, IDA will already have parsed up and classified all such addresses in the code for you, ripe and ready for your picking, so not much trouble at all really. Also, if any such addresses are part of a watermark (and thus differ between different copies of the executable) you will find them easily with the "Code-location-independent function diffing tool" that I mention above, since this tool will only ignore offsets and absolute addresses explicitly mentioned in the reloc table (or even just identified by IDA).
Finally, encrypted/obfuscated code containing watermark data will of course also be easily identified by the data and code diffing tools I mention.
"Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."
dELTA, I keep getting distracted but I have begun the task of porting the IDC script to pythonHopefully, I can do it this weekend. I will post here as soon as it is done.
Thanks a lot, you're extremely helpful![]()
Sounds great, looking forward to your progress reports.
This is a very good learning project (IDA scripting, PE structure, code anatomy, etc) and it would also be really cool if a good generic watermark detection/destruction tool would come out of it.
"Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."
In order to make things even more simple the makefile has been changed such that
- dependent functions are linked into the exe file (/MD -> /MT)
Without this modification renaming the exefile would give an error e.g. 'msvcr90.dll not found'
- Manifest file option is disabled
- exe file is linked with no dynamic base and fixed base address
With this modification it is possible to test the code reshuffling without worrying about reloc info.
As a side comment how many functions would you guess is identified/found by IDA inside the 'new' exe? See the answer below
(remember that we only defined 3 functions)
With these modifications the below IDC script and Python script both reorders the functions (the 3 functions made by us) in the exe file and the exe file actually runs afterwards![]()
Things to do to (in priorized order) before moving on to other parts of the exe as pointed out by dELTA
- automatically find the range of functions that the script can reorder (hardcoded now)
- make python code look nice (subclass of PE class from pefile)
- make IDA output look nice (kayaker ideas)
New makefile
IDC scriptCode:SRCS = main.c file1.c file2.c OBJS1 = main.obj file1.obj file2.obj OBJS2 = file2.obj file1.obj main.obj CC = CL CCFLAGS = /O2 /Oi /D "_MBCS" /FD /EHsc /MT /Gy /W3 /c /Zi /TC LINK = link LINKFLAGS1 = "/OUT:watermark1.exe" /MANIFEST:NO /OPT:REF /OPT:ICF /DYNAMICBASE:NO /FIXED /NXCOMPAT /MACHINE:X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib LINKFLAGS2 = "/OUT:watermark2.exe" /MANIFEST:NO /OPT:REF /OPT:ICF /DYNAMICBASE:NO /FIXED /NXCOMPAT /MACHINE:X86 kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib EC = echo RM = del default: all clean: @$(RM) /F *.obj @$(RM) /F *.idb @$(RM) /F *.pdb @$(RM) /F *.exe @$(RM) /F *manifest* %.obj : %.c "C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\vcvars32.bat" @$(EC) ************************************************ @$(EC) * Comiling $@ $(CC) $(CCFLAGS) $< watermark1.exe: $(OBJS1) "C:\Program Files\Microsoft Visual Studio 9.0\VC\bin\vcvars32.bat" $(LINK) $(LINKFLAGS1) $(OBJS1) $(LINK) $(LINKFLAGS2) $(OBJS2) all: watermark1.exe
Python script (this script creates a new permutation for each invocatioin)PHP Code:
#include <idc.idc> // Mandatory include directive
static GetFileHandle(mode)
{
auto hFile;
hFile = fopen(GetInputFilePath(), mode);
if (0 == hFile)
{
Message("Cannot open \"" + GetInputFile() + "\"");
}
return hFile;
}
static GetPointerToPEHeader(hfile)
{
auto e_lfanew;
// Seek to the e_lfanew field
if (0 != fseek(hfile, 0x3C, 0))
{
Message(" 1 Cannot seek in \"" + GetInputFile() + "\", handle: %x", hfile);
}
// Read the value of e_lfanew
e_lfanew = readlong(hfile, 0);
// Seek to IMAGE_NT_HEADERS
if (0 != fseek(hfile, e_lfanew, 0))
{
Message(" 2 Cannot seek in \"" + GetInputFile() + "\", handle: %x, elfanew: %x\n", hfile, e_lfanew);
}
// Read the Signature
if (0x00004550 != readlong(hfile, 0))
{
Message("Not a valid PE file");
}
return e_lfanew;
}
static GetImageBase(hfile, e_lfanew)
{
auto imageBase;
// Seek to the IMAGE_NT_HEADERS.OptionalHeader.ImageBase field
if (0 != fseek(hfile, e_lfanew + 0x18 + 0x1C, 0))
{
Fatal(" 3 Cannot seek in \"" + GetInputFile() + "\"");
}
imageBase = readlong(hfile, 0);
return imageBase;
}
static GetVirtualSectionOffset(hfile, e_lfanew, section)
{
auto numberOfSections, sectionRva;
// Seek to the IMAGE_FILE_HEADER.NumberOfSections field
if (0 != fseek(hfile, e_lfanew + 0x06, 0))
{
Fatal(" 4 Cannot seek in \"" + GetInputFile() + "\"");
}
// Read the number of sections
numberOfSections = readshort(hfile, 0);
if (section >= numberOfSections)
{
Fatal("Invalid section");
}
// Seek to the desired section
if (0 != fseek(hfile, e_lfanew + 0xF8 + section * 0x28 + 0x0C, 0))
{
Fatal(" 5 Cannot seek in \"" + GetInputFile() + "\"");
}
sectionRva = readlong(hfile, 0);
return sectionRva;
}
static GetRawSectionOffset(hfile, e_lfanew, section)
{
auto pointerToRawData;
// Seek to the desired section
if (0 != fseek(hfile, e_lfanew + 0xF8 + section * 0x28 + 0x14, 0))
{
Fatal(" 6 Cannot seek in \"" + GetInputFile() + "\"");
}
pointerToRawData = readlong(hfile, 0);
return pointerToRawData;
}
static GetFileOffset(rva, imagebase, virtualsectionoffset, rawsectionoffset)
{
return rva - imagebase - virtualsectionoffset + rawsectionoffset;
}
static GetNumberOfFunctions()
{
auto addr, name, fidx;
addr = 0;
fidx = 0; // function index
for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
{
name = Name(addr);
// Stop if name of function is _pre_cpp_init
// It is assumed that compiler/linker generated functions
// are appended in the end of image and that they start with the
// _pre_cpp_init function
if(name == "_pre_cpp_init")
{
return fidx;
}
fidx = fidx + 1;
}
return fidx;
}
static CreatePermutation(inumberoffunctions)
{
auto hpermutation;
hpermutation = CreateArray("Permutation");
if(hpermutation == -1)
{
// If array already exist get the handle by GetArrayId
hpermutation = GetArrayId("Permutation");
}
// Hardcoded permutation
SetArrayLong(hpermutation, 0, 2);
SetArrayLong(hpermutation, 1, 1);
SetArrayLong(hpermutation, 2, 0);
return hpermutation;
}
static GetFunctionAddresses()
{
auto addr, name, fidx, hfunctionaddresses;
addr = 0;
fidx = 0; // function index
hfunctionaddresses = CreateArray("FunctionAddresses");
if(hfunctionaddresses == -1)
{
// If array already exist get the handle by GetArrayId
hfunctionaddresses = GetArrayId("FunctionAddresses");
}
for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
{
name = Name(addr);
// Stop if name of function is _pre_cpp_init
// It is assumed that compiler/linker generated functions
// are appended in the end of image and that they start with the
// _pre_cpp_init function
if( (name == "_pre_cpp_init") || (fidx==3) )
{
return hfunctionaddresses;
}
SetArrayLong(hfunctionaddresses, 2*fidx, addr);
SetArrayLong(hfunctionaddresses, 2*fidx+1, NextFunction(addr) - addr);
fidx = fidx + 1;
}
return hfunctionaddresses;
}
static GetNewFunctionAddresses(hfunctionaddresses, hpermutation, inumberoffunctions)
{
auto addr, pidx, fidx, hnewfunctionaddresses;
addr = 0;
pidx = 0;
fidx = 0;
hnewfunctionaddresses = CreateArray("NewFunctionAddresses");
if(hnewfunctionaddresses == -1)
{
// If array already exist get the handle by GetArrayId
hnewfunctionaddresses = GetArrayId("NewFunctionAddresses");
}
// Address of first function
addr = NextFunction(addr);
fidx = GetArrayElement(AR_LONG, hpermutation, pidx);
SetArrayLong(hnewfunctionaddresses, fidx, addr);
addr = addr + GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx+1);
for(pidx=1; pidx < inumberoffunctions; pidx++)
{
fidx = GetArrayElement(AR_LONG, hpermutation, pidx);
SetArrayLong(hnewfunctionaddresses, fidx, addr);
addr = addr + GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx+1);
}
return hnewfunctionaddresses;
}
static CreateAddressTranslationLUT(hnewfunctionaddresses)
{
auto addr, haddresstranslationlut, name, end, inst, newaddr, fidx;
addr = 0;
fidx = 0;
haddresstranslationlut = CreateArray("AddressTranslationLookupTable");
if(haddresstranslationlut == -1)
{
// If array already exist get the handle by GetArrayId
haddresstranslationlut = GetArrayId("AddressTranslationLookupTable");
}
for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
{
name = Name(addr);
// Stop if name of function is _pre_cpp_init
// It is assumed that compiler/linker generated functions
// are appended in the end of image and that they start with the
// _pre_cpp_init function
if( (name == "_pre_cpp_init") || (fidx==3))
{
return haddresstranslationlut;
}
end = GetFunctionAttr(addr, FUNCATTR_END);
inst = addr;
// Get new base address of function
newaddr = GetArrayElement(AR_LONG, hnewfunctionaddresses, fidx);
SetArrayLong(haddresstranslationlut, inst, newaddr);
Message("haddresstranslationlut %x -> %x \n", inst, newaddr);
inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);
while(inst < end)
{
SetArrayLong(haddresstranslationlut, inst, newaddr + (inst-addr));
Message("haddresstranslationlut %x -> %x \n", inst, newaddr + (inst-addr));
inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);
}
fidx = fidx + 1;
}
return haddresstranslationlut;
}
static PatchInPlaceDebug(haddresstranslationlut)
{
auto addr, name, end, inst, newaddr;
addr = 0;
for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
{
name = Name(addr);
// Stop if name of function is _pre_cpp_init
// It is assumed that compiler/linker generated functions
// are appended in the end of image and that they start with the
// _pre_cpp_init function
if(name == "_pre_cpp_init")
{
return;
}
end = GetFunctionAttr(addr, FUNCATTR_END);
inst = addr;
while(inst < end)
{
Message("Address %x mapped to %x\n",inst,GetArrayElement(AR_LONG, haddresstranslationlut, inst));
inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);
}
}
}
static PatchInPlace(haddresstranslationlut)
{
auto addr, name, end, inst, newaddr, opidx, optype, newrva, nearaddr, fidx;
auto nearaddrnew;
addr = 0;
fidx = 0;
for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
{
name = Name(addr);
// Stop if name of function is _pre_cpp_init
// It is assumed that compiler/linker generated functions
// are appended in the end of image and that they start with the
// _pre_cpp_init function
if( (name == "_pre_cpp_init") || (fidx == 3) )
{
return;
}
end = GetFunctionAttr(addr, FUNCATTR_END);
inst = addr;
while(inst < end)
{
opidx = 0;
optype = GetOpType(inst,opidx);
while(optype > 0)
{
if(optype == 7)
{
// Immediate Near Address
// Maybe not necessary but check for call instruction
if(GetMnem(inst) == "call")
{
Message("Instruction at %x being patched.\n", inst);
nearaddr = LocByName(GetOpnd(inst, opidx));
Message("Operand near address is %x\n", nearaddr);
nearaddrnew = GetArrayElement(AR_LONG, haddresstranslationlut, nearaddr);
Message("Looking up near address: %x\n", nearaddrnew);
if(nearaddrnew == 0)
{
if(nearaddr == BADADDR)
{
Message("Fatal error, error processing instruction at %x\n", inst);
}
nearaddrnew = nearaddr;
}
newrva = nearaddrnew - (GetArrayElement(AR_LONG, haddresstranslationlut, inst)+0x6);
PatchDword(inst+0x1, newrva+0x1);
}
else
{
Message("Unsupported! Unknown %s instruction needs to be patched.\n", GetMnem(inst));
}
}
opidx++;
optype = GetOpType(inst,opidx);
}
inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);
}
fidx = fidx + 1;
} // end for-loop
} // end function
static EnumerateAndStoreFunctions(hfunctionnames)
{
auto addr, tmpaddr, name, fidx, widx, bsuccess, tmphandle, inextfunction;
addr = 0;
fidx = 0; // function index
widx = 0; // word idx
for(addr = NextFunction(addr); addr != BADADDR; addr = NextFunction(addr))
{
name = Name(addr);
// Stop if name of function is _pre_cpp_init
// It is assumed that compiler/linker generated functions
// are appended in the end of image and that they start with the
// _pre_cpp_init function
if( (name == "_pre_cpp_init") || (fidx == 3) )
{
return fidx;
}
bsuccess = SetArrayString(hfunctionnames, 2*fidx, name);
if(bsuccess == 0)
{
Message("Saving name of function %s failed.",name);
}
tmphandle = CreateArray(name);
if(tmphandle == -1)
{
tmphandle = GetArrayId(name);
}
inextfunction = NextFunction(addr);
if(inextfunction == BADADDR)
{
inextfunction = GetFunctionAttr(addr, FUNCATTR_END);
}
widx = 0;
for(tmpaddr = addr; tmpaddr < inextfunction; tmpaddr = tmpaddr + 1)
{
SetArrayLong(tmphandle, widx, Byte (tmpaddr));
widx = widx + 1;
}
bsuccess = SetArrayLong(hfunctionnames, 2*fidx+1, widx);
fidx = fidx + 1;
}
return fidx;
}
static PrintFunctions(hfunctionnames, inumberoffunctions)
{
auto fidx;
for(fidx = 0; fidx < inumberoffunctions; fidx = fidx + 1)
{
Message("Function: %s\n", GetArrayElement(AR_STR, hfunctionnames, 2*fidx));
}
}
static WriteBackFunctions(hfunctionnames, inumberoffunctions, iwriteaddr, writetofile, hfile)
{
auto fidx, oidx, funcname, hopcodes, opcodeslen;
auto imagebase, virtualsectionoffset, rawsectionoffset;
auto writeerror, byte, hglobalvars, fileoffset;
if(writetofile == 1)
{
hglobalvars = GetArrayId("GlobalVars");
imagebase = GetArrayElement(AR_LONG, hglobalvars, 0);
virtualsectionoffset = GetArrayElement(AR_LONG, hglobalvars, 1);
rawsectionoffset = GetArrayElement(AR_LONG, hglobalvars, 2);
}
// DEBUG
Message("imagebase: %x, virtualsectionoffset: %x, rawsectionoffset: %x\n",imagebase,virtualsectionoffset,rawsectionoffset);
for(fidx = 2; fidx >=0 ; fidx = fidx - 1)
{
funcname = GetArrayElement(AR_STR, hfunctionnames, 2*fidx);
opcodeslen = GetArrayElement(AR_LONG, hfunctionnames, 2*fidx+1);
hopcodes = GetArrayId(funcname);
for(oidx = 0; oidx < opcodeslen; oidx = oidx + 1)
{
byte = GetArrayElement(AR_LONG, hopcodes, oidx);
PatchByte(iwriteaddr, byte);
if(writetofile == 1)
{
fileoffset = GetFileOffset(iwriteaddr, imagebase, virtualsectionoffset, rawsectionoffset);
writeerror = fseek(hfile, fileoffset, 0);
writeerror = fputc(byte, hfile);
if(writeerror == -1)
{
Message("Could not write to file (RVA %x)",iwriteaddr);
return;
}
Message("Write byte %x to file offset %x\n", byte, fileoffset);
}
iwriteaddr = iwriteaddr + 1;
}
}
}
static main()
{
auto hfile, e_lfanew, imagebase, virtualsectionoffset, rawsectionoffset, writetofile, section;
auto didx, inumberoffunctions, hfunctionnames, hpermutation, hfunctionaddresses, hnewfunctionaddresses;
auto haddresstranslationlut, hglobalvars, main, call2main, newrva, fileoffset, writeerror, fidx;
writetofile = 1;
// This is init stuff and should be wrapped into a separate init function
if(writetofile == 1)
{
section = 0;
hfile = GetFileHandle("rb");
e_lfanew = GetPointerToPEHeader(hfile);
imagebase = GetImageBase(hfile, e_lfanew);
virtualsectionoffset = GetVirtualSectionOffset(hfile, e_lfanew, section);
rawsectionoffset = GetRawSectionOffset(hfile, e_lfanew, section);
hglobalvars = CreateArray("GlobalVars");
if(hglobalvars == -1)
{
// If array already exist get the handle by GetArrayId
hglobalvars = GetArrayId("GlobalVars");
}
SetArrayLong(hglobalvars, 0, imagebase);
SetArrayLong(hglobalvars, 1, virtualsectionoffset);
SetArrayLong(hglobalvars, 2, rawsectionoffset);
// Get address of main
main = LocByName("_main");
if(main == BADADDR)
{
Message("Could not find _main. Aborting...\n");
return;
}
call2main = RfirstB(main);
if(GetMnem(call2main) != "call")
{
Message("Expecting to find call to _main. Unsuccessful. Aborting...\n");
return;
}
fclose(hfile);
hfile = GetFileHandle("r+");
}
// Get number of functions
inumberoffunctions = GetNumberOfFunctions();
inumberoffunctions = 3;
// DEBUG
Message("Number of functions %d\n",inumberoffunctions);
// Create permutation array
hpermutation = CreatePermutation(inumberoffunctions);
// Get current function addresses
hfunctionaddresses = GetFunctionAddresses();
for(fidx = 0; fidx < inumberoffunctions; fidx++)
{
Message("Function address: %x\n", GetArrayElement(AR_LONG, hfunctionaddresses, 2*fidx));
}
// Get addresses after permutation
hnewfunctionaddresses = GetNewFunctionAddresses(hfunctionaddresses, hpermutation, inumberoffunctions);
for(fidx = 0; fidx < inumberoffunctions; fidx++)
{
Message("New Function address: %x\n", GetArrayElement(AR_LONG, hnewfunctionaddresses, fidx));
}
// Pre-processing, create address translation lookup table
haddresstranslationlut = CreateAddressTranslationLUT(hnewfunctionaddresses);
PatchInPlace(haddresstranslationlut);
// Fix call to _main
if(writetofile == 1)
{
if(GetOpType(call2main,0) != 7)
{
Message("Unexpected operand found at call2main. Aborting...\n");
return;
}
newrva = GetArrayElement(AR_LONG, haddresstranslationlut, main) - (call2main+0x6);
PatchDword(call2main+0x1, newrva+0x1);
fileoffset = GetFileOffset(call2main+1, imagebase, virtualsectionoffset, rawsectionoffset);
writeerror = fseek(hfile, fileoffset, 0);
writeerror = writelong(hfile, newrva+0x1, 0);
if(writeerror == -1)
{
Message("Could not patch call2main (newrva %x)", newrva);
return;
}
Message("Write long %x to file offset %x\n", newrva, fileoffset);
}
//DEBUG
//for(didx = 0; didx<inumberoffunctions; didx++)
//{
// Message("New Function address: %x\n", GetArrayElement(AR_LONG, hnewfunctionaddresses, didx));
//}
//return;
// This array is populated with names of functions and
// the length of the functions in dwords in the following
// way [name1,length1,name2,length2,...]
hfunctionnames = CreateArray("FunctionNames");
if(hfunctionnames == -1)
{
// If array already exist get the handle by GetArrayId
Message("hfunctionnames is -1.\n");
hfunctionnames = GetArrayId("FunctionNames");
}
// Enumerate functions and store them i persistent array
inumberoffunctions = EnumerateAndStoreFunctions(hfunctionnames);
// Print functions in IDA's output window
PrintFunctions(hfunctionnames, inumberoffunctions);
// Write Back functions in reversed order
WriteBackFunctions(hfunctionnames, inumberoffunctions, 0x401000, writetofile, hfile);
if(writetofile == 1)
{
fclose(hfile);
}
MakeUnknown (MinEA(), MaxEA() - MinEA(), 1);
AnalyzeArea (MinEA(), MaxEA());
}
PHP Code:
import pefile
import random
from collections import defaultdict
intsize = 32
magic = 2**intsize
def dec2hex(x, m=magic):
if x<0:
return magic+x
else:
return x
class DEWA(pefile.PE):
def PrintStuff(self):
print self.DOS_HEADER.e_lfanew
print self.OPTIONAL_HEADER.ImageBase
print self.sections[0].PointerToRawData
print self.sections[0].VirtualAddress
dewa = DEWA('C:\\rce\\LinkOrder\\manifestless\\watermark1.exe')
dewa.PrintStuff()
main = LocByName("_main")
if (main <= 0):
print "Unexpected result: could not find _main, LocByName(\"_main\")"
sys.exit(1)
print "address of main is " + hex(main)
print "entry point read from file is " + hex(dewa.OPTIONAL_HEADER.AddressOfEntryPoint)
call2main = RfirstB(main)
if(GetMnem(call2main) != "call"):
Message("Unexpected result: Expecting to find call to _main. Unsuccessful. Aborting...")
sys.exit(1)
# Get functions
funcs = []
for idx,func in enumerate(Functions()):
if Name(func) == "_pre_cpp_init":
break
if idx > 2:
break
funcs.append(func)
funclens = []
func2func = []
# Loop over the function and check if they represent a contiguous block
for idx in range(len(funcs)-1):
curfunc = funcs[idx]
end_addr = GetFunctionAttr(curfunc, FUNCATTR_END)
funclens.append(end_addr - curfunc)
func2func.append(funcs[idx+1] - curfunc)
for byte in range(end_addr, funcs[idx+1]):
if Byte(byte) != 204: # 0xCC
print "Warning: Test for continuity failed."
print "func number %i at address %s" % (idx, hex(curfunc))
# Last function is treated separately
end_addr = GetFunctionAttr(funcs[-1], FUNCATTR_END)
funclens.append(end_addr-funcs[-1])
last = NextFunction(funcs[-1])
if last > 0:
func2func.append(last - funcs[-1])
else:
func2func.append(funclens[-1])
print funclens
print func2func
# Get number of functions
no_functions = len(funcs)
# Make a permutation
funcorder = range(no_functions)
# make a copy of funcorder
new_funcorder = funcorder[:]
random.shuffle(new_funcorder)
# check the permutation
while True:
count = 0
for idx in range(no_functions):
if funcorder[idx] == new_funcorder[idx]:
count = count + 1
if count > 0:
new_funcorder = funcorder[:]
random.shuffle(new_funcorder)
else:
break
##print funcorder
##map(print hex(funcs), funcs)
funcmap = dict()
curaddr = funcs[0]
for idx in xrange(no_functions):
curfunc = funcs[new_funcorder[idx]]
funcmap[hex(curfunc)] = hex(curaddr)
curaddr = curaddr + func2func[new_funcorder[idx]]
print new_funcorder
print funcmap
# Build Address Translation LUT
atlut = dict()
for func in funcs:
func_end = GetFunctionAttr(func, FUNCATTR_END)
newfuncaddr = funcmap[hex(func)]
inst = func
atlut[hex(inst)] = newfuncaddr
inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT)
while inst < func_end:
atlut[hex(inst)] = hex(int(newfuncaddr,16) + (inst-func));
inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);
print atlut
print "*************************************"
# Patch in-place
for func in funcs:
func_end = GetFunctionAttr(func, FUNCATTR_END)
instr = func
while instr < func_end:
print "Considering %s." % hex(instr)
opidx = 0
optype = GetOpType(instr, opidx)
while optype > 0:
if(optype == 7):
# Immediate near address
# Maybe not necessary but check for call instruction
if(GetMnem(instr) == 'call'):
print "Instruction at %s being pathed." % hex(instr)
nearaddr = LocByName(GetOpnd(instr, opidx))
print hex(nearaddr)
print "atlut[hex(instr)] is " + atlut[hex(instr)]
if (hex(nearaddr) in atlut):
newrva = int(atlut[hex(nearaddr)],16) - (int(atlut[hex(instr)],16)+0x6)
else:
if (nearaddr == 0):
print "Fatal error! error processing instruction at %s" % hex(instr)
sys.exit(1)
else:
newrva = nearaddr - (int(atlut[hex(instr)],16)+0x6)
print "newrva is " + str(newrva) + " in hex " + hex(newrva)
PatchDword(instr+1, dec2hex(newrva)+1)
else:
print "Unsupported! Unknown %s instruction needs to be patch" % GetMnem(instr)
opidx = opidx + 1
optype = GetOpType(instr, opidx)
instr = FindCode(instr, SEARCH_DOWN | SEARCH_NEXT);
# Enumerate and store functions
funcdict = defaultdict(list)
funcidx = 0 # used for indexing into funclens list
for func in funcs:
funclen = func2func[funcidx]
for addr in range(func,func+funclen):
funcdict[funcidx].append(Byte(addr))
funcidx = funcidx + 1
print funcdict
imbase = dewa.OPTIONAL_HEADER.ImageBase
# layout the functions in the new order
writeaddr = funcs[0]
print hex(writeaddr)
for fidx in new_funcorder:
print fidx
for codebyte in funcdict[fidx]:
print hex(writeaddr), hex(codebyte)
PatchByte(writeaddr, codebyte)
dewa.set_bytes_at_rva(writeaddr-imbase, chr(codebyte))
writeaddr = writeaddr + 1
# Correct entry point
if(GetOpType(call2main,0) != 7):
print "Unexpected operand found at call2main. Aborting...\n"
sys.exit(1)
print atlut[hex(main)]
print int(atlut[hex(main)],16)
print call2main
print newrva
newrva = int(atlut[hex(main)],16) - (call2main+0x6);
PatchDword(call2main+0x1, dec2hex(newrva+1));
dewa.set_dword_at_rva(call2main+1-imbase, dec2hex(newrva+1))
# Commit to file / save to disk
dewa.write('C:\\rce\\LinkOrder\\manifestless\\dewatermark1.exe')
Answer: 235!?
Looking great niaren.
About this:
Shouldn't that always be "all of them"?Originally Posted by niaren
"Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."
My assumption was that the 'overhead/auxillary functions' introduced by the compiler are always appended to the exe file. As such they don't contribute to the watermark. However, instead of creating a more complicated exe file for further development and testing we might just as well try to reorder all the functions in the watermark1.exe file instead of just the 3 functions. I have tried to do that.
The result is an improved python script (se bottom). With this script it is possible to reorder a range of functions that obey a continuity constraint. As it turns out the first 9 functions in the watermark1.exe file fulfill this constraint and can be reordered with the script.
Continuity constraint:
A consecutive sequence of functions, as identified by IDA, is said to be continuous if the space in between functions (if there is any) consists of sequences of 0xCC (int 3) bytes.
In other words, we expect to find 0xCC sequences in between functions in order to align functions to paragraphs. If this is not the case we have no real control of what is going on with the bytes in between the functions. It could very well be code.
I will now depict a scenario in the watermark1.exe file where this continuity constraint is not obeyed and which furthermore turns out to be relative hard to deal with in de-watermarking terms (at least I think so).
Below is shown the IDA output for watermark1 where sub_401276 is the 10th function (where first function has lowest address). We see that the bytes following the end of function sub_401276 are not identified (or at least not indicated) by IDA as a function. We could regard these bytes as code and move them. However, there are no references (Xrefs) to the code. Therefore we have no direct way to take care of patching the code that calls this code, if it gets called.
In fact, this code gets called by this piece of code (inside __initterm_e function)Code:.text:0040126C public $LN25 .text:0040126C $LN25 proc near <------------- OEP (not interesting) .text:0040126C call ___security_init_cookie .text:00401271 jmp ___tmainCRTStartup .text:00401271 $LN25 endp .text:00401271 .text:00401276 .text:00401276 ; =============== S U B R O U T I N E ======================================= .text:00401276 .text:00401276 .text:00401276 sub_401276 proc near ; CODE XREF: sub_40102C:loc_401063p .text:00401276 ; sub_40102C+4Ep ... .text:00401276 mov eax, offset off_40C008 .text:0040127B retn .text:0040127B sub_401276 endp .text:0040127B .text:0040127C ; --------------------------------------------------------------------------- .text:0040127C .text:0040127C ___initstdio: <- instructions/data located in between functions .text:0040127C mov eax, dword_40EAC0 .text:00401281 push esi .text:00401282 push 14h .text:00401284 pop esi .text:00401285 test eax, eax .text:00401287 jnz short loc_401290 .text:00401289 mov eax, 200h .text:0040128E jmp short loc_401296
This seems like a quite difficult situation to handle from a de-watermarking point of view. One possible approach could be to search data section for 0040127C values...but that doesn't seem very feasible.Code:esi holds the address of an array of addresses, the loop is executed until a non NULL pointer is found and the execution directed to the function with the corresponding address .text:004026C4 loc_4026C4: ; CODE XREF: __initterm_e+1Fj .text:004026C4 test eax, eax .text:004026C6 jnz short loc_4026D8 .text:004026C8 mov ecx, [esi] .text:004026CA test ecx, ecx .text:004026CC jz short loc_4026D0 .text:004026CE call ecx .text:004026D0 .text:004026D0 loc_4026D0: ; CODE XREF: __initterm_e+15j .text:004026D0 add esi, 4 .text:004026D3 .text:004026D3 loc_4026D3: ; CODE XREF: __initterm_e+Bj .text:004026D3 cmp esi, [ebp+arg_4] .text:004026D6 jb short loc_4026C4
Updated python script
PHP Code:
import pefile
import random
from collections import defaultdict
intsize = 32
magic = 2**intsize
def dec2hex(x, m=magic):
if x<0:
return magic+x
else:
return x
class DEWA(pefile.PE):
def GetFunctionFromVA(self, va):
for func in Functions():
if func > va:
f = PrevFunction(func)
return f
def CheckInstr(self, instr):
mnem = GetMnem(instr)
if(mnem == "call") or (mnem == "jmp") :
if(Byte(instr) == 0xeb): # jmp short
return False
if(Byte(instr) == 0xff): # call indirect
return False
# Immediate near address
if(GetOpType(instr, 0) != 7):
print "Unexpected %s operand %i found at %s. Aborting...\n" % (mnem, GetOpType(instr, 0), hex(instr))
return False
else:
nearaddr = LocByName(GetOpnd(instr, 0))
if(nearaddr == BADADDR):
print "Error locating near address at %s. Aborting...\n" % (hex(instr))
return False
else:
# check that near address falls outside of current function
fcur = self.GetFunctionFromVA(instr)
if(fcur == BADADDR):
print "GetFunctionFromVA Error at %s. Aborting...\n" % (hex(instr))
return
fnext = NextFunction(fcur)
if(fnext == BADADDR):
print "NextFunction Error at %s. Aborting...\n" % (hex(instr))
return
if(((instr+nearaddr) >= fcur) and ((instr+nearaddr) <= fnext)):
print "instr %s, fcur %s, fnext %s\n" % (hex(instr),hex(fcur),hex(fnext))
return False
else:
return True
else:
return False
def PatchInstr(self, instr, atlut):
mnem = GetMnem(instr)
if( (mnem == "call") or (mnem == "jmp") ):
nearaddr = LocByName(GetOpnd(instr, 0))
if (hex(nearaddr) in atlut):
newrva = int(atlut[hex(nearaddr)],16) - (int(atlut[hex(instr)],16)+0x6)
else:
newrva = nearaddr - (int(atlut[hex(instr)],16)+0x6)
print "oldrva: " + hex(nearaddr) + " newrva is " + str(newrva) + " in hex " + hex(newrva)
PatchDword(instr+1, dec2hex(newrva)+1)
def PatchXrefInstr(self, instr, addr, atlut, imbase):
newrva = int(atlut[hex(addr)],16) - (instr+0x6);
PatchDword(instr+0x1, dec2hex(newrva+1));
self.set_dword_at_rva(instr+1-imbase, dec2hex(newrva+1))
def PrintStuff(self):
print self.DOS_HEADER.e_lfanew
print self.OPTIONAL_HEADER.ImageBase
print self.sections[0].PointerToRawData
print self.sections[0].VirtualAddress
def idapymain():
dewa = DEWA('C:\\rce\\LinkOrder\\manifestless\\watermark1.exe')
dewa.PrintStuff()
imbase = dewa.OPTIONAL_HEADER.ImageBase
main = LocByName("_main")
if (main <= 0):
print "Unexpected result: could not find _main, LocByName(\"_main\")"
return
print "address of main is " + hex(main)
print "entry point read from file is " + hex(dewa.OPTIONAL_HEADER.AddressOfEntryPoint)
# Get functions
funcs = []
for idx,func in enumerate(Functions()):
if Name(func) == "_pre_cpp_init":
break
if idx > 9:
break
funcs.append(func)
funclens = []
func2func = []
# Loop over the function and check if they represent a contiguous block
for idx in range(len(funcs)-1):
curfunc = funcs[idx]
end_addr = GetFunctionAttr(curfunc, FUNCATTR_END)
funclens.append(end_addr - curfunc)
func2func.append(funcs[idx+1] - curfunc)
for byte in range(end_addr, funcs[idx+1]):
if Byte(byte) != 204: # 0xCC
print "Warning: Test for continuity failed."
print "func number %i at address %s" % (idx, hex(curfunc))
# Last function is treated separately
end_addr = GetFunctionAttr(funcs[-1], FUNCATTR_END)
funclens.append(end_addr-funcs[-1])
last = NextFunction(funcs[-1])
if last > 0:
func2func.append(last - funcs[-1])
else:
func2func.append(funclens[-1])
print funclens
print func2func
# Get number of functions
no_functions = len(funcs)
# Make a permutation
funcorder = range(no_functions)
# make a copy of funcorder
new_funcorder = funcorder[:]
random.shuffle(new_funcorder)
# check the permutation
while True:
count = 0
for idx in range(no_functions):
if funcorder[idx] == new_funcorder[idx]:
count = count + 1
if count > 0:
new_funcorder = funcorder[:]
random.shuffle(new_funcorder)
else:
break
##print funcorder
##map(print hex(funcs), funcs)
funcmap = dict()
curaddr = funcs[0]
for idx in xrange(no_functions):
curfunc = funcs[new_funcorder[idx]]
funcmap[hex(curfunc)] = hex(curaddr)
curaddr = curaddr + func2func[new_funcorder[idx]]
print new_funcorder
print funcmap
# Build Address Translation LUT
atlut = dict()
for func in funcs:
func_end = GetFunctionAttr(func, FUNCATTR_END)
newfuncaddr = funcmap[hex(func)]
inst = func
atlut[hex(inst)] = newfuncaddr
inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT)
while inst < func_end:
atlut[hex(inst)] = hex(int(newfuncaddr,16) + (inst-func));
inst = FindCode(inst, SEARCH_DOWN | SEARCH_NEXT);
print atlut
print "*** Build ATLUT end *** \n\n"
print "*** Handle xrefs start *** \n"
for idx,func in enumerate(funcs):
xref = RfirstB(func)
while(xref != BADADDR):
if( hex(xref) not in atlut):
if (dewa.CheckInstr(xref) == False):
print "Unexpected Xref instruction found at %s. Aborting...\n" % (hex(xref))
return
print "Reference to %i %s found at %s\n" % (idx, hex(func), hex(xref))
dewa.PatchXrefInstr(xref, func, atlut, imbase)
xref = RnextB(func, xref)
print "*** Handle xrefs end *** \n\n"
print "*** Patch in-place *** \n"
# Patch in-place
for func in funcs:
func_end = GetFunctionAttr(func, FUNCATTR_END)
instr = func
while instr < func_end:
if(dewa.CheckInstr(instr)==True):
print "Instruction at %s being pathed." % hex(instr)
dewa.PatchInstr(instr, atlut)
instr = FindCode(instr, SEARCH_DOWN | SEARCH_NEXT);
# Enumerate and store functions
funcdict = defaultdict(list)
funcidx = 0 # used for indexing into funclens list
for func in funcs:
funclen = func2func[funcidx]
for addr in range(func,func+funclen):
funcdict[funcidx].append(Byte(addr))
funcidx = funcidx + 1
print funcdict
# layout the functions in the new order
writeaddr = funcs[0]
print hex(writeaddr)
for fidx in new_funcorder:
print fidx
for codebyte in funcdict[fidx]:
#print hex(writeaddr), hex(codebyte)
PatchByte(writeaddr, codebyte)
dewa.set_bytes_at_rva(writeaddr-imbase, chr(codebyte))
writeaddr = writeaddr + 1
# Check entry point
oep = dewa.OPTIONAL_HEADER.AddressOfEntryPoint + imbase
print "OEP is %s\n" % hex(oep)
if (hex(oep) in atlut):
print "OEP is changed to %s\n" % atlut[hex(oep)]
dewa.OPTIONAL_HEADER.AddressOfEntryPoint = int(atlut[hex(oep)],16) - imbase
# Commit to file / save to disk
dewa.write('C:\\rce\\LinkOrder\\manifestless\\dewatermark1.exe')
print "Done!\n"
idapymain()
Nice to see your progress.
I understand that indirect calls that are not caught in the IDA xrefs will pose a big problem. But if the full address of the called function is read from the data section, it must also be in the reloc data of the executable, so as long as relocs are not stripped, we should still be ok, right?
Also, for the cases with stripped relocs, maybe the de-watermarking tool could at least helpfully provide a list of all indirect call instructions in the entire program, so that they could be analyzed manually by the user, and then manually entered as resolved xrefs into IDA before proceeding with the final working de-watermarking procedure?
I have no idea how many this is in a normal program, but at least the program has then done all it can to help, which would be the goal - nothing more can be demanded. And if the user wants, he can then analyze/resolve/enter all of these and their resolved xrefs manually in IDA, and then again proceed with the tool to actually get a working de-watermarked executable!
"Give a man a quote from the FAQ, and he'll ignore it. Print the FAQ, shove it up his ass, kick him in the balls, DDoS his ass and kick/ban him, and the point usually gets through eventually."
Bookmarks