PDA

View Full Version : [help]function size


roxaz
07-07-2008, 05:33 AM
Im trying to find a way to get a size of certain function. However i got some very weird results =o Current idea im trying to implement is like this:
i have 2 functions one next to other. lets name them func1 and func2. I thought if i &func2 - &func1 = size of func1. However i got 5! i checked addresses in debugger, real size is 0x190. That kinda confused me. Does anyone of you have any idea how could i get size of a func?

yeah, i figured out that there will be curious guys that will ask me 'what for?' ;] im trying to inject my code to another process. and yes, i surely know about dll injection, but i want to be more sneaky ;]

naides
07-07-2008, 05:41 AM
Hi roxuz, well come to the Board. Without further detail, this is what I guess is happening: The functions themselves are dynamically allocated in the Heap, which is standard behavior in new compilers. &func2 - &func1 probably gives you the distance between the pointers in memory. I remotely remember this same theme was discussed previously here. try searching the forum. I don't remember any key search word that may be useful.

dELTA
07-07-2008, 07:47 AM
How can you acquire function pointers for non-exported functions from another process to begin with?

roxaz
07-07-2008, 07:51 AM
those functions are inside my process. my process starts another process. i want to copy some functions from my process to started one.

dELTA
07-07-2008, 08:00 AM
Ah, I see. Another trick, if the compiler abstracts the code too much with jump tables etc, as might be the case with your problem, is to insert "signatures" of asm code at the beginning and end of the targeted functions, and then search for these in the entire code section of your program (will still go very fast).

Example of such a signature:

jmp endlabel
dd signature_dword_1
dd signature_dword_2
...
endlabel:

roxaz
07-07-2008, 08:11 AM
hehe, great idea, thx so much ;]

OHPen
07-07-2008, 10:01 AM
Hi,

as naides said you are probably dealing with jump tables. If you functions inside the binary are really next to each other without displacement or padding stuff then you could simply extract the addresses out of the jmp instruction in table to get the real addresses of the functions.
You wouldn't even need a full disassembler for it, as a simple jump / call detection would be sufficient in this case

Might be another solution, but maybe the signature solution is the more simple one.

Regards,

OHPen

Camus SoNiCo
07-07-2008, 10:11 AM
Did u try removing the Incremental Linking from the compiler?

Cheers,
Camus

roxaz
07-07-2008, 10:44 AM
great man, it worked perfectly. thank you so much ;]

naides
07-07-2008, 12:38 PM
What worked roxaz: OHPen suggestion or Camus Suggestion or dELTA suggestion. . .?

roxaz
07-07-2008, 01:03 PM
one by Camus, sorry i forgot to quote ^^

darawk
07-07-2008, 02:18 PM
I wrote this code a while back to calculate the length of a function. It only works of course if all of the blocks are contiguous (not necessarily in execution order, but just overall contiguous):

Code:


/************************************************************************
Function length calculation algorithm - by Darawk:

1. Scan the function's code for branches, and record each branch. Stop
upon reaching an end-point*. This group of instructions constitutes
the current "block".
2. QSort the branch list
3. Recursively repeat steps 1 & 2 with each branch, skipping duplicates
and intra-block branches.

*end-point: A ret instruction or an unconditional backwards jump,
that jumps to a previous block.
************************************************************************/

u32 GetFunctionLength(void *begin)
{
void *end = GetFunctionEnd(begin);
u32 delta = (u32)((DWORD_PTR)end - (DWORD_PTR)begin);
delta += mlde32(end);
return delta;
}

void *GetFunctionEnd(void *func)
{
void *block = func;
vector branchList;
// ptr now points to the end of this block
void *blockend = GetBranchListFromBlock(block, branchList);

// If there are no branches, then return
// the empty list. If we don't have this
// here the loop will crash on an empty
// branch list.
if(branchList.size() == 0) return blockend;

// Sort the list so that we can identify and
// discard, intra-block branches. And optimize
// the removal of duplicates.
std::sort(branchList.begin(), branchList.end());

void *prev = NULL;
vector::iterator branch;
for(branch = branchList.begin(); branch != branchList.end(); branch++)
{
// Skip branches that jump into a block we've already
// processed.
if(*branch < blockend || *branch == prev)
continue;

blockend = GetFunctionEnd(*branch);
prev = *branch;
}

return blockend;
}

void *GetBranchListFromBlock(void *block, vector &branchList)
{
u8 *ptr = (u8 *)block;

// If we reach an end-point, then this block is complete
while(!IsEndPoint(ptr, block))
{
// Record all branching instructions that we encounter
void *address = GetBranchAddress(ptr);
if(address)
{
branchList.push_back(address);
}

// Next instruction
ptr += mlde32(ptr);
}

return ptr;
}


void *GetBranchAddress(u8 *instr)
{
s32 offset = 0;
// This code will determine what type of branch it is, and
// determine the address it will branch to.
switch(*instr)
{
case INSTR_SHORTJMP:
case INSTR_RELJCX:
offset = (s32)(*(s8 *)(instr + 1));
offset += 2;
break;
case INSTR_RELJMP:
offset = *(s32 *)(instr + 1);
offset += 5;
break;
case INSTR_NEAR_PREFIX:
if(*(instr + 1) >= INSTR_NEARJCC_BEGIN && *(instr + 1) <= INSTR_NEARJCC_END)
{
offset = *(s32 *)(instr + 2);
offset += 5;
}
break;
default:
// Check to see if it's in the valid range of JCC values.
// e.g. ja, je, jne, jb, etc..
if(*instr >= INSTR_SHORTJCC_BEGIN && *instr <= INSTR_SHORTJCC_END)
{
offset = (s32)*((s8 *)(instr + 1));
offset += 2;
}
break;
}

if(offset == 0) return NULL;
return instr + offset;
}

bool IsEndPoint(u8 *instr, void *curblock)
{
void *address;
s32 offset;
switch(*instr)
{
case INSTR_RET:
case INSTR_RETN:
case INSTR_RETFN:
case INSTR_RETF:
return true;
break;

// The following two checks, look for an instance in which
// an unconditional jump returns us to a previous block,
// thus creating a pseudo-endpoint.
case INSTR_SHORTJMP:
offset = (s32)(*(s8 *)(instr + 1));
address = instr + offset;
if(address <= curblock) return true;
break;
case INSTR_RELJMP:
offset = *(s32 *)(instr + 1);
address = instr + offset;
if(address <= curblock) return true;
break;
default:
return false;
break;
}

return false;
}

roxaz
07-07-2008, 02:29 PM
that is peace of artwork ^^ gee, really thx, best method of all. maybe you could add missing defines and tell me what mlde32 func does?

deroko
07-07-2008, 02:54 PM
mlde32 is length disassembler engine

http://vx.netlux.org/vx.php?id=em24

Kayaker
07-07-2008, 03:22 PM
Nice one deroko darawk. I should have used that algorithm in my IceProbe disassembler. It wasn't needed for the job at hand, but that's a much nicer implementation to generate a complete disasm of a particular function with all the scattered code chunks.

What if the blocks are *not* contiguous, which may be the more common case. For example, 2 or 3 functions may jump to the same shared endpoint return code chunk, therefore that chunk couldn't be contiguous with at least one of the functions.

Would the algo you posted not work in that case, or could it be modified to work?

I knew there was a way to do it, I was just too lazy to figure it out
I may use that idea some day to update the disasm engine I use.

Thanks for the code.

Cheers,
Kayaker

darawk
07-07-2008, 03:37 PM
Yea, you could modify it to do that. This algorithm was designed for the purpose of supplying a length to a memcpy() so that I could copy arbitrary functions out of other modules, provided they didn't behave as you described. It would of course be possible to do the same with the more complex type of function that jumps all around the module (such as many in ntdll), but that is a little bit more difficult and wasn't necessary for what I was doing at the time.

roxaz
07-07-2008, 04:11 PM
is it so difficult? we should calculate length of every chunk and add them up. sounds quite easy. copying such a func should be hard thou.

P.S. still it would be nice to get defines such as INSTR_RELJMP, INSTR_SHORTJCC_BEGIN, INSTR_SHORTJCC_END and many others ^_^

EDIT:
here ya go roxaz!
Quote:

#define INSTR_NEAR_PREFIX 0x0F
#define INSTR_FARJMP 0x2D // Far jmp prefixed with INSTR_FAR_PREFIX
#define INSTR_SHORTJCC_BEGIN 0x70
#define INSTR_SHORTJCC_END 0x7F
#define INSTR_NEARJCC_BEGIN 0x80 // Near's are prefixed with INSTR_NEAR_PREFIX byte
#define INSTR_NEARJCC_END 0x8F
#define INSTR_RET 0xC2
#define INSTR_RETN 0xC3
#define INSTR_RETFN 0xCA
#define INSTR_RETF 0xCB
#define INSTR_INT3 0xCC
#define INSTR_RELJCX 0xE3
#define INSTR_RELCALL 0xE8
#define INSTR_RELJMP 0xE9
#define INSTR_SHORTJMP 0xEB
#define INSTR_FAR_PREFIX 0xFF

gee, many thx to roxaz ;]]

darawk
07-08-2008, 11:50 AM
It shouldn't be difficult at all really. The only reason I didn't do it is because I was thinking too narrowly about the problem I was trying to solve. I wanted to be able to rip an arbitrary function out of any module on the fly, and at the time I wrote this, I shortsightedly limited myself to contiguous functions - but there really is no reason that this couldn't work on non-contiguous ones. All you'd have to do is re-order the blocks and tweak the jmp's to fit the new shape of the function if the function wasn't already contiguous (assuming your goal is copying the function - if it's just counting bytes or instructions, then this isn't necessary).

EDIT: Oh, and here are my original definitions. We even made the same comment, lol.

Code:
#define INSTR_NEAR_PREFIX 0x0F
#define INSTR_SHORTJCC_BEGIN 0x70
#define INSTR_SHORTJCC_END 0x7F
#define INSTR_NEARJCC_BEGIN 0x80 // Near's are prefixed with a 0x0F byte
#define INSTR_NEARJCC_END 0x8F
#define INSTR_RET 0xC2
#define INSTR_RETN 0xC3
#define INSTR_RETFN 0xCA
#define INSTR_RETF 0xCB
#define INSTR_RELJCX 0xE3
#define INSTR_RELJMP 0xE9
#define INSTR_SHORTJMP 0xEB

deroko
07-08-2008, 12:36 PM
Quote:
[Originally Posted by Kayaker;75725]Nice one deroko. I should have used that algorithm in my IceProbe disassembler.


Not mine code It's from darawk, I just gave refference for mlde32 which is used in the code

Kayaker
07-08-2008, 03:46 PM
Oops, misplaced credit, sorry

homersux
07-27-2008, 10:07 PM
Quote:
[Originally Posted by deroko;75722]mlde32 is length disassembler engine

http://vx.netlux.org/vx.php?id=em24


Hello, beautiful code. Do you mind uploading this engine zip file? the link you gave appears dead to me.

H

Kayaker
07-27-2008, 10:17 PM
If it's not readily available elsewhere it sounds like a candidate for CRCETL. Wasn't there an XDE engine as well?

deroko
07-28-2008, 10:08 AM
@homersux: indeed, neither can I download it from that link, but I found copy of it in 29a e-zine #7, which you may download from : http://vx.org.ua/29a/main.html . Also I've attached mlde32

@Kayaker: yes there is I think it's in 29a #8, but dunno if it's updated version as Zeljko Vrba described a little bug in it here : http://www.phrack.org/issues.html?id=13&issue=63

Quote:

----[ 3.4 - XDE bug


During the development, a I have found a bug in the XDE disassembler
engine: it didn't correctly handle the LOCK (0xF0) prefix. Because of the
bug XDE claimed that 0xF0 is a single-byte instruction. This is the
needed patch to correct the disassembler:

--- xde.c Sun Apr 11 02:52:30 2004
+++ xde_new.c Mon Aug 23 08:49:00 2004
@@ -101,6 +101,8 @@
if (c == 0xF0)
{
if (diza->p_lock != 0) flag |= C_BAD; /* twice */
+ diza->p_lock = c;
+ continue;
}

break;

I also needed to remove __cdecl on functions, a 'feature' of Win32 C
compilers not needed on UNIX platforms.


Fyyre
07-28-2008, 06:55 PM
XDE version 1.0.2 fixes the lock prefix bug:

eXtended (XDE) disassembler engine
----------------------------------
version 1.02

History:

1.01 - 1st release
1.02 - lock prefix bug is fixed, thanx to www.core-dump.com.hr

http://vx.netlux.org/vx.php?id=ex01

homersux
07-28-2008, 07:53 PM
Thanks for the update, Fyyre.

H

dELTA
07-29-2008, 03:50 AM
Quote:
[Originally Posted by Kayaker;76203]If it's not readily available elsewhere it sounds like a candidate for CRCETL. Wasn't there an XDE engine as well?
Indeed, and that's why I already put it there the first time it was mentioned in this thread.

http://www.woodmann.com/collaborative/tools/index.php/Mlde32

And btw, XDE just magically appeared there too:

http://www.woodmann.com/collaborative/tools/index.php/EXtended_Disassembler_Engine_%28XDE%29

Interested parties should probably take a look at the entire "X86 Disassembler Libraries" category:

http://www.woodmann.com/collaborative/tools/index.php/Category:X86_Disassembler_Libraries

aqrit
09-14-2008, 03:31 PM
(1)
Include the code to inject as a binary resource
then just call SizeofResource()

(2)
use "#pragma section(...)" to put the code to inject in its own section
then walk the PE header to determine the size of the section

(3)
Label the first instruction and after the last instruction
Subtract, the two addresses of the labels, to get
the size of code between them.

A label has only function scope, which is a small pain...

Code:
DWORD StartCodeAddress, EndCodeAddress, CodeSize;

void someFunc(){
// C++ can't get the address-of a label??
__asm lea StartCodeAddress, StartLabel
__asm lea EndCodeAddress, EndLabel
CodeSize = EndCodeAddress - StartCodeAddress;

return;

__asm{
StartLabel: pop eax
pop edx
push eax
mov eax,edx // dummy code
mov edx,eax
ret

EndLabel: nop // a label must point at code!
// note that the nop is not included in the CodeSize
}

}