Page 1 of 2 12 LastLast
Results 1 to 15 of 16

Thread: Understanding something about why a compiler does this

  1. #1
    Technomancer
    Guest

    Understanding something about why a compiler does this

    I am sorry, but i am very new to reverse engineering. When i disassemble software sometimes, i will see something like this often.

    * Reference To: VERSION.GetFileVersionInfoSizeA, Ord:0001h
    |
    :00477462 FF25F4934700 Jmp dword ptr [004793F4]

    Or it could be Call dword ptr [xxxxxxxx] too. I understand what it means technically. FF25 is Jmp dword ptr and F4934700 is 004793F4 in little endian format. So this instruction will cause EIP to be set to the dword stored at 004793F4 and will jump there.

    I understand this technically, but not the mechanism behind this.

    1. Why does the compiler do this? Isn't it kind of long winded?

    2. How does this relate to GetFileVersionInfoSize ? So let's say the dword stored at 4793F4 is 00404000. That means this jump will bring you to the address 00404000. But how does that relate to GetFileVersionInfoSize ? Basically i just don't understand how it works and i need to understand what happens from the point you jump to 00404000 onward and how it relates to the GetFileVersionInfoSize
    I promise that I have read the FAQ and tried to use the Search to answer my question.

  2. #2
    Super Moderator
    Join Date
    Dec 2004
    Posts
    1,511
    Blog Entries
    15
    the compiler does this because it cannot know where the address of GetVersion would be while it is compiling

    so it creates a section called .rdata and fills it with some information
    the the windows loader can understand and resolve and provide the correct address


    the compiler say hey loader this guy wants to call GetVersion() this GetVersion() is in kernel.dll when you have resolved whats the address of GetVersion() dump the address in this place so that when i am running i can use it apart from this information it also places a timestamp and some advanced magic information like forwarder chain

    and places a jmp table at the end

    so call GetVersion() will jump to the jump table which in turn would jump to the resolved address

    so its an array of five dwords terminated with a null dword for each dependency thats to be resolved finally terminated with five null dwords to indicate all the stuff is over and is called import table

    there are names like originalfirstthunk,firstthunk etc (i would suggest you to get the luvelsmeyer pe.txt
    http://spiff.tripnet.se/~iczelion/files/pe1.zip
    and give it a thorough reading several times till you understand the mechanisms and thier names)


    lets look at raw file and loaded file

    raw file (iczelions msgbox.exe)
    00000600 5C 20 00 00 00 00 00 00 78 20 00 00 00 00 00 00 \ ......x ......
    00000610 4C 20 00 00 00 00 00 00 00 00 00 00 6A 20 00 00 L ..........j ..
    00000620 00 20 00 00 54 20 00 00 00 00 00 00 00 00 00 00 . ..T ..........
    00000630 86 20 00 00 08 20 00 00 00 00 00 00 00 00 00 00 .. ..........
    00000640 00 00 00 00 00 00 00 00 00 00 00 00 5C 20 00 00 ............\ ..
    00000650 00 00 00 00 78 20 00 00 00 00 00 00 75 00 45 78 ....x ......u.Ex
    00000660 69 74 50 72 6F 63 65 73 73 00 4B 45 52 4E 45 4C itProcess.KERNEL
    00000670 33 32 2E 64 6C 6C 00 00 BB 01 4D 65 73 73 61 67 32.dll..Messag
    00000680 65 42 6F 78 41 00 55 53 45 52 33 32 2E 64 6C 6C eBoxA.USER32.dll
    00000690 00 00 ..


    loaded file

    00402000 >10 A1 3C 83 00 00 00 00 20 A1 3C 83 00 00 00 00 <.... <....
    00402010 4C 20 00 00 B4 C2 1F 37 00 00 F7 BF 6A 20 00 00 L ..7..j ..
    00402020 00 20 00 00 54 20 00 00 CD A1 20 37 00 00 F5 BF . ..T ..͡ 7..
    00402030 86 20 00 00 08 20 00 00 00 00 00 00 00 00 00 00 .. ..........
    00402040 00 00 00 00 00 00 00 00 00 00 00 00 5C 20 00 00 ............\ ..
    00402050 00 00 00 00 78 20 00 00 00 00 00 00 75 00 45 78 ....x ......u.Ex
    00402060 69 74 50 72 6F 63 65 73 73 00 4B 45 52 4E 45 4C itProcess.KERNEL
    00402070 33 32 2E 64 6C 6C 00 00 BB 01 4D 65 73 73 61 67 32.dll..Messag
    00402080 65 42 6F 78 41 00 55 53 45 52 33 32 2E 64 6C 6C eBoxA.USER32.dll
    00402090 00 00 ..
    [/b]

    you can see the loader has resolved the 0x205c and substituted with 0x80##### (i am on 9x if you are on nt or > you will have 0x7#### there)

  3. #3
    Naides is Nobody
    Join Date
    Jan 2002
    Location
    Planet Earth
    Posts
    1,647

    Wink

    Welcome to RCE.

    The short answer would be to ask you to search the PE file structure and the functioning of the IT import table and IAT.

    But I am going to give you a little heads up:

    VERSION.GetFileVersionInfoSizeA, Ord:0001h is the NAME of an imported function code from some dll.

    Your application needs to Call this code, but it cannot tell in advance where that GetFileVersionInfoSizeA address is in memory. In fact, depending on the OS or the version of the DLL, that address may change at different computers and perhaps everytime run your program.

    So a construct like

    :00477462 Jmp 7432103

    or


    :00477462 Call 7432103

    will only work in the unlikely circumstance that GetFileVersionInfoSizeA is always loaded and located at the static address 7432103.


    Enter the OS loader

    When your Application loads and loads its dll, the loader learns where GetFileVersionInfoSizeA (and all other imported functions) is located. Let us assume this time it happens to be 7432103.

    The Loader scans the import table of your file and fills that address
    at a special table in your file called the IAT,
    at a particular locus, in this example [004793F4] where your application expects to find the address of the function GetFileVersionInfoSizeA

    Then whenever your app wants to call it


    it runs

    00477462 FF25F4934700 Jmp dword ptr [004793F4]

    or

    00477462 FF25F4934700 call dword ptr [004793F4]

    which will always take you to the right destination because the memory at

    004793F4 contains the value 7432103. and if it changes next time, the correct address will be found there.

    That is the innerworkings of dynamic linkings, with one or two more convolutions, but you get the idea eh??
    Last edited by naides; May 13th, 2006 at 13:11.

  4. #4
    <script>alert(0)</script> disavowed's Avatar
    Join Date
    Apr 2002
    Posts
    1,281
    Technomancer, see http://spiff.tripnet.se/~iczelion/pe-tut6.html for more info.

  5. #5
    You might read http://www.codebreakers-journal.com/viewarticle.php?id=74&layout=abstract .
    I want to know God's thoughts ...the rest are details.
    (A. Einstein)
    --------
    ..."a shellcode is a command you do at the linux shell"...

  6. #6
    Technomancer
    Guest
    Thanks alot guys

    Just some more points to clarify. I am using WinXP so it should be 7XXXXX

    Let's say 00477462 FF25F4934700 Jmp dword ptr [004793F4]

    004793F4 will contain 7XXXXXXX so basically, we will be jumping to that. What i don't understand is,

    blabberer: What technically is a jump table and where is the jump table? Is the 7XXXXXX address part of a jump table? Also, as you stated, it will place a time stamp. So the timestamp is part of the "array of five dwords terminated with a null dword for each dependency thats to be resolved finally terminated with five null dwords ", which is the IAT right? How does the jump table relate to this though ?

    Maximus: I think that link is broken.
    I promise that I have read the FAQ and tried to use the Search to answer my question.

  7. #7
    Also, the reason why all imported calls go through the import table, instead of the loader modifying every jump/call instruction (which would achieve the same effect), is because of efficiency. Only one dord needs to be modified for each imported function, instead of many jumps/calls throughout the file.

    For API-hooking purpose this is also useful to know

  8. #8
    Posted at the same time as me...
    Quote Originally Posted by Technomancer
    What technically is a jump table and where is the jump table? Is the 7XXXXXX address part of a jump table? Also, as you stated, it will place a time stamp. So the timestamp is part of the "array of five dwords terminated with a null dword for each dependency thats to be resolved finally terminated with five null dwords ", which is the IAT right? How does the jump table relate to this though ?
    A jump table is a table of JMP instructions pointing to the import slots, e.g.
    Code:
    JMP [00401600]
    JMP [00401604]
    JMP [00401608]
    JMP [0040160C]
    ...
    Which many assemblers and compilers append to the file, even though it would be a lot more efficient to call the import slot directly.

    The 7XXXXXXX address is where the actual API function code resides.

  9. #9
    Naides is Nobody
    Join Date
    Jan 2002
    Location
    Planet Earth
    Posts
    1,647
    Another point:
    addresses 7XXXXXXX correspond to the system dlls.
    your application may import functions form other app specific dlls, which will be found at much lower addresses

  10. #10
    Super Moderator
    Join Date
    Dec 2004
    Posts
    1,511
    Blog Entries
    15
    whats a jmp table ?

    if you scroll down the listing in ollydbg you will notice rightly some thing like what you saw 0xff25

    Code:
    00401132   $-FF25 14204000  JMP DWORD PTR DS:[<&USER32.DialogBoxPara>
    00401138   $-FF25 10204000  JMP DWORD PTR DS:[<&USER32.EndDialog>]
    0040113E   $-FF25 20204000  JMP DWORD PTR DS:[<&USER32.GetDlgItem>]
    00401144   $-FF25 1C204000  JMP DWORD PTR DS:[<&USER32.GetDlgItemTex>
    0040114A   $-FF25 0C204000  JMP DWORD PTR DS:[<&USER32.MessageBoxA>]
    00401150   $-FF25 24204000  JMP DWORD PTR DS:[<&USER32.SendMessageA>>
    00401156   $-FF25 28204000  JMP DWORD PTR DS:[<&USER32.SetDlgItemTex>
    0040115C   $-FF25 18204000  JMP DWORD PTR DS:[<&USER32.SetFocus>]
    00401162   .-FF25 04204000  JMP DWORD PTR DS:[<&KERNEL32.ExitProcess>
    00401168   $-FF25 00204000  JMP DWORD PTR DS:[<&KERNEL32.GetModuleHa>
    
    or same thing in ollydbg without having it display the symbolic names
    
    00401132   $-FF25 14204000  JMP DWORD PTR DS:[402014]
    00401138   $-FF25 10204000  JMP DWORD PTR DS:[402010]
    0040113E   $-FF25 20204000  JMP DWORD PTR DS:[402020]
    00401144   $-FF25 1C204000  JMP DWORD PTR DS:[40201C]
    0040114A   $-FF25 0C204000  JMP DWORD PTR DS:[40200C]
    00401150   $-FF25 24204000  JMP DWORD PTR DS:[402024]
    00401156   $-FF25 28204000  JMP DWORD PTR DS:[402028]
    0040115C   $-FF25 18204000  JMP DWORD PTR DS:[402018]
    00401162   .-FF25 04204000  JMP DWORD PTR DS:[402004]
    00401168   $-FF25 00204000  JMP DWORD PTR DS:[402000]
    thats the jump table it was put by the compiler/assembler
    one does not code it
    one just says MessageBox(NULL,"blah",Blah",NULL);
    and compiler will put a ff25 thingie at the end

    if you ask ollydbg to resove ip for any registers

    you can see

    Code:
    DS:[00402014]=898A7130, (Thunk to USER32.DialogBoxParamA)
    Local call from <ModuleEntryPoint>+20
    so this jump table is being called from ModuleEntryPoint+0x20

    double clicking on the addresss tab would get you a relative referancing mode in ollydg

    [code]
    $+20 00401020 |. E8 0D010000 CALL 00401132 ; \DialogBoxParamA

    whats in 402014 after being loaded ?

    Code:
    00402014 >30 71 8A 89                                      0qŠ‰
    what was there when it was in raw mode

    Code:
    00000610              9C 20 00 00                              œ ..
    what did 209c originally point to which the loader used to resolve

    Code:
    00000690                                      92 00 44 69              ’.Di
    000006A0  61 6C 6F 67 42 6F 78 50 61 72 61 6D 41 00        alogBoxParamA.
    in which dll this Dialog whatever was there
    Code:
    00000630                                      16 21 00 00              !..
    what did the pointer 2116 (716 in physical address) point to
    Code:
    00000710                    55 53 45 52 33 32 2E 64 6C 6C        USER32.dll
    00000720  00 00                                            ..

    and so on

    the above examples are based on iczelions tut-10-2 (dialogbox.exe)

    its absolutely simple once you grasp the basics

  11. #11
    Quote Originally Posted by Technomancer
    Thanks alot guys

    Just some more points to clarify. I am using WinXP so it should be 7XXXXX

    Let's say 00477462 FF25F4934700 Jmp dword ptr [004793F4]

    004793F4 will contain 7XXXXXXX so basically, we will be jumping to that. What i don't understand is,

    blabberer: What technically is a jump table and where is the jump table? Is the 7XXXXXX address part of a jump table? Also, as you stated, it will place a time stamp. So the timestamp is part of the "array of five dwords terminated with a null dword for each dependency thats to be resolved finally terminated with five null dwords ", which is the IAT right? How does the jump table relate to this though ?

    Maximus: I think that link is broken.
    A jump table introduces a level of indirection to solve the problem of dynamically loading functions in win32 environment. Indirection is a common idiom in computer science. It can be used to resolve problems that may seem impossible to tackle at a first glance. For example, here the heart of the problem is that the function GetFileVersion's virtual address cannot be known at compile time. The compiler simply defers the problem by introducing a data structure called jump table and allows the program loader to fill in the proper values of GetFileVersion etc's virtual address to be resolved at run time. As LLXX pointed out, this data structure is an ingenious way to engineering this problem. There are other solutions but the jump table one is pretty neat.

  12. #12
    homersux,

    Your explanation is fine, but I'm still not happy. Here's what I understand:

    The code section is littered with calls to various API functions, the addresses of which are not known at compile time. Hence the calls are directed to a table which can be filled at run-time to solve everybody's problems. That's great.
    Now, the IAT (which is a null-separated, double-null-terminated array of DWORDs) is one example of the solution-by-indirection in that it is filled at run-time with the addresses of functions.
    Also, the 'jump table' (which is a table similar to the IAT, containing a series of JMP DWORD (PTR)s to all the APIs instanced) can solve the problem in much the same way.
    What I don't understand though is why both are necessary. The problem calls for one level of indirection only. Using two seems a little stupid. Maybe I'm missing something.

    Regards
    Admiral

  13. #13
    It must be due to some constraint on the compiler. Many of the programs I've inspected do not use a jump table (i.e. call directly into IAT), while an almost equally large amount do (the API function is called with a CALL xxxxxxxx, which then proceeds to a JMP [xxxxxxxx] through the IAT). Perhaps the compiler finds it easier to generate JMP [xxxxxxxx] and call through that than to generate a "hypothetical" import table and either enforce that it be in the same position after linkage, or try to generate relocations for the imports (which might be an impossible task).

    Perhaps someone would experiment with different compilers and compile options to see whether or not a jump table is produced? I know for sure that MASM + LINK will generate a jump table, to the dismay of Asm programmers like me that find it a waste of space.

  14. #14
    You don't have to use a jump table. One level of indirection is fine. As long as you understand the reasoning behind this hassle. Details can be engineered.

  15. #15
    I know for sure that MASM + LINK will generate a jump table, to the dismay of Asm programmers like me that find it a waste of space.
    Actually, one reason to use JMP tables for calling imported functions is to SAVE space: a relative call (x86 opcode E8h) to a JMP table entry is 5 bytes while a memory indirected absolute call into the IAT (x86 opcode FF15h) would require 6 bytes of code.
    Of course, this only takes effect if an import is called from more than six different locations within the actual code.
    --
    Pyrae
    dead cafe owner

Similar Threads

  1. Understanding Assembly Code
    By Unity in forum Advanced Reversing and Programming
    Replies: 5
    Last Post: June 6th, 2013, 20:05
  2. Reversing & Understanding a File Format
    By tonixxr in forum The Newbie Forum
    Replies: 8
    Last Post: February 20th, 2013, 07:28
  3. Understanding Maya 2010 keygen
    By james in forum The Newbie Forum
    Replies: 9
    Last Post: August 26th, 2009, 15:34
  4. Understanding a loader ...
    By N8di8 in forum The Newbie Forum
    Replies: 19
    Last Post: May 13th, 2006, 07:48
  5. looking for a VB3 compiler
    By 0ffs3t in forum The Newbie Forum
    Replies: 10
    Last Post: October 31st, 2002, 12:31

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •