View Full Version : Why do address and code are different when open .exe with debugger and hex editor?

November 4th, 2012, 10:51
When I tried to debug .exe with OllyDbg and Hex Workshop, addresses are totally different from each other but code pattern is still pretty much the same except at some point for example :

in OllyDbg at 0020100C : A1 00 30 02 00 , mov eax, dword ptr ds:[23000]
in Hex Workshop at 0000040C : A1 00 30 40 00

Thank you.

November 4th, 2012, 15:10
Directly from our FAQ (which you missed to read...):

I am using my hex editor trying to patch a byte at address 40746Dh but I am not able to find that address, is it possible?

Sure it is, you have to understand the difference between Virtual Address and File Offset because for physical patch you need to tell your hex editor to go at a specific offset and not at a specific Virtual Address.
- Virtual Address = 40746Dh is the memory address where the byte is located at when the program is executed. It is also the address showed by debuggers.
- File Offset represents the distance (in number of bytes) between the beginning of the file and a byte. If you want to patch that byte you have to translate Virtual Address into File Offset and then go to that specific Offset.

November 4th, 2012, 23:06
Sorry but I still don't understand the whole concept. Does anyone have a source which I can study?

November 5th, 2012, 04:37
there is no better source than what you have in front of you
and perseverance is an alternative name for any source
google is a sources source the god of source

let me ask you some simple questions
1) did you try to look what virtual means in english first ?
2) i mean did you open an English to your language dictionary and try to understand what virtual may mean literally
3) did you understand what physical means ?
4) did you bother to apply your latest gained knowledge to the existing situation in front of you
5) did you fantasize ?
6) did you open a few other exes and tried to discern a pattern ?
7) what did you do and what you cannot understand ?
8) reversing is an ART it needs passion perseverance and dreams in your un slept eyes
it is not an objective type pick one of four exam paper which has several coaching classes and solved guides

9) reversing is plucking the mole out of molehill not making mountains out of molehills
10) do your home work first read the faq several several times google every word in the faq
read every pdf, doc, manuscripts that you can lay your hands on whether relevant or not
and then come back with a question that is joyous to answer
a question that shows you did your research and we have to reverse to answer your question

virtual ~ not real it only exists in some some volatile dreams of some crap
physical == real or appears to be real and

relative address = my friends house is 2 houses from my house
physical address = no 22 faq street , question city

only those who know you/your house and your friend you mean to/can come to your friends house if you say come to my friends
house its 2 houses right of my house

any tom dick and harry can come to your friends house if you say come to
no 22, faq street , question house

a file in its physical form has absolute address or physical offsets
a debugger maps it and turns out every thing to 2 houses to my right or ten houses to my left

now right is a direction which is relative it depends on the side you are entering a two way street or right is dependent on the way
one is facing your house
if he faces your house right is to his right and if you face him right for him is to your left

now that is called a fixup a relative address that is relative to another relative phenomenon

go google and come back

November 5th, 2012, 11:08
lol,you should change your nick blabber jibber blabberer

November 16th, 2012, 18:22
[Originally Posted by blabberer;93623]
virtual ~ not real it only exists in some some volatile dreams of some crap
physical == real or appears to be real ...

In a hex editor, you are seeing the file as it exists on disk, and it uses real offsets from 0x0, not the 0x0400000 used in computer memory. Furthermore, when the file is loaded into memory, code gets relocated from where it is found on the disk image.

Just to expand on what blabberer wrote, since my background is in hardware. A physical address is the address in a real memory chip. Since each process in windoze is assigned 4 gigabytes of memory, that is not going to work well on a system with 512 megabytes of RAM. Even 4 gigs of RAM wont help a lot. So, the 4 gigs assigned by windoze is not real, it is virtual...aka imaginary.

It works because memory from real physical memory can be paged out to a swap file on the hard drive, and recalled as required. Using smoke and mirrors, you can have gigabytes of memory to reference (virtual memory) when none of it exists except that in a real memory chip. Also, the memory from the real physical space is mapped to the virtual space via a file paging system, that handles all the ugly details of converting physical to virtual and vice-versa.

Most apps in focus have an address of 0x400000, which is situated at 0x80000000 (someone correct me if I'm wrong) in virtual memory, but that can be changed at compile time. Please note that an address of 0x80000000 cannot exist with a RAM size of 512 meg. The app does not have to be loaded with its code at 401000, as usual. When you use a disassembler, it is using the image on the disk for disassembly, not the address at which the app is loaded in memory. Therefore, the disassembled file can have a different address space than what you find in a computers memory.

The apps stuff is located between 0x80000000 and 0xC0000000, where system memory starts. There is a difference of 0x40000000 between those addresses and converting to decimal you get 1073,741,824 bytes, which is a bit over a gigabyte. Of course, it's not really there.

I am a bit rusty and wont take offense at someone correcting me.

November 22nd, 2012, 04:35
i Normally Avoid answering dead threads where the original questioner has disappeared

but since i see constants like 0x80000000 etc i just thought i would chime in incase some one else is reading this thread in future

the following remarks assume you have xpsp3 as os on a 32 bit machine

the maximum user space is controlled by by a boot switch /3gb 3 gb userspace and 1 gb kernel space (boot.ini )
default is 2gb of user space and 2 gb of kernel space
this 2gb can be queried using several methods including some wmic queries easiest is to use windbg

lkd> ? poi(nt!MmHighestUserAddress)
Evaluate expression: 2147418111 = 7ffeffff
(puzzle what does the last page contain why effff not fffff what is the characteristics of the page that spans from 7fff0000 to 7fffffff)

the kernel space is common to all processes

like wise physical pages also have a defined pattern check MmHighestPhysicalPage and MmLowestPhysicalPage globals
so if you exceed say the upper limit your machine might crash citing insufficient resources
the smoke and mirror can only go thus far not extend into hyperspace

when a process is started the windows loader reads the executables header determines what is its base of image and
would start mapping your executable to that address (preferred image base) in case of helper objects like dll this preferred base
might not be available so they can be relocated to what the loader finds as the next available slot

the Imagebase is decided by certain switches to the linker (check /DRIVER /FIXED /DYNAMICBASE etc)

and based on certain other switches to linker relocation information is appended to the executable

an exe normally will not have a reloc section (probably because this file is the first to be mapped and it can be successfully mapped to
say a constant 0x400000 address (default preferred imagebase for exes)

the space from 0 to imagebase is used by the loader for various activities that include mapping language support code pages and mapping
environment variables specific to the process

the mapped system dlls in a process is what is termed as smoke and mirrors (magic)

suppose you have 1000 process and each process need ntdll.dll the loader gives each of the process the same ntdll.dll map
which has been mapped only once and not 1000 times and increments a counter saying ntdll is mapped 1000 times
as long as there is no write operation to the ntdll from the process a single copy can insert itself into every process

the loader plays rummy and uses a joker to substitute an ace in a triplet

if there is a write operation a separate copy of ntdll is provided to the specific processes that writes

now if all the 1000 processes write a separate copy might eat the resources to the point of suffocation
(ever seen the not enough virtual memory windows is increasing the virtual memory warning dialog) and
consequent death (bsod )

in the phase where windows states that it is increasing virtual memory it uses a mechanism called paging
where by diskspace is used as temporary ram

further blabbering on further questions

November 22nd, 2012, 17:56
[Originally Posted by blabberer;93760].... since i see constants like 0x80000000...
Blabberer...thanks for expose. I mentioned 0x80000000 only to emphasize that such an address does not exist on a machine with a physical memory of 512 meg, or even 1 gig. The original poster was wondering why there are different addresses between real memory and a disassembly (I think...I don't have it in front of me). You covered it well enough in your initial reply but my thing is real memory from a hardware background. I added my 2-bits in a feeble attempt to illustrate why an image on a disk is not the same as an image in memory, hence the difference between a disassembler reading a disk image and an OS loading the same image.

One of my pet peeves is that modern languages like C++ have gone out of their way to hide the hardware. As a result, when I step through code, I see scads of redundant code that could have been better implemented had the programmer the least bit of understanding of how it is implemented at the hardware level. I understand why they have obfuscated the language wrt making it more universal, but I think they have gone overboard by using terms like objects, which are nothing more than code.

I have tried to wade through several books on C++ only to be put off by authors who have not the slightest idea what they are talking about. I have seen the concept of a class beaten to death by authors who simply do not understand the concept. Finally, I read a book by Bjarne Stroustrup, who invented the language and he said very simply, "a class is a user-defined type". Suddenly the lights went on, after major frustration reading authors who obviously did not get that. If you read a lot of the authors writing on the subject, they cannot use a direct definition like that. They talk around the subject using thought experiments and inane examples, while thoroughly confusing the reader.

There are too many people who have learned computers through an OOP language and they have no idea what goes on under the hood. They think in a totally obfuscated world but when you read Stroustrup on obfuscation, he uses the word in a carefully defined manner. I wish someone would write a book on C++ and relate it directly to the CPU, or even just to assembler. There has to be a one to one relationship or the language would not work. Then again, I still use the term 'subroutine'.