Page 7 of 11 FirstFirst 1234567891011 LastLast
Results 91 to 105 of 154

Thread: NTFS MFT Internals

  1. #91
    Super Moderator
    Join Date
    Dec 2004
    Blog Entries
    ah the secret stash isnt that much of a secret

    all you need is one xxx.c file winddk build environement and a pdb to add your own type info

    :\>dir /b
    :\>type addtypeinfot.bat
    pushd ..
    @call C:\WinDDK\7600.16385.1\bin\setenv.bat C:\WinDDK\7600.16385.1\ fre x86 WXP
    cl.exe /Zi /Gz /c /Fd%1.pdb  /IC:\WinDDK\7600.16385.1\inc\crt %1.c

    the c file contains

    :\>type ntfs.c
    #include <windows.h>
    typedef struct _NTFSMFT {
        DWORD MAGIC; //4
        WORD UpdateSeqOffset; //6
        WORD FixupArrayEntries; //8
        DWORD64 $LogFileSeqNo; //0x10
        WORD SequenceNumber; //0x12
        WORD HardLinkCount; //0x14
        WORD AttributeOffset; //0x16
        WORD Flags; //0x18
        DWORD MftUsed; //0x1c
        DWORD MftAlloc; //0x20
        DWORD64 FileRefernace; //0x28
        WORD NextAttributeID; //0x2a
        WORD AlignNext4B; // 0x2c
        DWORD ThisMFTRecordNumber; //0x30
        BYTE UpdateSequence[0x8]; //0x38
    } NtfsMft, *PNtfsMft;
    typedef struct _ATTRIBUTE_HEADER
        DWORD AttributeType; //0x04
        DWORD AttributeLength; //0x08
        BYTE Resident; //0x09
        BYTE NameLength; //0x0a
        WORD NameOffset ; // 0x0c
        WORD Flags;  //0xe
        WORD AttributeNumber; //0x10
        DWORD AttributeContentLength; //0x14
        WORD AttributeContentStartOffset; //0x16
        WORD unk; //0x18
    typedef struct _FILE_INFO_ATTRIBUTE_RECORD
        DWORD64 ParentDirectory; //0x08
        DWORD64 FileCreationTime; // 0x10
        DWORD64 FileModificationTime[0x02]; // 0x20
        DWORD64 FileAccessTime; // 0x28
        DWORD64 AllocatedSizeOfFile; //0x30
        DWORD64 RealSizeOfFile; //0x38
        DWORD64 Flags; //0x40
        BYTE FileNameLengthinUnicodeChars; // 0x41
        BYTE NtfsNameSpace; //0x42
        wchar_t Filename[1];
    char magicNumber[4];
    WORD updateSeqOffs;
    WORD sizeOfUpdateSequenceNumberInWords;
    DWORD64 logFileSeqNum;
    DWORD64 vcnOfINDX;
    DWORD indexEntryOffs;
    DWORD sizeOFEntries;
    DWORD sizeOfEntryAlloc;
    BYTE flags;
    BYTE padding[3];
    WORD updateSeq;
    typedef struct _NTATTR_INDEX_RECORD_ENTRY
    DWORD64 mftReference;
    WORD sizeOfIndexEntry;
    WORD sizeofStream;
    WORD flags;
    char padding[2];
    DWORD64 mftFileReferenceOfParent;
    DWORD64 creationTime;
    DWORD64 lastModified;
    DWORD64 lastModifiedForFileRecord;
    DWORD64 lastAccessTime;
    DWORD64 allocatedSizeOfFile;
    DWORD64 realFileSize;
    DWORD64 fileFlags;
    BYTE fNameLength;
    BYTE filenameNamespace;
    wchar_t FileName[10];
        PNtfsMft pntmft;
        PAttributeHeader pfirstheader;
        PFileInfoAttributeRecord pfinforecord;
        P_NTATTR_INDEX_RECORD_ENTRY pntindxrecordentry;
    the pdb is from ms symbol server which gets modified

  2. #92
    Quote Originally Posted by Kayaker View Post
    I'll see if I can babelfish that.
    I'll address this to both kayaker and Blabs and try to kill two babelfishes with one stone.

    I have read quite a bit on windbg and understand a preliminary amount about their use of breakpoints. In fact I did a lookup of _ntfscheckindexbuffer@8 in softice and found a similar address (0xF89932FE).

    Thanks to you and blabs for elaborating on the usage. I could not figure out where the address came from and you guys have explained it adequately. The lights have come on...thanks.

    I am currently tracing through ntfs.sys using _ntfscheckindexbuffer@8 as a breakpoint and trying to equate blabberer's structure to what I am seeing. It's not perfectly clear yet and I am seeing offsets that extend beyond what blabs listed.

    When I first used the BP it went off before I could exit ice and click on a file to load. However, an F5 lead me back to the file manager and I am now back in the code related to my file after the BP fired again. Meantime I looked for an INDX structure that was more comprehensive and that lead to a python parsing program. I am looking at reloading python after swearing I would never use it again.

  3. #93
    I use ida2ice to make nms files when I can't get a good one from Msoft's PDB file. However, the IDA structure gets annoying at times when they refer to register values using their in-house terminology. Also, they tend to replace useful values with names they have created such as structxxx.

  4. #94
    Release the spinlock, Hal.

  5. #95
    Super Moderator
    Join Date
    Dec 2004
    Blog Entries
    Release the spinlock, Hal.
    what was that ?

    lkd> x hal!*rel*spin*
    806d2868 hal!KeReleaseQueuedSpinLock = <no type information>
    806d87ec hal!KiReleaseSpinLock = <no type information>
    806d0554 hal!_imp_KiReleaseSpinLock = <no type information>
    806d0ab4 hal!HalpReleaseSystemHardwareSpinLock = <no type information>
    806d2850 hal!KeReleaseInStackQueuedSpinLock = <no type information>
    806d0ab4 hal!HalpReleaseCmosSpinLock = <no type information>
    806d2720 hal!KfReleaseSpinLock = <no type information>
    806d67aa hal!KeReleaseSpinLock = <no type information>
    lkd> uf hal!HalpReleaseSystemHardwareSpinLock
    806d0ab4 50              push    eax
    806d0ab5 ff35848e6d80    push    dword ptr [hal!HalpRebootNow+0x4 (806d8e84)]
    806d0abb 8d05201f6e80    lea     eax,[hal!HalpSystemHardwareLock (806e1f20)]
    806d0ac1 9d              popfd
    806d0ac2 58              pop     eax
    806d0ac3 c3              ret

  6. #96
    Teach, Not Flame Kayaker's Avatar
    Join Date
    Oct 2000
    Blog Entries
    Oh Oh. I've seen those warning signs before, Waxford needs a reboot!

  7. #97
    Quote Originally Posted by WaxfordSqueers View Post

    Now for the good stuff:

    Offset 160 = First VCN = 0
    Offset 168 = Last VCN = 0 (not sure what this means yet)
    Offset 170 = Data Runs Offset = 0

    This could be referring to $INDEX_ROOT which has no data runs offset or VCN offset. (ie. it's right here)

    Fast forward to:

    Offset 198 = $INDEX_ALLOCATION = Data run

    At 198, you see the sequence 11 01 2C, meaning size = 0x11, cluster count = 01 and first cluster is at 0x2C. I am still not sure what size refers to but the cluster count is the number of times a standard sized cluster can be divided into the file length I have verified that with another situation.
    This is an excerpt from a post I made recently and is on page 6 of this thread.

    The reference above to VCN = 0 was not clear to me at the time. A VCN is an offset into a file from an LCN and in this case it is zero because the reference is to a root index. The data run offset is 0 as well because everything is resident in this MFT-resident root directory and data runs are data streams outside the MFT.

    The interesting part is the reference to the sequence at offset 198 which is a data run referring to data outside the MFT.

    The data run 11 01 2C is a very simply data run pointing to one offset. It is interpreted as follows:

    11 is in format xy where y refers to the following bytes. If y = 1, there is one byte following xy to indicate the size in clusters of the non-resident stream. If y = 2 there are two bytes following and if 3, there are 3 bytes following, etc. The case where y = 0 is a special case used for sparse files, which are files with long runs of zeros that can be omitted from the file by indicating how many zeros are in the run.

    The x position refers to the number of bytes following the size parameter pointed to by y and refers to the size of the offset byte field.

    Therefore 11 means there is 1 byte following the 11 for size (= 01) and 1 byte following the size for location ( = 2C).

    In this case, 11 01 2C means the size byte is 1 cluster and the offset (LCN) from the beginning of the partition is 0x2C (in clusters).

    Here's a more complicated data run:

    31 38 73 25 34 32 14 01 E5 11 02 31 42 AA 00 03 00

    ********The thing to note is that each file offset is added to the previous one in the data run.

    This data run might represent a fragmented file and it can be handled by breaking it into parts based on what was described above:

    31 38 73 25 34.........38 = cluster length at LCN1 = 0x342573

    32 14 01 E5 11 02 .... 0x0114 = cluster length at 0x0211E5...add to previous LCN1 = 0x211E5 + 0x342573 = LCN2 = 0x363758

    31 42 AA 00 03.........0x42 = cluster length at 0x0300AA.....add to previous LCN2 = 0x0300AA + 0x363758 = LCN3 = 0x393802

    00 = end of run

    Therefore this data run represents 3 fragmented non-resident files at LCNs 0x342573, 0x363758 and 0x393802. All LCNs are in clusters and with standard parameters with 512 bytes/sector and 8 sectors/cluster the physical offset can be found by adding 000 to the cluster address. That is, LCN1 is found at byte offset 0x342573000

    In the first example, with cluster offset = 0x2C, the byte offset is 0x2C000. That address is from my system and indicates the directory of c:\ and has a file signature of INDX. NTFS refers to directories as INDX or $I30.

  8. #98

    rethinking the problem

    I have been trying to understand the MFT through a combined reverse engineering and theoretical approach. I posted this thread because I was using softice to do the reverse engineering part. The reversing has been a lot tougher than anticipated and I think that is because I still tend to understand file systems with the old DOS structure in mind.

    Unfortunately, for me, Microsoft has applied object-oriented theory to file systems. Based on the lack of practical solutions on the Net to problems like NTFS systems I think a lot of people on the Net can talk good theory when it comes to object-oriented systems but they have no idea what is going on at a hardware level or at the low level system level where Hal, NTFS.sys and Ntoskrnl reside.

    I have made some notes and I would appreciate some comments. Specifically, I am trying to find where shell32 and ole32 begin communicating with the file driver system. In an earlier reply from Blabberer he commented that the system already knew where the file was located and that seems to be the problem. I can't seem to access the process early enough to catch where the initial call to the file system drivers is made.

    Here are the notes, taken from a ppt presentation called ntfs_mod. However, this is more a summary for me since I have researched and understood the following process cerebrally. I am having trouble translating it to tracing code.

    -NTFS implements files as objects.
    -an application creates or accesses a file by means of object handles
    -by the time an I/O request reaches NTFS, the Windows NT object manager and security system have verified the calling process authority.
    -the I/O manager has also transformed the file handle to a pointer to a file object.
    -NTFS uses the info in the file object to access a file on disk
    -the file object represents a single call to the 'open-file' service. It points to a stream control block (SCB) for the file attribute the caller is trying to R/W
    -the SCB represents individual file attributes and contains info about how to find specific attributes within a file
    -all SCBs for a file point to a common data structure (File Control Block)
    -the FCB contains a pointer to the file's record in the MFT.

    Here are the stumbling points.

    -starting with the last point above, how does the FCB have a pointer to the file's record. Is an FCB created for each file? If so, it must be connected to the namespace process I mention next. Maybe that's the missing link, the system knows from which namespace item is selected where to find the FCB.

    -Windows has a file structure system (namespace) that parallels the directory/sub-directory a user encounters. It is handled by shell32.dll. At the same time, shell32 works with ole32.dll to set up an object whenever a file in its namespace is executed.

    -somewhere in that process of parsing the path for the file to be executed, the object interface is formed by ole32. It's not too clear to me what happens next but somehow the file system manager gets into the act.

    -as I understand it, the object is not the file but a structure which represents the file. If I put a bpx on CreateFile, I get a hit, and Createfile returns a handle. However, that process leads to a place in NTFS code where the file location is already known. I can tell that because it accesses the file's MZ header and verification of the header reveals it to be the file I am trying to load.

    -I would think that the initial handle returned by CreateFile should be used as above by the I/O manager which changes it from a handle to a pointer to a file object. In other words, the file location should not be known yet, but it is. As it says in my notes above, the SCB contains specific info on how to find file attributes within a file.

    -it just occurred to me that the handle returned by CreateFile is a file handle and not the object handle referred to earlier. If CreateFile is not used to retrieve object handle, what is?

    -I am thinking Windows does far more when it loads the MFT into memory at boot time than what is obvious. I have theorized, perhaps wrongly, that the file is found on disk through fairly straightforward means along the lines of the DOS system. I am beginning to think it is far more convoluted than that. For one, if the file is accessed frequently it is loaded at least partly into the file cache. I have noticed while tracing that NTFS.sys queries the file cache, presumably checking to see if the file is in the cache.
    Last edited by WaxfordSqueers; September 19th, 2013 at 07:26.

  9. #99
    I wish I could offer some insight but, I see the whole thing as a linear process
    which it clearly is not. I think that windows does this on purpose so to make this endeavor
    as difficult, if not impossible as it can.

    My opinion is, because the way windows stores files in fragmented sections,
    it has been beyond my skill level to decipher.
    The calls go in directions I just cant comprehend.

    I expect things in windows to be linux like and they are not even remotely close.

    I so wish I could understand how windows does things.
    I wonder how much more difficult it is in win8 ?

    I want the old 8 bit days. Ugggfhhhhh Age has been harsh to me.

    Learn Or Die.

  10. #100
    Quote Originally Posted by Woodmann View Post
    I wish I could offer some insight but, I see the whole thing as a linear process
    which it clearly is not. I think that windows does this on purpose so to make this endeavor
    as difficult, if not impossible as it can.
    Good to hear from you, Woody. I agree 100% but I'm not so sure it is a calculated obfuscation. I think they have over-complicated things by over-thinking the problem.

    If you read the book on C++ by Bjarne Stroustrup, the guy who created C++, he gives really good examples of why the object-oriented side is required. However, having created the concepts, Stroustrup in on top of it and explains it all very clearly.

    That's not the case with 95% of the other authors I have read on object theory. They talk around examples, they don't explain them. For example, with the concept of a class in C++, I have seen author after author talk around the subject, giving juvenile and cockamamey examples of a class without stating specifically what it is. In Stroustrup's book he comes right and and states that a class is a user-defined type. In fact, he points out that C++ was based on giving more flexibility to programmers hence the definition of a type that was not as well-defined as char, int, etc. It gave programmers the power to be more creative.

    Microsoft have taken that creativity too literally, although if you read a lot on the NTFS system, it becomes clearer as to why they have gone that route. It's basically a good idea but it gets far too complex and out of control. For example, the MFT, the central record keeping database of NTFS files, can only grow until a theoretical point where there is no more space between the MFT and the user-based file system. Microsoft has not built in the ability to shrink it, even after scads of files have been deleted. Part of the reason, as far as I can see is that each file has a file number and they have not built in the capacity to re-sort the numbers in such a manner as to downsize them. Slots for files at a specific number can be re-used but due to problems with hard links, etc., NTFS wont relinquish that space until certain conditions are met.

    Disk utilities like Diskeeper will defrag the MFT but that is done in DOS mode after a reboot. Even so, I have an MFT on my C: drive that is now split into two sections and even after an MFT defrag it is still split in two. The split happens after the MFT becomes so large that it has to make part of itself non-resident as another MFT. It seems the MFT is subject to the laws of entropy and can only become more defragmented as time goes by, presumably to the point where the file system chokes to death.

    Quote Originally Posted by Woodmann View Post
    My opinion is, because the way windows stores files in fragmented sections,
    it has been beyond my skill level to decipher.
    The calls go in directions I just cant comprehend.
    That's what I am dealing with now. I have traced the creation of a file instance from the file manager well into shell32 and ole32 but I am tracing blindly. As I go, I can make breakpoints so I can return to the same point but there are thousands and thousands of lines of code that seem to do nothing but parse paths. I have created a very short path in C:\aaa but even at that, the tracing path leads through shell32, shlwapi, ole32, and occasionally into the bowels of ring 0 for apparently unknown reasons.

    One thing that really annoys me is the number of times the Windows system checks and rechecks things like paths and file names. They are anal about errors and I can imagine walking into the Microsoft programmers area and seeing men walking around with several belts holding up their pants along with several pairs of suspenders, just in case the belts should fail. You can get seriously immersed in nesting levels for function calls in the shell32/ole32 process.

    Quote Originally Posted by Woodmann View Post
    I expect things in windows to be linux like and they are not even remotely close.
    My hangup is thinking in DOS mode, or even at the assembler level. My background is in hardware and I refuse to immerse myself in a fantasy world that obfuscates the hardware level, or the assembler level. I am continually trying to interpret object theory into real life.

    No matter what object-oriented people call it, an object is code. Why not make that clear? In the old days we called them sub-routines. Stroustrup explains that in large software systems there is a necessity to objectify the process and I understand that. I also think that someone should try to relate the objects to the real world.

    Any object in a Windows system can be tracked down to a data structure. The concept of objects is far too general and that has crossed over to NTFS where everything is a file.

    Even Microsoft gets confused in their morass of double-speak. On their site they explain the MFT by laying out the first few file records in the MFT. File number 0 is called MFT but it has a $ sign in front of it to indicate that it is a system file record, not a user-mode record. $Mft is record 0 in the MFT. I caught someone on Microsoft trying to explain that $MFT is the MFT, that all files on the system were contained in it.

    That's double-talk. All files on a system are NOT in $MFT, in fact there is very little in $MFT other than a reference to the overall MFT. There is absolutely nothing in $MFT about other file records. They also tell us that the $MftMirr, which is file record 2 in the MFT, can help reconstruct a damaged MFT. %MftMirr is a copy of record 0 to 3 in the MFT, so how can it help reconstruct anything unless those 4 records are damaged?

    I have suspected that the 3rd record, $LogFile has something to do with reconstructing a damaged MFT and Microsoft is keeping that from us. The LogFile is supposed to be for recovering from a crash. Apparently the NTFS writes transaction (any operation on a file) to the LogFile, which can grow to 50 megs, or so. As it gets to it's limit, it starts deleting entries near the beginning of the log. If the system crashes, NTFS has enough data in the log to recreate the system.

    Quote Originally Posted by Woodmann View Post
    I so wish I could understand how windows does things.
    At least you have encouraged me to keep looking at the code in shell32 and ole32 to see if I can find the missing link. I know basically that Windows shows a directory structure in file explorer that is realized as a serious of structures in a namespace. As I trace through shell32 I can see the system processing those structures, which contain the path and filename broken into a structure with offsets listed to show where each record in the structure ends.

    A bit later, ole32 gets into the act and shlwapi, which is a light-weight shell. I am just not clear on the objects being initiated in ole32. Ole contains interfaces that can be called by applications, then it becomes a matter of pointers being issued for the interfaces. After that, the object creation process begins.

    I also know what the back end looks like. I have seen handles issued, IRPs issued, and I have traced through scads of NTFS.sys. I have yet to reach the point where the disk is accessed but I just read last night that ftdisk.sys is supposed to handle file I/O. I thought it would be Hal. It seems to me that I was into ftdisk code at one time but thought I'd gotten lost.

    What I'd like to find is where the connection between the file manager and the object creation process lies. That's no guarantee of success, however, since one needs a virgin file that has not been used recently in the current NTFS system. If it has been used, it gets cached, so the data for the file comes from a memory cache, not the disk.

    Quote Originally Posted by Woodmann View Post
    I want the old 8 bit days. Ugggfhhhhh Age has been harsh to me.Woodmann
    I concur, I try not to dwell on it.

    BTW...did you ever find Splaj?
    Last edited by WaxfordSqueers; September 20th, 2013 at 01:47.

  11. #101
    Teach, Not Flame Kayaker's Avatar
    Join Date
    Oct 2000
    Blog Entries
    Quote Originally Posted by WaxfordSqueers View Post
    Disk utilities like Diskeeper will defrag the MFT but that is done in DOS mode after a reboot. Even so, I have an MFT on my C: drive that is now split into two sections and even after an MFT defrag it is still split in two. The split happens after the MFT becomes so large that it has to make part of itself non-resident as another MFT. It seems the MFT is subject to the laws of entropy and can only become more defragmented as time goes by, presumably to the point where the file system chokes to death.
    I was going to ask you actually, since you've become familiar with parsing the MFT in raw format, whether you notice any difference if you defrag the MFT, or if that might even make the convoluted path you're trying to follow for your specific test files any easier if by chance their entries happen to be fragmented across MFT sections.

    To that end I tried using contig from Sysinternals, which is supposed to be able to defrag the hidden metadata files. Unfortunately I can't get it to work, I get the same error
    Failed to open C:\$Mft::$BITMAP:
    as reported here:

    But in any case, in response to your comment about the MFT still being split in 2 even after defrag, that in itself might be normal as explained in the 2nd post in this thread:

    For the record, my MFT as reported by the default Windows defrag analyzer has 2 fragments as well, is 77% in use, and is 61Mb in size. I think I read somewhere that the MFT will split when it gets over 200Mb, is that correct?

  12. #102
    Quote Originally Posted by Kayaker View Post
    .... whether you notice any difference if you defrag the MFT, or if that might even make the convoluted path you're trying to follow for your specific test files any easier if by chance their entries happen to be fragmented across MFT sections.
    Thanks for links Kayaker. I don't see how at this point an mft defrag would help. The MFT lives as a database with it's origin at a specific address. Microsoft claims it has no permanent address so it can be moved in the case of a bad cluster. What I am trying to do is find where the MFT is called to go find a file on disk. You supplied an NTFS function earlier, I think it was _NtfsCheckFile Record, or something like that. I did explore that but all it was doing was testing the file record in the MFT for veracity. It seemed to return to shell32 without finding the file. I presume the link I seek came before that or after that.

    On both my systems, with 512 bytes/sector and 8 sectors/cluster, (laptop win7 and desktop XP) the MFT is located at cluster offset 0xC0000. On my systems that is byte offset 0XC0000000. It's likely the same on yours if you have the same cluster factor and 512 bytes/sector, and you can see a file signature of FILE at that address. If you look down the file a short ways, you will see a reference to $MFT. That is the $MFT file record which is a reference to MFT itself. However, many people, apparently including Russinovich, the author of Contig, are using it in an odd manner.

    ***I have made an error here but I will leave it and correct it later. My error shows how easy it is to get confused in the MFT***

    That may sound arrogant but consider the definitions supplied by Microsoft. The first 16 file records in the MFT are system records and that's why they have the $ sign in front of them. The very first record is $MFT and it is record 0. When Russinovich writes C:\$MFT::$BITMAP, he is suggesting that $BITMAP is contained in $MFT and it is not. $BITMAP is a record in the overall MFT but it is not contained in $MFT.

    ***I have confused $Bitmap the file record with $Bitmap the attribute...explained below**********

    An interesting aside. Microsoft describes the MFT as a database with rows of file records and columns of attributes. That does make sense since it begins at file record 0 in the first row and each record has a header with attributes numbered from 0x10 (STANDARD_INFORMATION) upwards of 0xB0 in columns. You could visualize the columns as file header, attribute 1, attribute 2, etc. Not every file has the same attributes, however, so some columns would have no data in them. The data (header and attributes) in columns are telling you things about the file record in that row. Every file on the system is in its own row and the rows are numbered as file record numbers.

    In the root directory, normally C:\, the metafiles beginning with $ are listed in the C:\ directory but their attributes are hidden and system. The attribs cannot be changed using attrib from a DOS prompt. At this point, I don't know if the $MFT is a reference to the entire MFT or just to record 0 in the MFT. Along with $MFT can be found $MftMirr and several other system MFT metafiles all beginning with $.

    Problem is, none of them are in the C:\ directory even though they are listed there. As I said above, the MFT database is located at 0xC0000000 on my system and my C:\ directory is at 0x2C000. It seems the references to the metafiles are hard links just as the recycle bin is a virtual file pointed at in C:\. Anyway, at 0xC0000000, is the $MFT file record with it's file header and attributes, the rest of the file records, including $MftMirr follow with each of the following 16 records being system records. Here are the system records in the MFT, all rows of the database:

    0 = $Mft - the record with information about the MFT only. It contains a $Bitmap attribute ($B0) hence $Mft::$Bitmap.
    1 = $MftMirr
    2 = $LogFile
    3 = $Volume
    4 = $AttDef
    5 = no name...referred to as the dot record, or '.'. It is the directory system root.
    6 = $Bitmap - note that $Bitmap is a separate file and not contained in $Mft <<<<-------------------------
    7 = $Boot - the actual boot file is at the beginning of the partition with a copy at the end of the partition.
    8 = $BadClus
    9 = $Secure
    10 = $Upcase
    11 = $Extend
    12 - 15 - reserved

    The description given on my reference for $Mft claims it contains one base file record for each file and folder on the NTFS volume. Horsefeathers!! If you look in $MFT there is nothing there but a reference to itself. There is nary a mention of any other file.

    It has a standard header with 4 attributes $10, $30, $80 and $A0. $10 is standard information. $30 is $File_Name which tells you it's name is $MFT. $80 is more interesting , it is $Data and is presumably a file's data. It gives the first and last VCN and tells you the size of the MFT. However, the data reference is to this $MFT record only and to no other file on the disk.

    $80 has two data runs. The first tells you the MFT begins at cluster 0xC0000, which is also listed in the NTFS boot record at the start of the partition. The second data run is presumably a pointer to the other half of my split MFT. When I go to that address, sure enough, there is a FILE signature and the continuation of the MFT.

    $B0 = the $Bitmap 'attribute', not the $Bitmap 'file record'. Confusing??? So I am wrong, there is a $MFT::$Bitmap reference but the $Mft is a file record reference and the $Bitmap reference is to an attribute. I interpret C:\$Mft::$Bitmap as meaning the $Bitmap attribute in the $Mft file record.

    The $BITMAP record is apparently a record of available clusters on the system.

    Anyway, it has two data runs in the attribute and data runs must be non-resident, or external to the MFT. The first is at 0xBFFFF and extends for 1 cluster, which is 0x1000. That means that portion of the bitmap extends from one cluster below the MFT to the MFT at 0XC0000. It is full of FFs, an indications of bit fields for each cluster on the drive.

    The second $Bitmap data run is at 0xDCA85 and extends for 6 clusters (0x6000). That bit field is far more sparse than the first one as might be expected since it represents a region on disk that has more free space.

    To summarize, the name $MFT seems to be used incorrectly by many people. $Mft does not mean MFT. $Mft is one record, the first, in the MFT database and its sole purpose is to give information about the MFT such as its name, it's time/date information (when it was changed, created, etc.), it's security attributes, it's location, its size and the clusters it occupies. $MFT tells you nothing about any other file on the system. That information comes from the user file records beginning after the system $-records and referred to in the index records in the dot file record at file record 0x5.

    That seems to be how you locate a file in the MFT. You trace the b-tree with it's root node in the 'dot' file record 0x5. The b-tree is a sorted index based on the directory/file name, and once you have the path, you can find the file reference in the index. At that point, there is apparently a pointer to the associated file record.

    After having explored the MFT to a decent depth and seeing its complexity I don't think I'd want to run something like contig on my main system. I tried that once with a boot disk from Comodo written in Linux and it completely screwed my system, installing a Linux folder while writing over my directories including Program Files.

    I think part of the problem I am having now is based on what Comodo backup did on my external drive. Ultimately, it was a bonehead error on my part, mistaking a clone operation for an image operation. Comodo started cloning to my external disk without a warning that all data would be lost. That would have clued me into my error. Although it only wrote for a few seconds before I aborted it, the damage was done. It seems to have written over my external drive's MFT but why it would begin in mid disk with an MFT is not clear. I would think a sector by sector clone would work from sector 1 outward.
    Last edited by WaxfordSqueers; September 20th, 2013 at 18:57.

  13. #103
    I may be going too far with my long-winded explanation so I don't mind someone saying something. I notice someone changed the title of the thread which is a good idea.

    I still plan to attack this problem using reverse engineering and next time I will lay out an outline of my code tracing methodology so someone can follow along and duplicate the process if they like. I have taken a break from code tracing to try understanding the MFT structure better. It's a steep learning curve and passing on what I have learned for peer review is tough to do with terse replies. NTFS and the MFT give up their secrets reluctantly.

    In my last windy reply to kayaker I was actually addressing the issues in the links he provided but forgot to say so. In one link there was an error message from contig that claimed it could not find $Bitmap. That's likely because the $Bitmap file is damaged, or the $MFT file is damaged, or even the MFT itself. Just a guess but in the MFT verification routines I have traced the process of finding files and attributes in the MFT records is pretty foolproof. They ID the FILE signature of $MFT then do the rest using offsets included in the file header or the attribute itself.

    I wont pack the tracing reply with code but I will lay out key functions in shell32, etc., so someone can bp on them to save a lot of tracing time should they want to follow along. I am too bagged to do it right now.

    If the thread is getting too long don't be afraid to say something. I am not easily offended.

    Also, if Blabs wonders why I am not using windbg more it's because it wont work in the VM right now. It took me a long time to debug my softice install on the VM and I don't want to go through that right now with windbg.
    Last edited by WaxfordSqueers; September 20th, 2013 at 19:19.

  14. #104
    It's not long winded. We are trying to understand how MFT does what it does.

    Some zen sounds good about now.

    Learn Or Die.

  15. #105
    Teach, Not Flame Kayaker's Avatar
    Join Date
    Oct 2000
    Blog Entries
    No worries Waxford, your detailing of your findings is mucho appreciated, it's nice to discuss something new for a change. Yes, it's certainly not an easy subject matter and there's definitely a lot to absorb in how the NTFS is laid out. Let alone the undocumented code paths you're trying to follow. I changed the title to hopefully draw others into the thread.

    Thanks for the clarification on why you think Contig might not be working. Yeah, in the first link I posted it didn't work, but in the second it apparently did, as expected. Interestingly, I tried running chkdsk and it did find an error in the MFT Bitmap:

    CHKDSK discovered free space marked as allocated in the master file table (MFT) bitmap.
    Correcting errors in the Volume Bitmap.
    Windows found problems with the file system.
    Run CHKDSK with the /F (fix) option to correct these.
    After running chkdsk /f several times, all errors disappeared. However, Contig still gives the Failed to open C:\$Mft::$BITMAP: error.
    To be fair though, this is a VM I'm trying this on.

Similar Threads

  1. NTFS reversing
    By WaxfordSqueers in forum The Newbie Forum
    Replies: 21
    Last Post: April 28th, 2013, 00:56
  2. Qt Internals & Reversing
    By Daniel Pistelli in forum Blogs Forum
    Replies: 11
    Last Post: December 5th, 2008, 04:12
  3. problem with NTFS file encryption
    By Hero in forum The Newbie Forum
    Replies: 10
    Last Post: October 22nd, 2004, 03:49
  4. New project: RSA-65 analysis on GetDataBack for NTFS
    By Lbolt99 in forum RCE Cryptographics
    Replies: 6
    Last Post: August 1st, 2002, 14:48
  5. Write to NTFS
    By tentakkel in forum Malware Analysis and Unpacking Forum
    Replies: 7
    Last Post: October 8th, 2001, 17:18


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts