All Blog Entries

  1. Inside SetUnhandledExceptionFilter


    is frequently used as Anti Debug Trick, especially in Malware Applications. Around here there are various plugins for Olly that allows the Reverser to trasparently debug this kind of protection, so there is not a real necessity add other words about the mere practical part of trick overcoming.

    Due to the fact that today, too many young reversers uses a ton of plugins anti - anti - xxx without knowing how internally they works, I decided to expose here a little summary of SetUnhandledExceptionFilter Internal characteristics.

    First of all, what SetUnhandledExceptionFilter is? according to MSDN documentation:

    Enables an application to supersede the top-level exception handler of each thread of a process.

    After calling this function, if an exception occurs in a process that is not being debugged, and the exception makes it to the unhandled exception filter, that filter will call the exception filter function specified by the lpTopLevelExceptionFilter parameter.

    And this is the Syntax:

    __in  LPTOP_LEVEL_EXCEPTION_FILTER lpTopLevelExceptionFilter
    lpTopLevelExceptionFilter is a pointer to top-level exception filter function that will be called whenever the UnhandledExceptionFilter function gets control, and the process is not being debugged. A value of NULL for this parameter specifies default handling within UnhandledExceptionFilter.

    Usually, in absence of an UnhandledExceptionFilter the topmost handler called when an uncatched exception occours, is the default one provided by Windows Itself, the classical MessageBox that advices the user that an Unhandled Exception has occured.

    But Windows allow programs to use custom Handlers for UnhandledException. The core of the trick is here, if the application is NOT debugged, the application is able to call the Custom Handler, but if the application IS debugged the Custom Handler will be never called.

    The possibility of cognitive differentiation make obviously able the target application to apply a series of countemeasures against debugging, from detection to code hidding.

    Just remember that due to the architecture of Windows Exception Handling, in every case is called UnhlandledExceptionFilter() function, and this will our point of attack (for anti - anti dbg trick).

    This is the general inner meccanism of SetUnhandledExceptionFilter(), going more deep we observe the call stack of the first thread of any Win32 application, we can see that execution in every case is reported to BaseProcess, here the pseudo definition:

    VOID BaseProcessStart( PPROCESS_START_ROUTINE pfnStartAddr )
            ExitThread( (pfnStartAddr)() );
        __except( UnhandledExceptionFilter( GetExceptionInformation()) )
            ExitProcess( GetExceptionCode() );
    The same thing happens for threads, by referencing to BaseThreadStart:

    VOID BaseThreadStart( PTHREAD_START_ROUTINE pfnStartAddr, PVOID pParam )
            ExitThread( (pfnStartAddr)(pParam) );
        __except( UnhandledExceptionFilter(GetExceptionInformation()) )
            ExitProcess( GetExceptionCode() );
    All that happens inside BaseProcessStart() and BaseThreadStart() for what previously said, will be passed to the UnhandledExceptionFilter().

    Itís now time to see what really is UnhandledExceptionFilter(), according to MSDN:

    An application-defined function that passes unhandled exceptions to the debugger, if the process is being debugged. Otherwise, it optionally displays an Application Error message box and causes the exception handler to be executed. This function can be called only from within the filter expression of an exception handler.


    LONG WINAPI UnhandledExceptionFilter(
      __in  struct _EXCEPTION_POINTERS *ExceptionInfo
    Became clear that UnhandledExceptionFilter represents the last choise for processing unhandled exceptions, so the Check Debugger Presence surely is located inside this function, letís see a simplified version of this function:

    LONG UnhandledExceptionFilter( EXCEPTION_POINTERS* pep )
        DWORD rv;
        EXCEPTION_RECORD* per = pep->ExceptionRecord;
        if( ( per->ExceptionCode == EXCEPTION_ACCESS_VIOLATION ) &&
             ( per->ExceptionInformation[0] != 0 ) )
            rv = BasepCheckForReadOnlyResource( per->ExceptionInformation[1] );
            if( rv == EXCEPTION_CONTINUE_EXECUTION )
                return EXCEPTION_CONTINUE_EXECUTION;
        DWORD DebugPort = 0;
        rv = NtQueryInformationProcess( GetCurrentProcess(), ProcessDebugPort,
                                        &DebugPort, sizeof( DebugPort ), 0 );
        if( ( rv >= 0 ) && ( DebugPort != 0 ) )
            // Yes, it is -> Pass exception to the debugger
        // Is custom filter for unhandled exceptions registered ?
        if( BasepCurrentTopLevelFilter != 0 )
            // Yes, it is -> Call the custom filter
            rv = (BasepCurrentTopLevelFilter)(pep);
            if( rv == EXCEPTION_EXECUTE_HANDLER )
                return EXCEPTION_EXECUTE_HANDLER;
            if( rv == EXCEPTION_CONTINUE_EXECUTION )
                return EXCEPTION_CONTINUE_EXECUTION;
    As you can see, inside UnhandledExceptionFilter() is called NtQueryInformationProcess() that has as first parameter our process and next DebugPort, this is done to know if the process is debugged.

    All that we have to do to obtain an apparently undebugged process is to modify the first parameter (last pushed at debugging time), in other words we have to change the retur value of GetCurrentProcess() from 0xFFFFFFFF to 0◊00000000.

    So remember, when you have to overcome a SetUnhandledExceptionFilter() just put a Breakpoint for UnhandledExceptionFilter() and go inside this function to modify the previously exposed parameter

    Thanks to Oleg Starodumov for pseudocodes

    See you to the next blog post..
  2. Small Devices & RCE

    Didn't want to go off-topic in the other thread, that's why I'm opening a new one. I wanted to add some thoughts about the IDA-on-IPhone news.

    Good news for real iPhone fans: we ported IDA to iPhone! It can handle any application and provides the same analysis as on other platforms. It is funny to see IDA on a such small device:

    Ilfak Guilfanov
    I think it's awesome.

    It's also funny, because in theory the new CFF Explorer will be compilable for mac os (being written in Qt), thus also IPhone. The only problem is the small display of such devices and I'm not sure if there's a possibility to reduce the needed space, but I'm quite optimistic.

    I mention this because the new CFF Explorer will support elf and other formats (lib, object, symbian etc), making it useful also for other systems and it might become part of a new generation of cross platform/device tools. It would be encouraging to know that in the future it will be possible to do reversing stuff on such a small device. The new CFF will also have zoom in/out features for the hex editor, making it very useful on devices with a small (or big) display.

    I hope that other programmers will follow the same lead.

    The main problem is writing cross platform applications and reorganizing GUIs for small displays.

    I want to share something I read on wikipedia some time ago:

    Microsoft software is also presented as a "safe" choice for IT managers purchasing software systems. In an internal memo for senior management Microsoft's head of C++ development, Aaron Contorer, stated:[7]

    ďThe Windows API is so broad, so deep, and so functional that most Independent Software Vendors would be crazy not to use it. And it is so deeply embedded in the source code of many Windows apps that there is a huge switching cost to using a different operating system instead... It is this switching cost that has given the customers the patience to stick with Windows through all our mistakes, our buggy drivers, our high TCO (total cost of ownership), our lack of a sexy vision at times, and many other difficulties [...] Customers constantly evaluate other desktop platforms, [but] it would be so much work to move over that they hope we just improve Windows rather than force them to move. In short, without this exclusive franchise called the Windows API, we would have been dead a long time ago.
    Companies such as Apple and Microsoft are very conscious of the strategic importance of hard binding applications to their propretary API. That's why Apple pushes cocoa and Microsoft .NET. They don't want cross platform development environments (oh and don't tell me that .NET is cross-platform, before doing so, show me a .NET GUI with more than a button in it on a system which isn't Windows), because it would make possible for users to switch to another system without losing his tools.

    However, "the times they are a changin'". Nowadays, developers are more conscious about this problem and prefer not to bind their application to only one platform. You can notice this if you pay attention to the names of newer applications. Ten years ago there were lots of windows applications which contained the word "win" in them. Winhex, WinDvd, Winzip, WinRar, WinAce, Winamp etc. etc. etc. Have you noticed that this trend has stopped? It's interesting, right now a struggle between developers and OS producers is taking place. OS producers want to ever more bind (even more than before) developers to their platform. Why do I say more than before? Well, consider that .NET implements its own languages, you can't simply share real C++ code with the managed one (yes, you can rely on pinvoke, but not for everything). Well, it's a bit more complicate than that, I know, but unsafe code is not encouraged in the .NET environment. Meanwhile, Apple pushes Obj-C. I want to know how this ends. Speaking for myself, I refuse to take a side and will stick with my beloved C++ (the real one).

    I hope this post won't generate a big controversy like the one about Windows Vista.

    Updated July 25th, 2008 at 08:52 by Daniel Pistelli

  3. SymbolFinder

    Dunno if this is just me or this is for real, but if someone tries to google for some kind of example of symbol lister it will endup in dead-end (maybe I should work on my google skils ), anyway, I spent last 2 days playing and figuring these symbols (great MS simply points in MSDN to PDB documentation... where is that thing!??!!?), to write this enum, struct, symbol lister and decided to share my source so there can be at least one refference on how to list and parse symbols...

    Hope someone will find it usefull
  4. Sun VirtualBox Disassembler Explantation


    because i needed a good disassembler for my projects i check different distributions in the internet. most of them are homebrew and the support, or lets better talk about MAINTAINANCE is in most cases not the best.

    I really hate it if use a component and realize that there is a bug and the releaser of the component is not able to fix it or sometimes has no real interest in fixing it. That sucks.

    Thats why i focused on a disassembler which is well maintained and last but not least a good one.

    During my search i stumbled over VirtualBox, which is an similar SUN implementation of VMWARES Workstation. The difference is that VirtualBox comes with source, or at least you can download the source ( ).

    I thought that the pretty sure have to have an working disassembler inside there virtual machine and bingo....they have.
    The problem was that the disassembler was not contained in form of a library, it was simple integrated in the source.

    It took me about 2 hours to explant the needed source parts out of virtualbox and built a project for a library for it.

    I now use it for my projects and it is very usefull for me.

    There is only one problem you will discover when you try the example. I looking forward for your solutions for the problem


    OHPen aka PAPiLLiON

    Updated July 15th, 2008 at 15:10 by OHPen

    Attached Thumbnails Attached Files
  5. CartellaUnicaTasse.exe Italian Malware RCE Analysis

    I've just released a paper into my website about the RCE Analysis of an italian Downloader

    Paper can be reached here:

    if this link does not works, just reach it from the home of my website.

  6. Why is secure development so important?

    Here's a conversation I had recently with somebody:
    A: Why do you check the length of your strings so often and do that much validation of inputs?
    Me: It's more secure that way.
    A: Why do you need to make you program secure?
    Me: Better secure than sorry.
    A: It's a useless loss of time.
    Me: Bah, it's surprising sometimes the unforeseen problems that it can save.

    Here's a good example of an unforeseen problem that might happen, somebody managed to exploit a buffer overflow in OllyDbg and ImpREC.
    It happens when an export from a dll has a name longer than the buffer.

    CHimpREC does not get fooled by this trick:

    Better secure than sorry...
  7. pde/pte softice plugin

    Today I needed to verify some bits in PDE/PTE from SoftICE (well while I'm debugging) so I wrote one plugin for softice which will give me all needed information about pde/pte for a given address. Note that this is for PAE systems as I'm using PAE mainly... maybe I'll update it for non pae systems someday. Anyway, source is included
  8. Funny coded malware

    Some days ago I had the opportunity to check one of the last msn malware. I think thereís often something interesting inside a malware, no matter what it does and this is a perfect example!

    The malware is able to infect only right handed people! Iím not kidding...
    Among all the windows settings thereís one made for left handed people. The option Iím referring to is located under the Mouse control panel, labelled ďSwitch primary and secondary buttonsĒ. It lets you exchange the functions performed by the right and left mouse button. Donít know if this setting is usefull or not, most of the left handed friends I have are still using the mouse like a right handed. Maybe they donít even know the existence of such an option. Anyway, look at this code:

    Itís a simple query on a registry key named SwapMouseButtons.
    result_value is sent back to the caller, and the caller checks the value. If the value is equal to 0◊30 (right handed) the malware goes on running the rest of the code, but if the value is 0◊31 (left handed) the malware ends immediately. All the nasty things performed by the malware are executed after this check, it means that a left handed wonít get infected!

    Iíve seen some malwares using SwapMouseButton function in the past, but never something like that. I bet the author is left handed and he wrote the check just to be sure to avoid a possible infectionÖ I canít think of anything else. Quite funny!!!

    The malware is not really interesting per se, but it has something Iíve never noticed before. Itís not a cool and dangerous new technique, but a coding behaviour. Look at the graph overview:

    The image represents the content of a malware procedure. Nothing strange per se, except the fact that it contains 657 instructions in it, too many for a simple malware. Itís a big routine and I was surprised at first because you can do a lot of things with so many instructions. I started analysing the code, nothing is passed to the routine and nothing is returned back to the original caller. I tought it should be an important part of the malware, but I was disappointed by the real content of the routine. After few seconds I realized whatís really going on: 657 lines of code for doing something that normally would require around 50 linesÖ
    The function contains a block of 17 instructions repeated 38 times. When Iím facing things like that I always have a little discussion with my brain. The questions are:
    - why do you need to repeat each block 38 times?
    - canít you just use a while statement?
    - is this a sort of anti-disassembling trick?
    - can you produce such a procedure setting up some specific compilerís options?

    The repeated block contains the instruction below:
    00402175    push 9                       ; Length of the string to decrypt
    00402177    push offset ntdll_dll        ; String to decrypt
    0040217C    push offset aM4l0x123456789  ; key: "M4L0X123456789"
    00402181    call sub_401050              ; decrypt "ntdll.dll"
    00402186    add  esp, 0Ch
    00402189    mov  edi, eax
    0040218B    mov  edx, offset ntdll_dll
    00402190    or   ecx, 0FFFFFFFFh
    00402193    xor  eax, eax
    00402195    repne scasb
    00402197    not  ecx
    00402199    sub  edi, ecx
    0040219B    mov  esi, edi
    0040219D    mov  eax, ecx
    0040219F    mov  edi, edx
    004021A1    shr  ecx, 2
    004021A4    rep movsd
    004021A6    mov  ecx, eax
    004021A8    and  ecx, 3
    004021AB    rep movsb
    Itís only a decryption routine, nothing more. The string is decrypted by the ďcall 401050″, the rest of the code simply moves the string in the right buffer.
    Ok, letís try answering the initial questions.

    According to some PE scanners the exe file was produced by Microsoft Visual C++ 6.0 SPx.
    Itís possible to code the big procedure just using a loop (while, for, do-while) containing the snippet above. I donít think the author used one of these statements because as far as I know itís not possible to tell the compiler to explode a cycle into a sequence of blocks. At this point I have to options:
    - he wrote the same block for 38 times
    - he defined a macro with the blockís instructions repeating the macro for 38 times
    I wonít code something like that, but the macro option seems to be the most probable choice.
    Is it an anti-disassembling trick? My answer is no because itís really easy to read such a code. You donít have to deal with variables used inside a for/while; to understand whatís going on you only have to compare three or four blocks.
    I donít have a valid answer to the doubt I had at firstÖ.

    Trying to find out some more info I studied the rest of the code. I was quite surprised to see another funny diagram.

    This time the image represents the content of the procedure used to retrieve the address of the API functions. Again, no while/for/do-while statement. The rectangle on the upper part of the image itís a sequence of calls to GetProcAddress, and the code below itís just a sequence of checks on the addresses obtained by GetProcAddress.
    Itís a series of:

    address = GetProcAddress(hDLL, "function_name");

    followed by a series of:

    if (!address) goto _error;

    Apart the non-use of a loop thereís something more this time, something that I think reveals an unusual coding style; tha author checks errors at the end of the procedure. I always prefer to check return values as soon as I can, itís not a rule but itís something that help you to avoid oversight and potential errorsÖ The procedure has a little bug/oversight at the end, the author forgot to close an opened handle. Just a coincidence?
    Anyway, two procedures without a single loop. Seems like the author didnít use any kind of loop for choice. In case you still have some doubts hereís another cool pictures for you:

    The routine inside the picture contains the code used to check if the API(s) are patched or not. The check is done comparing the first byte with 0xE8 and 0xE9 (call and jump). If the functions are not patched the malware goes on, otherwise it ends. As you can see no loops are used.

    In summary: itís not jungle code, itís not an anti-disasm code and itís not a specific compiler setting. I think itís only a personal choice, but I would really like to know why the author used this particular style.
    Do you have any suggestions?

    Beyond the coding style, the malware has some more strange things. As pointed out by *asaperlo*, the code contains a bugged RC4 implementation.
    It also has a virtual machine check. The idea is pretty simple, the malware checks the nick of the current user. If the nick is ďsandboxĒ or ďvmwareĒ you are under a virtual machineÖ
    This malware spawns another one (itís encrypted inside the file), it might be material for another post.

    Thatís a funny coded malware for sure!
  9. antisptd

    antisptd is a driver that makes it possible for softice to load when sptd.sys is present. It uses the method described by Kayaker and that is, by removing the notifyroutine sptd sets to prevent ntice.sys to load. After ntice.sys gets loaded, it restores the notifyroutine and the keyboard hooks in i8042prt.sys that have been screwed by the sptd.sys.

    Just put the startsi.exe in a directory with antisptd.sys and execute startsi.exe.

    The driver should work on XP SP2/SP3 with the latest softice installed. I have no idea if it'll work on XP SP1 (cause I have used hardcoded values to locate the patch locations in i8042prt.sys). If it doesnt work, feel free to modify the sources and recompile the driver yourself.
    Attached Thumbnails Attached Files
  10. IceProbe - SoftIce Command Tracer

    IceProbe is a utility that allows live tracing and analysis of SoftIce commands using the full capability of SoftIce itself. It is a tool strictly for code exploration, designed to be able to trace running Softice code in order to augment IDA analysis. It is debugging a debugger, in order to answer the question "How does Softice work?"

    There is much that can be learned about system internals by studying Softice code. This utility will give a live hands-on method of tracing and exploring the code for the first time. It can also act as a GUI front-end for Softice, as bizarre as that might sound.


    Any SoftIce command typed into the command line window is stored in a global string buffer. The command string consists of the command name and any arguments. The buffer is passed to the individual function where it is parsed, and the command is executed.

    We can selectively replace instances of this global buffer pointer with one of our own and call Softice commands directly from a GUI interface. An (optional) embedded breakpoint which will pop-up Softice is written into our driver code immediately before calling the command, which allows us to start tracing the Softice command.

    While live tracing you have full use of all other Softice commands at your disposal, including the ability to set breakpoints in Softice code itself. There is an additional modification which will force the "Idt" command to expose the addresses of the Softice IDT hooks so you can also locate and analyse those various handlers as well.


    Iceprobe is simple to use, select Initialize/Reinitialize from the menu and the driver will return a listview listing of all the Softice commands and their addresses. A log window will monitor the driver. Double click on one of the entries and you will be presented with a dialog box to add any usual arguments to the command. When you select OK, Softice will popup at the start of the command, and you're ready to start tracing with F8.


    Disable Manual Tracing Mode
    We embed an INT 3 in our code and programatically enable "I3HERE DRV" in order to make Softice popup at the start of each command. Set this option if you don't want Softice to popup. The command will still be executed and output to the Softice window as normal.

    Make "Idt" show real addresses
    Expose the addresses of the Softice IDT hooks in the listing from the "Idt" command.

    Disable extra Softice self address space checks
    These are somewhat experimental patching of locations where Softice tests if an offset is within its own address space. Specifically, they occur in the "Search" command, in a portion of breakpoint handling code where MSR LastBranch and MSR LastException information is printed, and in the Int0D handler. You may or may not see any effect.

    Include Undocumented Commands
    There is only one command here, BPTE - Breakpoint on Thread Execution was its probable purpose. Code exists to be traced, but the command appears non-functional and was never documented. If selected, the BPTE command will be added to the listview where you can run it with test arguments.

    Increase Recursive Disassembly Level (Calls nested 4 deep)
    We must find every occurence of the Softice global command buffer used in each command, in order to replace them with a pointer to our own buffer. A recursive disassembly is therefore needed in order to trace through all nested subcalls within a command.

    A simple recursive method is used - trace each call until a RET/RETN is reached. It was found that this was sufficient with a default value of 3 nested levels of disassembly to find all instances of the global buffer for each command. A value of 4 will find further instances, but most seem to be false positives and not part of the command execution path. This is due to how Softice code is laid out (code chunks, use of jmps, etc), and the simplified method of recursive disassembly.

    Output Recursive Call Pattern for Xref with IDA (DbgPrint) - Shows the nested recursive disassembly of all Calls and SubCalls for a command, as determined by the Increase Recursive Disassembly Level option. The pattern can be matched to what you find during the IDA analysis. It makes it easier to keep track of where you are while jumping back and forth between IDA and the Softice/Softice tracing of a command.

    Output Developmental Notes (DbgPrint)
    Prints a bunch of output about the Softice driver and internal offsets, mostly used during development.

    All these options can be "toggled" on or off by setting them and selecting Initialize/Reinitialize from the menu again.

    IDA Analysis:

    This tool is meant to work side by side with an IDA analysis of the ntice.sys driver. Iceprobe should run without problem with Driver Studio 2.7, 3.1 or 3.2. It is designed to work with the final official DS3.2.1 patch version of the Softice driver which was publically available on their ftp site. This offical patch is available here:

    This would be incomplete without an explanation of how to set up IDA properly, which fortunately I discussed previously:

    Setting up IDA for analysing Softice functions

    Briefly, Softice keeps its command names and offsets in indexed tables. The very first step is to run the following idc script. The CmdTable offsets are for the DS3.2.1 patch version. If you happen to be using a different version change the offsets accordingly, the above thread describes how to find them.

    PHP Code:
    #include <idc.idc>

    // with idc command
    // CmdTable(0x15EAAD, 0x15E7AD);

    static CmdTable(NameTableCommandTable) {
    auto ij;
    auto CmdIndexCmdName ;
        while ( 
    Word(i) != 0) {
           while ( 
    Byte(j) != 0j++;
    CmdName "c_" substr(Name(i),1,j-i);      
    CmdIndex Byte(j) * 4;
    MakeNameDword(CommandTable+CmdIndex), CmdName);

    Your IDA disassembly will now identify all of the Softice commands by name. I would then strongly suggest to look at the IDB analysis and Softice headers produced by The Owl while developing Icedump, and use them to start naming some of the internal variables already defined. The article by +Spath is old but indispensable as well.

    NTICE and WINICE IDB Files by the_owl (IDB)

    SOFTICE INTERNALS revision 2 by +Spath

    Now you can start filling in the blanks in your IDA analysis with live tracing of any command using Iceprobe. The ideal situation is to have Softice running under VMWare and have IDA on your desktop. Iceprobe is stable, but you ARE live tracing Softice, so running under VMWare, etc. is desirable.

    To further enhance the experience, you can create progressive NMS symbol files of your IDA analysis and have Softice load its own symbol file into itself using its Symbol Loader. Produce the symbol file with Mostek's Ida2Sice

    Any command can be traced, while at the same time being able to issue any other Softice command. However, if you execute the same command as you are tracing it will only rerun it with the same parameters that were initially set in the GUI, since we've overwritten the global buffer for that command with our own pointer.

    For tracing the BPX command, you can set a breakpoint with a double click, or use BPM. You can even trace the HBOOT command and watch your VM reboot! (I put a protection in the GUI so you can't inadvertently click the HBOOT command).

    WTF is this?:

    I wrote this a few years back, partly as a way of tracing Softice code, but mostly as a way of exploring system internals and how Softice made use of various system structures, variables, hardware and registers. Sort of kernel spelunking through the eyes of a ring 0 debugger.

    IceProbe was first integrated as a KDExtension driver to take advantage of the internal Softice disassembly engine available through the WINDBG_EXTENSION_APIS interface. Further Softice internal details can be found in the thread

    Guide to creating a Softice Kernel Debugger Extension (KDExtension)

    This version uses a standalone driver and the disassembler is an integration of a module I created from the Ndisasm NASM disassembler for use in drivers. The disasm module is also available separately here:

    Sysdasm: Full-Text Disassembler DLL Export Module for Kernel Mode

    Full VC6++ source is included for those interested in looking at an old friend with new eyes


    Attached Thumbnails Attached Files
  11. build rule for x64 asm

    I've made build rule for x64 asm files compiled from msvc2008. I had to use asm x64 in my C code. Hope it will be usefull
    <?xml version="1.0" encoding="utf-8"?>
    			CommandLine="ml64.exe /c /nologo $(InputName).asm"
    			ExecutionDescription="Assembling x64..."

    Updated June 22nd, 2008 at 19:33 by deroko

  12. nonintrusive tracer on x64

    Well time has come to dig a little bit into x64 systems, and to move our lovely tools and ideas to that system.

    Lets have a look at KiUserExceptionDispatcher from ntdll.dll:

    .text:0000000077EF31B0                 public KiUserExceptionDispatcher
    .text:0000000077EF31B0 KiUserExceptionDispatcher:              
    .text:0000000077EF31B0                 mov     rax, cs:Wow64PrepareForException
    .text:0000000077EF31B7                 test    rax, rax
    .text:0000000077EF31BA                 jz      short loc_77EF31CB
    .text:0000000077EF31BC                 mov     rcx, rsp
    .text:0000000077EF31BF                 add     rcx, 4D0h
    .text:0000000077EF31C6                 mov     rdx, rsp
    .text:0000000077EF31C9                 call    rax ; Wow64PrepareForException
    .text:0000000077EF31CB loc_77EF31CB:                         
    .text:0000000077EF31CB                 mov     rcx, rsp
    .text:0000000077EF31CE                 add     rcx, 4D0h
    .text:0000000077EF31D5                 mov     rdx, rsp
    .text:0000000077EF31D8                 call    RtlDispatchException
    .text:0000000077EF31DD                 test    al, al
    .text:0000000077EF31DF                 jz      short loc_77EF31ED
    .text:0000000077EF31E1                 mov     rcx, rsp
    .text:0000000077EF31E4                 xor     edx, edx
    .text:0000000077EF31E6                 call    RtlRestoreContext

    Wow64PrepareForException is used only when loading wow64 process, so in "native x64" environment this variable is set to 0, and we can use that variable to write our own SEH handler in asm or nonintrusive tracer. Well let's cut to the point and see some real code:

                            mov     rax, KiUserExceptionDispatcher
                            xor     rbx, rbx
                            mov     ebx, dword ptr[rax+3]
                            add     rbx, rax
                            add     rbx, 7
                            mov     rax, offset __mykiuser
                            mov     [rbx], rax
                            xor     rax, rax
                            mov     [rax], rax
                            xor     r9, r9
                            mov     r8, offset szntdll
                            mov     rdx, offset szkiuser
                            mov     rcx, 0
                            callW   MessageBoxA
                            xor     rcx, rcx
                            callW   ExitProcess
    __mykiuser:             add     qword ptr[rdx+0F8h], 3
                            mov     rcx, rdx
                            xor     rdx, rdx
                            callW   RtlRestoreContext
    If everything worked as planned, MessageBoxA will be shown... simple isn't it
  13. My "Unofficial" ReCon Video

    My "Unofficial" video is now available from the ReCon website along with my pdf slides.

    The official videos recorded live are not out yet.
    I recorded this one after the conference so it is available before all the others.
    The live version is funnier though.

    Also, I messed-up a bit my live Armadillo demo.
    But it happens...

    So this video is a little less than 60 minutes with voice and a TOC.
    And the Armadillo demo works.

    Don't forget to watch ALL the official ones too when they come out.
    Every speaker out there was really great.

    And yes dELTA, I mentioned the CRCETL.
  14. Control Flow Deobfuscation Part 3

    Now we have:

    So to deobfuscate we only need one more thing:
    • A way to order the vertices

    Interestingly, we could also make a obfuscator in this way. All we have to do is put the vertices in a random order.

    So to deobfuscate we can't just use any ordering. We have to use one that is 'intuitive' and 'natural'. Unfortunately these words mean nothing to a computer; we'll have to specify them more carefully.

    Let's start simple. In which order would you place these vertices if you wanted to deobfuscate?

    I think we can agree that 1-2-3 is he right order.
    3-1-2 is bad because we have to start with 1: this tells everyone that the function starts there.
    1-3-2 is bad too, but why? For one it creates more branches than necessary, but I think this is not the reason it is hard to read. I think the reason is that we go backwards from 2 to 3. Most of us like to read in one direction: from top to bottom or from left to right.

    Another one:

    The right order here is 1-2-4-3 or 1-4-2-3.
    1-2-3-4 is wrong because 4 then goes backwards to 3.

    We can say: the order is right iff for all edges, the start of the edge comes before the end of the edge. This ordering is known as a topological ordering. There are various ways to compute it; it is important to remember that one is the reverse postorder of a depth-first traversal.

    This might all sound like gibberish to you, so feel free to take a pause and look up some of the underlined words. Here's a little pseudocode example for the reverse postorder:
        order = []
        for each child in vertex.edges
            order = reverse_postorder(child) + order
        order = vertex + order
        return order

    Well, so far so good. But what to think of this one:

    I'd say that the best ordering is 1-2-3-5-4. Some compilers order this as 1-5-2-3-4. Even 1-3-5-2-4 isn't unreasonable. But according to our ordering rules, they are all wrong.

    If your graph has a cycle, it isn't possible to get a perfect ordering. We can't have 2 before 3, 3 before 5 and 5 before 2 at the same time, so we'll have to go backwards at least one time.

    Something like 1-4-2-3-5 is even more wrong though. Now we have two backward edges, 3 to 4 and 5 to 2.

    One way to solve the problem is like this:

    We contract the vertices of the strongly connected component (a generalization of cycles) into one "supervertex". Now we order that and get 1-I-4. Then we order I: first we pick one vertex to start, I like to get the one that has most edges to it from outside I, but it doesn't matter much. In this example we pick 2. Then we order the rest (and if we find sub-SCC's in I we recurse). So here we get 3-5. Then the final order is 1-2-3-5-4.

    Now thinking that out is one thing, implementing it is another. The most obvious problem is how to find the SCC's. Luckily some very smart people already figured it out for us. We'll use Tarjan's strongly connected components algorithm. It is a bit hard to understand, but explains it very clearly.

    So, are we done now? Is the following algorithm enough:
    1. Find all SCC's
    2. Collapse them into supervertices
    3. Order the resulting graph
    4. Apply this algorithm to all supervertices in the result

    Yes, in a certain sense. It works fine, but it's quite a lot to code (especially in C) and it's pretty inefficient.

    / this is the clever part
    Now both Tarjan's algorithm and the reverse postorder are based on depth first search. So maybe it's possible to combine them? It turns out we can indeed do this and save ourself a lot of work. The only modification we need to do is:
    • Whenever we find an SCC, we check if it is trivial (one vertex). If so, we add it to the order. Else we order the SCC and put the result in the order.

    So the final algorithm is:
    get_order(first_vertex, vertices_to_consider):
        /* Define some variables */
        order = []
        stack = empty_stack
        cur_dfsnum = 0
        index = []
        low = []
        /* Nested function, has access to above variables (closure) */
            index[cur_vertex] = cur_dfsnum
            low[cur_vertex] = cur_dfsnum
            push cur_vertex on stack
            for each child in cur_vertex.edges where child in vertices_to_consider
                if index[child] == "To be done"
                    low[cur_vertex] = min(low[cur_vertex], low[child])
                else if index[child] == "Done"
                    /* Do nothing */
                    low[cur_vertex] = min(low[cur_vertex], index[child])
            if low[cur_vertex] == index[cur_vertex]
                /* we found an SCC */
                scc = []
                    popped = pop from stack
                    scc += popped
                    index[popped] = "Done"
                until popped == current_vertex
                if scc.length == 1
                    order = cur_vertex + order
                    order = get_order(choose_first(scc), scc) + order
        /* visit() ends */
        for vertex in vertices_to_consider:
            index[vertex] = "To be done"
        /* Special-case the start vertex to prevent infinite recursion */
        index[first_vertex] = "Done"
        for each child in first_vertex.edges where child in vertices_to_consider
            if index[child] == "To be done"
        order = first_vertex + order
        return order
    I omitted the choose_first function because it isn't important.

    Now we have every part, putting it together is simple:
        already_done_vertices = empty
        cfg = makecfg(bytecode)
        order = get_order(cfg, already_done_vertices)
        return rebuild_from_order(cfg, order)

    Well that's it mostly. I wanted to show how to do this so maybe someone doesn't have to spend weeks figuring this stuff out and because I think
    reverse postorder + Tarjan's strongly connected components algorithm = approximate topological ordering
    is a very nice insight.

    P.S. Sorry, my tool to do this for .NET is currently private because I'm not interested in an arms race. The ideas behind it are more important anyway
  15. Vmware snapshot and SSDT

    Some time ago I blogged about Vmware snapshots ( introducing a way to recognize hidden files by simply comparing two snapshots. I wanted to extend my research on the subject a little bit more, but I didnít. I got the opportunity to put my hands on some snapshots again in these days. I havenít anything on my mind, but I was surprised by some coincidences. Look at the information below:
    80544bc0: 804fc624 00000000 0000011c 804fca98
    80544bd0: bf995ba8 00000000 0000029a bf98f5f8
    80544be0: 00000000 00000000 00000000 00000000
    80544bf0: 00000000 00000000 00000000 00000000

    00544BC0: 24C6 4F80 0000 0000 1C01 0000 98CA 4F80 $.OÖÖÖ..O.
    00544BD0: A85B 99BF 0000 0000 9A02 0000 F8F5 98BF .[..............
    00544BE0: 0000 0000 0000 0000 0000 0000 0000 0000 ................
    00544BF0: 0000 0000 0000 0000 0000 0000 0000 0000 ................


    First 4 lines are taken from Windbg while I was debugging an XP sp1 virtual machine running under Vmware; last 4 lines are taken from a saved Vmware snapshot (same os of course).
    Do you see anything useful? These are KeServiceDescriptorTable[0],[1],[2],[3] and they have of course the same bytes, but thereís something else. Thereís a connection between the addresses on the first lines and the offsets on the second ones, just remove the first 2 digits from the address. Do you see it? Look here: 80544BC0/544BC0, 80544BD0/544BD0, 80544BE0/544BE0, 80544BF0/544BF0.

    Seems like the kernel memory is stored inside the snapshot. Itís not totally true indeed, thereís only a part of the kernel memory stored inside a Vmwareís snapshot. All the KeServiceDescriptorTable entries are present btw.
    SSDT is inside the snapshot I have and itís complete; SSDT Shadow seems to be inside the snapshot too, but thereís no real connection between kernel memory/snapshot addresses and itís not complete (it needs some more research btw).

    Is it only a coincidence? I tried with some XP machines and the result is the same, itís possible to obtain real information of SSDT. According to Kayakerís test it should work on win2k (donít remember the service pack he was using. Thx K.).

    With this new information itís pretty easy to code a SSDT revealer. I gave it a try and here is a result:

    You can use the program to display SSDT entries and to find out modified entries too by simply comparing an original snapshot with another one.

    To retrieve information from a snapshot you have to provide the address of KeServiceDescriptorTable[0] (something like 80544BC0, no ď0xĒ prefix), and you have to select the OS of the virtual machine. After that you can:
    1. save an untouched SSDT using the button labelled ďCreate untouched SSDTĒ
    2. retrieve SSDT information from a snapshot by simply pushing the button labelled ďGet snapshot SSDTĒ. Checking ďLoad untouched SSDT dataĒ you can compare the original table (previously saved) with the one from the snapshot youíll select. If a service has been changed youíll read the word ďYESĒ in the last column.

    I got the name of the services from this table:
    I canít test all the OS, if you find one or more errors drop me a mail.
    Following this method itís also possible to get the list of the running processes/modules, more about this later.

    SSDT from snapshot available here: