Few words about Kraken

Rating: 3 votes, 2.00 average.
Kraken is the word of the month for sure, but it has nothing to do with the beast from an old nice book written by Jules Verne, Twenty Thousand Leagues Under the Sea.
The word refers to a series of malwares, something like the Storm trojan, but with much more strength. Kraken seems to be out from August 2006, but until today Iíve never heard about it. Some days ago I read an article (,289142,sid14_gci1308645,00.html) about it, the interesting part is here:
ďOne somewhat interesting feature of the code is that the binary is not packed, as many malware binaries tend to be. However, Royal said that the code does have some other forms of obfuscation that make it difficult to analyze completely.Ē. I decided to look at it.

Iím not going to give out a detailed explanation about the sample Iím working on (MD5 = 592523a88df3d043d61a14b11a79bd55), but Iíll spend some words on the ďforms of obfuscationĒ used by the malware.

Detectors are not able to recognize any specific packer/protector. The file is not packed, but from the first lines of code itís pretty easy to understand that a sort of obfuscation/encryption was included inside the file. I have not found interesting imports/strings, so I tried running the malware. Just to be sure to retrieve some useful information I started logging all API(s) called by the malware.
The malware calls some nice functions. Almost all the code of the binary file has been decrypted at runtime. The malware spawns one file and it deletes itself, you can spy the decrypted code but I didnít get anything useful from it. The best thing to do is to look at the code trying to identify a general obfuscation scheme or a decryption routine. Donít think to trace the entire exe, itís madness!

In case like this one, if you are able to see a light over your head you are lucky, otherwise you can step and look at each instruction for the eternity. I was luckyÖ the real code has been hidden behind a virtual machine. Iím not a virtual machine expert for sure, I only read some articles about this kind of protection.
I wonít rebuild the entire machine, Iíll give out my findings only. If you think they are wrong and/or you want to add some more information about the virtual machine Iíll be happy to see a comment from you.

Like every virtual machine out there, after a little initialization it goes into a semi-infinite loop that starts at 4012DA. It simply selects a virtual machine instruction and jump to the code to run. There are a lot of instructions inside the loop, avoiding some junk code you can see the snippet used to select (and then jump to) the next instruction to execute:
004012E4 MOV AL,BYTE PTR DS:[ESI-1] // Byte pointed by esi-1 decides everything
004012F3 ADD AL,BL
0040F807 DEC AL
004103D9 DEC ESI   // Shift to the next byte
004103E7 ROL AL,2
004103F7 DEC AL
0040F590 XOR AL,0CF
0040F594 SUB AL,6B
004104A6 ADD BL,AL
004104B7 MOV ECX,DWORD PTR DS:[EAX*4+40FABB]   // EAX = index of the selected instruction
004104C6 NOT ECX
0040129C ROR ECX,1C
00410213 SUB ECX,4DCBE90C
0041021F ROL ECX,7
00410229 INC ECX
0041070D BSWAP ECX
00401195 ADD ECX,5E1E81EF
0040119C XOR ECX,77B911BC
004011AE NOT ECX
0041071B ADD ECX,60334BE6   // ECX = address of the selected instruction
0040FFFF RETN 4C   // Go to the selected instruction

Everything starts from the value stored inside the buffer pointed by (esi-1), the buffer contains a series of bytes and they are used to select the virtual machine instruction to execute (Moreover they are used to retrieve one or more vm_instructionís operand). The new value stored inside EAX (obtained after some minor operations) is used to retrieve a dword value, EAX represents the index of the vector that starts at 0◊40FABB. As you can see from the code above the new value is used to obtain the address of the vm_instruction to execute.
Unlike a classical virtual machine this one doesnít have a clear Instruction Table, spying the dead list from your favorite disassembler you wonít see the address of every single vm_instruction. The Instruction Table has been crypted and the first entry is located at 0◊40FABB (there are 256 entries).
The virtual machine has 16 registers (from r_0 to r_15), they can be used to store byte, word or dword data. EDI register points to the first one, the registers are stored in memory consecutively starting from r_0 to r_15.
The virtual machine has a stack with a fixed size, EBP register contains the vm_esp value. After almost all push vm_instructions thereís a stack overflow check. The alignment is two bytes, ďpush byte_valueĒ is not allowed and to push a single byte the virtual machine will extend the byte to a word value.

Is there a cmp/test instruction inside the snippet? Is there a reference to a vm_eip register? Seems like this virtual machine doesnít need them. vm_eip is replaced by (esi-1), itís not an eip per se but it *guides* the virtual machine. I havenít all the vm_instructions on my notes but I think there are no direct cmp/test instructions. Seems like they are not included inside the virtual machine, strange.

From what I have seen there are more than 45 vm_instructions included in the virtual machine, to identify each vm_instruction you have to remove a lot of junk code. Once you have all the vm_instructions itís not immediate to understand what the malware is trying to do.
Example: here are the vm_instructions used to patch a dword at 0◊41CE06 (1į column represents the initial address of the vm_instruction, 2į column represents the name I gave to the vm_instruction):
401028: push_dword val      //    push F440C1CB
401028: push_dword val      //    push 8040414A
40F5BE: nor_stack           //    The value at vm_esp+4 is updated with a nor(vm_esp+4, vm_esp) operation
4105FA: pop_dword r_i       //    r_15 = 0◊00000202
40F36F: push_dword r_i      //    r_0 = 0◊0041CE05
401028: push_dword val      //    push 98754A9F
401028: push_dword val      //    push 43179031
40F198: push_dword vm_esp   //    push vm_esp
401396: mov_stack_pstack    //    mov dword ptr [vm_esp], dword ptr [dword ptr [vm_esp]]
40F25C: pop_word r_i        //    r_14 = 0◊00009031
401028: push_dword val      //    push 678AB562
40F198: push_dword vm_esp   //    push vm_esp
40FEF3: push_bdword val     //    push 0◊00000006, push a dword but the last 24 bits are 0, so itís like a push byte extended to dword
410452: add_stack           //    add dword ptr [vm_esp+4], dword ptr [vm_esp]
4105FA: pop_dword r_i       //    r_15 = 0◊216
40F0A0: pp_mov_dword        //    mov dword ptr [pop t1], (pop t2)
40F25C: pop_word r_i        //    r_11 = 0◊015E4317
410452: add_stack           //    add dword ptr [vm_esp+4], dword ptr [vm_esp] <Ė 98754A9F + 678AB562 = 1
4105FA: pop_dword r_i       //    r_14
410452: add_stack           //    add dword ptr [vm_esp+4], dword ptr [vm_esp] <Ė 41CE05 + 1 = 41CE06
4105FA: pop_dword r_i       //    r_15
410171: mov_stack_pstack    //    mov dword ptr [dword ptr [vm_esp]], dword ptr [vm_esp+4] <Ė patch

Quite a simple patch operation, but the author didnít use the straight way for sure. Believe it or not, this is the nature of the malware. Now you can understand the phrase: ďDonít think to trace the entire exe, itís madness!Ē.

I tried inspecting some more samples of the same Kraken family. There are some similarities/differences:
- they are protected by a virtual machine too
- the routine used to select the next vm_instruction is not the same
- (I think) the vm_instructions are equal, but they are not defined in the same way. I mean, the code used to define a push is not the same but the result is the same infact in both cases you have a push vm_instruction
- the (encrypted)Instruction Table is not the same. At index i you wonít have the same vm_instruction for malware_x and malware_y
- the vm protection exists for the spawned file too

Now I fully understand the words used by the author of the interview, itís complex to understand whatís going onÖ

Submit "Few words about Kraken" to Digg Submit "Few words about Kraken" to Submit "Few words about Kraken" to StumbleUpon Submit "Few words about Kraken" to Google