fr33ke
Control Flow Deobfuscation Part 2
by
, April 25th, 2008 at 08:05 (6165 Views)
In part one I explained how to get a CFG from bytecode, and today I'll tell you about the inverse operation.
Well, actually there are two parts to it:
- Decide in which order you place the vertices
- Put them in that order and insert the correct branches
I'll discuss point 1 in part three, and point 2 now.
It may sound trivial to do this: just get the data for the first vertex, then the branch for the first vertex etc. However this doesn't work because to create the branch we need to know the position of its targets. But for the position of the targets we need to know how big the branch is!
To create an optimal solution where all branches are as small as possible we use a simplification of Robertson's algorithm. The original paper is hard to find but there is good documentation of it in the source of YASM ( http://www.tortall.net/projects/yasm/browser/branches/yasm-0.7.x/libyasm/section.c#L698 ). Basically what we do is assume all branches are 0 bytes1, then increase the size for each branch that doesn't fit and repeat until we find a fixed point where all branches fit.
We can skip some sizes that are impossible, for instance there exists no 1 byte branch in .NET. However a 4 byte branch is possible with two 2 byte branches ("jnz xxx; jmp xxx"). Also we don't have to recalculate every branch if one doesn't fit, only the ones that are affected by the increased branch size; but since I saw little performance increase (surprisingly), I chose to use the simpler version.
Now that I have explained everything, here is the pseudocode:
See you in part 3 which should be ready in 2009Code:rebuild_from_order(cfg, order): for vertex in order size_of_branch[vertex] = 0 do length = 0 for vertex in order start[vertex] = length length += vertex.length + size_of_branch[vertex] has_changed = false for vertex in order for edge in vertex.edges delta[edge] = start[edge] - (start + length(vertex.bytecode)) if not branch_fits(vertex.branchtype, delta, size_of_branch[vertex]) has_changed = true do size_of_branch[vertex]++ while not possible_branch_size(size_of_branch[vertex]) while has_changed bytecode = empty for vertex in order length = 0 for vertex in order start[vertex] = length length += vertex.length + size_of_branch[vertex] for edge in vertex.edges delta[edge] = start[edge] - (start + vertex.length) bytecode += vertex.bytecode bytecode += create_branch(vertex.branchtype, delta) return bytecode
1 This is possible if it's an unconditional branch with a delta of 0 bytes: "jmp next; next:" is the same as "".