PDA

View Full Version : Custom data types and formats


Hex Blog
February 25th, 2010, 13:17
Another new feature that will be available in the upcoming version of IDA Pro is the ability to create and render custom data types and formats.

http://hexblog.com/ida_pro/pix/custdata_cover.gif
(Embedded instructions disassembled and rendered along side with x86 code)

What are custom types and formats


Custom data type: A custom type is basically just a way to tag some bytes for later display with custom format, when the built-in IDA types (dt_byte, dt_word, etc) are not enough.For example: an XMM vector, a Pascal string, a half-presision (16 bits) floating-point number, a 16:32 far pointer (fword), uleb128 number and so on.To define a custom type, you need to provide to IDA its name, size (fixed or dynamically calculated), keyword for disassembly and a few other attributes.Custom data format:The custom data format allows you do display a custom or built-in data type in any way you like. You can register several formats for each type and switch the representation.For example, you might want to switch the display of the same 16-byte XMM vector between four floats or two doubles.A format definition includes callback for printing (to display) and scanning (used during debugging to change the register values).
For example, here is a custom MAKE_DWORD format applied to the built-in dword type:
http://hexblog.com/ida_pro/pix/custdata_mkdword.gif

Its implementation is very simple:

http://hexblog.com/ida_pro/pix/custdata_mkdword_code.gif

Next we illustrate some possible usages of custom types and formats. Other uses are also possible too, it is up to your imagination.

Decoding embedded bytecodes

Imagine you are debugging an x86 program that implements its own VM and embeddes them in the program.
The classical solution for this problem can be:

Write a dedicated processor module and then load the extract bytecodes separatelyOr define the bytecodes as bytes and then use comments to describe the real meaning of those bytecodes.
With this new addition, one can just write a custom data type to handle the situation:

http://hexblog.com/ida_pro/pix/custdata_vm_data.gif

And if you happen to have a situation where the bytecodes are operands to instructions (as means of obfuscation), you can still apply the custom format on those operands:

http://hexblog.com/ida_pro/pix/custdata_vm_opr.gif

The previous ("http://hexblog.com/2010/02/scriptable_processor_modules.html") blog entry showed how to write processor modules using Python. What if one simply uses the "import" statement to import a full-blown processor module script and use it in the custom data types/formats?

Displaying resource strings

When reversing MS Windows applications, one can encounter string IDs, but then how to easily and nicely go fetch the data and display it in the disassembly listing?
Normally, one would have to use a resource editor to extract the string value corresponding to the string id, then to create an enum in IDA for each string ID with a repeatable comment:

http://hexblog.com/ida_pro/pix/custdata_rsrc_enum.gif

That works, but what about writing your own custom format instead:

http://hexblog.com/ida_pro/pix/custdata_rsrc_menu.gif

And then applying it directly without having to use a resource editor to extract the string value, have the custom format do that programmatically for you :

http://hexblog.com/ida_pro/pix/custdata_rsrc.gif

This is how a resource string custom format handler can look like:

http://hexblog.com/ida_pro/pix/custdata_rsrc_code.gif

To take a closer look at it, you can download ("http://hexblog.com/ida_pro/files/custdata_files.zip") the custom data type handler script along with the source code of the simplevm assembler/disassembler and the C program that was used in this article.
<!-- Thank you, you know who you are. -->

http://hexblog.com/2010/02/custom_data_types_and_formats_1.html

wtbw
February 25th, 2010, 16:11
Quote:
<!-- Thank you, you know who you are. -->


Interesting feature of the blog import tool

(I don't know who it is, so it's not me )

Kayaker
February 25th, 2010, 18:58
Bwwahaha...I hadn't realized until now that some html comments come through the rss feed. I could fix the import script, but I think it could be more entertaining to leave it the way it is