Results 1 to 10 of 10

Thread: Dissasmbling Chip8 Rom but cannot seem to seperate code from data

  1. #1
    Pingcrosby
    Guest

    Dissasmbling Chip8 Rom but cannot seem to seperate code from data

    Hi

    I hope this is the correct place to post on this forum.

    I currently completing work on a Chip8 dissasembler - however I am having difficulty seperating code from data. I am hoping somebody can shed some light on techniques available to me that will enable me to distinguise code from data.

    Chip8 instructions are 2bytes wide - so currently i just step 2 bytes at a time and dump the file accoding to the opcodes i read.

    eg if my simplified source assembly looks like

    JP START
    DB 'Hello word hope your ok..etc"
    START:
    HIGH

    And my dissasembly looks like (please note this is a trivialised example)

    0x200 JMP 209 ;; correct
    0x202 SHL 12, 7 ;; incorrect this is just Hello word hope your ok data
    0x204 MOV 12, 7 ;; incorrect this is just Hello word hope your ok data
    0x206 SNE 22, 4 ;; incorrect this is just Hello word hope your ok data
    0x208 SHR 22, 3 ;; incorrect this is just Hello word hope your ok data
    0x20A MOV 55, 3 ;; incorrect incorrect assembly as it should now be 0x209 HIGH
    ;; the rest of the dissasembly is out-of-sync and is just incorrect


    What i ideally want to see is

    0x200 JMP 209 ;; correct
    0x202 DATA;
    0x204 DATA;
    0x206 DATA;
    0x208 DATA;
    0x209 HIGH; ;; correct
    0x20B MOV 10, 19 ;; correct

    Are there any generic techniques to deal with this?


    Thanks in advance
    I promise that I have read the FAQ and tried to use the Search to answer my question.

  2. #2
    ...if they were existing, decompiling would be a trivial exercise... at the end of the day, THAT is the problem of decompiling.
    Short answer: no.
    Long answer: you can use some heuristic and code analysis techinuqe in order to predict if something is 'code' or 'data'. Problem is, there's always a certain degree of failure, that depends by the kind of code/machine you are disassembling.
    I want to know God's thoughts ...the rest are details.
    (A. Einstein)
    --------
    ..."a shellcode is a command you do at the linux shell"...

  3. #3
    Super Moderator
    Join Date
    Dec 2004
    Posts
    1,524
    Blog Entries
    15
    googling around to find what chip8 is/was i see someone writing a dis assembler

    http://www.emulator101.com/chip-8-disassembler/ maybe it is useful to you maybe you already saw it

    wikipedia says chip 8 uses only 3584 bytes for its memory

    and has only 35 opcodes

    CHIP-8 has 35 opcodes, which are all two bytes long. The most significant byte is stored first. The opcodes are listed below, in hexadecimal and with the following symbols:

    well iirc Benglays pvdasm understands this chip8 and iirc it is open source maybe you can check it out



    anyway i compiled the disassembler after some tweaking in the above link
    and disassembled a tetris game that i randomly downloaded from net
    seems to be working dont know if it does code analysis or not

    Code:
    VISUAL~1\Projects\chip8dis>chip8dis.exe TETRIS
    0200 a2 b4 MVI        I,#$2b4
    0202 23 e6 CALL       $3e6
    0204 22 b6 CALL       $2b6
    0206 70 01 ADI        V0,#$01
    0208 d0 11 SPRITE     V0,V1,#$1
    020a 30 25 SKIP.EQ    V0,#$25
    020c 12 06 JUMP       $206
    020e 71 ff ADI        V1,#$ff
    0210 d0 11 SPRITE     V0,V1,#$1
    0212 60 1a MVI        V0,#$1a
    0214 d0 11 SPRITE     V0,V1,#$1

  4. #4
    Pingcrosby
    Guest
    Hi,

    thanks for replies - I had not seen http://www.emulator101.com/chip-8-disassembler/ - but it essentially does what i do.

    Unfortunately I am trying to just go further than dumping opcodes and do some kind of analysis.

    So my current plan is -
    read the file - assume that everything is an instruction
    validate each instruction - eg ensure its a valid opcode and ensure the operands are in range (check outside memory bounds and register indexes between 0 and 15)
    find the first branch in the code (and re-read if necessary all instructions from this offset onwards)

    repeat...

    Somehow logically this seems incorrect...! If a branch occurs which causes a jump backwards then i could end up in some kind of never ending loop

    The thing is the Chip8 instruction set is quite simple so I should be able to do this relatively easy but for the life of me i cannot think of a reasonable solution - it does not have to be 100% perfect just a really good guess.
    I think PVDasm does some form of analysis; is the source code available to even give me a hint as to how it could be done?


    Thanks again
    I promise that I have read the FAQ and tried to use the Search to answer my question.

  5. #5
    there's more behind the scenes, than this.

    Example: say you have a JMP $ADDR opcode. Now, imagine you have a LDA+BNE/BEQ+ADC+STA sequence that changes the value of $ADDR depending on the condition's outfit.
    How can you discover the address of your code segments without actually emulating (all) of the code paths?
    I want to know God's thoughts ...the rest are details.
    (A. Einstein)
    --------
    ..."a shellcode is a command you do at the linux shell"...

  6. #6
    Pingcrosby
    Guest
    All,

    I have downloaded the source for the excellent "Borg" which looks promising. From what i can gather having taken a breif (20mins) look at the code - it reads blocks of instructions stopping on branches and decodes/validates each instruction in the block adding them to a table. If a branch causes an overlap of code already in the list it removes the old code and overwrites it with the new.

    Thats what i think it does? I am probably wrong.


    Can anyone suggest any other references for this - I presume other people have had this issue when writing there own dissasmblers.
    I promise that I have read the FAQ and tried to use the Search to answer my question.

  7. #7
    Naides is Nobody
    Join Date
    Jan 2002
    Location
    Planet Earth
    Posts
    1,647

    IDA modules

    If this is a serious project and you are willing to invest a substantial ammount of time into it, consider IDA.
    It gives you the hability of building custom processor modules. (See th IDA PRO book by Chris Eagle, available in the wild). It may take A substantial ammount of initial analisis and effort, but once you finish it, gives you the power of IDA, with all of its bells and whistles.

  8. #8
    Pingcrosby
    Guest
    To be honest - its not particularly serious. I just was bored at work so i knocked up a chip8 emulator.

    Once i had that working I thought "mmmm it would be nice to debug it !" - so i began my journey with dissasembling. However I soon realised that trivially dumping instructions and opcodes from file bytes is not the way to go forward and some form of simple analysis of the input is needed.

    It is at this point i am kind of stuck to be honest
    I promise that I have read the FAQ and tried to use the Search to answer my question.

  9. #9
    Super Moderator
    Join Date
    Dec 2004
    Posts
    1,524
    Blog Entries
    15
    some body has written a disassembler / debugger / emulator / in .net that runs in win7

    see sharpchip8
    Attached Images Attached Images  

  10. #10
    Programmer Run Amock... Bengaly's Avatar
    Join Date
    Aug 2001
    Location
    Somewhere over the Rainbow
    Posts
    289
    Blog Entries
    1
    here's my (very) old code for CrazyChip-8, sadly i lost the code for PVDasm plugin, but that can be easily re-coded.
    Sources are in VC++ / ASM.
    though i know the CPU emulation has some bugs in it, maybe it will set you on some coding direction
    Attached Files Attached Files
    "knowledge is now free at last, everything should be free from now on, enjoy knowledge and life and never work for everybody else"

Similar Threads

  1. How to find code generating known data?
    By nomatter in forum Advanced Reversing and Programming
    Replies: 5
    Last Post: September 10th, 2010, 04:10
  2. using filestreams to store data..or code as data?
    By BanMe in forum The Newbie Forum
    Replies: 7
    Last Post: August 8th, 2009, 21:58
  3. ida misinterpreted data section as code
    By The Keeper in forum Tools of Our Trade (TOT) Messageboard
    Replies: 6
    Last Post: May 14th, 2004, 23:27
  4. code opcodes interpreted by OD as data???
    By Anonymous in forum OllyDbg Support Forums
    Replies: 1
    Last Post: March 31st, 2003, 15:30
  5. From code to data???
    By homunculus in forum OllyDbg Support Forums
    Replies: 6
    Last Post: February 5th, 2003, 00:56

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •