View Full Version : Discovering Embedded Architectures

September 13th, 2008, 18:50
Hello All,
I am in the process of reverse engineering a piece of embedded equipment and have run into a bit of a hitch. The system is manufactured in China by a company that has almost zero web presence, and the device has no FCC registration or any real documentation. The CPUs within the box have all been relabeled and/or scrubbed. I do however have a complete ROM dump. I have managed to glean the compiler via a string in the binary (Keil C) which allowed me to narrow the arch to either an ARM7 or Intel 8051. I 90% sure that it is an Intel 8051, but the 8051 is extremely popular and many many revisions have come out over the years. I have also found several sites claiming that the device runs Linux, but I am 99% certain that this is NOT the case due to the compiler choice and the total lack of identifiers in the binary.

My question is, does anyone have any tips or tricks for discovering the exact architecture of an embedded system given a CPU pin count and a working binary? I need more details in order to provide IDA Pro an effective entrance point and RAM size in order to get a successful disassembly. I have been reading data sheets for weeks and basically performing trial and error with IDA Pro but have yet to get valid output for the disassembly.



September 14th, 2008, 11:52
If it DOES run Linux, then you KNOW that it has to have SDRAM. All the embedded Linux projects that I have worked on have used a processor with an external SDRAM, so if that exists, that's a big hint that it's probably an arm. By virtue of the fact that they DIDN'T use GCC, that leads me to believe that it's NOT Linux. Because who in their right mind PAYS for a toolchain with GCC is free.

Another thought, if you are looking at a flash image, and it IS somehow Linux, then it might be a CRAMFS, or JFFS2 image, and that might be causing the difficulties that you are having getting a disassembly, as the code would be IN there, but there would be all kinds of file system crap around it, not to mention that it would probably be compressed. To test this, I would try to mount it on a Linux box, and see if it mounts. If it DOES, copy the executables out, and IDA them.

If you have access to an oscilloscope, then you should be able to probe the pins when the CPU is running, and determine clock speed, and figure out at least a minimal pin-out that would help you in your datasheet comparisons. Also something that should help with this, is if you can identify the peripherals that it's using. It it has an HDMI port, then you know it has to have an HDMI controller chip, and you might be able to run that down. If it has an Ethernet port, then it has to have a PHY, etc. Once you determine what it has in this aspect, you might be able to find datasheets for THOSE parts, and the interfaces used (I2C, I2C, SPI, etc.) Then, you can use that as a method to weed out the chips that it "can't be, because it doesn't support THIS protocol".

See, I TOLD you it'd be random!

September 14th, 2008, 23:21
How big is the ROM? If it's a few K then it's pretty likely a 8051 or another small processor. If it's a few hundred K it might possible that it's Linux (though unlikely without a rootfs) or can be some embedded RTOS for ARM or another "bigger" CPU. One giveaway of ARM code is prevalence of "E?" byte values every fourth byte (i.e. E? ?? ?? ?? E? ...). What other "interesting" strings are there? Can you upload (a piece of) the dump somewhere?

September 14th, 2008, 23:43
Thanks FrankRizzo, was unable to eliminate a few of my suspect architectures via the existence of an Ethernet controller.

The ROM is 512k, after finally ripping the box all apart I am pretty sure its not an 8051 (there are WAY too many pins).
I will check for the 'E' byte on the ARM, i didn't know that.

Here is strings output from the rom (as you can tell it has a web server). NMap identifies it as contiki, while linuxdevices.com claim it runs uCLinux.

The complete strings output is available at: http://pwntatoes.cs.uidaho.edu/hostedFiles/output.txt

There is a lot of HTML in the binary.

I am thinking that it might be of the ST10/C166 family, but am still having problems getting a clean disassembly.

September 15th, 2008, 00:30
OK, it includes a DS1307 RTC chip. Found a reference to that in the strings.
It also includes an Atmel 24C16 2 wire Serial EEPROM.

I also found this: Deer5020@yahoo.com.cn

I bet HE could tell you what the processor is! (I'd bet it's the author).

September 15th, 2008, 04:12
I think the processor architecture is 8051, or some other 8 bit architecture.
The strings contain uip_listen and other uip related functions. These functions are part of uiP TCP/IP stack, which is a TCP/IP stack for embedded microcontrollers (www.sics.se/~adam/uip/index.php/Main_Page)

September 15th, 2008, 16:34
Someone already did some work on it and even has a boot log output:
The firmware is downloadable from aviosys: http://www.aviosys.com/update.htm (9258).
The binary definitely doesn't look like an ARM. If I set the CPU to 8051, the disassembly does look more or less correct, however all registers are clearly wrong for the CPU I chose. So you will probably need to add a new CPU config to the .cfg file and try to figure out all the various registers.

September 15th, 2008, 21:53
I've seen LOTS of instances of "TF33x" and "TF33xFU" in the code.

Those could be signatures for the filesystem.

FU might be Firmware Update now that I think about it.

September 15th, 2008, 22:06
FU could mean something else .


September 15th, 2008, 22:08
Yeah, but I wasn't GOING there!

September 15th, 2008, 22:11
I know,

I had to add some lame ass off topic humor.

Luv's ya's, Woodmann

September 16th, 2008, 13:05
Here is the CRCETL entry for it:


September 17th, 2008, 08:58
LOL gg