PDA

View Full Version : Questions for hard disk experts


roocoon
November 15th, 2006, 14:35
Hi all.
To keep the story short, I have a year-old SATA WD 1600JD disk (split into two equal NTFS partitions of 80GB each) that suddenly developed problems.
The disk had checked clean a week ago and the problems started after an overnight computer crash.
It could be a mobo failure, the PSU or the disk of course. I've replaced the PSU to be on the safe side but I can't replace the mobo.

I've run many tools against it but so far nothing.

BIOS claims that SMART Status BAD, Backup and Replace.
Even if true, I cannot back it up since nothing seems to work.

As of now, I've pinpointed it to this:

In both partitions: the main MFT and its mirror are identical.
In partition 1: the mirror MFT gives a read error when accessed.
In partition 2: the main MFT gives a read error when accessed.

For both partitions, I backed up the good copy and wrote it over the corresponding bad copy.
The write was successful but the read error didn't go away.

Unless there are different hardware mechanisms involved for reads vs writes, I suspect the SMART subsystem on the drive is failing. Is that a logical assumption? Is there an actual chip or whatever that can fail or is SMART just some code in the drive's firmware?

1. Assuming it's SMART to blame, are there any recovery tools that can bypass it and talk directly to the drive?

2. Are there any tools to reset the SMART counters to defaults? WD Diagnostics refuse to run because of "counter past threshold" and if I disable SMART in the BIOS, Diagnostics just hang.
Presumably there's a tool like that for Maxtor. They reset the counters and if they increase past the thresholds again, they RMA the drive.

3. Is there any tool that can remap the MFT to a good part of the disk? It seems to be a simple process. Move the bad copy to a good area and change a pointer in the partition boot sector. I wouldn't like to try it manually before I backup my data.

Thanks in advance for any pointers.

naides
November 15th, 2006, 15:19
1 _ What is the brand of the HD: Some vendors provide low level Diagnose and recovery utils. Otherwise There is a Russian Site that provides a shareware low level Diagnosis and access software. Search the board, I think I posted the name when I was having trouble with one of my disks that got a botched rootkit attempt.

2_ I assume you connected the disk to another computer? Try to connect to a Linux Box and see if it gives a better read of your data.

Woodmann
November 15th, 2006, 18:28
Howdy,

Try HDD Regenerator.

Woodmann

roocoon
November 16th, 2006, 02:13
The disk is Western Digital.

I don't have another computer that supports SATA so trying the disk on another mobo is out. I do have a second SATA disk connected on the mobo though that works fine. That probably puts the mobo on the clear.

I've tried most of the partition, hard disk and file recovery tools. A couple that had worked before, didn't now.

HDD Regenerator freezes while scanning the disk. Same with SpinRite, Partition Table Doctor, TestDisk and what not.
It seems they get some read error (at least PTD does when scanning for the filesystem) that keeps them either trying forever or hangs them up. They should have ignored the errors and continue but somehow they don't.

Silver
November 16th, 2006, 14:16
Tried Ontrack EasyRecovery Pro?

SiGiNT
November 16th, 2006, 19:05
Just a long shot, have you tried a new cable? Don't overlook the simple stuff.

SiGiNT

Aimless
November 16th, 2006, 21:01
Heh!

Maybe we should now start charging for support calls...

Have Phun

roocoon
November 17th, 2006, 02:49
Hi.

I've tried EasyRecovery Pro among others.
As for the cable, I'm using the same one to plug to another working SATA and works fine (the PC is open and I switch disks around to transfer data among partitions to make room for an eventual image transfer of the bad disk).

I've tried Partition Table Doctor 3.0 again (the new v3.5 seems to see only four disks max).
Even though it shows the correct disk parameters and knows it's NTFS, when it scans, it looks for NTFS and then it cycles looking for FAT32 FAT for 8-sector and down to 1-sector.
It was getting stuck at the 53% of the first partition while searching for FAT32 FAT for 8-sector.
This time it got stuck at the same point but I was too fed up and left it to go do something else.
Two days later (now) it had moved and was now stuck at the 18% of the cycle for 1-sector.
Give it another 5 days and it might finish or finish me.

Whatever this behaviour means. Different points on the disk give randomly read errors?
It looks like a bad circuit board but I can't find the same disk anywhere to switch boards so I have to wait out Partition Table Doctor.

I located the site Naides had mentioned. It's a great place and some very nice tools there. Actually I had one of them but forgotten about it. The site is hddguru.com

Regards.

Nacho_dj
November 17th, 2006, 04:29
Have you tried GetDataBack?
hxxp://www.runtime.org

It worked for me in a similar situation...

Cheers

Nacho_dj

roocoon
November 17th, 2006, 11:21
Yep. One of the first I tried when I attempted to recover the data. No luck either.
The same with Stellar Phoenix (I had good results with it before) and a few others.
If there were no read errors on the MFTs I would have fixed them already with WinHex (it's pretty helpful at manually fixing system areas).
Anyway, we'll see. I still have to try the tools from the hddguru site but I don't want to stop PTD yet (still at 18% 12 hours later!!).

WaxfordSqueers
November 17th, 2006, 19:38
Quote:
[Originally Posted by roocoon;62411]To keep the story short, I have a year-old SATA WD 1600JD disk (split into two equal NTFS partitions of 80GB each) that suddenly developed problems. The disk had checked clean a week ago and the problems started after an overnight computer crash.It could be a mobo failure, the PSU or the disk of course.


I just went through this recently with a Maxtor drive. BTW...I'll never buy Maxtor again due to their out and out refusal to help me with parts, documents and firmware. From your description I'd say you definitely have a hard drive problem. When SMART says you have, you have. It 'could' be something simple that is repairable, but it might not be recoverable. Check out this site, and be prepared to be ignored, or to get replies from professionals looking to relieve you of considerable amounts of money.

http://hddguru.com/

This is your best bet for advice, however. There is an English language side and a Russian side. If you decide to get into the recovery process, be prepared to spend a lot of time deciphering Russian. Here's some tips to save you time:

1)DO NOT open the sealed cover on your hard drive unless you are absolutely sure about what you are doing. The low-level formatting is written to the drive in a dust-free clean room, through a slot in the bottom of the drive, after it is sealed, by a very expensive, high-precision machine. If you mess up the alignment inside, you are screwed. There are ways to change heads in there, in an emergency, for data recovery, but if you are lucky enough to get the data, you might as well chuck the drive afterward. It will never read or write correctly again.

2)stay away from the so-called data recovery experts, unless you're willing to spend upward of US $1000. They'll suck you in with a low price till they get your drive, then the price suddenly doubles and triples. The reason: most of them are hackers who have invested in expensive equipment and need a quick return on investment.

There is software available in Russia from Acelabs called the PC3000. You can buy it at a reasonable price (to North Americans) in Russia, but they don't have a lot to spend over there so it's expensive to them. The dealers for that product in North America are black-hearted capitalists who want over $10, 000 for the same product.

The real PC3000 comes with a PCI board, but there are altered (shall we say) versions available on the net that work without the board. Getting the proper English documentation is the trick, since you don't ever want to play with low-level code on a hard drive blind. I say the 'real' PC3000, because there is a site in China blattantly advertising themselves as the PC3000 and it's a copy. It sells for about US$600, but from what I've heard, it's better to steer clear of them.

3)Modern hard drives often store firmware on the disk as well as in their ROM. The disk firmware is stored in sectors, and if one of them goes bad, due to maybe a bad write during a power fail, the drive will just stop loading it's boot code. The PC3000 software allows you to examine that, and to re-write the bad code 'IF' you can find the right firmware to replace it. Even at that, interpreting the PC3000 findings is an art in itself. There are people on the site I gave above who 'can' help you, if you reach the right person.

Good luck.

LLXX
November 17th, 2006, 23:51
Quote:
[Originally Posted by WaxfordSqueers;62482]1)DO NOT open the sealed cover on your hard drive unless you are absolutely sure about what you are doing. The low-level formatting is written to the drive in a dust-free clean room, through a slot in the bottom of the drive, after it is sealed, by a very expensive, high-precision machine. If you mess up the alignment inside, you are screwed. There are ways to change heads in there, in an emergency, for data recovery, but if you are lucky enough to get the data, you might as well chuck the drive afterward. It will never read or write correctly again.
Actually almost all the drives manufactured since the mid 1990s aren't very critical of alignment; they track the head position based on embedded servo signals in the platters themselves, so they can almost always find the track if it's there. Much unlike the older stepper-motor drives, which relied on absolute positioning and needed very precise alignment (sometimes even thermal expansion would cause problems with those).

The best way to open a drive is when it's running, since the spinning of the platters acts as a fan and dust will get nowhere near the surface without being flung away by the air current.

I recently opened an old 512M drive just to see whether it would have any effect (was also thinking of doing a clear cover mod to it...) if "contaminated" air was introduced to it... did a continous low-level read/write testing for a day (leaving it open all the time) and no errors at all after approximately 200 reads/writes to every sector. It was a lot less sensitive than I thought

roocoon
November 18th, 2006, 05:08
Thanks.

I've read most of the English threads in hddguru and the opinions for the Chinese lookalike to PC3000. They could be right to be negative towards it but then again, they might want to protect their investment to the PC3000.

I looked at AceLabs but I didn't realize the PC3000 is coming in a software-only form. I thought every variety depended on some addin card and/or module.

In any case, who can afford it? The data in the drive is not life-critical to justify the cost. I'll keep trying with software.

By the way, I saw a reference to a PC3000emul 1.2. Never heard of it and with my laptop I'm not set up for searches to Usenet, eDonkey and such.
Something to look for in preparation for the next drive failure.

As for SMART, I saw a reference claiming that SMART bases its results mostly on access time. If this isn't up to par, the other measurements would be affected to. I'll verify this claim after reading the ATAPI manuals.
Also, SMART errors are not a perfect way to judge a drive's condition.
If you have, say, a bad cable that causes errors, SMART will update its counters. If you replace the cable, the counters are not reset and SMART will keep screaming its head off that the drive is bad.

Update: PTD was at 18% for a day and a half up to last night. This morning it's stuck at 30%.
I get the feeling PTD is a night-person type of program. Does nothing during the day and only works during the night. Maybe I should rig some flashing lights up and make it think it's in a nightclub or something.

WaxfordSqueers
November 18th, 2006, 19:22
Quote:
[Originally Posted by roocoon;62497]I looked at AceLabs but I didn't realize the PC3000 is coming in a software-only form. I thought every variety depended on some addin card and/or module.
It was hacked a long time ago so the card wasn't required. The card is used for only a few of the features.
Quote:
[Originally Posted by roocoon;62497]By the way, I saw a reference to a PC3000emul 1.2. Never heard of it and with my laptop I'm not set up for searches to Usenet, eDonkey and such. Something to look for in preparation for the next drive failure.
the emulator is the hacked version. There are several generations of it. The original app has a dongle for protection and the emulator takes care of the dongle and emulates the card. If you spend enough time, you'll find what you want. You don't need edonkey, etc.

WaxfordSqueers
November 19th, 2006, 03:26
Quote:
[Originally Posted by LLXX;62491]Actually almost all the drives manufactured since the mid 1990s aren't very critical of alignment; they track the head position based on embedded servo signals in the platters themselves, so they can almost always find the track if it's there. Much unlike the older stepper-motor drives, which relied on absolute positioning and needed very precise alignment (sometimes even thermal expansion would cause problems with those).
I wouldn't say the drive's aren't very critical of alignment. The densities are getting out of this world and they have reorganized the sectors so the sector density is higher around the larger diameter outside edge of the disk. They used to talk about cylinders, tracks and sectors as defining a disk but that's all out the window now. Modern drives can set up their cyls/trks/sects to their liking.

The problem with stepper motors was the linear inertia. As the head shot straight out, it gained momentum and usually overshot the track. Then it had to be reigned in and nudged back and forth till it was centred. That took time, and it was cumbersome. The newer voice coil actuators come in on an angle, and once they pick up the embedded bytes in the track, they can lock onto them. Even that's not foolproof, however. There are CRC bytes in the same area as the servo data that have to be read and be accurate, and the platter is going so fast it might complete a full revolution before the heads get locked in. Timing can be an issue, especially if the spindle motor servo is out of whack. The spindle motor servo data is stored on disk as well.

The motor is 3-phase, and is kept in sync by feeding it phase pulses from a table on disk. On initial boot, the motor free-wheels till the heads start reading, then it's speed is locked in by the disk data. If it doesn't lock in over a certain period, the main drive processor shuts the drive down. That would happen if the servo data was corrupt on disk. There are so many functions dependent on the firmware stored on the disk, and even one byte out of range will shut the drive down, or not let it start at all.

Cylinder 0 normally holds vital data for the boot. That data tells the head where to find the service sector where a lot of vital data is stored. The drive electronics and firmware are a totally self-contained unit and have nothing initially to do with the motherboard BIOS or the processor. The initial power-on boot sequence comes from the firmware in ROM and on the drive's service sector. If any of that gets messed up, the drive simply shuts down.

Quote:
[Originally Posted by LLXX;62491]The best way to open a drive is when it's running, since the spinning of the platters acts as a fan and dust will get nowhere near the surface without being flung away by the air current.
I'm not implying that it's impossible to take the cover off and keep the disk working, I'm just saying you'd better know what you're doing. Acelabs has gone so far as to claim that once the seal is broken, the drive never seems to work right again. I don't know what that's about, but they don't call it a hermetically-sealed unit for nothing.

The platter surfaces have oil on them, and even though you don't see specs of dirt with the naked eye, that doesn't mean there aren't microscopic particles adhering to the oil. The heads are flying a fraction of a human hair width above the surface, and they oscillate a bit due to air currents. If you jar the drive even slightly, or a piece of microscopic dust lodges in the air space between head and platter, you might not see any damage, but it could be significant.

Quote:
[Originally Posted by LLXX;62491]I recently opened an old 512M drive just to see whether it would have any effect (was also thinking of doing a clear cover mod to it...) if "contaminated" air was introduced to it... did a continous low-level read/write testing for a day (leaving it open all the time) and no errors at all after approximately 200 reads/writes to every sector. It was a lot less sensitive than I thought
A 512 Meg drive is tiny compared to todays drives, and you can get away with a lot of abuse with them. I have an old 10 Gig drive that is ancient. You can drop those things and they'll still work. My current drive is an 80 gig, and that's old. The technology has taken off. The 10 gig has 1 platter with two heads. The 80 gig has three platters with 6 heads.

roocoon
November 19th, 2006, 05:38
Thanks for the clarfication WaxfordSqueers. I'll go look for this emulation software.

Very nice hardware explanation by the way.
Out of curiosity, would you know why I get different behaviour between reading and writing (see my original post above)? Are there different subsystems involved?

PTD is at 42% as of now. I just hope there's no power failures for the next few days.

Maximus
November 19th, 2006, 08:56
Quote:
[Originally Posted by WaxfordSqueers;62511]
Cylinder 0 normally holds vital data for the boot.


uhm... this makes me think. I wonder if older IDE protocols for directly accessing the HD are still valid in EIDE interfaces.
Also, I do not understand well a thing: the cylinder 0 is write-enabled, supposingly for bios updates? And can be accessed as normal sectors if one uses the multi read/write EIDE interface protocol?

Maximus

LLXX
November 19th, 2006, 22:24
That's the "real" cylinder 0, physical sector 0.

What you see as sector 0 is likely not going to be the physical sector 0. In other words, it goes below zero.

roocoon
November 20th, 2006, 13:32
I've located PC3000 for DOS v14 (and v12) software with the emulator.
Now if only I can find some instructions in English.

Does anybdy know if this version supports SATA disks?

On the PTD status it's now at 66%.
For some reason, it seems to be spending about a day and a half on some spot and then quickly moves to the next freeze spot.
Somehow these freezing spots are 12% points apart. e.g. 66% now, 54% before, 42% earlier, 30%, 18%.
Why would that be?

WaxfordSqueers
November 20th, 2006, 20:07
Quote:
[Originally Posted by Maximus;62515]uhm... this makes me think. I wonder if older IDE protocols for directly accessing the HD are still valid in EIDE interfaces.
Also, I do not understand well a thing: the cylinder 0 is write-enabled, supposingly for bios updates? And can be accessed as normal sectors if one uses the multi read/write EIDE interface protocol?Maximus
I'm not presenting myself as an expert on HDD theory, I just know enough to be dangerous. ATA/ATAPI and IDE/EIDE are the same thing, but ATA is the correct reference to the interface for HDD's. In the old days, there was no electronics control on a HDD, it was all on the disk controller on the mobo. It was possible to write driectly to a HDD, and in some cases, you could give it a low-level format. Not anymore.

IDE means Integrated Disk Electonics, and refers to the electronics circuitry on the HDD itself. On the simpler, earlier HDD's, there was a bootstrap loader used, which was a form of servo. After a power on, the platter would get up to speed, the heads would load automatically onto the outermost cylinder, which was cylinder 0, and start to read under control of the mobo processor. One of the first things you would notice while exploring with Norton Utilities, was an EB instruction with an address, in cyl 0. That is assembler for 'jump direct', and it told the processor to jump to firmware code in the BIOS memory area. That would initiate other processes, which eventually got the system up and running.

That has all changed. The older drives were dumb, in the sense they were completely controlled by a controller interface on the mobo. Today, all that electronics, and more, is on the HDD. The older HDD's would allow Norton to reformat the low-level formatting, but that is not accessible anymore. It's done at the factory once, and unless you have access to one of those machines, there's nothing you can do about low-level formatting.

That low-level formatting contains the servo markers for voice-coil actuators to read the tracks and orient themselves. That entire sub-system is under the control of the HDD processor and it's firmware, part of which on some drives (Maxtor) is on the platter. The sub-system requires data, in the form of tables/segments to operate. All of this is hidden from any operating system or external application that uses ATA commands to access the HDD.

That means no IDE/EIDE app can ordinarily directly access this sub-system. There are ATA commands available to get limited access, but getting complete access would involve interfacing with the HDD processor directly. There are a few app out their, like the PC3000, Victoria, MHDD, etc., that can access the sub-system, but the latter is so fragile, there is no room for mistakes. Once you overwrite a crucial table, the drive is junk.

I'm not absolutely sure what is meant by cylinder 0 these days. The sub-system has it's own service area, and it can be addressed starting at it's own cyl 0. That's the one I meant. Only the HDD processor can normally access that area. The other physical cylinder 0 'should' be the outermost cylinder on the HDD, but I wouldn't even bet on that.

The boot system is not the same these days. The HDD, from power on, goes through it's own housekeeping routine, even if it's not connected to an ATA ribbon cable. If it doesn't complete that routine, the OS can't access it. One way to tell is looking at the boot screen of the OS. One drive, a Maxtor 5T060H6, will be identified by that model number on the OS boot screen. If the drive is faulty, it shows up as a Maxtor 'Rigel'. That's because the 5T060H6 is identified on the platter firmware, and by 'Rigel' in the HDD processor's ROM inside the processor. Seeing 'Rigel' on the OS bootscreen tells you there is a problem with the firmware portion on disk. If the HDD processor can't read the disk firmware, it reads it's own ROM to ID the drive.

When you think in terms of a HDD, you have to think in different ways. One way is the HDD from the perspective of the operating system, and the other is from the perspective of the HDD processor with it's service sector. There is also the processor on the motherboard with it's BIOS (firmware) and the HDD processor with it's firmware. When you talk about accessing sectors on a hard drive from a user perspective, you are speaking with reference to the operating system, not the HDD itself.

On each track of a hard drive, there are a variable number of sectors that hold user data, CRC bytes and servo data. Low-level formatting is the sub-structure of cylinders/tracks/sectors written to the HDD in the factory. The formatting performed by an OS like Windows, writes information into a user area inside each sector but separate from the low-level formatting from the factory.

There are various levels of ATA. ATA-1 is equivalent to IDE and ATA2 is equivalent to EIDE, or enhanced IDE. They are only rules for communication between a processor and a storage device. There's nothing in the ATA protocol that allows a user to access the low-level HDD structure or the service area of the HDD with it's tables of data.

If the problem with the SATA drive in question in this thread has a low-level problem, then no ordinary software app can do anything about it. All the user can affect is the structures written to disk that define the OS, like partition tables, FATS and sectoring peculiar to the OS.

WaxfordSqueers
November 20th, 2006, 20:27
Quote:
[Originally Posted by roocoon;62514]would you know why I get different behaviour between reading and writing
read my reply to Maximus. The reading/writing is done through different circuits.The head coils are split and have different resistance measurements. I put that down to a heavier current being required to write than to read. If one of your read heads is intermittent, or it's amplification circuitry is suspect, that could cause a problem with the read and not the write.

A read error should not be taken literally. It doesn't mean there is a problem with the read per se. The error message is simply the HDD processor reporting back that it had difficulty retrieving the data. I caution you not to jump to any conclusions. There are problems in the service area tables that could cause that too.

For example, if the processor encounters problems in a sector, it can mark it as bad and relocate that sector. The relocations are kept in a table. If the table becomes corrupted, the HDD processor can't read the sector and flags it as an error. The fact that your problems began after a crash leads me to think you have a corrupted firmware section.

One other thing to check. In XP (assuming you're using it), under Start/Settings/Control Panel/System/Hardware/Device Manager take a look under Hard Drives and highlight one of your drives. Right click and select properties. Look under the policies tab to see if 'Enable write caching' is checked. If so, there's a high likelihood that a crash will cause data loss. Write caching buffers data intended for your hard drive and writes it later. If you crash suddenly, that data gets lost. I have turned it off on all drives.

If your drive is that new, it may still be under warranty. If there's no data on it you need, why not try for a replacement from Western Digital?

WaxfordSqueers
November 20th, 2006, 20:45
Quote:
[Originally Posted by roocoon;62536]I've located PC3000 for DOS v14 (and v12) software with the emulator. Now if only I can find some instructions in English.
there are English versions out there but one is a Chinese translation, which has it's own problems. I have one for a Maxtor family that might help, but I don't know how appropriate it would be posting to this forum considering the nature of the software in question. You could always PM me with an email addy. The file is 1.17 Meg in PDF format. It might compress a bit.

Quote:
[Originally Posted by roocoon;62536]Does anybdy know if this version supports SATA disks?
I don't think they support SATA per se. An HDD doesn't care about SATA/PATA, it's the mobo SATA controller that does the conversion. You might enquire about that on the URL we mentioned earlier.

It might be a hassle because the PC3000 works best in a DOS environment. The low-level format will not have been affected by the SATA controller, so you should be able to use the PC3000 to examine the service area tables for corruption.

I edited the reply to insert this URL. They claim there are reliability problems with SATA drives.

http://www.ata-atapi.com/sata.htm

Quote:
[Originally Posted by roocoon;62536]On the PTD status it's now at 66%.
I don't know what PTD is. Have tou tried the Western Digital troubleshooting software, or is that what PTD is?

Also, have a look here:

http://www.acelab.ru/products/pc-en/support.html

check the article “Modern Hard disk drive” under downloads and also the articles under Western Digital.

roocoon
November 21st, 2006, 07:07
Thanks for the offer WaxfordSqueers.
I have a Maxtor document that came with the PC3000 I downloaded. I also have many documents in Russian that don't help me much.
What I find though is that all the references point to older hard disk models. I wonder how the program will behave against my hard disk.

True to form, PTD (Partition Table Doctor 3.0) is now 12 percentage points further along. It's now frozen at 78%. Tomorrow noon it should jump to 90%. With PTD anybody can be Nostradamus.

The SATA reference you pointed to is pretty scary. Hopefully things are much better today than they were when the document was written.

dELTA
November 21st, 2006, 11:33
Refreshing to see some hardcore hardware low-level knowledge around here, compared to all the software-kind of the same we usually have. Thanks for your contributions WaxfordSqueers (and everyone else), stick around!

Maximus
November 21st, 2006, 13:41
@WaxfordSqueers: I were talking of direct access -where direct means direct, thru I/O commands sent to the HDD interface. I used them to retrieve IDE hd geometry for dead HD's on DOS times (it was an optional command, but all the HD implemented it).
This way, you need only an IDE cable, and a power cable attached to the HD, nothing other
If interfaces are not changed, commands should be the same: and I wonder if referring the 'hd bios' data area was possible this way: would be ...interesting

WaxfordSqueers
November 21st, 2006, 16:33
Quote:
[Originally Posted by roocoon;62558] I wonder how the program will behave against my hard disk.
I was rethinking that. For some reason, I relate SATA to RAID, but you didn't mention RAID, did you? SATA really has nothing to do with RAID, which is a special formatting technique to allow data to be distributed over two or more drives, or as a parallel system of duplication. For example, in a job I did at a hospital, they were using 9 Quantum Fireballs in a RAID array, with each drive getting 1 bit of a byte and the 9th getting the parity bit. It was part of a real-time heart Xray system.

So, you can have a SATA controller as a standard HDD interface without RAID. In that case, the only difference between SATA and the older PATA (parallel i/f) should be in the ATA interface. One uses an 80 pin ribbon cable (40 conductors of which are just grounds) and the other uses a 7 conductor serial cable. After the data reaches the drive, it has to be converted to parallel again, because the HDD processors are all parallel data input. Then it gets converted back to serial for writing to the drive. i.e. the write heads only have 2 conductors. From that perspective, the only difference between SATA and PATA should be in the ATA interface. The rest of the HDD should be the same in either system.

The article I pointed you at mentioned issues with the serial cable itself. It's definitely worth checking out. In modern data cabling, using the TIA/EIA category 5 standard, there are special precautions that must be taken when handling the cables. That's because data travelling at 100 Mhz or greater requires special considerations, especially in twisted pair cable. Kinks in the cable can be catastrophic even thought it's just twisted pair, and the terminations are critical. For example, you're not allowed to untwist more than about 3/4 inch of the twisted pairs for termination. Crosstalk is a major problem at that speed.

The traditional serial cable has an RX pair and a TX pair, meaning receive and transmit. Writing would of necessity involve the TX pair and reading the RX pair. Problems in the pairs could cause read and write problems, but unfortunately you mentioned your trouble started after a crash, and that is a classic for problems on the platter firmware itself.

If you can boot to pure DOS with your SATA drive, and I mean by inserting a floppy or CD with a pure DOS command.com, then I don't see any reason why the PC3000 shouldn't work with your drive. It shouldn't care if it's SATA or PATA because it can't see the difference. It's obviously sending commands through the ATA interface that we mere mortals can't access.

WaxfordSqueers
November 21st, 2006, 16:57
Quote:
[Originally Posted by Maximus;62566]If interfaces are not changed, commands should be the same: and I wonder if referring the 'hd bios' data area was possible this way: would be ...interesting
I'm glad you find it interesting because I do too. Many people thought I was a bit crazy for trying to fix my HDD when it went down. Maybe they were right because I did not succeed. That had more to do with a lack of support from companies like Maxtor.

Obviously, the PC3000 has a way of accessing the HDD firmware area, but I don't know if it's using undocumented ATA commands, or whether it has access to the HDD processor firmware. HDD manufacturers have ways of accessing the HDD processor and it's commands, but some of them plug into a serial interface on the drive electronics pcb of the HDD. There must also be ATA command that can access the service sector of the HDD's because the PC3000 does it through a DOS interface. The newer versions have a GUI interface.

I have to warn you, however, that if you are considering doing this, there is a steep learning curve. For one, most of the good information on the subject is in Russian. Also, even on one model of a HDD, there can be several variations of firmware. It can be maddening trying to find the exact copy for your drive, and without it, certain problems cannot be addressed.

It's not quite as straight-forward as our kind of reverse engineering. Much of the firmware on a mobo is straight code, but a lot of the data in the firmware of a HDD are data tables that make little sense. I'm sure you could figure it out eventually. I'm trying to say that hacking your way in using ATA commands is just the beginning of your problems.

What I wanted to do was gain control of the read/write heads. In the old days, you could use Basic to send commands to the heads mechanism, telling it which cyl/track/sect you wanted to read or write. That one to one physical era is gone. There is a table in the firmware area dedicated just to translation of the physical geometry to the logical geometry used on modern hard drives. In fact, you can now specify which geometry you want, but the procesor does it through it's own algorithms.

Maximus
November 21st, 2006, 17:44
It is passed long time since I played with IDE I/O commands (almost the same used by the old INT13H). Many things has changed, but, rethinking, I believe that access to the firmware is not allowed thru READ/WRITE IDE commands. A bunch of commands were optionals, and I guess that for sure each company has implemented its own service commands for retrieving/updating firmware. Such actions surely require a specific CRC/Signature, as for Pentium microcode update.
PC3000 guys must have reversed and monitored such tools to know how/what they do... they did a cool job, indeed.

WaxfordSqueers
November 22nd, 2006, 15:48
Quote:
[Originally Posted by dELTA;62561]Thanks for your contributions WaxfordSqueers ....
Thanks for the acknowledgement, Delta. It's nice to be able to contribute in an area where I know a little. Normally, I have only been able to read, in wonder, the in-depth analysis of complex software problems on this board over the past couple of years. Thanks to so many for 'edumacating' me, and helping me out with my reversing issues.

roocoon
November 29th, 2006, 03:06
I thought I should update this even though the story doesn't have a happy ending yet.

The Partition Table Doctor with its skips of 12% points finished the cycle I was watching but then went on on another neverending cycle so I stopped it.

The PC3000 I downloaded, didn't help. All Russian interface and can't make heads or tails of it. Guessing in this case was out of the question for the obvious reasons.

I run MHDD that showed a regular unrecoverable error. On its sector-blocks display, around 7 lines of good data followed by 2.5 lines of errors.
Somebody said it might be a bad head on the disk and it sounds reasonable but I can't be sure.

I tried to copy whatever good data I could get so I tried HD Duplicator (the CopyR tool).
It saw most of the data in the first partition and flagged a few as bad. The problem was that I could select what I wanted to copy but couldn't activate the copy process because the program refused to stop scanning and had gotten stuck in some part of the disk.
They have a version that runs from a CD (HDD Duplicator Emergency) that allows setting the error retries to a smaller number but that needs activation when it starts and there's not much I can do about it. It's a Linux kernel (I think) and have no idea how to intercept it with a debugger especially since my main system is off-line.

Now for something completely different as the Monty Python motto used to go.

Just before this problem started, I had installed a brand-new SATA disk (WD SE16 1500KS).
That disk had a problem restarting after a power-cycle. Running WD Diagnostics against it, would show a couple of critical errors that required disk replacement. Nevertheless, the disk worked after that until the next power-cycle.
I replaced that disk with a similar one and transferred the data to it.
The new disk checked OK and the bad one was still critical.
I wrote zeros on the bad disk before I sent it back and checked it again.
Now the disk was clean!!!

That shows to me that only CRC errors had occurred and was nothing as bad as WD Diagnostics claimed.

Q1. Is this disk to be trusted or should I send it back? If it doesn't have any errors now, why would the dealer accept it?

Q2. In case my current problem is similar in nature, are there any tools that can read the bad sectors, ignore the error, and just write them back in place fixing their CRC?

Regards.

LLXX
November 29th, 2006, 05:12
Maybe it's your power supply or some other hardware that's causing this...

roocoon
November 29th, 2006, 07:09
It might have been the supply or the mobo. I suspected both.

I've changed the PSU already just to make sure but it was already too late.