WaLe

2nd broken WD1200JB - just bad luck?

Recommended Posts

Hi!

I bought a Western Digital WD1200JB 2,5 years ago. Last automn I realized that it was quite slow, and HD-Tach gave me this result:

wd120jb_old.JPG

I RMAed it, and got a new HD (or a refurbished) back 6 weeks later. The performance according to HD-Tach was fine, and my feeling was the same.

Only recently I had the impression that it was bad again (don't know for how long because I don't use it as my main disk anymore), so I run HD-Tach again:

wd120jb_new.JPG

(blue line was the same HD after I got it back, red was today)

Is there any chance that this was caused by anything else than the HD itself? Or is this just bad luck and I have to RMA it again?

Thanks

Walter

Share this post


Link to post
Share on other sites

Obvious entry-level question since you didn't mention it - I assume you ran chkdsk/defrag/antivirus/antispyware on the drive before testing it? Are there any telling differences in amount of data on the drive, etc?

In order of decreasing likelyhood:

1) The problem is on the data level. Corrupt filesystem, fragmented files, rampant spyware, insufficiant free space.

2) Something environmental is wrong - heat, dust, electrical.

3) You just had bad luck and had two drives die on you. Or, something connected to the hard drives is operating oddly and causing them to fail.

I hope you didn't tell WD's RMA people that you were sending it back because your benchmark scores were odd - and if so they're suprisingly leniant.

Share this post


Link to post
Share on other sites

Could be other factors such as heat, PSU stability and/or hardware/software configuration.

I usually don't like to tax (or benchmark) the HDD since this increases wear and tear. The chances of two HDDs fail in a row within years apart is very unlikely usually, although I've came across some exceptions. I never had HDDs bought failed me in a row though...

Perform backups just in case.. :rolleyes:

Share this post


Link to post
Share on other sites

Based on the graph you show that doesn't seems to be a "benchmark" issue. It is more like a either a problem in your data transfer, media damage, or other electronics on your PC. To isolate the problem I recommend you try this drive in another PC and see if you have the same problem.

Notice that you have similar pattern of spikes on the 70-90GB level, and lots of spikes on the <60GB level, there seems to be some transfer related issues. Try forcing UDMA33 or less and see if you can get a consistent reading. bad or shorted 80 pin cables can cause unreliable data transfer in some cases. Try another port on your motherboard and see if you have the same issues also.

Definitely do a backup just in case.

Share this post


Link to post
Share on other sites

I once read in another thread that poor mounting could cause poor performance. Try taking the hd out of the case and see if that changes anything...

Share this post


Link to post
Share on other sites

Sorry for my late answer - I was a bit stressed recently (loads of work, both children and wife ill).

Obvious entry-level question since you didn't mention it - I assume you ran chkdsk/defrag/antivirus/antispyware on the drive before testing it?

I am using an up to date virus scanner, antispyware didnt show anything, when running defrag I noticed the problem for the first time (for it was so slow) - and chkdsk didn't show any errors.

But I run WDs DLGDIAG for Windows - and that showed already in the the quick test this error:

Test Result: FAIL

Test Error Code: 06-Quick Test on drive 2 did not complete! Status code = 04 (Unknown failed test element), Failure Checkpoint = 64 (SMART Attribute Test) SMART self-test did not complete on drive 2!

Does anyone what that means?

Concerning all the other potential problems you mentioned (corrupt files system, other system components, transfer problems) - do you still think this might be the problem, especially if you consider that my other (main) disk that is on the same IDE cable (I know, that's not the optimum, but I had no other chance to connect both HDs) is not having any problems like this?

Share this post


Link to post
Share on other sites
But I run WDs DLGDIAG for Windows - and that showed already in the the quick test this error:
Test Result: FAIL

Test Error Code: 06-Quick Test on drive 2 did not complete! Status code = 04 (Unknown failed test element), Failure Checkpoint = 64 (SMART Attribute Test) SMART self-test did not complete on drive 2!

Does anyone what that means?

According to WD it means that the drive should be replaced.

Share this post


Link to post
Share on other sites

Hey all,

I'm having what appears to be the same issue with my two WD1200JB drives. I set them up as a RAID 0 volume on a HighPoint HPT370A PCI controller card when I first got them, and was never really impressed at the performance. In fact, it sucked:

2xWD1200JB-RAID0-HPT370A.png

I've now had a chance to move the data off the array and test the drives individually to see what was going on.

Both drives showed highly erratic transfer rates, with very low minimum and average transfer rates, and access times were a couple of ms off spec. The striped configuration probably exaggerated the problem.

As suggested by Western Digital tech support, I performed a zero fill of the drives to see if that would help.

Drive #1:

Zero Fill: OK

Quick Test: "Completed with read element failure", error code 0007, checkpoint 65

Second Zero Fill: OK

Quick Test: error 0007 again

Extended Test: bad sectors (2 of them) found and repaired

Quick Test: OK

Here's a before and after comparison using HD Tune:

Before zero fill:

WD-WMA8C3717474.png

After zero fill:

WD-WMA8C3717474-after.png

The STR performance now looks normal, and access times are now on-spec. The question is, would you trust this drive now, or should I get an RMA, given that the drive continued to fail the Quick Test after the first Zero Fill?

Drive #2:

Zero Fill: Many errors from sector 86,443,008 onwards; drive stops responding ("Drive Not Ready", error code 0104); HDD Activity LED remains on

Extended Test (after cold boot): "Too many bad sectors detected"

Second Zero Fill: More errors in same area; drive becomes unresponsive

This one's definitely going back!

Before (partially successful) zero fill:

WD-WMA8C3718879.png

After:

WD-WMA8C3718879-after.png

Notice how the STR now looks normal over the part of the drive that was able to be zero filled.

FYI, both drives were made in Malaysia, 23 March 2003, serial numbers in the range WD-WMA8C371xxxx.

--

screwtop

Share this post


Link to post
Share on other sites

screwtop, I'd send both drives back for replacement.

WaLe, I've personally have several failed 1600JB's and I've never found WD to be particularly reliable. However, my failure symptoms were very different from yours so I doubt they're related.

However, seeming as the DLGDiag result failed, you have a good reason to send the drive back for replacement. If there is no important data on the drive, try a zero-fill as screwtop did. If that helps, it may well be some kind of media error, or caused by some kind of local electromagnetic interference. Two drives failing is quite rare, however, since they're the both the same firmware/revision, they could both have had the same "problem" to start with. Do check if you have any highly magnetic devices near your PC - speakers for example.

Share this post


Link to post
Share on other sites

Yep, Western Digital tech support agreed - both my drives should be sent back for replacement.

I was surprised at how much a zero fill seemed to help things, though (even though the drives were still having problems). I wondered if the lower ambient temperatures here (New Zealand) compared to the drives' place of manufacture (Malaysia) could cause problems.

Cheers,

screwtop

Share this post


Link to post
Share on other sites

I doubt the ambient temperature should be a problem, unless it's really cold (e.g. the drive runs at <5'C). However, humidity and dust could be a contributing factor...

Share this post


Link to post
Share on other sites
But I run WDs DLGDIAG for Windows - and that showed already in the the quick test this error:
Test Result: FAIL

Test Error Code: 06-Quick Test on drive 2 did not complete! Status code = 04 (Unknown failed test element), Failure Checkpoint = 64 (SMART Attribute Test) SMART self-test did not complete on drive 2!

Does anyone what that means?

According to WD it means that the drive should be replaced.

Does this mean it they will replace it? the same thing happened to me :(

error.JPG

bench.JPG

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now