Sign in to follow this  
cbrworm

Specific machine - multiple drive failures. Suggestions?

Recommended Posts

Hi, I have a customer with a SFF Dell Optiplex 9020 that runs 24/7 in an air conditioned closet. It has been in service now for about 3.5 years.

It initially had a Seagate ST3000DM001 which failed within the first few months of service. I replaced it with a new ST3000DM001 (this was before the high failure rate for those drives was known). It ran non-stop for almost a year before failing. The customer has sensitive data and doesn't want the drives sent back for warranty replacement - they pay for new drives each time. By the time the second Seagate 3tb drive failed, I had sworn off seagate due to an incredible number of those drives failing at my customer sites.

So I bought a WD WD3003FZEX, thinking that the black drive should be reliable. 604 days later, it is now failing. The risks here are that the drives spin 24/7, but the actual workload is very low. It is a headless machine that only has remote users, so no one to hear the drives making noise, etc. Luckily the first drive and this drive failed progressively with SMART errors first. This one currently has high reallocated and pending sector counts.

Being a small machine, airflow is not ideal, but the drives never exceed 52 degrees C. higher than I would like, but not high enough that I would expect failure.

What do I get next? HGST? I have had great luck with HGST at other locations, but I honestly have very good luck with WD black drives as well. I have had fairly horrible luck with Seagate SSHD drives, which could be an ideal solution. Is the WD 4TB SSHD better? Is it suitable for 24/7 use?

I don't want to move to enterprise drives due to error recovery being a desired trait.

Due to the database work that they are doing, when the system is being utilized, I need max IOPS, so I am hesitant to use a 5,400 RPM drive.

The machine is not exposed to any vibration and it is in a very tightly controlled, dry environment. Unfortunately it is held at about 80F.

Thanks!

Share this post


Link to post
Share on other sites

Based on the Backblaze stats, 3 TB drives have been unusually unreliable. Except 3 TB drives from HGST.

It sounds like you're running without a backup, and have gotten lucky so far. Suggest upgrading to a pair of 4 TB drives, with a periodic backup script that clones the primary drive to the 2nd one. It's not ideal, but far better than what you have now - prayer and your good looks.

Share this post


Link to post
Share on other sites

Thanks for your concern. The machine is backed up multiple times a day locally and nightly to a remote location. Downtime for repair is the killer. Ironically they have a SAN in the same space with 24 1tb Hitachi Ultrastars that has been running 24/7 for at least 8 years. I have been hounding them to upgrade since it went out of warranty a number of years ago, but it has never had a drive failure.

I ordered a HGST 3TB drive yesterday along with a Samsung 850 Pro for the OS.

This particular customer is difficult to begin with, having repeat hardware failures is not good.

I guess if this next set of drives follows the same pattern of basically doubling the previous set, they should be good for almost 4 years.

Edited by cbrworm

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this