Storage Forums: RAID5, 16-Drive Issue - Storage Forums

Jump to content

Advertisement

  • (2 Pages)
  • +
  • 1
  • 2
  • You cannot start a new topic
  • You cannot reply to this topic

RAID5, 16-Drive Issue

#1 User is offline   cdh Icon

  • Member
  • Group: Member
  • Posts: 4
  • Joined: 25-February 08

Posted 01 November 2009 - 09:50 PM

I hope so much that someone here can help me (very worried). I was having trouble with my RAID5 recently. Drives were coming up as failed or missing. I was able to get them all back except for two...one said failed, and the other said missing. However, I tried many times to remove and re-insert the missing drive. It would not come back up. So, after reading online, it seemed an option was to delete the RAID set and then re-create it. I did so, and my D and E drives were both visible as they should have been. The D drive seemed to work also. However, the E drive did not work, which is the vast majority of the content (the D drive is about 750 GB, and then the E drive is the rest of the 15TB). I remembered that I didn't select the "Greater than 2TB option when I re-created that RAID set (I thought the option was only if your operating system did not support more than 2TB...mine does...Windows XP x64). Since the other one worked, though, and it was not greater than 2TB, I figured that's what it must be. So, I went into the volume set options and changed that setting. It didn't say it was going to erase the disk contents, unless I missed it (I have 3% of my vision...so, it is possible, although I was looking very closely since this is extremely important to me). Yet, it began initializing, which to me means bad news. It was at about 15% by the time I got into the web admin to see what was happening now. I shut down the computer, turned off the drive boxes, and then rebooted so that I could send this message. Firstly, is it indeed erasing all my data? If so, can it be aborted/reverted at this point (long shot, I guess, but I am hopeful!)? I very much hope someone here will have good news for me. Thanks much in advance.

#2 User is offline   continuum Icon

  • Mod
  • Group: Mod
  • Posts: 2,452
  • Joined: 31-December 01

Posted 02 November 2009 - 01:41 PM

There are software tools that exist to recover damaged RAIDsets, but a 16-disk RAID5 that;'s already been 15% initialized?? Eeeeeek. You have probably just screwed yourself...

(and as you experienced, rebuild times on arrays this large get really scary, you have a HUGE window of vulnerability to a 2nd disk failure, which hoses you pretty good. I would, in the future, run it as a pair of 8-disk RAID6's at least..)

#3 User is offline   HachavBanav Icon

  • Member
  • Group: Member
  • Posts: 237
  • Joined: 03-August 07

Posted 06 November 2009 - 04:07 AM

RAID 5 on large SATA drives is NOT reliable. Full point.
RAID 6 on more than 10 large SATA drives is NOT reliable as well.

Use RAID 10 and Enterprise class SATA drives (with a 1 per 10E15 BER) to avoid those kind of problems !

#4 User is offline   HachavBanav Icon

  • Member
  • Group: Member
  • Posts: 237
  • Joined: 03-August 07

Posted 09 November 2009 - 04:22 AM

Just to understand how BAD is a 1TB SATA with a 1 per 10E14 BER in a raid array.

A BER/UBE of 1 per 10E14 means a 8.8% probability of ONE unreadable sector per 1TB read !
==> You have 16 of those 1TB unreliable HDD !

Enterprise class SATA drives (and large SAS HDD) are one order of magnitude more reliable : 1 per 10E15
Most small SAS HDD are 1 per 10E16

#5 User is offline   qasdfdsaq Icon

  • Member
  • Group: Member
  • Posts: 1,174
  • Joined: 29-December 02

Posted 15 November 2009 - 08:17 PM

View PostHachavBanav, on Nov 9 2009, 10:22 AM, said:

Just to understand how BAD is a 1TB SATA with a 1 per 10E14 BER in a raid array.

A BER/UBE of 1 per 10E14 means a 8.8% probability of ONE unreadable sector per 1TB read !
==> You have 16 of those 1TB unreliable HDD !

Enterprise class SATA drives (and large SAS HDD) are one order of magnitude more reliable : 1 per 10E15
Most small SAS HDD are 1 per 10E16


Many non-enterprise class SATA drives have a rating of 1 sector in 10^15.

Also, rated values != reality.

Rated values == marketing.


1.5TB x5 = 7.5TB

Zero unreadable sectors in over 100TB read.

Explain that.

#6 User is offline   HachavBanav Icon

  • Member
  • Group: Member
  • Posts: 237
  • Joined: 03-August 07

Posted 17 November 2009 - 04:22 AM

View Postqasdfdsaq, on Nov 16 2009, 02:17 AM, said:

Many non-enterprise class SATA drives have a rating of 1 sector in 10^15.

You are welcome to keep buying Desktop class drives.

View Postqasdfdsaq, on Nov 16 2009, 02:17 AM, said:

Zero unreadable sectors in over 100TB read.

To me, the UBE/BER point is more sensitive at REBUILD time where the HBA reads the ALL drive at once.

#7 User is offline   qasdfdsaq Icon

  • Member
  • Group: Member
  • Posts: 1,174
  • Joined: 29-December 02

Posted 18 November 2009 - 08:21 PM

zpool scrub is a read-verify of the entire drive at once. I do it daily.

You've still offered no explanation.

This post has been edited by qasdfdsaq: 18 November 2009 - 08:22 PM


#8 User is offline   HachavBanav Icon

  • Member
  • Group: Member
  • Posts: 237
  • Joined: 03-August 07

Posted 19 November 2009 - 04:40 AM

View Postqasdfdsaq, on Nov 16 2009, 02:17 AM, said:

Zero unreadable sectors in over 100TB read.

Can you check your 5x 1.5TB drives SMART data and let us know ?

#9 User is offline   TRACKER_MAN Icon

  • Member
  • Group: Member
  • Posts: 28
  • Joined: 12-December 07

Posted 19 November 2009 - 09:25 AM

View Postqasdfdsaq, on Nov 18 2009, 08:21 PM, said:

zpool scrub is a read-verify of the entire drive at once. I do it daily.

You've still offered no explanation.

I can confirm observations on qasdfdsaq with opensolaris / zpool scrub. No problems at all.
Main PC: CPU E8400@3Ghz, 2x2GB DDR2 ADATA 1066+, 2xWD5000AAKS@RAID1 on Jmicron 363,
MSI P45 Platinum, VGA ASUS EAH4670 DVDRW NEC ND-3540A
NAS PC: CPU Intel E5200@2.5GHz, 2x2GB DDR2, 4.5TB with Opensolaris 06.2009 ZFS, Chieftec 550W, Thermaltake Matrix.

#10 User is offline   HachavBanav Icon

  • Member
  • Group: Member
  • Posts: 237
  • Joined: 03-August 07

Posted 19 November 2009 - 11:24 AM

View PostTRACKER_MAN, on Nov 19 2009, 03:25 PM, said:

I can confirm observations on qasdfdsaq with opensolaris / zpool scrub. No problems at all.

Well, can you tell us how many sectors did the scrubber fixed ?

  • (2 Pages)
  • +
  • 1
  • 2
  • You cannot start a new topic
  • You cannot reply to this topic

1 User(s) are reading this topic
0 members, 1 guests, 0 anonymous users