I'm trying to design a storage system for my business and in researching I ran across several articles claiming that RAID 5/6 were effectively dead due to large drive capacities. The theory is that if you have several 2 TB drives and a drive fails, during the rebuild there is a 100% chance that you will get an error on rebuild and the entire array will die.
I start reading all these articles from 2007 onward about how rebuilding/resilvering a RAID 5/6 array will assuredly fail once you have 12 TB or more of data. The reasoning is that a drive has an "unrecoverable read error" (URE) rate of 1 bad read every 10^14 reads. When you do the math with 512 byte sectors, that's around 12 TB worth of data. Not a big deal if you're reading 12 TB of data from one drive as the built-in hard drive controller detects it and maps it as a bad sector and moves the data elsewhere, but if you're rebuilding an array that causes the entire array to be unable to be rebuilt and you lose all data in the array.
Here are the articles/discussions I'm referencing:
I primarily use HP products which means HP SmartArray RAID controllers. The drives have traditionally been 15K 146 GB or 300 GB 2.5" SAS drives in arrays no larger than 8 drives. I've used RAID 5, 6, and 1+0 and also have some hot spares ready to go in some of the arrays. So the largest array I've made has been less than 2 TB (1.2 or 1.6 maybe?) in total. I run Win2K8 R2 on our servers.
I'm now looking to create something like a 24 TB or larger array. In doing that I wonder what my options are as far as array portability (in case of server hardware failure) and RAID structure. If I have 12 x 2 TB drives or 8 x 3 TB drives, I'm now starting to get nervous that during a rebuild I'll lose everything. The SmartArray controllers are pretty good about letting you move arrays to and from similar controllers using the same drives. Maybe Windows software RAID (dynamic disks or whatever) would make sense for portability to a different server.
I know that RAID is not a backup. That point comes up from time to time so I just wanted to address this now that I'm not planning on using this as a backup. Not even discussing anything related to backups with this post.
The arguments that I linked to say that because of the URE on drives, you're certain to get a failure during a rebuild and this is due primarily to each drive having larger capacity. By that token, it also seems that reading a 2 TB drive 6 times would also give you an URE, but it wouldn't matter in that instance because you're not trying to rebuild an array from data and parity, you're just losing a sector. In a RAID rebuild, that would cause the entire thing to die.
I get the feeling that the math doesn't match up to reality, but I don't have access to something like that Google hard drive survey to give real-world experience. So all I got is the math to feed my fear.
I figure if anyone has real-world experience with large storage arrays and DIY arrays, it's y'all. I'm not looking into some prepackaged EMC or something where the storage is a giant mystery box that "just works" and I'm not going to be trying to figure out a *nix. I don't have a need for massive performance as I'll be constrained by gigabit Ethernet or some external SAS links (assuming a drive enclosure). My main concerns are:
1) Have people run into these rebuild issues with RAID 5/6 using 1+ TB drives? If so, would some type of RAID 6+0 or sets of mirrors or something mitigate these issues? Do I just need to choose a RAID type that avoids parity as that's where the disaster can happen during rebuild?
2) Rebuilding an array has the same issues that growing an array would have, correct? So if I get 6 TB now and add a few drives at a time, am I courting disaster? I'm not planning on doing this but from an academic standpoint it seems the conventional wisdom is that growing arrays is bad. I've done it before with nary a problem with my dinky little 146 GB drives so I didn't know if it's a size thing, perhaps?
3) Any suggestions for a 24 TB setup? Or examples of what y'all run? Any reason to go with 2.5" versus 3.5" drives if I decided to do commodity Seagate or WD or Hitachi enterprise drives rather than HP-branded drives? It's just a drive density consideration if I want to save on space, right? Any suggestions on software like FlexRAID? I know it's probably not a first choice for an enterprise-type of software RAID solution but I don't know any other brands/companies off the top of my head for that kind of product except "install Linux and use [random lowercase utility using as few letters as possible]".
I thank you for your time reading this and for any information or suggestions you may have!
Edited by Scrotos, 05 July 2013 - 12:49 PM.