Shane GWIT

Seeking input - 6 2TB drives RAID 5, 5ee, 50 or 6 on Adaptec 5805

Recommended Posts

Hi, I was hoping someone might have some experience with this card (5805) and various raid configurations. The server hosting this hardware will primarily be used as a file server, and I have it setup w/ 2 Intel EXPI9400PT nics to do bonding / link aggregation on a gigabit switch. My current plan is RAID 50 or RAID 6 - each allowing for two drive failures, given one in each sub array in 50. My two goals are high availability and performance -- I want the I/O limitation to be the 2 gbps network connection.

Thanks

Shane

Edited by Shane GWIT

Share this post


Link to post
Share on other sites

Are you looking for drive recommendations? As to that Adaptec card, the 5805 is very well respected and is designed to be used in performance environments.

Share this post


Link to post
Share on other sites

I already have the drives, 6 Samsung 2TB Spinpoint, was looking for input on which RAID level to use. My understanding is 6 is slow, but higher availability than the rest. For dishing up files across a network (2 bonded gigabit nics), am I going to notice a performance difference? What level would you use?

Share this post


Link to post
Share on other sites

I wouldent use raid 5 alone for that kind of drives... think of your rebuild times when a disk dies.... what is that window ??? And in that period you are not safe if another breaks....

Raid 6 for me, or raid 5+1

regarding performance, i gues it depends.. what is your load patern ??? Seq / random ? read/write ? .... if your disks can fill your network connection, then its no problem.

//Jan Chu

Share this post


Link to post
Share on other sites

Going to have both sequential access, those who work with adobe products, and those who don't (most likely a bunch of small random hits).

I'm going to try RAID 6 and see how the performance is with synthetic benchmarks. Are you worried about two simultaneous drive failures? What do you think about RAID 5EE?

Also, if I increase the stripe size to 512 or 1024, would that help performance for those working with large files and hinder performance for everyone else?

Share this post


Link to post
Share on other sites
Are you worried about two simultaneous drive failures?

No - with quality enterprise class drives this would happen less often than getting struck by lightning.

What do you think about RAID 5EE?

With 6 drives I don't think this benefits you. I've never used 5EE but my understanding is that it's most effective in smaller arrays as it gives you more usable space, rather than having an idle hot spare.

As to stripe size I'm not sure...that's a bit beyond me, but I will say that more likely, users will have other factors slowing them down, like connection speed, that would make any stripe size gain difficult, if not impossible, for the end user to detect.

Share this post


Link to post
Share on other sites

Buy whatever's on Adaptec's compatibility list.

Are you worried about two simultaneous drive failures?
No - with quality enterprise class drives this would happen less often than getting struck by lightning.
In that case we've been struck by lighting about every six days where I am, for the past couple of years...

I would run RAID6, as rebuild times get rather lengthy and the odds of a 2nd disk failing during the rebuild get kinda scary, especially if you've bought all of your drives in the same batch or in the same shipment. Again, while the drives themselves are theoretically reliable, the odds of disks being damaged during handling or shipment-- since most people tend to buy drives all in the same order and put them all into the same array-- gets very scary, quickly.

If you are confident your entire set of disks were never mishandled then go ahead and run RAID5. Even then I wouldn't do it, as based on a 10E15 bit error rate, odds are good you will encounter an unrecoverable read error at least once in rebuilding a three-1TB-disk RAID5, and the extra parity of RAID6 will protect against that.

. My understanding is 6 is slow,
RAID6 is somewhat slower than RAID6, but with a 5805 and six disks you should have no trouble saturating a 2gbps connection.

Share this post


Link to post
Share on other sites

FWIW, I have seen dual drive failures of Hitachi's 15K300 drives in fairly close succession. At home with more normal discs, I've seen enough failures that I wish I'd bought a RAID6 capable card.

One contributing factor is you say you've got 6 2TB Samsung drives. That presumably means you bought them at the same time from the same shop, and so they are very likely all the same manufacturing batch - that increases the change of them failing together. I've had 4 drive failures, of which 3 were a set of discs I bought at the same time and they have almost sequential serial numbers, and failed weeks apart from each other.

I'd definitely pick the RAID6 solution. However be aware that it only protects you from drive failures. A friend of mine went to a lot of effort with RAID, only to see a power spike wipe out the whole PC.

Share this post


Link to post
Share on other sites

For sequential accesses, RAID6 on that card will be just fine. Too many random writes will slow things down, for sure. Small, random writes will give you essentially the IOPS of a single drive with a RAID6 of 6 disks. It would only be a bit better with RAID5, so I think RAID6 is the way to go.

Share this post


Link to post
Share on other sites

It took almost a day to build/verify, but I'm now running RAID 6 on 64 bit Debian 5. Upon installing adaptec's software, it reported all 6 drives failed SMART w/ 75 error count. I'm assuming that is something erroneous either on Samsung's side or Adaptec. I'm currently running 3 instances of Bonnie++ retrieved from Debian repository.

I plan on using amanda to backup everything offsite; you can never be too careful.

Thanks for the input, I went from the initial idea of using raid 50 to 6.

Unrelated, I think I'm going to nuke Debian as most of my experience has been with Red Hat. I only went with it as a friend suggested and Fedora Core 12 doesn't recognize any drives / raid cards. I know I can create a driver disk and use it, but that would involve a floppy, which isn't readily available nor does it appeal. I started using CentOS instead of RHEL recently and will most likely go that route.

Share this post


Link to post
Share on other sites

Depending on how high you place the priority of high availability/reliability, RAID 1+0 might also be a good choice. RAID 1+0 doesn't have any XOR requirements, so throughput (both in terms of I/Os and read/write speed) is higher than RAID 5 or RAID 6. MTTR is also among the lowest you could hope for, since the array only needs to copy 1 drive's worth of data to a hot spare drive to recover from a failure, so your "window of vulnerability" is very small compared to a RAID 5 or RAID 6 array of the same size. Performance for a degraded array is extremely high, since there is no need for XOR calculations. As far as reliability goes, you can potentially lose up to HALF of the drives in the array while remaining operational.

The only real downside is that you need more drives to achieve a given storage capacity. OTOH, the cost of the high performance controller needed for RAID 5 or RAID 6 to have acceptable performance might be comparable to the cost of several drives.

Share this post


Link to post
Share on other sites

I've been playing around with Bonnie++ and although it is a synthetic benchmark, I'm not happy with the results. Then again perhaps I should have ran multiple copies. Running this one test placed a very high load on the system (>10)...I can only assume the CPU is being utilized so much due to the fact this is a synthetic benchmark as the 5805 has a 1.2ghz dual core processor with 512mb of RAM on it. Write cache is enabled on the card.

Going to run some iozone benchmarks before I nuke everything and start over. I've thought about running RAID 10 and at a loss of another drive, but doesn't that seem like a waste of $$$ in addition to a waste of resources (ie the card would barely be utilized).

Writing intelligently...done

Rewriting...done

Reading intelligently...done

start 'em...done...done...done...

Version 1.03d ------Sequential Output------ --Sequential Input- --Random-

-Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--

Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP

twelve 68000M 111940 48 88708 26 314832 44 110.1 1

twelve,68000M,,,111940,48,88708,26,,,314832,44,110.1,1,,,,,,,,,,,,,

Share this post


Link to post
Share on other sites

A waste of $$$ and resources??? If using RAID 1+0 improves your I/O throughput, reduces your window of vulnerability (during a rebuild), provides higher reliability (up to 50% *simultaneous* drive failure and still operational!), you already purchased the card (so no *extra* expense), performance is great (whether the array is deraded or not), and the chasis you have will support the # of drives you need for a RAID 1+0 configuration, then I'd call it a "win", since it would seem like it exceeds all your goals and simply needs more drives (which are relatively inexpensive) to do so. cool.gif

To me , the most important thing should be achieving all stated goals (I/O, sequential andrandom read/write thoughput, MTTDL, MTTR) for a cost that is reasonable based on the improvements (both tangible and intangible) delivered and thereby have a reasonable time for ROI. I wouldn't hesitate for a second to consider different storage configurations in order to do so. Honestly, whether the card is "barely utilized" wouldn't even be a factor to me, because I would consider the cost of the controller negligible compared to the other longer term costs and benefits.

Share this post


Link to post
Share on other sites

I'm curious if the RAID rebuild algorithms have been improved. I know that in the past a single read error during a rebuild could cause another drive to go offline. It's always seemed like the sensible thing would be to use the rest of the array to rewrite the sector of the drive that returned the read error. This was one of the features touted in ZFS, so I'm curious if it has made it's way back to hardware RAID cards.

Share this post


Link to post
Share on other sites

I went with a stripe size of 1024kb and have been running bonnie tests over the last few days. I'm not sure it is the most reliable benchmark, but at least I can get results that have some relative meaning.

Edited by Shane GWIT

Share this post


Link to post
Share on other sites

Well after weeks of synthetic tests, and then opening and closing large tif files over the network I am going to stick with RAID 50 w/ stripe size of 1024kb and the XFS for the home partition.

My results:

Somethings were CPU bound

R0 was done as theoretical max

I can't find my R6 p=3 results, otherwise enjoy:

http://www.godswordintime.com/results.php

Share this post


Link to post
Share on other sites

Well after weeks of synthetic tests, and then opening and closing large tif files over the network I am going to stick with RAID 50 w/ stripe size of 1024kb and the XFS for the home partition.

My results:

Somethings were CPU bound

R0 was done as theoretical max

I can't find my R6 p=3 results, otherwise enjoy:

http://www.godswordi...com/results.php

Was RAID 1+0 not tested or considered?

Share this post


Link to post
Share on other sites

Was RAID 1+0 not tested or considered?

I can't remember to be honest. For anything with parity involved it takes a day before I can get real world results as it does a build & verify. I thought I tested 10 but I can't find any csv files, which means I tested it and didn't xfer them off the server before starting the next or forgot to test it.

Either way, the server doesn't have to go into production for another week, so I'll put 10 results up there tomorrow. I want to start testing stripe size as well, but I'd figure the larger the better for the density of the drivers & platters.

Oh and the RAID6 array died -- as in no arrays present. I was messing around with ext4 settings, but that should have nothing to do with the controller. Adaptec's official response (after 3 days) was

Thank you for your message concerning your 5805 controller. I do not show these particular Samsung drive have been tested. You can find a compatibility guide here:

I'm not able to find any information on the Samsung HD203W1 drives on the Samsung site.

I would recommend you update the controller to the latest release of BIOS version 17544 available here

I've updated the BIOS and nothing of the sort has happened since, but it is disconcerting to say the least.

EDIT: Just as a FYI I have nothing against RAID 1+0 -- I deployed it on my second production server, which currently has ~2 years uptime on RHEL 5. I just feel like a 5805 is being wasted doing a simple mirror of stripes.

Edited by Shane GWIT

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now