qasdfdsaq

How to achieve superb write speeds with nForce onboard RAID5

Recommended Posts

WORKS GREAT! New data on nForce 780i:

Specifically, I was looking to create a 6-drive RAID-0 array of 320gb per drive. This would give 1920gb of space, just under the 2tb limit. The data on these drives is transient and not important - just the speed and capacity are.

Utilizing qasdfdsaq’s partition offset suggestion; I went with a start sector of 3072 using Beeblebrox. I know you said that we’d be out of luck for 4 or 6 drive arrays, but I also realize you meant that for Raid-5. My assumption was that a 7-drive RAID-5 and a 6-drive RAID-0 would observe the same benefits of using 3072 as the starting sector – mainly because of your stripe size rule (since I was going to test up to 128k block and 128k cluster, I figured 6*128*4 to be safe).

From there, I played with the stripe and block size – I created a matrix of cluster & stripe, and recorded the results of every possible combination. Here are the results:

59.gif

You can see that if you're up for 128k stripe and 64k cluster, you can achieve 1gb per sec read speeds when accessing files over 1mb in size! Even higher with one more tweak (more below).

If you look closely at the same 128k+64k table, tho, you'll see horrible read speeds with files under 128kb in size. Which is understandable; the read/write duration is pretty long at 128k+64k, I’m guessing there is a lot of wasted effort from the drives on small files at these settings.

So, how to determine the best balance? I scoured a windows install, complete with games, to get a sense of how big and how many files we load on average. I decided to take the following sample file numbers, and create a read/write time simulator:

20,000 x 4kb files

5,000 x 16kb files

10,000 x 128kb files

4,500 x 1mb files

Basically, the time simulator would estimate how long it would take to read that many files of each file range, add them up, and point out the quickest one on average.

Time simulator matrix:

136.gif

And, the final results (lower is better, of course):

174.gif

So, I was thinking of using 32k/32k as my stripe/cluster – since it had the best average access time for the specified number of files in the time simulator. I then, of course, ran into the Windows 4k cluster limit.. I tried EVERYTHING I could find in order to get around it (disk image/restore, etc), but since I wanted to install the OS on my first/fastest partition, I went with 32k/4k.

Suggestions (some are restated):

1) 3072 sector offset = works. I observed performance increases no less than 12% avg (see below) – and I did some pretty extensive and specific testing.

2) Disabling NCQ. I was surprised to find that this helped increase average speeds up to 9%. Once you've installed your OS, go under your device settings, select each of the three nVidia RAID controllers (Port 0 & 1 tabs) and uncheck the "Enable command queuing". Click "no" to the restart option, until you're done your last.

3) Contrary to what you might read out there on the subject, disabling your Write Cache WILL cripple your drive speeds. Maybe I’m doing something wrong, but I strongly recommend against disabling Write Cache.

4) Your first partition will be the fastest (closest to edge of disk), so if you were really anal, you should create a parition for your games, and THEN your C: partition.

5) If you’re going for pure performance, and want to use the full 6 drives in RAID-0, try not to get anything larger than 320gb – with nForce having only 32-bit LBA addressing, your limit for an array will be 2tb. Might as well just go with six 150gb Raptors ;)

6) My particular time simulator is bias towards equal read and writes. You may find your usage warrants more emphasis on reads than writes. If you would like, I can send you the time simulator spreadsheet, and you can edit the number of files in the simulator.

32.gif

All on an nForce integrated 6-channel SATA RAID-0 array.

Share this post


Link to post
Share on other sites

Recently I started a new topic on this subject but thankfully Madwand pointed me to this one so I will continue my discussion here. Here is what I posted in that topic.

I have used linux software raid5 and raid6 for years and the performance for me is quite acceptable. I have 15TB of linux software raid currently at work using > 50 SATA drives... With linux software raid I expect sequential writes to be at minimum faster than the speed of 1 hard drive. Since the price of drives has gone down to less than $100 for 500GB and most motherboards have fakeraid5 I decided to try this with new computer builds. After playing around with this for a few days I am horribly disappointed with the write performance. One example of this is when transferring a 10GB file over a gigabit network. With a single drive the transfer is less than 5 minutes. With a 3 or 4 drive software or fake raid5 it will take > 25 minutes. From my linux experience the problem appears to be that the cache manager is forcing writes too frequently and the raid subsystem is not caching stripes so that every stripe is being read before it is written back. With linux you can lessen the impact of this by increasing the size of the stripe cache. However under XP I see no way to do that. I did enable LargeSystemCache but that appears to only help for the first few seconds where the gbit bandwidth holds steady at >50% for a few seconds then settles down on 3 to 8%. During all of these tests the cpu usage on the dual core machine with 4GB of memory is less than 10%.

And now the progress. Like others, I have had some difficulty alining partitions in XP. Diskpart does not work under XP so I had to use the PTEdit method and I also used Vista. My latest test was with a 4 drive raid 5 and with the Vista 2048 sector alignment it did not help on my the problem with nvraid (expected from this thread). So now I am going to try to use a single drive with XP on it to boot and do my testing that way with a 3 drive raid5 config.

Share this post


Link to post
Share on other sites

Because the achille's heel of RAID5 is it's write speeds, the benefits of alignment are even greater because with proper configuration write chunks are aligned and sized the same as the stripes, whereas a casual setup will have one write very possibly overlapping two stripes causing double the workload. The nature of this tweak means random I/O is helped more than sequential operations.

On my old 3ware card (9550SX) without any tweaking under linux I get about 100-120mb/s writes, whereas with the tweaks it comes close to 300mb/s - suffice it to say I was surprised because most articles I've read claimed stuff like 10%-15% gains not several multiples (although the original poster here does see such gains iirc).

Share this post


Link to post
Share on other sites

With 32Kb stripe size and 3 drive raid 5 I see a tremendous improvement in the 10GB network transfer test however it is still a little slower than a single disk. On top of that it is significantly slower in a 1GB windows dd zero write test (15 sec vs 90 sec).

On a linux system with the same hardware and software raid5 using 5 disks I get the following:

# dd if=/dev/zero of=test.test bs=1M count=1000

1000+0 records in

1000+0 records out

1048576000 bytes (1.0 GB) copied, 3.42419 s, 306 MB/s

I think I have spent enough time on this one and I will go with single 750GB drives over 3x 500 in raid5 and use the extra drives in my linux software raid servers.

Edited by drescherjm

Share this post


Link to post
Share on other sites

Now with 64kb cluster I got 1 GB write with dd in 16 seconds. Not bad (about the speed of 1 drive) but now this eliminates the possibility of using NTFS compression.

Share this post


Link to post
Share on other sites

What I meant on the last part was to do selective NTFS folder compression on the lesser used parts of the filesystem while keeping the rest uncompressed for performance reasons...

Share this post


Link to post
Share on other sites
With hardware RAID controllers, it shouldn't be much of an issue as any respectable controller should be able to identify a sequential write, plus has sufficient cache memory to handle it properly.

I did not react before...but I now think that this mechanism is very inefficient (but for marketing).

When you have a large db running dozens of simultaneous queries, each issuing lots of direct write io, I doubt a raid controller can really delay the write very long so and identify a sequential write among dozens of other direct write io.

==> Does anybody has "real world" measures about this "controller hability to identify a sequential write" efficiency ?

Share this post


Link to post
Share on other sites

I think this depends on the controller. Some controllers have 1GB of battery backed cache so the delay can be longer. With that said raid5 is generally a bad choice for a database server.

Share this post


Link to post
Share on other sites

hello guys,

first of all, very nice article, but i think i'm missing something, in a 3-drive (160GB segate) RAID5 array (nForce570 ultra) i can only write with an average of ~3MB/s on my drive!

- made my raid5 stripe size 32k

- did the partition offset (3 partitions: the first is shifted @ 2048 sectors instead of 63, the second is booting my xp, and the third holding my data)

- resized my NTFS cluster to 64k

i mean i see no difference between my old array and this new one using HD Tune ...

Share this post


Link to post
Share on other sites

ok, sorry for the wrong post in the thread, i'll do it again

hello guys,

first of all, very nice article, but i think i'm missing something, in a 3-drive (160GB segate) RAID5 array (nForce570 ultra) i can only write with an average of ~3MB/s on my drive!

- made my raid5 stripe size 32k

- did the partition offset (3 partitions: the first is shifted @ 2048 sectors instead of 63, the second is booting my xp, and the third holding my data)

- resized my NTFS cluster to 64k

i mean i see no difference between my old array and this new one using HD Tune ...

Share this post


Link to post
Share on other sites

ok, sorry for the wrong post in the thread, i'll do it again

hello guys,

first of all, very nice article, but i think i'm missing something, in a 3-drive (160GB segate) RAID5 array (nForce570 ultra) i can only write with an average of ~3MB/s on my drive!

- made my raid5 stripe size 32k

- did the partition offset (3 partitions: the first is shifted @ 2048 sectors instead of 63, the second is booting my xp, and the third holding my data)

- resized my NTFS cluster to 64k

i mean i see no difference between my old array and this new one using HD Tune ...

Share this post


Link to post
Share on other sites

I just installed a new software raid5 under XP, and I'm getting about 5MB/sec write speeds. Will this tweak help me? If so, can someone advise me exactly how to move my partitions? I can figure out the other 3 steps (cluster size, stripe width, and 64K I/O default in XP)

I'm not sure it even matters what my hardware is, but here goes:

AMD 64 x2 4800+ with 2GB RAM

WIN XP SP2 (with raid 5 software hack)

5 x1tb SATA Seagate Barracuda's (ST31000340AS)

Biostar TA780G M2+ (which I purchased because I thought it supported raid 5)

Also, in the bios for the SATA chipset I have about 5 options:

1-Native IDE (current setting)

2-AHCI (won't boot if I select this one)

3-RAID (won't boot if I select this one)

4-IDE->AHCI

5-Legacy IDE

Share this post


Link to post
Share on other sites
...

WIN XP SP2 (with raid 5 software hack)

...

The windows xp raid 5 sw hack is very low performance. I have had a system (4yrs) running with 5x250GB drives and it can't get more than 4-5MB/s writes. You're best bet for software only RAID is to see if ciprico's raid sw works with your MB. Though you may want to note that they are in bankruptcy and their EULA is just egregious.

-R

Share this post


Link to post
Share on other sites
This is one of those "Oh my god this is so amazing i gotta tell the whole world" posts...

How do you actually get diskpart to create the proper alignment/offset?

I'm trying to follow the high-level instructions that qasdfdsaq outlined. I have a 5-drive RAID5 (5x 320MB drives = 1.16TB) on a nVidia 590 SB (MCP55) w/ a 16KB stripe size, and I'm running XP Pro SP3 (32-bit). When I use diskpart per Microsoft's instructions (http://support.microsoft.com/kb/929491), specifically, on the "create partition primary align=1024" command, I get a "The arguments you specified for this command are not valid" error. Additionally, when I try to use the "offset" argument, it only lets me specify the offset in terms of MB instead of KB (and it only allows integer values). Any guidance anyone could provide regarding how to apply a partition offset/alignment using diskpart would be greatly appreciated. I can't say I'm advanced enough to handle a manual (hex) partition table edit, but I'm pretty able up until that point. This is non-bootable NTFS drive where I'd like to set up a single partition. Thanks!

FK

Share this post


Link to post
Share on other sites

Does anyone know how to align a NTFS partition in XP? Apparently, the "align" function isnt included in diskpart.exe under XP, only 2k3 or vista. ...Or, does anyone have a Q&D guide on how to align it w/ a partition editor? Thanks!

FK

Share this post


Link to post
Share on other sites

First thanks for the info ... was not really specifically interested in R5 and nForce but it help me understand why a motherboard 2 drive raid 0 had a 31.5k offset before the striping (SiS Raid Bios 2005) ... to help XP get correct alignment ... reformated the vista raid using XP helped heaps

Linden

Share this post


Link to post
Share on other sites

Never did post an ATTO pic. Almost a year later, thinking of getting into SSDs, dismayed at their price and RAID scaling. This is the same old array, 4k cluster, 32k block:

ATTO2.jpg

I imagine you'd only take a 20% hit in performance if you did a RAID-5 instead. But you wouldn't use the 3078 partition starting point. If SSDs don't get better in a year, I think I'm going to hunt down a couple more of these WD SE16 320gb, and go for an 8-drive RAID-5.

Any recommendations for a 48bit LBA controller?

Edited by Wiinter

Share this post


Link to post
Share on other sites

Forgot this old thread was still alive! Anyway for all those who can't get their partition offset right under XP, there is no real way of moving it to the right place, so you'll have to delete the partition and recreate it with an e.g. Vista setup disk.

Might see more relevance now with SSDs that use a native 4K sector size, though it confuses me as to why they emulate a 512-byte sector where Vista was designed from the ground up to be optimized for large sector (i.e. 4K) drives to begin with.

Anyhow with the optimizations under Vista making a big difference to performance on SSDs and RAIDs both through better support for large-sector storage devices, I've finally made the switch to the Broken OS.

Share this post


Link to post
Share on other sites
Forgot this old thread was still alive! Anyway for all those who can't get their partition offset right under XP, there is no real way of moving it to the right place, so you'll have to delete the partition and recreate it with an e.g. Vista setup disk.

Might see more relevance now with SSDs that use a native 4K sector size, though it confuses me as to why they emulate a 512-byte sector where Vista was designed from the ground up to be optimized for large sector (i.e. 4K) drives to begin with.

Anyhow with the optimizations under Vista making a big difference to performance on SSDs and RAIDs both through better support for large-sector storage devices, I've finally made the switch to the Broken OS.

qasdfdsaq - many thanks for that initial post! Here's a question for ya: Do you think Vista's updated partition handling would perform BETTER than XP with the modified start sector? Or should it, technically, be the same effect? Thinking about upgrading to the broken OS as well, tho only for the triple-SLI support. I've got three XFX 8800 GTX, but failed to do the research before buying the third (XP only supports two GPUs).

Share this post


Link to post
Share on other sites

Vista does perform faster both with newer partitions and older (unmodified start) partitions due to a number of reasons. For one, it reads and writes in 1MB-8MB blocks by default rather than XP's 64KB, which does wonders for large RAID stripes and often means you don't even have to mess with your cluster size. Personally, I much prefer Vista over XP now that I've made the switch, and I see no compelling reason to go back. Then again I've got plenty of RAM in my machine to run Vista and as a result it is usually quite fast and nippy.

If you're reluctant about the speed of Vista over XP in general use (non HDD IO) then you might want to try Windows 7. It is certainly faster and more efficient than Vista and has all the same large sector optimizations, partition offsets, etc. that Vista does.

Share this post


Link to post
Share on other sites

Hi!

My config:

MSI P6NGM-FD Mainboard with nForce 630i chipset

3x 500gb Disks for Raid 5

OS: Windows Server 2003 or 2008 (If 2008 has a lot of benefits I would take it.)

I have a few questions:

1) Partition offset: I don’t understand what is meant with the first sentence, please explain it to me („Your partitions on the array must be offset to a common multiple of both the number of drives minus one (for e.g. 3 Disks * 2 – 1 = 5 partitions?), and the stripe width (With 3 Disks it should be 3?).“). At which sector Windows Server 2003/2008 places the first partition?

2) Stripe size: I would take 32k for 3 Disks Raid 5.

3) I/O block size: What’s Windows Server 2003/8 default?

4) Cluster Size: I would take 64k.

Thanks and regards from Austria!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now