6_6_6

NCQ: Best Upgrade For a Power User!

Recommended Posts

Here are some NCQ disabled results with HD_Speed at 0 and 50% positions. It really is impossible to say that the Samsung F1 is worse than the Seagate when you look at these. Then again, the older Samsung Spinpoint T166 sure looks bad with smaller blocks.

Samsung F1 1TB:

32k block = 2 x 25MB

64k block = 2 x 24MB

128k block = 2 x 23MB

256k block = 2 x 33MB

512k block = 2 x 12MB

1MB block = 2 x 20MB

Seagate 7200.11 750GB:

32k block = 2 x 18MB

64k block = 2 x 23MB

128k block = 2 x 27MB

256k block = 2 x 26MB

512k block = 2 x 29MB

1MB block = 2 x 31MB

Samsung HD501LJ:

32k block = 2 x 5MB

64k block = 2 x 5MB

128k block = 2 x 6MB

256k block = 2 x 7MB

512k block = 2 x 12MB

1MB block = 2 x 18MB

Share this post


Link to post
Share on other sites

I just tested the Samsung F1 1TB with Adaptec AAR-1430SA controller in single drive (jbod) mode. It was exactly the same as it was with ICH9R NCQ enabled i.e. very slow when two HD_Speeds was running. I also tested with two Seagate 7200.10 250GB drives and two Samsung HD401LJ drives, they also were equally slow in single drive and also in RAID0 setting. It looks pretty sad when you get 150MB with one HD_Speed and that drops to 2 x 4-7MB when you start the second HD_Speed...

So this controller does not magically make any difference to the NCQ performance of these drives. I'll test with 7200.11 drives and the 3ware controller later this week.

Share this post


Link to post
Share on other sites

Here are the results with NCQ enabled. It sure looks like you really shouldn't use NCQ with the F1 drive. Some Seagate values are my own averages since the values change a lot.

Samsung F1 1TB:

32k block = 2 x 1MB

64k block = 2 x 2MB

128k block = 2 x 4MB

256k block = 2 x 7MB

512k block = 2 x 12MB

1MB block = 2 x 20MB

Seagate 7200.11:

32k block = 2 x 20MB

64k block = 2 x 21MB

128k block = 2 x 22MB

256k block = 2 x 50MB (lots of peaks)

512k block = 2 x 13MB

1MB block = 2 x 45MB

Samsung HD501LJ:

32k block = 2 x 1MB

64k block = 2 x 2MB

128k block = 2 x 4MB

256k block = 2 x 7MB

512k block = 2 x 12MB

1MB block = 2 x 18MB

Share this post


Link to post
Share on other sites

For comparison, here is a test of an older PATA drive (Maxtor 6Y160P0 , 160 GB)

Mainboard : Gigabyte GA-7VAXP Ultra (VIA KT400, PATA adapter is PDC20276)

Athlon XP2200+ , 1GB DDR

OS: Windows 2008 SP1

instances of HD Speed 1.5.4.72

speeds by hd_speed, last sum by perfmon.exe

256KB blocks

1i (pos 0%): 57 MB/s

2i (0%,50%): 2x4,5 = 9 MB/s

9i (00,10,20,...): 9x1,3 = 11 MB/s

(during the test the system is responsive, programs like calc, notepad, write, paint...

load fast, almost feels faster than a Q6600 system with new SATA disk...)

64KB blocks

1: 57 MB/s

2: 2x1,3 = 2,7 MB/s

9: 9x320KB/s = 3 MB/s

512 KB blocks

1: 57 MB/s

2: 2x8 = 16 MB/s

9: 9x2 = 18 MB/s

And an USB drive on the same system:

WD Elements 500 GB USB drive

256KB blocks

1i (pos 0%): 31 MB/s

2i (0%,50%): 2x 9,5 = 19 MB/s

9i (00,10,20,...): 9x2 = 18 MB/s

64KB blocks

1: 30 MB/s

2: 2x10 = 20 MB/s

9: 9x2 = 18 MB/s

512 KB blocks

1: 31 MB/s

2: 2x10,5 = 21 MB/s

9: 9x 2,5 = 22 MB/s

Share this post


Link to post
Share on other sites

NOTE about using dd on Linux !!!

At least some versions (I used "dd (coreutils) 6.12" from RIPLinux v6.3) of dd read the disk from the beginning even when a skip= option is used ! (it reads but skips the input until it reaches the specified spot)

Here is how to check if this happens:

run dd with a larger block size , skip=0 and note the speed.

Then run it with a skip value that should be near the end of the disk.

- if iostat gives the same speed as before, then it is reading from the begin of the disk

- stop dd with ctrl-c; if it reports 0 read bytes, then it was reading from begin of disk and skipping(ignoring) the data

Luckily there is dd_rescue which works as we need here.

dd_rescue 1.14 (note, this is not "GNU ddrescue" !) is present on many linux distributions, like RIPLinux, which I used (very small and can be started from a USB stick)

The command line is:

dd_rescue -q -d -y 0 -s 80G -B 4096 -b 512k /dev/hde /dev/null

/dev/hde - the source disk (the disk we test for read performance)

/dev/null - dd_rescue need a file where to copy data, as it is a really a copy tool; /dev/null just "eats" the data (discards it)

-s xG - start reading at position 80 G from begin of disc (easier than skip=, as it uses human readable values like gigabyte)

-b xk - read size in kilobytes

-q - do not print anything, useful when running in script, otherwise can be omitted

-y 0 - turn off syncing of output; for our purposes it just spams with error messages, as /dev/null can not be synced

-d - turns on direct disc access; otherwise linux sees that there a bunch of small sequential reads and merges them into one bigger read; we want reads as large as we specify

-B 4096 - this is needed, otherwise the -d option complains

Share this post


Link to post
Share on other sites

Here is a test of the Maxtor 6Y160P0 on the same hardware, under RIPLinux 6.3

The command for test was

dd_rescue -q -d -y 0 -s 80G -B 4096 -b 512k /dev/hde /dev/null

-s xG - start reading at position 80 G from begin of disc

-b xk - read size in kilobytes

64KB

1: 58 MB/s

2: 45 MB/s

9: 41 MB/s

256KB

1: 58 MB/s

2: 47 MB/s

9: 40 MB/s

512KB

1: 58 MB/s

2: 45 MB/s

9: 40 MB/s

And the USB WD Elements 500GB drive

64k:

1: 28 MB/s

2: 27 MB/s

9: 27 MB/s

256KB

1: 31 MB/s

2: 27 MB/s

9: 25 MB/s

512KB

1: 31 MB/s

2: 26 MB/s

9: 25 MB/s

The script to run 9 instances of dd_rescue is:

(I just noticed that no process will read from 0% of the disc; if this bothers you, use "0 1 2 3 4 5 6 7 8" instead of "1 2 3 4 5 6 7 8 9")

#!/bin/sh

for a in 1 2 3 4 5 6 7 8 9
do
# replace 50 with a value so that <value> * 9 will be near the end
# 16 is good for a 160 GB disk, 50 for 500 GB etc...
POS=$(expr 16  \* $a)
# replace "sdb" with actual disk :
dd_rescue -q -d -y 0 -s "$POS"G -B 4096 -b 64k  /dev/sdb /dev/null   &
done

echo all running

Edited by xerces8

Share this post


Link to post
Share on other sites

3ware controller with Seagate 7200.10 drives in single drive mode does not make any difference, still 2x4-7MB NCQ enabled or disabled. But this RAID1 result looks very nice, it's like this with 32-512k block sizes, 1MB block size drops speed to 2x 22MB. But with 4 HD_Speeds running, total speed drops to 20MB...

3wareraid1zx9.png

Share this post


Link to post
Share on other sites
But this RAID1 result looks very nice, it's like this with 32-512k block sizes, 1MB block size drops speed to 2x 22MB. But with 4 HD_Speeds running, total speed drops to 20MB...

Perhaps the controller reads from drive 1 for HD_Speed instance 1 and from drive 2 for HD_Speed instance 2, since the data on a RAID 1 is identical for all drives? For 4 instances, there are not enough drives. But on the other hand, it should not drop to 2x22 @ 1MB block size.

The Linux performance is really astonishing. :blink:

Why isn't that possible with Windows. :(

Share this post


Link to post
Share on other sites
Perhaps the controller reads from drive 1 for HD_Speed instance 1 and from drive 2 for HD_Speed instance 2, since the data on a RAID 1 is identical for all drives? For 4 instances, there are not enough drives.

Yes, I think it does it like you said, with two tests running there was no seek noise. But HD_Speed tests with RAID0 did not look good with those 7200.10 drives... I think Seagate 7200.11 drives will allow nice RAID1 performance even with 4 HD_Speeds running with the 3ware card, I will test that (and RAID0) next week.

The Adaptec AAR-1430 controller does not work like the 3ware and it was much slower (with lots of seek noise) when I did the same 2x HD_speed test with RAID1.

Btw. the 3ware controller allows to switch NCQ on/off via web-interface, no reboot is needed and you can do this during HD_Speed test. NCQ on/off had no effect to 7200.10's speeds, I did notice a little change in seek sounds but nothing else.

Share this post


Link to post
Share on other sites

Sounds as if the 3ware is a really nice controller. I guessed so, but unfortunately it costs a small fortune, especially if you'd like to have more than 4 ports. :rolleyes:

- Considering your 7200.11 tests with all those different block sizes, we can conclude that the 7200.11 is not as fast as 6_6_6 stated (only with 256k blocks), but it's still faster than all the other drives. Right?

Share this post


Link to post
Share on other sites
- Considering your 7200.11 tests with all those different block sizes, we can conclude that the 7200.11 is not as fast as 6_6_6 stated (only with 256k blocks), but it's still faster than all the other drives. Right?

I think that the 7200.11 is the only one that has good (Windows) performance with NCQ on. But then again, turning NCQ on with the Seagate has no advantage in my system.

I actually did a test and reinstalled the Samsung F1 as my system drive and did the same tests with file copying and starting .divx movies and there was no difference, it was as fast as the Seagate. With 4 HD_Speeds running (with 64k blocks) the Samsung was faster but with 5 or more the Seagate was faster. But with NCQ on, the Samsung was much slower, also in the file copy/divx test.

I'm now testing a software (for Win XP) called eBoostr. With 768MB system ram cache it makes a big difference to the way system responds under heavy disk access. I hope NCQ could do something like that, but in my system it certainly does not. SSD is the answer...

But I still think that the Seagate 7200.11 is a great hard drive and I will buy 1TB model soon.

Share this post


Link to post
Share on other sites

I think, buy a hard drive if you need one.

But don't buy a seagate or other, just because you expect a big jump in performances.

According to theses reviews tests ,

the seagate might outperform other 7200rpm drives for multitasking,

but it still way behind raptors.

I don't know why we got different results much worse results for the raptors

in this forum , than for the reviews tests I've just mentioned.

Probably depends of some settings used for IOMETER during the test.

Also I think multitasking is OVERRATTED for a regular user.

I might be launching ton of applications with my samsung,

I almost never encounter any kind of performance drop.

It's only when I do at the same time TWO hdd intensive task,

like formatting, defragmenting, scandisk etc...

But even when I format/defragment/scan a SINGLE partition,

my computer remain snappy.

Any way my money was limited, and I thought it was more interesting to upgrade other

parts of my computer. I'm ditching my psu for a more silent one for instance.

Edited by extrabigmehdi

Share this post


Link to post
Share on other sites

SO could we get a summary of the results so far.

Let me offer what I think I can understand from these posts:

1.The Steelbytes NCQ test set at 256k clusters clearly is not realistic for testing.It us unrealistic to test at 9 instances, as there are rarely so many multiple requests reading big blocks of data at the same time in real world usage.

2. The seagate 7200.11's do opertate amazinlgy on a 256k block test, but when you move to more real world block sizes, the performance drops massively - However - still less than the other drives. So it is still the best overal NCQ solution.

3. The samsungs are terrible with NCQ - stay way from NCQ on these drives.

4.No one has been able to find a way to force the OS to use 256k block sizes ( which if they could would be blindingly fast!).

Look - if I have misunderstood any of these - don't sling mud - just provided an updated summary - thats whats important.

LittleJhon.

Share this post


Link to post
Share on other sites

I think your summary is correct.

There is no way to force an OS to use only one block size, and probably it wouldn't do any good either. ;)

I think if you have an Intel ICHxR, you have two options if you want to avoid trouble:

1. Use IDE mode. No AHCI, no RAID. All drives should work with that. You will not have to deal with any drivers either, Windows XP installation is easy, etc.

2. If you want to use AHCI or RAID (or have to), try a Seagate 7200.11. Considering all posts here, it's probably the best solution. And do not buy a Western Digital AAKS, it's proven to suck with the ICHxR in AHCI/RAID mode. And it sucks a lot. The Samsung F1 seems to be similar bad. But with IDE mode, both should work.

Share this post


Link to post
Share on other sites

I tested 7200.11 drives with the 3ware controller. Not much difference to ICH9R results in single drive mode. RAID1 works fine with 4 HD_Speeds running, no major speed drop like with 7200.10. Here are results of RAID0 with 64k block. That drop is the point where I switched NCQ off. So it actually works!

I also noticed that I get about 5MB better hard drive test result (15MB vs. 20MB) in eboostr speed test if NCQ is on. The test reads most used files in random order. This was with single 7200.11 drive and ICH9R. Too bad that I have those Samsung drives in my system so NCQ switched off is still the best choice. Or is there a way to switch NCQ off for them? HDUtil does not have an option for it.

720011raid0qy6.png

Share this post


Link to post
Share on other sites
That drop is the point where I switched NCQ off. So it actually works!

Thank you for your extensive tests. :)

They again show that there is something special about the 7200.11 with NCQ enabled, although the performance advantage is not as big as thought in the beginning of this thread.

What do you think? A recommendation for the 7200.11? For those who'd like to know if they should buy it or not?

Share this post


Link to post
Share on other sites
I'll wait for [...] when ssd becomes more affordable.

That's what I do. ;)

Here is a nice article at AnandTech about the newest Intel SSD (really fast and affordable) and the general state of the art of SSDs. It also shows some serious problems with many of today's SSDs, remembering me of my trouble with the WD AAKS @ Intel ICH9R. ;)

The symptoms are pretty obvious: horrible stuttering/pausing/lagging during the use of the drive. The drive still works, it's just that certain accesses can take a long time to complete.

Quite interesting read, IMHO.

Edited by FAT_Punisher

Share this post


Link to post
Share on other sites

I just picked up a 1.5TB Seagate 7200.11 drive. The results for the drive seem pretty consistent with previous tests. Using 256KB blocks, the average read speed for a single instance as ~115MB/s, adding a second instance caused it to drop to ~100MB/s, and a third instance made little impact.

hdtunebenchmarkseagateswy5.png

hdspeedseagatest3150034hz2.png

perfmonseagatest3150034gq5.png

It's worth noting that the maximum transfer rate registered by HD Tune was only 103MB/s, while HD Speed was getting 120MB/s.

Share this post


Link to post
Share on other sites

I revise the topic title to "AHCI: Best Upgrade For a Power User".

So far NCQ and AHCI has been used interchangably. But a distinction must be made.

The improvements are not NCQ related (linux and OSX tests) -- they are related to how the Seagate firmware behaves when AHCI is activated under windows.

Share this post


Link to post
Share on other sites

Hi everyone,

This thread has piqued my curiosity, so I've made a couple of tests on my system.

These are the relevant hardware specs

Mainboard: ASUS P5B-E (bios 1803)

Processor: intel Core 2 Duo E6600

Chipset: intel P965 and ICH8R

RAM: 2 GiB of GEIL DDR2-800 memory (in dual channel mode)

hard drive: Seagate ST3500320AS (firmware SD15)

And the relevant software specs

OS: Windows Vista Ultimate SP1 (64 bit edition)

drivers: intel Storage Manager 8.5

BIOS settings: AHCI enabled

I't important to take into account the differences between Windows XP and Vista. It's explained here http://technet.microsoft.com/en-us/magazine/cc162494.aspx

Since the first version of Windows NT, the Memory Manager and the I/O system have limited the amount of data processed by an individual storage I/O request to 64KB. Thus, even if an application issues a much larger I/O request, it's broken into individual requests having a maximum size of 64KB. Each I/O incurs an overhead for transitions to kernel-mode and initiating an I/O transfer on the storage device, so in Windows Vista storage I/O request sizes are no longer capped. Several Windows Vista user-mode components have been modified to take advantage of the support for larger I/Os, including Explorer's copy functionality and the command prompt's Copy command, which now issue 1MB I/Os.

These are the screenshots of my results:

Using 64 kiB blocks (like Windows XP): 64k.png

Using 1 MiB blocks (like Vista): 1mb.png

I don't have more time right now, but tomorrow I will test my system with AHCI disabled and post here the results.

Regards.

Share this post


Link to post
Share on other sites

Hi again,

As I promised yesterday, these are the results with AHCI disabled in the bios:

With 64 kiB blocks (like Windows XP):

ide64k.png

With 1 MiB blocks (like Vista)

ide1mb.png

CONCLUSIONS

Using 1 MiB blocks, AHCI manages to obtain a sustainted total throughput of around 100 MiB/s with 2 HD_Speed instances. IDE only achieves around 64 MiB/s in those circumstances, so if you're using Vista you will clearly benefit from enabling AHCI.

However, the situation is not as good with Windows XP or earlier OS. Using 64 kiB blocks, AHCI performs equally, or even a little bit worse than IDE. AHCI struggles to achieve 44 MiB/s, while IDE manages a tiny bit more, 47 MiB/s

EXECUTIVE SUMMARY

Enable AHCI if you're using Vista or later OS.

Don't bother with it if you're using Windows 2003, XP, 2000 or earlier OS.

;)

Share this post


Link to post
Share on other sites

STOP THE PRESSES!!

I have made another test, this time uninstalling the intel Matrix Storage drivers and using the native MSAHCI.SYS driver. Take a look at the results! I didn't expect this!!

With 64 kiB blocks like XP:

msahci64k.png

With 1 MiB blocks like Vista:

msahci1mb.png

The native Microsoft AHCI driver works better than the iaStor 8.5 driver!!! And quite a bit better I must say!!

I'm completely surprised by this, I was expecting just the opposite!

Does anyone have answers for this? Did intel screw up their drivers or what?? What's going on here?!?

Regards.

Edited by Leolo

Share this post


Link to post
Share on other sites
The native Microsoft AHCI driver works better than the iaStor 8.5 driver!!! And quite a bit better I must say!!

I'm completely surprised by this, I was expecting just the opposite!

Does anyone have answers for this? Did intel screw up their drivers or what?? What's going on here?!?

MS driver (stock with 2008 Server Enterpise x64) did not work fine for me on AHCI. No speed difference from AHCI-off. Only Intel worked.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now