Eugene

Tcq, Raid, Scsi, And Sata

Recommended Posts

StorageReview takes the first of several comprehensive looks at tagged command queuing and its implications for the desktop as well as the server world. Does TCQ rate? How does it mesh with RAID arrays? Is SATA TCQ as effective as SCSI TCQ? Find the answers to these questions and more in SR's latest!

TCQ, RAID, SCSI, and SATA

Share this post


Link to post
Share on other sites
In a single user environment would a task such as copying or moving 20,000 files (say 20GB total) all at once benefit from command queing?

No. OS-level caching strategies dominate in such a scenario... TCQ exerts little effect.

Share this post


Link to post
Share on other sites
StorageReview takes the first of several comprehensive looks at tagged command queuing and its implications for the desktop as well as the server world. Does TCQ rate? How does it mesh with RAID arrays? Is SATA TCQ as effective as SCSI TCQ? Find the answers to these questions and more in SR's latest!

TCQ, RAID, SCSI, and SATA

I started reading your article and after a few lines I have to disagree!

TCQ (the 'parallel ATA tagged command queueing') does _not_ need a special controller! It requires a TCQ capable hard disc and a TCQ capable (SW) driver. That's a very big difference to NCQ (Sata native command queueing). Here the drive and the controller interact without SW-assistance. With NCQ you can start a bunch of DMAs (with different address and length) and the Controller and the Drive will serve this request until completion and a raised interrupt. With TCQ you can issue several Read/Write commands to the drive and, when the drive completed one request, you can 'arm' the Controller with the corresponding address and length. The CPU still has to do something for each of the issued commands.

That's my understanding and in my opinion the biggest advantage of NCQ!

Correct me if I'm wrong...

- berndl

Share this post


Link to post
Share on other sites
TCQ (the 'parallel ATA tagged command queueing') does _not_ need a special controller! It requires a TCQ capable hard disc and a TCQ capable (SW) driver.

Considering that the driver comes with the controller, I think this is a rather silly point, no offence. Why bother splitting this hair? In all practicality you have to buy a controller that supports TCQ.

Besides SR, notes that the differences between the S150 TX4 and the TX4200 lie only in the firmware realms. This implies to me that NCQ can be enabled in firmware/drivers as well.

As for you performance concerns I think they are unwarranted.

1. ATA TCQ uses DMA to return the answered requests, just like NCQ, the CPU utilization isn't going to be different.

2. If you're taking issue with managing the registers for the commands and tags, or the overhead of the actual tagging I don't think there is enough information available to distinguish between the ATA TCQ and NCQ. Is your concern that ATA TCQ implementations use main memory space for the registers to store the commands instead of registers on the controller? I think it is likely that the implementation of NCQ on the TX4200 is just as 'software' an implementation of TCQ as ATA TCQ is on the same controller.

Share this post


Link to post
Share on other sites

Hello,

I was wondering if there would be a PDF version available for this article, specifically the RAID article. I found it very informative but I would definitely like to keep a hardcopy of it to look through without having to go link by link.

Share this post


Link to post
Share on other sites

It is interesting that TCQ seems to be a disadvantage at low queue depths. Perhaps the optimal solution for TCQ enabled drives would be to have the controller monitor the queue depth and serve requests on first-in first-out basis until the queue depth justifies turning on TCQ.

Are there any good utilities to monitor queue depth during real-world usage ? It would seem that TCQ is a no-brainer on a server with a large number of users, but what about a server with only 10 users. If performance is of the utmost importance, it might be better in this case to choose some Raptors with a non TCQ controller over a SCSI solution. The Win2k performance monitor has an object called Current Disk Queue Length. Is that an accurate representation ?

This was a good article - not many people ask these kinds of question.

Share this post


Link to post
Share on other sites

The obvious followup question is:

When will the WD740 article/benchmarks be updated with the new (MUCH) slower results? I seem to recall something like this before with WD. Is this right or am I recalling incorrectly?

Does StorageReview update this or continue with the vendor-supplied drives?

Share this post


Link to post
Share on other sites

I'd be very interested in StorageReview's take on one of the 3ware 9000 series SATA controllers. Have you tried to get your hands on one of these beasts?

Share this post


Link to post
Share on other sites
When will the WD740 article/benchmarks be updated with the new (MUCH) slower results?  I seem to recall something like this before with WD.  Is this right or am I recalling incorrectly?

A quick check of the SR performance database should show that there are currently two results of the WD740GD. Results from the 00FLA1 version will eventually become the only results that exist in the performance database. Would immediate deletion of the full set of old results have been in the best interest of readers? How else would readers be able to assess the "MUCH" slower results for themselves? Hell if I was that paranoid I'd assume immediate deletion of the old results was a coverup. :P

Share this post


Link to post
Share on other sites
I'd be very interested in StorageReview's take on one of the 3ware 9000 series SATA controllers. Have you tried to get your hands on one of these beasts?

I hate to metoo but I really would like to see a comparison of the 3Ware 9500S-8 and the RAIDCore BC4852. Based on Tom's Hardware review it seems like the RAIDCore RC4852 is the one to beat for RAID5. Unfortunately, he used an older 3Ware card.

I've searched for the BC4852 but can't seem to find it anyware. It's the successor to the recalled RC4852 that had a flaw in the Marvel chip.

And while we're wishing I'd like to see the comparision on a 32bit PCI card...I'm building a AMD64 based 2TB RAID5 array...

Share this post


Link to post
Share on other sites

You make a persuasive cause that TCQ/NCQ will hurt performance in single user environments. There were some odd numbers in this review of Intel's Grantsdale 925X chipset. It looks as though everything except NCQ remained constant, and NCQ improved STR and seek times, at least in IOMeter.

Share this post


Link to post
Share on other sites

Seems quite interesting, those results from the tech report's NCQ test. Leaves me wondering again, what was the main differences with NCQ and TCQ... I suppose the NCQ is also on the cheaper 915x, not just 925.

Share this post


Link to post
Share on other sites
Hello,

  I was wondering if there would be a PDF version available for this article, specifically the RAID article. I found it very informative but I would definitely like to keep a hardcopy of it to look through without having to go link by link.

Eugene: You could do like Ars Technica and make PDF versions available to paid subscribers.

Share this post


Link to post
Share on other sites
In fact, it is evident that the very maturity and consistent across-the-board implementation of TCQ in the SCSI world is actually one of the factors that cause mechanically superior SCSI drives to stumble in single-user scenarios.

How exactly is this evident?

Why is it TCQ and not read-ahead/write-back strategies?

Accordingly, real LBA sector addresses are requested throughout the tests. Higher-capacity drives and arrays will by definition cram these locations closer together physically and thus enjoy an advantage (as they do in actual use).

Exactly how much mb/gb do these tests span?

And last but not least, isn't it quite likely that WD will improve the TCQ code so that the drives at least perform on par with the non-TCQ 'protocol'?

Share this post


Link to post
Share on other sites
TCQ (the 'parallel ATA tagged command queueing') does _not_ need a special controller! It requires a TCQ capable hard disc and a TCQ capable (SW) driver.

Considering that the driver comes with the controller, I think this is a rather silly point, no offence. Why bother splitting this hair? In all practicality you have to buy a controller that supports TCQ.

Besides SR, notes that the differences between the S150 TX4 and the TX4200 lie only in the firmware realms. This implies to me that NCQ can be enabled in firmware/drivers as well.

As for you performance concerns I think they are unwarranted.

1. ATA TCQ uses DMA to return the answered requests, just like NCQ, the CPU utilization isn't going to be different.

2. If you're taking issue with managing the registers for the commands and tags, or the overhead of the actual tagging I don't think there is enough information available to distinguish between the ATA TCQ and NCQ. Is your concern that ATA TCQ implementations use main memory space for the registers to store the commands instead of registers on the controller? I think it is likely that the implementation of NCQ on the TX4200 is just as 'software' an implementation of TCQ as ATA TCQ is on the same controller.

I still disagree! Again, TCQ is a matter of drives supporting it and a TCQ capable driver. NCQ requires a NCQ capable drive (basically a TCQ capable drive), a NCQ capable controller, and a NCQ capable driver. TCQ does not use DMA to inform the host CPU about a served request, its a task of the host CPU (polling). NCQ is much smarter, since you just load a bunch of commands into the controller (to be precise: the host memory, a command chain) and wait for completion (indicated by an interrupt). You can entirely offload the processing of the command chain, in difference to TCQ where the host CPU has to find out, which request can be served, then arm the DMA engine of the controller.

just my 2 cents...

- berndl

Share this post


Link to post
Share on other sites

I have not seen any evidence in the article that command queuing on SCSI drives hurts performance. It's pretty clear however that TCQ sucks. Performance decreases at low to medium queue depths and only improves at high queue depths. The NCQ implementation of the new Intel chipsets and the Maxtor MaXLine III does not have this behaviour according to benchmarks by Tech Report. This same combination is performing very well in AnandTech's review of the MaXLine III.

It's very unlikely that TCQ will be faster in real world circumstances. I have some RankDisk server tests in my storage suite with high average queue depths of 30 tot 45 outstanding I/Os. There's not a single test where the Pacific Digital Talon ZL4-150 with TCQ enabled is faster than the ZL4-150 with TCQ disabled. The difference is smaller than in the desktop tests tough. The TCQ performance of the Silicon Image Sil 3124 is even worse than the Talon ZL4-150.

Share this post


Link to post
Share on other sites
So, if it is largely TCQ that hurts SCSI drives in single-user situations, is it possible to disable the functionality?

You can disable TQ in your device manager, under disk drives. If you have XP, find the drive and double-click on it. The option is on the "SCSI Properties" tab.

I was going to ask Eugene then, SHOULD we disable TCQ in a single-user environment because it is that significant? (I would say 5% or more improvement with it off.) Maybe he said in the article but I missed that conclusion.

Share this post


Link to post
Share on other sites
You can disable TQ in your device manager, under disk drives.  If you have XP, find the drive and double-click on it.  The option is on the "SCSI Properties" tab.

unfortunately, eugene found this did not actually work in testing. it stays enabled on the ata tcq-enabled adapters.

Share this post


Link to post
Share on other sites
On a less relevant note, I wonder if MSFT's bootvis could cram more I/Os per second into that bootup benchmark :D

quote microsoft: Please note that Bootvis.exe is not a tool that will improve boot/resume performance for end users. Contrary to some published reports, Bootvis.exe cannot reduce or alter a system's boot or resume performance. The boot optimization routines invoked by Bootvis.exe are built into Windows XP. These routines run automatically at pre-determined times as part of the normal operation of the operating system.

Share this post


Link to post
Share on other sites
unfortunately, eugene found this did not actually work in testing.  it stays enabled on the ata tcq-enabled adapters.

Yes, what honold said. If only it were this simple. It is because it isn't that such TCQ vs non-TCQ benchmarks have been so long in coming.

It appears the only way to disable TCQ in a SCSI subsystem is through custom, non-supported firmware flashes. I'm sure they exist. I might be able get get one from SOME manufacturer purely for test purposes. But I fear for my inbox if it became clear that I had such a thing. :P

Share this post


Link to post
Share on other sites

Sorry for interrupting the discussion. Does anyone know,except Promise Technology, if I could get a TCQ-supported SATA RAID controller NOW?If not,will it be possible to just update the controller's drivers instead of making hardware changes to enable it? thank you

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now