RedHickey

Performance Ceiling

Recommended Posts

I have 2 SATA seagate ST3808110AS on XP, and if a run a process to do 1 megabyte random reads from a

vary large file,

I a get, at best 40 reads per second, (ok by the specs thats reasonable) but If I try two instances of the program

on on C drive and one on D (both the same type f drive) I get only the same cumulative performance level.

In other words, each is cut in half if running with the other process. As soon as its partner stops reading, the

performance picks up so the sum is still about 40 reads per second. I used CreateFile and made sure the

system buffer pool is not used. if avg seek time is .0129 sec, and avg latency is .0125, and transfer rate

for 1 megabytes is .00333sec, then 1/(.0129+.00416+.0033) gives about 49 per sec, and thats reasonable,

but why the ceiling on performance for the pair? Looking at the device manager, DMA is allowed if available

for both, but the current transfer mode is set to Ultra DMA mode 2 on one , and Ultra DMA mode 5 on the

other (if that matters). Any ideas?

Share this post


Link to post
Share on other sites

I am using pure SATA connectors for C and D. There is an IDE connector and I have two DVD's connected to it so I am assuming its not germane. There are no non standard connectors involved. Pure simple SATA

connectors to to the motherboard. The device manager says I have Intel 82801GB Serial ATA Storage Controlers 27c0, location PCI bus 0 device 31 function 2. And the other disk says 82801GB Ultra ATA Storage Controlers 27df location PCI bus 0 device 31 function 1.

Both disks have driver version 7.0.0.1014, driver provider Intel date 1/19/2005.

Resource type for Storage Controler 27c0 is io/range fe00-fe07 and fe10-fe13 and fe20-fe27 and fe30-fe33 and fea0 -feaf with irq 20 .

The other disk has (controler 27df) has resources io range ffa0-ffaf

In my ignorance about this, I probably wasted your time in this tedious recitation but Maybe there a clue there somewhere.

Share this post


Link to post
Share on other sites

I will try the new driver. This non additivity does not happen with more modern windows server machines, but it happens on a lot of non server desktops around here. Thanks a a lot !! for listening to the story. I will tell you

if the new driver helps. I am surprised that a bus limit of 40 MB/sec could exist so it must be some setting or driver issues, but It's not uncommon around this shop.

Share this post


Link to post
Share on other sites
I will try the new driver. This non additivity does not happen with more modern windows server machines, but it happens on a lot of non server desktops around here. Thanks a a lot !! for listening to the story. I will tell you

if the new driver helps. I am surprised that a bus limit of 40 MB/sec could exist so it must be some setting or driver issues, but It's not uncommon around this shop.

That may be due to the fact that XP-32bit uses the scsi minidriver and windows server 2003 uses the native IO interface (I forget what it's called).

Frank

Share this post


Link to post
Share on other sites

A have a modern machine to test now: 2003 R2 server,2 dual core 3.2GHz processors, and 4 disks,

2 of which are very hot 10,000 rpm WD raptors, The capacity does add nicely up until the fourth disk,

It tops out at 183 MB/sec. This is in terms of random reads 1 Meg at a time. You have some ceiling

at less that 200 MB/sec. Am I bumping against the bus speed here, or what. The 3.2GHz cycles/second

times 8 bytes per cycle will give the naive limit of (3200 times 8) MB/sec. So this limit is way way too high.

What is the source for the limit I am seeing at around 200 MB/sec do you think? The disks are SATA disks.

Would SCSI disks be any better? Somehow I think this 200MB limit is more fundamental than that.

Share this post


Link to post
Share on other sites
http://developer.intel.com/technology/pcie...sPCIExpress.pdf

look at page 7. Maybe 200MB/sec is all that PCIe can deliver. Thats too bad if its true.

What are you talking about? You have some crazy math I must say. I get close to 800MB/sec on my raid so its obiviously not PCI-e. It is your controller. Are you using onboard or a crappy/cheap pci-e raid controller?

Share this post


Link to post
Share on other sites

The 200 MB/sec is attained with just plugging in the disks into the SATA motherboard connectors, without using a raid controler of any kind. As I add disks, with reads running concurrently, I get cumulative addition for the number of reads per second, until I top out with 200 MB/sec as some kind of ceiling. (with 4 disks being read) This is a brand new Dell server, 2003 R2. The device manager says, I think, that I have PCIe bus. The next phase of the testing will be to use a raid controler and plug the disks into that instead of directly into the mother board. First step will be to turn off actual raid0 but just pass through the raid0 connector. Lets see what kind of throughput I get with that. Then I will turn on the raid0 functionality

for the 4 disks, with a stripe size as close to 1/4 meg as I can, and see if a single thread running on on a raid0 concatination of the four volumes , can get close to the throughput I get now with 4 threads on 4 separate volumes. (that is 200 MB/sec). I am not too happy either with the 200MB/sec cap. I hoped for more, but thats all there is. I found a few url's that hinted that thats all I will get through PCIe, but nobody would be more pleased than I if that is wrong.

I am talking a large disk file with the read locations arranged so that no duplicate read addresses occur. On top of that, my program uses CreatFile with an option that does not use any buffering, so the system buffer pool is on purpose not cheating. I want real io, not poluted by any fortunate cache hits from the system buffer pool.

I forget the add the record size is 1 meg, so the 200 random reads per second comes to 200MB/sec.

Share this post


Link to post
Share on other sites

I am hoping that when one naively plugs their SATA disks into the bays that come with the computer, that maybe one has some PCIe to PCIx bridge that may cause a performance cap, but by utilization of the raid card, and pluging the disk into that, even without the raid0 actually turned on, I may see the performance ceiling at 200 MB/sec removed. That is a hope. In a day or so I will test this and see. With four WD1500ADFD 10,000 rpm disks, each of I actually measured to deliver about 60 MB/sec (can do 60 1 meg random reads per second)

lets see if four of them working concurrently, can delive 4 *60 = 240 MB/sec, or will I still see a cap at 200.

I am really new at the disk performance measuring game but I am a very experience hand at C/C++ programming and I am confident that what I am reporting to you is correct. My ignorance about the details of disk performance is vast. I know raid controllers promise throughput levels like 1.5 GB/sec. I am trying to see if I can really get this level of performance. Somewhere there is a PCIe bus limit. Reading the on line literature, it's not too easy to get an unambiguous answer if one does not know the buzz words, but I only believe what I can measure anyway.

Share this post


Link to post
Share on other sites

I have a dell 690 to test on now. The SATA controller dell gives will max out at 200MB/sec throughput on random IO, one meg/rec. I have 4 WD1500ADFD 10,000 rpm disks that will each do 60 MB/sec, but you with four threads and 4 files, one on each disk, the controller will max out at 200. BUT if you put another controler in, even a cheap one like a Promise Technology FastTrack TX4310, not using the raid but only using it as another SATA controller, and put 2 of the disks on this one and leave 2 on the Dell SATA, I can break the 200MB/s barrier. 2 disks on the dell can give 120 MB/s and 2 disks on the TX4310 can give 114.8 (not as good as the SATA dell), but the sum of all 4, 2 on each controller, gives 232.5 MB/s. Both controllers are plugged into a PCIe X4 slots. I wonder if I put another SATA controller on a remaining x4, would I get a limit in the 6 * 60 MB/s range with 6 disks? dont know. Soon I will have a dell with a x8 slot to test. I wonder if a single SATA controller plugged into the x8 slot, with 6 SATA disks will hit limits before a configuration with

3 of the PCIe x4 controllers each with 2 disks on. I will try the ARC-1220 with 6 SATA WD1500ADFD disks connected, and I hope I can get 60 * 6 MB/s throughput. Then I will try turning on raid0.

seek time is 4.6ms and latency is 3ms. and transfer time is about 120 MB/sec. so

with 6 disks, to read a single 1MB rec, I will need 1 seek and 1 latency time to transfer time, for one disk, for a 1 meg record would be 1/120= 8.3ms but with 6 of them running concurrently, the transfer time, I hope would be 8.3/6=1.4ms. So then my 1 meg rec would take (4.6+3 + 1.4) ms =9ms , so throughput on a single thread raid0 read would be 1/.009=111 MB/sec . Anybody have any predictions about actual performance I will see?

I did some raid0 testing with the x4 dell controller and the formula I used about was correct. 1 seek + 1 latency + transfer time / 4 was about right for a raid0 1 meg random read. Using this formula, with a single thread raid 0 disk array, one can cut the transfer time down the more disks you add, but you are still stuck with the seek and latency, but the transfers can occur concurrently.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now