gundersausage

The Death Of Raid

Recommended Posts

RAID 1 is fault tolerant.  Write performance does take a hit.  You can minimize it by placing one drive on each channel, if applicable.  Does the board have SATA RAID or just SATA?

The Board has Silicon Image Sil3114 RAID, which supports (RAID 0, 1 and 10) Everything that I have seen, says that the hit in Write Performance is minimal (a few percentage points)

BTW correct me if I am wrong, but one of the advantages of SATA is that each drive uses a separate channel.

Share this post


Link to post
Share on other sites

I do lots of high end image, audio and video editing. I do quite a bit of CAD work. I play games, and do lots of other "crap".

I have two 74 gig Raptors.... I want to set up Raid 0 with them.

No, wait!!! Don't bite my head off yet. The reason is not to benifit from any performance possibilities... I know it probably ain't gonna happen. What I will get is a 150 gig (approx) 10,000 RPM SATA WD Raptor hard drive. I also know there is no fault protection with this setup.

Additionally, I have a 7200 RPM 150 gig SATA drive and a Kingwin SATA Mobile Drive Rack... see where I'm going with this yet?

I want the performance of the Raptor with the volume of the 150 gig drive.

I religiously do my full system virus scans (yes, real time scanning is always on for front-line protection), and system defrags every Sunday evenings.

It's not a big deal to engage the internet lock after the system is done doing it's thing, slide the carrage with the 150 gig drive into the rack and do my system backup while I get ready for bed. In the morning, pull the 150 gig drive out of the rack and put it on the shelf / in the safe, disengage the internet lock and get on with my life.

Does this scenario seem viable and like a realistic use of a Raid 0 configuration?

Share this post


Link to post
Share on other sites
The SR benchmarks prove there is little benefit from striping when bandwidth is bottlenecked and latencies are increased by use of the legacy PCI bus, and usage patterns have very low queue depths. In the upcoming era of PCI Express, bandwidth is no longer scarce.

Funny you should mention that, bandwidth isn't scarce now, if you know what you are doing. It just seems that people willing to spend money on extra drives and a RAID card wouldn't spend extra money for a current board with PCI66/64 or PCI-X, which is mostly 64 bit PCI with 100 and 133MHz frequencies. PCIe does change the topology to a completely segmented one, but the bandwidth is there, and several current boards have multiple PCI segments.

Share this post


Link to post
Share on other sites
yes, that's quite realistic use.  for the AV stuff you very well might get better performance using separate disks, though.

I think this is why some people people make distinctions between desktop, workstation and servers. AV, CAD and such may very well benefit from RAID 0, but aren't necessarily in this discussion because those are often considered workstation-type tasks. Games and office apps are generally considered desktop uses.

One person I know that sets up video editing systems puts the OS & software on its own drive, and stripes two drives together for data. I think he does striping for capacity, and not speed.

Share this post


Link to post
Share on other sites

I do SOME multitrack audio recording.

Not as much as I used to but i still do the occasional job.

In the past I found that when trying to record real-time continous large files that glitches would occur.

A bottleneck

Share this post


Link to post
Share on other sites

I do some multitrack audio recording.

Not as much as I used to but I still do the occasional job.

In the past I found that when trying to record real-time continous large files that glitches would occur.

A bottleneck somewhere in the system would cause a dropout in the recording.

In a pro environment I found that moving to SCSI drives solved the problem. not only that but it also avoided dropouts from untimely thermal recalibratoin self tests as well.

At home I found that moving to a RAID 0 config (2 x Maxtor 133 20Gb/Highpoint 370 on an Abit board) worked better than a single 133 drive.

The time has come for a rebuild and I decided to stick with a Highpoint controller so I've got an EPoX 4PDA2V/P4E 2.8.

However, after I bought the board (typical) it occurred to me that what with the bigger caches now available and SATA (through the Intel ICHR5) that one WD Raptor SATA150 may perform as well as two ATA100/133s on RAID 0.

I can't go to SATA bcause, for compatibiliy reasons, I want to stay with 98SE as an OS.

My thinking is that the SATA controller won't be using the PCI bus (I hope I'm right here) leaving PCI to handle the Audio I/O.

Share this post


Link to post
Share on other sites

Wot no edit?

I can't go to SATA bcause, for compatibiliy reasons, I want to stay with 98SE as an OS.

Should have read

I can't go to SATA RAID because, for compatibiliy reasons, I want to stay with 98SE as an OS.

Share this post


Link to post
Share on other sites

It's funny how many comments have zipped right by the first page of this thread where a post seems to lay waste to the theory that RAID 0 offers no improvement in performance. It seems like for enthusiast computer users, the average gain is about 46% vs. non-RAID performance.

Booting Windows is a big gain. Defragging is a big gain. Loading game levels is a huge win, especially since the average game going forward will be more than 3 to 5 GB, with single texture packages a few dozen to a few hundred megabytes each.

In addition to the stripping, which by simple logic dictates faster performance on average, you are also talking about 16 MB of cache vs. 8 MB. Depending on the app, that is also a big win.

http://www.tweakers.net/benchdb/test/120

(has some interesting results).

Share this post


Link to post
Share on other sites
It seems like for enthusiast computer users, the average gain is about 46% vs. non-RAID performance.

*yawn*

Booting Windows is a big gain.

*yawn!*

Defragging is a big gain.

*burp?*

Loading game levels is a huge win

Huh... gamez?

In addition to the stripping, which by simple logic dictates faster performance on average, you are also talking about 16 MB of cache vs. 8 MB. Depending on the app, that is also a big win.

Stripping? Wanna see hOOters!!! Yeah!

If you don't have a clue, just shut up.

Share this post


Link to post
Share on other sites
Booting Windows is a big gain... Loading game levels is a huge win

This is just ridiculous. Please have the self-respect to educate yourself --at least superficially-- on the subject before posting.

This has been linked several times now.

And this is ancient news now.

And did you mention defragging performance! Your post is a joke right?

A superficial understanding of the factors that determine storage performance in single-user situations with multiple disks may lead one to consider FemmeT's objections to be very reasonable, but anyone with an understanding of the factor's that are responsible for the performance of striped arrays can appreciate that his conclusions were drawn with little or no consideration, or appreciation, of the details of the factors that produce the disparities between his results and everyone else's.

Share this post


Link to post
Share on other sites

The following information has been posted before in response to FemmeT's conclusions and his benchmarks. Much of it is very elementary knowledge regarding the manner in which IO is generated by a system and the manner in which a disk subsystem responds to it. Consistently, however, individuals who make claims regarding striping performance demonstrate that they have no apprehension of the following simple facts so I include them and begin from first principles.

Basic striping laws:

Striped arrays of disks responds to workloads in two ways:

1. Independently. For an independent response, positional performance can improve proportionally to the number of disks in the array. An independent response occurs when different spindles act to satisfy seperate requests.

2. Dependently. For a dependent response sequential transfer rates can improve proportionally to the number of disks in the array. A dependent response occurs when different spindles act to satisfy the same request.

The independent response is governed by the following factors:

1. The degree to which the requests are random determines the degree to which the array can potentially respond in an independent manner. 100% randomness permits positional performance improvements proportional to the number of disks in the array, relative to a single disk, if the following other conditions are also satisfied.

2. The consistency with which the requests are smaller than the stripe size also determines the degree to which the array can potentially respond in an independent manner. If 100% of the requests are smaller than the stripe size, the array can potentially demonstrate positional performance superior to a single disk that is proportional to the number of disks in the array.

3. Finally, the ratio of time for which the queue depth is greater than the number of disks in the array also determines the degree to which positional performance can be magnified.

The dependent response is a consequence of the following conditions:

1. The degree to which the requests are sequential determines the degree to which the array can potentially respond dependently. A 100% sequential workload will permit STR performance improvements proportional to the number of disks relative to a single disk.

2. The consistency with which requests are larger than the stripe size also determines the extent to which the array can respond dependently.

3. Only a sustained queue depth of one is necessary for the realization of these performance benefits.

Minor issues with FemmeT's Benchmarks:

1. Many of his benchmarks are misleading. Very misleading. They are traces of tasks which are not disk bound. Specifically 'Windows Update,' 'Program Installation,' and 'DVD strip' are not tasks for which RAID, or any storage subsystem for that matter, improves performance.

2. Many benchmarks, which may be slightly affected by IO performance, demonstrate distorted results when the primary limiting factors are removed. For example, Anandtech measured Farcry level load times as well... how does one account for the (remarkable) disparities between the Anandtech results and FemmeT's. As another example, FemmeT has also noted that he included Filesharing traces operating concurrently with other operations. Filesharing requests in the real world, are likely to be so infrequent as to have little to no effect on the resolution of other disk IO. In a disk trace however, for which the internet connection bottleneck has been removed, the benchmark proceeds as if the disk subsystem was actually in an environment like that of a network fileserver!

THE MOST IMPORTANT POINT:

FemmeT argues that his tests demonstrate the superiority of striped sets because they have particular characteristics that make them different from other tests. He particularly stresses the importance of queue depth and suggests that 'real power users' will see the benefits. While I have already pointed out that many of these tests would see little or no benefit from the IOPS of even an SSD, there is a more relevant point that counters the very fundamentals upon which his conclusions are based:

The benchmarks that FemmeT uses to 'support' his points involve precisely the characteristics that demand the greatest benefit, not from striped arrays, but from independent disks! The demonstration of this follows. Section 4 is the most relevant, but all the following analyses are relevant to the performance of striped arrays for power users, gamers, enthusiasts, and everyday computer users alike.

While certain workloads certainly benefit from striping, I am debunking the common myth (albeit long after Eugene ;) ) that multitasking in-and-of-itself, produces such a workload, and also that, even when multitasking does, that striping is only a second-rate solution.

The Consequences of 'The Basic Striping Laws':

1. Stripe size. You can't have it both ways.

You are either tuning for positional or STR performance. Also the individual who is attempting to tune for positional performance in single-user accesses is trying to overcome an inherent, unavoidable disadvantage: you've doubled the number of read/write heads (spindles), but you've also doubled the number of places you have to get data from! Controllers don't allow for stripe sizes that can contain an area of localization, which would fix this doubling problem by allowing independent work by the drives on access patterns (if it weren't for other consequences of the striping laws). Even if such controllers existed, a random distribution of localizations is inherently disadvantaged relative to a conscious balancing of spindle workloads --that is, relative to individual disks.

Having two disks tackle an area of localization is wasteful. This is important because of the effect of...

2. Caching. Single-user workloads are not at all random. They are localized. This means that subsequent requests are more likely to be close to previous requests. This is obvious rationally, and observable empirically. This has huge consequences for performance. Because a disk knows (for desktop workloads), that the next request is likely to be near the previous request, caching can be tremendously effective. The veracity of this claim has been confirmed. Multiple disks in an array aren't going to improve performance over a single disk if the data is cached. You're just wasting read-ahead time.

This is almost certainly the primary reason for the dismal improvement offered by striped sets over single disks in most tests other than FemmeT's.

3. STR. Numerous tests have confirmed that STR is the least important aspect of storage subsytem performance for real world tests. It is second to 1st, caching/firmware, 2nd, rotational latency, and 3rd, seek time. Look how little striping affected level load times (an area where striping was commonly suspected to be of significant use). Hint. It had no effect. Changing chipsets makes a bigger difference.

4. Queues. This is the final killer. Even when striped sets can offer improved performance, the very conditions that enable such a performance contrive to render a striped array only a second-rate solution.

Single-user workloads cannot generate queues easily. Single-user applications do not issue multiple outstanding requests. They wait for earlier requests to be satisfied. This means you can't have a queue greater than one, unless you're multitasking. Even then, each application has to be consistently competing with each other for array time if the requests are going to go unsatisfied and build into a queue. A fast enough disk subsystem will not allow a queue to build in this situation.

While a queue is necessary for a striped array to offer positional performance that is at all better than a single disk, the above should also demonstrate that the queue is, itself, only symptomatic of a problem. Your disk subsystem isn't fast enough for the workload, because it is seeking too much. A queue only serves to demonstrate this. It shows that your disks can't keep up with the load, but for a striped array to have any chance of improving positional performance it has to be falling behind --the queue depth has to be >1.

Most interestingly, if you seperated the disks, one disk per localized workload for example, the average rate of IO satisfaction would certainly be superior than for a striped set. Such a configuration eliminates the seeking that is destroying the arrays throughput. So, to get improved positional IO performance from a striped set, you have to create a workload that would be better served by independent disks. Ironic isn't it?

Lastly, and incidentally, I respect the time FemmeT has invested in his benchmarking but I am afraid that a combination of several factors is beginning to suggest to me that he is trying to sell RAID controllers. Or that he is specifically contriving to give positive reviews of products when such reviews are not warranted by the facts.

Many reviewers hesitate to give a negative, but honest review of a product, for fear of damaging their relationship with the manufacturer on whose good will they depend for review pieces. This problem has become endemic to the Internet. FemmeT has seen the information I posted here before and I know from the manner in which he argues that he is intelligent enough to understand its significance, but he seems to have no interest in sharing it... I intend no offense, but it is important to be critical of data, the benchmarks that produce it, and the scientists (hopefully) who compile and present it. In light of some recent points it appears that some individuals are siezing hold, non-critically and exclusively, of data which needs to be weighed relative to other tests.

Share this post


Link to post
Share on other sites
Booting Windows is a big gain... Loading game levels is a huge win

This is just ridiculous. Please have the self-respect to educate yourself --at least superficially-- on the subject before posting.

This has been linked several times now.

And this is ancient news now.

And did you mention defragging performance! Your post is a joke right?

A superficial understanding of the factors that determine storage performance in single-user situations with multiple disks may lead one to consider FemmeT's objections to be very reasonable, but anyone with an understanding of the factor's that are responsible for the performance of striped arrays can appreciate that his conclusions were drawn with little or no consideration, or appreciation, of the details of the factors that produce the disparities between his results and everyone else's.

You forgot this.

This is the setup that lets me defrag while creating Shrek 3 while doing a full blown network backup - which I always like to do in the middle of the workday. Defragging during the night is a typical beginner tactic - doing it in the middle of the day is for pros. I mean seriously -any professional knows that you don't get paid for the nightly defrags - plus its better if you watch it, so you can take the appropriate action should the defragger start shredding your data.

I know what your going to say... "But Raid 0 risks all your data!!" which is why I software stripe it to my DVD-R - my data is always safe.

For awhile I actually believed all this benchmark mumbo-jumbo - then I actually read some of the posts, where people told me point blank "RAID 0 is like super fast, I can totally tell the difference between RAID 0 and my unknown work computers drive when I do email and stuff (while defragging) and thats when i saw the light - the dude said it was fast and stuff - so it must be - he wouldnt lie to me, and someone who defrags while they work on disk intensive stuff is clearly a professional that should be listened to.

I've been converted!!!

Share this post


Link to post
Share on other sites
You forgot this.

I can't believe I missed that thread - thanks Mars, I had tears coming down my cheeks from reading it start to finish...and your post was hilarious as well.

I think the mods should just consider moving pro-RAID 0 posts to the B&G... ;)

Future Shock

Share this post


Link to post
Share on other sites
1. Many of his benchmarks are misleading. Very misleading. They are traces of tasks which are not disk bound. Specifically 'Windows Update,' 'Program Installation,' and 'DVD strip' are not tasks for which RAID, or any storage subsystem for that matter, improves performance.

2. Many benchmarks, which may be slightly affected by IO performance, demonstrate distorted results when the primary limiting factors are removed. For example, Anandtech measured Farcry level load times as well... how does one account for the (remarkable) disparities between the Anandtech results and FemmeT's. As another example, FemmeT has also noted that he included Filesharing traces operating concurrently with other operations. Filesharing requests in the real world, are likely to be so infrequent as to have little to no effect on the resolution of other disk IO. In a disk trace however, for which the internet connection bottleneck has been removed, the benchmark proceeds as if the disk subsystem was actually in an environment like that of a network fileserver!

The major issue with IPEAK RankDisk is that it measures I/O performance independent from the host system, meaning benchmark results may show improvements in I/O performance while performance in real world workloads would not improve because I/O is not a bottleneck. Game level loading times seem to be heavily dependent on CPU power, therefore showing very small differences in loading times between various disks and RAID setups.

Improved I/O performance does however mean that one can assume a larger headroom in performance while multi-tasking in these particular workloads.

The filesharing requests in some of the desktop workloads accounted for only ~200KB/s of disk traffic in the traces, so there was no major impact on the characteristics of the trace. The traces were recorded on a internet connection with 60Kbps upload speed. The filesharing requests will be 'time compressed' while replaying the trace in RankDisk (which happens with all traces), but it will nowhere come close to the workloads of a fileserver.

What is apparent however is that the Winstone 2004 traces I used for primary application performance have larger queue sizes than the traces from Storage Review. The new software releases in Winstone 2004 probably have higher I/O workloads, which is logically when software progresses over time. Newer versions of the Winstone system benchmarks also have a more realistic way of simulating multi-tasking user activity than Content Creation Winstone 2001 used in SR's High-End DriveMark 2002. This will have great impact on queue-depths. The statistics of my Multimedia Content Creation 2004 trace show an average queue-depth of 8.82 I/Os compared to 1.40 I/Os in SR's Content Creation Winstone 2001 trace. I use a plain MCCW2004 trace without any additional activity.

The benchmarks that FemmeT uses to 'support' his points involve precisely the characteristics that demand the greatest benefit, not from striped arrays, but from independent disks! The demonstration of this follows. Section 4 is the most relevant, but all the following analyses are relevant to the performance of striped arrays for power users, gamers, enthusiasts, and everyday computer users alike.

The problem with independent disks is that it is a real pain to plan out they way you use your drives in order that get the maximum performance. In many situations you cannot have improvement performance from both drives. If you wan't to copy data from drive A to A, using independent drives will not improve performance at all. I have done some stopwatch timings with single drive, RAID 0 and independent drive configurations with show a nice improvement for independent configurations in some situations, but also show zero improvement in many other circumstances. From a usability point of view one fat physical drive is preferable for many users.

Lastly, and incidentally, I respect the time FemmeT has invested in his benchmarking but I am afraid that a combination of several factors is beginning to suggest to me that he is trying to sell RAID controllers. Or that he is specifically contriving to give positive reviews of products when such reviews are not warranted by the facts.

You almost seem to be religious about disproving the benefits of striping.

I have developed the 2004 benchmark suite with the usage patterns of power users in mind. The excellent performance scaling in RAID 0 configurations and possible nice performance benefits with NCQ hard drives (I haven't tested any NCQ drives yet) are just side effects. I don't care about the performance of individual products. Vendors will provide samples of their hardware anyway and otherwise there are always local sellers happy to help out with review hardware.

The results of the stopwatch timings. You will be delighted to see the performance decrease in the SP1 installation and Explorer file search tests:

full.gif

full.gif

Test configuration:

Dual Athlon XP 2400+

1GB PC2100 ECC Registered DDR

GeForce FX 5900 videocard

64-bit 66MHz PCI

Intel Pro/1000MT gigabit ethernet controller

A Ghost image was used to place identical copies of the system on all disk configurations.

Windows pagefiles, temp directories and Photoshop scratch files were placed on the second drive in the configuration of independent Raptor WD740GD drives.

Short descriptions of the tests:

1) Photoshop image loading

Loading a fotoshoot of 25 8 megapixel images into memory. Heavy usage of scratch file.

2) Photoshop image laoding + backup

NTBackup network backup of a 10GB set of images while running the Photoshop image loading tests.

3) Thunderbird mailbox search query

Search query for the word 'pentium' in the mailbodies of a mailbox, size 600MB. Obviously this test is almost completely CPU limited.

4) Thunder mailbox search query + Kaspersky virus scan

Virus scan on Windows partition while performing the Thunderbird test.

5) File copy

Copy of a 535MB set of images from drive A to A. In the configuration of independent Raptors the files were copied from disk A to B.

6) File copy + Kaspersky

Virus can on Windows partition while performing file copy on data partition.

7) Photoshop CS start-up times

Time used to start Photoshop CS.

8) Windows XP boot-up time

Time to boot Windows from the disappearance of the POST screen to the appareance of the login screen. The MegaRAID SCSI 320-2X did not perform any I/O activity in the first seven seconds of the boot procedure.

9) Windows XP SP1 installation

SP1 setup from the archived version. As you can you overhead of striping actually decreased performance of the Raptor RAID configurations.

10) Windows Explorer file search query

Query for files with the appareance of the word 'pentium'.

11) Windows Explorer file search query + network file copy

File copy over gigabit ethernet in the background while performing the search query in Explorer.

Share this post


Link to post
Share on other sites

I just wanted to remind everyone that the performance characteristics of RAID 0 are not worth getting emotionally involved in, making enemies, making an ass of yourself, or trying to put down other people.

RAID 5, however...

Share this post


Link to post
Share on other sites

BTW: after rebooting the various configurations used for the stopwatch tests more than 30 times each, it was very easy to feel the differences in performance. The Raptor WD740GD RAID 0 did feel a lot smoother and more responsive than the single WD740GD. The single WD360GD on the other hand almost felt like a slow drive. Any power user should be able to observe these subtile distinctions in performance.

Also interesting are these results of a poll on the frontpage of Tweakers.net about RAID user satisfaction. This site is visited by people who are not pre-determined to believe RAID 0 is 'insane' or useless. After 7661 votes, 29.5 percent of the voters indicated they were using RAID, out of which 56.2 percent experience a huge improvement in performance, 29.5 percent experienced a small improvement in performance and 14.6 percent experienced no improvement at all. It's no scientific evidence but it does say something about how people think about the performance of their RAID systems.

Personally I am very happy with my RAID 5 configuration. I wouldn't recommend RAID 0 to anyone besides gamers who don't have data to care about. I do believe however that RAID 0 can show consirable performance improvements in demanding desktop environments.

It would be a good idea for Eugene to develop a new generation of desktop benchmarks, targetted towards realistic multi-tasking activity in up to date versions of popular office and content creation applications. The new benchmarks will probably show some nice improvements in I/O performance from striping. I wonder how he is going to retract from his current 'RAID 0 is insane' viewpoint.

Share this post


Link to post
Share on other sites
Also interesting are these results of a poll on the frontpage of Tweakers.net about RAID user satisfaction. This site is visited by people who are not pre-determined to believe RAID 0 is 'insane' or useless. After 7661 votes, 29.5 percent of the voters indicated they were using RAID, out of which 56.2 percent experience a huge improvement in performance, 29.5 percent experienced a small improvement in performance and 14.6 percent experienced no improvement at all. It's no scientific evidence but it does say something about how people think about the performance of their RAID systems.

Once agai

Those results aren't interesting at all. Of course a power user who just dumped a load of cash on his new raid system is going to feel like its fast. Once again, a users perception is -almost- worthless.

Read Here

Many experiments have been conducted to show the extraordinary extent to which the information obtained by an observer depends upon the observer's own assumptions and preconceptions.

Share this post


Link to post
Share on other sites
I have developed the 2004 benchmark suite with the usage patterns of power users in mind.

Keep it up and I will be forced to post graphs of my banana card. And don't get Sivar worked up - he has an Uber Experimental Raid that we try to keep him from using due to the blackouts we have been having...

Now where is proof - I have to tell him about this trick I learned with hotmelt and that big outlet behind the dryer..... I have already turned 3 bananas into a burnt pile of something on my desk that smells really really bad, and blew that coconut through the neighbors Honda - but I think I'm on to something - somehow I just need to create 1.21 gigawatts. (and get new homeowners insurance - after my third call trying to get a little 'accident' resolved they canceled me - jerk offs)

Note: My apologies to those looking for a serious discussion. Please move along - they are all spread out over pages 1-20009....

Share this post


Link to post
Share on other sites

I have to agree with Gilbo here, FemmeT, it really looks like you are trying to manufacture scenarios where there will be a high queue depth. Perhaps some people are foolish enough to use their computers while backing up or scanning for viruses, but most people schedule these things for when they are NOT trying to do anything productive, deliberately.

In addition, you suggest that RAID is primarily something gamers should use, but fail to specify a single gaming benefit. The most sensible tests there are things like opening multiple photos in Photoshop, and that's not a common task performed regularly, save by specific types of user. And such a user with such a specific task should really be assessing on their own what the best setup is for their individual usage pattern. There are of course uses for RAID. But I think your application patterns are fairly unrealistic.

Share this post


Link to post
Share on other sites

I see no issue. This is precisely the point of the exercise.

It would be a good idea for Eugene to develop a new generation of desktop benchmarks
I believe Eugene has subtlely indicated on numerous occasions over the past six or eight months that work on TB4 is underway.
targetted towards realistic multi-tasking activity in up to date versions of popular office and content creation applications.
Up to date versions aside, I would let Eugene's words speek for themselves:
if one is to contest the SR Desktop DriveMarks, one must do so on the grounds that the application usage selected for recording is not representative of the majority of users
A couple of questions: I'm not certain if you stated it before, but what do you not find realistic about SR's usage pattern FemmeT? To what do you attribute the higher queue depths in your finding too in comparision to SR's? What were the queue depths for your tests?
The new benchmarks will probably show some nice improvements in I/O performance from striping.
I don't think Eugene would be publishing something if he thought he would be retracting its findings only a short time later. I suspect that the further developments of SR's test suites have only collaborated with earlier evidence. I of course am conjecturing - but we'll have to wait to see what Eugene and a future TB4 has to say on the matter.
I wonder how he is going to retract from his current 'RAID 0 is insane' viewpoint.
Well that is a rather incendiary statement, but regardless, I think Eugene has shown in the past that he is not beyond admitting mistakes and re-evaluation of his conventional wisdom. The question I pose to you Femme, are you?

Share this post


Link to post
Share on other sites

Oops, I cut off a quote by mistake. My "I see no issue comment was in regards to FemmeT's statement that "The major issue with IPEAK RankDisk is that it measures I/O performance independent from the host system".

Share this post


Link to post
Share on other sites

As my previous posts throughout this forum indicate, I believe that striping is a valuable tool when used in the right situation. I also do a lot of Photoshop work with very large files -- in fact this is my whole reason for being interesting in striping these days.

So, even though I am a fan of properly-used striping, I have to say that the Photoshop components of this benchmark set are not representative of how a power Photoshop user would set up a multi-disk system. Specifically:

Test configuration:

Windows pagefiles, temp directories and Photoshop scratch files were placed on the second drive in the configuration of independent Raptor WD740GD drives.

In Photoshop the scratch file is the 3rd most important factor determining performance (after memory and CPU, and the speed of the scratch file may even be more important than CPU in many cases). A power Photoshop user would never put his scratch file on a single disk and his CPU and images on a RAID array. If he/she had two disks then the config would be one disk for OS/Programs/Images and the other for scratch. With three disks it would probably be one for OS/Programs/Images and the other two in a RAID0 array for scratch. With four disks it would probably be one for OS/Programs, one for Images, and two in a RAID0 array for scratch.

When reading and writing files, Photoshop does a lot of concurrent disk I/O on the scratch disk and the disk containing the file being read/written. You want these two files (images + scratch) to be on separate spindles.

1) Photoshop image loading

Loading a fotoshoot of 25 8 megapixel images into memory. Heavy usage of scratch file.

Yes, heavy usage of scratch file. Again, a Photoshop power user would not set up a machine in the way used in these tests, with the scratch file in a single disk and OS/Programs/Images on a RAID array. That is backwards!

2) Photoshop image laoding + backup

NTBackup network backup of a 10GB set of images while running the Photoshop image loading tests.

A power Photoshop user would never do this. When doing serious editing it is easy to take up the entire available resources of a fully-configured Windows machine: full 2GB RAM address space fully used up, CPUs maxed out, disks cranking away. A power Photoshop user would shut down all nonessential applications and would never run backups while editing. He/she would *definitely* do backups, but not concurrently with edit sessions.

7) Photoshop CS start-up times

Time used to start Photoshop CS.

This is not an important metric for a power Photoshop user, who typically starts Photoshop once then uses it for several hours. However, regardless of whether it is an important metric, the PHotoshop CS startup times would probably be faster if the OS, Programs, and Scratch were on multiple distinct spindles rather than grouped together on a single RAID array. A complete benchmark analysis should include this scenario along with the RAID-only solutions.

I think the point that people are trying to make is that if you've got N disks available to solve a problem, is it better to configure those disks into a single N-disk RAID array (either RAID0, 1, or 5) vs. configuring them as N distinct independent spindles, or perhaps some combination? Most applications will speed up a little bit when going from a single disk to RAID0 -- that is not a surprise. But a true benchmark comparison would include test cases that compare striping vs. multiple distinct spindles as well.

As for the argument that the "the problem with multiple separate disks is that you have to figure out in advance what files go on which disk", I say that yes, this is a problem, but is specifically the kind of problem a *power user* would solve.

Another example: when I was doing full time software development on my machine I noticed that my productivity was often limited by disk I/O time. I did some investigation -- OK, a LOT of investigation -- and ended up configuriung my development machine with four small disks used as follows: one for OS, one for Programs, one for temp files, and one for source code. I chose this configuration because every time I compiled there was significant concurrent I/O to/from each of these four sources. I wanted each of those four sources on seprate spindles to minimize head movement, increase concurrency, etc. Switching to this configuration made a HUGE difference. Would I have set up those four disks in a RAID5 array? No way! A pair of RAID1 arrays? Probably not.

Again, documenting that there are small speedups between a single disk and a RAID array is no big surprise. We would all expect this without even seeing the tests. Comparing a single RAID array to multiple spindles...now that would be interesting.

Share this post


Link to post
Share on other sites
You almost seem to be religious about disproving the benefits of striping.

I am simply critical of information that comes my way. All the other information I have examined on the subject was placed under equally careful examination.

Since your results are the only ones that disagree with the other data I have already examined, I am subjecting them to an evaluation to demonstrate, for those confused, why you have reached a conclusion that is opposite of the other careful observers.

Lastly, I apologize if the speculations I concluded with offended you, and contributed to you terming me 'religious'. In case it wasn't clear before, I made my assertion because:

1. Your posts demonstrate that you are intelligent, that you understand how to make an argument, and that you understand the basic elements that determine the performance of storage subsystems.

2. However, you publish observations that you must know (see 1) are irrelevant. Coincidentally, these tests often show significant benefits from striped arrays.

3. You appear to have an excellent relationship with manufacturers of RAID controllers.

Most Importantly (to reiterate):

Much of your argument depends on your observations of high queue depth situations.

High queue depth situations generated during multitasking are exactly the category of situations for which a power user should be using independent disks. No one should use striping in these situations.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now