Defiler

Terrible SCSI performance in Windows XP

Recommended Posts

If M$ is so darn interested in a fix to this problem, why is it taking more than 6 months to fix a major file subsystem bug?  Think about that - their flagship OS has a glaring performance problem and 6 months later they haven't fixed it?  One would think they might use part of their 40 billion in cash reserves to hire a programmer or two to fix this problem?

Absolutely amazing.

Perhaps the "performance" issue is intentional, as a method of pushing the Windows-only "dynamic disk" partition format. Why should MS change anything? They want you to use Windows and only Windows. Windows partition formats, Windows filesystems, anything that is MS-specific and hasn't become commiditized yet.

Best solution to these performance problems, is uninstall Windows completely, and use something a little more efficient for an OS.

Share this post


Link to post
Share on other sites

I managed to get much more speed by using bigger cluster size.

I have abit KG7-Raid and 2*IBM deskstar 60GXP 40Gb drives and 64 stripe size, disks are basic. After testing I noticed that ATTO scores got from one drive level ~ 45 Mb/s 16-1024 kb reads/writes to 72-75 Mb/s :D when I used 32kb cluster. It seems to me that that default 4kb cluster size isn't good for raid-0 setup.

Share this post


Link to post
Share on other sites
XP is more for the low budget computers, a more simple version of Win2000.

Your understanding is wrong. Just because XP combines the NT and the 9x lines, it doesn't mean that it is a simpler version of Win2k. Instead, think of it as the next step up.

NT -> 2000 -> XP

95 -> 98 -> ME -> XP

That's how the two lines merge. Oh, and by the way, you don't see Windows XP Server family simply because it will be called .NET when it is released.

Leo

That's right Leo, except I would not even put XP in the 95 product line. The only merging really is the 'usability' of 95 line with the more advanced everything else of the NT product line. Not that the usability wasn't already there with 2000 anyway!

Francois, this was my point with my last post, I guess my sarcasm wasn't obvious enough! :)

Share this post


Link to post
Share on other sites
i am using Win2000 myself, but am familiar with XP as well. I liked NT4, i like Win2000, but do not like XP that much. Everytime i use XP, i am glad that at home Win2000 is installed.

It certainly is not for the power user (dixit Microsoft).

Still to each his own and i respect the opinion of others.

François

I worked very long on NT4.0 WS.

Best MS-OS ever build.

But it was not more up to date.

The i used W2k. I missed the performance.

Ok had better features than NT4.0, but i thinked

W2k is just anything between NT4.0 and ?.

Then came XP-Prof.

I was impressed (I used the same Hardware).

It is fast (if you give it enough RAM) you can make ALL with it.

I am developing DBs.

Unfortunatly i bought me SCSI-Contr. and Drives.

The problem was i use XP and Win 98SE on the same computer.

Win98 for Internet, Mail, and Internet Games on FAT32.

XP for developing DBs, Grafik, DVD, Programming and other things.

If i have XP and Win98 on the first HDD i must use Basic-HDDs.

But otherwise XP is much faster, better than W2k.

Now i must have W2k again. Because i use XP(W2K) and 98 at 50-50%,

and i hate it to change the boot-device on the controller.

I can't wait for XP-SP1.

Then i can kicking the between Nt4.0 - "W2k" - XP.

Bye dino

Share this post


Link to post
Share on other sites

I have been following this forum for quite some time. It peaked my interest as I too have experienced slow disk access times when I upgraded to XP. When I contacted Microsoft, they provided me with the hot-fix q308219. I loaded this. Contrary to some of you, I feel that fixed the problem with the SCSI disks. However, performance was still not back to what it was with W2k. But, what I now saw seemed to be effecting both SCSI and IDE disks. I contacted Microsoft again, after the regular run-around, they gave me a startling response. They claimed that there is a bug in W2k(Workstation and Server) that the call FlushFileBuffers, a Kernal32 system call, dose not actually flush the contents of the file buffer all the way to the disk. It falls short and leaves it in cache. If the data is read from the same PC before the cache is routinely flushed to disk, then the OS would grab it out of cache. This would occur really fast since no disk access ever occurs. However, if another PC were to try to access this data assuming that it had already been committed to disk, then the data would not be there. This could cause data corruption or an application crash. Recognizing this, Microsoft fixed the problem in XP. Now a FlushFileBuffers call actually flushes the contents of the file buffer all the way to disk. But, of course, to do this it takes time. The end result is that we all have been living a lie with respect to disk performance. Microsoft maintains that all of our applications that thought they were committing data to disk actually were not and therefore performance seemed much higher than it really should have been.

I have done some testing with the FlushFileBuffers call. A simple program was written in C++ that performed a write to disk and flush to disk a fixed number of times. The elapsed time to perform this loop is then recorded. When run on W2k, the time is 1/30th of what it is on XP. My most controlled test was performed on the same hardware. I have the luxury of having two of the same model disks. I installed a clean W2k OS on one and a clean XP prof on the other. By swapping the disks in and out of the same PC, a like comparison of the difference in the OS speed can be measured. I would have to agree that the time to complete the loop on W2k is faster than I think a hard drive can react.

In fact, I believe the same bug was “fixed” on .Net Server. I’m running the beta 3(3615 build) on a Dell 2500 server with a 5 disk RAID5 array. My W2k workstation IDE disk completes the Write/Flush loop faster...about twice as fast as the .Net Server machine!

Has anybody heard this story? What do you think? Have we all been duped?

If so, how have all our application not been crashing. I would think many applications would have been trying to read data that never actually made it to disk. Granted some applications are written not to be susceptible to this. For example, Microsoft’s own SQL server takes requests from other clients, but all the disk access is done locally, thus pulling data out of cache, quickly. But there has to be many other applications out there that did not take into account this bug. Why have those application been running?

GottaAsk

On a side note: During the run-around period of this, Microsoft supplied me with a pre-ship of XP’s SP1(Build 2600, 020606-1800(v1052) to see if that satisfied me. It made no difference(I already had q308219 installed).

Share this post


Link to post
Share on other sites
I have been following this forum for quite some time.  It peaked my interest as I too have experienced slow disk access times when I upgraded to XP. When I contacted Microsoft, they provided me with the hot-fix q308219. I loaded this. Contrary to some of you, I feel that fixed the problem with the SCSI disks. However, performance was still not back to what it was with W2k. But, what I now saw seemed to be effecting both SCSI and IDE disks. I contacted Microsoft again, after the regular run-around, they gave me a startling response. They claimed that there is a bug in W2k(Workstation and Server) that the call FlushFileBuffers, a Kernal32 system call, dose not actually flush the contents of the file buffer all the way to the disk. It falls short and leaves it in cache. If the data is read from the same PC before the cache is routinely flushed to disk, then the OS would grab it out of cache. This would occur really fast since no disk access ever occurs. However, if another PC were to try to access this data assuming that it had already been committed to disk, then the data would not be there. This could cause data corruption or an application crash. Recognizing this, Microsoft fixed the problem in XP. Now a FlushFileBuffers call actually flushes the contents of the file buffer all the way to disk. But, of course, to do this it takes time. The end result is that we all have been living a lie with respect to disk performance. Microsoft maintains that all of our applications that thought they were committing data to disk actually were not and therefore performance seemed much higher than it really should have been.

[...]

If so, how have all our application not been crashing. I would think many applications would have been trying to read data that never actually made it to disk. Granted some applications are written not to be susceptible to this. For example, Microsoft’s own SQL server takes requests from other clients, but all the disk access is done locally, thus pulling data out of cache, quickly. But there has to be many other applications out there that did not take into account this bug. Why have those application been running?

GottaAsk

Whoa! Very interesting...

Crashes would have occurred only in very rare circumstances, but still... amazing it wasn't noticed before. Apparently, FlushFileBuffers simply queues a write and returns before the write actually completes in Win2k. Of course, it doesn't take long for it to complete, so a problem would have occurred only if there were a nearly immediate request for data from the same file. Something that could only occur under an extremely heavy access load from different PCs.

Leo

Share this post


Link to post
Share on other sites

Has this been officially announced yet? Surely corporations running fileservers, DB, or other servers with heavy disk I/O should know this. Why have they not fixed it in Win2K yet?

Share this post


Link to post
Share on other sites
Has this been officially announced yet? Surely corporations running fileservers, DB, or other servers with heavy disk I/O should know this. Why have they not fixed it in Win2K yet?

The problem could have shown up only with a number of different servers accessing the same files. This is common in the UNIX world, but not with Win2k.

Leo

Share this post


Link to post
Share on other sites
Has this been officially announced yet? Surely corporations running fileservers, DB, or other servers with heavy disk I/O should know this. Why have they not fixed it in Win2K yet?

This has been fixed in the upcoming service pack

Share this post


Link to post
Share on other sites
Has this been officially announced yet? Surely corporations running fileservers, DB, or other servers with heavy disk I/O should know this. Why have they not fixed it in Win2K yet?

This has been fixed in the upcoming service pack

I guess the interesting thing now is whether we'll see the same performance impact as seen in XP under 2k SP3? And some people seem to have been experiencing what appears to be the same or similar problem already in 2k, perhaps it's already included in a hotfix?

Share this post


Link to post
Share on other sites

That's all well and good, but how come the problem doesn't show up in other OS'es not using the MS calls?

Also if we've gone this long with this "bug" without major disaster while at the same time reaping double the speed in disk writes, Is it even something that should be fixed? At least selectable through some checkbox somewhere...

Sounds sketchy to me.

-Chris

Share this post


Link to post
Share on other sites
That's all well and good, but how come the problem doesn't show up in other OS'es not using the MS calls?

Because those other OSes implement things differently.

Leo

Share this post


Link to post
Share on other sites
Because those other OSes implement things differently.

That results in a twofold improvement in write performance with no loss of data security? If that were the case then one could safely say that windows disk IO has some serious problems.

I don't really buy the claim at all. It's not like there are widespread reports of data corruption in w2k on scsi hardware supporting the claim.

-Chris

Share this post


Link to post
Share on other sites

Combine this:

They claimed that there is a bug in W2k(Workstation and Server) that the call FlushFileBuffers, a Kernal32 system call, dose not actually flush the contents of the file buffer all the way to the disk. It falls short and leaves it in cache. If the data is read from the same PC before the cache is routinely flushed to disk, then the OS would grab it out of cache. This would occur really fast since no disk access ever occurs. However, if another PC were to try to access this data assuming that it had already been committed to disk, then the data would not be there. This could cause data corruption or an application crash. Recognizing this, Microsoft fixed the problem in XP. Now a FlushFileBuffers call actually flushes the contents of the file buffer all the way to disk. But, of course, to do this it takes time. The end result is that we all have been living a lie with respect to disk performance.

with this:

The reason why IDE disks are not affected is because the ATA spec does not define an equivalent for ForceUnitAccess, which instructs the drive to commit the data onto the media before completing the request [ this is triggered by the FILE_FLAG_WRITE_THROUGH attribute that the ATTO Disk Benchmark sets ]

and you start to get an interesting picture.

Atto thinks that it is writing directly through to the disk, but it isn't, in the case of W2K and SCSI. In the case of IDE, it doesn't matter, as apparently there isn't a protocol command yet to flush the internal cache. This could also explain in many cases the "W2K shutdown bug", where it tends to eat data at shutdown. (I really hope SP3 fixes this 'little' issue.)

This is all fine and dandy, but it *still* doesn't explain the real-world slowdown with XP, basic disks, and SCSI. Simple things like copying files between drives, shouldn't in normal cases require the FlushFileBuffers call to be made. That's the whole point of write-back caching, as a performance advantage. Something else is still clearly screwed up, although I think we now have a reasonably solid explaination for the W2K shutdown bug.

Share this post


Link to post
Share on other sites

I don’t think the FlushFileBuffers fix is going to be in w2k’s sp3. I asked my Microsoft contact and the reply was that the bug would be fixed in W2k, but there was no schedule of when. There is a bug that is fixed in sp3 that references FlushFileBuffers(q288794). But that is not the same issue.

I would have to agree that there is still a disk access performance degradation in XP. However, outside of q308219 and unoptimized SCSI drivers, I’m not convinced it is specific to SCSI. Here is the test I just ran. I booted my PC with the IDE disk with XP on it. I copied a directory with 817meg of randomly sized files from one directory to another. This took 4 minutes and 3 seconds. I powered down, replaced the hard drive with the same model hard disk with W2k installed on it. I performed the copy of the same files. It took 1 minute and 58seconds. I also have a tool called TrueTime that is used in performance analysis of software. You can run an exe from it and it will track every method called. I normally use it to produce reports for optimization of application software. In this case, I wanted to see if the FlushFileBuffers call was used in a file copy. It is not. Thus, regardless of the FlushFileBuffers and regardless of SCSI or not, there is an over-all degradation in disk performance from W2k to XP.

Thus, as is usually the case with very difficult problems, there are multiple problems.

1. SCSI disk performance problem in NTFS.SYS fixed with q308219

2. Some SCSI Drivers are more optimized than others. Usually when an OS is released, all the hardware vender throw out whatever driver they have “working” as fast as possible to maintain their presence on the cutting edge. Many times those drivers are bug ridden and slow. As time passes they will put out optimized and more robust drivers.

3. FlushFileBuffers is making some software run really slow as it is now writing all the way through to the disk. Software companies really need to know about this “fixed” bug and re-asses when they use this call as the cost in XP has gone up.

4. The out-standing issue I mentioned above. SCSI or IDE, disk access on XP is slower.

Anything I missed?

I have contacted Microsoft again with the latest test. I’ll let you all know how it goes.

Share this post


Link to post
Share on other sites
I don’t think the FlushFileBuffers fix is going to be in w2k’s sp3. I asked my Microsoft contact and the reply was that the bug would be fixed in W2k, but there was no schedule of when. There is a bug that is fixed in sp3 that references FlushFileBuffers(q288794). But that is not the same issue.

It's not going to be fixed in SP3? MS really has W2K users over a barrel, don't they. They want to drop W2K like a hot potato, because it doesn't have all the crap like activation in it that XP does. Their recent disabling of Corporate Windows Update speaks volumes about their plans.

Probably, SP3 for W2K was withdrawn and re-written, to add "features" like product activation, and forced use of their automatic Windows Update feature. (Office 2K added product activation in a SP, if you recall, so it's not unprecedented.)

Oh yeah, don't get me started on the "where are the USB 2.0 drivers for W2K" issue. They were release-quality months ago. Where are they?

I would have to agree that there is still a disk access performance degradation in XP. However, outside of q308219 and unoptimized SCSI drivers, I’m not convinced it is specific to SCSI. Here is the test I just ran. I booted my PC with the IDE disk with XP on it. I copied a directory with 817meg of randomly sized files from one directory to another. This took 4 minutes and 3 seconds. I powered down, replaced the hard drive with the same model hard disk with W2k installed on it. I performed the copy of the same files. It took 1 minute and 58seconds. I also have a tool called TrueTime that is used in performance analysis of software. You can run an exe from it and it will track every method called. I normally use it to produce reports for optimization of application software. In this case, I wanted to see if the FlushFileBuffers call was used in a file copy. It is not. Thus, regardless of the FlushFileBuffers and regardless of SCSI or not, there is an over-all degradation in disk performance from W2k to XP.

Ahh, but switching to MS's proprietary "dynamic disk" partitioning scheme, seems to magically fix the performance problems, but at the same time locking out all other OSes from accessing that partition. (Until Linux reverse-engineers it, that is.)

Thus, as is usually the case with very difficult problems, there are multiple problems.

4. The out-standing issue I mentioned above. SCSI or IDE, disk access on XP is slower. 

Anything I missed?

That perhaps it (the slowdown) is intentional... nah, MS would *never* do anything like that... (snicker).

Share this post


Link to post
Share on other sites
Ahh, but switching to MS's proprietary "dynamic disk" partitioning scheme, seems to magically fix the performance problems, but at the same time locking out all other OSes from accessing that partition. (Until Linux reverse-engineers it, that is.)

I really haven’t mentioned anything about Basic versus Dynamic disks. I guess I figured nobody was really using a Basic disk. But, since you brought it up. From my testing and all issues mentioned in my pervious post aside, there is no difference in the disk performance of a W2k Basic disk and a Windows XP Basic disk. Its when you upgrade to a Dynamic disk that W2k gets all of its speed. Where as, there is no performance difference between Basic and Dynamic disk on XP. This is Issue 4, why doesn't an XP Dynamic disk perform the same as W2k Dynamic disk.

That perhaps it (the slowdown) is intentional... nah, MS would *never* do anything like that... (snicker).

I disagree. I don’t think that Microsoft has done this intentionally. However, I am worried about the state of things. As for how we got here? I would agree that Microsoft is spending too much time and effort on their commercial interests and, as a result, have let quality slip. This is especially true for their corporate power users. We’re really in a bind.

Share this post


Link to post
Share on other sites
I really haven’t mentioned anything about Basic versus Dynamic disks. I guess I figured nobody was really using a Basic disk. But, since you brought it up. From my testing and all issues mentioned in my pervious post aside, there is no difference in the disk performance of a W2k Basic disk and a Windows XP Basic disk. Its when you upgrade to a Dynamic disk that W2k gets all of its speed. Where as, there is no performance difference between Basic and Dynamic disk on XP. This is Issue 4, why doesn't an XP Dynamic disk perform the same as W2k Dynamic disk.

Without going all the way back through this thread, I believe that this is not exactly correct. Changing to Dynamic Disks has not solved the problem for most people here. It does appear to make a difference, particularly in ATTO, but real world tests show that there is no performance difference between Basic and Dynamic Disks. Dynamic Disks is essentially a partitioning scheme, I can't see how it would make any significant difference to performance.

Share this post


Link to post
Share on other sites

First: a lot of topics concern Win Xp complaints, so it is far from being tested before released :(

Maybe you should try the Adaptec 2940 utility, this tests the write speed also. This way you know if the Atto tests are correct.

François

Share this post


Link to post
Share on other sites

Chew, My mistake. I reviewed my test data and you are correct, there is no performance difference between a Basic disk and a Dynamic disk. I had gotten mixed up with the FlushFileBuffers issue. The FlushFileBuffers performance is the same on XP and W2k when using a basic disk. It is only with a dynamic disk on W2k that it is broken and causes there to be faster then expected performance. To be sure I went back and tested a simple file copy on a Basic disk and a Dynamic disk on Wk2. They performed the same.

So this brings us back to the outstanding issue that disk performance, SCSI or IDE, is simply slower on XP.

I still haven’t heard back from Microsoft on this yet.

Share this post


Link to post
Share on other sites
Guest Eugene

Microsoft's official response:

"The observations of lower benchmark scores stem from a number of bug fixes, over several product and service pack releases, where it was discovered that the FILE_FLAG_WRITE_THROUGH flag was not being correctly handled. The flag generally indicates that data should be written through all the way to disk media before the command is completed. If not observed, writeback caching may be used, yielding higher throughput numbers, but without providing the application with the requested semantics. Applications developed for high bandwidth during NT4 days may have been setting this flag as a matter of course, but NT4 did not propagate this as far as the disk.

These bug fixes only affect disk devices that a) use the SCSI command set and B) observe the "Force Unit Access" semantics on write commands. Benchmark tests performed on individual SCSI or FC hard drives are the most likely to be impacted, although some RAID controllers and arrays may also show the FUA behavior. IDE drives are not affected because there is currently no equivalent capability. One of the bugs explains the difference between Basic and Dynamic disks: we were not propagating FILE_FLAG_WRITE_THROUGH on dynamic disks, only basic. This has been fixed in the upcoming XP SP1. On Windows 2000, it has been noted that performance was good unless the user toggled the write cache setting; this, too, is a bug that will be addressed in an upcoming Service Pack – we were not checking the state of the write cache enable and it was assumed to be off.

Generally speaking, we try to observe the requested semantics and will fix any bugs that cause them to be violated. Some applications exist, however, that request WRITE_THROUGH semantics, thinking that this will help (or at least not hurt) performance. You can see the performance difference in many apps by setting the flag each way under Windows XP. For maximum throughput, the CreateFile flag FILE_FLAG_NO_BUFFERING should be used without FILE_FLAG_WRITE_THROUGH.

In cases where it's not possible to update the application, we will be taking an application compatibility approach to these applications, to allow the WRITE_THROUGH flag to be ignored on a case-by-case basis.

We will have a KB article on this very shortly. The fix associated with Q308219 is not related to this problem and we will update that KB article as well to provide better information."

Regards,

Eugene

Share this post


Link to post
Share on other sites

I find all this interesting, but don't see how the write cache issue could be the cause of significant performance hits on large file copies. I've seen 50% reductions on write speed when copying 1+ GB files - certainly cache is not a likely issue there, is it?

I'm of the opinion that this isssue may be a red herring, but I'm far from an expert here...

Share this post


Link to post
Share on other sites
Guest Eugene
I find all this interesting, but don't see how the write cache issue could be the cause of significant performance hits on large file copies.  I've seen 50% reductions on write speed when copying 1+ GB files - certainly cache is not a likely issue there, is it?

I'm of the opinion that this isssue may be a red herring, but I'm far from an expert here...

Are large sequential transfer affected by the lack of write caching? Yes...

There's an easy way to test this for those who've reverted to Win2k... try a TotalCopy with caching activated and deactivated via control panel.

I've done this on the testbed a while back... identical symptoms to the XP problem.

Regards,

Eugene

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now