ace101

Seek Times

Recommended Posts

What I want is the snappiest response from the computer that I can get. But I don't want to spend a small fortune. The machine will be used to record live music and convert it from analog to digital on the fly, but that is more reliant on disk throughput (and CPU).

The other demanding function of the computer is to play FPS and MMORPG style games. Data is being read and written to and from the disk almost constantly, and from many different locations on the disk. Graphic textures are read and loaded into the video card, as well as having data written to the disk for almost any action in the game. These are all relatively small sets of data being read and written to the disk. But I've used some disk performance tools to see that the disk is spending a lot of time seeking spots on the disk.

So I'm assuming that seek times will have a very large impact on the speed of these functions. But my question is, will 10ms "feel" faster than 13ms? At what point would there be a noticeable difference? Would 2 Raptors in a Raid-0 be very noticeably faster than a Hitachi 7K500 when it comes to small files all over the disk? That's 8.1ms compared to 12.9. It seems like it would stand out like a sore thumb. Is this right?

I know that servers use very large raid-array caches to solve a lot of these kinds of issues, which masks seek times by queuing up writes and focusing on reads during heavy loads. Is this how the SATA disk caches are used? Is there a way to mask slower seek times with a raid controller with a much larger cache?

Ron

Share this post


Link to post
Share on other sites
Guest Eugene

Seek time is a primitive metric that has not shed its facade of usefulness even though many years and changes in many, many factors have abstracted the importance of physical actuator movement.

Straight up, those who insist that systems "just feel more responsive" because of seek time are full of BS. I can't say it more frankly.

You state that you've "used some disk performance tools to see that the disk is spending a lot of time seeking." Can you elaborate? What tool?

Share this post


Link to post
Share on other sites

RAID 0 has no positive effect on seek times. It can induce additional seeking delays.

Converting analog to digital audio is not a task the CPU is involved in. The ADC on the sound card does that, and any old sound card will do, within limits of the S/N of its inputs and ADC. There is only one write stream involved, which any remotely modern hard disk drive will handle effortlessly.

As for game level loading. Those accesses are localized. Subsequent accesses are more likely to be nearer previous accesses than farther away. For this type of workload cache and caching algorithms are by far the most significant factor. As Eugene notes, higher level metrics are needed to quantify these effects. Seek time is utterly irrelevant. Old SCSI 15K disks with access times that are factors(!) faster than current 7200RPM disks get obliterated by the lazy-seeking desktop disks.

Bear in mind that the complexity of the access pattern alters the results from caching algorithm to caching algorithm. Among a given group of disks, the winner at loading a level in one game may be the slowest for other levels in the game. It may be all over the map from game to game. Of course all current-generation disks are close enough that the actual breakdown from test to test is insignificant. Both the TechReport and Anandtech perform stop-watched level-loading benchmarks in their disk reviews. If you want the trivial breakdown you could have it, but I wouldn't bother, because you can't extrapolate from the results usefully even if the differences were consequential.

Edited by Gilbo

Share this post


Link to post
Share on other sites
Guest Eugene

When it comes to tools to assess just what a drive is doing, IPEAK SPT is among the best there is. Let's take a look at one of our disk traces... FarCry.

First up, a macrocosmic look at overall disk acccesses as they happen over the time of the recording:

farcry_sequence.png

The red dots are read accesses, the blue dots are write accesses. The x-axis is the time elapsed within the recording (which is about 427 seconds long), while the y-axis represents the location (remember, TB4 traces are culled from a 40 GB partition... the very end of the partition would be the 80-millionth sector, as each sector is 512 bytes).

Here we can see that reads occur in "clumps" around the same general disk area over a given short interval of time. Writes are somewhat more dispersed but nonetheless tend to happen in a relatively select few areas.

Overall, though, this "from the top" view makes it a bit difficult to summarize trends. Let's take a closer look at how these requests are distributed in relation to each other.

farcry_distances.png

This graph basically groups the relative % of distances from a given request to the one that immediately preceded it (this distance, btw, is "stride") in a histogram. In this particular version of the graph, the y-axis extends all the way to 70% to fully show the values for zero stride. When stride is zero, a request immediately follow its predecessor... in other words, its sequential. So, 68% or so of this FPS game's disk accesses are sequential, not too huge a leap given the massive level loads the title is famous for. This large scale unfortuantely compresses other results and dilutes their visual impact. It -is- interesting to note that there is a clump at the 32 million and 64 million sector results. Lets zoom in a bit to make the distinctions clearer:

farcry_distances_zoom.png

Keep in mind that this is a logarithmic graph... each "bin" along the x-axis doubles in size and encompasses a range of results 50% larger than all the bins to the left of it combined. Roughly eyeballing the graph, the % of strides that span 16 million or more sectors is about 11%.

BTW, for those too lazy to do the math :P

0 sectors = 0 bytes

16 sectors = 8 kilobytes

512 sectors = 256 kilobytes

16K sectors = 8 megabytes

512K sectors = 256 megabytes

16M sectors = 8 gigabytes

11% of requests stride more than 8 gigabytes... 80%, however, are within 8 megabytes. Given a decent read-ahead buffer algorithm, a large majority of those requests will be cached outright while the remainder will probably be on the same track (no seek) or at most a few tracks away (track-track seeking).

Ok, fine... does that 11% occupy a large % of the disk's time? We can take a look at that too :)

farcry_time.png

Again, just eyeballing results, the aformentioned 11% of requests that have to stride 8 gigabytes or more occupy about 25% of the disk's time. The remainder is either sequential, buffered by the drive, or a minimal distance away from the previous request.

As you improve the performance of actuator movement spanning 8 - 40 GBs (still a relatively small figure on today's drives), you're in effect addressing this 25%. Though not formally published here at SR (or, hell, anywhere else :P), IPEAK SPT's AnalyzeDisk can easily assess a given drive's random access performance with given stride distances. Let's take a look at two contemporary drives:

The Hitachi Deskstar 7K400:

7K400_rsp.png

And the mighty Fujitsu MAU3147:

MAU3147_rsp.png

These are three-dimensional graphs... along the x-axis, you have stride, on the y-axis, response time (access time), and on the z-axis (color), the % of access times that fell into the particular stride/time combo.

The average random access time for the 7K400 with stride distances of 16 to 64 million sectors (that is, 8 to 32 gigabytes) is perhaps 7 milliseconds. For the MAU3147, we're looking at about 5 milliseconds. What this means is that for that remaining 11% of disk accesses under FarCry that occupy about 25% of the time spent to retrieve, your improvement when going from a good ATA drive (the 7K400) to a drive that has a supremely low seek time (the MAU3147) is the difference between 5 and 7 milliseconds per access.. or about a 40% difference WITHIN that handful where this difference actually exerts itself.

In the past, I've had requests come in for short stroked access times... if a given multi-hundred-gigabyte drive turns in a certain access time score, wouldn't it do even better if it were partitioned? What if we take this to an extreme and partition the drive to 10 MB or so in hopes of simulating that elusive "locality"? As one follows this line of thought, he will eventually arrive to (tada) the SR DriveMarks and game traces!

Has anyone sat down and thought about what "I/Os per second" are? They're the inverse of response time, measured in milliseconds. In fact, RankDisk (the playback component of IPEAK SPT) delivers results in milliseconds. I invert them across 1000 to yield IOPs figures. I chose this route way back with Testbed3 to avoid the silly "no one can feel milliseconds anyway" argument. If you can't feel milliseconds, how can you possibly feel nanoseconds, the inverse unit to GHz? Anyway, that's another topic.

The SR DriveMarks and gaming tests are "random access time" measurements that take into effect real world conditions such as on OS's caching system, data localization, drive buffers and strategies, etc and deliver a response time (access time) result that is the average yielded by the given drive in the given application.

When you see the Deskstar 7K500 doing "763 IO/s per second" in FarCry, what you're seeing is that the Deskstar 7K500 turns in an average random access time of (1000/763) 1.31 milliseconds in FarCry.

All of the above is why high-level results such as the DriveMarks and game captures should vastly supecede "random access time" in the mind of all readers when evaluating potential drives for single-user use.

Whew. Who's still with me? :)

Share this post


Link to post
Share on other sites
You state that you've "used some disk performance tools to see that the disk is spending a lot of time seeking." Can you elaborate? What tool?

Weeeelll, I use a combination of Microsoft Performance Monitor and Diskview. However, I have to interpret the results and so I'm making certain assumptions about what's actually happening at the lowest level, and basing my ideas on those assumptions. So basically I'm saying I'm pretty sure what I'm saying is correct. :unsure:

That being said, WOW is IPEAK SPT awesome! I had no idea something like that existed. And thanks for the incredible post concerning FPS style games! Sorry for two sentences in a row with exclamation points, but that shines a whole different light on my illusion of how disks interact with FPS style games. There is a lot more sequential access than I had expected, and the actual time spent seeking is way way less than I had ever imagined.

But not only that, I have a better understanding of what I should be looking at when I evaluate one drive over another, and I understand much better why you guys do your analysis the way you do. Thanks for such an informative post!

Ron

Share this post


Link to post
Share on other sites

Do more platters alter the overall time spent seeking at all? In other words, if you have 5 platters, the chances are higher that you would need to seek a spot on a different platter than a drive with only 3 platters. Granted most programs or applications would have the majority of the program residing on one platter, but I could see an application split between two platters happening. Is there any additional latency or overhead associated with this?

Ron

Share this post


Link to post
Share on other sites

If farcry doesn't span 8GB and the drive isn't fragmented I would assume those large span accesses are to system files which would lead me to believe a dedicated os drive would remove them. Is IPEAK able to show what files a given access is accessing? If not, a retest with a dedicated os drive would prove such.

Share this post


Link to post
Share on other sites
Guest Eugene
Do more platters alter the overall time spent seeking at all?  In other words, if you have 5 platters, the chances are higher that you would need to seek a spot on a different platter than a drive with only 3 platters.  Granted most programs or applications would have the majority of the program residing on one platter, but I could see an application split between two platters happening.  Is there any additional latency or overhead associated with this?

Ron

217616[/snapback]

I think you have it the wrong way around. Data does not first occupy one platter before starting with another. Rather, data is arranged in concentric cylinders. Once one TRACK is filled on a given platter, the data then moves to the same track on the next platter. Hence the very split you refer to happens all the time, not "rarely."

No, there's generally no latency associated with this as today's modern disks use an offset arrangement where the sector on a platter that picks up from the last sector of the previous platter is located at the most likely point on the new platter that will be under the associated read or write head given head switch times and rotation speed.

The result is that net actuator movements given a set # of LBA sectors will be lessser on larger capacity drives than smaller. The overall increase in speed due to this lessened movement depends on the stride lengths themselves; as the read seek profiles above demonstrate, actuator movement speed vs stride distance is not linear.

If you're wondering how that affects the given FarCry trace above, it doesn't. All the accesses contained within are simply linear LBA sectors virtually ordered one after another. Drive geometry does not come into play there, but rather at a lower level. It's what makes these trace playbacks so applicable.

Share this post


Link to post
Share on other sites
Guest Eugene
If farcry doesn't span 8GB and the drive isn't fragmented I would assume those large span accesses are to system files which would lead me to believe a dedicated os drive would remove them.  Is IPEAK able to show what files a given access is accessing?  If not, a retest with a dedicated os drive would prove such.

217618[/snapback]

You're probably correct. Details like what files happen to encompass the LBA locations and sizes requested are of a higher level than what WinTrace32 and RankDisk deal with... they don't care, nor should they.

An installation's amount of fragmentation is one of the "input variables" that have been debated in the past here in the SR community. Good arguments span the range from a system defragmented as much as possible to one intentionally fragmented as much as possible. The installation from which all five single-user traces have been drawn here have featured light fragmentation.

The ability to "snapshot" a given state of fragmentation, btw, is another strength of the exacting record-playback system. If one were running the given applications themselves (such as executing the Winstones themselves rather than captures), fragmentation state then becomes a confounding variable that's different in every run on every tested drive.

Share this post


Link to post
Share on other sites

Beautiful data Eugene. Just awesome. I love to see the numbers.

Share this post


Link to post
Share on other sites
Do more platters alter the overall time spent seeking at all?  In other words, if you have 5 platters, the chances are higher that you would need to seek a spot on a different platter than a drive with only 3 platters.  Granted most programs or applications would have the majority of the program residing on one platter, but I could see an application split between two platters happening.  Is there any additional latency or overhead associated with this?

Ron

217616[/snapback]

depends on hdd make, there are ones with vertical format or horizontal format.

Share this post


Link to post
Share on other sites
Is IPEAK able to show what files a given access is accessing?

217618[/snapback]

The Event Tracing for Windows (aka ETW) instrumentation in Windows XP and up has enough information to do what your asking. It includes filename, disk offset, disk service time, I/O time (includes queueing), process and a few other things. There aren't any tools shipping in Windows that display this though they could be written.

Share this post


Link to post
Share on other sites
Beautiful data Eugene.  Just awesome.  I love to see the numbers.

217684[/snapback]

Agreed! This thread has reminded me why I've spent so much time here. This stuff is fascinating!

(Or does that say more about my obsession with technical details & explanations than about the community here? :unsure: )

Share this post


Link to post
Share on other sites
Guest Eugene

Given the recent return of various "should I raid?!?!??!" posts in the community, I should also take time to point out that the FarCry data presented above is relatively STR-heavy compared to the Office and even the High-End pattens that are broken down in the TB4 article.

Despite this, however, note that while STR represents about 70% of all accesses in FarCry, the array spends only 15% of its time on sequential transfers. Doubling STR through a two-drive array halves this 15%.

Hence, the 10-20% improvement we see in SR's tests when going from a single drive to 2xRAID0 comes from the doubling in capacity (which, in the past, has more or less established itself as a 7-10% performance boost in our tests) + this small improvement.

In other words, sequential transfers already complete so quickly that they have in effect written themselves out of the performance equation. Doubling the performance of a factor that exerts such a small effect nets a small improvement... as one should expect.

Share this post


Link to post
Share on other sites
In other words, sequential transfers already complete so quickly that they have in effect written themselves out of the performance equation. Doubling the performance of a factor that exerts such a small effect nets a small improvement... as one should expect.

218028[/snapback]

This is something I've long considered to be one of the dominate reasons behind the poor performance improvement offered by striping. It's great to have exact numbers to quantify it.

Share this post


Link to post
Share on other sites
Is IPEAK able to show what files a given access is accessing?

217618[/snapback]

The Event Tracing for Windows (aka ETW) instrumentation in Windows XP and up has enough information to do what your asking. It includes filename, disk offset, disk service time, I/O time (includes queueing), process and a few other things. There aren't any tools shipping in Windows that display this though they could be written.

217917[/snapback]

Sysinternal's long standing filemon & diskmon do such tracing. It'd be more convenient if iomon did it but probably could use sysinternal's tools in conjunction to reveal what's going on. Have you tried making use of these before Eugene?

Share this post


Link to post
Share on other sites

Basically, starting filemon or diskmon logging(each access includes a timestamp) then running the canned benchmark run should reveal what is accessed & when/sequence crossed with access load.

Share this post


Link to post
Share on other sites
In other words, sequential transfers already complete so quickly that they have in effect written themselves out of the performance equation. Doubling the performance of a factor that exerts such a small effect nets a small improvement... as one should expect.

218028[/snapback]

But RAID 0 does not 'just' double STR.

If stripes are big enough or access is aligned then one drive can serve one request and the next drive should be able to work on the other request.

In RAID 1 any drive can serve any read request, so it may be nice to benchmark how that would affect game/level load times.

However, this does require 'high queue depth' and I'm afraid most apps/games only use sync IO instead of async IO.

Edited by Olaf van der Spek

Share this post


Link to post
Share on other sites
Basically, starting filemon or diskmon logging(each access includes a timestamp)  then running the canned benchmark run should reveal what is accessed & when/sequence crossed with access load.

218089[/snapback]

If I recall correctly, these utilities log the file accesses, not the actual disk cluster accesses. They are therefore useless for this kind of benchmarking.

Share this post


Link to post
Share on other sites

I don't think you understand the significance of SR's tests keith.

Tracking cluster accesses and, therefore, the physical location of requests on the platter relative to other requests is the most important thing when analyzing a workload since it is the sole thing which determines how a disk reacts of a workload and how it performs. If you read this article on SR's Testbed3, you'll get a much better idea of what is going on.

Share this post


Link to post
Share on other sites

Very interesting, Eugene.

A drive may spend ~11% of its time servicing 16+ M sector requests, but uncached service requests encompass much more than just the 16+ M sector "long distance calls".

I would say that analyzing differences in seek performance from beyond 16 k sectors as opposed to 16 M is more representative when answering the question of how significant is seek performance to responsiveness.

Reframing this analysis to look at 16+ k sector requests, we find that a drive spends ~60% of its time servicing uncached requests. And that means the seek performance of a drive is more significant than you're giving it credit for.

Am I missing something here?

Share this post


Link to post
Share on other sites

I think Eugene's "electron cloud" map of seek times explains simply why average seek time doesn't tell us much; most accesses are fairly localized on the disk. Unfortunately only SCSI drives seem to list data for average track-to-track seek, which may be better but still does not adequately describe uncached performance. The 18.4GB 36Z15 (short-stroked and quicker than the high capacity model SR tested) has average seek times shorter than even an Atlas 15k II, but look at the awful track-to-track time: http://www.xbitlabs.com/articles/storage/d...eetah-15k3.html

http://www.xbitlabs.com/articles/storage/d.../ibm-36z15.html

Note how each successive generation has quicker track-to-track despite increasing platter density.

Excellent work, Eugene!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now