• Content Count

  • Joined

  • Last visited

Everything posted by doctorj

  1. I'd like to know what really limits the performance of desktop applications. It is conventional wisdom that they are "I/O limited," i.e., waiting for the hard disk. And this in turn is because of the large seek times involved in random I/O. Then, SSDs come along and reduce seek time by a factor of 100 -- and performance goes up only a few percent. If seek times are really the bottleneck, why don't we see desktop apps with instantaneous response on an SSD? Why are they only marginally faster? Here are some guesses: Most mechanical drive requests are serviced from the buffer? Operating systems' I/O strategies are tuned for mechanical drives? Sequential transfer rate is actually the bottleneck? File systems and storage layouts are optimized for mechanical drives? Performance is limited by I/O transaction overhead? Speculation is nice, but I'd prefer answers backed up by real data.
  2. The 'Caviar vs. Raptor' graph is labeled "Command Queueing Disabled" but the Caviar's scores are actually with command queueing enabled. The Caviar would put up a better showing vs. the Raptor if you actually graphed the results with both disabled.
  3. The trace doesn't record a delay as you have described. Whether the request is serviced by a cache hit or not, it is only recorded in the trace as a request of the particular data block(s) along with the current queue depth. The slower response of the drive with the smaller cache in your example doesn't change the capture or playback stage of the process. Well, except for the fact that the slower drive will playback the trace slower, which is the whole idea of the trace/playback method of benchmarking. 214177[/snapback] Then what does this mean? That's correct, request order and interarrival times are properly preserved. 212562[/snapback]
  4. Since the TB4 benchmarks are now, well, benchmarks, couldn't you just run the actual Veritest software instead of playing back captured traces of it? That would at least eliminate the CQ/request-reordering/thread-timing issues. I imagine the benchmark is scriptable so it shouldn't be too labor intensive. Either that, or we need to see some data on the variation between actually running the benchmark and playing back a trace, on several different drives with/without CQ. And just to throw some more fuel on the fire, I have to imagine that the buffer size of the drive on which the trace was recorded significantly affects the timing of requests as well. Consider a trace recorded on a drive with an 8MB buffer. If a request misses the cache, that particular drive will take a few milliseconds to fetch the data from the platter, and so the trace will record a delay before the next request is issued. However, if the same request is serviced instantly from a 16MB buffer drive because its in the cache, then the next request won't have to wait. The trace adds an artificial delay and skews the results toward drives with the same buffer size as the reference drive. Mmmm, life is complicated.