I've seen this mentioned a couple of times, but I must have missed where this was revealed. Can somebody point me to it?
The "editor's choice" seems to have had a significant effect on the results. I think part of the problem here is the extremely long period (in computer terms) between benchmark refreshes. It's been 3 or 4 years since TB3 was put together, and SR have been benchmarking new drives, that have likely been tuned to perform for today's applications, on benchmarks based on older applications.
Having said that, for that possible explanation to ring true, you'd want to see newer drives benefitting under TB4 and the reverse for older drives. This doesn't seem to be the case.
So it starts to seem a case of whether the drives have been tuned better or worse for the particular applications used in a benchmark. This would explain why other review sites provide end results that aren't always consistent with SR's results. While most other review sites don't use a testing methodology that matches SR's, many of them are still done well enough that their results are valid, while possibly contradictory to SR's. It seems to me that to provide the most applicable, general use benchmarks, you need to use as large an application set as possible. SR have always said that the one point they would consider conceding as inaccurate in their methodology is the applications used (my wording) - perhaps there was more to this than we previously expected?
The DriveMarks have been designed as a way to measure the performance of the drives in 'real world' usage, but in isolation from all other factors. It's never been implied that a drive that's twice as fast in one of these benchmarks will result in the entire system running twice as fast. If a trace is taken of activity that only has disk activity during 20% of that time period, a drive that plays back that trace twice as fast obviously doesn't make any difference to the other 80% of that time period.
What might be interesting, along those lines, is the overall time elapsed during the trace captures. And in addition to that, the time it takes to play back the trace. This would help put the benchmarks in to proper context. The DriveMark numbers are derived by simply dividing the total number of IOs in a trace by the number of seconds it takes to play back the trace, so the information is probably available, just not published. Eugene?