I think you have some interesting arguments there, but I still disagree with your idea that somehow this is a controller only problem and not a drive issue. I think we actually agree on the final point though, the point is you should NOT need more expensive drives just to get a feature that should be in the drives to begin with but manufacturers are either disabling or ignoring to try and force people to buy more expensive drives. I also agree that for a regular user with a small array that having a controller that delays and waits on drives or could be configured to do so is an acceptable solution, but just because it works for a some doesn't mean this isn't a real problem for others and "ignoring it" isn't a possible solution when real data is on the line.
My take on the same arguments is basically like this:
Re: 1) Most controllers do, and I want mine set at 7 seconds so the array doesn't get locked up during heavy writes and begins dropping data when the write cache is consumed. (I have had this happen before in a high performance system... it isn't pretty)
Re: 2) There isn't "elsewhere" to read the data from. RAID reads data in in stripes across all drives participating in the array, this is so reliability of data can be confirmed as well as having the necessary data should it be altered and parity recalculated. The drive needs to complete before the card can continue to do its calculations, this is what locks up the controller waiting for a response.
Re: 3) We should reinforce choice in design, what you believe to be correct is apparently what I consider completely incorrect behavior. Good RAID cards allow me to specify what I want, the hard drive manufactures should not be dictating what terms I use their drives on.
Re: 4) Actually there is a big difference between not responding and disconnected, not responding takes several attempts at communication and confirming the drive appears to be completely unresponsive. At that point it is considered the same as disconnected. Disconnected events can be tripped by the hot plug notification system instantly and the controller can cope with that much faster.
In fact many cards do drop a drive because it doesn't respond fast enough. It took too long to respond (and some drives don't respond while they are doing something like attempting error recovery on a sector, exactly what TLER is supposed to fix) so its assumed the drive has lost power, been unplugged, the firmware locked up, any known host of problems. For data integrity the drive is knocked off the array. Also bad sector relocations, while automatic on the controller, absolutely send notifications of the problem encountered and it is in no way "hidden". This is precisely because drives developing problems reading/writing are very possibly going to fail soon, so the card can keep you apprised of this (obviously through some userland utilities). TLER simply allows the card to recover gracefully, timely, and make a note of the problem for someone to fix.
On top of all that you shouldn't need a "more expensive" drive to fix the TLER problem. It is part of the ATA-8 spec, as far as I am concerned the drives claim ATA-8 compliance they should implement it. The "RAID Enabled" drives are a bunch of bull though, that is really the entire point here. Western Digital and others have tried to mark up the drive 50% when basically the only thing they change is effectively enabling the ability to control TLER. Supposedly they pick better batches and such too, but I am somewhat skeptical of any difference between the drives except for firmware locked features. I shouldn't need a new controller or more expensive disks, as far as I am concerned the controllers are doing everything the correct way and the drives should be obeying it, not the other way around.
In terms of ZFS, until it supports resizable arrays it isn't an option for most of my projects which periodically require reallocating resources between machines.