dietrc70, on 10 February 2013 - 09:08 PM, said:
These are the options I can think of:
1. Shrinking a partition is not very difficult, and you could create a new one with default clusters, move the applications to the new partition, and hard-link the directories. Your users would not see any change. The problem with this option is that it would slow the array by forcing it to seek to the new partition. I don't really like this option.
2. Reformat the whole array with 8kb clusters, which are large enough for your array and will eliminate most of your slack space issues. This would probably be the cleanest and best solution.
3. On a side note, I wonder if you should consider changing your array configuration. Your array is huge, and I'd be concerned about rebuild times and possible 2 drive failures on a 10 drive array. RAID 6 or 60 might be worth considering.
I've seen endless arguments on cluster size vs. stripe size. My own impression is that it's usually best to use the manufacturer's recommended settings for your application, and then just go with the default (minimum) cluster size for your array size. Perhaps a RAID 5 expert could give more specific recommendations. Since you have so many small files, then stripes in the smaller range might be better.
Well, that's just the applications directory (directory where the applications are being served out of). In another directory, I have a LOT of really big files. The space-on-disk vs. space utilization is much better (sitting somewhere around the 0.9-0.95 factor). And that's the thing too - I couldn't predict what the installation files would unpack to. And whether the application will run over the network (some do. Some won't. Some won't even let me installed to a mapped network drive.)
(As for backup, I'm starting to think about looking into getting a LTO-3 drive and do a grandfather-father-son type dealio. But that's another discussion. (ORIGINALLY, the plan was for me to build a second live server running ZFS and then run rsync weekly), but I dunno.
I know that usually, the discussion around stripe vs. cluster size is due to performance considerations. Whereas in my case, I'm not really too overly concerned about that. (I've got a 12-port SATA 3 Gbps controller, and my network is only 1 Gbps, which means that I am much likely to bottleneck and oversaturate the network long before I will run out of performance room from the array.) So this ends up being an optimization of a different kind - one that I don't think that very many people has done.
All 'round übergeek.