First, I will acquiesce the post length title to you for this thread.
older documents, however the 3945E's are nice routers, just put some in here at work for some 600mbps mpls vpn's. As for switches the 3560X's are the current low-end line. They are actually as I mentioned not that bad in price (barring your taxes/fees ouch!).
good, that you have a backup plan. most places forget about it until it's too late. doing partition based backups (vss) would greatly improve performance as it avoids individual file locking.
You have it partially, yes you'll have a max cap of your inet speeds which is good. The problem is the request data itself is going to be randomized and small which requires a lot of iops to your subsystem. Some tuning can help (i.e. caching; turning off access time updates; etc) but ultimately you're going to be at the mercy of the drive subsystem. This is where more 'spindles' wins.
This is good as it helps separate your I/O workloads. At this point I should make a comment as to security. Generally allowing Inet directly into an internal system is a bad idea. With your willingness to have it as an end repository that helps in functional network separation of your environment. Even if you do not have a firewall itself, you can create a 'poor mans firewall' which is better than nothing. Simple environment would be to have an external interface on the router to your provider, acls to filter traffic (remove 'impossibilities like incoming RFC1918 ranges and seeing your own IP coming back into you; plus your ingress filter as to the types of traffic you want to allow to your DMZ). A DMZ segment on a separate interface on the router; which you would also apply an acl in-bound (for what you want to allow. mind set that this is an 'untrusted' segment as it has direct inet going to it so it should not initiate back internally or at the very least scrutinize exactly what you want to allow as this is a popular method for attacks (transient trust). Then lastly an 'internal' interface (one facing your internal lan) which would be another separate interface on the router which would also have an acl to allow only specific items/ports out to the Inet and towards your dmz. This lets you help block internal virus issues spreading to your dmz as well as to other sites. Also would suggest using private ip's (RFC1918) internally and on your DMZ and natting (1:1) for your dmz going to the Inet. This makes it a little harder to attack but also allows for growth a bit better. Security design layouts are just as involved as storage subsystems, I could go into more detail but it would probably derail this thread. I understand you have a small environment so some items you may not be able to easily accomplish with the assets you have, just raising it up so as to hopefully allow you to make some design decisions with security in mind. (see way too many sites/companies that have no consideration for it, most will survive a while but in the end it's costly if you get attacked.
Due to the size of the drives you are looking at, the issue comes down to is trust in both data integrity as well as correlated failures (temporal as well as physical) and the amount of time for a recovery. This can be substantial (>24 hours), since simple errors (hardware/main failure) are temporarily correlated (you have a greater chance of a second drive failing if a first drive fails) so you can be in the state where you can loose 1 drive; start a recovery process and loose a 2nd drive while that is happening. Now at this point you have two failures so a failure to read a sector or you get a UBE you now have data loss. (basically what the spreadsheet I had above calculates). I try to get the probability of not reading all sectors on a sub-array to less than 10% (ideally less than 5%). This is what you have to balance against your stripe width and MTTDL for the raid type chosen.
The points of multiple chassis are mainly from past experience. First, having a single chassis that also contains your head (motherboard) you invariably outgrow the chassis and then you run into situations where you have separate power going to your MB/head and drive arrays (i.e. controller having drives on different physically separated power systems would cause issues where your drives would loose power but not the controller and similar glitches which would even in the case of BBU's have more data in flight than can be handled so cause corruption, or causing split arrays (degraded et al) when you come back up and drives in one chassis are not seen/spun up before the head/controller is et al. I've seen the same problems even with sans (emc; stk, et al). best to have a hard separation.
Next the issue of external chassis size. 1) you have the issue to HA having more chassis and spreading drives across them if done properly you can loose a chassis and still have your array. simple analogy in a a raid 1+0 you have two chassis one drive of each mirror in each, you loose a chassis you loose 1/2 your drives but your array is still on-line. Same idea can be used for other array types (for example raid 5 of 3 chassis one drive each (stripe width of 3, 2 data 1 parity.) and 2) look at your bandwidth, with 3Gbps sas in 4channel multilane you get 12Gbps or ~1.2GiB/s throughput. with 6Gbps sas that's 2.4GiB/s. With hard drives now at ~120MiB/s at outer track or down to 60MiB/s inner track that's (assuming streaming) 10-20 drives before saturation on 12Gbps or 20-40 drives w/ 24Gbps. Now in your case since your workload is going to be more small transfers in operational mode you can have more drives without really hitting that bandwidth limit, with the exception of when you do your data scrubbing and recoveries in which case if you oversubscribe too much you'll impact your array maintenance functions.
Assuming you have UBE ratio drives of 1:10^15 then having a stripe width of 6 drives of 2TB would give a 9.15% probability of not reading each sector, going to 3TB drives that raises to ~13.4%. However this is a statistic so it does not directly apply to /your/ drive build but in general. With that I would be hard pressed to not use raid6 w/ 6 drives per parity group at max. Then add in multiples of that to reach your IOPS/space requirements.
Also as a rule of thumb I normally have at least 1 hot spare drive per 12-15 drives; minimum of 1 hot spare per chassis. As you want to start a recovery AS SOON AS A DRIVE FAILS due to recovery time involved.
Areca's are nice cards (I have several here), though there is no 'perfect' card. Things to be aware of:
- Cards with built-in expanders (basically cards that have > 8 channels) may have interaction issues with external expanders and certain chipsets (for example the ARC-1680's had issues with some LSI expander chips as well as the intel 5520 IOH (negotiating at lower link rates). That may not hit you with the 1880 but raises the item that you need to test.
- Cards run hot so you need good airflow across them
- You are limited to 4 cards per system.
- you cannot share a raid parity group across controllers (need external means like dynamic disks; lvm; or similar at the host level)
- with the larger channel # cards there are 4 channels directly wired for the back-end SFF8088 conntector the other 4 channels are fed to the internal expander (so even if you have 24 channels they get fed to the same 4 channels on the chip). This is why getting a card with say 2 8088's or 2 8087's (i.e. 8 channels) can be better. (*note, depending on workload if you are doing a lot of small random writes then cache can help more up to a point).
As I mentioned above with the # drives per parity group directly goes toward MTTF_DL calcs.
UPS's protect against site hard power failure, however BBU's protect in-flight data to the arrays. You should always have both /OR/ disable write caching all together but that has performance implications. Basically power failure is not the only situation where you have data loss issues. For example if you had a cpu or OS crash which has no bearing on power but could leave data non-flushed in your cache. That's what the BBU protects.
IOH = I/O Hub the 5520 is a dual IOH chip each IOH (5500 or 5520) has 36 PCIe lanes. Having two of them gives you 72 lanes which is why it's much better than using a PCIe bridge chip (i.e. nvidia nf200 or similar) as those chips do the same thing as the sas expanders, the take say 16 lanes in and then time slice to 32 lanes. So you have queuing issues (longer latency). dual IOH's provide full link bandwidth no sharing/blocking as both are directly tied to the QPI interconnect.
For the boards the main differences really are that the X8DAH has more memory capabilities than the X8DTH. The DTH has 7 16x slots but all slots are 8x electrically, the X8DTH has 8x & 16x physical slots on the board. Not much of a difference as there are no 16x network or HBA/RAID cards. For a server system unless you are pumping it full of ram either is the same.
To answer a later question you DO want the -F version (IPMI) of the boards. IPMI is your KVM and console replacement. This gives you power control and remote console (graphic & serial) to the system. This avoids you having to put a keyboard and monitor on the box and allows you out of band access to it in case there is a problem with the OS. The cost is negligible for its' function.
I won't even go into 'backblaze' there is just way too much wrong with that environment from an availability and integrity standpoint. That's the epitome of a 'if it works with one drive; it will work with 1000' mindset. no it doesn't. It has the /potential/ to work out but not at that price point. you would probably use ZFS (zvols) and a distributed file system on top of that which would need to be presented to front end filer hosts. The additional network & server robustness to handle different failure scenarios will also have to be added (different power grids; network switch zone separation; multi-pathing to chassis; then you run into other items with consumer/desktop drives such as vibration tolerance; error recovery items; bad failure scenarios where a single 'failing' drive can cause problems for all other drives attached to the same controller/expander chip; et al. Not saying that you /need/ enterprise level everything; however you have to know what they provide and build your environment to compensate for the issues.
Larger drives have the same problems as listed above, 3TB for example may take up to 2 days to re-sync in a failure mode; what is the UBE rating; how does it handle vibrations; etc. You can run the numbers, I haven't found any yet that from the published specs I would trust without some much more in depth compensatory controls (i.e. ZFS adds in checksumming of every sector both read & write which raid cards don't do this helps catch a lot of issues; doesn't solve them but pushes the bar out a bit, then you can do other HA items like chassis/controller/power separation et al.
This is why a lot of enterprise solutions cost much more money. As you are supporting the industry to do a lot of the testing for you. Testing takes time. For example I just got a sample SC837E26 box here and am running into a connectivity issue where I can't see any internal sata drives on the unit. It may take a week or more to work. Comes down to time or money.
If you're separating your functions out (file server separate from your video conversion systems et al) your file server has no need for SSD's. You should have a raid 1 for your boot drive (os) but that's about it. It's main function is static and the i/o will be going to the array(s). I normally buy a couple SAS 2.5" drives (10K rpm or something like that; 72GB or so in size) for the system. Main thing I'm looking for is reliability. Now if you were using a non-windows OS and say solaris or something that can run ZFS then there /are/ uses for SSD's there (cache and log devices) but that's a different discussion related specific to that type of deployment.
Leave your SSD's for your scratch disk space on your conversion systems, that would speed your composting. You won't need a fast OS drive for these systems either generally as that's just your application/OS you would need ram but once the apps generally load it's all scratch & I/o to your data drives. Talk to you app users.
You may not from a cpu (processing) standpoint but it does allow you to better utilize interrupts for heavily loaded systems in I/O. You can always start with one; though remember you have to add the same model for the second later on and keep an eye on interrupts (software & hardware).
As for hyperthreading that goes with the above. think about what hyperthreading does. Basically it allows the cpu to QUEUE up an additional instruction but only one can be operated on at a time. this is great for filling your queues for applications that are heavily threaded. With deployments that are interrupt driven (for example storage systems; firewall/network switching, etc) you do not want to 'queue' that interrupt waiting on the execution pipeline of a single core you want that interrupt to be handled as fast as you can so put it on a core that can take it and run. (this is very simplistic, if interested books in computer architecture/design may be helpful).