stevecs

Member
  • Content Count

    207
  • Joined

  • Last visited

Everything posted by stevecs

  1. stevecs

    Intel SSD 520 Review Discussion

    Check to make sure you have the latest firmware on the dell as well as the ssd. I don't have a dell here, but I do have 10 of the intel 520 240GB ssd's here in various systems (native in laptops, as well as raided with LSI and areca) no issues on any. Albeit, this is a small sample size
  2. Now that's cool. Never built anything that large for the smaller projectile accelerator I put together years ago, had some issues with the photocell triggers and spacing. Looks like a fun project. and yes, I /think/ that will do a pretty good job of erasing data, or welding it if all else fails.
  3. stevecs

    Intel SSD 520 Review Discussion

    Yes, I know from testing that you can extend that sometimes up to an order of magnitude but still that's 1) not the design spec, and 2) much lower than what you can get from server classed drives which are at the same or lower price points. Just looked at one of my workstations here that I've just moved over from a 4-drive to an 8-drive raid10 (sas), and that's doing about 500GB/day in writes total (so about 125GB/day per drive assuming equal loading). The last system that I had running for about 5 years on sas was up into the multi-PB range of writes per drive. Just looking at some of my 'light use' drives and they are averaging about 4GB/hour of I/O.
  4. stevecs

    Intel SSD 520 Review Discussion

    Nice write up on the drive, and I'm glad intel/you have included the UBER rates which is something very lacking from other vendors. For me though the big killer here is the pathetically low 36TB write endurance or 20GB/day. That just rules it out completely (heck In the time I've written this message I've already written ~2GB per drive * 4 drives (or actually 8 as it's a raid 1+0). For the price points they're looking at (say $500 for the 240GB version) that's more expensive than a 2.5" 15K rpm SAS 300GB drive which has a latency of ~2ms). Unless these become significantly cheaper than the sas drives and at least come up an order of magnitude or more for write endurance I just don't see it. (and yes, I have intel 320's and 510's in my laptops, though frankly haven't really seen any 'big' improvement there at least under linux where you have the system already optimized and enough ram for your applications (i.e. no swap)).
  5. stevecs

    Seagate Barracuda Green 2TB Review

    yes, they are more expensive but: 1) they are still sold/supported and have a 5-year warranty opposed to 3 year. 2) designed for 24x7 (8760 hours/year) use opposed to ~6hours/day (2100 hours/year) 3) since they were speced for raid systems should be more robust in firmware support to avoid dropping out or causing errors like I'm seeing with all the ST2000DL003's. Having been 'wooed' (and yes, I /SHOULD/ know better as I do this all the friggen time) by the lower price of the ST2000DL003's, they are not worth it IMHO. I've had to replace about 20 of them so far due to hanging the bus or other similar 'soft' errors which is way more than the hitachi's or the better seagate drives. (been burned too often by WD in the past that I don't even try them anymore). Likewise though your mileage may vary. Just so you're aware and it may/may not affect your roll of the dice.
  6. stevecs

    Seagate Barracuda Green 2TB Review

    I don't have the QNAP but I am running about 120 of these drives now in a raid. Frankly, don't get them if you have a choice instead you may want to look at the SV35 series which are rated for 24x7 use unlike the 6-hours a day use for the 'green' ones. Main problems I've seen: Long error recovery periods both during initial spin-up as well as in operation (no means to enable TLER); Less compatibility with backplanes/controllers (LSI; areca (which use an LSI expander chip) and the older Vitesse chipsets, basically errors such as "log_info(0x31120303): originator(PL), code(0x12), sub_code(0x0303)" which are messages from the drives. I think they're on the ragged edge and not really designed for much more than a tertiary drive for /very/ light use.
  7. ok, I thought you would have had more HBA's or raid cards for the head unit which is what raised the Q. To me, for server platforms/main OS/application et al I would stick with rotating rust. SSD's are better for caching, or where IOPS are important and you can justify the cost of replacements (database solutions where transaction time is paramount). Most solutions I've seen with SSD's are more for ego than for real-world use. (BER rates; P/E cycles; internal data integrity checks; et al are all behind that of HD's. Unless you're mitigating this by external design or business needs it falls into the 'bling' category in my opinion. For OS itself and since you have SAS controller on-board as well as add-in cards would probably go with two ST9146853SS's in a mirror for the OS and general apps. For the VM's, it really comes down to what you are going to be doing in the guests. Most guests are rather light on I/O but chew up memory like no tomorrow. Would either use SSD's in a mirror here or another couple ST9146853SS or ST9300653SS's depending on more information as to the guests and what they're doing. Would never deploy a SSD or HD without some type of HA. Even with backups, the time generally taken to do restores in a production environment is too costly in comparison to the rather small $$ for the higher availability. The X8DTH-6F has two SFF-8087's on-board, that SC835TQ appears to use a version of their M28E1/2 system which if true has an expander on that either 1 or 2). However I thought those backplanes were SAS/Sata 3Gbps only. For the 'local backups' either something like the ST91000640SS or ST91000640NS (since they're only about $20 different would probably go w/ the sas versions just to keep everything sas but that's just me). In a raid-5 that would give 3TB of backup space.
  8. Would suggest /not/ to use RAID0 for any type of business function really as you have MTTF/#drives minimally for availability. If you have it as purely 'scratch' (i.e. not cache or similar but literally stuff that will have zero impact if the drive/access to the drive fails to the rest of the system) possibly, but even then would strongly caution against it. striping/raid0 is like playing chicken, you are /never/ 'lucky' all the time. For your setup, would probably save some $$ and do: boot - 2 enterprise sas drives for boot, either 10Krpm or 15K rpm depending on how much swap you may need (though would probably put the $$ into more ram if possible) VM guests - SSD's in RAID-1 mode. Make sure you turn off swap in your guests (use your host's memory). backup - 4 disk raid5 or 6 depending on usage workload and drive size/type (if BER 1:10^16 and < say 500GB you could probably get away with raid-5), with only 4 drives you are really limited here if you need a lot of iops, so I'm assuming it's going to be as you stated, batch transfer data/backups et al. If you're going w/ a 5.25" full height (or dual HH) enclosure you can find them with build in sas expanders that can handle 8 2.5" drives, that would let you use the two SFF-8087's on-board depending on your chassis setup. Otherwise the SATA2 ports would be fine. SATA is not a bi-directional signaling (can't send/receive at the same time to the controller) which is why larger cache's are on sata drives to 'hide' this from the user's point of view. Though shouldn't be a problem with the spindle counts you have here. Is this for another task machine (encoding or something?) Since I didn't see the large data store in your list trying to figure out what your workload will look like for this node.
  9. using identical raid cards helps with interrupt 19 handling from the bios/bootstrapping as well as, if written correctly, will cut down on ROM space used as cards with the same firmware/code could share the same space. That's the main reason why I was using the LSI2208 chips (9200-8e) as it's the same as the on-board chip, however even with that with effectively 7 chips (6 9200-8e's + 1 on-board) I can't get into the LSI bios as it doesn't have enough ROM space to run, I have to remove some cards to get in to configure them or do it from the OS utilities. This doesn't stop the cards from working, just the bios utilities are too large to fit in operational memory. So yes you should be able to use different raid cards, though like anything testing is needed. As for the 2nd response above from their 'lab engineers' unless they have just recently updated firmware (which would upset me as I just returned the SC837E26) it does /NOT/ work with the LSI2208. The last I heard ~week ago was that they were looking into it but nothing from that point onward. I would get them to validate it in their lab for assurance or be prepared to fight it with them. I couldn't wait as I needed the system up here in the next 2-3 weeks so had to go with a known working config.
  10. Yeah, that's a large MB. why are you looking at that one opposed to the X8DTH-6F which is a normal extended ATX size not the custom enhanced extended atx. You just have 12 dimm sockets so only 96GB of ram (or 192 if you can find the 16GB sticks which are hard to come by). If you do that you shouldn't have a problem with the fit. As for Chenbro, I've had similar experiences with them (lack of response), I haven't used their chassis though mainly using SM, AIC, at least recently (past 3-5 years).
  11. Don't have enough data points to say it will or won't work with the SM chassis (the 2208 chip). That is the same chip that is used in the 9200-8e's that I have here w/ the SC837E's and it does not work. Would suggest you send an e-mail to SM for a complete compatibility list for that particular chassis, it's possible it has a different expander so may have other supportability.
  12. Remember dual-porting does not work with SATA drives (they only have a single port) nor does this work with consumer raid cards (you need to use software raid and a volume manager to handle the multi-pathing). If you ever consider going to SAS or may want HA in the future I think it's a worth-while option to have even if not used right away as it's a small cost in the price of the chassis usually. In most cases to add it later may require a complete chassis swap. Multi-pathing is a means to have redundant connections to a particular drive. With enterprise drives (FC; SAS; etc) you have two data ports on the drive. This is normally set up so that you have separate physical expanders; cables; HBA's and in some cases you can have them going to different computers (but this you'd normally use sas or FC switches). The idea is to make the data path itself redundant so if you loose say a HBA or cable you will still have your drives/data available. Generally this is not something a home user or small business would need initially. As for card support this is what I got from SM: "I got a compatible list from SM and find that AOC-USAS2LP-H8iR works with the EL2 backplanes in the JBOD chassis. However, it's an UIO interface so it will not work in the PCIe slots on your motherboard. The AOC-USAS2LP-H8iR does utilize a LSI 2108 chipset. I find MegaRAID SAS 9280-4i4e and 3Ware 9750-4i4e both using the same LSI 2108 chipset so they have better chances to be compatible with the EL2 backplanes."
  13. On-line UPS's have the benefit of removing transfer delays, but that's generally not that important in general server environments (more for medical, or cases where your load doesn't have enough capacitance to handle a couple missed cycles, which most server PSU's have more than enough to handle that). On the down-side is that you go through batteries faster due to their constant use, normal UPS's you want to replace them 3-5years with on-line if going direct to battery (opposed to super capacitors & batteries) probably 1-3 years depending on use. Yes, XJ-SA24-448R-B's and similar are a bad design. STK was using that design in their SANs back about 5-8 years ago and was no end to problems with them (mainly drives falling offline, mating problems et al). To get around the issue of pulling multiple drives, STK had a dual-head design so the front drive was controlled by one raid system and the back drive by another storage processor. You would have to offline both A/B drives and wait until it was synced to a hot spare before you pulled the blade. Slow and yes if you had another problem at the same time you're kind of screwed (they were only doing RAID3/5/[1+0] back then). The later ones you picked 48 drives vertically mounted in a 4U is a copy of the SUN Thumper systems, which are good for high density but air-flow is an issue due to the amount of restriction in the cases, plus to replace a dead drive you have to shut down/pull the entire unit out of the rack to get to them. To mitigate that you need a decent number of hot spares in there so you can do replacements in standard outage or downtime windows. I normally do a 1:10 - 1:15 ratio depending on quality of drives (consumer or enterprise). Another item is how those chassis are wired, the ones you linked do not show internal wiring schematics at all. You want to avoid situations where, like I mentioned before, a bad drive could take down an expander and all other drives attached to it. This is why I opted toward multiple chassis et al. It comes down to your HA requirements and what you can live with in cost. Just FYI, SM is supposedly working on the issue I had with the SC837E* chassis and the LSI HBA's but no solution yet. I've since went with plan B here and just purchased some more of the AIC chassis. Not as good as a density solution but couldn't wait for SM (considering that they use the LSI2008 chip on their motherboards this implies that they didn't do much QA testing).
  14. Correct, the SUA3000RM2U is not 'on-line' however it does have a high sensitivity rating and does have an internal buck-boost transformer to even out high/low currents (i.e. 82V-144V). And yes I would use a UPS with buck/boost especially in a noisy environment as it will help prolong your equipment life. Clean power is IMHO very important. I would find it hard to see you pulling 1800VA on both PDU's with a 25U cabinet with the number of drives you're looking at unless you're doing blade chassis. Especially with staggered spin-up of drives. Remember that 'Major Loads' is not really a problem, your APC unit itself unless you're going to 220V lines is at 20A, that's ~2000W, but your APC itself is only rated for 2100 or 2700 depending on the model you're looking at. I would suggest having separate circuits (two pdu's) would be better. That APC7801 is only rated at 16A at 100/120V though it IS metered which is kind of nice (would avoid me having to use my AC clamp to get data. It just takes up a U which I would rather use for servers. The case(s) that I mentioned before all have SAS expander backplane options: http://www.aicipc.com/ProductParts.aspx?ref=RSC-3EG2 (sku: RSC-3EG2-80R-SA2S-0[A-C]) for single expander versions. The XJ line is designed for drives only the 3EG2 is for MB's as well. The Supermicro's as well SC837's; 938's, et al). There are quite a few but chassis like these are designed more for servers or business/enterprise farms where you need to string multiples together to get performance/storage goals. Basically with an expander backplane (like the AIC ones I have here) you attach the drives to the backplane (hot swap) and then the backplane has two SFF-8087's (one in; one out) so you can daisy-chain chassis together. I normally use the supermicro CBL-0167L's in them so I can convert this to SFF-8088's and deal with each unit either directly tied to daisy-chained depending on needs.
  15. First you should be looking rackmount units assuming that's what you're getting (a rack), and the newer versions with the more efficient inverters. http://www.apc.com/products/family/index.cfm?id=165#anchor1 The SUA3000RM2U is about right 3000VA/2700W: http://www.apc.com/products/resource/include/techspec_index.cfm?base_sku=SUA3000RM2U&total_watts=1400 Requires you to hook up a L5-30 outlet which shouldn't be much of a problem (you can find them usually at hardware stores if you're doing it yourself for ~30US or less) and then a 30A fuse for your breakout box. Or get your electrician to do it. As for the PDU's I use the AP9567 http://www.apc.com/products/resource/include/techspec_index.cfm?base_sku=AP9567, two of these which you can hang on the back cable tray of your rack. Each goes to a different 20A outlet on the back of your UPS (each outlet is on a different fuse) then you plug each server/chassis into BOTH PDU's that way if a PSU dies/shorts which would blow one circuit the other should still retain power. Each pdu can handle 1800VA/120V/15A which should be fine for your intended load, it doesn't have the fancy led meter on them but those are hard to get with a short rack in zero u format. As for those sas expander cards, those are just plain expanders, no logic (raid) at all. The Chenbro one at least mentions that it is using the LSISASII36 chip, the HP one doesn't specify (it may as they do source a lot of there stuff from LSI). Were you looking at that opposed to have an integrated expander backplane in the chassis for the drives or something?
  16. Glad that our past experiences are a help. Basically how I got started in this back in the 70's as a kid. Information/experience is really limited if not shared freely. As for UPS load. I normally try to keep the load to about 1/2 the capacity of the UPS solution or less during normal running, but be able to handle a full power-on load without getting above 85-90%. The spreadsheet has some rudimentary calcs. For some perspective, currently I have 7 AIC RSC-3EG2 chassis; one Supermicro X8DTH-6F w/ 60GB ram; & 6 LSI00188 cards; 1 Areca 1680ix-24 w/ 2gb cache; (2) quantum LTO4 tape drives; Dell 6224 switch; 64 1TB ST31000340NS drives; 16 ST32000444SS drives; 12 ST2000DL03 drives; and 2 2.5" 10K rpm sas boot drives and a older cisco 2610XM router. All this takes ~1500W and is running on an APC RMA3000U (2U 3000VA rackmount ups). I expect when I replace the drives here all with ST2000DL003's (total 112 of those) should lower this down to 1100 or so.
  17. You're right, most companies use 42U racks but they do get a lot of miscellaneous stuff as well. The company I work for gets all types (out-sourcer). They may not all be 'pretty' after years of use but they're functional. Even though a full 42U rack is large, don't knock the price. A couple guys and a pick-up truck can do wonders. the APC one you have linked to is not a bad rack. The back is split which is good, the front is a full door but that may not be a big issue to you (i.e. you don't have tight isles of racks, that's where the split doors come in real handy). The cable tray in the back is nice as it lets you hang cable management guides as well as pdu's. Also with loading, you want heavy items at the bottom and work your way up. Generally I put UPS's at the bottom, then the drive units, then the server(s) no more than shoulder height if a full rack, and then things like switches etc at the top. I also like to keep high voltage and low voltage (network/san/scsi, etc) on separate sides of the unit.
  18. Didn't say racks were cheap. However, if you are in an area that has high-tech business (data centers, computer centers, etc.) you may want to call them up and see if you can get to their facilities group. Most places dump or go through racks frequently especially in outsourcing fields (when they do migrations, data center updates for new types of cooling standards or just standardizing on floor space management techniques). You may be able to find one just for finding a truck to pick it up. Generally racks you want to look for depth so you have at least 2" in front of the servers and at least 6" in back, if not more. Generally split door designs are better as this lets you open the rack easier to move systems in/out in constrained spaces. Having a wider rack is also good for better cable management. Same thing goes for the back/sides for your pdu's. ('zero U' power strips basically) just so you can hang them out of the way to not interfere when you want to pull out a hot swap supply or doing general maint. Square hole racks are generally better, at least here in the states as we're hard pressed to find any equipment with round-hole designs for rail kits. Most have 'universal rails' but generally even in the UK I've seen square hole taking over). The main thing is good air-flow (don't get glass or closed off-racks as they will cook your systems unless they're designed /very/ well). It's hard to go really wrong though, as the main purpose is just to have 4 posts to hook rails to. Rest is just sheet metal. As for the 'wish', no, lady luck is way too fickle for me, rather work with skill. Plan for the worst, hope for the best. As for the reason, for the incompatibility, nothing yet as to a technical detail reason besides 'not compatible'. Without a scsi/sas analyzer I probably won't get real reasons to it. It seems that it's in the enumeration of the expander chip back to the HBA but what, don't know. In the AIC chassis here with the 2000rpm scythe fans I have drives stay at in the low 30's C with ambient temps around 23-25C. So not bad. The array here is currently running 1TB drives (soon to be replaced w/ 2TB versions), would be ~128TB usable. (96 drives raidz2 of 16 vdevs of 4+2 drives; plus another 16 drives (8 hot spares and 8 as scratch for various other items that don't run well under zfs's cow structure). No, my old 2002 audi is going to last until it falls apart.
  19. I don't have the acoustic rack yet, just an old tripp lite one which was about $2500 or so, the ucoustic is around $5000 (passive) or $7500 (active). still debating if my hearing is worth it. As for fans I went with the Scythe's mainly for their long life bearings and hooked them up to a pwm controller so I can manually set the speed to find a good balance for cooling. If you are looking for sound pressure volumes remember that it's vector & phase additive for multiple fans. Assuming no other interference/everything in phase and properly balanced et al, you can use the following formula (assuming 3 fans at 44dB): 10 x (log (10^(44/10) * 3)) = 48.7 dB As for the MegaRAID 9265-8i I don't see anything there that would indicate that it does. the LSI2008 is an eight channel chip so I would doubt that it does. However should note now that I heard back concerning my own validation testing with the LSI00188 (which also uses the LSI2008 chipset) is incompatible with the Supermicro SC837E[1|6]6 chassis. Seems that they are only supported by the LSI 2108 chipset (wish they said that a month ago, which would have saved me validation time, but this is the type of thing that you run into frequently). In my case it's just a return of the chassis, but you should work plan 'b's with your vendors if you don't have similar validation that items will work. As for home system, that's just my main storage array. It's just a hobby, either that or spend $$ on cars or something.
  20. Basically those vendors you mention (LSI, Areca, Adaptec) are all in very close competition and performance is close to the same. Some may be a little better with certain drives (firmware tweaks, etc). But it really comes down to specific build items (i.e. out of band network; or removable memory; bbu on-card or off card etc). The reason I went with the AIC RSC-3EG2 chassis was due to the 120x120x38mm fans, larger fans can push the same volume of air with lower blade speed which lets you cut down noise. The main thing you have to careful of is that you don't want a drive to be >40 degrees centigrade as that would decrease lifespan. However even replacing systems with lower speed fans you'll run into a lot of white noise with larger builds. Right now I'm running ~112 drives here at home across 7 chassis plus switches et al. It's getting to the point where I'm even looking at acoustic racks such as the Ucoustic 9210 http://www.quiet-rack.com/ucoustic-active.php But those are not cheap.
  21. The main point of RAID cards is to off-load the host CPU which is good for systems running applications or are constrained (interrupt handling, or other issues). They also make it a bit easier to manage as it's generally 'pull a failed drive and replace' type of a situation. 'software' based raids (I have that in quotes as technically even raid cards are software just with accelerated parity functions and some other items) require more care and feeding (user maintenance). Generally most of the options that are handled automatically by the hardware raid solution is now brought to the surface for you to make decisions on. At this stage I would probably say that most software raid solutions are really only viable under a *nix type environment. Most of the 'tunables' are not really there are are hard to get to under windows. Things you generally look for are low level drive stats (so as to know when to pull a drive out of a system /before/ it fails or troubleshoot low level performance issues; logical volume management functions; partition alignment items; etc. Generally if you're with windows (and remember my 02:00 rule) probably be best to stick with a hardware raid. There are half-way solutions like openfiler and freenas that run unix but generally provide an easier management interface to the user, but would play with them first to get used to it opposed to starting your first build on it for business purposes. SAS supports SATA through SATA tunneling. Unless you're talking a first generation SAS card it will support SATA drives/signaling. Likewise a SAS port/backplane will accept a SATA drive but not the reverse in either case. Generally it's better to get sas cards & chassis as you can use either type of drive. As for RAID card types, I like the Areca's mainly for their out of band management (ethernet jack) which is real nice. Besides that performance is more or less the same with the same generation of chips. LSI probably would be my second choice due to their large industry support, generally more compatible with systems, Adaptec also makes good cards, though haven't used them too recently. The adaptec 6805 you linked to does not have expanders (just the 8 channels/two 4-channel SFF-8087's). The areca ARC-1880-i; ARC-1880x and ARC-1880LP are without expanders, all others have them (that's why you can have more than 8 channels). A lot of the LSI's products are old 3Ware cards probably want to stick with the 'MegaRaid' line, you want cards that are PCIe v2 and the same thing applies to their raid cards (>8 channels). For HBA's (like the LSI00276) is actually a 16 channel sas chip. You can only do raid 1+0 with that or use it as a normal HBA (no raid) which is good, but with only a 8x PCIe lanes you can't fully populate without over saturating the host slot. As for that computer link at the end, that's kind of a mess, not to mention a dust magnet. Could really be cleaner and even going to a peltier or phase change for sub ambient. I only WC my desktop system mainly to cut down noise levels though that's not much of an argument anymore with the 19" rack and fan noise from it.
  22. First, I will acquiesce the post length title to you for this thread. older documents, however the 3945E's are nice routers, just put some in here at work for some 600mbps mpls vpn's. As for switches the 3560X's are the current low-end line. They are actually as I mentioned not that bad in price (barring your taxes/fees ouch!). good, that you have a backup plan. most places forget about it until it's too late. doing partition based backups (vss) would greatly improve performance as it avoids individual file locking. You have it partially, yes you'll have a max cap of your inet speeds which is good. The problem is the request data itself is going to be randomized and small which requires a lot of iops to your subsystem. Some tuning can help (i.e. caching; turning off access time updates; etc) but ultimately you're going to be at the mercy of the drive subsystem. This is where more 'spindles' wins. This is good as it helps separate your I/O workloads. At this point I should make a comment as to security. Generally allowing Inet directly into an internal system is a bad idea. With your willingness to have it as an end repository that helps in functional network separation of your environment. Even if you do not have a firewall itself, you can create a 'poor mans firewall' which is better than nothing. Simple environment would be to have an external interface on the router to your provider, acls to filter traffic (remove 'impossibilities like incoming RFC1918 ranges and seeing your own IP coming back into you; plus your ingress filter as to the types of traffic you want to allow to your DMZ). A DMZ segment on a separate interface on the router; which you would also apply an acl in-bound (for what you want to allow. mind set that this is an 'untrusted' segment as it has direct inet going to it so it should not initiate back internally or at the very least scrutinize exactly what you want to allow as this is a popular method for attacks (transient trust). Then lastly an 'internal' interface (one facing your internal lan) which would be another separate interface on the router which would also have an acl to allow only specific items/ports out to the Inet and towards your dmz. This lets you help block internal virus issues spreading to your dmz as well as to other sites. Also would suggest using private ip's (RFC1918) internally and on your DMZ and natting (1:1) for your dmz going to the Inet. This makes it a little harder to attack but also allows for growth a bit better. Security design layouts are just as involved as storage subsystems, I could go into more detail but it would probably derail this thread. I understand you have a small environment so some items you may not be able to easily accomplish with the assets you have, just raising it up so as to hopefully allow you to make some design decisions with security in mind. (see way too many sites/companies that have no consideration for it, most will survive a while but in the end it's costly if you get attacked. Due to the size of the drives you are looking at, the issue comes down to is trust in both data integrity as well as correlated failures (temporal as well as physical) and the amount of time for a recovery. This can be substantial (>24 hours), since simple errors (hardware/main failure) are temporarily correlated (you have a greater chance of a second drive failing if a first drive fails) so you can be in the state where you can loose 1 drive; start a recovery process and loose a 2nd drive while that is happening. Now at this point you have two failures so a failure to read a sector or you get a UBE you now have data loss. (basically what the spreadsheet I had above calculates). I try to get the probability of not reading all sectors on a sub-array to less than 10% (ideally less than 5%). This is what you have to balance against your stripe width and MTTDL for the raid type chosen. The points of multiple chassis are mainly from past experience. First, having a single chassis that also contains your head (motherboard) you invariably outgrow the chassis and then you run into situations where you have separate power going to your MB/head and drive arrays (i.e. controller having drives on different physically separated power systems would cause issues where your drives would loose power but not the controller and similar glitches which would even in the case of BBU's have more data in flight than can be handled so cause corruption, or causing split arrays (degraded et al) when you come back up and drives in one chassis are not seen/spun up before the head/controller is et al. I've seen the same problems even with sans (emc; stk, et al). best to have a hard separation. Next the issue of external chassis size. 1) you have the issue to HA having more chassis and spreading drives across them if done properly you can loose a chassis and still have your array. simple analogy in a a raid 1+0 you have two chassis one drive of each mirror in each, you loose a chassis you loose 1/2 your drives but your array is still on-line. Same idea can be used for other array types (for example raid 5 of 3 chassis one drive each (stripe width of 3, 2 data 1 parity.) and 2) look at your bandwidth, with 3Gbps sas in 4channel multilane you get 12Gbps or ~1.2GiB/s throughput. with 6Gbps sas that's 2.4GiB/s. With hard drives now at ~120MiB/s at outer track or down to 60MiB/s inner track that's (assuming streaming) 10-20 drives before saturation on 12Gbps or 20-40 drives w/ 24Gbps. Now in your case since your workload is going to be more small transfers in operational mode you can have more drives without really hitting that bandwidth limit, with the exception of when you do your data scrubbing and recoveries in which case if you oversubscribe too much you'll impact your array maintenance functions. Assuming you have UBE ratio drives of 1:10^15 then having a stripe width of 6 drives of 2TB would give a 9.15% probability of not reading each sector, going to 3TB drives that raises to ~13.4%. However this is a statistic so it does not directly apply to /your/ drive build but in general. With that I would be hard pressed to not use raid6 w/ 6 drives per parity group at max. Then add in multiples of that to reach your IOPS/space requirements. Also as a rule of thumb I normally have at least 1 hot spare drive per 12-15 drives; minimum of 1 hot spare per chassis. As you want to start a recovery AS SOON AS A DRIVE FAILS due to recovery time involved. Areca's are nice cards (I have several here), though there is no 'perfect' card. Things to be aware of: - Cards with built-in expanders (basically cards that have > 8 channels) may have interaction issues with external expanders and certain chipsets (for example the ARC-1680's had issues with some LSI expander chips as well as the intel 5520 IOH (negotiating at lower link rates). That may not hit you with the 1880 but raises the item that you need to test. - Cards run hot so you need good airflow across them - You are limited to 4 cards per system. - you cannot share a raid parity group across controllers (need external means like dynamic disks; lvm; or similar at the host level) - with the larger channel # cards there are 4 channels directly wired for the back-end SFF8088 conntector the other 4 channels are fed to the internal expander (so even if you have 24 channels they get fed to the same 4 channels on the chip). This is why getting a card with say 2 8088's or 2 8087's (i.e. 8 channels) can be better. (*note, depending on workload if you are doing a lot of small random writes then cache can help more up to a point). As I mentioned above with the # drives per parity group directly goes toward MTTF_DL calcs. UPS's protect against site hard power failure, however BBU's protect in-flight data to the arrays. You should always have both /OR/ disable write caching all together but that has performance implications. Basically power failure is not the only situation where you have data loss issues. For example if you had a cpu or OS crash which has no bearing on power but could leave data non-flushed in your cache. That's what the BBU protects. IOH = I/O Hub the 5520 is a dual IOH chip each IOH (5500 or 5520) has 36 PCIe lanes. Having two of them gives you 72 lanes which is why it's much better than using a PCIe bridge chip (i.e. nvidia nf200 or similar) as those chips do the same thing as the sas expanders, the take say 16 lanes in and then time slice to 32 lanes. So you have queuing issues (longer latency). dual IOH's provide full link bandwidth no sharing/blocking as both are directly tied to the QPI interconnect. For the boards the main differences really are that the X8DAH has more memory capabilities than the X8DTH. The DTH has 7 16x slots but all slots are 8x electrically, the X8DTH has 8x & 16x physical slots on the board. Not much of a difference as there are no 16x network or HBA/RAID cards. For a server system unless you are pumping it full of ram either is the same. To answer a later question you DO want the -F version (IPMI) of the boards. IPMI is your KVM and console replacement. This gives you power control and remote console (graphic & serial) to the system. This avoids you having to put a keyboard and monitor on the box and allows you out of band access to it in case there is a problem with the OS. The cost is negligible for its' function. I won't even go into 'backblaze' there is just way too much wrong with that environment from an availability and integrity standpoint. That's the epitome of a 'if it works with one drive; it will work with 1000' mindset. no it doesn't. It has the /potential/ to work out but not at that price point. you would probably use ZFS (zvols) and a distributed file system on top of that which would need to be presented to front end filer hosts. The additional network & server robustness to handle different failure scenarios will also have to be added (different power grids; network switch zone separation; multi-pathing to chassis; then you run into other items with consumer/desktop drives such as vibration tolerance; error recovery items; bad failure scenarios where a single 'failing' drive can cause problems for all other drives attached to the same controller/expander chip; et al. Not saying that you /need/ enterprise level everything; however you have to know what they provide and build your environment to compensate for the issues. Larger drives have the same problems as listed above, 3TB for example may take up to 2 days to re-sync in a failure mode; what is the UBE rating; how does it handle vibrations; etc. You can run the numbers, I haven't found any yet that from the published specs I would trust without some much more in depth compensatory controls (i.e. ZFS adds in checksumming of every sector both read & write which raid cards don't do this helps catch a lot of issues; doesn't solve them but pushes the bar out a bit, then you can do other HA items like chassis/controller/power separation et al. This is why a lot of enterprise solutions cost much more money. As you are supporting the industry to do a lot of the testing for you. Testing takes time. For example I just got a sample SC837E26 box here and am running into a connectivity issue where I can't see any internal sata drives on the unit. It may take a week or more to work. Comes down to time or money. If you're separating your functions out (file server separate from your video conversion systems et al) your file server has no need for SSD's. You should have a raid 1 for your boot drive (os) but that's about it. It's main function is static and the i/o will be going to the array(s). I normally buy a couple SAS 2.5" drives (10K rpm or something like that; 72GB or so in size) for the system. Main thing I'm looking for is reliability. Now if you were using a non-windows OS and say solaris or something that can run ZFS then there /are/ uses for SSD's there (cache and log devices) but that's a different discussion related specific to that type of deployment. Leave your SSD's for your scratch disk space on your conversion systems, that would speed your composting. You won't need a fast OS drive for these systems either generally as that's just your application/OS you would need ram but once the apps generally load it's all scratch & I/o to your data drives. Talk to you app users. You may not from a cpu (processing) standpoint but it does allow you to better utilize interrupts for heavily loaded systems in I/O. You can always start with one; though remember you have to add the same model for the second later on and keep an eye on interrupts (software & hardware). As for hyperthreading that goes with the above. think about what hyperthreading does. Basically it allows the cpu to QUEUE up an additional instruction but only one can be operated on at a time. this is great for filling your queues for applications that are heavily threaded. With deployments that are interrupt driven (for example storage systems; firewall/network switching, etc) you do not want to 'queue' that interrupt waiting on the execution pipeline of a single core you want that interrupt to be handled as fast as you can so put it on a core that can take it and run. (this is very simplistic, if interested books in computer architecture/design may be helpful).
  23. actually, update on the switch, seems you can get a decent low-end cisco switch these days at a 'reasonable' price. The 3560X series, it only handles 2 10GbE ports max (at a premium as the module is another $1500+SFP+'s ) but sans that you can get the 24port switch for about $1500 or the 48port version for double. Which is actually not that shabby. (really wish for some more 10GbE but you could over-subscribe by stacking switches I guess, or go out and get the 4900M or nexus 5K's.
  24. Ok, long post and a lot of mixed directions, I'll try and at least summarize some design points. On the storage front first: From your description on what you are using the system for it sounds like you will end up with a highly random I/O to disk workload (both reads & writes) this is a killer for parity raids (reason why many large sans do small stripe width parity groups or use raid 1+0 to offset that). However that also has an effect on your usable storage space (which is why to reach a certain performance goal you may end up with several times the amount of space you were looking for in the first place to get the spindles). You don't mention your local network speed, assuming gbit, nor do you mention how many 802.3ad (link aggregation/port channel/teamed nics) you may have to your server. With (I'm assuming again) SMB drive mappings your request size would be ~64KiB. Also I don't see mention of backup traffic (are the backups happening on the main server or over the network to another backup system? Is it file or volume level backups?) Would suggest NOT to have a single large chassis for all drives but to have multiple chassis. You mention the Chenbro 50x, I would rather suggest having TWO Supermicro SC837E26 which you can get for around ~1650/each. This avoids putting all your eggs in one basket and has higher storage density (56drives in 6U). The E26 version is a dual-expander which won't help you with hardware raid cards (they don't support multi-pathing) but it avoids a forklift upgrade in the future if you want to go towards a HA type build if you don't see this happening you can save a little bit by the E16 version). Ideally you would want to run a single chassis off a single card mainly due to issues with stack pointers with hardware raid cards or higher redundancy with HBA's. Basically what this means is that for example the Areca (and others) have a finite limit in recording state change and recovery pointer information (usually 128 or 256 entries). This is fine for small environments but when you say have a power failure and have say 28 drives combined with a failed drive (for example say you have a failed drive which causes a short or external power problems) the array when it comes back up to the card have a reference of where it was; now lets say it happens again (bad breaker/fuse so it keeps tripping) since the array started a recovery it has pointers for 28 volumes; another fault would show worst case 28 drive 'removals' and when power comes back another 28 'device inserted/start recovery' so now you have the initial 28+28 remove+28 inserted =84 entries. this really fills up the queue. I would be hard pressed to put more than say 32 drives directly off a raid card without some other type of infrastructure behind it (sas switch/more robust pathing et al). You need to feel out your own risk level to this particular type of problem. On top of the above if you are looking at areca or any hardware raid card look for ones WITHOUT internal expanders. Yes this limits you to 8 channels. Reason is mainly compatibility, removing the on-board expander on the card seems to really help in working with multiple other vendor's (LSI, Vitesse; etc) chassis expanders. This has another side effect that these cards lack memory and if you are running windows (ntfs, or other traditional file systems) the raid cache really does help even out workloads (small random writes can be saved up to write an entire stripe at a time thereby avoiding the parity raid write hole or at least mitigating it). Other options would be to run other file systems that you can mitigate this by running HBA's (opposed to raid cards i.e. 'software raid') and have different caching algorithms (you mention ssd's in conjunction with ZFS that can really help with your ZIL volumes) and put $$ into memory for your san head. However I DO NOT RECOMMEND this if you are not familiar with the OS/file system. The 02:00 rule applies strongly here. (I.e. ultimately it's you who needs to administrate the system. What system are you most comfortable with to troubleshoot at 02:00? A good admin on windows will do better than a poor admin on unix and visa versa). As for raid 'levels' and their resultant availability metrics I've attached a spreadsheet that you can play with to see what your comfort zone is. As you will see large drives >1TB are very poor with uncorrectable bit error ratios which take precedence with large capacity as a data loss item. (this is why CRC file systems are so hot now as it's a means for 'cheap' solutions to help bridge the data integrity chasm). As for your san/nas head system, that's a good choice either that or the X8DTH-6F as these have dual IOH's which gives you much more room for growth. Specific Q responses: 1) For video editing it really comes down to your software support and what you're trying to do as to the type of card to buy. Quadro cards have better support for colour accuracy and multiple channels but you point out correctly that it's much more expensive. I haven't done much video work currently (mainly used premier in the past) but that was before a lot of the offloads as well. This is more of a question related to your application and it's support. 2a) It /should/ however you would need to test specifically. For example right now I'm getting the SC837E26 shipped here along with some ST2000DL03 drives along with the LSI00188 HBA for validation testing. When doing large builds if all the parts are not under support matrices you need to test. Fallout items can be simple recognition of the hardware; issues under stress; how things are handled when devices are not working correctly (i.e. slow/bad drive could take down/degrade entire chassis). If possible work with your vendor to see if you can do this with perhaps just a restocking fee if it doesn't work out for a particular part. 2b) RAID cards should not disable on-board items, however remember that there is limited ROM space for all add-in items on a system so you may need to disable items to free up space for other items to load. Also some items may have compatibility issues with chipsets (link speed negotiations, etc. ) 3) It's possible, see above #2b. With raid cards you are generally limited to the number you can put in (areca for example has a limit of 4). For larger builds I'm moving towards HBA's (up to 6) and 10GbE eventually going to a clustered file system for growth (i.e. each 'node' would have say up to 168 drives attached to 10GbE switched fabric and using luster or gfs) but that's much more involved than what you are looking at. 4) Yes, from a RAID6 level itself, however you'll run into UBE availability issues as well as performance issues, I've found that with 4KiB sector drives that you get better performance when your data drives are in binary multiples 2^n so 2, 4, 8, which for raid6 that would be 4, 6, 10 drives. The wider the stripe also hurts performance when compared to the same number of drives in say a smaller stripe format. (i.e. 24 drives; 4 raid6 groups of 6 drives will give better write than 3 raid6 groups of 8 drives; > 2 raid6 of 12 drives). Now with large writes or streaming performance where the write hold does not apply this is not true. You have to match your subsystem to your workload. Now on to general arch: From what you've described and not knowing your budget, plus assuming that you are going to be windows as your san/nas head and data integrity is not an issue you can afford to fight now would probably do something like this: 1Gbit Ethernet switch fabric w/ 802.3ad support (cisco is pricey here but look at Dell (6224) or HP lines which though interfaces suck have decent performance for the $$). Redundant if you really need that but doubtful initially. NAS/SAN Head: Supermicro X8DTH-6F or X8DAH-6F small 3U chassis for board above (3U as that allows full-height cards) (6) DDR3 ECC ram sticks; minimum of 12GB (2GB) 1333Mhz (2) Xeon CPU;s X56xx series with 6.4GT/s QPI does not need high clock rates unless you want software raid since this is being offloaded to the raid cards you are just pushing internal I/O. Hyperthreading should be disabled as well as power saving options. (3) SC837E26 or SC837E16 chassis (2) local SATA/SAS drives for OS (mirrored) can be low speed. *(45) drives assuming you want to use 3TB drives (3 hot spares; 1 per chassis; 35TB initial plus 20% YoY growth for 5 years (~90TB)) 4D+2P deployment **(3) ARC-1880x or ARC-1880-ix-12 w/ 4GB RAM **(3) BBU's for above cards *** cheap LTO4 drive + tapes; or Another SC837E26 chassis with separate card (LSI00188 HBA) and do a dynamic disk volume for it. * Other option would be to use more (78 2TB drives with 3 HS). ** this in a pinch could be a single card and you can daisy-chain the chassis (SFF-8088 SFF-8088) them with plans to go to this mode in the future. *** personally would prefer tape as it's offline however it's slow (120MB/s) so a full backup here would take a long while. Main video workstation: Supermicro or Desktop board (Can your video process take advantage of SLI?) CPU's - Dependant on Video process (Do you need cores or high frequency or both?) Memory- what you need for your application Internal drives (SSD or traditional HD), perhaps SSD for scratch space in raid-0 for posting work. video card/s dependent on application Basically, once you build your nas/san head it's relatively static and doesn't need to be extreme. Then you can build out your application systems to whatever specs you need for that particular application. That way your cost of upgrade is cheaper (just hit the piece you need). 20090530-Raid_Reliability_Worksheet.zip
  25. You seem to have a lot of different types of workloads here (file sharing; VM machines; p2p apps; et al). These workloads are not the same so to handle them all you will be going toward low performance but just capacity. (mainly what I see in general IT shops, unfortunately). Plus you seem to be wanting to combine other functions on your NAS head (CIFS/SMB; NFS; et al) but you mention items like HTTP; VM; et al which those should be on a separate system that mounts your storage head via iscsi; cifs/smb; nfs, etc. But it should not be the same box. Anyway, some suggestions for high availability/flexibility: - your server 'head' should be a different box than your disk chassis. OR find a box that you can put /all/ your disks in the same chassis. The problem is if you split drives from your head to remote units you have a much higher risk of data loss due to external factors (power; cabling, etc). - if using external chassis; assuming direct attachment to have them configured with enough redundancy to handle loss or take the risk. I.e. if doing a RAID 1+0 have two chassis and split each raid 1 between them so you can loose an entire chassis. As for chassis types, I've used the AIC RSC-3EG's in the past http://www.aicipc.com/ProductParts.aspx?ref=RSC-3EG2 mainly as they have large (120mm) fans so can run quieter. However I'm now switching over to the Supermicro 837E26 http://www.supermicro.com/products/chassis/3U/837/SC837E26-RJBOD1.cfm as it holds 28 drives per 3U and has a dual expander (for multi-pathing with sas drives). I would suggest a SAS chassis just because it can handle sata & sas drives; plus with a sas backplane/expanders you would have the ability to grow/expand your data store a bit easier. (even more so with a sas switch). As for raid widths, large widths (many drives per raid group) it has some benefit with large I/O streaming but hinders performance with random I/O. Not to mention lower MTTF_DL values. With larger drives like you're talking about I would be hard pressed to put together a system (if data availability is an issue) with a width larger than 6 drives in a RAID6 (raidz2; 4 data+2 parity). This also increases write performance for random writes when using small parity groups in a multi-level raid setup. (a single parity group regardless how large will have the same write performance for <=stripe width (i.e. ~speed of a single drive). If you're interested in data integrity in addition to availability (i.e. what you read is what you wrote to put it simply) then you are down to using ZFS and/or SAS drives w/ T10 DIF. However SAS really increases the cost for such a small project (assume 3-4x the cost of the same sized SATA drive). Anyway, I would suggest you really narrow down exactly what you want this system to DO, what performance metrics you want to meet for X application and then design towards that. Right now this seems more of a first-generation list of items mainly from a capacity standpoint with no regard to availability/integrity/performance for any one of the applications.