Henri BR

72Tb+ NAS, Extensible - Newbie build

Recommended Posts

Thanks to searching for "forum storage chassis" on Google, I've met this page and joined the community :)

I've started a thread on Tom's Hardware, and an user have been of GREAT help since I'm not used to storage servers.

We've decided on a new build, and I'd really appreciate if you fellows could have a look on it and help on this new rig.

Link to the thread

There, we've started to build a NAS solution, which some of the requirements are:

  • At least 48Tb or 16 HDDs. We're probably to pick 24 HDDs / 72Tb;
  • SATA and SAS 6 Gbit/s HDDs support, where SDDs for the operating system(s) is a plus;
  • Be capable of handling 15 local persons (on the LAN) editing and messing with educational contents/files:
    ... The contents are in audio, video (some videos are HD) & book formats;
  • At the same time, share these contents on eMule/bitTorrent similar apps, as well through HTTP/FTP;
    ... These contents will be shared mainly on file sharing apps.
    ... HTTP/FTP is not the main concern, the main goal.
  • Remote access to ~20 persons;
  • The solution will be also used for learning/use of virtualization (Hyper-V R2, VMware, etc);
  • And most important: Be someway, future proof.

We've been reading a lot to learn the most we can.

Anyway, we couldn't understand some things about the available equipment parts/technologies.

It won't be that hard.

Care to read that thread (link) and provide an insight on how we could accomplish the requirements and share your knowledge?

If there's a way to achieve better results, and we need to change something on what is stated on the thread, it's very welcome.

Critical analysis, questions, are welcome just the same.

...

Right now, it's being someway hard to understand and choose what enclosure/chassis we'll need.

The only things I'm sure is that we'd need at least a 24-bays chassis.

Maybe more headroom would be wise, I don't know since we may actually need more than 40/50Tb+ in storage.

So, we're completely lost on this.

Could you review that and help?

Thanks for sharing,

:D

Edited by Henri BR

Share this post


Link to post
Share on other sites

You seem to have a lot of different types of workloads here (file sharing; VM machines; p2p apps; et al). These workloads are not the same so to handle them all you will be going toward low performance but just capacity. (mainly what I see in general IT shops, unfortunately). Plus you seem to be wanting to combine other functions on your NAS head (CIFS/SMB; NFS; et al) but you mention items like HTTP; VM; et al which those should be on a separate system that mounts your storage head via iscsi; cifs/smb; nfs, etc. But it should not be the same box.

Anyway, some suggestions for high availability/flexibility:

- your server 'head' should be a different box than your disk chassis. OR find a box that you can put /all/ your disks in the same chassis. The problem is if you split drives from your head to remote units you have a much higher risk of data loss due to external factors (power; cabling, etc).

- if using external chassis; assuming direct attachment to have them configured with enough redundancy to handle loss or take the risk. I.e. if doing a RAID 1+0 have two chassis and split each raid 1 between them so you can loose an entire chassis.

As for chassis types, I've used the AIC RSC-3EG's in the past http://www.aicipc.com/ProductParts.aspx?ref=RSC-3EG2 mainly as they have large (120mm) fans so can run quieter. However I'm now switching over to the Supermicro 837E26 http://www.supermicro.com/products/chassis/3U/837/SC837E26-RJBOD1.cfm as it holds 28 drives per 3U and has a dual expander (for multi-pathing with sas drives).

I would suggest a SAS chassis just because it can handle sata & sas drives; plus with a sas backplane/expanders you would have the ability to grow/expand your data store a bit easier. (even more so with a sas switch).

As for raid widths, large widths (many drives per raid group) it has some benefit with large I/O streaming but hinders performance with random I/O. Not to mention lower MTTF_DL values. With larger drives like you're talking about I would be hard pressed to put together a system (if data availability is an issue) with a width larger than 6 drives in a RAID6 (raidz2; 4 data+2 parity). This also increases write performance for random writes when using small parity groups in a multi-level raid setup. (a single parity group regardless how large will have the same write performance for <=stripe width (i.e. ~speed of a single drive).

If you're interested in data integrity in addition to availability (i.e. what you read is what you wrote to put it simply) then you are down to using ZFS and/or SAS drives w/ T10 DIF. However SAS really increases the cost for such a small project (assume 3-4x the cost of the same sized SATA drive).

Anyway, I would suggest you really narrow down exactly what you want this system to DO, what performance metrics you want to meet for X application and then design towards that. Right now this seems more of a first-generation list of items mainly from a capacity standpoint with no regard to availability/integrity/performance for any one of the applications.

Share this post


Link to post
Share on other sites

Thanks for your great work Steve!

I'll update this post with some information once I have finished studying some points you've explained.

Probably today.

Thanks again.

Share this post


Link to post
Share on other sites

Thanks for your efforts Steve,

You're more than right when explaining we should narrow down what are the priorities with this system. Despite we can't afford two or more specialized systems/servers right now, I'll try to explain the priorities so it may be easier to design it accordingly and in the future, we can add a second box to deal with different workloads as you've wisely explained.

We've been studying some points, and I've been reading a lot to learn the more I can since we decided for a custom build. Some of your considerations are bit hard for me to figure it out, so I'm still studying systems, protocols, etc. What is also nice is that you point me in the direction toward what I need to devote more time studying. Thanks for this.

What is the scope to design it accordingly?

As we are going to start receiving a large amount of files in August, once we finish dealing with copyrights and owners – Video, Audio and eBooks – We need to prepare the structure to edit, organize and share it. Within 4 to 8 months after we start, we may have more than 2.000 videos, many in HD quality, thousands and thousands of long audio lectures and a few hundreds eBook files.

Around 20% to 30% of the video files will need small editions and corrections before sharing them on the P2P networks – Anyway, we'll need to encode ALL these videos into a specific file format. The same will happen to the audio lectures - We'll need to cut them breaking in smaller length, encode in MP3 format and compress (zip or rar) it before sharing. Probably, we're talking about of 20Tb to 35Tb of files within one year.

We thought of providing these files thought HTTP and FTP in addition to P2P, but it is not a must IF it affects the overall performance of our resources – Contrary to the P2P networks which is a must, and I mean using Torrents, eDonkey or similar, etc. Despite providing these files in HTTP/FTP is not a must, we must to provide a way to the raw/base files to be uploaded to server whenever needed - We'll need to provide remote access to around 20 persons/group of persons so they can upload files and help on these editions, corrections, encodings, cataloging, etc.

Also, as we've been in need of a centralized storage to our team, mostly for system and files backups, we'll be using the server to fulfill this need. We have only 5 computers in our network, but frequently we count on 15 to 20-25 persons helping us on our daily activities with a laptop with access to the network. The server will then, be accessed by these persons in addition to remote access so we can improve the production process. Anyway, we'll only backup those 5 computers and not our friends' laptops.

These are the primary roles of this server and we can adapt it if something not works.

Summarizing it:

- The main role is video, audio and eBook storage and sharing through P2P (and maybe HTTP/FTP IF it works) over a 200Mbit FTTx connection (maybe 500Mbit in a near future);

- Some minor edition on video and audio; but someway, large encoding and compressing capabilities;

- Remote access to ~20 persons so they can upload files, and also help on the production process;

- Local access to up to 25 persons so they can also help on daily activities;

- Weekly backup for 5 computers on the LAN;

- 20Tb to 35Tb of files within one year in addition to other 15Tb to 30Tb of our backups (so, 35 to 65Tb/year).

We thought of building a workstation for video editing and audio encoding. As our priority is a server, and budget is someway low, by building a workstation in addition to this server would be somehow costly when compared to just throw a video card on it. So, I may be wrong about it, but I think that as our main role is not video editing, if we do this way it may fulfill the video editing need and save us some money for a better server and network structure, which we still don't have a good one as well. I've mentioned it in this link.

1. What do you think about it as a palliative to costs? Should that kind of video card mostly work well so we don't need those overpriced and an outdated Quadro cards?

By saving some money, we can afford a better structure and listen to your advice with regard of a box to put all the disks in a same chassis and also, provide some headroom. I visited all these manufactures websites and I found this one: Chenbro 50x 3.5" hot-swap 6Gb/s SAS HDDs. Maybe, we could buy a second Areca ARC-1880ix-24-4G and fill it with 44 3Tb disks and 4 SSDs for O/S. Also, we could use 2 SSD in RAID for O/S, and other 1 or 2 for specific purposes where performance is needed - I really don't know about it yet - What I know is that, we won't have more than 4 or 6 SSDs due to costs.

2. Should this enclosure/chassis work with the hardware parts I've selected until now (link)?

As I have never had a dedicated RAID controller card, I don't know if it disables the motherboard controller. I'm telling it because I've mentioned 3 SSD in raid 5 for O/S and virtualization learning and tests. As that RAID card has 24 ports I've picked 20 HDDs, 3 SSD and a slim Blu-ray burner. If the MB onboard controller is not disabled when we use a dedicated solution, I'd like to use 4 SSD in RAID 10 or some other configuration like I told above, and plug the Blu-ray burner on the MOBO.

3. Is it possible with no problems? I mean, use the onboard controller in addition to a dedicated RAID controller card, or maybe even 2 of these RAID cards?

We are not so worried about availability to the external public - We'd love to increase it, but it is not something that could cause us headaches as we're not charging for anything and we're not selling anything - Data integrity is all important as well. With regard to SAS disks, it is exactly as you told: Costly. A cost that we'd like, but cannot afford. RAID6, with 6 or more HDDs right? I'm wondering the rebuild time of 8 3Tb SATA3 drives in RAID 6 and how it would perform for what we're proposing.

4. Would 8 drives work well for us in RAID6? Or 6 or another config. would be better?

As a bottom note, with regard to OS/System/Software, we are going to use what fits better into requirements.

For Linux based stuff it'd be a bit painfull because we'd need to lean about it, but no problem at all.

Maybe VMs for learnig more...

I'm sorry for the long post, I were trying to explain the best I can.

Have a great week :D

Edited by Henri BR

Share this post


Link to post
Share on other sites

Ok, long post and a lot of mixed directions, I'll try and at least summarize some design points. On the storage front first:

From your description on what you are using the system for it sounds like you will end up with a highly random I/O to disk workload (both reads & writes) this is a killer for parity raids (reason why many large sans do small stripe width parity groups or use raid 1+0 to offset that). However that also has an effect on your usable storage space (which is why to reach a certain performance goal you may end up with several times the amount of space you were looking for in the first place to get the spindles).

You don't mention your local network speed, assuming gbit, nor do you mention how many 802.3ad (link aggregation/port channel/teamed nics) you may have to your server. With (I'm assuming again) SMB drive mappings your request size would be ~64KiB. Also I don't see mention of backup traffic (are the backups happening on the main server or over the network to another backup system? Is it file or volume level backups?)

Would suggest NOT to have a single large chassis for all drives but to have multiple chassis. You mention the Chenbro 50x, I would rather suggest having TWO Supermicro SC837E26 which you can get for around ~1650/each. This avoids putting all your eggs in one basket and has higher storage density (56drives in 6U). The E26 version is a dual-expander which won't help you with hardware raid cards (they don't support multi-pathing) but it avoids a forklift upgrade in the future if you want to go towards a HA type build if you don't see this happening you can save a little bit by the E16 version).

Ideally you would want to run a single chassis off a single card mainly due to issues with stack pointers with hardware raid cards or higher redundancy with HBA's. Basically what this means is that for example the Areca (and others) have a finite limit in recording state change and recovery pointer information (usually 128 or 256 entries). This is fine for small environments but when you say have a power failure and have say 28 drives combined with a failed drive (for example say you have a failed drive which causes a short or external power problems) the array when it comes back up to the card have a reference of where it was; now lets say it happens again (bad breaker/fuse so it keeps tripping) since the array started a recovery it has pointers for 28 volumes; another fault would show worst case 28 drive 'removals' and when power comes back another 28 'device inserted/start recovery' so now you have the initial 28+28 remove+28 inserted =84 entries. this really fills up the queue. I would be hard pressed to put more than say 32 drives directly off a raid card without some other type of infrastructure behind it (sas switch/more robust pathing et al). You need to feel out your own risk level to this particular type of problem.

On top of the above if you are looking at areca or any hardware raid card look for ones WITHOUT internal expanders. Yes this limits you to 8 channels. Reason is mainly compatibility, removing the on-board expander on the card seems to really help in working with multiple other vendor's (LSI, Vitesse; etc) chassis expanders. This has another side effect that these cards lack memory and if you are running windows (ntfs, or other traditional file systems) the raid cache really does help even out workloads (small random writes can be saved up to write an entire stripe at a time thereby avoiding the parity raid write hole or at least mitigating it).

Other options would be to run other file systems that you can mitigate this by running HBA's (opposed to raid cards i.e. 'software raid') and have different caching algorithms (you mention ssd's in conjunction with ZFS that can really help with your ZIL volumes) and put $$ into memory for your san head. However I DO NOT RECOMMEND this if you are not familiar with the OS/file system. The 02:00 rule applies strongly here. (I.e. ultimately it's you who needs to administrate the system. What system are you most comfortable with to troubleshoot at 02:00? A good admin on windows will do better than a poor admin on unix and visa versa).

As for raid 'levels' and their resultant availability metrics I've attached a spreadsheet that you can play with to see what your comfort zone is. As you will see large drives >1TB are very poor with uncorrectable bit error ratios which take precedence with large capacity as a data loss item. (this is why CRC file systems are so hot now as it's a means for 'cheap' solutions to help bridge the data integrity chasm).

As for your san/nas head system, that's a good choice either that or the X8DTH-6F as these have dual IOH's which gives you much more room for growth.

Specific Q responses:

1) For video editing it really comes down to your software support and what you're trying to do as to the type of card to buy. Quadro cards have better support for colour accuracy and multiple channels but you point out correctly that it's much more expensive. I haven't done much video work currently (mainly used premier in the past) but that was before a lot of the offloads as well. This is more of a question related to your application and it's support.

2a) It /should/ however you would need to test specifically. For example right now I'm getting the SC837E26 shipped here along with some ST2000DL03 drives along with the LSI00188 HBA for validation testing. When doing large builds if all the parts are not under support matrices you need to test. Fallout items can be simple recognition of the hardware; issues under stress; how things are handled when devices are not working correctly (i.e. slow/bad drive could take down/degrade entire chassis). If possible work with your vendor to see if you can do this with perhaps just a restocking fee if it doesn't work out for a particular part.

2b) RAID cards should not disable on-board items, however remember that there is limited ROM space for all add-in items on a system so you may need to disable items to free up space for other items to load. Also some items may have compatibility issues with chipsets (link speed negotiations, etc. )

3) It's possible, see above #2b. With raid cards you are generally limited to the number you can put in (areca for example has a limit of 4). For larger builds I'm moving towards HBA's (up to 6) and 10GbE eventually going to a clustered file system for growth (i.e. each 'node' would have say up to 168 drives attached to 10GbE switched fabric and using luster or gfs) but that's much more involved than what you are looking at.

4) Yes, from a RAID6 level itself, however you'll run into UBE availability issues as well as performance issues, I've found that with 4KiB sector drives that you get better performance when your data drives are in binary multiples 2^n so 2, 4, 8, which for raid6 that would be 4, 6, 10 drives. The wider the stripe also hurts performance when compared to the same number of drives in say a smaller stripe format. (i.e. 24 drives; 4 raid6 groups of 6 drives will give better write than 3 raid6 groups of 8 drives; > 2 raid6 of 12 drives). Now with large writes or streaming performance where the write hold does not apply this is not true. You have to match your subsystem to your workload.

Now on to general arch:

From what you've described and not knowing your budget, plus assuming that you are going to be windows as your san/nas head and data integrity is not an issue you can afford to fight now would probably do something like this:

1Gbit Ethernet switch fabric w/ 802.3ad support (cisco is pricey here but look at Dell (6224) or HP lines which though interfaces suck have decent performance for the $$). Redundant if you really need that but doubtful initially.

NAS/SAN Head:

Supermicro X8DTH-6F or X8DAH-6F

small 3U chassis for board above (3U as that allows full-height cards)

(6) DDR3 ECC ram sticks; minimum of 12GB (2GB) 1333Mhz

(2) Xeon CPU;s X56xx series with 6.4GT/s QPI does not need high clock rates unless you want software raid since this is being offloaded to the raid cards you are just pushing internal I/O. Hyperthreading should be disabled as well as power saving options.

(3) SC837E26 or SC837E16 chassis

(2) local SATA/SAS drives for OS (mirrored) can be low speed.

*(45) drives assuming you want to use 3TB drives (3 hot spares; 1 per chassis; 35TB initial plus 20% YoY growth for 5 years (~90TB)) 4D+2P deployment

**(3) ARC-1880x or ARC-1880-ix-12 w/ 4GB RAM

**(3) BBU's for above cards

*** cheap LTO4 drive + tapes; or Another SC837E26 chassis with separate card (LSI00188 HBA) and do a dynamic disk volume for it.

* Other option would be to use more (78 2TB drives with 3 HS).

** this in a pinch could be a single card and you can daisy-chain the chassis (SFF-8088 SFF-8088) them with plans to go to this mode in the future.

*** personally would prefer tape as it's offline however it's slow (120MB/s) so a full backup here would take a long while.

Main video workstation:

Supermicro or Desktop board (Can your video process take advantage of SLI?)

CPU's - Dependant on Video process (Do you need cores or high frequency or both?)

Memory- what you need for your application

Internal drives (SSD or traditional HD), perhaps SSD for scratch space in raid-0 for posting work.

video card/s dependent on application

Basically, once you build your nas/san head it's relatively static and doesn't need to be extreme. Then you can build out your application systems to whatever specs you need for that particular application. That way your cost of upgrade is cheaper (just hit the piece you need).

20090530-Raid_Reliability_Worksheet.zip

Share this post


Link to post
Share on other sites

Thanks very much Steve!

I have just edited my last post while you were writing yours.

I edited from the question 1 and below; it was about SSDs and O/S specifics.

I'm going to read your comments right now ;)

Share this post


Link to post
Share on other sites

actually, update on the switch, seems you can get a decent low-end cisco switch these days at a 'reasonable' price. The 3560X series, it only handles 2 10GbE ports max (at a premium as the module is another $1500+SFP+'s ) but sans that you can get the 24port switch for about $1500 or the 48port version for double. Which is actually not that shabby. (really wish for some more 10GbE but you could over-subscribe by stacking switches I guess, or go out and get the 4900M or nexus 5K's.

Share this post


Link to post
Share on other sites

Thanks for organizing my mess.

By separating things you helped a lot!

I think it takes me 10x more time to read and understand it, than you may take to write it. It is due to my lack of knowledge about storage and file systems - Not the English language or your great explanation. So, here are some opportunities to learn - To research and learn what is RAID parity, stripe width and stripe size, multi-pathing, etc... As you may see, for the 02:00 rule, I'll be much more confortable if it is NOT Unix/Linux, at least for a while - And that's why I've been talking about virtualization, Hyper-V / Server R2 X/Y edition / VMware, and other - I mean, with that, we'll have the opportunity to learn more about a lot of things we can't do right now, as we'll have a real server in our environment to help on it. That's the importance of virtualization for now - Nothing critical at all.

To start, I'm going to answer some of your questions so we can refine the thread a bit more...

We have a modest network structure with 5 computers in the LAN. Despite all these computers has a dual GigE onboard adapter, we only have a 10/100 switch. As our network requirements are about to change, we are going to upgrade it with Cisco parts, for a router we were thinking about something with integrated services that is able to handle (see the attachment) at least a 200Mbit FTTx internet connection with DUAL-WAN port; as a gigabit internet connection seems to become relatively cheap within 3-5 years, a router that could handle at least 500Mbit may be the way to go. For a switch, we'd like a Cisco with 24 or maybe 48 ports, with at least 2 or 4 10GbE ports - I didn't researched it yet. Anyway, the below PDFs show us some Cisco parts performance - These are the most up-to-date files I could find. It is important to remember that it is time for NAS right now, and not SAN structure due to cost - So, the network equipment will follow this requirement.

Cisco Routers Performance

Cisco Switches Performance

We're running Windows 7 Ultimate and for backups, we're using external SATA disks and Acronis True Image Home; we are doing partition level backups each 15 days.

We also do file level backups each 15 days, but no system files are involved.

We'll do it with this server once we're ready to buy it, but in a weekly fashion and keeping up to 7 images of the partitions. It means around 15-20Tb to 30Tb if we consider file level as well.

With regard to budget, we'll be able to afford around $10K to $14K for the system hardware without the HDDs.

Network is also not included on this budget.

When it comes to operating system, software, etc. I'm quite lost as I'm new to storage;

I have never used Unix/Linux on a daily basis, only a few times at the university.

I'm and we are quite comfortable with windows and similar, and willing to learn more on Unix/Linux but not administer it right now.

From your description on what you are using the system for it sounds like you will end up with a highly random I/O to disk workload (both reads & writes) this is a killer for parity raids (reason why many large sans do small stripe width parity groups or use raid 1+0 to offset that). However that also has an effect on your usable storage space (which is why to reach a certain performance goal you may end up with several times the amount of space you were looking for in the first place to get the spindles).

My main concern is with regard to P2P. When we have 500 to 1,5-2K people connected to this server downloading something, I'm not sure on how the things is going to run. I know we can set some limits, we can fine tune it by some configurations, but I have no experience with a server providing files over a 200Mbit (and up/higher) connection. For torrents we can manage max speed per person, what is nice, so we can set limits of e.g. 200-500KB/s per downloader providing a faster rotation. As we are going to use other apps like eDonkey/eMule, the things changes a bit, as we can just set global limit and the program handle how the connection are going to be used - It means that if we have 2.000 persons waiting for files, the app reduce the speed per person for up to 5KB/s so more people can download at the same time - Over a 200Mbit connection, 5KB/s means up to 5.120 persons connected. We still don't have an internet connection of 200Mbit, but only 35Mbit - In a near future we can even upgrade it to 500Mbit and not 200Mbit anymore, so we need to plan the structure a bit more. What I mean with these words is that I'm not sure what it will represent in terms of resources consumpition as you've started your explanation.

About videos/audio/PDFs...

When we need to edit a video, we can do it with our current 5 computers - It is fine for it.

Once the videos are properly edited, we can upload it to the sever and convert them in batch: *.xyz to MPEG/AVI or something else.

Then, we catalog them and start sharing.

We can do the same with audio - Edit them on the computers, upload and batch convert/compress keeping a lossless version on it.

PDFs won't be an issue anyway ;-)

1. What kind of arrays/RAID arrangement do you advise for this P2P needs? Raid 6 as you've told? How many disks? What configs?

We'll be dealing mostly with files of 20 MB and over.

For small files, we may have no more than 3.000 PDFs - Maybe it can be "hosted" in some other arrangement.

Would suggest NOT to have a single large chassis for all drives but to have multiple chassis. You mention the Chenbro 50x, I would rather suggest having TWO Supermicro SC837E26 which you can get for around ~1650/each. This avoids putting all your eggs in one basket and has higher storage density (56drives in 6U). The E26 version is a dual-expander which won't help you with hardware raid cards (they don't support multi-pathing) but it avoids a forklift upgrade in the future if you want to go towards a HA type build if you don't see this happening you can save a little bit by the E16 version).

Yes, I do understand you when you say to putt all the eggs in one basket.

I first choose that Chenbro due to simplicity "to set things up" and better cable management.

I have never seen one of these storage chassis myself. 2 for me, seems more complexity - But maybe it's not that hard.

The SuperMicro ones seems very nice, I'm not sure about airflow.

Another things I'm not sure, is if we'd need another chassis for the server to be plugged on them.

If so, I'm completely lost on how to assemble 1 server with 2 of these SuperMicro chassis. Completely... :unsure:

That 50 HDDs Chenbro seems so far easier "to set things up and run". :D

2. Would a build with that Chenbro cause us future bottlenecks or problems?

I don't see we needing more than two rigs like this one of ~48/50 disks for the next 5 to 8 years; a total of 96/100 disks.

As a side note, we may buy the disks as needed - We are not going to buy more than 24 disks right now (probably).

Ideally you would want to run a single chassis off a single card mainly due to issues with stack pointers with hardware raid cards or higher redundancy with HBA's. Basically what this means is that for example the Areca (and others) have a finite limit in recording state change and recovery pointer information (usually 128 or 256 entries). This is fine for small environments but when you say have a power failure and have say 28 drives combined with a failed drive (for example say you have a failed drive which causes a short or external power problems) the array when it comes back up to the card have a reference of where it was; now lets say it happens again (bad breaker/fuse so it keeps tripping) since the array started a recovery it has pointers for 28 volumes; another fault would show worst case 28 drive 'removals' and when power comes back another 28 'device inserted/start recovery' so now you have the initial 28+28 remove+28 inserted =84 entries. this really fills up the queue. I would be hard pressed to put more than say 32 drives directly off a raid card without some other type of infrastructure behind it (sas switch/more robust pathing et al). You need to feel out your own risk level to this particular type of problem.

On top of the above if you are looking at areca or any hardware raid card look for ones WITHOUT internal expanders. Yes this limits you to 8 channels. Reason is mainly compatibility, removing the on-board expander on the card seems to really help in working with multiple other vendor's (LSI, Vitesse; etc) chassis expanders. This has another side effect that these cards lack memory and if you are running windows (ntfs, or other traditional file systems) the raid cache really does help even out workloads (small random writes can be saved up to write an entire stripe at a time thereby avoiding the parity raid write hole or at least mitigating it).

Would never think about these issues myself! Thanks...

For the sake of simplicity, I thought about going with one or two of these Arecas.

We'll buy the battery backup units for these if we chose this path. So, we'll have 4 slots used only for it.

I saw some topics with some Arecas and HP SAS expanders; however, I'm not sure what are of running this kind of config.

If we choose that Chenbro, and going with a card without internal expanders, we'll need 6 slots and things will require some changes, I mean the storage design.

If these changes are needed, more complex things are coming up. Mostly, beneficial of course; but I'm not sure if I can handle it and about costs.

As you have seen, we have some temptation of going with Windows, and maybe VMware or Citrix.

If an edition of the Server 2008 R2 SP1 can do all the things we need, including some VMs playing (for learning in general), maybe it's our choice because I'm already very familiarized with Windows, not the server exactly. Also, all our structure is based on Windows systems - Maybe there may be a benefit of going with it - Not sure, however.

3. Is there a card better than the Areca ARC-1880ix-24-4G for what we're looking after?

4. If we work with two of this card, battery backup units and an UPS, are the chances of running into trouble big?

4a. What if we choose arrays with less disks in it, e.g. 6HDDs? Do we have any benefit (less chances of troubles, faster rebuilds, etc)?

Other options would be to run other file systems that you can mitigate this by running HBA's (opposed to raid cards i.e. 'software raid') and have different caching algorithms (you mention ssd's in conjunction with ZFS that can really help with your ZIL volumes) and put $$ into memory for your san head. However I DO NOT RECOMMEND this if you are not familiar with the OS/file system. The 02:00 rule applies strongly here. (I.e. ultimately it's you who needs to administrate the system. What system are you most comfortable with to troubleshoot at 02:00? A good admin on windows will do better than a poor admin on unix and visa versa).

I'll learn more about what you're talking about.

As you "DO NOT RECOMMEND", and I'm / we are not educated on HBA's, ZFS, etc, chances are very hight that we'll mess up the things.

I don't know.

As for raid 'levels' and their resultant availability metrics I've attached a spreadsheet that you can play with to see what your comfort zone is. As you will see large drives >1TB are very poor with uncorrectable bit error ratios which take precedence with large capacity as a data loss item. (this is why CRC file systems are so hot now as it's a means for 'cheap' solutions to help bridge the data integrity chasm).

As for your san/nas head system, that's a good choice either that or the X8DTH-6F as these have dual IOH's which gives you much more room for growth.

Thanks for the spreadsheet. We're probably to have different RAID configs for different needs.

I'll study it further, once I can break down these needs in an organized fashion.

For the motherboard, I first selected that (link) due to chipset version, # of PCI slots and x16 slots and no integrated video.

I'm not sure what IOH stand for, is it the I/O Hub, an Intel chipset architecture?

If so, the X8DAH+ model seems to have it as well:

Chipset - Intel® 5520 (Tylersburg) chipset / ICH10R +2x IOH-36D

Anyway, I still have no idea on what it may represent... :P

By the way, I've been reading this article (link) a friend have provided me. That company seems to be using "home grade HDDs". It'd reduce the costs A LOT.

5. Would it to be a VERY bad choice if we buy "cheap" 3Tb HDDs for what we're talking about even if we try to "protect them" with "anti-vibration kits"? Why?

5a. In what cases would those cheap 3Tb drives would work for us? (maybe we could buy SATA3 "enterprise grade" for this, and "home grade" for that; in a same enclosure)

Question: 2. Should this enclosure/chassis work with the hardware parts I've selected until now (link)?

2a) It /should/ however you would need to test specifically. For example right now I'm getting the SC837E26 shipped here along with some ST2000DL03 drives along with the LSI00188 HBA for validation testing. When doing large builds if all the parts are not under support matrices you need to test. Fallout items can be simple recognition of the hardware; issues under stress; how things are handled when devices are not working correctly (i.e. slow/bad drive could take down/degrade entire chassis). If possible work with your vendor to see if you can do this with perhaps just a restocking fee if it doesn't work out for a particular part.

2b) RAID cards should not disable on-board items, however remember that there is limited ROM space for all add-in items on a system so you may need to disable items to free up space for other items to load. Also some items may have compatibility issues with chipsets (link speed negotiations, etc. )

I didn't know we could putt 2 dedicated RAID cards on a same board. I thought it would not work or it could cause some problems.

Unfortunately, we may not have the chance to play with some hardware parts and send it back if it don't work for us.

6. Is there any benefit or cons of using 2 SSDs in RAID plugged directly on the motherboard for an OS and leave the RAID card just for storage?

6a. Could we also use the 2 or 4 SSDs in RAID 0/10 (daily backup) and plug other 2 for other purposes (video/audio conversion, VMs) and also have other 2 RAID cards?

Maybe only 2 SSD for O/S, 2 for other purposes, and a slim BD drive or...

4 SSD for O/S (RAID10), 1 for other purposes, and 1 BD drive.

(the insistence is for the peace of mind, as it's not something we can afford building every year) :-)

As a side note on it, here are the X8DAH+ specific description:

SATA.

- Intel ICH10R SATA 3.0Gbps Controller (I'm not sure if it'd be a bottleneck on above config)

- RAID 0, 1, 5, 10 support (Windows)

- RAID 0, 1, 10 support (Linux)

- Six Serial ATA ports

- Six SATA hard drives supported

Now on to general arch:

From what you've described and not knowing your budget, plus assuming that you are going to be windows as your san/nas head and data integrity is not an issue you can afford to fight now would probably do something like this:

1Gbit Ethernet switch fabric w/ 802.3ad support (cisco is pricey here but look at Dell (6224) or HP lines which though interfaces suck have decent performance for the $$). Redundant if you really need that but doubtful initially.

NAS/SAN Head:

Supermicro X8DTH-6F or X8DAH-6F

small 3U chassis for board above (3U as that allows full-height cards)

(6) DDR3 ECC ram sticks; minimum of 12GB (2GB) 1333Mhz

(2) Xeon CPU;s X56xx series with 6.4GT/s QPI does not need high clock rates unless you want software raid since this is being offloaded to the raid cards you are just pushing internal I/O. Hyperthreading should be disabled as well as power saving options.

(3) SC837E26 or SC837E16 chassis

(2) local SATA/SAS drives for OS (mirrored) can be low speed.

*(45) drives assuming you want to use 3TB drives (3 hot spares; 1 per chassis; 35TB initial plus 20% YoY growth for 5 years (~90TB)) 4D+2P deployment

**(3) ARC-1880x or ARC-1880-ix-12 w/ 4GB RAM

**(3) BBU's for above cards

*** cheap LTO4 drive + tapes; or Another SC837E26 chassis with separate card (LSI00188 HBA) and do a dynamic disk volume for it.

* Other option would be to use more (78 2TB drives with 3 HS).

** this in a pinch could be a single card and you can daisy-chain the chassis (SFF-8088 SFF-8088) them with plans to go to this mode in the future.

*** personally would prefer tape as it's offline however it's slow (120MB/s) so a full backup here would take a long while.

I wish I had your knowledge to putt these parts together and say "hey, it's right there and ready, just use it!". :D

About the MBs the "F" versions seems to be the ones with "Integrated IPMI 2.0 with Dedicated LAN" and integrated graphics.

I don't see the need of IPMI at the moment - Don't even know how to use it =D

Something we're really lacking here is storage rackmount chassis - Very few options in the market.

To import 3 of them could be very costly due to weight and those great appreciated 60% in gov. taxes.

As I'm also lacking some knowledge on how to assemble the things this way, I'm not sure if it is the way to go.

Some few drawbacks here, and most are due to costs we're not able to afford right now.

We'll also have to build the network structure, so this "toy" may not be something we could play with :P

Anyway, it is great for further study, and there are interesting points you've showed us:

7. From what you've learned, would we actually need 2 CPUs right now? (X56xx series)

8. Why hyperthreading should be disabled as well as power saving options? What if we leave the HT enabled?

Quite long, I know. Hope it don't take a whole morning for you to help.

I mean, I hope it don't take a lot of your time and efforts.

Many thanks,

Cisco.pdf

Edited by Henri BR

Share this post


Link to post
Share on other sites

First, I will acquiesce the post length title to you for this thread. ;)

... for a router we were thinking about something with integrated services that is able to handle (see the attachment) at least a 200Mbit FTTx internet connection with DUAL-WAN port; as a gigabit internet connection seems to become relatively cheap within 3-5 years, a router that could handle at least 500Mbit may be the way to go. For a switch, we'd like a Cisco with 24 or maybe 48 ports, with at least 2 or 4 10GbE ports - I didn't researched it yet. Anyway, the below PDFs show us some Cisco parts performance - These are the most up-to-date files I could find. It is important to remember that it is time for NAS right now, and not SAN structure due to cost - So, the network equipment will follow this requirement.

older documents, however the 3945E's are nice routers, just put some in here at work for some 600mbps mpls vpn's. As for switches the 3560X's are the current low-end line. They are actually as I mentioned not that bad in price (barring your taxes/fees ouch!).

We're running Windows 7 Ultimate and for backups, we're using external SATA disks and Acronis True Image Home; we are doing partition level backups each 15 days.

We also do file level backups each 15 days, but no system files are involved.

We'll do it with this server once we're ready to buy it, but in a weekly fashion and keeping up to 7 images of the partitions. It means around 15-20Tb to 30Tb if we consider file level as well.

good, that you have a backup plan. most places forget about it until it's too late. doing partition based backups (vss) would greatly improve performance as it avoids individual file locking.

My main concern is with regard to P2P. When we have 500 to 1,5-2K people connected to this server downloading something, I'm not sure on how the things is going to run. I know we can set some limits, we can fine tune it by some configurations, but I have no experience with a server providing files over a 200Mbit (and up/higher) connection. For torrents we can manage max speed per person, what is nice, so we can set limits of e.g. 200-500KB/s per downloader providing a faster rotation. As we are going to use other apps like eDonkey/eMule, the things changes a bit, as we can just set global limit and the program handle how the connection are going to be used - It means that if we have 2.000 persons waiting for files, the app reduce the speed per person for up to 5KB/s so more people can download at the same time - Over a 200Mbit connection, 5KB/s means up to 5.120 persons connected. We still don't have an internet connection of 200Mbit, but only 35Mbit - In a near future we can even upgrade it to 500Mbit and not 200Mbit anymore, so we need to plan the structure a bit more. What I mean with these words is that I'm not sure what it will represent in terms of resources consumpition as you've started your explanation.

You have it partially, yes you'll have a max cap of your inet speeds which is good. The problem is the request data itself is going to be randomized and small which requires a lot of iops to your subsystem. Some tuning can help (i.e. caching; turning off access time updates; etc) but ultimately you're going to be at the mercy of the drive subsystem. This is where more 'spindles' wins.

About videos/audio/PDFs...

When we need to edit a video, we can do it with our current 5 computers - It is fine for it.

Once the videos are properly edited, we can upload it to the sever and convert them in batch: *.xyz to MPEG/AVI or something else.

Then, we catalog them and start sharing.

We can do the same with audio - Edit them on the computers, upload and batch convert/compress keeping a lossless version on it.

PDFs won't be an issue anyway ;-)

This is good as it helps separate your I/O workloads. At this point I should make a comment as to security. Generally allowing Inet directly into an internal system is a bad idea. With your willingness to have it as an end repository that helps in functional network separation of your environment. Even if you do not have a firewall itself, you can create a 'poor mans firewall' which is better than nothing. Simple environment would be to have an external interface on the router to your provider, acls to filter traffic (remove 'impossibilities like incoming RFC1918 ranges and seeing your own IP coming back into you; plus your ingress filter as to the types of traffic you want to allow to your DMZ). A DMZ segment on a separate interface on the router; which you would also apply an acl in-bound (for what you want to allow. mind set that this is an 'untrusted' segment as it has direct inet going to it so it should not initiate back internally or at the very least scrutinize exactly what you want to allow as this is a popular method for attacks (transient trust). Then lastly an 'internal' interface (one facing your internal lan) which would be another separate interface on the router which would also have an acl to allow only specific items/ports out to the Inet and towards your dmz. This lets you help block internal virus issues spreading to your dmz as well as to other sites. Also would suggest using private ip's (RFC1918) internally and on your DMZ and natting (1:1) for your dmz going to the Inet. This makes it a little harder to attack but also allows for growth a bit better. Security design layouts are just as involved as storage subsystems, I could go into more detail but it would probably derail this thread. I understand you have a small environment so some items you may not be able to easily accomplish with the assets you have, just raising it up so as to hopefully allow you to make some design decisions with security in mind. (see way too many sites/companies that have no consideration for it, most will survive a while but in the end it's costly if you get attacked.

1. What kind of arrays/RAID arrangement do you advise for this P2P needs? Raid 6 as you've told? How many disks? What configs?

We'll be dealing mostly with files of 20 MB and over.

For small files, we may have no more than 3.000 PDFs - Maybe it can be "hosted" in some other arrangement.

Yes, I do understand you when you say to putt all the eggs in one basket.

I first choose that Chenbro due to simplicity "to set things up" and better cable management.

I have never seen one of these storage chassis myself. 2 for me, seems more complexity - But maybe it's not that hard.

The SuperMicro ones seems very nice, I'm not sure about airflow.

Another things I'm not sure, is if we'd need another chassis for the server to be plugged on them.

If so, I'm completely lost on how to assemble 1 server with 2 of these SuperMicro chassis. Completely...

That 50 HDDs Chenbro seems so far easier "to set things up and run".

2. Would a build with that Chenbro cause us future bottlenecks or problems?

I don't see we needing more than two rigs like this one of ~48/50 disks for the next 5 to 8 years; a total of 96/100 disks.

As a side note, we may buy the disks as needed - We are not going to buy more than 24 disks right now (probably).

Due to the size of the drives you are looking at, the issue comes down to is trust in both data integrity as well as correlated failures (temporal as well as physical) and the amount of time for a recovery. This can be substantial (>24 hours), since simple errors (hardware/main failure) are temporarily correlated (you have a greater chance of a second drive failing if a first drive fails) so you can be in the state where you can loose 1 drive; start a recovery process and loose a 2nd drive while that is happening. Now at this point you have two failures so a failure to read a sector or you get a UBE you now have data loss. (basically what the spreadsheet I had above calculates). I try to get the probability of not reading all sectors on a sub-array to less than 10% (ideally less than 5%). This is what you have to balance against your stripe width and MTTDL for the raid type chosen.

The points of multiple chassis are mainly from past experience. First, having a single chassis that also contains your head (motherboard) you invariably outgrow the chassis and then you run into situations where you have separate power going to your MB/head and drive arrays (i.e. controller having drives on different physically separated power systems would cause issues where your drives would loose power but not the controller and similar glitches which would even in the case of BBU's have more data in flight than can be handled so cause corruption, or causing split arrays (degraded et al) when you come back up and drives in one chassis are not seen/spun up before the head/controller is et al. I've seen the same problems even with sans (emc; stk, et al). best to have a hard separation.

Next the issue of external chassis size. 1) you have the issue to HA having more chassis and spreading drives across them if done properly you can loose a chassis and still have your array. simple analogy in a a raid 1+0 you have two chassis one drive of each mirror in each, you loose a chassis you loose 1/2 your drives but your array is still on-line. Same idea can be used for other array types (for example raid 5 of 3 chassis one drive each (stripe width of 3, 2 data 1 parity.) and 2) look at your bandwidth, with 3Gbps sas in 4channel multilane you get 12Gbps or ~1.2GiB/s throughput. with 6Gbps sas that's 2.4GiB/s. With hard drives now at ~120MiB/s at outer track or down to 60MiB/s inner track that's (assuming streaming) 10-20 drives before saturation on 12Gbps or 20-40 drives w/ 24Gbps. Now in your case since your workload is going to be more small transfers in operational mode you can have more drives without really hitting that bandwidth limit, with the exception of when you do your data scrubbing and recoveries in which case if you oversubscribe too much you'll impact your array maintenance functions.

Assuming you have UBE ratio drives of 1:10^15 then having a stripe width of 6 drives of 2TB would give a 9.15% probability of not reading each sector, going to 3TB drives that raises to ~13.4%. However this is a statistic so it does not directly apply to /your/ drive build but in general. With that I would be hard pressed to not use raid6 w/ 6 drives per parity group at max. Then add in multiples of that to reach your IOPS/space requirements.

Also as a rule of thumb I normally have at least 1 hot spare drive per 12-15 drives; minimum of 1 hot spare per chassis. As you want to start a recovery AS SOON AS A DRIVE FAILS due to recovery time involved.

For the sake of simplicity, I thought about going with one or two of these Arecas.

We'll buy the battery backup units for these if we chose this path. So, we'll have 4 slots used only for it.

I saw some topics with some Arecas and HP SAS expanders; however, I'm not sure what are of running this kind of config.

If we choose that Chenbro, and going with a card without internal expanders, we'll need 6 slots and things will require some changes, I mean the storage design.

If these changes are needed, more complex things are coming up. Mostly, beneficial of course; but I'm not sure if I can handle it and about costs.

As you have seen, we have some temptation of going with Windows, and maybe VMware or Citrix.

If an edition of the Server 2008 R2 SP1 can do all the things we need, including some VMs playing (for learning in general), maybe it's our choice because I'm already very familiarized with Windows, not the server exactly. Also, all our structure is based on Windows systems - Maybe there may be a benefit of going with it - Not sure, however.

3. Is there a card better than the Areca ARC-1880ix-24-4G for what we're looking after?

4. If we work with two of this card, battery backup units and an UPS, are the chances of running into trouble big?

4a. What if we choose arrays with less disks in it, e.g. 6HDDs? Do we have any benefit (less chances of troubles, faster rebuilds, etc)?

Areca's are nice cards (I have several here), though there is no 'perfect' card. Things to be aware of:

- Cards with built-in expanders (basically cards that have > 8 channels) may have interaction issues with external expanders and certain chipsets (for example the ARC-1680's had issues with some LSI expander chips as well as the intel 5520 IOH (negotiating at lower link rates). That may not hit you with the 1880 but raises the item that you need to test.

- Cards run hot so you need good airflow across them

- You are limited to 4 cards per system.

- you cannot share a raid parity group across controllers (need external means like dynamic disks; lvm; or similar at the host level)

- with the larger channel # cards there are 4 channels directly wired for the back-end SFF8088 conntector the other 4 channels are fed to the internal expander (so even if you have 24 channels they get fed to the same 4 channels on the chip). This is why getting a card with say 2 8088's or 2 8087's (i.e. 8 channels) can be better. (*note, depending on workload if you are doing a lot of small random writes then cache can help more up to a point).

As I mentioned above with the # drives per parity group directly goes toward MTTF_DL calcs.

UPS's protect against site hard power failure, however BBU's protect in-flight data to the arrays. You should always have both /OR/ disable write caching all together but that has performance implications. Basically power failure is not the only situation where you have data loss issues. For example if you had a cpu or OS crash which has no bearing on power but could leave data non-flushed in your cache. That's what the BBU protects.

For the motherboard, I first selected that (link) due to chipset version, # of PCI slots and x16 slots and no integrated video.

I'm not sure what IOH stand for, is it the I/O Hub, an Intel chipset architecture?

If so, the X8DAH+ model seems to have it as well:

Chipset - Intel® 5520 (Tylersburg) chipset / ICH10R +2x IOH-36D

Anyway, I still have no idea on what it may represent... :P

By the way, I've been reading this article (link) a friend have provided me. That company seems to be using "home grade HDDs". It'd reduce the costs A LOT.

5. Would it to be a VERY bad choice if we buy "cheap" 3Tb HDDs for what we're talking about even if we try to "protect them" with "anti-vibration kits"? Why?

5a. In what cases would those cheap 3Tb drives would work for us? (maybe we could buy SATA3 "enterprise grade" for this, and "home grade" for that; in a same enclosure)

IOH = I/O Hub the 5520 is a dual IOH chip each IOH (5500 or 5520) has 36 PCIe lanes. Having two of them gives you 72 lanes which is why it's much better than using a PCIe bridge chip (i.e. nvidia nf200 or similar) as those chips do the same thing as the sas expanders, the take say 16 lanes in and then time slice to 32 lanes. So you have queuing issues (longer latency). dual IOH's provide full link bandwidth no sharing/blocking as both are directly tied to the QPI interconnect.

For the boards the main differences really are that the X8DAH has more memory capabilities than the X8DTH. The DTH has 7 16x slots but all slots are 8x electrically, the X8DTH has 8x & 16x physical slots on the board. Not much of a difference as there are no 16x network or HBA/RAID cards. For a server system unless you are pumping it full of ram either is the same.

To answer a later question you DO want the -F version (IPMI) of the boards. IPMI is your KVM and console replacement. This gives you power control and remote console (graphic & serial) to the system. This avoids you having to put a keyboard and monitor on the box and allows you out of band access to it in case there is a problem with the OS. The cost is negligible for its' function.

I won't even go into 'backblaze' there is just way too much wrong with that environment from an availability and integrity standpoint. That's the epitome of a 'if it works with one drive; it will work with 1000' mindset. no it doesn't. ;) It has the /potential/ to work out but not at that price point. you would probably use ZFS (zvols) and a distributed file system on top of that which would need to be presented to front end filer hosts. The additional network & server robustness to handle different failure scenarios will also have to be added (different power grids; network switch zone separation; multi-pathing to chassis; then you run into other items with consumer/desktop drives such as vibration tolerance; error recovery items; bad failure scenarios where a single 'failing' drive can cause problems for all other drives attached to the same controller/expander chip; et al. Not saying that you /need/ enterprise level everything; however you have to know what they provide and build your environment to compensate for the issues.

Larger drives have the same problems as listed above, 3TB for example may take up to 2 days to re-sync in a failure mode; what is the UBE rating; how does it handle vibrations; etc. You can run the numbers, I haven't found any yet that from the published specs I would trust without some much more in depth compensatory controls (i.e. ZFS adds in checksumming of every sector both read & write which raid cards don't do this helps catch a lot of issues; doesn't solve them but pushes the bar out a bit, then you can do other HA items like chassis/controller/power separation et al.

Unfortunately, we may not have the chance to play with some hardware parts and send it back if it don't work for us.

This is why a lot of enterprise solutions cost much more money. As you are supporting the industry to do a lot of the testing for you. Testing takes time. For example I just got a sample SC837E26 box here and am running into a connectivity issue where I can't see any internal sata drives on the unit. It may take a week or more to work. Comes down to time or money. :)

6. Is there any benefit or cons of using 2 SSDs in RAID plugged directly on the motherboard for an OS and leave the RAID card just for storage?

6a. Could we also use the 2 or 4 SSDs in RAID 0/10 (daily backup) and plug other 2 for other purposes (video/audio conversion, VMs) and also have other 2 RAID cards?

Maybe only 2 SSD for O/S, 2 for other purposes, and a slim BD drive or...

4 SSD for O/S (RAID10), 1 for other purposes, and 1 BD drive.

(the insistence is for the peace of mind, as it's not something we can afford building every year) :-)

If you're separating your functions out (file server separate from your video conversion systems et al) your file server has no need for SSD's. You should have a raid 1 for your boot drive (os) but that's about it. It's main function is static and the i/o will be going to the array(s). I normally buy a couple SAS 2.5" drives (10K rpm or something like that; 72GB or so in size) for the system. Main thing I'm looking for is reliability. Now if you were using a non-windows OS and say solaris or something that can run ZFS then there /are/ uses for SSD's there (cache and log devices) but that's a different discussion related specific to that type of deployment.

Leave your SSD's for your scratch disk space on your conversion systems, that would speed your composting. You won't need a fast OS drive for these systems either generally as that's just your application/OS you would need ram but once the apps generally load it's all scratch & I/o to your data drives. Talk to you app users.

7. From what you've learned, would we actually need 2 CPUs right now? (X56xx series)

8. Why hyperthreading should be disabled as well as power saving options? What if we leave the HT enabled?

You may not from a cpu (processing) standpoint but it does allow you to better utilize interrupts for heavily loaded systems in I/O. You can always start with one; though remember you have to add the same model for the second later on and keep an eye on interrupts (software & hardware).

As for hyperthreading that goes with the above. think about what hyperthreading does. Basically it allows the cpu to QUEUE up an additional instruction but only one can be operated on at a time. this is great for filling your queues for applications that are heavily threaded. With deployments that are interrupt driven (for example storage systems; firewall/network switching, etc) you do not want to 'queue' that interrupt waiting on the execution pipeline of a single core you want that interrupt to be handled as fast as you can so put it on a core that can take it and run. (this is very simplistic, if interested books in computer architecture/design may be helpful).

Edited by stevecs

Share this post


Link to post
Share on other sites

Wow, I'm embarrassed of so much effort I'm requiring from your part! Thank you for the patience as well.

You could partner with FireWire2 and run a successful storage consulting business, if not already running it... :D

I read the post, but I'll need to do it once or twice more to get all the things I should.

After I have done it, I post a decent reply.

It may happen tomorrow.

Until a few days before, I knew very little about server, mainly with regard to storage servers.

Right now, I'm able to understand it someway better.

Something I were used to think before studying, was that the only way to assemble a bunch of disk was with a RAID card.

Now I see I were wrong about it. And, it makes me wonder if I really need a RAID card or an HBA solution with software RAID.

Something that was (and still is) a bit hard to figure out is regarding to SATA3 support.

If, in manufacturer's website it's telling that a card support up to 6Gb/s or SAS6, does it mean that it also support SATA III disks as well SSDs?

I'm not sure if it could be a valid comparison, and if the products are equivalent for a comparison, so please, correct it if something's wrong.

If needed, pick an equivalent or better product for comparison.

Considering the requirements, what are your thoughts about...?

Adaptec RAID 6805 (Kit 2271200-R) url 2 vs. Areca ARC-1880x (with BBU) vs. LSI MegaRAID 9265-8i (W/BBU) url 2, 3 vs. What else?

I'm not sure if these 3 cards have built-in expanders, as explained before.

Anyway, there are some conversation (more or less) related to these here, here, here, and other 1, 2, 3, 4, 5... For those who want to read it.

Just to leave a note, I first selected that Areca due to it is SATA III/SAS6G.

Maybe there are several other options/configs around there supporting it.

Some manufacturers' links: Areca, Adaptec, HTP, LSI (3Ware)...

I wish I never end with a computer like this :-D

Nice set up! And I liked the fire extinguisher around there...

Edited by Henri BR

Share this post


Link to post
Share on other sites

The main point of RAID cards is to off-load the host CPU which is good for systems running applications or are constrained (interrupt handling, or other issues). They also make it a bit easier to manage as it's generally 'pull a failed drive and replace' type of a situation. 'software' based raids (I have that in quotes as technically even raid cards are software just with accelerated parity functions and some other items) require more care and feeding (user maintenance). Generally most of the options that are handled automatically by the hardware raid solution is now brought to the surface for you to make decisions on.

At this stage I would probably say that most software raid solutions are really only viable under a *nix type environment. Most of the 'tunables' are not really there are are hard to get to under windows. Things you generally look for are low level drive stats (so as to know when to pull a drive out of a system /before/ it fails or troubleshoot low level performance issues; logical volume management functions; partition alignment items; etc. Generally if you're with windows (and remember my 02:00 rule) probably be best to stick with a hardware raid. There are half-way solutions like openfiler and freenas that run unix but generally provide an easier management interface to the user, but would play with them first to get used to it opposed to starting your first build on it for business purposes.

SAS supports SATA through SATA tunneling. Unless you're talking a first generation SAS card it will support SATA drives/signaling. Likewise a SAS port/backplane will accept a SATA drive but not the reverse in either case. Generally it's better to get sas cards & chassis as you can use either type of drive.

As for RAID card types, I like the Areca's mainly for their out of band management (ethernet jack) which is real nice. Besides that performance is more or less the same with the same generation of chips. LSI probably would be my second choice due to their large industry support, generally more compatible with systems, Adaptec also makes good cards, though haven't used them too recently.

The adaptec 6805 you linked to does not have expanders (just the 8 channels/two 4-channel SFF-8087's). The areca ARC-1880-i; ARC-1880x and ARC-1880LP are without expanders, all others have them (that's why you can have more than 8 channels). A lot of the LSI's products are old 3Ware cards probably want to stick with the 'MegaRaid' line, you want cards that are PCIe v2 and the same thing applies to their raid cards (>8 channels). For HBA's (like the LSI00276) is actually a 16 channel sas chip. You can only do raid 1+0 with that or use it as a normal HBA (no raid) which is good, but with only a 8x PCIe lanes you can't fully populate without over saturating the host slot.

As for that computer link at the end, that's kind of a mess, not to mention a dust magnet. Could really be cleaner and even going to a peltier or phase change for sub ambient. I only WC my desktop system mainly to cut down noise levels though that's not much of an argument anymore with the 19" rack and fan noise from it.

Share this post


Link to post
Share on other sites

Good morning Steve!

I've been reading all those links all night long.

I didn't finish the 5th at this while.

What I saw people commenting are basically about:

LSI: Dual core, DDRIII and only 1Gb of cache, and a lot of things regarding to # SSDs in RAID0.

Could not find too much useful information when it comes to work with 24+ drives on it, nor about SAS expanders to be used with it.

I saw someone stating that the HP SAS expander works with that card - But no confirmation about it.

Couldn't find (or were not able to properly interpret or both) useful comparisons to the Areca's cards.

What I saw were about people telling about single core on Areca's, DDRII as a drawback against the LSI one; but:

Up to 4Gb of memory, nice interfaces, the ethernet jack as a plus against the LSI card.

No deep explanation about what would be those differences "in the real world"...

Regarding to the Adaptec one, there are still very few information. Most people are talking about the "Zero-Maintenance Cache".

There a few complaints regarding to drivers (as it's still a 'new' card)...

Finally, there are not much information about Adaptec and LSI regarding to how "it would be good or not for what we've been talking about".

About the Areca's we can find much more information; however I don't when they released their last cards comparing to these other.

I don't know if you did some research about these cards in a near past.

Anyway, if you could post some notes about how these differences could benefit or not, it'd be much appreciated.

Just to remember, there many things you already explained that could fit here, so there's no need to repeat and take much more of your time.

Fan noise!?

That's something I stated researching yesterday, as we'll live the entire day very closed to this server in a quiet room.

By the way, we'll plug a LCD monitor on it to play with possibilities a little further :-)

Edited by Henri BR

Share this post


Link to post
Share on other sites

Basically those vendors you mention (LSI, Areca, Adaptec) are all in very close competition and performance is close to the same. Some may be a little better with certain drives (firmware tweaks, etc). But it really comes down to specific build items (i.e. out of band network; or removable memory; bbu on-card or off card etc).

The reason I went with the AIC RSC-3EG2 chassis was due to the 120x120x38mm fans, larger fans can push the same volume of air with lower blade speed which lets you cut down noise. The main thing you have to careful of is that you don't want a drive to be >40 degrees centigrade as that would decrease lifespan. However even replacing systems with lower speed fans you'll run into a lot of white noise with larger builds. Right now I'm running ~112 drives here at home across 7 chassis plus switches et al. It's getting to the point where I'm even looking at acoustic racks such as the Ucoustic 9210 http://www.quiet-rack.com/ucoustic-active.php

But those are not cheap.

Share this post


Link to post
Share on other sites

Jealous at those racks Steve! :)

It'd cost a fortune just to ship it.

I'm compiling a list with all the PWM fans I can find (80mm and 120mm).

Here is a previous of it:

120mm PWM Fans

Akasa AK-FN059

Artic Cooling F PWM F12 (0872767002654)

Cooler Master R4-XFBL-22PR-R1

Cooler Master R4-EXBB-20PK-R0

Cooler Master R4-BMBS-20PK-R0

Cooler Master R4-P2B-12AK-GP

Coolink SWiF2 120P (471612331341)

Deepcool UF 120

Enermax UCTB12P

Everflow R121225BU

Gelid FN-PX12-15

Logisys SF120

Nexus D12SL-12 PWM

Noiseblocker XLP (4150051901528)

Rexus Rexflo DF1212025BH-PWMG

Scythe SY1225SL12LM-P

Scythe SA1225FDB12H-P

Sharkoon Silent Eagle SE

80mm PWM Fans

Akasa AK-FN051

Artic Cooling F PWM F8 (0872767002630)

Cooler Master R4-P8B-25AK-GP

Cooler Master R4-BM8S-30PK-R0

Coolink SWiF2 80P (471612331334)

Deepcool UF 80

Enermax UCTB8P

Everflow R128025BU

Gelid FN-PX08-20

Nexus SP802512H-03PWM

Rexus Rexflo DF128025BH-PWMG

Sharkoon Silent Eagle SE

* This list is sorted alphabetically, and not by features/preference/etc.

** I didn't include the Delta fans due to noise, what some of above products may cause as well.

** Anyway, the Delta fans seems to be worthy just the same.

Steve, I started talking about those cards because I'm seriously considering your advice of NOT going with a RAID card WITH built-in expanders. I still didn't completely understand what you have explained regarding to 8 channels (below quote) and its relationship to the three mentioned cards, if there is a relationship with all the three.

...with the larger channel # cards there are 4 channels directly wired for the back-end SFF8088 conntector the other 4 channels are fed to the internal expander (so even if you have 24 channels they get fed to the same 4 channels on the chip). This is why getting a card with say 2 8088's or 2 8087's (i.e. 8 channels) can be better.

Do you know if the LSI MegaRAID 9265-8i has a built-in expander? I'm not sure.

BTW: 112 drives AT HOME? ;)

Edited by Henri BR

Share this post


Link to post
Share on other sites

I don't have the acoustic rack yet, just an old tripp lite one which was about $2500 or so, the ucoustic is around $5000 (passive) or $7500 (active). still debating if my hearing is worth it.

As for fans I went with the Scythe's mainly for their long life bearings and hooked them up to a pwm controller so I can manually set the speed to find a good balance for cooling. If you are looking for sound pressure volumes remember that it's vector & phase additive for multiple fans. Assuming no other interference/everything in phase and properly balanced et al, you can use the following formula (assuming 3 fans at 44dB):

10 x (log (10^(44/10) * 3)) = 48.7 dB

As for the MegaRAID 9265-8i I don't see anything there that would indicate that it does. the LSI2008 is an eight channel chip so I would doubt that it does.

However should note now that I heard back concerning my own validation testing with the LSI00188 (which also uses the LSI2008 chipset) is incompatible with the Supermicro SC837E[1|6]6 chassis. Seems that they are only supported by the LSI 2108 chipset (wish they said that a month ago, which would have saved me validation time, but this is the type of thing that you run into frequently). In my case it's just a return of the chassis, but you should work plan 'b's with your vendors if you don't have similar validation that items will work.

As for home system, that's just my main storage array. It's just a hobby, either that or spend $$ on cars or something. ;)

Share this post


Link to post
Share on other sites

Absolutely, that rack would not be an option for us...

~$5K + 60% taxes + ~$2K to ship = ~$10K :P

For the rack, we're thinking about buying the APC AR3104 24U (NetShelter) from a local dealer.

We'll need a 24U rack. For this one we'll pay around $ 1.700.

Is there something special we should to pay attention when choosing a rack?

With regard to this one, is there something we could include (addon/accessory/etc) that could be useful?

...

Nice formula, though. I'm sure I'll use it.

Did you wish to know that a month ago? lol, but that's sad; I know.

Why that card didn't work with the SM. SC837E16? Did you discover the cause?

By the way, I'm wondering how cool the HDDs would run on that kind of case. It was my first thought when I saw it.

Yes, there must a plan B, which will cause some more expenses but can save us some time and money.

That's nice a 200to300Tb system at home, which I can't imagine or afford here.

You can buy a brand new car by selling this "little thing". Would you? I doubt! :-)

Edited by Henri BR

Share this post


Link to post
Share on other sites

Didn't say racks were cheap. However, if you are in an area that has high-tech business (data centers, computer centers, etc.) you may want to call them up and see if you can get to their facilities group. Most places dump or go through racks frequently especially in outsourcing fields (when they do migrations, data center updates for new types of cooling standards or just standardizing on floor space management techniques). You may be able to find one just for finding a truck to pick it up.

Generally racks you want to look for depth so you have at least 2" in front of the servers and at least 6" in back, if not more. Generally split door designs are better as this lets you open the rack easier to move systems in/out in constrained spaces. Having a wider rack is also good for better cable management. Same thing goes for the back/sides for your pdu's. ('zero U' power strips basically) just so you can hang them out of the way to not interfere when you want to pull out a hot swap supply or doing general maint. Square hole racks are generally better, at least here in the states as we're hard pressed to find any equipment with round-hole designs for rail kits. Most have 'universal rails' but generally even in the UK I've seen square hole taking over). The main thing is good air-flow (don't get glass or closed off-racks as they will cook your systems unless they're designed /very/ well). It's hard to go really wrong though, as the main purpose is just to have 4 posts to hook rails to. Rest is just sheet metal.

As for the 'wish', no, :) lady luck is way too fickle for me, rather work with skill. Plan for the worst, hope for the best.

As for the reason, for the incompatibility, nothing yet as to a technical detail reason besides 'not compatible'. Without a scsi/sas analyzer I probably won't get real reasons to it. It seems that it's in the enumeration of the expander chip back to the HBA but what, don't know.

In the AIC chassis here with the 2000rpm scythe fans I have drives stay at in the low 30's C with ambient temps around 23-25C. So not bad.

The array here is currently running 1TB drives (soon to be replaced w/ 2TB versions), would be ~128TB usable. (96 drives raidz2 of 16 vdevs of 4+2 drives; plus another 16 drives (8 hot spares and 8 as scratch for various other items that don't run well under zfs's cow structure). No, my old 2002 audi is going to last until it falls apart. :)

Share this post


Link to post
Share on other sites

Thanks for the tips Steve. And I haven't thought of buying something from a datacenter!

There are a large telecom company and a datacenter 50 miles from here, there are 2 small ones right here where I live. As I have never played with racks, "major" servers, and such sort of equipments, I thought of paying them a visit just to learn a bit more, but I didn't think about buying an equipment from them - So, I'll call them and ask if I can have access to their facilities and also call that telecom and DC - Maybe they can do something for us. What I'm sure is if these kind of companies works with minor racks such the 24U APC I've told you. A major rack would be overkilling for us, would not help at this point - Anyhow contacting and visiting them will be constructive; a must do.

Have you got the opportunity to see or manage that APC rack? (link again)

Regarding to it's dimentions, here it is:

Maximum Height: 1198.00 mm

Maximum Width: 600.00 mm

Maximum Depth: 1070.00 mm

I liked it, but as I'm not able to fully understand these subjects, I'm not sure if it can "take care" of all the needs.

What do you think about it?

Share this post


Link to post
Share on other sites

You're right, most companies use 42U racks but they do get a lot of miscellaneous stuff as well. The company I work for gets all types (out-sourcer). They may not all be 'pretty' after years of use but they're functional. Even though a full 42U rack is large, don't knock the price. A couple guys and a pick-up truck can do wonders. :)

the APC one you have linked to is not a bad rack. The back is split which is good, the front is a full door but that may not be a big issue to you (i.e. you don't have tight isles of racks, that's where the split doors come in real handy). The cable tray in the back is nice as it lets you hang cable management guides as well as pdu's.

Also with loading, you want heavy items at the bottom and work your way up. Generally I put UPS's at the bottom, then the drive units, then the server(s) no more than shoulder height if a full rack, and then things like switches etc at the top. I also like to keep high voltage and low voltage (network/san/scsi, etc) on separate sides of the unit.

Share this post


Link to post
Share on other sites

I'm just remembering the FireWire2's post:

Hope this will get you start.

Yes, it got and he helped a lot!

I'm learning a lot with you too! Even if we study alone, research all the things, there are always something we can't fully understand - Even if it's some basics; as you may know, we are unsure what to look for, and there may not be an on-line answer that will answer your specific question. Sharing, that's all about sharing. I'm 26 today and I'm now remembering the internet in the 90's - That's very nice these opportunities, and that's what we've been working to do: to distribute some more opportunities like in the essence, you 'guys' are doing right now. Many thanks for your time, efforts and yes, patience :P

A couple guys and a pick-up truck?

Yeah, this is one of the reasons we're going to choose a 24U. :-)

Nice tips regarding to the arrangement of equipments.

I'm now wondering what kind of UPS is going to be enought for us.

Server, ~48 drives and a few upgrades if and when needed, network stuff, etc.

APC equipment is easy to find here...

Care to clear it up?

Share this post


Link to post
Share on other sites

Glad that our past experiences are a help. Basically how I got started in this back in the 70's as a kid. Information/experience is really limited if not shared freely.

As for UPS load. I normally try to keep the load to about 1/2 the capacity of the UPS solution or less during normal running, but be able to handle a full power-on load without getting above 85-90%. The spreadsheet has some rudimentary calcs. For some perspective, currently I have 7 AIC RSC-3EG2 chassis; one Supermicro X8DTH-6F w/ 60GB ram; & 6 LSI00188 cards; 1 Areca 1680ix-24 w/ 2gb cache; (2) quantum LTO4 tape drives; Dell 6224 switch; 64 1TB ST31000340NS drives; 16 ST32000444SS drives; 12 ST2000DL03 drives; and 2 2.5" 10K rpm sas boot drives and a older cisco 2610XM router. All this takes ~1500W and is running on an APC RMA3000U (2U 3000VA rackmount ups). I expect when I replace the drives here all with ST2000DL003's (total 112 of those) should lower this down to 1100 or so.

Share this post


Link to post
Share on other sites

APC SMART-UPS RT 3000VA 120V; 2100 Watts / 3000 VA (SURTA3000XL)

Rack PDU, Metered, 1U, 20A, 120V, 8x NEMA 5-20R (AP7801)

As this this UPS is capable of up to 2100 Watts, it can generate a current of up to 17,5 amperes (2100W / 120V = 17,5A).

If we buy a single PDU, to handle all these equipment, the PDU should to be capable of handling this 17.5A current, or we'll have a bottle neck, right?

Here is the APC AP7801 PDU specification:

Input

Nominal Input Voltage: 100V, 120V

Input Frequency: 50/60 Hz

Regulatory Derated Input Current (North America): 16A

Maximum Input Current per phase: 20A

Load Capacity: 1920 VA

Output

Nominal Output Voltage: 120V

Maximum Total Current Draw per Phase: 16A

Overload Protection: No

Now, a bit of confusion...

The UPS has an output power capacity of 2100 Watts, 3000 VA, and 17,5A (if the formula is correct).

The PDU has an input power capacity of 20A per phase and 1920 VA. Also, it has an output capacity of 16A.

Following these numbers, it seems that the PDU can handle the 'amperes load' generated by the UPS, but it cannot handle the 3000 VA load, but only 1920 VA of input.

Moreover, I'm not sure how much/many VAs or Watts the PDU can deliver, but we know it can 'deliver' 16A. That's the only number I have, and can't figure out the others.

  • Is this PDU a capable hardware at all? I mean, to deliver what the UPS has to offer.
  • Do you that other numbers for the PDU output? I mean, VA and Watts. How to figure it out?

If that PDU is actually a bottle neck, there is another option: APC 2U, 30A, 120V, 16x NEMA 5-20R (AP7802)

Input

Nominal Input Voltage: 100V, 120V

Input Frequency: 50/60 Hz

Regulatory Derated Input Current (North America): 24A

Maximum Input Current per phase: 30A

Load Capacity: 2880 VA

Output

Nominal Output Voltage: 120V

Maximum Total Current Draw per Phase: 24A

Overload Protection: Yes

This second one is able to 'receive' almost the 3000 VA from the UPS; 2880 VA. It also has an overload protection layer. Just to remember, the UPS has it.

  • Will this one be our choice (AP7802)? Or would 2x AP7801 be much worthier than this seccond option?

As some side notes, For Zero U options there is only a single PDU AP7831 due to the size of the rack (24U).

Also, it is a "Maximum Input/Line Current per phase" of 15A and 1440 VA of "Load Capacity".

For the UPS, there is also a rack version of that one; but we won't use 3U of the rack for the UPS.

It can live beside the rack. It also means less heat inside it.

Edited by Henri BR

Share this post


Link to post
Share on other sites

The more I study Hardware vs. Software RAID 'pros and cons', the more I'd be happy to manage 24/48 drives individually and forget all the performance sakes... :)

All kidding apart, this Chenbro CK23601 (CK23601H0C01) SAS Expander looks a bit more promising than the HP SAS Expander whether looking into cards without built-in expanders.

Wondering what are the chances of incompatibility with the LSI's and Areca's cards.

Share this post


Link to post
Share on other sites

First you should be looking rackmount units assuming that's what you're getting (a rack), and the newer versions with the more efficient inverters.

http://www.apc.com/products/family/index.cfm?id=165#anchor1

The SUA3000RM2U is about right 3000VA/2700W:

http://www.apc.com/products/resource/include/techspec_index.cfm?base_sku=SUA3000RM2U&total_watts=1400

Requires you to hook up a L5-30 outlet which shouldn't be much of a problem (you can find them usually at hardware stores if you're doing it yourself for ~30US or less) and then a 30A fuse for your breakout box. Or get your electrician to do it.

As for the PDU's I use the AP9567 http://www.apc.com/products/resource/include/techspec_index.cfm?base_sku=AP9567, two of these which you can hang on the back cable tray of your rack. Each goes to a different 20A outlet on the back of your UPS (each outlet is on a different fuse) then you plug each server/chassis into BOTH PDU's that way if a PSU dies/shorts which would blow one circuit the other should still retain power. Each pdu can handle 1800VA/120V/15A which should be fine for your intended load, it doesn't have the fancy led meter on them but those are hard to get with a short rack in zero u format.

As for those sas expander cards, those are just plain expanders, no logic (raid) at all. The Chenbro one at least mentions that it is using the LSISASII36 chip, the HP one doesn't specify (it may as they do source a lot of there stuff from LSI). Were you looking at that opposed to have an integrated expander backplane in the chassis for the drives or something?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now