size, speed, capacity 6 x HDDs for non-RAID array
Posted 16 February 2013 - 11:21 AM
I'm having quite a difficult time deciding what will be the best drives to purchase. So I would like to hear your thoughts on the matter. Here is some information about what I am trying to accomplish and some of the options I am considering.
I am building a test server that will host about 30 Virtual Machines. I'm trying to build it as inexpensively as possible. I have sized all of the other components appropriately, but I am very concerned that the storage will become the bottleneck. I cannot afford the storage I would like to have. I will have a ~500GB SSD for some things and a 3TB Green Drive for data storage. But for holding the virtual disks for the VMs I cannot afford to run all SSDs, SAS, hardware based RAID, etc. I know that running multiple concurrent VMs can be severely limited by the storage system. This is mostly an IOPs issue, not a throughput issue. As such, my understanding is that the number of spindles is a much bigger factor than the spindle speed (RPM). I do not require much capacity. I only need about 1.5TB to 2TB between all of these VM drives, so 250GB to 500GB per drive is fine. The most I can do for now is 6 drives (in addition to the two mentioned previously). I need to keep the cost under $600 for the drives. But the less expensive, the better. Also, I will most likely not be creating a RAID array from them. I cannot afford a good RAID controller, so I would be limited to a software RAID. Additionally, depending on which RAID is used, it would either reduce performance too much, be to risky, etc. Keeping a separate volume per disk is probably the best option to not take a performance hit or increase the risk and I don't need a large volume, so I don't need to combine them. My problem is deciding on what drives to use. 2.5" drives would use less power, create less heat, but it would be at the cost of some performance and possibly the reliability of the drives? These are laptop drives and are not designed for 24x7 server use. Since the spindle speed makes less of a difference for my use than the number of spindles, would I be just as well off using low speed drives or is a 7200 RPM drive worth the difference in price (times 6)? I highly doubt the 10,000 RPM or faster drives will be worth the cost for my use. If I were doing something transferring large files, maybe, but for high IO use, it should not help enough to justify the cost. Also what about enterprise class or RAID type drives that are designed for 24x7 continuous use? Perhaps something like a WD RE4 or WD RED? What are your thoughts?
If you would like to remove this advertisement, please register.
Posted 16 February 2013 - 05:00 PM
Actually, for IOPS spindle speed is very important, as well as the number of platters that can be read from simultaneously.
For your purposes, you should only look at enterprise SATA or better. Don't consider laptop or cheap consumer drives for a RAID/VM application, except for backups. They aren't designed for it. Use that Green you have for backups; don't try to use it in a RAID array of any kind.
You didn't say if you were using Linux, which I know has excellent software raid capabilities. If you are using Windows, I strongly suggest considering a RAID 0/1/10 hardware card like the Adaptec 6805E, which has eight ports, and excellent asynchronous read/write performance. You could get such an controller and four 1TB WD RE4's for a 2TB RAID 10 array and not go very far over your budget. If you found refurb 10K/15K drives and used 6 or eight in a RAID10 array, you would get screaming performance on both reads and writes.
(I strongly recommend using RAID10--it will give you the balance of fast reads and writes that VMs usually need, as well as being extremely fault tolerant.)
Posted 17 February 2013 - 06:58 AM
You only need about 1.5 - 2 TB for the VMs. However, how much of this will be accessed frequently? SSD caching might be an attractive option. Even several Raptors can't beat a single high-end desktop SSD for random IOs. If you could get away with 256 GB for your main drive you could get another 256 GB SSD for the cache without any additional expenses. I suspect this would hold pretty much all of your "hot" files. The drives to be cached wouldn't be as important, then. I could see a reliable RAID of 4 3.5" 'Cuda 2 TB as a good option. Only downside: I don't know how to set up such a large SSD cache for an entire RAID volume.
Posted 18 February 2013 - 03:06 PM
Thanks, that's not a bad idea. I would be a little concerned about the reliability of the refurbs. I will be using these in a 24x7 server. It is not a production server. It is not even for business test/development. It will be a server I am building for home to learn/expiriment on. But I'd still hate to take changes on stability if there is any risk. Has anyone had good or bad experiences with these refurbs?
Sorry, I didn't mean to imply there is no relation to IOPs and spindle speed. So I guess I should clarify a few things. First please note, I am no storage expert, this is all just based on the reading I have done).
So IOPs are typically measured as sequential reads, sequential writes, random reads and random writes. They are also usually averaged together based on typical workloads. For the VMs I will be running and given I will have multiple VMs per disk (regardless if RAID is used or not) I expect the vast majority of IOs for the drives workload to be random, not sequential.
As I understand it, random reads/writes do not scale as well (relative to the RPM of the platters) as the sequential reads/writes do. All do increase with the rotational speed, just not at the same rate. Additionally since the heads all move together, the number of platters will pretty much only affect the sequential reads/writes and should not affect the random IOs virtually at all. Platter size is also a factor. The larger the platters, the faster the outside tracks are moving and the faster it can read from them, but also the further the heads have to travel between the inside and outside tracks and thus the potential for slower random access.
Most of the data I can find indicates that you will see anywhere from less than 1 times the performance increase per additional rotation to nearly 1.75 times the performance per additional rotation. So for example a 10,000 RPM drive like the WD VR is a little less than 39% faster than a 7200 RPM drive (strictly referring to the rotational speed increase). The IOPs will likely be around 66% higher. Thus giving you about 1.71 times the IOPs factoring in the increased rotational speed. However, looking at WD VR versus WD Caviar Black, the VR is just under 2 times the cost per GB per rotation. That means per IO/GB/Dollar, the VR will provide about 87% the value of Black. So if I purchase 2 x 500GB WD Blacks, I should get the greater IO performance than a 1 x 1TB WD VR and come out $60 less expensive.
I am leaning toward enterprise drives. But they are expensive. I'm thinking something like the WD RE4 is probably the best I can do. I definitely can't afford to go down the SAS road right now.
As far as RAID, I will be using Windows for the host and most of the guests. If I get a actual hardware RAID card, that will eat up too much of my budget. Additionally, there is a write peformance hit for RAID (Except RAID 0 which is too risky). RAID 1 and 10 take the least Write IO performance hit, but it is still there. It is a 2 to 1 hit and for many of my VMs (the client OS VMs), I expect them to be write heavy. I do appreciate the value of RAID and maybe I can do that on version 2 of this server. But for now the cost is too high. No matter the drives used or configuration, I will get better Random Write IO performance by not putting them in a RAID and it will cost some $200+ less since I wont need a RAID card. Also, I really have to try to increase the number of spindles (drives). Most information I have found indicates that you should expect to get 4 to 5 client VMs per disk which is 6 to 8 drives for 30 VMs. Granted some of mine will be servers and will not have the same type of reads and writes, but its a good foundation for a variety of VMs. I'd love to have the redundancy, but I do not need the space to all be available as one volume, so it is not as risky as RAID 0 or JBOD. If I have 6 drives with 5 VMs per drive, and perform nightly backups, then if I do have a single drive failure, I'm only looking at restoring or rebuilding 5 VMs, not all 30 and since these are not production or business (just for learning and experimenting). Its not a big deal if some of them are down for a bit.
You also mentioned using refurb drives. I am kind of leary. Is it pretty common to use refurbs? Are they pretty reliable? I'd love to shave off some cost, but even though these aren't production, I really don't want to deal with a drive failure. So if I will be at increased risk of failure over the next 2 years, I'd probably rather just go with new.
Here is one of the many articles I used as a reference. While it is geared towards VDI, it contains some excellent information on calculating and sizing requirements.
I agree that nothing can compare to an SSD for random IO performance. The ~500 GB SSD that I mention is specifically for caching. I will not be using conventional caching. The motherboard I am using (server class) does not include SRT (which would be limited to ~64GB anyway). The only "Cache Drives" out there are about 60GB and the software only works on that drive. I have heard some good things about either implementation, but I have also heard some bad things about it. In any case, for a conventional (single OS) system those would make more sense. But for a Hypervisor and 30 VMs, they just aren't designed to handle something like that. I would love to try out VeloBit or something like that, but even if it was incredible, $1250 for the software is out of my range. The solution I have come up with is this. All of my VMs will likely be based on 7 or 8 Operating Systems:
Windows Server 2003 32-Bit
Windows Server 2008 32-Bit or 64-Bit (not sure about this one)
Windows Server 2008 R2 64-Bit
Windows Server 2012 64-Bit
Windows XP 32-Bit
Windows 7 32-Bit
Windows 7 64-Bit
Windows 8 64-Bit
I plan to create a base image (in the form of a VHD) for each of these operating systems. These will all be located on the SSD. The VMs will all use differencing disks based off of those parent image VHDs and will reside on mechanical drives. Once created the Pagefile for the VMs will reside on separate fixed size VHDs that will be on the SSD. I will probably also be putting a SQL database on the SSD. The result will be the base OS and any base applictions for all VMs will be read-only from the SSD. The pagefile for all VMs will be written to/read from the SSD. And any saved changes to the OS/Applications will be written to/read from the mechanical drives. It is not perfect. Overtime (with patching, etc) more and more of the data will sift over to the HDDs rather than the SSD. But I figure I will end up rebuilding the Client VMs at least every 6 months and I can recreate the parent VHD with all patches at those times. The servers I may not rebuild for a year or more, but by then I should be able to tell which VMs are the IO hogs and which ones are not. I can merge all of the Server VMs into individual fully merged VHDs and move the less IO intensive ones to the mechanical and the more demanding ones to the SSD. Besides, by then I may also be able to add SSDs or a more robust disk subsystem. I believe this will be the best compromise. I believe it will make the most efficient use of the relatively limited capacity I will have on the SSD and get the most performance out of it possible for the entire range of VMs. Again, once I have run them for a while and I have a better idea of how the various servers perform, I may move things around and change them up since the servers will be used in different ways and the demands they put on the host will be very different.
Thanks again to all of you for replying with your feedback. The more I discuss it the more thoroughly I think it through and the better understanding I have. I am still interested in any additional feedback anyone has, even if it is to tell me that I am wrong in my understanding or assumptions.
Posted 18 February 2013 - 06:47 PM
Your arrangement sounds very carefully thought out to me. I'll be interested if you post back with performance observations. Using the SSD for base VHD's and differencing them to the platter drives is a particularly clever idea.
I think WD RE's would be ideal. They are more heat and vibration-resistant (this is important when several drives are mounted close together) than desktop drives, and tend to be a bit faster as well. Since you are buying several drives, you would also benefit by having matched RE's that you could later use in a RAID array if you chose. That said, the Caviar Blacks are very robust, and I'm sure they would perform well for your purposes.
PS-- On RAID: Although more platters will not change the write-a-random-block benchmark, in the real world a RAID 1 or 10 array is dealing with a long queue of requests from the OS, and has likely performed several of them while finding that single block. It is important not to overlook how clever RAID controllers and modern drive firmware (i.e. NCQ) can be when writing or reading a long queue in the most efficient way. I've always been impressed by RAID 10's "magical" ability to balance multiple VHD's while playing/re-encoding media and copying files to and from the array simultaneously.
PPS-- True enterprise server hardware is built to amazing levels of quality, and if not abused can run for well over a decade. Buying decommissioned or re-certified server hardware can be risky; but generally if such hardware works, it's likely to continue working until you're sick of the heat and noise and want to get rid of it.
This post has been edited by dietrc70: 18 February 2013 - 06:50 PM
Posted 18 February 2013 - 06:57 PM
Sounds you'll already be making pretty good use of that SSD. You could still consider using 2 x 256 GB models, as these are pretty much as fast as the current 512 GB ones. Gives you some more speed, if you can spread your data easily across both drives (sounds possible).
Regarding the HDDs: compared to the prices end users pay the power consumption cost is negligible. I'd go for 3.5" due to the much better value (since you've got enough space). Forget 5.4k rpm due to limited performance, and they're not any cheaper anyway. Those 10k rpm drives are surely better at purely random IO.. however, I think your workload will be more of a mixture between some random workload and typical desktop/workstation workloads. In the latter case a Raptor one generation behind is just as good as a current 7.2k rpm drive. I'd probably go with 'Cuda 2 TB drives (officially 3 platters, newer ones may use 2 platters) if the budget allows, or with 1 TB ones (single platter). The latter ones cost almost 50% more per GB, though.
BTW: I don't think platter count of regular desktop HDDs influences random IOps much. The reason is they've only got one actuator which can only work with one read/write head at a time. So the parallelism inherent to a multi-platter design can not be exploited, mainly due to cost reasons.
Posted 21 February 2013 - 03:59 PM
Sorry it took so long to reply. Been a helluva week...
Thanks, Yes, I have tried to be very methodical about the component selections. I wish I could take the time to perform some experiments and precreate the VMs so I could take some measurements from them. Then I would be much more confident that the components were all well balanced to allow the most concurrent VMs possible. I will be happy to provide some follow up on performance. I actually plan to start a vlog or blog or both about the build. Physical and Virtual, soup to nuts, save component selection. I'll probably cover why I selected what I selected, but I haven't started it yet to cover the actual discovery/decision making process.
I'm glad you liked the SSD/HDD Base VHD/Differencing disk idea. I haven't gotten much feedback on it. I can't find any information on anyone else doing it, so I'm not sure how well it will work, but it makes sense to me. I guess I'll find out.
I am leaning toward the RE's. I also like the idea of adding a RAID card at some point, if for nothing more than performance testing and more experience working with RAID arrays. I've configured a few, but I've never really been able to play around with different levels and settings.
Thanks for the points about RAID. I am kind of anxious to play with/test out what the real impacts of it are. But I feel like for now, I'll start with what I already have decided on, then a little later in the year, I may scrape up a few hundred to throw in a decent controller. That will also give me time to really get a feel for how it performs in non-RAID, as well as what the challenges are with it.
Also, thanks for the info on the refurbs. That's a good poing. I'd probably feel better about it if it were components that were less prone to failure and/or less loss upon failure. Even though I wont have production data on the drives. I would not want to have to restore or rebuild due to a failure, at least if I can avoid it. And of all the things I have had fail on computers (after the initial week or two), it has been harddrives more than anything else.
Thanks, I hope to. I have thought about 2 x 256. I'd like to go that route, but it would cost me one of my other drives. The motherboard I am going with only has 8 on board SATA connections and the Power Supply I am going with also only has 8 SATA connections. I really like the arrangement of the 7 mechanical drives. If I could replace one of the mechanicals and get 2 x 512GB SSDs, then that would be appealing, but too expensive. I know I could manage to get more than 8 drives in, but for now, I am content with this arrangement. But I agree that it would be beneficial. hmmm, maybe it would be worth sacrificing one mechanical. It would cost 10$ more for the 2 x 256, but $90 less to go with 5 x HDDs rather than 6 x HDDs. The question then is, would splitting the files I was going to store on the SSD, across two SSDs make up the difference in having 6 VM differencing disks per HDD instead of 5 VM differencing disks?
Thanks for the info on the HDDs. You've given me lots to think about for sure. Possibly throw one Raptor into the mix and put some of my VMs on it that I expect to benefit the most... As far as the budget... lol, well I started off committed to a $2000 budget. I am now at $2419 without any additional case fans and with 9 mechanical drives, it probably wouldnt hurt to put an extra fan or two in there. I don't want to make any choices that I will regret, but I really can't afford to add any additional expense, without recovering the cost somewhere else.
On the platters that was what I was trying to say. All the heads move together so I was thinking the number of platters should really only help with large sequential writes and subsequent reads and shouldnt help random access really at all. Like you said, there is only one actuator to move the heads back and forth. All of the heads are connected and traverse the platters at the same time and rate. Where one stops, they all stop. I believe they can read from and write to more than one head at the same time, but given the rotational speed, I would not expect more platters to help (again) except for larger sequential transfers.
Thanks for the feedback. I didn't realize WD would cover them, but that is cool.
Thanks again to all three of you. I hope to get my tax refund the end of next week. Hopefully I can get this stuff ordered in the next couple of weeks... I'll be sure to follow up with info as it progresses...
Posted 22 February 2013 - 05:46 PM
If 2x256 GB SSDs are better than 1x512 GB depends on the load.. if the single big one can handle it just fine, the split won't help. In this case the single large one would provide better space-load-balancing.
Sequential transfer rates of HDDs don't scale with platter count, as the electronics are only made to read/write from/to one head at a time. I don't have a link at hand, but there are surely tests out there, if you're in doubt.
And 1 or 2 case fans are certainly good for this amount of HDDs, especially tightly packed. They're not that expensive, though.
All the best for your project!
Page 1 of 1
1 User(s) are reading this topic