Skouperd

Member
  • Content count

    16
  • Joined

  • Last visited

Everything posted by Skouperd

  1. Evening everybody, I have a serious problem on my hands and I now have absolutely ran out of options so I would really appreciate some input from you guys as I am really stuck now. I've built a file server for a company, the server specs is as follows: NORCO 4U, Rackmount with 10 Hot swappable bays Intel i7-4820k Gigabyte GA-X79-UP4 32GB, Corsair RAM (4x8GB) Corsair CX750W PSU Intel RT3WB080 RAID controller 120GB SSD (OS drive) Intel i340-T4, quad 1Gb NIC Asus GT610 GPU 8xWestern Digital RED WD40EFRX drives The SSD was connected straight to the motherboard, while the 4TB drives was connected to the RAID controller. The SSD acts as the OS drive, while the 8x4TB drives was setup in RAID 6. Over a period of about 8 months, I’ve lost more than 10 (of the 4TB drives) drives already on this setup. The drive failures seems completely random. The NORCO chassis contains two bays with 5 drives each. The drive failures was completely random, in other words, it was not allocated to a specific drive bay (one of the two), it was not specific that it was the top drives, nor the bottom drives that failed. It was really random drive failures. Sometimes the drives will run fine for weeks, and then all of a sudden 2 (and 3) drives will fail in one weekend. SMART information was not completely out of the norm, but there was a slightly elevated Read Allocation Errors on the drives that I’ve had to replace. As a process of elimination, I’ve replaced the SAS cables (SAS cables fanning out to 4 SATA connectors which plugs into the back of the drive bays). This did not solve anything and the drives continued popping out of the arrays and I continued replacing them with new ones. At this stage, I was getting really desperate to try and figure out what is causing the issues and to preserve the data (recovering from out backups took ages) so I’ve removed 4 of the drives and re-allocated 3 of them to 3 other servers and used them for internal storage (nearline backups). I’ve removed the fourth drive out of the RAID controller and connected that straight onto the motherboard (via a SATA cable to the back of the Norco backplane). This was done about 6 months ago, and to date, not a single one of these four drives have failed. The four drives that remained in the Array however, kept on failing fast. Discussing the symptoms, random crashes, slightly more frequently over weekends, the suggestion was that it may be the PSU. (I stay in South Africa and our power supply is really unreliable). So, the next step of elimination was that the PSU may be dodgy, so I’ve replaced that with a new PSU. Unfortunately, the drives in the array kept on popping out. I then had a massive UPS installed that is powering the whole rack. This did not help either and the drives kept on popping. (Pick an error message the drives showed, I’ve had them all…) Ok, then it must obviously be the RAID controller, so I’ve replaced the RAID controller for a new one. Again, an Intel RT3WB080, this did not resolve anything either. Ok, despite having replaced pretty much all the drives by now for new ones, I’ve decided maybe the WD-Red drives just ain’t good enough. I’ve had two drives at my supplier and they agreed to credit me the cost of the two RED drives and I then took 2 x WD 4TB SAS drives. For starters, I’ve only realised after installing the SAS drives that this Intel Raid controller is not suitable for SAS drives, so I took another RAID Controller that I had close by (RocketRaid 2722) which is a really cheap controller but it has been running in another server for close to a year without any issues. Any event, the one SAS drive seemed to be completely dead as it doesn't even register (still need to test on another SAS system). The second SAS drive was accepted and I’ve rebuild the array (again RAID 6). Truth be told, with a new RAID controller, a new PSU, new SAS cables, 3 of the 4 drives popped within 24 hours after building the array. I figured the only thing that have not been chaged as yet is the backplane of the Norco chassis, so I’ve then removed the norco chassis and built it into one of my old no name server chassis. Connected all 4 drives straight onto the RAID controller (i.e. no backplanes between the drives and the RAID controller). Guess what, it still failed after 24 hours. I am really, really running out of options here and will appreciate any suggestions, in summary, here is what I’ve done: Replaced the SAS cables Replaced the Raid Controller (twice) Removed the Norco Chassis (i.e. no backplanes to worry about) Replaced the PSU Installed the whole setup onto a high end UPS. The only thing that might be slightly out of the norm is that the last failure happened exactly the same time that our backups kicked off. Most (not all) the drives popping was also over weekends, but I've never heard that a backup program can be the cause of drives popping? I have not replaced the CPU, RAM, GPU, NIC, or motherboard, as none of these in my mind is directly connected to the drives but if you recon that this may be a problem, then please let me know. Please, any suggestions will really be appreciated. Kind regards Skouperd
  2. I am located in South Africa, so any of the drives that arrive here would be subject to at least a 3000 KM travel from the manufacturing to here. However, despite the distance, these drives are sourced from a very reputable dealer who source directly from the two main importers. I would guess that about 80% of all drives in South Africa is sourced via these two importers so they obviously know how to import it while the dealer I am buying from is one of the biggest in the country as well. I could add that I must have bought in excess of 100 drives in the last 5 years from either of the 3 companies and barring this server never really had any problems. (sure, the occasional drive failure but nothing like this) The rack is a 4 post rack in a restricted access server room (only 2 people access the server room, me and another person). You mentioned software could be causing drive failures, you mind giving me some examples of software which have caused these failures, apart from the obvious ones being RAID controller drivers, and raid controller firmware. I have since moved the drives out of the rack into a seperate computer just to run some tests but I think the damage has been done already as the SATA drives keeps on failing (received the two new SAS drives and this time round they seem to hold up well). Again, thank you very much for the feedback, very much appreciated.
  3. Thank you very much for the information, to answer some of your questions:: This is the Norco chassis that I've got: http://www.directron.com/rpc450th.html and this is what the backplanes looks like: (from the back) http://ep.yimg.com/ay/directron/norco-rpc-450th-4u-server-rackmount-chassis-with-10-hot-swappable-drive-bays-3x-5-25-inch-drive-bays-8.gif The chassis did not ship with a PSU, but I've bought a Corsair CX750W PSU http://www.corsair.com/en/cx750-80-plus-bronze-certified-power-supply I've replaced the PSU for the same one some time ago (as my thoughts was also on PSU) but neither one of these two PSU's show any fluctuations in either of their lines, nor gives me issues when I run them in any other computer. (the one I've replaced was used in another gaming rig and it is running smoothly). I've eliminated the backplanes by rebuilding the file server into a chassis that does not require any backplanes (i.e. SAS / SATA cables goes directly from the controller to the drives). However, the drives crashed within 24 hours in the new built. (that was the 2 new SAS drives and 3 of the WD Red drives). I've since returned the two SAS drives as the one is completely dead (looks like it was dead on arrival) and the second one is giving me issues in that it is not registering it each time and when it does it seems like I am stuck on 2TB only. (The second drive seems like it took a knock or something as there are minor damages to the outside of the drive). Regarding rack mounted, even though these WD Red drives are not meant to be rack mounted, I should add that the rack contains only the following devices: 1. 24 Port switch (no virbrations as it doesn't even have a fan) 2. Three "servers", however, none of these servers have any spinning drives in them, they are fully equiped with SSD's. The only vibrations in these may be the fans that may be spinning (120mm PSU fan, normal intel LGA 2011 CPU fan, and one low RPM 120mm exhaust fan) 3. A KVM switch (no vibrations as it doesn't have a fan installed) 4. Keyboard, mouse and screen (only virbrations when the keyboard is used which is rare consider we remote desktop into the servers) 5. The main File Server (which is giving the issues and the only one with the drives) I should add that I've removed 3 of the 8 drives from the file server and installed one into each of the 3 servers mentioned in point 2 above. However, in order to reduce the virbrations of the tower itself, I've actually mounted the File Server on the bottom (less virbrations if any the closer it is to the floor). The UPS etc are all outside the rack. Given the above contents of the Rack together with the placement of the server, I am not sure that the virbrations caused by other equipment is sufficient cause to damage the drives. I've also checked just now, the rack seems quite solid with no visible vibrations being emitted. (I've touched it lightly and honestly can not feel any vibrations on it) I had a long chat to my supplier as well when I've returned the two SAS drives, and he is predominately of the opinion that I am just having an exceptionally, unnatural, super unlukcy case of getting all the really bad hard drives. However even I am getting the feeling that surely one person can not be that unlucky and that something else is at play here. Have anybody ever had issues with software causing drive failures? The data stored on these drives are super sensitive (bank data) and as such we have various encryption software installed on the servers. Thank you very much for taking the time to read this, but more importantely for taking the time to respond. Kind regards Skouperd
  4. Hi everybody, I am looking for some guidance / advise on what the best RAID Controllers are in terms of using it with multiple Vertex 3 60GB SSD's. I need to upgrade my existing SAS controller, so figured I'd just ask for some input here. Please note, as this is my OS drive, on my gaming PC, I don't care at all about redundancy. My saved games etc is stored in a seperate RAID 5 array on the pc, whereas my real real data is stored on a server with more than sufficient redundancy, so let's not debate the various RAID levels please. Full details on my currently rig can be found here: http://blog.skoups.com/?p=272 but from a hard drive perspective I have: - SAS Controller: Adaptec 6405 - SAS Expander: Intel 6 Bay hot swap SAS Expander (AXX6DRV3DEXP) - Game Drives: 6 x VelociRaptor, 150GB, 10,000 RPM (located in the SAS expander) - Game Array: RAID 5, ~Read = 400MB/s, ~Write = 400MB/s - OS Drives: 2 x OCZ Vertex 3, 60GB - OS Array: RAID 0, Read ~ 800MB/s+, Write ~ 400MB/s+ The price of 60GB Vertex 3 SSDs have reduced so much, and during a recent promotion, I've picked up 4 more at a real bargain. My thinking now is to get an 8 port SAS controller, use 6xSSDs in RAID 0 for the OS drive / programs drive, and use the remaining two ports to drive the SAS expander housing the 6xRaptors. If you followed my link above, you will note that I've designed my case for exactly the above kit. My question now, I've used Adaptec controllers for ever and a day now, but lately, their benchmarks just ain't that impressive anymore (as can be seen by the existing RAID 0 array). It is also very difficult for me to actually find benchmarks with 6 Sandforce drives in RAID 0 to do meaningful comparisons on. Which is why I will appreciate any advise / comments from people that have actually tested a similar kind of setup themselves. Thank you in advance for taking the time to comment / respond, it is really much appreciated. Please feel free to ask questions if there is anything not clear. Kind regards Skouperd Ps, there are no particular reason why I want to do this, it just sounds like a nice cool project to do, and you can't argue with that! ;-)
  5. That 7 series Adaptec is most likely the RAID controller that we've been waiting for! I would defintely seriously consider the Adaptec 7805 for my gaming rig. I notice the website say available in October, do we have any indication if they will be really "available" or will they just be "available" to some elite companies? Kind regards Skouperd
  6. No, your understanding is the same as mine. I would go with the smaller drives any time of the day since as you rightly put, you will get a much bigger benefit from the speed aspect of things. Just make sure that when you RAID 0 SSD's then TRIM etc does not always work so you may need to cleanwipe the drives every now and again. Apart from that, I would personally have opted for 8x128GB drives instead of 4x256GB since then you can even get more speed, plus chances are the RAID controller you've opted for will be able to handle 8 drives as well. My 2 cents on the matter.
  7. Hi there, I had the opportunity to play with the same RAID card using Vertex 3 drives. When I used 4 drives, (read / write speed individually = ~500MB/s) then my throughput on the card (RAID0) peaked at about 1400MB/s, with about 600MB writes. When I upped that to 8 drives, the reads peaked at around 3000MB/s and writes at 2000MB/s. You are using 4 drives, and hitting 1800MB/s reads. That in my mind is quite fast. Remmember you need to account for some overhead with the RAID, etc, so your speeds are more or less what I'd expect. I would not be concerned with the smaller block sizes, that is always slow. BTW, I recon that when one start going to that kind of speeds, one start approaching the PCI-Express bandwidth bottleneck but not sure.
  8. Ok, couple of months later, I found myself a RocketRaid 2720SGL, as seen here: http://thessdreview.com/our-reviews/highpoint-2720sgl-rocketraid-controller-review-amazing-3gbs-recorded-with-8-crucial-c400-ssds/ The performance on it using 4xVertex3, 60GB, SSD's was quite nice. Achieving around 1.4GB read, and about 600MB writes. I needed to built a bigger array for somebody else (this time 8xVertex3, 60GB drives) using the same card. The performance was around 3GB read, and about 2GB write. (faster CPU / etc than my rig). So, if anybody else is reading this and is in the same boat as what I was, I could recommend the RocketRaid 2720SGL if the only thing you are looking for is raw speed. I have to warn you though,the RAID 5 performance really sucks. Now, an unashamed bump here... since I've lost my RR2720SGL (client desperately needed to get that card and the supplier could only get again in a couple of weeks time). Question, does anybody have experience with the ATTO R608 expresssas card and how it will perform with multiple (4, 6, 8) SSD's in RAID 0?
  9. Thanks for the input, my understanding (and correct me if I am wrong) is that the Adaptec 6x05Q series only adds hybrid SSD into the mix. My understanding of that is that it will use normal SATA drives together with an SSD thereby reading from the one drive and writing to the other format. As such I can not really see that their Q series will add anything over and above the normal cards. But I may be wrong. Thank you for the tip on LSI and Areca's. I don't have any experience with either, so if anybody here have some spare SSD's they mind putting into one of those cards to see how they will perform, that will be so cool! Thanks for taking the time to respond, appreciated.
  10. Hi Userwords, here is my suggestion and I am sure somebody else will be able to give you more information. My suggestion is that you create yourself 2 sets of arrays 1. Array 1 = RAID 5 - Create a RAID 5 array, meaning the total space available = number of drives, less one multiplied by the size. So, if you have 5 x 2TB drives, you will only use 8TB - Install say four or five 1TB or 2TB drives in it, depending on how much “usable” space you need - Move the following folders to that array: o Application software o Game installs o Media - RAID 5 means that should you lose any drive, the data will still be intact. So, you could lose 1 out of 5 drives and still not lose your data. 2. Array 2 = RAID 1 - Create a RAID 1 array, meaning that the drives will be mirrored in real time - Unless you go with RAID 10, I’d suggest you just use 2 x 1TB drives for a total space of 1TB - This array will host the following data: o Windows installation / boot array o Your personal data (photos etc) - This means that you could lose either one of the two drives and still be able to go on with your business. However, RAID is never an answer to having proper offsite backups, especially of your personal data (which I assume is irreplaceable such as photos). So what I am doing is I actually bought myself another hard drive, and installed it at a friend’s place. He did the same on my side. I created a truecrypt folder on my off-site drive, and every month or so make a full backup of all my photos and stuff (also around 400GB) and he does the same to his photos onto the drive I host for him. This means, that should my house burn down and all my drives are destroyed for instance, I still will not lose my photos. I am just doing this offsite backup of mine manually every now and again (he stays about 3km away from me, so we just do it wirelessly) but I am using a program called “allwaysync” to keep my USB drive’s data synced / backed-up between various pc’s. Anyevent, I am sure somebody else will give you other suggestions.
  11. Hi Kevin, thanks for the feedback. I agree with your assessment that Adaptec is just not able to keep it up when it comes to very fast arrays. Your point on the write speeds is well noted but my argument is that since this is the OS drive, where a lot of reads (at least in theory) happens, I am more interested in the read speeds (both sequential and random). The writes that occurs is typically user files and saved files which is happening to the RAID 5 in any event. With regard to budget, I've learned a long time ago that it is better to get the right stuff the first time, or else it just works out more expensive. If I am unable to afford it now, I just need to wait a bit longer. My current budget will be around $600 to $1000 USD which I recon should be putting me in the ball park for most consumer and mid level raid controllers. Hope that helps? Thanks for taking the time to respond. Regards Skouperd
  12. Hi everybody, after researching several sites on the web, the closest somebody came to solve my problem was on this site. So, without further ado, my current setup is as follows: 1. My "Server" is a relative high end desktop (old gaming motherboard), quad core CPU, and several network cards. The server basically just acts as a file server so there is very little overhead on the CPU. The OS installed is Windows Server 2008 R2. 2. The motherboard have 5 on-board SATA ports, populated with 5x1.5TB hdd's (1.5TB is cheapest / GB). 3. A cheap PCI-SATA card houses my OS drive and a spare drive (only two SATA ports on the card) 4. My DAS enclosure, http://www.chyangfun.com/pro01_2_3.asp is equipped with two SIL3726 chips. 5. I’ve plugged in 5x1.5TB hard drives into the DAS enclosure, 4 drives on the one SIL3726 board, and one on the second SIL3726 board. (the remaining slots have been filled with other drives) 6. Together with the purchase of the DAS enclosure came bundled a SIL3132 raid controller I set this up as follows: 1. Using the onboard RAID of my motherboard, created a RAID 0 array with the 5x1.5TB drives. I called this Array1 2. Using the SIL3132 controller, I then created another RAID 0 array in my DAS, again 5x1.5TB drives. Let’s refer to this as Array2. 3. Using Windows Server 2008R2 I then mirrored these two arrays, resulting in me having a RAID0+1 setup with 7.5TB usable space. 4. In summary, I have 10x1.5TB hard drives, 5 in the DAS, and 5 internally. (And several other ones that is not really applicable to this discussion) The reason why I opted for the RAID0+1 and not RAID10 is because I also end up having to extend my arrays. Having a RAID0+1 array, allows me to break the mirror (software) destroy one of the arrays, plug in more hard drives in that array, recreate a new increases space RAID0 array, copy the files from the original array onto the newly increased RAID0 array. When files have been copied successfully, I am able to break the first array, increase the size, and just mirror it again. I appreciate the risk when actually doing this, but given the hardware I have available, a risk I am comfortable to deal with. Where I currently have a problem, is that my read speeds on the RAID01 array was shocking. I was getting in the region of 250MB/s. This is acceptable in most scenarios, but the mere fact that I had 10 drives, each capable of around 80MB to 100MB just tickled that “nerd” feeling in me in saying that I can get more. Analysing the two arrays in more detail, I was getting around 230MB on Array 1 (the motherboard RAID0), but only 130MB on the Array2 (the DAS with the SIL3132). Having a mirrored, (read from both sources) this then tied back to the speed of 250MB I’ve observed in the RAID01 setup (speed of slowest drive times number of arrays, slowest drive = 130x2=260 less some additional overheard = 250MB). My initial hope was to get over 400MB/s (hoping for 500MB but I am a realist). From researching the problem, it appears as if the bottleneck is tied to the SIL3132, refer to this post here: I am not sure if the limitation comes in with the bus speed on the PCI-Express x 1, (250MB) or if it is E-SATA perhaps. However, the SIL3726 is stated to be SATA 2 (3Gb/s) compliant which means the bus speed of is definitely a potential bottleneck. It is obvious that if I want to reach my goal of hitting 400MB/s throughput that I will need to upgrade some hardware in my “server”. This is my current restrictions / objectives: 1. I have reached a limit in terms of how much hard drives I can house directly on my internal motherboard, so ideally I would need something that can expand on the internal drives a little bit more, or I will need to obtain another DAS at some point. 2. I am residing in South Africa, and we don’t get all the name brands here such as Highpoint, Arcena etc. We do however get Adaptec controllers. A good indication of cards that I can get is from this store here: http://www.sybaritic.co.za the Rand / Dollar exchange is around R6.5 for every USD. 3. This is a home setup, so obviously I would like to keep the cost as low as possible, and re-use as much of the hardware that I can, but still have the ability to expand in future. 4. I continue to increase my array on a regular basis, I’ve managed to do just that with the RAID01 setup for quite some time now without any data losses, but should one be able to incorporate a dynamic raid expansion or raid migration solution then that will be awesome. Now, from a hardware perspective, I realise that I will need to chuck the SIL3132 card if I want to get better throughputs, but I will prefer to keep the DAS since I recon one should be able to get at least 250MB/s from her. (Which if paired with an internal 250MB/s should give me my 400MB read target) The RAID card that I have been eyeing now for a very long time is the Adaptec 3805 but I cannot find any confirmation that it will work with the SIL3726 chipset in my DAS. Also, I am kind of new to dedicated RAID cards (I’ve been dealing with the cheapies mostly), so I am not exactly sure how the SATA / SAS expanders work, since the SIL3726 uses e-SATA cables. If anybody ask why I aim for 400MB/s, truth be told, it just sounds better than 200MB and I know that the hard drives definitely is capable of doing it (and more). So if I cannot reach it, then it will not be a train smash, but if I can reach it, then that will be awesome! I appreciate any constructive feedback and suggestion please. Also, I apologise for any spelling and grammatical errors, English is not my first language. Kind regards Skouperd
  13. Hi Mockingbird, it was only very recently that I needed to "upgrade" my arrays again with an additional 4 drives causing me to acquire the DAS device. Before then, I connected some of the drives on just normal el-cheapo sata controllers (2 and 4 ports). qasdfdsaq, with regard to the two links you've send me. The multilane adaptor looks very cool indeed, thanks for linking that. With regard to the adaptec 16 port card, I will rather keep an eye out on one that is PCI-Express. I received confirmation from Adaptec as well that none of their 3 series, nor 5 series RAID controllers will work with port multipliers. So looks like the highpoint (here: http://news.softpedia.com/news/HighPoint-Is-First-to-Launch-PCI-E-2-0-x16-SATA-Port-Multiplier-172734.shtml) may be looking more and more towards the option / solution for me. (Now if they will just release it!) Thank you for all your guys input, really much appreciated.
  14. Hi guys, thank you all for your reponses. With regard to the suggestion to run cables directly to the harddrives in the DAS, I agree, that that is actually a very good point and one that have crossed my mind. The only thing is that in order to make that work, you actually need a SATA card that is capable to handle a fair number of drives. I think on last count I have 15 drives (and growing) but that should keep me out of mischieve at least till I can get a decent enclosure / PM. Thanks for all the comments people, very much appreciated.
  15. Thank you for the response. (I was wondering if anybody actually read this). With regard to your point that the SIL3726 is a piece of junk, I am tempted to agree with you, however since my DAS is fitted with 2 of them (each capable to run 4 drives) each using their own e-SATA cable, then even if I am able to JUST squeeze 100MB out of each card, then I will be able to reach my goal of 400MB (200MB on my internal array, and 100MB + 100MB on the DAS arrays). If I can squeeze 125MB out of each of the SIL3726's then that will increase my read speed to almost 500MB per second, which for me is more or less what I am realistically thinking the max is I should expect given the cost of the DAS. But my original reasoning for opting for the SATA Port multiplier technology is as follows. If each SATA port multiplier (such as the SIL3726) is capable to house 4 drives, (some can house 5), then a Adaptec 3805, or 5805 will effectively be able to manage 8 of these port multipliers, which will give you 32 SATA drives (or 40 in the case of 5). If each of the port multipliers (such as the SIL3726) is restricted to say 100MB/s (you mentioned 130MB/s) then that should equate to a total througput of 800MB/s which is virtually approaching the bus bandwidth limitation of the Adaptec 3805 (being PCI Express x 4 = 8Gbs = 1GBs). I suppose, in summary, I am very curious to find out if RAID controllers, such as Adaptec 3805 or Adaptec 5805 will work with normal, el-cheapo SATA port multipliers. I know that HighPoint have released a card specifically targeted at SATA port multiplier technology, as can be seen here which I suppose support my theory that the above does seem like a poor man's way to both storage space and reasonable good throughputs. http://news.softpedia.com/news/HighPoint-Is-First-to-Launch-PCI-E-2-0-x16-SATA-Port-Multiplier-172734.shtml At the end of the day, I will only be able to either get a new DAS or a new RAID controller ideally, I would love to upgrade them both at the same time, however, I will need to choose between getting a proper raid controller or getting a better DAS. Getting a better DAS will not help me personally a lot since I still need a decent raid controller. Anyevent, the above is just my thinkings, but I would appreciate if you could say what you would do in the above situation. Let me know if my maths are making sense, and if I perhaps misunderstand something. Kind regards