For a short introduction of my tech level; I’ve been a system manager for the past 8 years, and been playing around with computer for nearly 20 years.
Until recently, I’ve never faced a problem with my own computer that I couldn’t solve, or figure out what was giving issues, until now… And that’s why I need your help, and a fresh look on things.
First, my setup:
Motherboard: ASUS P6T SE
Motherboard BIOS: v.0808
CPU: Intel i920 @ 2,6 GHz (Stock)
CPU Heatsink: ProlimaTech MegaHalems + 2 Cooler Master120mm fans
CPU Idle Temp: 30-35 degrees per core
CPU Load Temp: 45-50 degrees per core (Prime95)
Memory Part Number: 6 GB OCZ Gold PC3-8500U + 6 GB OCZ Gold PC3-10700U (Both OCZ3G1333LV6GK @ 1066 MHz, Stock)
Memory Voltage: 1.65
Video Card(s): Asus nVidia GTX295
Sound Card: Sound Blaster Fatal1ty X-Fi
PSU Model Number: Cooler Master 1000W Real PowerPro
Hard Drive(s): Intel X-25 M SSD 80G (OS/Boot), 4x 2TB HD204UI Samsung Spinpoint F4 (all on ICH10 / Sata2)
Optical Drive(s): GGW-H20L BlueRay RW
Other Cooling: Cooler Master Stacker 831 case with 6x Cooler Master 120mm fans
Operating System: Windows 7 Professional x64
This machine is running 24x7, rock solid. I never had any issues with performance or stability.
Now, for the problem:
Until a few weeks ago, I didn’t have the 4x 2TB drives in it.
I bought the drives, made a RAID5 array, using the ICH10R, and that’s when the problems began:
When copying (or downloading) large amounts of data to the array, thus creating a high I/O on any of the 2TB drives, my computer suddenly reboots. No blue screens (although that option is checked ON), and the only remarks in Event Viewer are: System suddenly rebooted unexpectedly, Event ID 6008, and possible cause: Power failure, Event ID 41).
No error codes, no nothing.
I decided to break up the array, and see what happens when I copy 1.5 TB of data from 1 drive to the next; The computer reboots again. The drives are now single SATA2 drives, with a 2 TB partition. After 10-80 minutes, the system reboots without notice or error.
So, I started to systematically remove drives, and test with the other drives.
Regardless of what drive is the source, or destination (I tried all combo’s, and all directions), the system reboots when a high amount of data is generated. Not just with copying the existing data, but also when downloading (and at the same time repairing files (.PAR), and extracting).
I tried the following:
Check for overheating: All values are well below 40 degrees C; (also checked drives);
Even put an active cooler on my southbridge (The ICH10);
Memory checks. Ran 3 different programs to check / test my memory, ran overnight for hours and hours, multiple passes, 0 errors.
Remove all other hardware, except for the absolute minimal necessary;
Swap / replace powercords, SATA cables, even rotate drive position on the SATA connectors;
Reset BIOS settings;
Reinstall Windows 7 (delete the entire 80GB partition on the SSD and reinstall, no other tweaks, but right after install, start the copy transactions) to rule out the possibility of faulty software and / or drivers;
Reformat / create the drives / partitions: Tried both MBR and GPT partition; Different block sizes;
Turn off Write Back Cache, to even further rule out a problem with my RAM;
Calculate the PSU needs; I tried multiple programs, even a paid one, and counted manually: Granted, on a full Direct3D load (games on high etc), my GFX card needs around 450 watts. This makes a grant total of 950 Watts. However, the problems occur while idle in Windows, so the consumption for my GFX is max. 100 Watts, making a total of (roughly) 600 Watts, well within the limits of my 1000W PSU;
CPU check / Prime95; runs for days, stable, without a single error;
the problem only (and only) occurs when copying / downloading a large amount of data;
The system runs flawlessly under high load (playing games, watching movies, running programs etc);
I never had this problem before, but then again, I never had the space to start downloading 250 GB of data, or copying 1.5 TB ( I didn’t even have 1,5 TB ^^) data to other drives.
I am able to reproduce a “fast” reboot error:
I created 3 separate batch files, which basicly tell Robocopy to copy data from:
Drive D: to Drive E:
Drive D: to Drive F:
Drive D: to Drive G:
When I run these scripts separate it runs for a while, but also reboots / crashes after an hour or so.
When I start these scripts all at once, it reboots within 5 minutes.
Remember, that I already cloned the drives, so I could rotate the source drive, and systematically removed / switched a destination drive, thus trying all different combo’s (and to check whether one of the drives might be faulty).
Further this also makes me doubt if it’s the shear size of data that causes the problem, cause within 5 minutes, not even 100 MB is being addressed, and still, it reboots.
This is making me think, that the PSU might be the problem. As soon as these drives are actively called upon, i can imagine a sudden increase in the 12V+ rail, can cause to overload my 12V rail... altho my PSU has 6x 12V rails, i'm not to convinced this might work as well as people say...there are many discussions on the web about the use of 6x 12V rails. Could it be, that my 12V rail is maxed? (considering it's giving power to: The Mobo, The Cpu (4/6 pins, can't remember), The GFX card (both 6 and 8 pins), 5 drives (SSD + 4x 2 TB), an Optical BD-RW, and ofcourse the onboard devices (Soundcard) and USB devices (headset, webcam).
I am all out of ideas. If there is something that I haven’t checked / tried, please tell me. I think I wrote down everything I tried thus far; maybe I missed something, but I’ve been testing and trying for 3 weeks now.
For now my conclusion / suspects are:
The motherboard. Either a chip in the ICH10 was fried, or the SMBus got a dent;
The motherboard (or ICH10 / SMBus) is just not capable of processing such large amounts of data.
My PSU. Mainly, the 12V+ rail. It could be (maybe), that my 12V rail is max.loaded, and when kicking in the extra drive operations, it fluctuates, and tilts it a bit above it maximum, thus crashing my computer. Looking at the symptoms (sudden reboots without any errors) it might be a more plausible cause then all the other things I tried. And yet, if you look at my hardware setup, I cannot imagine I reached the 12V max. However, I will be testing this week, by taking another PSU on a second desktop, and connect my hard drives to that power supply. Or maybe even, take an el cheepo GFX card and remove my GTX295, and see if the computer stays stable during copying…
I’ll post the results shortly. In the mean time, if any1 has seen this problem before, or has other ideas / solutions to try, please let me know here and I’ll try them.
According to the PSU calculator, this is the recommended Watts / Amperage for my setup with a full 100% load; Below that is a table of power my PSU can handle:
Afaik, i can add the 12V rails together, so my PSU can handle max. 128 A on the 12V rail?
This is my PSU: http://www.coolermaster.com/product.php?product_id=2519
Does this mean my PSU should have more then enough power?
I'm still gonna try with a different PSU or GFX card to be sure, but according to the above i think all should be covered... Any thoughts?
Many thanks in advance for thinking with me.