uart

Member
  • Content Count

    782
  • Joined

  • Last visited

Everything posted by uart

  1. Hi guys. It's a long time since I've posted here, but recently I've had a lot of silent errors when transferring backup image files to external USB hard drives (usb powered drives). The HDD media itself seems to be fine and the drives themselves in good condition (when viewing SMARTs etc), and the copy operation appears successful with no warning of the failure (from windows 7). However, when I run a checksum (hash) on the copied files there is sometimes an error. I first noticed this when I needed to restore an image and found it to be corrupt. I did some testing and found that several of my other image files were also corrupt. So then I started using sha type checksums to verify each time I copied, and an alarmingly high percentage (like several percent depending on the motherboard usb hardware and the usb drive combination) fail the checksum but report no error while copying. This is pretty surprising and alarming to me! At first I thought it was one particular hardware combination, a Gigabyte motherboard and a Seagate external (usb powered) HDD. Well that combination was probably the worst, with about a 5 to10 percent chance of one of these silent write errors per 2GB image file, but I've since noticed other combinations of motherboard and external usb drives giving the same issue (though with lower probability). For now I'm still using these drives to save my backup images, but I'm checksuming every file after every copy. Anyway, just wondering if any other users have had data integrity problems like this with external usb (usb powered) hard drives?
  2. Ah, I just had a thought! Recently my two external HDDs have been packed away not being used, and I've had their USB3 cables disconnected and packed away as well. So today when I got out the "good" drive (the one that previously didn't seem to be suffering from these silent write errors) I may well have swapped over the USB cables between these two devices. If so, then that would make the usb cable the only common factor between that which gave me my errors today and the other hardware combination that had caused me grief in the past. I'm a bit busy at the moment, but as soon as I get the time I'm going do some tests with a bunch of large transfers, while interchanging the USB cables between tests, to try and isolate the cable as the cause. BTW. Just wondering if this issue has been noticed before by anyone here. Can a slightly dodgy cable cause totally silent usb write errors at rate of about 1 per 10^10 bytes?
  3. Thanks for the reply Continuum. Yeah I might have expected the possibility of an error when looking at multi PB data sets, but definitely not likely in a few tens of GB (typical backup sizes only 10 to 20 GB for me). At first I thought it was just faulty hardware, either the motherboard or the external drive, but even so I would have expected some kind of error message. But then today I got the same problem on a completely different motherboard and external drive, so now I'm really stumped??? I started to wonder if this type of error is a lot more common than people know about (with usb powered devices) but just going undetected because most people don't hash test their files, or if there is something else going on here that I'm overlooking. I'm really confused about this now. I just can see how I'm getting such low reliability (and especially bad because I get no indication of the failure) on two completely independent pieces of hardware. This just can't be right.
  4. I wouldn't be surprised if this has been discussed ad nauseam in the past, so apologies in advanced about dredging up old news if that is the case. However, returning here recently after many years of absence, it really struck me as to just how quiet these forums are these days. I've noticed similar trends in some other computer hardware forums as well. Is this related to a general down turn in forum usage in favour of things like facebook and the like. Or is it merely a reflection of the rise of the mobile device and a consequent loss of interest in "home builds" and upgrades that were so popular back when the desktop computer reigned a decade or more back?
  5. It was my understanding that modern hard drives were pretty tolerant to sudden power loss. I knew that logical disk corruption was always a possibility, but I thought that the heads retracted and parked fairly safely in the event of power loss these days. The thing that has me questioning that belief is that I just recently lost the hard drive in my HTPC after a bad storm that resulted in a number of sudden losses of power. My wife was watching the TV to get news about the storm, and the power kept on cutting out throughout the night. Normally I would turn off the PC and watch straight from the TV under these conditions, but of course the wife just kept restarting the PC after each event. Eventually we lost all power for nearly a week (yeah it was a cracker of a storm), but when I eventually got it up and running again I noticed that the smart status of the drive had gone from being perfectly healthy to crap. I first ran a "chkdsk /f" and noticed it seemed to run a little slow. So then I looked at the smart status and there was a whole butt load of re-allocated sectors. Then after doing a full surface scan there were a bunch of bad sectors. This is a pretty non critical application to me (no important data to lose) so I left the bad HDD in there for a while, and over the period of a few weeks it pretty much "fell apart", with new bad sectors popping up all over the place. The drive is pretty much toast now, that's for certain. So now wondering if this was just a coincidence, and the drive was just about to die anyway, or am I right to be suspicious that the multiple power outages was a factor in it's demise? BTW, the HDD was just an old WD green 1.5TB. It had been in service for about 3 years and I was pretty sure that it was in perfect health before this event.
  6. Ok, I know that the raw data reported by SMART is essentially proprietary, and so we should really only pay attention to the "Current", "Worst" and "Threshold" fields. However I've never come across a HDD that didn't report the actual number of reallocated sectors in the SMART raw data. I just bought a "Toshiba Canvio Simple 3.0", which is a 1TB external USB powered usb3 drive, and I was a bit alarmed to see the raw data for reallocated sectors count was "...00F0" straight out of the box! I returned the drive to the shop and they gave me a replacement, which initially reported zero reallocated sectors on the first usage. However after I got home and ran a "chkdsk /r" that blew out to "...001658" in the raw data, and the "Crystal Disk Info" software I'm using reports an "Amber Warning" on the drive. The thing is, the "current" and "worst" fields still read 100 (with threshold = 50), so I know that the drive must actually be healthy. Also chkdsk /r found no bad sectors. So I tend to think that the drive is actually ok and that this is just something weird about how Toshiba handles the raw data for realloc sector count. Anyway, just wondering if anyone else has noticed this anomaly with the SMART data from Toshiba drives lately. BTW, Crystal Disk Info reports the drive as being a "TOSHIBA MQ01ABD100" if that means anything.
  7. As far as I know, all hard drives report the current temperature (in hexadecimal) in low order byte of the temperature raw data. However some drives seem to also report the temperature (in decimal) in the "value" field, and what looks like the historical maximum temperature in the "worst" field. For example my seagate drives look like they report the actual temperature in the "value" and "worst" fields, but my WD and Hitachi drives seem to just use some kind of generic health value that counts down in those fields. So just wondering, is there a way to get the historical maximum operating temperature from smart for all drives. Or is it only certain drives (like some Seagates) that provide this smart info?
  8. Thanks Brian. To make it a little clearer, here are some examples of the smart values I'm referring to. With the Seagate drive in first image you can see that the current temperature is 28 hex which is 40C, and that the worst temperature on this drive has been 52C. With the WD drive in the second image however, you can only see that the current temperature is 1F hex, which is 31C.
  9. No, not on a UPS. Like I said it's a pretty non critical application, basically just a glorified set top box.
  10. It's out of warranty so it's heading for the bin anyway. I was going to run the WD diags but I'm pretty sure I know what;s going on. The bad sectors have grown and they're literally all over the platter. Just for fun I had a go at making a 400G partition at various positions on the drive, and there wasn't one place I could put it that didn't have bad sectors. I think that there were some "brown outs" and other glitches with the power that night, so maybe it was just some type of freak occurrence.
  11. All brands of hard drive are subject to random failure, it's just the nature of the beast. I've used just about every major brand of HDD over the past 25 years and believe me that any brand can fail unexpectedly. At different times I've had a very good run with one brand (say WD) and a poor run with another (say seagate for example), and then at other times those fortunes have been reversed. The main problem is that good information about the reliability of any particular brand/model is rarely available until many units have been in operation for many years, by which time that particular model is obsolete anyway. Yes I've certainly had a few seagate drives fail early on me, but I've also had many still in perfect health after 5+ years of daily use. Averaged over many years I'd say I've had very similar experiences with Seagate as I've had with WD and Samsung. As you've apparently just now discovered, the hardware is relatively inexpensive but data recovery not necessarily so. No manufacter warranty covers data recovery, warranties are always limited to the hardware itself. The key of course is to have a good backup strategy in place before disaster happens. And to be honest this has never been easier or more affordable, with the low cost per GB of external drives these days. Sorry to be the one to say it, but there really is no good reason for you failing to backup your data.
  12. I ended up exchanging it for a Seagate Expansion drive, (model STBX1000301), as it was the only other option they had in stock. So far this drive is working flawlessly, and even seems to work correctly on my sometimes troublesome front usb ports. Here are the SMART values of the new drive: BTW. If you notice how the "fitness" value on the above drive is 100% in speedfan, that read zero on the previous Toshiba drive with all the reallocated sectors. I'm really glad I exchanged the drive now.
  13. Thanks Mighty. Yeah it's been nagging away at me since I got it. So today I'd had enough and took it back to the store and exchanged it for a "seagate expansion". I know what reallocated sectors are, and in my opinion that is the single most important smart value for predicting drive failure. I just couldn't understand how two brand new drives could have that many reallocated sectors straight out of the box, so I thought it might have just been something anomalous about how toshiba were handling the raw data on that attribute. To be honest, I'm starting to think that slyphnier might have been on the right track re the usb error thing. You know I'm starting to think that something dicky with the usb connection might be the common factor here. Both "failed" drives came with their own (new) cables, so I don't think that was the common factor. I did however test each drive on one of the front USB ports of my (desktop) computer, and I know that these are a little suspect with usb powered drives. (My very old and well used WD passport wont even power up properly when plugged into either of these!) So one of the first tests I did on each drive (to test how good was its usb compatibility) was to test it on one of these front ports. For the record, both drives powered up and detected correctly, but both had some degree of difficulty transferring files and on one or more occasions hung up and I had to abort the transfer. So perhaps this is where the offending smart anomalies occurred?
  14. Thanks for your input slyphnier. I'm going to try returning the drive today and see if I can swap it out for another brand. I don't know what the issue is, but two out of two is not a good sign.
  15. BTW. Here is the screenshot from Crystal Disk Info.
  16. Can anyone recommend a good utility to find and remove duplicate files on a computer. Are there any good freeware programs available for this? Thanks
  17. In the past I've always just bought enclosures and stuck a HDD of my choice in. Then if I ever needed to do any serious diagnostics I'd pull the drive out and connect it directly to SATA or IDE as the case may be. Today however I bought "Seagate Goflex" 1.5TB external drive. It is USB3 capable but for now I only have USB2 ports so that's how it's connected. Before I loaded any data I (quick) formatted exFat (for compat with my PVR) and did a "chkdsk /r". Obviously that was fairly slow, an hour or two, but it showed the percentage complete and ran fairly smoothly and predictably. Now however I've loaded about 650GB of data on it and I want to run chkdsk /r again, but this time it seems to freeze and give no percentage completion or anything. I know it's going to be different now that it's half full of data, because the "chkdsk /r" is going to have to scan both the data and verify free space (whereas before it was all just verifying free space). So I'm not sure if it's hanging or if it's just taking a long time (without any progress report) on the file verification. Basically it only displays the following lines : The type of the file system is exFAT. Volume Serial Number is F411-B32E Windows is verifying files and folders... Volume label is Seagate1T5. And that's it, no more information or progress for about 2 hours after which I abort and reboot. A normal chkdsk or a "chkdsk /f" runs fine, just a few seconds and it says the file system is healthy with no bad sectors and correctly reports the file usage and free space. Is a USB2 just too slow to adequately "chkdsk /r" a 1.5TB external drive or is there something else wrong. In either case, can anyone suggest a good alternative for testing the health of an external drive like this. Thanks.
  18. Hey that's a good idea FastMHz. Thanks, I didn't think of that. Anyway I've gone and run it now. I'd already figured out that it might take 6 to 8 hours to run, but I was just taken aback by the lack of any reported progress (percent). As far as I can now tell this is just how chkdsk (winXP) works with an exFAT partition, no progress indicators at all for the file verification part. Anyway I just let it run all day and sure enough, after about 5 or 6 hours, it finished scanning the data and starting scanning the free space (and finally started giving progress indication). I know that with an NTFS partition it would have reported the progress correctly, but honestly I don't think it would have been any quicker, that's just the nature of 1.5TB over USB2. So apart from this anomaly with chkdsk I'm now wondering what other problems I might encounter by using exFAT on this drive. Thought I probably should start a new thread to discuss that :
  19. I recently formatted my 1.5TB external drive with exFAT. I did this for compatibility with my DTV recorder (aka PVR), which unfortunately can't handle NTFS. I'm now having second thoughts about using this drive for recording TV directly, and think I'll probably just use it to archive recorded stuff from my internal hard drive. So I'm sort of regretting formatting exFAT, but at the moment I don't want to clear everything off the drive to reformat. So is exFAT going to give me any serious problems on a 1.5TB external drive? It's mostly going to be used for archiving (off line storage), typically larger files like media and backup images? At the moment the drive is only USB2 connected so I assume that will limit speed far more than anything to do with the file system. Should I go to the bother of getting everything off this drive and reverting to NTFS?
  20. Just posting this extra info in case it helps someone else. I just tested a smaller external HDD with the exFAT file system, and it turns out that the behavior of chkdsk is indeed different. With an NTFS partition "chkdsk /r" gives a running "percent completed" dialog for both the file verification and the free space verification phases. Whereas with the exFAT partition "chkdsk /r" only gives the running "percent complete" dialog for the free space verification phase. So the upshot is that with the exFAT file system on an external drive, "chkdsk /r" will just sit completely silently during the entire file verification phase! This of course may take many hours in my case, given that the drive is USB2 connected and now has over 650GB of data on it. So basically that's my mystery solved. BTW. This is running on WinXP. Not sure if the behavior is the same or not on later Win OSes.
  21. Yeah I might try it overnight tonight. It just seems strange that the first time I ran it (no data) that it proceeded smoothly, giving percent complete information over several hours. But now when I try to run it (chkdsk /r) it sits for two hours without even giving me any "percent complete" info at all. I'm not sure if this is something to do with the 650GB of data it's now got or if it's related to the fact that I've since reformatted from NTFS to exFAT? Anyway, looking at some other threads here I've found the program "CrystalDiskInfo" that can read the SMART values for external drives, so at least I have some drive health info now (and it all looks ok). BTW. Is CrystalDiskInfo considered a good program for this purpose? Any other recommendations?
  22. I know this has been the standard way of quoting capacity of hard drives for a long time but I was a little surprised to see that a lot of USB "Thumb Drives" are now doing the same thing. I just bought a new Sandisk 8GB USB memory stick (aka thumb drive) and I noticed it said "1GB = 1000,000,000 bytes" on the back of the pack. Unsurprisingly it shows up as 7.44 GB capacity in Windows. I guess the only reason that I'm surprised is because semiconductor memory chips are still based on binary (power of two) memory sizes for obvious address mapping reasons and I assumed that the flash memory chips that these "thumb drives" are constructed from would be the same. So how are they getting the reduced "decimal" capacity, is it due to bad/remapped/spare sectors or something? Edit: Whoops I meant to post this in the "Other Storage" forum but posted it in the "Hard drives" forum by mistake.
  23. Yeah I've got a couple of 1GB sticks and they are all over 10^9 bytes (about 1028.1 million bytes). Honestly though I'd never really paid that much attention to their exact capacity before. For some reason I just thought that their capacities would be more in line with semiconductor memory which still uses the "binary" sizes (eg if you buy a 1GB stick of RAM for you know youll get exactly 1024 MiB, 1048576 kiB, 1073741824 Bytes). I don't know much about the internal construction of the actual flash memory chips themselves but I'm guessing that they are still constructed around "binary" capacities (for reason of address decoding and memory mapping etc) but then the actual capacity gets reduced by the sector management of these devices.
  24. Thanks everyone, I got it sorted now and avoided needing the recovery install. Not sure what the problem with recovery install was, I was going to try again with as many drivers as possible unloaded or set to generic and my AV uninstalled etc, but in the end I bypassed the need all together. I was thinking of moving to Windows 7 with a clean install at my next major H/W upgrade anyway, so shouldn’t need any more recovery installs anyway. Thanks to czr for alerting me to the fact that a simple controller swap was all that's needed. Honestly I was very poorly informed about the whole AHCI on Win XP thing before I started. I'd googled it and read a lot of stuff at various random sites, much of which turned out to be total rubbish parroted from one person to the next. Things like "you need to re-install windows every time you change between IDE mode and AHCI" (either way) in bios, and of course the endlessly repeated "you'll get no performance improvement anyway, that ncq is only for heavy server loads and does nothing on a desktop computer". No wonder I was confused after reading so much junk. I really thought the darn re-install was some type of essential process in making AHCI work when clearly it isn't. BTW. In the end I was going with my "migrate back to IDE, load sata drivers, migrate back to sata" idea, which I'm now 100% certain would have worked, but half way through the process I realized that I already effectively had two sata controllers (built-in) anyway. I suddenly noticed that my bios had two relevant settings, 1. "SATA MODE", which could be set to : "IDE", "raid" or "AHCI". and 2. "SATA 4/5 as IDE", which could be true/false when the main sata mode was AHCI. So suddenly it was obvious that the second setting would let me do czr's two controller trick, all on the one controller! I just set it to AHCI (1st setting) but with SATA 4/5 still IDE (second setting) and then popped the sata cable over from SATA 0 to SATA 4. Booted it (still in IDE mode but AHCI now active in bios so it would prompt for driver once XP had loaded), loaded the driver, turned off, popped the sata cable back to sata 0 and I was done. Would have been 5 minutes tops if I had known what I was doing from the start, too bad I wasted a whole day beforehand. Anyway that's a handy trick to keep in mind, I see a lot of mainboards now days have that split SATA/IDE option that saved the day for me.
  25. Grrrr this is really annoying. I've been putting off going to native sata drivers (XP pro) for ages because I couldn't be bothered with the hassle, but today I finally decided to go for it. I spent all afternoon chasing down a floppy drive, backing up my system and getting everything ready for a recovery install with "F6" boot drivers. At first the "F6" drivers wouldn't load but eventually I got that sorted (for some totally unknown reason the drivers would crash unless I kept the floppy out of the drive until after I selected F6 and was prompted for drivers, selected "s" and finally was prompted for the floppy. That's just nuts!) Anyway after wasting about an hour with the above problem I finally got the "F6" drivers loaded and started a recovery install. The recovery install copied all the necessary files and re-booted. It then started the "gui" part of the installation but only a few minutes into the process it blue screened on me with a driver error. I've retried several times and it always does the same thing, blue screens at the same point in the install every time. I then tried setting the bios back to IDE mode for the sata ports but the damage was done and I still couldn't get it to complete the recovery install without crashing. Thankfully I made a backup, which I've just finished loading. So I'm back in action, but after wasting a whole afternoon and evening, messing with the unholy crappiness of floppy drives etc I'm still without AHCI from my SATA drive. This has been a totally painful experience, I'll probably just go without AHCI for now. Does anyone have any suggestions of anything else to try. BTW. The "F6" drivers did work when I tested them on a clean install, so the floppy and drivers etc are ok. The problem is that I don't want a clean install right now. The other thing I tested is that my current install will allow me to do a recovery install successfully if I say in IDE mode in bios and dont load the F6 drivers.