Sign in to follow this  
uart

Data integrity problems with external usb drives!

Recommended Posts

Hi guys. It's a long time since I've posted here, but recently I've had a lot of silent errors when transferring backup image files to external USB hard drives (usb powered drives).

The HDD media itself seems to be fine and the drives themselves in good condition (when viewing SMARTs etc), and the copy operation appears successful with no warning of the failure (from windows 7). However, when I run a checksum (hash) on the copied files there is sometimes an error. :o

I first noticed this when I needed to restore an image and found it to be corrupt. I did some testing and found that several of my other image files were also corrupt. So then I started using sha type checksums to verify each time I copied, and an alarmingly high percentage (like several percent depending on the motherboard usb hardware and the usb drive combination) fail the checksum but report no error while copying.

This is pretty surprising and alarming to me! At first I thought it was one particular hardware combination, a Gigabyte motherboard and a Seagate external (usb powered) HDD. Well that combination was probably the worst, with about a 5 to10 percent chance of one of these silent write errors per 2GB image file, but I've since noticed other combinations of motherboard and external usb drives giving the same issue (though with lower probability).

For now I'm still using these drives to save my backup images, but I'm checksuming every file after every copy. Anyway, just wondering if any other users have had data integrity problems like this with external usb (usb powered) hard drives?

Share this post


Link to post
Share on other sites

We copy easily a few PB of data per year between various computers with various brands and sizes of external drives without any errors that we've noticed.

Given typical rated error rates of 1*10^14 you shouldn't be seeing errors anywhere nearly this often-- maybe one every few TB. What programs are you using to copy?

 

If you think about it, if you're getting errors nearly as often as you are, you should be seeing system instability frequently as well...?

Share this post


Link to post
Share on other sites

Thanks for the reply Continuum.

Yeah I might have expected the possibility of an error when looking at multi PB data sets, but definitely not likely in a few tens of GB (typical backup sizes only 10 to 20 GB for me).

 

At first I thought it was just faulty hardware, either the motherboard or the external drive, but even so I would have expected some kind of error message. But then today I got the same problem on a completely different motherboard and external drive, so now I'm really stumped???

I started to wonder if this type of error is a lot more common than people know about (with usb powered devices) but just going undetected because most people don't hash test their files, or if there is something else going on here that I'm overlooking.

 

I'm really confused about this now. I just can see how I'm getting such low reliability (and especially bad because I get no indication of the failure) on two completely independent pieces of hardware. This just can't be right.

Edited by uart

Share this post


Link to post
Share on other sites

Ah, I just had a thought!

 

Recently my two external HDDs have been packed away not being used, and I've had their USB3 cables disconnected and packed away as well. So today when I got out the "good" drive (the one that previously didn't seem to be suffering from these silent write errors) I may well have swapped over the USB cables between these two devices. If so, then that would make the usb cable the only common factor between that which gave me my errors today and the other hardware combination that had caused me grief in the past.

 

I'm a bit busy at the moment, but as soon as I get the time I'm going do some tests with a bunch of large transfers, while interchanging the USB cables between tests, to try and isolate the cable as the cause.

 

BTW. Just wondering if this issue has been noticed before by anyone here. Can a slightly dodgy cable cause totally silent usb write errors at rate of about 1 per 10^10 bytes?

Edited by uart

Share this post


Link to post
Share on other sites

Hello,

I got the same problem some times ago. I investigated and found that it was related to big file (I do not remember well but IIRC it was more than 2 GB).

At 2 GB (or 1 GB, do not remember) bonudary, very often, there were data corruption on several bytes.

It was external disk in USB 3. The same disk has also eSATA connector; with eSATA, I have never data corruption.
So I use only eSATA at the moment.

I searched on internet and found other people having the problem. It seemed that this was related to Windows 7.

As the problem occurs only for big files, not a lot of people see it (and who checks the archive integrity when backuping?)

http://superuser.com/questions/698856/cause-of-disk-data-corruption

 

Edited by LeJav

Share this post


Link to post
Share on other sites

I've seen issues at times where USB controllers (even ones built into the motherboard) start to go unstable and occasionally will disconnect for a split second.  This can cause errors in the data if it was currently writing to the disk.  Especially, it would cause such issues if it was only modifying a large file such as a database.  Doing nothing but data recovery work, we actually tend to wear out USB controllers and always keep a few add-in cards handy for when they go.  So I've seen them act quite strange when they are on the way out.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

Sign in to follow this