Jump to content


Photo

LOGICAL UNIT FAILURE PREDICTION THRESHOLD EXCEEDED


  • You cannot start a new topic
  • Please log in to reply
2 replies to this topic

#1 agarwaen

agarwaen

    Member

  • Member
  • 6 posts

Posted 22 April 2003 - 05:45 AM

I get the following information when probing my Atlas 10K III disks SMART infromation: (with UCSC S.M.A.R.T. Suite for Linux)

SMART: Sense (02) LOGICAL UNIT FAILURE PREDICTION THRESHOLD EXCEEDED

I got the warning also at bootup from my Adaptec 39320D-R card but it gave the warning message twice and does not give it anymore.

I'd appreciate if someone who is more familiar with SCSI SMART stuff could tell me what that message actually means - how severe it is...

I've already backed up all important data but I don't have time to install linux and XP on a new drive right now so I'm going to wait and see what happens.

there is a list of those messages and codes in http://www.t10.org/lists/asc-num.htm

the system:
P4T-E 512MB 2.4GHz
Adaptec 39320D-R 1 Atlas on channel A and 2xAtlas on channel B (raid 0 - and the scsi card says it's still optimal)

Ati radeon 8500
intel pro100s
Sb Live 5.1
550W antec truepower

#2 GPz1100a

GPz1100a

    Member

  • Member
  • 50 posts

Posted 22 April 2003 - 10:18 AM

I started getting this error on an atlas 10K II 74G about 2-3 weeks after purchase/installation.

The drive would take extremingly long periods of time to do simple tasks like directory listings.

I'd RMA the drive ASAP if I were you.

The replacement maxtor sent out has been working fine.

I should point out, that the bad drive, even on first power up, didn't quite sound right. Made a growling sound on spindown. I didn't pay attention to it at first, since the drive has probably been sitting on a shelf for a long time and just needed to be 'broken' in.

About 2 weeks later, machine started pausing for no reason, and that drive's led was ON for long amounts of time.

#3 MaxtorSCSI

MaxtorSCSI

    Member

  • Member
  • 346 posts

Posted 22 April 2003 - 03:18 PM

SMART: Sense (02) LOGICAL UNIT FAILURE PREDICTION THRESHOLD EXCEEDED


Unfortunately, the level of detail in this error message is not sufficient to tell much more about the drive other than that SMART is predicting an impending failure. If I'm guessing right, and the sense qualifier code was the "02", this means that a READ or WRITE error rate threshold has been exceeded.

The SCSI specification for SMART includes a provision for controlling the reporting method used by the drive, and is defined in Mode Page 0x1C, the Information Exceptions Reporting page. It also contains provision for reporting more detailed SMART error information through Log Page 0x2F, the ANSI SMART page.

By default, Atlas 10K III is supposed to be configured to report an error once, only (MRIE=4, Report Count=1). However, some OS implementations may change this configuration. If still at the factory defaults, after reporting the error condition, the A10KIII SMART implementation resets and starts accumulating fresh statistics. Since the predictor only runs once every 15 minutes, you should not see errors reported at a rate greater than 4/hour.

But, it's important to understand that SMART is a *predictive* warning system. It can't actually see the future, it has to guess. To do this, the drive monitors error rate and issues an alert when that rate exceeds a limit. This testing is qualified by a minimum required number of IOs. If the predictor runs and a head hasn't accumulated enough IO for a valid computation, the predictor doesn't *do* a computation in that interval (it just continues to accumlate data and try again at the next interval). The key takeaway from this is that a drive won't trip SMART if it's just sitting idle without IO.

The good news is, there are many reasons why a drive might demonstrate a spurious (i.e., short term) increase in error rates that have nothing to do with an impending failure. If the error condition goes away and doesn't come back (provided you're actually doing IO with the drive), there's little to worry about.

The bad news is, if you see the error multiple times, that means the condition causing the SMART Trip is recurring.

So, generally, if you get a SMART Trip and it goes away and the drive doesn't fail catastrophically soon after, you can safely assume the trip was a transient event and ignore it. So long as you have your data backed up, you can afford to wait. Maxtor will honor the warranty on the drive no matter how good or bad the drive is, when you ultimately request a replacement (provided it is still within the warranty period, that is!).

If you have a program that you can trust to do a lot of IO to the drive (or a tool that will let you initiate a Drive Self Test), use it. Set the drive up performing IO and let it run for a few hours. If you don't get any more SMART trips, you're probably A-OK. If you do get additional Trips, you probably want to trust the SMART Predictor and return the drive for replacement.



0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users