[TriLUG] HD failure eminent?

Michael Hrivnak mhrivnak at triad.rr.com
Fri Dec 17 14:53:03 EST 2004


Brian,

If i recall correctly, "passing" a SMART test only means that catastrophic 
failure is not expected within the next 24 hours.

It looks like you haven't used the SMART tests.  Try both of these:

smartctl -t short /dev/hda

The results will show up in the output below.  Then try:

smartctl -t long /dev/hda

After a while, those results will also show up in the output below.

I recently worked with a drive that passed all the SMART tests but 
consistently was producing errors, which were demonstrated in the above 
mentioned tests.

Michael

On Friday 17 December 2004 01:04 pm, Brian Henning wrote:
> Hi List,
>   Someone mentioned that my HD might be on its way out, in response to the
> strange errors I described earlier.  So I went and ran some SMART tests,
> and they all came back negative..  but the general SMART stats looked kind
> of iffy, so I wanted to post them here and find out if anyone had any
> comments, since I really don't completely know what I'm looking at here. 
> Some of the numbers seem high, but I don't know whether high is good or
> bad..  :-P
>
> Here's the complete output of smartctl -a /dev/hda:
>
> smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
>
> === START OF INFORMATION SECTION ===
> Device Model:     ST320413A
> Serial Number:    7ED2Y9PT
> Firmware Version: 3.40
> Device is:        Not in smartctl database [for details use: -P showall]
> ATA Version is:   5
> ATA Standard is:  Exact ATA specification draft version not indicated
> Local Time is:    Thu Dec 16 08:43:02 2004 EST
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
>
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
>
> General SMART Values:
> Offline data collection status:  (0x82) Offline data collection activity
> was completed without error.
>      Auto Offline Data Collection: Enabled.
> Self-test execution status:      (   0) The previous self-test routine
> completed
>      without error or no self-test has ever
>      been run.
> Total time to complete Offline
> data collection:    ( 422) seconds.
> Offline data collection
> capabilities:     (0x1b) SMART execute Offline immediate.
>      Auto Offline data collection on/off support.
>      Suspend Offline collection upon new
>      command.
>      Offline surface scan supported.
>      Self-test supported.
>      No Conveyance Self-test supported.
>      No Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
>      power-saving mode.
>      Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
>      No General Purpose Logging support.
> Short self-test routine
> recommended polling time:   (   1) minutes.
> Extended self-test routine
> recommended polling time:   (  23) minutes.
>
> SMART Attributes Data Structure revision number: 10
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED
> WHEN_FAILED RAW_VALUE
>   1 Raw_Read_Error_Rate     0x000e   054   048   025    Old_age
> ys       -       101257730
>   3 Spin_Up_Time            0x0002   083   077   000    Old_age
> ys       -       0
>   4 Start_Stop_Count        0x0032   100   100   020    Old_age
> ys       -       357
>   5 Reallocated_Sector_Ct   0x0032   099   099   036    Old_age
> ys       -       46
>   7 Seek_Error_Rate         0x000e   075   060   030    Old_age
> ys       -       34823207
>   9 Power_On_Hours          0x0032   094   094   000    Old_age
> ys       -       5904
>  10 Spin_Retry_Count        0x0012   100   100   097    Old_age
> ys       -       0
>  12 Power_Cycle_Count       0x0032   100   100   020    Old_age
> ys       -       514
> 194 Temperature_Celsius     0x0022   037   054   000    Old_age
> ys       -       37
> 195 Hardware_ECC_Recovered  0x001a   056   051   000    Old_age
> ys       -       187358950
> 197 Current_Pending_Sector  0x0012   100   100   000    Old_age
> ys       -       0
> 198 Offline_Uncorrectable   0x0010   100   100   000    Old_age
> line      -       0
> 199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age
> ys       -       0
> 200 Multi_Zone_Error_Rate   0x0000   100   100   000    Old_age
> line      -       0
> 202 TA_Increase_Count       0x0032   100   253   000    Old_age
> ys       -       0
>
> SMART Error Log Version: 1
> No Errors Logged
>
> SMART Self-test log structure revision number 1
> Num  Test_Description    Status                  Remaining  LifeTime(hours)
> LBA_of_first_error
> # 1  Short offline       Completed without error       00%
> 4         -
>
> Thanks y'all..  I guess I should just start looking into getting a
> replacement drive anyhow..  but I wanted some expert advice.
>
> Cheers,
> ~Brian
>
>
>
> ----------------
> Brian A. Henning
> Strutmasters.com
> 866.597.2397
> ----------------



More information about the TriLUG mailing list