[TriLUG] smartmontools - selftest fails, Health status passed. ???
lfwelty at nc.rr.com
lfwelty at nc.rr.com
Thu Jan 8 14:21:31 EST 2004
I may be able to answer my own question - a google search seems
to indicate people are recommending hdd replacement in this
situation.
http://216.239.41.104/search?q=cache:OpA21QVKE7wJ:lists.debian.org/debian-powerpc/2003/debian-powerpc-200310/msg00413.html+smartctl+PASSED+extended+read+failure&hl=en&ie=UTF-8
Can anyone confirm or provide more information?
Thanks again,
F.
lfwelty at nc.rr.com wrote:
> Hi y'all,
>
> I have a hdd that is showing some seemingly (to me at least)
> conflicting information. smartctl's health status shows the
> hdd as PASSED, but it's failing the short and long selftests
> at the same place.
>
> - relevent smartctl output below.
>
> If the health status were FAILED and I was seeing the errors
> I would definately replace the hdd. But since they're conflicting,
> I'm not sure if I need to replace it.
>
> Thanks for the help,
>
> - Frank.
>
> tiresias|ROOT:lfwelty-2# smartctl -a /dev/hda
> smartctl version 5.1-18 Copyright (C) 2002-3 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
>
> === START OF INFORMATION SECTION ===
> Device Model: MAXTOR 6L080J4
> Serial Number: 664204750210
> Firmware Version: A93.0500
> Device is: In smartctl database [for details use: -P show]
> ATA Version is: 5
> ATA Standard is: ATA/ATAPI-5 T13 1321D revision 1
> Local Time is: Thu Jan 8 13:54:32 2004 EST
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
>
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
>
> General SMART Values:
> Off-line data collection status: (0x00) Offline data collection activity
> was
> never started.
> Auto Off-line Data Collection:
> Disabled.
> Self-test execution status: ( 112) The previous self-test completed
> having
> the read element of the test
> failed.
> Total time to complete off-line
> data collection: ( 35) seconds.
> Offline data collection
> capabilities: (0x1b) SMART execute Offline immediate.
> Automatic timer ON/OFF support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> No Conveyance Self-test supported.
> No Selective Self-test supported.
> SMART capabilities: (0x0003) Saves SMART data before entering
> power-saving mode.
> Supports SMART auto save timer.
> Error logging capability: (0x01) Error logging supported.
> No General Purpose Logging support.
> Short self-test routine
> recommended polling time: ( 2) minutes.
> Extended self-test routine
> recommended polling time: ( 40) minutes.
>
> SMART Attributes Data Structure revision number: 11
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
> UPDATED WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x0029 100 253 020 Pre-fail
> Offline - 0
> 3 Spin_Up_Time 0x0027 068 065 020 Pre-fail
> Always - 4092
> 4 Start_Stop_Count 0x0032 100 100 008 Old_age
> Always - 162
> 5 Reallocated_Sector_Ct 0x0033 099 099 020 Pre-fail
> Always - 5
> 7 Seek_Error_Rate 0x000b 100 100 023 Pre-fail
> Always - 0
> 9 Power_On_Hours 0x0012 079 079 001 Old_age
> Always - 14083
> 10 Spin_Retry_Count 0x0026 100 100 000 Old_age
> Always - 0
> 11 Calibration_Retry_Count 0x0013 100 100 020 Pre-fail
> Always - 0
> 12 Power_Cycle_Count 0x0032 100 100 008 Old_age
> Always - 62
> 13 Read_Soft_Error_Rate 0x000b 100 093 023 Pre-fail
> Always - 0
> 194 Temperature_Celsius 0x0022 086 082 042 Old_age
> Always - 37
> 195 Hardware_ECC_Recovered 0x001a 100 001 000 Old_age
> Always - 99292106
> 196 Reallocated_Event_Count 0x0010 100 100 020 Old_age
> Offline - 0
> 197 Current_Pending_Sector 0x0032 100 100 020 Old_age
> Always - 3
> 198 Offline_Uncorrectable 0x0010 100 253 000 Old_age
> Offline - 0
> 199 UDMA_CRC_Error_Count 0x001a 200 200 000 Old_age
> Always - 0
>
> SMART Error Log Version: 1
> ATA Error Count: 39 (device log contains only the most recent five errors)
> CR = Command Register [HEX]
> FR = Features Register [HEX]
> SC = Sector Count Register [HEX]
> SN = Sector Number Register [HEX]
> CL = Cylinder Low Register [HEX]
> CH = Cylinder High Register [HEX]
> DH = Device/Head Register [HEX]
> DC = Device Command Register [HEX]
> ER = Error register [HEX]
> ST = Status register [HEX]
> Timestamp = decimal seconds since the previous disk power-on.
> Note: timestamp "wraps" after 2^32 msec = 49.710 days.
>
> Error 39 occurred at disk power-on lifetime: 10518 hours
> When the command that caused the error occurred, the device was in an
> unknown state.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 10 59 06 e9 01 0b e0
>
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
> -- -- -- -- -- -- -- -- --------- --------------------
> c8 00 08 e7 01 0b e0 0b 217.381 READ DMA
> c8 00 08 0f 02 0b e0 0b 217.381 READ DMA
> c8 00 08 07 02 0b e0 0b 217.380 READ DMA
> c8 00 08 47 1a 0b e0 00 217.380 READ DMA
> c8 00 08 ff 01 0b e0 0b 217.365 READ DMA
>
> Error 38 occurred at disk power-on lifetime: 10335 hours
> When the command that caused the error occurred, the device was in an
> unknown state.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 40 59 06 e9 01 0b e0
>
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
> -- -- -- -- -- -- -- -- --------- --------------------
> c8 00 08 e7 01 0b e0 0b 85.432 READ DMA
> c8 00 08 0f 02 0b e0 0b 85.432 READ DMA
> c8 00 08 07 02 0b e0 0b 85.431 READ DMA
> c8 00 08 47 1a 0b e0 00 85.431 READ DMA
> c8 00 08 ff 01 0b e0 0b 85.423 READ DMA
>
> Error 37 occurred at disk power-on lifetime: 10215 hours
> When the command that caused the error occurred, the device was in an
> unknown state.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 40 59 06 e9 01 0b e0
>
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
> -- -- -- -- -- -- -- -- --------- --------------------
> c8 00 08 e7 01 0b e0 0b 59.635 READ DMA
> ca 00 10 cf 00 60 e0 60 59.635 WRITE DMA
> c8 00 08 0f 02 0b e0 00 59.634 READ DMA
> ca 00 10 af 00 60 e0 60 59.634 WRITE DMA
> c8 00 08 07 02 0b e0 00 59.633 READ DMA
>
> Error 36 occurred at disk power-on lifetime: 9762 hours
> When the command that caused the error occurred, the device was in an
> unknown state.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 10 59 06 e9 01 0b e0
>
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
> -- -- -- -- -- -- -- -- --------- --------------------
> c8 00 08 e7 01 0b e0 0b 139.350 READ DMA
> c8 00 08 0f 02 0b e0 0b 139.350 READ DMA
> c8 00 08 07 02 0b e0 0b 139.350 READ DMA
> c8 00 08 47 1a 0b e0 00 139.349 READ DMA
> c8 00 08 ff 01 0b e0 0b 139.333 READ DMA
>
> Error 35 occurred at disk power-on lifetime: 9403 hours
> When the command that caused the error occurred, the device was in an
> unknown state.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 40 59 06 e9 01 0b e0
>
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
> -- -- -- -- -- -- -- -- --------- --------------------
> c8 00 08 e7 01 0b e0 0b 291.066 READ DMA
> c8 00 08 0f 02 0b e0 0b 291.059 READ DMA
> c8 00 08 07 02 0b e0 0b 291.058 READ DMA
> c8 00 08 47 1a 0b e0 00 291.058 READ DMA
> c8 00 08 ff 01 0b e0 0b 291.042 READ DMA
>
> SMART Self-test log structure revision number 1
> Num Test_Description Status Remaining
> LifeTime(hours) LBA_of_first_error
> # 1 Extended off-line Completed: read failure 90%
> 14063 0x000aea3d
> # 2 Short off-line Completed: read failure 40%
> 14063 0x000aea3d
>
>
--
----------------------------------------------------------------------
Frank Welty | Earth is a beta site, I just wish that damn
lfwelty at nc.rr.com | pink elephant would give me my mouse back.
----------------------------------------------------------------------
More information about the TriLUG
mailing list