[TriLUG] controller causing read errors?

David Black dave at jamsoft.com
Tue Jan 25 10:59:08 EST 2011


Speaking of sensors, if you can at least spot check - and better still, monitor power supply voltages over time, you may see something.  In particular watch the +5V rail for sag or overvoltage, and if you have the benefit of a scope, noise/spikes on it or +12V (for a drive).

The BIOS usually has a screen where you can see 'PC Health', including supply voltages.  Worth looking at but of limited usefulness since at the time the CPU and hard drives are usually idle.  Outside of your just having a run of bad drives, something could be stressing them.  If not thermal, power supply is where I'd look next - including filter capacitors on the motherboard, as you already plan to do.

Dave

----- Original Message -----
> These are internal SATA drives. The case is a small form factor Dell
> Optiplex 755. Sensors tell me that the CPUs are running at 32 C and
> hddtemp tells me that /dev/sda is running at 40 C. There's only one
> drive in this machine. It's formatted for ext4. The first two drives
> to die in this machine were 400GB Hitachi Deskstar drives. The current
> drive is a Maxtor 250 GB. Not sure about firmwares. The drive is not
> part of a RAID. I'm not passing any arguments. They are not SSD. The
> read errors started appearing in December on the first drive. When I
> replaced that, the second drive started reporting errors in the first
> week. When I replaced that the third drive started reporting errors in
> about a week.
> 
> I'll look for bulging capacitors.
> 
> Thanks for the help.
> 
> 
> 
> On 01/25/2011 10:21 AM, William Chandler wrote:
> > Also note that SMART doesn't work on USB devices (last time I
> > checked, over
> > 2 years ago.) How's the cooling situation in the case? Try to make
> > sure
> > it's not overheating if possible. Are the Hard Drives fairly large?
> > What's
> > the vendor/firmware? Are they connected to a RAID array? What
> > filesystem
> > are you using? Are you passing any specific arguments? How often are
> > they
> > failing? SSD or Platter? You can also look on your machine for
> > bulging
> > capicators -- they may be sending incorrect voltages to your devices
> > (crazy,
> > but possible.)
> >
> > On Tue, Jan 25, 2011 at 9:42 AM, Jim<jjtuttle at trilug.org> wrote:
> >
> >> Hi,
> >>
> >> I'm on my 3rd hard drive in a month. Each reported read errors and
> >> then
> >> reallocated sectors. I'm starting to suspect that this isn't
> >> coincidence
> >> and that the hard drive controller or other hardware is damaging
> >> the drives
> >> or misreporting the SMART data. Is this possible?
> >>
> >> To successfully get a replacement machine I'll have to prove that
> >> I'm not
> >> making this up. Is there anyway to track down the cause?
> >>
> >> Thanks!
> >> --
> >> This message was sent to:
> >> wcchandler at gmail.com<wcchandler at gmail.com>
> >> To unsubscribe, send a blank message to trilug-leave at trilug.org
> >> from that
> >> address.
> >> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
> >> Unsubscribe or edit options on the web :
> >> http://www.trilug.org/mailman/options/trilug/wcchandler%40gmail.com
> >> TriLUG FAQ :
> >> http://www.trilug.org/wiki/Frequently_Asked_Questions
> >>
> 
> --
> This message was sent to: David Black <dave at jamsoft.com>
> To unsubscribe, send a blank message to trilug-leave at trilug.org from
> that address.
> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
> Unsubscribe or edit options on the web :
> http://www.trilug.org/mailman/options/trilug/dave%40jamsoft.com
> TriLUG FAQ : http://www.trilug.org/wiki/Frequently_Asked_Questions

-- 
A wise and frugal government, ... which shall leave men free to regulate their own pursuits of industry and improvement, and shall not take from the mouth of labor the bread it has earned - this is the sum of good government. - Thomas Jefferson, First Inaugural Address, March 4, 1801. 




More information about the TriLUG mailing list