[TriLUG] debugging random reboot

Steve Litt slitt at troubleshooters.com
Tue Dec 5 11:32:39 EST 2006


It could be anything. If it were me, I'd check the power cord. Many power 
cords get loose where they plug into the computer (as opposed to where they 
plug into the wall). So loose they can almost FALL out, and certainly loose 
enough that vibration will eventually make them unplug, at least partially.

If it's loose, just for fun, replace it.

Another thing -- do you have a way of seeing whether the reboot switch (not 
the power on switch, but the little reboot switch), is dirty? Perhaps 
disconnect it from the mobo until you get this straightened out.

Just for fun, put it on a UPS. I live in Orlando Florida, home of unreliable 
power. We regularly have 1/4 second brownouts. I've found that my boxes with 
really beefy power supplies can keep running through such events, but others 
reboot or turn off.

Speaking of rebooting or turning off -- is your bios set up to restart upon 
restoration of power after power loss? If so, try switching it so it doesn't 
restart, and see if the symptom becomes "computer shuts off". If so, it's a 
pretty good indication that the problem was caused by power loss.

I've been criticized for suggesting this, but I do it all the time. I wiggle 
the mobo and cables while the thing's running to see if I can get it to 
happen. Likewise, I can use a heat gun to activate thermal intermittents, 
although of course with a computer GREAT CARE needs to be taken to avoid 
overheating a CPU or video card.

It could be the power supply. You haven't said how often this intermittent 
symptom occurs, but given the inconvenience and expense of this situation, 
you might want to prophylactically replace the power supply. Keep the old one 
and mark it with a question mark. When you eventually find the root cause, 
either mark the old one good (if it didn't cause the problem) or if it did 
cause the problem, slam it with a hammer until unusable, because you DON'T 
want anyone putting it in a box and giving the box this intermittent.

HTH

SteveT


On Monday 04 December 2006 22:16, Phillip Rhodes wrote:
> One of my linux boxes has recently taken to rebooting itself at random
> intervals, and I'm at my wits end trying to figure out why.  I'm hoping
> somebody here might have some suggestions.  Here's what I know / have done
> so far:
>
> 1. I've been in the house (albeit different room) when it rebooted
> and there was no power event. Additionally none of my other
> boxes are rebooting.  I think it's safe to eliminate power events
> even though the box isn't on a UPS.
>
> 2. Installed Memtest86 and booted into that.  Ran for about
> 9 hours and found no memory errors.  To my mind that
> eliminates two possibilities: memory and power-supply; as the
> box doesn't stay up 9 hours when it boots into the OS.
>
> 3. checked CPU temperature and fan speed, all look to be normal.
>
> 4. checked hard-drive with the short offline test using smartctl, found
> no problems.
>
> 5. Suspected somebody rebooting me remotely using
> some apache exploit, so I shut down port 80 traffic to the
> box, which did not help.
>
> 6.  Examined /var/log/messages, the httpd logs, jboss logs, etc.
> found nothing that looked unusual.
>
> At this point I suspect a hardware problem, but I'm not sure
> what to try next.  I think memory and power-supply are out
> as possibilities, which leaves the CPU or the motherboard as
> the most likely culprits.  Other than swapping either or both
> out for a different part, I don't really know any way to test
> that theory...  any suggestions?
>
> The other possibility might be a rootkit of some sort (this box
> is exposed to the public Internet, so anything's possible I guess).
>
> If it matters, the box is running Centos 4.2,  uname -a reports:
>
> Linux mariner 2.6.9-22.0.1.EL #1 Thu Oct 27 12:26:11 CDT 2005 i686 i686
> i386 GNU/Linux
>
>
> TTYL,
>
> Phil

-- 
Steve Litt
Author: 
   * Universal Troubleshooting Process courseware
   * Troubleshooting Techniques of the Successful Technologist
   * Manager's Guide to Technical Troubleshooting
   * Twenty Eight Tales of Troubleshooting
   * Rapid Learning: Secret Weapon of the Successful Technologist

http://www.troubleshooters.com/bookstore
http://www.troubleshooters.com/utp/tcourses.htm

(Legal Disclaimer) Follow these suggestions at your own risk.



More information about the TriLUG mailing list