[TriLUG] debugging random reboot

Paul G. Szabady Paul at ThyService.com
Tue Dec 5 00:49:49 EST 2006


Phil,

I ran into a situation a while back where I had auditd running.  I can't
remember the specifics at this late hour (from about 2 years ago), but
there was a default setting (/etc/audit/filter.conf) that made it think
the log partition was full, even though there was plenty of space
available.  At random times, it would cause the system to reboot.  This
took me about 2 weeks to find.  :(  I stopped running auditd to confirm,
then played around with the settings.

I hope this helps!

-- 
Paul
@ Thy Service

> One of my linux boxes has recently taken to rebooting itself at random
> intervals, and I'm at my wits end trying to figure out why.  I'm hoping
> somebody here might have some suggestions.  Here's what I know / have done
> so far:
>
> 1. I've been in the house (albeit different room) when it rebooted
> and there was no power event. Additionally none of my other
> boxes are rebooting.  I think it's safe to eliminate power events
> even though the box isn't on a UPS.
>
> 2. Installed Memtest86 and booted into that.  Ran for about
> 9 hours and found no memory errors.  To my mind that
> eliminates two possibilities: memory and power-supply; as the
> box doesn't stay up 9 hours when it boots into the OS.
>
> 3. checked CPU temperature and fan speed, all look to be normal.
>
> 4. checked hard-drive with the short offline test using smartctl, found
> no problems.
>
> 5. Suspected somebody rebooting me remotely using
> some apache exploit, so I shut down port 80 traffic to the
> box, which did not help.
>
> 6.  Examined /var/log/messages, the httpd logs, jboss logs, etc.
> found nothing that looked unusual.
>
> At this point I suspect a hardware problem, but I'm not sure
> what to try next.  I think memory and power-supply are out
> as possibilities, which leaves the CPU or the motherboard as
> the most likely culprits.  Other than swapping either or both
> out for a different part, I don't really know any way to test
> that theory...  any suggestions?
>
> The other possibility might be a rootkit of some sort (this box
> is exposed to the public Internet, so anything's possible I guess).
>
> If it matters, the box is running Centos 4.2,  uname -a reports:
>
> Linux mariner 2.6.9-22.0.1.EL #1 Thu Oct 27 12:26:11 CDT 2005 i686 i686
> i386 GNU/Linux
>
>
> TTYL,
>
> Phil
> --
> TriLUG mailing list        : http://www.trilug.org/mailman/listinfo/trilug
> TriLUG Organizational FAQ  : http://trilug.org/faq/
> TriLUG Member Services FAQ : http://members.trilug.org/services_faq/
>




More information about the TriLUG mailing list