[TriLUG] Question - Linux Server CPR
Mike Johnson
mike at enoch.org
Wed Mar 19 11:07:59 EST 2003
Jeremy Portzer [jeremyp at pobox.com] wrote:
> Along the lines of "don't let it happen to begin with", is this type of
> thing one reason people recommend multi-processor machines for servers?
> I'm under the impression that if one process goes totally bezerk and
> overcomes a processor, the rest of the processes will be assigned to the
> remaining processor and life can continue on. Is that a naive
> understanding of SMP? Does SMP really improve the situation with a
> hosed system like that?
The problem you run into is that it may not be CPU that's the problem.
What if the system is swapping (thrashing) like mad? Something I've run
into before is a bad SCSI card taking down an entire system. Processes
start blocking on I/O because the SCSI card isn't completing it's part
of the deal. If one process that depends on another that is I/O blocked
starts to want to talk to the blocked one, -it- then blocks. You end up
in this death spiral that will take down the system entirely, and
there's nothing you can do about it.
So, an SMP may or may not help here. It's hard to say. You might try a
script like this that would let you do a post mortum if it dies again:
while true
do
date >> test.log
echo >> test.log
echo "Free: " >> test.log
free >> test.log
echo >> test.log
echo "VMstat: " >> test.log
vmstat >> test.log
echo >> test.log
echo "Top: " >> test.log
COLUMNS=500 top -n 1 -b -c >> test.log
echo >> test.log
echo >> test.log
echo >> test.log
sleep 10
done
(Yes, it's a hastily thrown together script -- the 'top' line is the
most helpful)
Mike
--
"If life hands you lemons, YOU BLOW THOSE LEMONS TO BITS WITH
YOUR LASER CANNONS!" -- Brak
GNUPG Key fingerprint = ACD2 2F2F C151 FB35 B3AF C821 89C4 DF9A 5DDD 95D1
GNUPG Key = http://www.enoch.org/mike/mike.pubkey.asc
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 230 bytes
Desc: not available
URL: <http://www.trilug.org/pipermail/trilug/attachments/20030319/9dc4521e/attachment.pgp>
More information about the TriLUG
mailing list