[TriLUG] Question - Linux Server CPR

Mike Johnson mike at enoch.org
Wed Mar 19 11:07:59 EST 2003


Jeremy Portzer [jeremyp at pobox.com] wrote:

> Along the lines of "don't let it happen to begin with", is this type of
> thing one reason people recommend multi-processor machines for servers? 
> I'm under the impression that if one process goes totally bezerk and
> overcomes a processor, the rest of the processes will be assigned to the
> remaining processor and life can continue on.  Is that a naive
> understanding of SMP?  Does SMP really improve the situation with a
> hosed system like that?

The problem you run into is that it may not be CPU that's the problem.
What if the system is swapping (thrashing) like mad?  Something I've run
into before is a bad SCSI card taking down an entire system.  Processes
start blocking on I/O because the SCSI card isn't completing it's part
of the deal.  If one process that depends on another that is I/O blocked
starts to want to talk to the blocked one, -it- then blocks.  You end up
in this death spiral that will take down the system entirely, and
there's nothing you can do about it.

So, an SMP may or may not help here.  It's hard to say.  You might try a
script like this that would let you do a post mortum if it dies again:
                while true
                do 
                        date >> test.log
                        echo >> test.log
                        echo "Free: " >> test.log
                        free >> test.log
                        echo >> test.log
                        echo "VMstat: " >> test.log
                        vmstat >> test.log
                        echo >> test.log
                        echo "Top: " >> test.log
                        COLUMNS=500 top -n 1 -b -c >> test.log
                        echo >> test.log
                        echo >> test.log
                        echo >> test.log
                        sleep 10
                done


(Yes, it's a hastily thrown together script -- the 'top' line is the
most helpful)
Mike
-- 
"If life hands you lemons, YOU BLOW THOSE LEMONS TO BITS WITH 
 YOUR LASER CANNONS!" -- Brak

GNUPG Key fingerprint = ACD2 2F2F C151 FB35 B3AF  C821 89C4 DF9A 5DDD 95D1
GNUPG Key = http://www.enoch.org/mike/mike.pubkey.asc

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 230 bytes
Desc: not available
URL: <http://www.trilug.org/pipermail/trilug/attachments/20030319/9dc4521e/attachment.pgp>


More information about the TriLUG mailing list