[TriLUG] System overload issues
Ron Kelley
rkelleyrtp at gmail.com
Fri May 24 11:44:49 EDT 2013
Adding more swap will only prolong the inevitable death. Add more RAM or get an SSD...
Thanks,
-Ron
On May 24, 2013, at 11:28 AM, Brian McCullough wrote:
> On Fri, May 24, 2013 at 09:56:56AM -0400, Bill Farrow wrote:
>> On Fri, May 24, 2013 at 9:34 AM, Brian McCullough <bdmc at buadh-brath.com> wrote:
>>> Frequently during the day, the system will become ( or the web sites will become )
>>> non-responsive for periods ranging from one minute to well over an hour.
>>
>> Have you thought about putting limits on processes to prevent them
>> from taking the system to it's knees ? I would start by looking at
>> ulimit. If you can prevent the system from becoming un-responsive,
>> then you can start investigating which process is going haywire and
>> hopefully fix it properly.
>
> Thank you, Bill. I hadn't thought of ulimit, since I have only used
> that to limit disk space ( if I remember correctly ) in the past.
>
>
>>> Now, other things seem to be showing failure symtoms; for instance, bzip2, which
>>> compresses the MySQL database backup seems to take hours instead of minutes;
>>
>> How big is the mysql dump file that is being compressed ?
>
> I think it is somewhere about 7.5G; it compresses to 1.1G. It am in the
> process of unpacking one of the backups to confirm the original size.
>
>> Time how
>> long it takes when the system is running normally, and compare with
>> when the system is under load.
>>
>> time bzip2 test-backup
>>
>>
>> I'm going to second Ron Kelley's suggestion that it might be a bad
>> hard drive. Check dmesg and syslog for hard drive error messages.
>
> I just took a look at dmesg, I haven't for a while, I guess, and find
> something that I think is MUCH more interesting.
>
> My ( gut ) feeling has been that things are thrashing, and I see
> something at the bottom of the current dmesg that suggests that that may
> be ( part of ) the issue.
>
> What I see is:
>
>
> Swap cache: add 17613573, delete 17613356, find 25621613/26574285, race
> 41+1296
> Free swap = 0kB
> Total swap = 4192888kB
> Free swap: 0kB
> 2293760 pages of RAM
> 249431 reserved pages
> 311175 pages shared
> 585 pages swap cached
> Out of memory: Killed process 21911, UID 48, (httpd).
>
>
> There are more DMA statistics and CPU statistics prior to that, but the
> "Free swap: 0kB" is a red flag to me.
>
> Am I correct, and should I start by increasing swap space, or should I
> work on reducing the need for it?
>
>
> Brian
>
> --
> This message was sent to: Ron Kelley <rkelleyrtp at gmail.com>
> To unsubscribe, send a blank message to trilug-leave at trilug.org from that address.
> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
> Unsubscribe or edit options on the web : http://www.trilug.org/mailman/options/trilug/rkelleyrtp%40gmail.com
> TriLUG FAQ : http://www.trilug.org/wiki/Frequently_Asked_Questions
> TriLUG is dedicated to a harassment-free experience for everyone. Our anti-harassment policy can be found at: http://trilug.org/anti-harassment
More information about the TriLUG
mailing list