[TriLUG] Nagios: Problems with check_esx3.pl plugin
Roy Vestal
rvestal at trilug.org
Thu Nov 11 11:27:39 EST 2010
Hey all,
I found the issue.
The production system is in a cluster with a shared PERL5LIB
environment. Even though the root user and nagios user have the correct
PERL5LIB path, the Nagios daemon did not. I was making the assumption
that the daemon and the nagios user used the same environment variables.
The plugin required Nagios::Plugin which we didn't have it installed, so
it was installed to the shared library.
When we changed the PERL5LIB variable in /etc/profile.d/perl.sh to the
shared library, everything is working correctly.
Hope this helps someone,
-Roy
On 11/10/10 9:59 AM, David M. wrote:
> I was thinking somewhat differently, if he is running ESX3 (as in VMware) he
> won't be on different distros per say. Also, I don't believe there is any
> SELinux involved in ESX at all. Are the test and production machines the
> same version of ESX? Next, have you checked your Configuration tab,
> Security Profile and allowed SNMP? (I presume this check is using SNMP) If
> so have you also checked /etc/snmp/snmpd.conf and the esx firewall at
> 'esxcfg-firewall -q'?
>
> In Configuration - Security Profile:
> Incoming - SNMP Server 161 udp
> Outgoing - SNMP Server 162 udp
>
> Is your SNMP community set for public in testing and something else in
> production and have you defined it in snmpd.conf if different?
>
> The last bits of 'esxcfg-firewall -q' for us:
> Opened ports:
> ntpServer : port 123 udp.in
> OpenManageRequest : port 1311 tcp.in
> hostdSnmp : port 171 udp.in udp.out
> We don't use the script you mentioned but we do have our own custom one
> which polls for these things:
> Service Console CPU, Service Console Disk /, Service Console Disk /boot,
> Service Console Disk /var/log
>
> Any VM's we monitor with SNMP individually at the guest OS level. Have you
> considered monitoring the individual guests as an alternative?
>
> David McDowell
>
>
> 2010/11/9 Cristóbal Palmer<cmp at cmpalmer.org>
>
>
>> On Tue, Nov 9, 2010 at 4:02 PM, Roy Vestal<rvestal at trilug.org> wrote:
>>
>>> Hey guys,
>>> I'm using the op5.org check_esx3.pl plugin for nagios. I have it on a
>>>
>> test
>>
>>> system running correctly. However, on the production system, it's
>>>
>> failing.
>>
>> Are the test system and the production system the same distro and
>> release? Can you do an "rpm -qa |sort" if it's an rpm-based machine
>> and then diff the output on the two? If you're on a debian derivative,
>> you can "dpkg --get-selections" and similarly diff the results. Do you
>> perhaps have selinux running on one and not the other?
>> /usr/sbin/sestatus to check. Those are possible things getting in the
>> way off the top of my head. I'm also not clear where in the chain
>> you'rs seeing the failure. Maybe check iptables on both?
>>
>> Cheers,
>> --
>> Cristóbal M. Palmer
>>
>>
--
------------------------------------------
Roy Vestal
http://rpp.linuxmaniac.net
.-.
/v\ L I N U X
// \\>Phear the Penguin<
/( )\
^^-^^
------------------------------------------
More information about the TriLUG
mailing list