[TriLUG] Nagios: Problems with check_esx3.pl plugin

Thu Nov 11 11:27:39 EST 2010

Hey all,
  I found the issue.

The production system is in a cluster with a shared PERL5LIB 
environment. Even though the root user and nagios user have the correct 
PERL5LIB path, the Nagios daemon did not. I was making the assumption 
that the daemon and the nagios user used the same environment variables. 
The plugin required Nagios::Plugin which we didn't have it installed, so 
it was installed to the shared library.

When we changed the PERL5LIB variable in  /etc/profile.d/perl.sh to the 
shared library, everything is working correctly.

Hope this helps someone,

-Roy

On 11/10/10 9:59 AM, David M. wrote:
> I was thinking somewhat differently, if he is running ESX3 (as in VMware) he
> won't be on different distros per say.  Also, I don't believe there is any
> SELinux involved in ESX at all.  Are the test and production machines the
> same version of ESX?  Next, have you checked your Configuration tab,
> Security Profile and allowed SNMP? (I presume this check is using SNMP)  If
> so have you also checked /etc/snmp/snmpd.conf and the esx firewall at
> 'esxcfg-firewall -q'?
>
> In Configuration - Security Profile:
> Incoming - SNMP Server 161 udp
> Outgoing - SNMP Server 162 udp
>
> Is your SNMP community set for public in testing and something else in
> production and have you defined it in snmpd.conf if different?
>
> The last bits of 'esxcfg-firewall -q' for us:
> Opened ports:
>          ntpServer           : port 123 udp.in
>          OpenManageRequest   : port 1311 tcp.in
>          hostdSnmp           : port 171 udp.in udp.out
> We don't use the script you mentioned but we do have our own custom one
> which polls for these things:
> Service Console CPU, Service Console Disk /, Service Console Disk /boot,
> Service Console Disk /var/log
>
> Any VM's we monitor with SNMP individually at the guest OS level.  Have you
> considered monitoring the individual guests as an alternative?
>
> David McDowell
>
>
> 2010/11/9 Cristóbal Palmer<cmp at cmpalmer.org>
>
>    
>> On Tue, Nov 9, 2010 at 4:02 PM, Roy Vestal<rvestal at trilug.org>  wrote:
>>      
>>> Hey guys,
>>>   I'm using the op5.org check_esx3.pl plugin for nagios. I have it on a
>>>        
>> test
>>      
>>> system running correctly. However, on the production system, it's
>>>        
>> failing.
>>
>> Are the test system and the production system the same distro and
>> release? Can you do an "rpm -qa |sort" if it's an rpm-based machine
>> and then diff the output on the two? If you're on a debian derivative,
>> you can "dpkg --get-selections" and similarly diff the results. Do you
>> perhaps have selinux running on one and not the other?
>> /usr/sbin/sestatus to check. Those are possible things getting in the
>> way off the top of my head. I'm also not clear where in the chain
>> you'rs seeing the failure. Maybe check iptables on both?
>>
>> Cheers,
>> --
>> Cristóbal M. Palmer
>>
>>      

-- 
------------------------------------------
Roy Vestal
http://rpp.linuxmaniac.net

   .-.
   /v\    L   I   N   U   X
  // \\>Phear the Penguin<
/(   )\
  ^^-^^
------------------------------------------