[TriLUG] Jihad! ( was Remote server monitoring)
William Sutton
william at trilug.org
Thu Sep 1 12:14:40 EDT 2005
I'd like to introduce a scenario and inquire as to how
OpenNMS/Nagios/Big Brother/et. al. handle it.
You have a server (or servers) where network connectivity can be spotty.
You need to track CPU usage, memory usage, active processes for up/down
status, etc.
If I understand how these systems operate, then they won't be able to poll
the server (or servers) if the network connectivity is missing (e.g., no
knowledge of what happened, and no history).
Please correct me if I'm wrong.
It seems like a more sensible alternative to polling is to have separate
tools for monitoring and data collection/reporting: Place the monitor on
the servers, and allow them to queue up reports in event of network
problems.
Thoughts? Responses?
William
On Thu, 1 Sep 2005, Tarus Balog wrote:
>
> On Sep 1, 2005, at 11:41 AM, Aaron Joyner wrote:
>
> > Did you mention that it's auto detection (perhaps it's only feature
> > advantage over Nagios) is notoriously prone to slaughter the network
>
> Wrong. At LinuxWorld I discovered a whole Class B, 25K active servers
> and 50K services, without "slaughtering" the network. OpenNMS is
> extremely configurable, and you can make discovery as nice or as
> vicious as you would like. OpenNMS is a tool - a powerful tool. If
> someone uses a chainsaw to tear up a couple of people, you can't just
> assume that in the hands of a lumberjack it wouldn't be useful tool
> for cutting up trees. It would have been nice in your "notorious"
> claim to cite a reference or two. Notoriety implies experience
> outside of your own.
>
> > and it's implemented entirely in Java (and consumes resources like
> > your average java application, accordingly?)
>
> As someone who has never setup OpenNMS, this seems to be more of an
> "I hate Java" rant than anything to do with OpenNMS. I run OpenNMS on
> a small file server running various web pages and mail services and
> the system's load is rarely above 0.2. Then again, we have a huge
> system in Geneva monitoring 80K devices that is constantly busy.
>
> > Also, don't forget to point out that it doesn't understand
> > network topology, and as such will page you for services Y and Z
> > that depend on X, when ever X goes down, because you can't express
> > the dependencies.
>
> Don't you have to set up those dependencies manually? How do you do
> that on a 80K node network? Plus, in 1.3.0 or 1.3.1 we'll introduce
> Linkd, which will do both topology and mapping. At Dev-Jam 2005 the
> plan is to include AJAX in the webUI so that our maps will act more
> like Google maps. Give us until the end of the year and this argument
> goes away.
>
> >
> > Have a mentioned enough, or should we continue the holy war? :) I
> > can go on for pages...
> > Aaron S. Joyner
>
> So can I, but outside of topology it doesn't seem like you know what
> you are talking about with respect to OpenNMS. Plus, since OpenNMS
> 1.3.0 supports NRPE, almost every bit of data available to Nagios is
> available to OpenNMS.
>
> You said "[Nagios] It be the best. Tie in MRTG or better yet your
> own RRDtool back end for historical monitoring, and layer on
> smokeping for good latency measurements if you need them." Shane's
> point was that OpenNMS does service monitoring, like Nagios, latency,
> like smokeping, data collection, like MRTG, as well as event
> management and notifications like no other app out there. It does it
> without "double polling" (SNMP data can generate threshold events and
> reports without having a separate process for each) and it actually
> has a database in which network information can be stored, versus log
> files.
>
> When you use loaded language like "notoriously" and "slaughter" you
> are definitely trolling for flame.
>
> -T
>
> -----
>
> Tarus Balog
> The OpenNMS Group, Inc.
> Main : +1 919 545 2553 Fax: +1 503-961-7746
> Direct: +1 919 647 4749 Skype: tarusb
> Key Fingerprint: 8945 8521 9771 FEC9 5481 512B FECA 11D2 FD82 B45C
>
>
More information about the TriLUG
mailing list