[op5-users] importing thoughts
Andreas Ericsson
ae at op5.se
Mon Aug 24 11:33:27 CEST 2009
Michael Hobbs wrote:
> Hi all
>
>
>
> As you can see from a few other posts I've been having issues with Merlin
> importing properly
>
>
>
> The system Nagios will be monitoring will have hosts being added/removed
> often (I'm talking 10's added or removed each day) so it's important that
> Merlin can keep up with the changes. Unfortunately at the moment Merlin is
> unable to do this and I'm scratching me head as to why.
>
Happily, I think I've discovered the reason for this.
Increasing the ipc socket pressure by running ~1000 checks / minute, I found
that a lot of events were dropped in the transfer.
What probably happens when Merlin fails to import the new hosts and services
is that the pathspec event that the merlin module generates to trigger an
import is dropped, causing the daemon to not know it should run the import
program.
I'm working on boosting up the performance of the reading side now, which
should alleviate the pressure quite nicely. I'll also actually implement
the binary backlog thing which will help with short bursts of a huge number
of events where Merlin or the database simply chokes.
>
>
> With this in mind I thought I would look at the basics and see if I'm doing
> something fundamentally wrong
>
>
>
> When we add or remove a new host we restart nagios via a script, this script
> just restarts nagios and then calls Merlins import.php file
>
>
>> <snip>
>
>> #!/bin/bash
>
>> /etc/init.d/nagios reload
>
>> # import to merlin
>
Add a "sleep 2" here, since Nagios needs to have time to write its objects.cache
and status.log files properly before the import script is being run.
>> php /usr/local/nagios/etc/addons/merlin/import.php
>
>> --cache=/usr/local/nagios/var/objects.cache
>
>> </snip>
>
>
>
> What I'm wondering is should we be stopping/starting Merlin during this
> process as well instead of just doing the import?
>
>
>
> And if so when should we restart merlin
>
Merlin should preferrably be running before Nagios is started, or you might
end up missing events. The correct way to stop merlin and nagios is therefore
STOP nagios
STOP merlind
START merlind
START nagios
>
> To be honest guys anything I can do to improve merlins' import would be
> great otherwise I'll be under pressure to go back to using NDOUtils and put
> Merlin/Ninja on the backburner
>
Well, you could also start using the --status-log= option to the import
script. That will cause it to read the status.log file Nagios produces at
startup. Other than that, adding the 'sleep 2' call to your restart script
should work around the issue for now (set it higher if 2 seconds isn't
enough for your Nagios server to write the necessary files).
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
More information about the op5-users
mailing list