[op5-users] Réf. : Re: Réf. : Re: Réf. : Re: Nagios start time delay with Merlin

nicolas.raspail at bnpparibas.com nicolas.raspail at bnpparibas.com
Wed Aug 26 14:45:29 CEST 2009


op5-users-bounces at lists.op5.com wrote on 26/08/2009 14:07:07:

> nicolas.raspail at bnpparibas.com wrote:
> > 
> > I will try the new beta. Is a tar file available or this version must 
be 
> > retreive from the git repository ?
> > 
> 
> You'll need to get it from git, I think. We still haven't gotten around
> to providing tar-balls automagically every time we tag. That should
> happen in the next few days though, as we're in release-frenzy at the
> moment and usually face a slightly calmer period the first week or so
> after a major release.
> 

I have got it from 
http://git.op5.org/git/?p=nagios/merlin.git;a=commitdiff;h=4ae5c2ba5bb494d60a865df8a34ec66f9f32676e


> >>>         * nagios has been restarted at timestamp 1251281934 and 
since 
> >>> that, no check have been made until timestamp 1251282809 ! 15 mins 
of 
> >>> update, that is a lot of time.
> >> And far, far more than we're experiencing here. Merlin is designed in
> >> such a way that it rather drops messages than interferes with the
> >> running Nagios daemon, so what you're seeing is almost certainly not
> >> a result of Merlin doing something weird.
> >>
> >> This should be alleviated by upgrading to the latest Merlin version
> >> though, since it ignores all events Nagios throws at it until Nagios
> >> has entered the main event execution loop. Under such circumstances,
> >> the startup time can't be affected at all by Merlin.
> >>
> > 
> > Ok, but without Merlin, my Nagios starts immediately some checks. With 

> > NDO,
> > there is also a delay (7/8 mins), and the mysql server is very busy 
during
> > this period. Maybe there is a problem else, but I can't find where.
> > 
> 
> With the latest Merlin, checks should start immediately with merlin too.
> 

I will compile it and git it a try

> > 
> >>> Also, During these tests, I have see my check and hosts latencies 
grow 
> > up 
> >>> and now, with Merlin enabled, I have a large number of orphaned 
> > checks.
> >>> With Merlin I have the following latencies (and it is increasing as 
I 
> >>> write my email) :
> >>>
> >>> Service Check Latency:  0.00 / 1299.86 / 160.156 sec
> >>> Host Check Execution Time:      2.54 / 3.19 / 2.564 sec
> >>> Host Check Latency:     0.00 / 723.49 / 304.415 sec
> >>>
> >>> Before Merlin, i don't have the exact values, but I remember that 
the 
> >>> service latency was under 50s and the host latency under 1s
> >>>
> >> Was the latency slowly increasing before, or was it totally stable?
> >> Merlin does add a small overhead to the processing of each check
> >> result, status update and a plethora of other things. If your
> >> latency was previously increasing slowly, Merlin will make it
> >> increase faster. If it was stable before, it's possible that the
> >> (very small) overhead that Merlin adds is pushing it over the
> >> limit so that the latency starts converging on infinity.
> > 
> > Before Merlin, the latency was totally stable. I understand why Merlin 
and 
> > NDO add
> > a small overhead, but what I'm facing is a huge overhead.
> > 
> > I have disabled Merlin and enabled NDO to compare. Here is my actual 
> > latency with NDO
> > after 15/20 minutes :
> > 
> > Service Check Execution Time:   0.04 / 30.03 / 0.389 sec
> > Service Check Latency:  0.00 / 1952.96 / 300.960 sec
> > Host Check Execution Time:      2.54 / 2.69 / 2.566 sec
> > Host Check Latency:     0.00 / 17.98 / 5.105 sec
> > 
> > And the latency is still decreasing as I write my email.
> > 
> 
> Ah, I think I know what's happening now. Merlin schedules a reaper
> event once every 5 seconds, where it basically just sits and waits
> for input from the daemon. This reaper event can be dropped
> completely if no pollers or peers are configured, so I'll make
> sure to add something to that effect. This should make Merlin
> perform better than NDOUtils, since we're transferring fewer
> events and we're transferring them with a more efficient protocol.

Should I test the beta5 of Merlin or wait for the release where you
will make this change ?

Right now, with NDO, my latencies seems to be stable with theses values

Service Check Execution Time:   0.04 / 30.02 / 0.372 sec
Service Check Latency:  0.00 / 1299.86 / 46.321 sec
Host Check Execution Time:      2.54 / 2.62 / 2.563 sec
Host Check Latency:     0.00 / 3.45 / 1.179 sec

Thanks for your help

Nicolas




This message and any attachments (the "message") is
intended solely for the addressees and is confidential. 
If you receive this message in error, please delete it and 
immediately notify the sender. Any use not in accord with 
its purpose, any dissemination or disclosure, either whole 
or partial, is prohibited except formal approval. The internet
can not guarantee the integrity of this message. 
BNP PARIBAS (and its subsidiaries) shall (will) not 
therefore be liable for the message if modified. 
Do not print this message unless it is necessary,
consider the environment.

                ---------------------------------------------

Ce message et toutes les pieces jointes (ci-apres le 
"message") sont etablis a l'intention exclusive de ses 
destinataires et sont confidentiels. Si vous recevez ce 
message par erreur, merci de le detruire et d'en avertir 
immediatement l'expediteur. Toute utilisation de ce 
message non conforme a sa destination, toute diffusion 
ou toute publication, totale ou partielle, est interdite, sauf 
autorisation expresse. L'internet ne permettant pas 
d'assurer l'integrite de ce message, BNP PARIBAS (et ses
filiales) decline(nt) toute responsabilite au titre de ce 
message, dans l'hypothese ou il aurait ete modifie.
N'imprimez ce message que si necessaire,
pensez a l'environnement.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.op5.com/pipermail/op5-users/attachments/20090826/f9abdce2/attachment.html 


More information about the op5-users mailing list