[op5-users] merlin & "duration" in nagios

Russell Jennings russ at geekwhiz.com
Fri Oct 16 03:37:45 CEST 2009


So, i finally figured out that making the checks on the NOC be ACTUALLY passive checks gives me what i want (going unknown if no data is received for n seconds). before i just had the active check run every 5 minutes (which was a simple script that generated the unknown state) and had the nagios poller's active check be every 3 minutes... i figured at the time that it was an ok setup, but i've found setting them to be passive is a much cleaner solution.

Anyway, there is one thing i'm not sure is due to merlin or not. but, if a service check starts out as unknown, then 4 minutes later merlin updates the state, it still keeps the the counter running from before, so it now has a duration of OK for 4 minutes, which is clearly incorrect. this isn't when it starts out only as unknown, before when i was doing my crazy active checks for passive results, i had the node offline for a spell. so, those services were in a critical state for 8 days. when i brought the poller node back online, all the checks went OK, but it kept the duration counting...


Thanks,
Russell


More information about the op5-users mailing list