[op5-users] Merlin: A Few Questions

Eric Schoeller eschoeller at users.sourceforge.net
Thu Jan 14 04:58:31 CET 2010


Good Evening,


I am new to the list, and have just setup my first 2 node NOC/Poller/Peer
merlin installation. I have been digging through the mailing list
archives and the wiki to try and gain a better understanding of
exactly how Merlin works ... but I still have a few questions.
Please excuse me if they're silly!

    1. For a pair of NOCs peering with each other, should both be
       using the same mysql database? I can't seem to find any
       explanation on why/why not.

    2. Does configuration sharing work yet between NOC peers?

    3. How are notifications handled between a pair of NOCs?
       Specifically, if a notification is sent out from one
       NOC, is merlin responsible for telling the other not
       to notify?

    4. Is there a diagram depicting how merlin integrates with
       the rest of the Nagios core?

    5. From my understanding, two NOC peers are essentially just
       two instances of Nagios running independently of each other,
       scheduling their own checks etc. Merlin simply passes check
       results between the two? How does this interface with the
       Nagios scheduler? If checkA was run on peerB, peerB would
       submit that result to peerA ... but what if peerA is in the
       process of running checkA as well, or it's scheduled to run
       within a few nanoseconds? Can the scheduler on peerA remove
       that check from the queue since it already received a result?
       Perhaps the answer to #4 will shed light on #5!

    6. Is there a chance that an event_handler would ever get executed
       twice (on peerA and peerB) when it should have only been executed
       once (on peerA) ?

    7. Has anyone integrated DNX with Merlin successfully? I tried to
       load dnxServer.so along with running merlin, and got some complaints:

    [1263441167] 3: Non-control packet of type 0 with zero size length 
(this should never happen)
    [1263441167] 7: Read 64 bytes from 172.20.0.16. protocol: 0, type: 
0, len: 0
    [1263441167] 6: ipc socket isn't ready to accept data: Success
    [1263441167] 3: Unknown callback type. Weird, to say the least...
    [1263441167] 6: Data available from peer 'a-test ' (172.20.0.16)
    [1263441167] 3: recv(7, (buf + total), 1070757272, MSG_DONTWAIT | 
MSG_NOSIGNAL) returned -1 (Bad address)
    [1263441167] 4: Bogus read in proto_read_event(). got -1, expected 
1070793228
    [1263441167] 3: read() from peer node a-test  failed: Bad address

    8. What are the popular methods for synchronizing configurations
       between nagios hosts (if configuration sharing doesn't work)
       Ideas that come to mind: DRBD, csync2, rsync etc.

    9. I believe these three log-lines from merlin aren't good. Can
       someone explain briefly what they mean, and under what
       circumstances one would encounter them?

    ipc socket isn't ready to accept data: Success
    ipc is not connected
    Nulling OOB ptr 0. type: 0; offset: 0x30302e303030333b; len: 336; 
overshot with 3472326097204425195 bytes



Sorry for the barrage of questions. I am clearly interested in this
software <g> I appreciate your patience and any responses or
suggestions you may have.



Eric Schoeller
University of Colorado, Boulder
Information Technology Services


More information about the op5-users mailing list