[op5-users] Ninja backend and architecture
Andreas Ericsson
ae at op5.se
Tue Mar 31 09:46:45 CEST 2009
Matthias Flacke wrote:
> Hi Andreas & Johannes,
>
> thanks for your comprehensive answers! Comments and more questions
> following inline:
>
> Andreas Ericsson wrote:
>> Yes, we're using the Merlin module, which was originally designed
>> as an event transport module/daemon pair, much like NDOUtils. We
>> found that the NDO database scheme made it scale very poorly, so
>> we had to design our own. The idea is to make Nagios scale to
>> tens of thousands (or hundreds of thousands) hosts,
>
> Can you explain your decision a bit more? We all know about
> constraints and weaknesses in NDO. But was performance the only
> reason to create a proprietary DB model and leave the compatibility
> path for all addons based on NDO?
>
Not entirely. The new database model is also a lot simpler to
understand and write queries for. Those two sort of go hand in hand
though, since complex queries are often very difficult to optimize
well, and difficult to create an optimized database for.
However, the primary reason we didn't stick with NDOUtils is that
it doesn't scale linearly, and we failed to find a way to fix that.
In other words, a query against a database with 100 services may
take 1 second, but if you increase the number of services to 1000,
the same query takes somewhere around 50 or 60 seconds instead of
the expected 10. This behaviour can be seen relatively early, and
indicates that the database model just won't work in a network with
oh, let's say 50000 services.
> which becomes
>> rather simple since the Merlin module allows events to be sent not
>> only from the Nagios daemon on the same host, but also over the
>> network. The protocol is open, so other applications can use it
>> to transfer data to a merlin daemon as well.
>
> Is there some specification for this protocol (beside the C header
> files ;))?
Nope. I'm afraid not. It's simple enough though, so I might as well
write it up here.
Each packet can be a maximum of 32768 bytes. It consists of a 64-byte
header and a body that can be anywhere from 0 to 32704 bytes.
The header has the following fields:
protocol: 2 bytes
pkt_type: 2 bytes
selection: 2 bytes (only used for module -> daemon communication)
body_len: 4 bytes
sent: (struct timeval, so 8 bytes)
padding to 64 bytes for future protocol extensions.
Note that since the protocol bytes come first it's possible to
extend the packet header beyond 64 bytes in the future, although
that would require a rewrite of the event transmission functions.
The body is free-form and should be determined by the "pkt_type"
field. Each merlin-instance is expected to know how to handle
the packets it receives, although any merlin daemon can forward
packets it doesn't understand as well.
If pkt_type is 0xffff, it's a merlin control packet. In such
cases, the "len" field works as a flag to indicate the type of
control we're sending. It's possible this will get extended
in the future.
> From your NEB presentation in Nuremberg I understood Merlin as a
> means to exchange events between several Nagios instances. What I
> didn't got: who is the real player, Nagios or Merlin? If all the
> monitoring objects are instantiated in the Nagios instances, how do
> you achieve goals like
> - load balancing
> - redundancy / failover
> - configuration synchronization
> via Merlin? Do you hijack the Nagios scheduling in some extent?
>
Yes and no. When the merlin module receives a host-check event, it
updates the status of the host in question and, in the load-balanced
scenario, requests Nagios to reschedule the check as if that Nagios
instance had done the check and received its results right now. It
turns into a race, where one server will handle as much as it can
until latency creeps up on it, when the other server will start
executing checks because it hasn't received a recent enough result
from the server that normally would have done the check.
In the failover scenario, one host will simply stop doing checks
when it crashes (for whatever reason), so the remaining host will
keep doing its checks according to schedule. When the crashed host
comes back up, the backlog will be transferred to it, thereby
updating its status.
The config sync is actually turning out to be the tricky part,
although we'll likely solve that by transferring a pre-cached file
of all the objects the other node should have. For the failover
case, that's fairly straightforward (since we just send the entire
file), but for the distributed tiered case, it gets a bit trickier
since downstream instances shouldn't know about hosts it can't
monitor.
> BTW - if you combine multiple Nagios instances in terms of
> redundance and failover, does this also include the DB backend? Or
> is the DB singular and has to be secured in the classical sense of
> clustering and HA?
>
The db is solely for status viewing purposes. You can have as many
databases as you like, and as many "view-only" servers as you like
(well, up to 65534 anyway, but I'm assuming that will suffice), since
each Merlin daemon can update a database with events received from
pretty much anywhere.
>> This'll be hard without some means of drawing stuff, but here's
>> what happens when an event is triggered inside Nagios:
>> * Nagios calls the eventbroker module part of Merlin.
>> * merlinmod creates a merlin packet from the event and
>> - transfers it to the merlin module if the connection is live
>> or
>> - writes it to a backlog file for later feeding to the unix
>> domain socket connecting the module to the daemon
>
> Backlog is a nice idea, since Nagios lives in presence and does not
> care for non received results after leaving the retry_intervals. But
> - is Nagios capable to deal with such events from the past? How does
> it fit into concepts like timeperiods, flapping, reporting,
> performance metering etc.?
>
Good question. Nagios does some of it "for free" for us. Notifications,
for example, are validated by checking "is *now* inside the timeperiod?",
so it won't start sending notifications outside a contact's configured
timeperiods. For reports etc, we fully expect users to use our reporting
solution (it's simply much, much better than what's in Nagios today, or
anywhere else too for that matter), and since the database is updated
with the timestamps of the checks, those will work too.
I'm not too sure about flapping. It's possible we'll have to add flapping
checks manually to the module, or patch Nagios to make status updating a
bit more programmer friendly in terms of handling such things on the fly.
>>> - Is it perhaps planned to add some kind of an API to structure the
>>> interaction between GUI and backend?
>>>
>> Yes. We have to do that since a new status map will also be included
>> in the GUI, and the people writing the status map are contracted and
>> shouldn't have to worry about database layout and things like that.
>> We're aware of possible changes that may need to be done to the db
>> layout later, and we want a stable API that everyone can use to pull
>> stuff out of the database. Per Åsberg knows more about that, as I've
>> only been working on the datafeeding part.
>
> I would appreciate to get some more info about that API which is IMO
> more important than bells and whistles on the GUI ;) Does it only
> reflect data retrieval and manipulation or will it also cover things
> like event handling, command queuing, messaging etc.?
>
The exact details are a "wait and find out" for both you and me ;-)
The (gui) API will be about status data retrieval and manipulation. It
will also cover sending commands to Nagios. I'm not sure what you mean
by event handling and messaging, since Merlin is the one that handles
the IPC parts.
> I expect to work quite a
>> bit on the API once the Merlin stuff is about 90% complete though.
>
> Good news;) One last question: is it already time to start to play
> around with Merlin? AFAIR you stated somewhere that its not yet
> production stable...
>
You can definitely play around with it. The current "master" branch
in the git repository has been running on a test-server here for about
two weeks now, so it seems pretty stable. It doesn't have all the
functionality yet (reports-module is to be integrated into Merlin, and
not all events are handled properly yet). It's also under rather heavy
development at the moment, so some parts may be missing, and it's most
likely not too easy to install it. I'm not sure if I've added the
database schema to the repository yet, fe.
If you want to try it out and run into problems, contact me off-list
and I'll send you my msn or skype username so I can guide you a bit
more hands-on. It's quite important to us that it's easy to install
the Merlin module and the GUI stuff (which it isn't today, I'm afraid)
so feedback in that area is most welcome :)
The easiest way right now is probably to get an op5 Monitor vmware
image and then install Merlin onto that. The reason is a rather
unfortunate dependency on the monitor-gui package, which is required
to initially populate the database with the object configuration.
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
More information about the op5-users
mailing list