[op5-users] Merlin crashed on me?
Andreas Ericsson
ae at op5.se
Wed Jul 1 10:56:20 CEST 2009
Frater, Greg J wrote:
>
> I never get neb.log file, should I? When I start nagios I see a console
> message that says 'Starting nagios:Logging to
> '/usr/local/nagios/merlin/logs/neb.log' but the log file never appears.
This almost certainly has to do with directory permissions. You can try,
as root, doing
# chmod 777 /usr/local/nagios/merlin/logs
# (restart nagios)
and it should start working.
> For that matter I don't see the binary log files 'daemon.ipc.read.bin'
> and 'daemon.ipc.write.bin' either.
>
Right. I actually think those have been removed, so it's not that
strange ;-)
> Ah, there's my crash, it dumped while I was writing this message. I
> still did not find any core dump files, I checked the places you
> suggested. I'm also not totally confident I'm doing this right, I'm
> primarily a Windows admin, Linux wannabe at best :-). I did get the
> following on the console, I think this is the backtrace, it says it is
> anyways.
> *** glibc detected *** /usr/local/nagios/merlin/merlind: free(): invalid
> size: 0x00007fffa3361ef0 ***
> ======= Backtrace: =========
> /lib64/libc.so.6[0x3613a71ce2]
> /lib64/libc.so.6(cfree+0x8c)[0x3613a7590c]
> /usr/local/nagios/merlin/merlind[0x406262]
> /usr/local/nagios/merlin/merlind[0x403a71]
> /lib64/libc.so.6(__libc_start_main+0xf4)[0x3613a1d974]
> /usr/local/nagios/merlin/merlind[0x402289]
It is a backtrace, but without symbol names resolved. In a "normal" debug
backtrace I'd get to see the function name and the line-number in the
source-file of where the actual crash came from. Since I can't reproduce
this myself I'm still hoping I can get those from you.
What does
cat /proc/sys/kernel/core*
say?
> Hope this is useful. By the way were you able to get your 64
> bit system up and running?
>
Yes, we have a 64-bit system up and running now, but I still haven't
seen any crashes on it so I'm guessing we're just not exercising it
as heavily as you are. Does the crash by any chance always happen
after receiving the same type of event? Inspecting the last 10 or so
lines of daemon.log after a crash should tell you if this is so, since
it logs the event type quite a long time before it starts messing
around with free()ing any pointers.
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
More information about the op5-users
mailing list