[op5-users] Merlin crashed on me?

Andreas Ericsson exon at op5.com
Tue Jun 16 20:02:38 CEST 2009


Frater, Greg J wrote:
>>> Looks like the daemon died, after restarting it, I got this:
>>>
>>> [root at host1 merlin]# service merlind restart Logging to 
>>> '/usr/local/nagios/merlin/logs/daemon.log'
>>> No daemon running
>>> Logging to '/usr/local/nagios/merlin/logs/daemon.log'
>>> [root at host1 merlin]# Importing objects to database merlin importing 
>>> objects from /usr/local/nagios/var/objects.cache
>>> importing status from /usr/local/nagios/var/status.dat SQL query 
>>> failed with the following error message;<br /> Table 
>>> 'merlin.hostdowntime' doesn't exist<br />
>> I'll amend that. Thanks for reporting this.
> 
> I'm getting repeated segfaults from the daemon.  It looks like it only
> runs for a short time and then crashes again.  I'm not seeing any
> symptoms other than segfault events in the log file.
> 

Hmm, that sucks.

>>From /var/log/messages:
> 
> Jun 15 15:42:09 host1 kernel: merlind[3176]: segfault at
> 0000000005c31e70 rip 0000003613a78d80 rsp 00007ffff50e9988 error 4
> Jun 16 08:45:10 host1 kernel: merlind[20873]: segfault at
> 00000000205cfe90 rip 0000003613a78d80 rsp 00007fffc939ebc8 error 4
> 
> Is there some other info I can send that would help track this down?

A backtrace of the daemon would be favourite, but I think I'll need to
test this extensively on a 64-bit machine myself. Plenty of people are
running Merlin on 32-bit machines without any problems what so ever.

You can get a backtrace by locating the core-dump and running
gdb /path/to/merlind /path/to/corefile
(gdb) bt

That should allow me to see *where* it crashes at least, which will
provide some insight into whether it's a 32 vs 64 bit issue or if you
simply do something with your Nagios setup that others don't.

/Andreas


More information about the op5-users mailing list