[op5-users] Réf. : Re: Réf. : Re: Réf. : Re: merlind crash after loosing mysql connection

Andreas Ericsson ae at op5.se
Tue Sep 1 14:59:01 CEST 2009


nicolas.raspail at bnpparibas.com wrote:
> op5-users-bounces at lists.op5.com wrote on 01/09/2009 13:46:07:
> 
>> nicolas.raspail at bnpparibas.com wrote:
>>> op5-users-bounces at lists.op5.com wrote on 31/08/2009 17:04:24:
>>>
>>>> nicolas.raspail at bnpparibas.com wrote:
>>>>> p5-users-bounces at lists.op5.com wrote on 31/08/2009 10:38:27:
>>>>>
>>> <snip> 
>>>
>>>> Ok. In that case it's not a configuration error. Can you try using 
> the
>>>> latest git snapshot (download it directly from git for simpler 
> updates)
>>>> and see if that solves this particular problem?
>>>>
>>>> The latest core code changes can be found in v0.6.2-beta11.
>>>>
>>>> Thanks for your reports. I really appreciate them :-)
>>>>
>>> Hi,
>>>
>>> I have just compiled and installed the tarball from the git commit 
>>> b0703a40b91d39b57d84d52ddb81a8e34933c362. I have modified the 
>>> gen-version.sh script to add 
>>> DEF_VER=v0.6.2-b0703a40b91d39b57d84d52ddb81a8e34933c362 
>>>
>>> * When I start merlind, I get the following messages
>>>
>>> [1251803549] 6: Initializing IPC socket 
>>> '/bnp/apps/nagios/merlin/ipc.sock' for daemon [1251803549] 6: 
>>> dbi_conn_query_null(): Failed to run [SELECT host_name, current_state, 
> 
>>> state_type FROM merlindb.host ORDER BY host_name]: no database 
>>> connection. Error-code is 7 [1251803549] 3: Attempting to reconnect to 
> 
>>> database [1251803549] 6: Successfully ran the previously failed query 
>>> [1251803550] 6: Primed object states for 0 hosts and 14639 services 
>>> [1251803550] 6: Merlin daemon 
>>> v0.6.2-b0703a40b91d39b57d84d52ddb81a8e34933c362 successfully 
> initialized 
>>> [1251803550] 6: Accepting inbound connection on ipc socket 
>>>
>>> * After that, a large number (41) of php importer process are launched
>>>
>>> [1251803550] 6: Executing import command 'php 
>>> /bnp/apps/nagios/merlin/import.php 
>>> --nagios-cfg=/bnp/apps/nagios/etc/nagios.cfg 
>>> --cache=/bnp/apps/nagios/var/objects.cache --db-name=merlindb 
>>> --db-user=merlin --db-pass=xxx --db-host=eqd-nagios-sql' 
>>> <40 times>
>>> [1251803554] 6: Handled 86 ipc events in 4.326 seconds
>>> [1251803566] 6: Executing import command 'php 
>>> /bnp/apps/nagios/merlin/import.php 
>>> --nagios-cfg=/bnp/apps/nagios/etc/nagios.cfg 
>>> --cache=/bnp/apps/nagios/var/objects.cache --db-name=merlindb 
>>> --db-user=merlin --db-pass=xxx --db-host=eqd-nagios-sql' 
>>> [1251803566] 6: Handled 6 ipc events in 0.006 seconds
>>> [1251803566] 6: Handled 2 ipc events in 0.003 seconds
>>> [1251803566] 6: Handled 128 ipc events in 0.083 seconds
>>> [1251803566] 6: Handled 1 ipc events in 0.000 seconds
>>>
>>> I have stopped merlind because the load on the server was very high.
>>> Several minutes after I have stopped the merlind process, I can see 
> php
>>> running, but they finally disappeared.
>>>
>>> I did not can test the reconnection after a failover of my mysql 
> server
>>> because of this new merlind behaviour :)
>>>
>> Can you please send me the log from your eventbroker module as well
>> (the one call neb.log in the example config file)?
>>
>> What I think is happening is that your config is simply so large that
>> the import takes far too much time. I'll add a check to make sure we
>> aren't running one import while another one's working.
>>
> 
> Hi
> 
> unfortunately, a wrong permission on the logs directory prevent nagios
> to write the neb.log. Maybe a warning in nagios.log from the module would
> be a nice feature :)
> 
> But I have corrected the permission, enabled again merlind for some
> minutes and stopped it. And I have attached the file in this email.
> 

Thanks. I'll dig in to this tomorrow.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.


More information about the op5-users mailing list