Xymon Mailing List Archive search

Out of memory possibly due to hobbitd.chk corruption

list Charles Goyard
Tue, 16 Oct 2007 11:33:42 +0200
Message-Id: <user-dfca3cb9582c@xymon.invalid>

Hi,

we had a total outage of hobbit yesterday, with the following message in
hobbitd's log :

Could not fork checkpoint child:Cannot allocate memory

The kernel was too logging such problem and started killing procs :

kernel: Out of Memory: Killed process 1341 (devmon).
kernel: Out of Memory: Killed process 1342 (devmon).
kernel: Out of Memory: Killed process 24132 (hobbitd).
kernel: Out of Memory: Killed process 1343 (devmon).
kernel: Out of Memory: Killed process 1344 (devmon).
kernel: Out of Memory: Killed process 1345 (devmon).
kernel: Out of Memory: Killed process 1345 (devmon).
kernel: Out of Memory: Killed process 1284 (pdnsd).
kernel: Out of Memory: Killed process 1286 (pdnsd).
kernel: Out of Memory: Killed process 669 (hobbitd).
kernel: Out of Memory: Killed process 827 (bbtest-net).
kernel: Out of Memory: Killed process 832 (hobbitd).
kernel: Out of Memory: Killed process 1346 (devmon).
kernel: Out of Memory: Killed process 1347 (devmon).

We rebooted, same thing. We rebooted again, stopped hobbit, moved the
hobbitd.chk away and started over. It works fine now (17h uptime).

Henrik, I have a backup copy of the hobbitd.chk file, is it of some
interrest to you ?

We run hobbit 4.2.0 with some patches on a Linux box (2.6 kernel). The
server usualy uses only 20% of its memory.

Regards,


-- 
Charles Goyard - user-a6cdca7046e2@xymon.invalid - (+33) 1 45 38 01 31
Orange Business Services - online multimedia  // ingénierie