Could not fork checkpoint child:Cannot allocate memory
Yes, I can provide you all information you need. Xymon was upgraded on 15
July and this is memory graph:
[image: Inline image 1]
Xymon is instaled in Debian 6.0.9.
This are our ulimit configuration. We didn't change it so it should be
defaults values:
root: #ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 63975
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 63975
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
xymon: $ ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 8192
coredump(blocks) 0
memory(kbytes) unlimited
locked memory(kbytes) 64
process 63975
nofiles 1024
vmemory(kbytes) unlimited
locks unlimited
We monitor 1200 hosts and don't add or remove many hosts. May be 20 host a
week:
[image: Inline image 2]
xymond test is:
Statistics for Xymon daemon
Version: 4.3.17
Up since 21-Jul-2014 11:37:37 (0 days, 22:00:00)
Incoming messages : 4335475
- status : 2251363
- combo : 24050
- extcombo : 104869
- page : 160
- summary : 0
- data : 1354532
- client : 92341
- notes : 0
- enable : 0
- disable : 0
- ack : 2
- config : 5968
- query : 4471
- xymondboard : 146651
- xymondlog : 341645
- drop : 3
- rename : 0
- dummy : 328
- ping : 0
- notify : 0
- schedule : 298
- download : 0
- Bogus/Timeouts : 8794
Incoming messages/sec : 52 (average last 300 seconds)
status channel messages: 2244782 (1 readers)
stachg channel messages: 10913 (1 readers)
page channel messages: 49720 (1 readers)
data channel messages: 1349077 (1 readers)
notes channel messages: 0 (0 readers)
enadis channel messages: 0 (0 readers)
client channel messages: 90465 (1 readers)
clichg channel messages: 360 (1 readers)
user channel messages: 0 (0 readers)
backfeed messages : 0
Ghost reports:
10.6.71.66 reported host
10.6.42.103 reported host 10.6.42.10
10.6.42.103 reported host 10.6.42.11
10.6.42.103 reported host 10.6.42.12
10.6.42.103 reported host 10.6.42.13
10.6.42.103 reported host 10.6.42.14
10.6.42.103 reported host 10.6.42.15
10.6.42.103 reported host 10.6.42.4
10.6.42.103 reported host 10.6.42.5
10.1.0.194 reported host ePagess_app_3
10.1.0.86 reported host IIS6_6_1
10.1.0.87 reported host IIS6_6_2
10.1.0.194 reported host main0404
Multi-source statuses
admin01:conn reported by 10.6.42.103 and 10.0.0.29
And xymond section in task.cfg:
[xymond]
ENVFILE /usr/lib/xymon/server/etc/xymonserver.cfg
CMD xymond --pidfile=$XYMONSERVERLOGS/xymond.pid \
--restart=$XYMONTMP/xymond.chk
--checkpoint-file=$XYMONTMP/xymond.chk --checkpoint-interval=600 \
--log=$XYMONSERVERLOGS/xymond.log \
--admin-senders=127.0.0.1,$XYMONSERVERIP \
--store-clientlogs=!msgs \
--maint-senders=127.0.0.1,$XYMONSERVERIP \
--www-senders=127.0.0.1,10.0.0.0/24,10.6.42.103 \
--flap-count=10 \
--flap-seconds=900
We also change some MAX variables in xymonserver.cfg:
MAXLINE="32768"
MAXMSG_DATA="5242880"
MAXMSG_CLIENT="5242880"
MAXMSG_STATUS="5242880"
Thank you for you help.
On Mon, Jul 21, 2014 at 7:12 PM, J.C. Cleaver <user-87556346d4af@xymon.invalid>
wrote:
On Mon, July 21, 2014 3:24 am, Raul GN wrote:After upgrading from xymon version 4.3.12 to 4.3.17 xymond daemon memory grow without any limit. After a 2 or 3 days this messages appear in logs: 2014-07-17 16:27:46 Setup complete 2014-07-18 04:16:49 Flapping detected for web.int:http - 10 changes in868seconds 2014-07-18 04:16:49 Flapping detected for web.int:tomcat - 10 changes in 868 seconds 2014-07-18 04:18:23 Flapping detected for web.int:http - 10 changes in892seconds 2014-07-18 04:18:23 Flapping detected for web.int:tomcat - 10 changes in 892 seconds 2014-07-18 18:25:44 Flapping detected for web.int:http - 10 changes in808seconds 2014-07-18 18:25:44 Flapping detected for web.int:tomcat - 10 changes in 808 seconds 2014-07-18 23:40:53 Could not fork checkpoint child:Cannot allocate memory 2014-07-18 23:50:54 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 00:00:55 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 00:10:56 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 00:20:57 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 00:30:58 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 00:40:59 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 00:51:00 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 01:01:01 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 01:11:02 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 01:21:03 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 01:31:04 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 01:41:05 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 01:51:06 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 02:01:07 Could not fork checkpoint child:Cannot allocate memory 2014-07-19 02:11:08 Could not fork checkpoint child:Cannot allocate memory żAnybody knows how to avoid this problem? If I don't reboot xymon it crash.Hmm. If it's growing truly without limit there's something unusual going on; I'd take the memory allocation error later on at face value. Can you provide any additional details? Do you have an unusual workload or ulimits on the xymon user? Or a large number of host inserts/removals? What OS are you running? Regards, -jc
Attachments (1)
attachment-0001.png