Xymon Mailing List Archive search

xymond_filestore: crashed but still working (too well?)

3 messages in this thread

list David Mills · Tue, 9 Jul 2013 18:02:21 +0000 ·
All --

Am turning on the "xymond_filestore" worker module for the first time and have a couple of odd things (to my mind) happening:

1) The status icon in Xymon ("xymond_filestore") is red, and the details page says "Program crashed / Fatal signal caught!". Found the core file under ~xymon/server/tmp/ and pstack shows:

core '/home/hobbit/xymon/server/tmp/core' of 16895:     xymond_filestore --data --debug
 ff13ebd4 _lwp_kill (6, 0, 0, ff11e0f0, ffffffff, 6) + 8
 ff0b29f0 abort    (0, 1, 42b94, ffb04, ff1b5518, 0) + 110
 0001f170 sigsegv_handler (b, 0, ffbfc830, 1, 0, 0) + 30
 ff13b00c __sighndlr (b, 0, ffbfc830, 1f140, 0, 1) + c
 ff12f6bc call_user_handler (b, 0, 0, 0, ff302a00, ffbfc830) + 3b8
 ff12f8a4 sigacthandler (b, 0, ffbfc830, ffffffff, 0, 0) + 60
 --- called from signal handler with signal 11 (SIGSEGV) ---
 ff11f858 fwrite   (4ba93, 16d, 1, 0, ff0000, 80808080) + 8
 00013c08 update_file (ffbfd4d0, 0, 4ba93, 0, 0, ffffffff) + 78
 00014a1c main     (2c000, 4ba55, 4, 0, 0, ffbfd4d0) + 99c
 00013a28 _start   (0, 0, 0, 0, 0, 0) + 5c

 (Running v. 4.3.3 / Solaris 10, FWIW)

2) I turned on xymond_filestore because I'm writing a server-side script to analyze incoming data and assign status. I imagined I would *only* get files under $XYMONDATADIR sent from my custom client script, but instead I'm getting zillions (actually a few thousand) showing up in that dir. Can someone please explain why I'm getting all these client "data" files in this dir?

More to my point -- is there any way to restrict only the data files I want to appear in this dir?

Thanks!

david
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
David Mills
Systems Administrator
Northrop Grumman
(XXX) XXX-XXXX
list Japheth Cleaver · Tue, 9 Jul 2013 18:48:03 -0000 (UTC) ·
quoted from David Mills
All --

Am turning on the "xymond_filestore" worker module for the first time and
have a couple of odd things (to my mind) happening:

1) The status icon in Xymon ("xymond_filestore") is red, and the details
page says "Program crashed / Fatal signal caught!". Found the core file
under ~xymon/server/tmp/ and pstack shows:

core '/home/hobbit/xymon/server/tmp/core' of 16895:     xymond_filestore
--data --debug
 ff13ebd4 _lwp_kill (6, 0, 0, ff11e0f0, ffffffff, 6) + 8
 ff0b29f0 abort    (0, 1, 42b94, ffb04, ff1b5518, 0) + 110
 0001f170 sigsegv_handler (b, 0, ffbfc830, 1, 0, 0) + 30
 ff13b00c __sighndlr (b, 0, ffbfc830, 1f140, 0, 1) + c
 ff12f6bc call_user_handler (b, 0, 0, 0, ff302a00, ffbfc830) + 3b8
 ff12f8a4 sigacthandler (b, 0, ffbfc830, ffffffff, 0, 0) + 60
 --- called from signal handler with signal 11 (SIGSEGV) ---
 ff11f858 fwrite   (4ba93, 16d, 1, 0, ff0000, 80808080) + 8
 00013c08 update_file (ffbfd4d0, 0, 4ba93, 0, 0, ffffffff) + 78
 00014a1c main     (2c000, 4ba55, 4, 0, 0, ffbfd4d0) + 99c
 00013a28 _start   (0, 0, 0, 0, 0, 0) + 5c

 (Running v. 4.3.3 / Solaris 10, FWIW)

Yikes. IIRC there were some changes in this code over time; I'd be curious
if this works ok with 4.3.11.
quoted from David Mills

2) I turned on xymond_filestore because I'm writing a server-side script
to analyze incoming data and assign status. I imagined I would *only* get
files under $XYMONDATADIR sent from my custom client script, but instead
I'm getting zillions (actually a few thousand) showing up in that dir. Can
someone please explain why I'm getting all these client "data" files in
this dir?
You'll actually be getting a copy of all data messages coming through on
the channel. xymond_client takes the incoming client message and turns it
in to both status *and* data messages. The status messages are what you
see as test results, but the data messages include things like parsed
vmstat/ifstat data. The reason xymond_rrd is configured to run twice (one
listening to the status channel, once to the data channel) is so that it
can pick up both feeds and make pretty RRDs out of -- say -- bandwidth in
the trends page.
quoted from David Mills
More to my point -- is there any way to restrict only the data files I
want to appear in this dir?
I think xymond_filestore's "--only=test[,test,test]" filter might work for
data messages, but I'm not certain. Another option is to use --filter= at
the xymond_channel level, but it depends on how fine-tuned your
requirements are (that option is a simple regex include).


If the box you're on has a lot of ram, I've found using a tmpfs for the
filestore /data directory can be quite convenient. If you're doing
background analysis of data messages, it's sometimes easier to grep a dir
than create another xymond_channel listening script and wait.


HTH,

-jc
list David Mills · Tue, 9 Jul 2013 19:10:34 +0000 ·
*Very* helpful, JC.

Thx!
quoted from Japheth Cleaver

david
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
David Mills
Systems Administrator
Northrop Grumman
(XXX) XXX-XXXX

From: user-87556346d4af@xymon.invalid [user-87556346d4af@xymon.invalid]
Sent: Tuesday, July 09, 2013 1:48 PM
To: Mills, David (IS)
Cc: xymon at xymon.com
Subject: EXT :Re: [Xymon] xymond_filestore: crashed but still working (too well?)
All --

Am turning on the "xymond_filestore" worker module for the first time and
have a couple of odd things (to my mind) happening:

1) The status icon in Xymon ("xymond_filestore") is red, and the details
page says "Program crashed / Fatal signal caught!". Found the core file
under ~xymon/server/tmp/ and pstack shows:

core '/home/hobbit/xymon/server/tmp/core' of 16895:     xymond_filestore
--data --debug
 ff13ebd4 _lwp_kill (6, 0, 0, ff11e0f0, ffffffff, 6) + 8
 ff0b29f0 abort    (0, 1, 42b94, ffb04, ff1b5518, 0) + 110
 0001f170 sigsegv_handler (b, 0, ffbfc830, 1, 0, 0) + 30
 ff13b00c __sighndlr (b, 0, ffbfc830, 1f140, 0, 1) + c
 ff12f6bc call_user_handler (b, 0, 0, 0, ff302a00, ffbfc830) + 3b8
 ff12f8a4 sigacthandler (b, 0, ffbfc830, ffffffff, 0, 0) + 60
 --- called from signal handler with signal 11 (SIGSEGV) ---
 ff11f858 fwrite   (4ba93, 16d, 1, 0, ff0000, 80808080) + 8
 00013c08 update_file (ffbfd4d0, 0, 4ba93, 0, 0, ffffffff) + 78
 00014a1c main     (2c000, 4ba55, 4, 0, 0, ffbfd4d0) + 99c
 00013a28 _start   (0, 0, 0, 0, 0, 0) + 5c

 (Running v. 4.3.3 / Solaris 10, FWIW)

Yikes. IIRC there were some changes in this code over time; I'd be curious
if this works ok with 4.3.11.

2) I turned on xymond_filestore because I'm writing a server-side script
to analyze incoming data and assign status. I imagined I would *only* get
files under $XYMONDATADIR sent from my custom client script, but instead
I'm getting zillions (actually a few thousand) showing up in that dir. Can
someone please explain why I'm getting all these client "data" files in
this dir?
You'll actually be getting a copy of all data messages coming through on
the channel. xymond_client takes the incoming client message and turns it
in to both status *and* data messages. The status messages are what you
see as test results, but the data messages include things like parsed
vmstat/ifstat data. The reason xymond_rrd is configured to run twice (one
listening to the status channel, once to the data channel) is so that it
can pick up both feeds and make pretty RRDs out of -- say -- bandwidth in
the trends page.
More to my point -- is there any way to restrict only the data files I
want to appear in this dir?
I think xymond_filestore's "--only=test[,test,test]" filter might work for
data messages, but I'm not certain. Another option is to use --filter= at
the xymond_channel level, but it depends on how fine-tuned your
requirements are (that option is a simple regex include).


If the box you're on has a lot of ram, I've found using a tmpfs for the
filestore /data directory can be quite convenient. If you're doing
background analysis of data messages, it's sometimes easier to grep a dir
than create another xymond_channel listening script and wait.


HTH,

-jc