Xymon Mailing List Archive search

bbtest rrd file isn't getting updated anymore

5 messages in this thread

list Tom Georgoulias · Thu, 27 Sep 2007 07:50:38 -0400 ·
Yesterday afternoon my bbtest rrd file stopped getting updated with new data, even though the bbtest-net test is running fine and the latest status update is displayed under the bbtest column.  All of the other rrd files for my hobbit server are updating just fine and do not have any issues.

I added --debug to the CMD line for the rrdstatus module in hobbitlauch.cfg, but nothing bbtest related is reported in the rrd-status.log.

The bbtest data shows up fine when I watch the status channel, which make sense since the data is making it the webpage.

What can I check next?  Seems like there is a breakdown between what bbtest data goes into the status channel and what is rrdstatus is getting from the the status channel, but I don't know how to drill into that area.

Thanks,
Tom
list Henrik Størner · Thu, 27 Sep 2007 17:21:12 +0200 ·
On Thu, Sep 27, 2007 at 07:50:38AM -0400, Tom Georgoulias wrote:
Yesterday afternoon my bbtest rrd file stopped getting updated 
Could you check the timestamp and permissions on the bbtest.rrd file?
And the rrd-status.log file.

Regards,
Henrik
list Tom Georgoulias · Thu, 27 Sep 2007 11:44:23 -0400 ·
quoted from Henrik Størner
Henrik Stoerner wrote:
On Thu, Sep 27, 2007 at 07:50:38AM -0400, Tom Georgoulias wrote:
Yesterday afternoon my bbtest rrd file stopped getting updated 
Could you check the timestamp and permissions on the bbtest.rrd file?
And the rrd-status.log file.
Clipped the hostname from my paths:

[hobbit at radm000p server]$ ls -ld ~/data/rrd/<>/bbtest.rrd
-rw-r--r--    1 hobbit   hobbit      19548 Sep 27 09:43 
/home/hobbit/data/rrd/<>/bbtest.rrd

Nothing bbtest related in rrd-status.log
[hobbit@<> server]$ grep bbtest ~/log/rrd-status.log
[hobbit@<> log]$ ls -ld rrd-status.log
-rw-rw-r--    1 hobbit   hobbit    9518262 Sep 27 11:38 rrd-status.log
[hobbit@<> log]$


Stopped and restarted hobbit, and now I have core files in my ~/server 
dir.  Want one or something done to one?

-rw-------    1 hobbit   hobbit    4628480 Sep 27 11:13 core.1646
-rw-------    1 hobbit   hobbit    4628480 Sep 27 11:18 core.2711
-rw-------    1 hobbit   hobbit    4534272 Sep 27 10:58 core.31625
-rw-------    1 hobbit   hobbit    4628480 Sep 27 11:03 core.32027
-rw-------    1 hobbit   hobbit    4632576 Sep 27 11:23 core.3812
-rw-------    1 hobbit   hobbit    4534272 Sep 27 11:28 core.5832
-rw-------    1 hobbit   hobbit    4628480 Sep 27 11:33 core.5847
-rw-------    1 hobbit   hobbit    4628480 Sep 27 11:07 core.624
-rw-------    1 hobbit   hobbit    4628480 Sep 27 11:38 core.6972

rrd-status and hobbitlaunch entries:

[hobbit@<> log]$ tail rrd-status.log
2007-09-27 10:54:23 Worker process died with exit code 139, terminating
2007-09-27 10:58:05 Worker process died with exit code 139, terminating
2007-09-27 11:03:01 Worker process died with exit code 139, terminating
2007-09-27 11:07:54 Worker process died with exit code 139, terminating
2007-09-27 11:13:02 Worker process died with exit code 139, terminating
2007-09-27 11:18:01 Worker process died with exit code 139, terminating
2007-09-27 11:23:09 Worker process died with exit code 139, terminating
2007-09-27 11:28:15 Worker process died with exit code 139, terminating
2007-09-27 11:33:11 Worker process died with exit code 139, terminating
2007-09-27 11:38:18 Worker process died with exit code 139, terminating
[hobbit@<> log]$ tail hobbitlaunch.log
2007-09-27 10:57:09 Setting up logfiles
2007-09-27 10:58:05 Task rrdstatus terminated, status 1
2007-09-27 11:03:01 Task rrdstatus terminated, status 1
2007-09-27 11:07:54 Task rrdstatus terminated, status 1
2007-09-27 11:13:02 Task rrdstatus terminated, status 1
2007-09-27 11:18:01 Task rrdstatus terminated, status 1
2007-09-27 11:23:09 Task rrdstatus terminated, status 1
2007-09-27 11:28:15 Task rrdstatus terminated, status 1
2007-09-27 11:33:11 Task rrdstatus terminated, status 1
2007-09-27 11:38:18 Task rrdstatus terminated, status 1
list Henrik Størner · Thu, 27 Sep 2007 18:49:27 +0200 ·
quoted from Tom Georgoulias
On Thu, Sep 27, 2007 at 11:44:23AM -0400, Tom Georgoulias wrote:
Stopped and restarted hobbit, and now I have core files in my ~/server 
dir.  Want one or something done to one?
Seems like your rrdstatus task dies once every 5 minutes, which would
match the times that your network tests run. Could you run one of the
core files through gdb (see the "Reporting bugs" section in the Help
menu)? Also, please grab the bbtest report with

  bb 127.0.0.1 "hobbitdlog YOURHOBBITSERVER.bbtest" >bbtest.txt

and send me the bbtest.txt file (attached,please - to avoid it
getting messed up by the mail program).


Regards,
Henrik
list Tom Georgoulias · Thu, 27 Sep 2007 13:09:14 -0400 ·
quoted from Henrik Størner
Henrik Stoerner wrote:
On Thu, Sep 27, 2007 at 11:44:23AM -0400, Tom Georgoulias wrote:
Stopped and restarted hobbit, and now I have core files in my ~/server dir.  Want one or something done to one?
Seems like your rrdstatus task dies once every 5 minutes, which would
match the times that your network tests run. Could you run one of the
core files through gdb (see the "Reporting bugs" section in the Help
menu)? Also, please grab the bbtest report with

  bb 127.0.0.1 "hobbitdlog YOURHOBBITSERVER.bbtest" >bbtest.txt

and send me the bbtest.txt file (attached,please - to avoid it
getting messed up by the mail program).
I will send you this data off list, because there is some sensitive info   exposed in the backtrace and the http test that is causing the failure.

When I remove that http test from bb-hosts, everything is back to normal.  So I know what is breaking it.

Tom