hobbitd coredumping and purple trends
list Richard Deal
My hobbitd is core dumping every so often and less often but still occasional the trends column turns purple. Looking through the makefile the only oddity is MAXMSG=32768 Were my old BBd was set to #define MAXLINE 11264 I have core files in /tmp from hobbitd Logs :
more bb-display.log
2005-04-01 15:47:59 Whoops ! bb failed to send message - timeout 2005-04-01 16:02:59 Whoops ! bb failed to send message - timeout 2005-04-01 16:03:00 connect to bbd failed - Connection refused 2005-04-01 16:03:00 Whoops ! bb failed to send message - Connection failed 2005-04-01 16:03:00 connect to bbd failed - Connection refused 2005-04-01 16:03:00 Whoops ! bb failed to send message - Connection failed 2005-04-01 16:03:00 connect to bbd failed - Connection refused 2005-04-01 16:03:00 Whoops ! bb failed to send message - Connection failed 2005-04-01 16:18:05 Whoops ! bb failed to send message - timeout 2005-04-01 17:03:08 Whoops ! bb failed to send message - timeout
more hobbitd.log
2005-04-01 15:32:47 Setup complete 2005-04-01 15:32:54 Setup complete 2005-04-01 15:48:01 Setup complete 2005-04-01 16:03:01 Setup complete 2005-04-01 16:33:03 Setup complete 2005-04-01 16:48:04 Setup complete I have a lot of these errors in larrd-data.log from various hosts. 2005-04-01 17:17:53 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/ray1.tigr.org/netstat.rrd from 172.17.10.20: expected 12 data source readings (got 16) from 1112393873:597496849:203665680:0:1400608:474490:380897:4323:190:65584910 3:2750185864:9271815:54370878:358842800:919424657:55608:57615:... 2005-04-01 17:18:15 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/akela.tigr.org/netstat.rrd from 172.17.10.87: expected 12 data source readings (got 16) from 1112393894:7278664:4601574:0:2187293:80558:15408:1028:18:3786687185:3319 9304:551592:3055134:392628802:534540232:12324:8938:... 2005-04-01 17:18:22 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/vader.tigr.org/netstat.rrd from 172.16.4.50: expected 12 data source readings (got 16) from 1112393902:844147:844153:0:173177:11681993:15774:1756237:109:2946405093: 1171800154:1508:44541250:1263968085:53592252:29:1305303:... 2005-04-01 17:18:49 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/invino.tigr.org/netstat.rrd from 172.17.10.29: expected 12 data source readings (got 16) from 1112393929:161474660:161355279:0:979032:1013326:8108:2751:26:3077107260: 3115145104:3779497608:1171327:3474031250:2366740414:176290878:15382:... I used the moverrd.sh . And these errors from lard-status.log: 005-04-01 17:18:10 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/IGR51RRTB.tigr.org/temperature .module_6_asic-.rrd from 172.17.10.16: illegal attempt to update using time 1112393889 when last update time is 1112393889 (minimum one second step) 2005-04-01 17:20:04 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/utah.tigr.org/disk.rrd from 172.17.10.79: illegal attempt to update using time 1112394004 when last update time is 1112394004 (minimum one second step) 2005-04-01 17:20:04 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/utah.tigr.org/disk.rrd from 172.17.10.79: illegal attempt to update using time 1112394004 when last update time is 1112394004 (minimum one second step) 2005-04-01 17:21:27 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/atlas.tigr.org/netstat.rrd from 172.17.10.80: expected 11 data source readings (got 16) from 1112394087:23501770:2904610:0:97558:26724:76:17:8:U:U:U:U:226801128:2976 62863:U:956:... any suggestions? Thanks
list Olivier Beau
i'm still running RC6, and i have the same behaviour : serveral cores in tmp/ (about a dozen per day) they seem to be bbtest-net, but also bbgen cores ! i have also seem my hobbitd bark to listen to port 1984... (telnet localhost 1984 would not answer; couple seconds after it would...) henrik : can these 2 problems be related ? olivier Selon "Deal, Richard" <user-f6f804cb0a50@xymon.invalid>:
▸
My hobbitd is core dumping every so often and less often but still occasional the trends column turns purple. Looking through the makefile the only oddity is MAXMSG=32768 Were my old BBd was set to #define MAXLINE 11264 I have core files in /tmp from hobbitd Logs :more bb-display.log2005-04-01 15:47:59 Whoops ! bb failed to send message - timeout 2005-04-01 16:02:59 Whoops ! bb failed to send message - timeout 2005-04-01 16:03:00 connect to bbd failed - Connection refused 2005-04-01 16:03:00 Whoops ! bb failed to send message - Connection failed 2005-04-01 16:03:00 connect to bbd failed - Connection refused 2005-04-01 16:03:00 Whoops ! bb failed to send message - Connection failed 2005-04-01 16:03:00 connect to bbd failed - Connection refused 2005-04-01 16:03:00 Whoops ! bb failed to send message - Connection failed 2005-04-01 16:18:05 Whoops ! bb failed to send message - timeout 2005-04-01 17:03:08 Whoops ! bb failed to send message - timeout
list Henrik Størner
▸
On Fri, Apr 01, 2005 at 05:22:42PM -0500, Deal, Richard wrote:
My hobbitd is core dumping every so often and less often but still occasional the trends column turns purple.
hobbitd crashing - that's bad.
Could you run the core-dump through gdb and send me the call-trace.
Do this:
$ gdb ~hobbit/server/bin/hobbitd /tmp/core-file-from-hobbitd
[messages from gdb]
gdb> bt
and send me the output from that "bt" command.
▸
Looking through the makefile the only oddity is MAXMSG=32768 Were my old BBd was set to #define MAXLINE 11264
Shouldn't cause any problems, it just means Hobbit will accept larger messages than your BB setup.
▸
more bb-display.log2005-04-01 15:47:59 Whoops ! bb failed to send message - timeout 2005-04-01 16:02:59 Whoops ! bb failed to send message - timeout 2005-04-01 16:03:00 connect to bbd failed - Connection refused
Probably a result of hobbitd being down.
▸
I have a lot of these errors in larrd-data.log from various hosts. 2005-04-01 17:17:53 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/ray1.tigr.org/netstat.rrd from 172.17.10.20: expected 12 data source readings (got 16) from
The "netstat" and "vmstat" RRD files from LARRD are not compatible with Hobbit. Do a find ~hobbit/data/rrd -name netstat.rrd | xargs rm -f to delete the old files.
▸
005-04-01 17:18:10 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/IGR51RRTB.tigr.org/temperature .module_6_asic-.rrd from 172.17.10.16: illegal attempt to update using time 1112393889 when last update time is 1112393889 (minimum one second step)
This is a bit more tricky. It means that the same RRD file was being updated by two status messages within one second - that normally should not happen, because a status is sent every 5 minutes. It can happen if you have two hosts reporting the same hostname (one of them would be the 172.17.10.16 IP you have in that error message). Regards, Henrik
list Henrik Størner
▸
On Sat, Apr 02, 2005 at 01:13:00AM +0200, user-fe6e0e6a0d05@xymon.invalid wrote:
i'm still running RC6, and i have the same behaviour : serveral cores in tmp/ (about a dozen per day) they seem to be bbtest-net, but also bbgen cores !
I'd like to see call-traces from those core files: cd ~hobbit/server gdb bin/bbgen tmp/core-from-bbgen [messages from gdb starting up]
▸
gdb> bt
and send me the output.
i have also seem my hobbitd bark to listen to port 1984... (telnet localhost 1984 would not answer; couple seconds after it would...) henrik : can these 2 problems be related ?
Perhaps ... but I wouldn't expect them to be, unless it was hobbitd that crashed. Henrik
list Terry Barnes
I experienced same thing after making some changes to hobbit - might be a longshot, but here is what caused this for me. After restarting hobbit and receiving the same as you, found that some hobbit processes were hung. If I stopped hobbit - could still see most processes were still running. Even after multiple attempt to do a ~/server/hobbit.sh stop, the processes continued to run. Killed those processes and restart hobbit - problem solved. Like I say - could be a longshot, but worth a look. Terry Barnes Siemens Com @ HFHS XXX-XXX-XXXX (Office) XXX-XXX-XXXX (Cellular) XXX-XXX-XXXX (Fax) user-34ea5ff61ded@xymon.invalid (Text Pager) user-0e29285d9a67@xymon.invalid
user-f6f804cb0a50@xymon.invalid 4/1/05 5:22:42 PM >>>
▸
My hobbitd is core dumping every so often and less often but still occasional the trends column turns purple. Looking through the makefile the only oddity is MAXMSG=32768 Were my old BBd was set to #define MAXLINE 11264 I have core files in /tmp from hobbitd Logs :
more bb-display.log
2005-04-01 15:47:59 Whoops ! bb failed to send message - timeout 2005-04-01 16:02:59 Whoops ! bb failed to send message - timeout 2005-04-01 16:03:00 connect to bbd failed - Connection refused 2005-04-01 16:03:00 Whoops ! bb failed to send message - Connection failed 2005-04-01 16:03:00 connect to bbd failed - Connection refused 2005-04-01 16:03:00 Whoops ! bb failed to send message - Connection failed 2005-04-01 16:03:00 connect to bbd failed - Connection refused 2005-04-01 16:03:00 Whoops ! bb failed to send message - Connection failed 2005-04-01 16:18:05 Whoops ! bb failed to send message - timeout 2005-04-01 17:03:08 Whoops ! bb failed to send message - timeout
more hobbitd.log
2005-04-01 15:32:47 Setup complete 2005-04-01 15:32:54 Setup complete 2005-04-01 15:48:01 Setup complete 2005-04-01 16:03:01 Setup complete 2005-04-01 16:33:03 Setup complete 2005-04-01 16:48:04 Setup complete I have a lot of these errors in larrd-data.log from various hosts. 2005-04-01 17:17:53 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/ray1.tigr.org/netstat.rrd from 172.17.10.20: expected 12 data source readings (got 16) from 1112393873:597496849:203665680:0:1400608:474490:380897:4323:190:65584910 3:2750185864:9271815:54370878:358842800:919424657:55608:57615:... 2005-04-01 17:18:15 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/akela.tigr.org/netstat.rrd from 172.17.10.87: expected 12 data source readings (got 16) from 1112393894:7278664:4601574:0:2187293:80558:15408:1028:18:3786687185:3319 9304:551592:3055134:392628802:534540232:12324:8938:... 2005-04-01 17:18:22 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/vader.tigr.org/netstat.rrd from 172.16.4.50: expected 12 data source readings (got 16) from 1112393902:844147:844153:0:173177:11681993:15774:1756237:109:2946405093: 1171800154:1508:44541250:1263968085:53592252:29:1305303:... 2005-04-01 17:18:49 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/invino.tigr.org/netstat.rrd from 172.17.10.29: expected 12 data source readings (got 16) from 1112393929:161474660:161355279:0:979032:1013326:8108:2751:26:3077107260: 3115145104:3779497608:1171327:3474031250:2366740414:176290878:15382:... I used the moverrd.sh . And these errors from lard-status.log: 005-04-01 17:18:10 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/IGR51RRTB.tigr.org/temperature .module_6_asic-.rrd from 172.17.10.16: illegal attempt to update using time 1112393889 when last update time is 1112393889 (minimum one second step) 2005-04-01 17:20:04 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/utah.tigr.org/disk.rrd from 172.17.10.79: illegal attempt to update using time 1112394004 when last update time is 1112394004 (minimum one second step) 2005-04-01 17:20:04 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/utah.tigr.org/disk.rrd from 172.17.10.79: illegal attempt to update using time 1112394004 when last update time is 1112394004 (minimum one second step) 2005-04-01 17:21:27 RRD error updating /local/packages/IT/HOBBIT/hobbit/data/rrd/atlas.tigr.org/netstat.rrd from 172.17.10.80: expected 11 data source readings (got 16) from 1112394087:23501770:2904610:0:97558:26724:76:17:8:U:U:U:U:226801128:2976 62863:U:956:... any suggestions? Thanks ==============================================================================
CONFIDENTIALITY NOTICE: This email contains information from the sender that may be CONFIDENTIAL, LEGALLY PRIVILEGED, PROPRIETARY or otherwise protected from disclosure. This email is intended for use only by the person or entity to whom it is addressed. If you are not the intended recipient, any use, disclosure, copying, distribution, printing, or any action taken in reliance on the contents of this email, is strictly prohibited. If you received this email in error, please contact the sending party by replying in an email to the sender, delete the email from your computer system and shred any paper copies of the email you printed. Note to Patients: There are a number of risks you should consider before using e-mail to communicate with us. These risks are described in our Privacy Policy at http://henryford.com. Review that policy carefully before continuing to communicate with us by e-mail. For greater Internet security, our policy describes the Henry Ford MyHealth electronic communication process - you may register at http://henryford.com. If you do not believe that our policy gives you the privacy and security protection you need, do not send e-mail or Internet communications to us. ==============================================================================
list Henrik Størner
▸
In <user-b51923391996@xymon.invalid> "Terry Barnes" <user-0e29285d9a67@xymon.invalid> writes:
I experienced same thing after making some changes to hobbit - might be a longshot, but here is what caused this for me.
I think I've found the cause for this particular problem. A usable work-around is to add the "--no-meta" option to the bb-larrdcolumn command in hobbitlaunch.cfg. Regards, Henrik
list Richard Deal
--no-meta fixed the purple trends issue and the patch fixed the core dumps. Thanks
▸
-----Original Message-----
From: Henrik Storner [mailto:user-ce4a2c883f75@xymon.invalid]
Sent: Sunday, April 03, 2005 3:54 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] hobbitd coredumping and purple trends
In <user-b51923391996@xymon.invalid> "Terry Barnes" <user-0e29285d9a67@xymon.invalid> writes:
I experienced same thing after making some changes to hobbit - might be a longshot, but here is what caused this for me.
I think I've found the cause for this particular problem. A usable work-around is to add the "--no-meta" option to the bb-larrdcolumn command in hobbitlaunch.cfg. Regards, Henrik