Xymon Mailing List Archive search

xymond core dumping, not restarting correctly

4 messages in this thread

list Sean Clark · Wed, 13 Jun 2012 09:03:14 -0400 ·
I've had xymond core dump on me twice now since updating to 4.3.7

It's dumping with:

#3  0x080632b3 in sigsegv_handler (signum=11) at sig.c:57
#4  <signal handler called>
#5  0x006a46f9 in strcasecmp () from /lib/libc.so.6
#6  0x08067215 in binsearch (mytree=0x9509038, key=0x322e3432 <Address 0x322e3432 out of bounds>) at tree.c:47
#7  0x080673e8 in xtreeDelete (treehandle=0x9509038, key=0x322e3432 <Address 0x322e3432 out of bounds>) at tree.c:253
#8  0x0804c652 in clear_cookie (log=0x233ced58) at xymond.c:1163
#9  0x0804c704 in find_cookie (cookie=0xbfc30c02 "546260") at xymond.c:1183

And

#3  0x080632b3 in sigsegv_handler (signum=11) at sig.c:57
#4  <signal handler called>
#5  0x006a46f9 in strcasecmp () from /lib/libc.so.6
#6  0x08067215 in binsearch (mytree=0x8773038, key=0x10 <Address 0x10 out of bounds>) at tree.c:47
#7  0x080673e8 in xtreeDelete (treehandle=0x8773038, key=0x10 <Address 0x10 out of bounds>) at tree.c:253
#8  0x0804c652 in clear_cookie (log=0x1ca5cbe8) at xymond.c:1163
#9  0x0804c704 in find_cookie (cookie=0xbf905e72 "97097") at xymond.c:1183


I've tried it on two machines, to rule out any weird hardware/RAM issues, and it still happens – and when it crashes, a stale xymond stays, preventing it from auto restarting and binding to port 1984, making me have to go in, kill everything, remove the .chk file (as it has that out of bounds status message still in the chk file)

Makes for an irritating time – they were different host/test combinations causing the crash

Any ideas on what to do next? Is it just something like max_msg that is causing an Out of Bounds crash?


--

Sean Clark
Sr. Engineer, Software
ATG Network Operations & Planning Integrated Regional OSS<http://www.twcable.com/DepartmentOverview/AdvancedTechnologyGroup/ATG/NOP/OSS/Network.aspx>;
user-2db5fbcae9a7@xymon.invalid <mailto:user-2db5fbcae9a7@xymon.invalid>  devaudio<aim://devaudio> <mailto:user-2db5fbcae9a7@xymon.invalid>
Cell: (XXX) XXX-XXXX

This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.
list Jeremy Laidman · Fri, 15 Jun 2012 15:56:44 +1000 ·
Could be a bad link to the libredblack library?  Did you "make clean"
before you built Xymon?
quoted from Sean Clark

On Wed, Jun 13, 2012 at 11:03 PM, Clark, Sean <user-2db5fbcae9a7@xymon.invalid>wrote:
 I've had xymond core dump on me twice now since updating to 4.3.7

 It's dumping with:

 #3  0x080632b3 in sigsegv_handler (signum=11) at sig.c:57
#4  <signal handler called>
#5  0x006a46f9 in strcasecmp () from /lib/libc.so.6
#6  0x08067215 in binsearch (mytree=0x9509038, key=0x322e3432 <Address
0x322e3432 out of bounds>) at tree.c:47
#7  0x080673e8 in xtreeDelete (treehandle=0x9509038, key=0x322e3432
<Address 0x322e3432 out of bounds>) at tree.c:253
#8  0x0804c652 in clear_cookie (log=0x233ced58) at xymond.c:1163
#9  0x0804c704 in find_cookie (cookie=0xbfc30c02 "546260") at xymond.c:1183

 And

 #3  0x080632b3 in sigsegv_handler (signum=11) at sig.c:57
#4  <signal handler called>
#5  0x006a46f9 in strcasecmp () from /lib/libc.so.6
#6  0x08067215 in binsearch (mytree=0x8773038, key=0x10 <Address 0x10 out
of bounds>) at tree.c:47
#7  0x080673e8 in xtreeDelete (treehandle=0x8773038, key=0x10 <Address
0x10 out of bounds>) at tree.c:253
#8  0x0804c652 in clear_cookie (log=0x1ca5cbe8) at xymond.c:1163
#9  0x0804c704 in find_cookie (cookie=0xbf905e72 "97097") at xymond.c:1183


 I've tried it on two machines, to rule out any weird hardware/RAM
issues, and it still happens – and when it crashes, a stale xymond stays,
preventing it from auto restarting and binding to port 1984, making me have
to go in, kill everything, remove the .chk file (as it has that out of
bounds status message still in the chk file)

 Makes for an irritating time – they were different host/test
combinations causing the crash

 Any ideas on what to do next? Is it just something like max_msg that is
causing an Out of Bounds crash?


  • *
*--*
• *
*Sean Clark**
**Sr. Engineer, Software*
*ATG Network Operations & Planning **Integrated Regional OSS*<http://www.twcable.com/DepartmentOverview/AdvancedTechnologyGroup/ATG/NOP/OSS/Network.aspx>;

*user-2db5fbcae9a7@xymon.invalid  <user-2db5fbcae9a7@xymon.invalid>** devaudio** <user-2db5fbcae9a7@xymon.invalid>
quoted from Sean Clark
• *Cell: (XXX) XXX-XXXX*

This E-mail and any of its attachments may contain Time Warner Cable
proprietary information, which is privileged, confidential, or subject to
copyright belonging to Time Warner Cable. This E-mail is intended solely
for the use of the individual or entity to which it is addressed. If you
are not the intended recipient of this E-mail, you are hereby notified that
any dissemination, distribution, copying, or action taken in relation to
the contents of and attachments to this E-mail is strictly prohibited and
may be unlawful. If you have received this E-mail in error, please notify
the sender immediately and permanently delete the original and any copy of
this E-mail and any printout.

list Sean Clark · Tue, 19 Jun 2012 09:28:21 -0400 ·
Yeah, and I've had it crashing since 4.3.3 – updated to 4.3.7, still crashes about every 7-15 days, with a different host/test combo

Oddly, on only 1 of the 8 instances of xymon I have (I have tried a completely different server in that same instance) so now I am investigating my configs
signature


--

Sean Clark
Sr. Engineer, Software
ATG Network Operations & Planning Integrated Regional OSS<http://www.twcable.com/DepartmentOverview/AdvancedTechnologyGroup/ATG/NOP/OSS/Network.aspx>;
user-2db5fbcae9a7@xymon.invalid <mailto:user-2db5fbcae9a7@xymon.invalid>  devaudio<aim://devaudio> <mailto:user-2db5fbcae9a7@xymon.invalid>
Cell: (XXX) XXX-XXXX

quoted from Jeremy Laidman
From: Jeremy Laidman <user-71895fb2e44c@xymon.invalid<mailto:user-71895fb2e44c@xymon.invalid>>
To: Sean Clark <user-2db5fbcae9a7@xymon.invalid<mailto:user-2db5fbcae9a7@xymon.invalid>>
Cc: "xymon at xymon.com<mailto:xymon at xymon.com>" <xymon at xymon.com<mailto:xymon at xymon.com>>
Subject: Re: [Xymon] xymond core dumping, not restarting correctly

Could be a bad link to the libredblack library?  Did you "make clean" before you built Xymon?

On Wed, Jun 13, 2012 at 11:03 PM, Clark, Sean <user-2db5fbcae9a7@xymon.invalid<mailto:user-2db5fbcae9a7@xymon.invalid>> wrote:

I've had xymond core dump on me twice now since updating to 4.3.7

It's dumping with:

#3  0x080632b3 in sigsegv_handler (signum=11) at sig.c:57
#4  <signal handler called>
#5  0x006a46f9 in strcasecmp () from /lib/libc.so.6
#6  0x08067215 in binsearch (mytree=0x9509038, key=0x322e3432 <Address 0x322e3432 out of bounds>) at tree.c:47
#7  0x080673e8 in xtreeDelete (treehandle=0x9509038, key=0x322e3432 <Address 0x322e3432 out of bounds>) at tree.c:253
#8  0x0804c652 in clear_cookie (log=0x233ced58) at xymond.c:1163
#9  0x0804c704 in find_cookie (cookie=0xbfc30c02 "546260") at xymond.c:1183

And

#3  0x080632b3 in sigsegv_handler (signum=11) at sig.c:57
#4  <signal handler called>
#5  0x006a46f9 in strcasecmp () from /lib/libc.so.6
#6  0x08067215 in binsearch (mytree=0x8773038, key=0x10 <Address 0x10 out of bounds>) at tree.c:47
#7  0x080673e8 in xtreeDelete (treehandle=0x8773038, key=0x10 <Address 0x10 out of bounds>) at tree.c:253
#8  0x0804c652 in clear_cookie (log=0x1ca5cbe8) at xymond.c:1163
#9  0x0804c704 in find_cookie (cookie=0xbf905e72 "97097") at xymond.c:1183


I've tried it on two machines, to rule out any weird hardware/RAM issues, and it still happens – and when it crashes, a stale xymond stays, preventing it from auto restarting and binding to port 1984, making me have to go in, kill everything, remove the .chk file (as it has that out of bounds status message still in the chk file)

Makes for an irritating time – they were different host/test combinations causing the crash

Any ideas on what to do next? Is it just something like max_msg that is causing an Out of Bounds crash?


--

Sean Clark
Sr. Engineer, Software
ATG Network Operations & Planning Integrated Regional OSS<http://www.twcable.com/DepartmentOverview/AdvancedTechnologyGroup/ATG/NOP/OSS/Network.aspx>;

user-2db5fbcae9a7@xymon.invalid <mailto:user-2db5fbcae9a7@xymon.invalid>  devaudio <mailto:user-2db5fbcae9a7@xymon.invalid>
quoted from Jeremy Laidman
Cell: (XXX) XXX-XXXX

This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.
list Jeremy Laidman · Thu, 28 Jun 2012 15:17:48 +1000 ·
Are you acknowledging any alerts when it crashes?  Seems to be related to
ack cookie handling.
quoted from Sean Clark

On Tue, Jun 19, 2012 at 11:28 PM, Clark, Sean <user-2db5fbcae9a7@xymon.invalid>wrote:
Yeah, and I've had it crashing since 4.3.3 – updated to 4.3.7, still
crashes about every 7-15 days, with a different host/test combo