xymond core dumping, not restarting correctly
list Sean Clark
I've had xymond core dump on me twice now since updating to 4.3.7 It's dumping with: #3 0x080632b3 in sigsegv_handler (signum=11) at sig.c:57 #4 <signal handler called> #5 0x006a46f9 in strcasecmp () from /lib/libc.so.6 #6 0x08067215 in binsearch (mytree=0x9509038, key=0x322e3432 <Address 0x322e3432 out of bounds>) at tree.c:47 #7 0x080673e8 in xtreeDelete (treehandle=0x9509038, key=0x322e3432 <Address 0x322e3432 out of bounds>) at tree.c:253 #8 0x0804c652 in clear_cookie (log=0x233ced58) at xymond.c:1163 #9 0x0804c704 in find_cookie (cookie=0xbfc30c02 "546260") at xymond.c:1183 And #3 0x080632b3 in sigsegv_handler (signum=11) at sig.c:57 #4 <signal handler called> #5 0x006a46f9 in strcasecmp () from /lib/libc.so.6 #6 0x08067215 in binsearch (mytree=0x8773038, key=0x10 <Address 0x10 out of bounds>) at tree.c:47 #7 0x080673e8 in xtreeDelete (treehandle=0x8773038, key=0x10 <Address 0x10 out of bounds>) at tree.c:253 #8 0x0804c652 in clear_cookie (log=0x1ca5cbe8) at xymond.c:1163 #9 0x0804c704 in find_cookie (cookie=0xbf905e72 "97097") at xymond.c:1183 I've tried it on two machines, to rule out any weird hardware/RAM issues, and it still happens – and when it crashes, a stale xymond stays, preventing it from auto restarting and binding to port 1984, making me have to go in, kill everything, remove the .chk file (as it has that out of bounds status message still in the chk file) Makes for an irritating time – they were different host/test combinations causing the crash Any ideas on what to do next? Is it just something like max_msg that is causing an Out of Bounds crash? -- Sean Clark Sr. Engineer, Software ATG Network Operations & Planning Integrated Regional OSS<http://www.twcable.com/DepartmentOverview/AdvancedTechnologyGroup/ATG/NOP/OSS/Network.aspx> user-2db5fbcae9a7@xymon.invalid <mailto:user-2db5fbcae9a7@xymon.invalid> devaudio<aim://devaudio> <mailto:user-2db5fbcae9a7@xymon.invalid> Cell: (XXX) XXX-XXXX This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.
list Jeremy Laidman
Could be a bad link to the libredblack library? Did you "make clean" before you built Xymon?
▸
On Wed, Jun 13, 2012 at 11:03 PM, Clark, Sean <user-2db5fbcae9a7@xymon.invalid>wrote:
I've had xymond core dump on me twice now since updating to 4.3.7 It's dumping with: #3 0x080632b3 in sigsegv_handler (signum=11) at sig.c:57 #4 <signal handler called> #5 0x006a46f9 in strcasecmp () from /lib/libc.so.6 #6 0x08067215 in binsearch (mytree=0x9509038, key=0x322e3432 <Address 0x322e3432 out of bounds>) at tree.c:47 #7 0x080673e8 in xtreeDelete (treehandle=0x9509038, key=0x322e3432 <Address 0x322e3432 out of bounds>) at tree.c:253 #8 0x0804c652 in clear_cookie (log=0x233ced58) at xymond.c:1163 #9 0x0804c704 in find_cookie (cookie=0xbfc30c02 "546260") at xymond.c:1183 And #3 0x080632b3 in sigsegv_handler (signum=11) at sig.c:57 #4 <signal handler called> #5 0x006a46f9 in strcasecmp () from /lib/libc.so.6 #6 0x08067215 in binsearch (mytree=0x8773038, key=0x10 <Address 0x10 out of bounds>) at tree.c:47 #7 0x080673e8 in xtreeDelete (treehandle=0x8773038, key=0x10 <Address 0x10 out of bounds>) at tree.c:253 #8 0x0804c652 in clear_cookie (log=0x1ca5cbe8) at xymond.c:1163 #9 0x0804c704 in find_cookie (cookie=0xbf905e72 "97097") at xymond.c:1183 I've tried it on two machines, to rule out any weird hardware/RAM issues, and it still happens – and when it crashes, a stale xymond stays, preventing it from auto restarting and binding to port 1984, making me have to go in, kill everything, remove the .chk file (as it has that out of bounds status message still in the chk file) Makes for an irritating time – they were different host/test combinations causing the crash Any ideas on what to do next? Is it just something like max_msg that is causing an Out of Bounds crash? • * *--* • * *Sean Clark** **Sr. Engineer, Software* *ATG Network Operations & Planning **Integrated Regional OSS*<http://www.twcable.com/DepartmentOverview/AdvancedTechnologyGroup/ATG/NOP/OSS/Network.aspx>;
*user-2db5fbcae9a7@xymon.invalid <user-2db5fbcae9a7@xymon.invalid>** devaudio** <user-2db5fbcae9a7@xymon.invalid>
▸
• *Cell: (XXX) XXX-XXXX*
This E-mail and any of its attachments may contain Time Warner Cable
proprietary information, which is privileged, confidential, or subject to
copyright belonging to Time Warner Cable. This E-mail is intended solely
for the use of the individual or entity to which it is addressed. If you
are not the intended recipient of this E-mail, you are hereby notified that
any dissemination, distribution, copying, or action taken in relation to
the contents of and attachments to this E-mail is strictly prohibited and
may be unlawful. If you have received this E-mail in error, please notify
the sender immediately and permanently delete the original and any copy of
this E-mail and any printout.
list Sean Clark
Yeah, and I've had it crashing since 4.3.3 – updated to 4.3.7, still crashes about every 7-15 days, with a different host/test combo Oddly, on only 1 of the 8 instances of xymon I have (I have tried a completely different server in that same instance) so now I am investigating my configs
▸
-- Sean Clark Sr. Engineer, Software ATG Network Operations & Planning Integrated Regional OSS<http://www.twcable.com/DepartmentOverview/AdvancedTechnologyGroup/ATG/NOP/OSS/Network.aspx>; user-2db5fbcae9a7@xymon.invalid <mailto:user-2db5fbcae9a7@xymon.invalid> devaudio<aim://devaudio> <mailto:user-2db5fbcae9a7@xymon.invalid> Cell: (XXX) XXX-XXXX
▸
From: Jeremy Laidman <user-71895fb2e44c@xymon.invalid<mailto:user-71895fb2e44c@xymon.invalid>> To: Sean Clark <user-2db5fbcae9a7@xymon.invalid<mailto:user-2db5fbcae9a7@xymon.invalid>> Cc: "xymon at xymon.com<mailto:xymon at xymon.com>" <xymon at xymon.com<mailto:xymon at xymon.com>> Subject: Re: [Xymon] xymond core dumping, not restarting correctly Could be a bad link to the libredblack library? Did you "make clean" before you built Xymon? On Wed, Jun 13, 2012 at 11:03 PM, Clark, Sean <user-2db5fbcae9a7@xymon.invalid<mailto:user-2db5fbcae9a7@xymon.invalid>> wrote: I've had xymond core dump on me twice now since updating to 4.3.7 It's dumping with: #3 0x080632b3 in sigsegv_handler (signum=11) at sig.c:57 #4 <signal handler called> #5 0x006a46f9 in strcasecmp () from /lib/libc.so.6 #6 0x08067215 in binsearch (mytree=0x9509038, key=0x322e3432 <Address 0x322e3432 out of bounds>) at tree.c:47 #7 0x080673e8 in xtreeDelete (treehandle=0x9509038, key=0x322e3432 <Address 0x322e3432 out of bounds>) at tree.c:253 #8 0x0804c652 in clear_cookie (log=0x233ced58) at xymond.c:1163 #9 0x0804c704 in find_cookie (cookie=0xbfc30c02 "546260") at xymond.c:1183 And #3 0x080632b3 in sigsegv_handler (signum=11) at sig.c:57 #4 <signal handler called> #5 0x006a46f9 in strcasecmp () from /lib/libc.so.6 #6 0x08067215 in binsearch (mytree=0x8773038, key=0x10 <Address 0x10 out of bounds>) at tree.c:47 #7 0x080673e8 in xtreeDelete (treehandle=0x8773038, key=0x10 <Address 0x10 out of bounds>) at tree.c:253 #8 0x0804c652 in clear_cookie (log=0x1ca5cbe8) at xymond.c:1163 #9 0x0804c704 in find_cookie (cookie=0xbf905e72 "97097") at xymond.c:1183 I've tried it on two machines, to rule out any weird hardware/RAM issues, and it still happens – and when it crashes, a stale xymond stays, preventing it from auto restarting and binding to port 1984, making me have to go in, kill everything, remove the .chk file (as it has that out of bounds status message still in the chk file) Makes for an irritating time – they were different host/test combinations causing the crash Any ideas on what to do next? Is it just something like max_msg that is causing an Out of Bounds crash? -- Sean Clark Sr. Engineer, Software ATG Network Operations & Planning Integrated Regional OSS<http://www.twcable.com/DepartmentOverview/AdvancedTechnologyGroup/ATG/NOP/OSS/Network.aspx>;
user-2db5fbcae9a7@xymon.invalid <mailto:user-2db5fbcae9a7@xymon.invalid> devaudio <mailto:user-2db5fbcae9a7@xymon.invalid>
▸
Cell: (XXX) XXX-XXXX
This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.
list Jeremy Laidman
Are you acknowledging any alerts when it crashes? Seems to be related to ack cookie handling.
▸
On Tue, Jun 19, 2012 at 11:28 PM, Clark, Sean <user-2db5fbcae9a7@xymon.invalid>wrote:
Yeah, and I've had it crashing since 4.3.3 – updated to 4.3.7, still crashes about every 7-15 days, with a different host/test combo