Hi again,
On Thu, Jul 02, 2015 at 03:49:17PM +0200, Axel Beckert wrote:
today our xymongen check went purple for about an hour. In the
xymongen.log I found tons of crash reports like this one:
*** buffer overflow detected ***: xymongen terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x731ff)[0x7f31a13fc1ff]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f31a147f4c7]
/lib/x86_64-linux-gnu/libc.so.6(+0xf46e0)[0x7f31a147d6e0]
xymongen[0x40d526]
xymongen[0x403b72]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f31a13aab45]
xymongen[0x40510c]
======= Memory map: ========
[...]
After ca. one hour it went yellow again [...]
In the meanwhile I had cases where it took like 15 hours or so to
recover.
Feel free to tell me what else could be helpful to track down this
bufferflow. I've found no (recent) core dump.
In the meanwhile I gathered hundreds of core dumps, but the backtrace
generated from them is not that helpful except for the exact
commandline:
[New LWP 104939]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `xymongen --recentgifs --subpagecolumns=2 --wml --rss --nongreen-ignorecolumns=l'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f96ef76a107 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
... because the error message...
"symlink nongreen.xml->index.wml failed: Transport endpoint is not
connected".
... comes from xymongen/wmlgen.c, i.e. I assume the buffer overflow is
related to the generation of the WML view of Xymon as it would explain
why nobody else suffered from it so far as this feature is probably
rarely used nowadays.
I'll also disable it for now in our setup to see if I'm right with my
assumption about the source of the buffer overflow. But I still think
it should be fixed if it's indeed in there.
Kind regards, Axel Beckert