Xymon Mailing List Archive search

buffer overflow detected in xymongen (4.3.21)

9 messages in this thread

list Axel Beckert · Thu, 2 Jul 2015 15:49:17 +0200 ·
Hi,

today our xymongen check went purple for about an hour. In the
xymongen.log I found tons of crash reports like this one:

*** buffer overflow detected ***: xymongen terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x731ff)[0x7f31a13fc1ff]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f31a147f4c7]
/lib/x86_64-linux-gnu/libc.so.6(+0xf46e0)[0x7f31a147d6e0]
xymongen[0x40d526]
xymongen[0x403b72]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f31a13aab45]
xymongen[0x40510c]
======= Memory map: ========
00400000-00445000 r-xp 00000000 fd:00 151583                             /usr/lib/xymon/server/bin/xymongen
00644000-00645000 r--p 00044000 fd:00 151583                             /usr/lib/xymon/server/bin/xymongen
00645000-00647000 rw-p 00045000 fd:00 151583                             /usr/lib/xymon/server/bin/xymongen
00647000-00658000 rw-p 00000000 00:00 0 
01ea3000-022fe000 rw-p 00000000 00:00 0                                  [heap]
7f31a0c26000-7f31a0c3c000 r-xp 00000000 fd:00 524307                     /lib/x86_64-linux-gnu/libgcc_s.so.1
7f31a0c3c000-7f31a0e3b000 ---p 00016000 fd:00 524307                     /lib/x86_64-linux-gnu/libgcc_s.so.1
7f31a0e3b000-7f31a0e3c000 rw-p 00015000 fd:00 524307                     /lib/x86_64-linux-gnu/libgcc_s.so.1
7f31a0e3c000-7f31a0f68000 rw-p 00000000 00:00 0 
7f31a0f68000-7f31a0f80000 r-xp 00000000 fd:00 535294                     /lib/x86_64-linux-gnu/libpthread-2.19.so
7f31a0f80000-7f31a117f000 ---p 00018000 fd:00 535294                     /lib/x86_64-linux-gnu/libpthread-2.19.so
7f31a117f000-7f31a1180000 r--p 00017000 fd:00 535294                     /lib/x86_64-linux-gnu/libpthread-2.19.so
7f31a1180000-7f31a1181000 rw-p 00018000 fd:00 535294                     /lib/x86_64-linux-gnu/libpthread-2.19.so
7f31a1181000-7f31a1185000 rw-p 00000000 00:00 0 
7f31a1185000-7f31a1188000 r-xp 00000000 fd:00 535286                     /lib/x86_64-linux-gnu/libdl-2.19.so
7f31a1188000-7f31a1387000 ---p 00003000 fd:00 535286                     /lib/x86_64-linux-gnu/libdl-2.19.so
7f31a1387000-7f31a1388000 r--p 00002000 fd:00 535286                     /lib/x86_64-linux-gnu/libdl-2.19.so
7f31a1388000-7f31a1389000 rw-p 00003000 fd:00 535286                     /lib/x86_64-linux-gnu/libdl-2.19.so
7f31a1389000-7f31a1528000 r-xp 00000000 fd:00 535301                     /lib/x86_64-linux-gnu/libc-2.19.so
7f31a1528000-7f31a1728000 ---p 0019f000 fd:00 535301                     /lib/x86_64-linux-gnu/libc-2.19.so
7f31a1728000-7f31a172c000 r--p 0019f000 fd:00 535301                     /lib/x86_64-linux-gnu/libc-2.19.so
7f31a172c000-7f31a172e000 rw-p 001a3000 fd:00 535301                     /lib/x86_64-linux-gnu/libc-2.19.so
7f31a172e000-7f31a1732000 rw-p 00000000 00:00 0 
7f31a1732000-7f31a179e000 r-xp 00000000 fd:00 524394                     /lib/x86_64-linux-gnu/libpcre.so.3.13.1
7f31a179e000-7f31a199e000 ---p 0006c000 fd:00 524394                     /lib/x86_64-linux-gnu/libpcre.so.3.13.1
7f31a199e000-7f31a199f000 r--p 0006c000 fd:00 524394                     /lib/x86_64-linux-gnu/libpcre.so.3.13.1
7f31a199f000-7f31a19a0000 rw-p 0006d000 fd:00 524394                     /lib/x86_64-linux-gnu/libpcre.so.3.13.1
7f31a19a0000-7f31a1b6b000 r-xp 00000000 fd:00 131288                     /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f31a1b6b000-7f31a1d6b000 ---p 001cb000 fd:00 131288                     /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f31a1d6b000-7f31a1d88000 r--p 001cb000 fd:00 131288                     /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f31a1d88000-7f31a1d98000 rw-p 001e8000 fd:00 131288                     /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0
7f31a1d98000-7f31a1d9b000 rw-p 00000000 00:00 0 
7f31a1d9b000-7f31a1df1000 r-xp 00000000 fd:00 133362                     /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0
7f31a1df1000-7f31a1ff1000 ---p 00056000 fd:00 133362                     /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0
7f31a1ff1000-7f31a1ff4000 r--p 00056000 fd:00 133362                     /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0
7f31a1ff4000-7f31a1ffb000 rw-p 00059000 fd:00 133362                     /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0
7f31a1ffb000-7f31a201b000 r-xp 00000000 fd:00 535260                     /lib/x86_64-linux-gnu/ld-2.19.so
7f31a220b000-7f31a2210000 rw-p 00000000 00:00 0 
7f31a2217000-7f31a221b000 rw-p 00000000 00:00 0 
7f31a221b000-7f31a221c000 r--p 00020000 fd:00 535260                     /lib/x86_64-linux-gnu/ld-2.19.so
7f31a221c000-7f31a221d000 rw-p 00021000 fd:00 535260                     /lib/x86_64-linux-gnu/ld-2.19.so
7f31a221d000-7f31a221e000 rw-p 00000000 00:00 0 
7fff1cf05000-7fff1cf2d000 rw-p 00000000 00:00 0                          [stack]
7fff1cffc000-7fff1cffe000 r-xp 00000000 00:00 0                          [vdso]
7fff1cffe000-7fff1d000000 r--p 00000000 00:00 0                          [vvar]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

After ca. one hour it went yellow again with the following error
output: "symlink nongreen.xml->index.wml failed: Transport endpoint is
not connected". Shortly thereafter it went green again, but the
runtime is about 1.5 times as high as before.

xymon is installed from Debian's xymon packages.

Feel free to tell me what else could be helpful to track down this
bufferflow. I've found no (recent) core dump.

		Kind regards, Axel Beckert
-- 
Axel Beckert <user-96d9963fe797@xymon.invalid>       support: +41 44 633 26 68
IT Services Group, HPT H 6                  voice: +41 44 633 41 89
Departement of Physics, ETH Zurich
CH-8093 Zurich, Switzerland		   http://nic.phys.ethz.ch/
list Axel Beckert · Tue, 29 Sep 2015 19:27:08 +0200 ·
Hi again,
quoted from Axel Beckert

On Thu, Jul 02, 2015 at 03:49:17PM +0200, Axel Beckert wrote:
today our xymongen check went purple for about an hour. In the
xymongen.log I found tons of crash reports like this one:

*** buffer overflow detected ***: xymongen terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x731ff)[0x7f31a13fc1ff]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f31a147f4c7]
/lib/x86_64-linux-gnu/libc.so.6(+0xf46e0)[0x7f31a147d6e0]
xymongen[0x40d526]
xymongen[0x403b72]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f31a13aab45]
xymongen[0x40510c]
======= Memory map: ========
[...] 
After ca. one hour it went yellow again [...]
In the meanwhile I had cases where it took like 15 hours or so to
recover.
quoted from Axel Beckert
Feel free to tell me what else could be helpful to track down this
bufferflow. I've found no (recent) core dump.
In the meanwhile I gathered hundreds of core dumps, but the backtrace
generated from them is not that helpful except for the exact
commandline:

[New LWP 104939]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `xymongen --recentgifs --subpagecolumns=2 --wml --rss --nongreen-ignorecolumns=l'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f96ef76a107 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56      ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.

... because the error message...
"symlink nongreen.xml->index.wml failed: Transport endpoint is not
connected".
... comes from xymongen/wmlgen.c, i.e. I assume the buffer overflow is
related to the generation of the WML view of Xymon as it would explain
why nobody else suffered from it so far as this feature is probably
rarely used nowadays.

I'll also disable it for now in our setup to see if I'm right with my
assumption about the source of the buffer overflow. But I still think
it should be fixed if it's indeed in there.
quoted from Axel Beckert

		Kind regards, Axel Beckert
-- 
Axel Beckert <user-96d9963fe797@xymon.invalid>       support: +41 44 633 26 68
IT Services Group, HPT H 6                  voice: +41 44 633 41 89
Departement of Physics, ETH Zurich
CH-8093 Zurich, Switzerland		   http://nic.phys.ethz.ch/
list Japheth Cleaver · Tue, 29 Sep 2015 10:40:32 -0700 ·
Hi Axel,

Would you be able to do a full BT on one of those core dumps? I'm not 
certain if Debian splits the debuginfo off into a separate package (like 
is done on the RH side), but that might need to be installed first.

Having the specific line this is coming from would make tracking down 
the root easier.

WML is indeed probably one of the more rarely used features nowadays, so 
it's quite possible there's a latent bug in there.

-jc
quoted from Axel Beckert


On 9/29/2015 10:27 AM, Axel Beckert wrote:
Hi again,

On Thu, Jul 02, 2015 at 03:49:17PM +0200, Axel Beckert wrote:
today our xymongen check went purple for about an hour. In the
xymongen.log I found tons of crash reports like this one:

*** buffer overflow detected ***: xymongen terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x731ff)[0x7f31a13fc1ff]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f31a147f4c7]
/lib/x86_64-linux-gnu/libc.so.6(+0xf46e0)[0x7f31a147d6e0]
xymongen[0x40d526]
xymongen[0x403b72]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f31a13aab45]
xymongen[0x40510c]
======= Memory map: ========
[...]
After ca. one hour it went yellow again [...]
In the meanwhile I had cases where it took like 15 hours or so to
recover.
Feel free to tell me what else could be helpful to track down this
bufferflow. I've found no (recent) core dump.
In the meanwhile I gathered hundreds of core dumps, but the backtrace
generated from them is not that helpful except for the exact
commandline:

[New LWP 104939]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `xymongen --recentgifs --subpagecolumns=2 --wml --rss --nongreen-ignorecolumns=l'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f96ef76a107 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56      ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.

... because the error message...
"symlink nongreen.xml->index.wml failed: Transport endpoint is not
connected".
... comes from xymongen/wmlgen.c, i.e. I assume the buffer overflow is
related to the generation of the WML view of Xymon as it would explain
why nobody else suffered from it so far as this feature is probably
rarely used nowadays.

I'll also disable it for now in our setup to see if I'm right with my
assumption about the source of the buffer overflow. But I still think
it should be fixed if it's indeed in there.

		Kind regards, Axel Beckert
list Axel Beckert · Wed, 30 Sep 2015 11:39:23 +0200 ·
Hi,
quoted from Japheth Cleaver

On Tue, Sep 29, 2015 at 10:40:32AM -0700, Japheth Cleaver wrote:
Would you be able to do a full BT on one of those core dumps?
Only with recompiling the packages.
quoted from Japheth Cleaver
I'm not certain if Debian splits the debuginfo off into a separate
package (like is done on the RH side), but that might need to be
installed first.
We currently don't build debug packages for Debian's xymon
package. They're (currently) optional in Debian and only present if
the package's maintainer adds them explicitly. And we didn't need them
for xymon so far.

Debug packages in Debian will change generally in the near future as
automatically built debug packages will be available in Debian
Unstable soon-ish. So I won't manually add debug packages to the
official Debian package as they will become obsolete soon. And because
adding them requires some additional review process.

So I'll try to add debug packages to my backports the classical way by
specifying them manually.

P.S.: No new crashes since I disabled the WML generation yesterday,
but that's probably not yet a significant amount of time. ;-)

   Kind regards, Axel (one of Debian's xymon package maintainers)
quoted from Axel Beckert
-- 
Axel Beckert <user-96d9963fe797@xymon.invalid>       support: +41 44 633 26 68
IT Services Group, HPT H 6                  voice: +41 44 633 41 89
Departement of Physics, ETH Zurich
CH-8093 Zurich, Switzerland		   http://nic.phys.ethz.ch/
list Axel Beckert · Fri, 2 Oct 2015 13:32:51 +0200 ·
quoted from Axel Beckert
Hi,

On Tue, Sep 29, 2015 at 10:40:32AM -0700, Japheth Cleaver wrote:
Would you be able to do a full BT on one of those core dumps?
Today it finally crashed again after I deployed the debug
packages. (It's seldom that I'm so happy to see a core dump. ;-)

Here's the most recent backtrace:

Reading symbols from /usr/lib/xymon/server/bin/xymongen...done.
Reading symbols from /usr/lib/debug/.build-id/c8/bc967d123c1a308a9973d19da8c2357af16a6d.debug...done.
[New LWP 154856]
quoted from Japheth Cleaver
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `xymongen --recentgifs --subpagecolumns=2 --wml --rss --nongreen-ignorecolumns=l'.
Program terminated with signal SIGABRT, Aborted.

#0  0x00007f29044aa107 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56      ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007f29044aa107 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f29044ab4e8 in __GI_abort () at abort.c:89
#2  0x00007f29044e8204 in __libc_message (do_abort=do_abort at entry=2, fmt=fmt at entry=0x7f29045d8a2b "*** %s ***: %s terminated\n")
    at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007f290456b4c7 in __GI___fortify_fail (msg=msg at entry=0x7f29045d89c2 "buffer overflow detected") at fortify_fail.c:31
#4  0x00007f29045696e0 in __GI___chk_fail () at chk_fail.c:28
#5  0x000000000040d526 in strcpy (__src=<optimized out>, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:104
#6  generate_wml_statuscard (host=<optimized out>, entry=<optimized out>) at wmlgen.c:150
#7  do_wml_cards (webdir=0x7f29045d0200 <_itoa_lower_digits> "0123456789abcdefghijklmnopqrstuvwxyz") at wmlgen.c:296
#8  0x0000000000403b72 in main (argc=4404274, argv=0x25ce8) at xymongen.c:678
(gdb) 

And here's the oldest one:

Reading symbols from /usr/lib/xymon/server/bin/xymongen...done.
Reading symbols from /usr/lib/debug/.build-id/c8/bc967d123c1a308a9973d19da8c2357af16a6d.debug...done.
[New LWP 122254]
quoted from Japheth Cleaver
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `xymongen --recentgifs --subpagecolumns=2 --wml --rss --nongreen-ignorecolumns=l'.
Program terminated with signal SIGABRT, Aborted.

#0  0x00007f12adf18107 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56      ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  0x00007f12adf18107 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f12adf194e8 in __GI_abort () at abort.c:89
#2  0x00007f12adf56204 in __libc_message (do_abort=do_abort at entry=2, fmt=fmt at entry=0x7f12ae046a2b "*** %s ***: %s terminated\n")
    at ../sysdeps/posix/libc_fatal.c:175
#3  0x00007f12adfd94c7 in __GI___fortify_fail (msg=msg at entry=0x7f12ae0469c2 "buffer overflow detected") at fortify_fail.c:31
#4  0x00007f12adfd76e0 in __GI___chk_fail () at chk_fail.c:28
#5  0x000000000040d526 in strcpy (__src=<optimized out>, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:104
#6  generate_wml_statuscard (host=<optimized out>, entry=<optimized out>) at wmlgen.c:150
#7  do_wml_cards (webdir=0x7f12ae03e200 <_itoa_lower_digits> "0123456789abcdefghijklmnopqrstuvwxyz") at wmlgen.c:296
#8  0x0000000000403b72 in main (argc=4404274, argv=0x1dd8e) at xymongen.c:678
(gdb) 

They look more or less identical to me.

It's indeed in the WML generation code, should be this line here:
https://sources.debian.net/src/xymon/4.3.21-1/xymongen/wmlgen.c/#L150

HTH
quoted from Axel Beckert

		Kind regards, Axel Beckert
-- 
Axel Beckert <user-96d9963fe797@xymon.invalid>       support: +41 44 633 26 68
IT Services Group, HPT H 6                  voice: +41 44 633 41 89
Departement of Physics, ETH Zurich
CH-8093 Zurich, Switzerland		   http://nic.phys.ethz.ch/
list Axel Beckert · Fri, 2 Oct 2015 13:58:46 +0200 ·
Hi,
quoted from Axel Beckert

On Fri, Oct 02, 2015 at 01:32:51PM +0200, Axel Beckert wrote:
Today it finally crashed again after I deployed the debug
packages. (It's seldom that I'm so happy to see a core dump. ;-)

Here's the [...] oldest one:
quoted from Axel Beckert
[...]
It's indeed in the WML generation code, should be this line here:
https://sources.debian.net/src/xymon/4.3.21-1/xymongen/wmlgen.c/#L150
Looking at what status changed in the five minutes before the crash I
found two status reports which had each line lengths of 6576 and 6897
characters, but both far shorter than the MAX_LINE_LEN of 16384
characters

Another thing I could imagine are lines with no trailing newline at
the end. But then again I have no idea where they could come from nor
how I should look for them.
quoted from Axel Beckert

		Kind regards, Axel Beckert
-- 
Axel Beckert <user-96d9963fe797@xymon.invalid>       support: +41 44 633 26 68
IT Services Group, HPT H 6                  voice: +41 44 633 41 89
Departement of Physics, ETH Zurich
CH-8093 Zurich, Switzerland		   http://nic.phys.ethz.ch/
list Japheth Cleaver · Fri, 2 Oct 2015 11:58:28 -0700 ·
A malformed message could be returned as a result of a truncated network
connection to xymond, or other sorts of interesting cases.

                p = strchr(nextline, '\n'); if (p) *p = '\0';
                strcpy(l, nextline);
                if (p) nextline = p+1; else nextline = NULL;

There does seem to be something a little odd there.

I suppose the below patch would at least keep us from writing past the end
on 'l', but in a proper situation we shouldn't need something like it.

-jc
quoted from Axel Beckert


On Fri, October 2, 2015 4:58 am, Axel Beckert wrote:
Hi,

On Fri, Oct 02, 2015 at 01:32:51PM +0200, Axel Beckert wrote:
Today it finally crashed again after I deployed the debug
packages. (It's seldom that I'm so happy to see a core dump. ;-)

Here's the [...] oldest one:
[...]
It's indeed in the WML generation code, should be this line here:
https://sources.debian.net/src/xymon/4.3.21-1/xymongen/wmlgen.c/#L150
Looking at what status changed in the five minutes before the crash I
found two status reports which had each line lengths of 6576 and 6897
characters, but both far shorter than the MAX_LINE_LEN of 16384
characters

Another thing I could imagine are lines with no trailing newline at
the end. But then again I have no idea where they could come from nor
how I should look for them.

		Kind regards, Axel Beckert
--
Axel Beckert <user-96d9963fe797@xymon.invalid>       support: +41 44 633 26 68
IT Services Group, HPT H 6                  voice: +41 44 633 41 89
Departement of Physics, ETH Zurich
CH-8093 Zurich, Switzerland		   http://nic.phys.ethz.ch/

Attachments (1)
list Axel Beckert · Mon, 5 Oct 2015 11:36:41 +0200 ·
quoted from Japheth Cleaver
Hi,

On Fri, Oct 02, 2015 at 11:58:28AM -0700, J.C. Cleaver wrote:
A malformed message could be returned as a result of a truncated network
connection to xymond,
Hrm. Wouldn't expect that here. Not _that_ often. :-)
or other sorts of interesting cases.
It's probably one of these cases. ;-)
quoted from Japheth Cleaver
I suppose the below patch would at least keep us from writing past the end
on 'l', but in a proper situation we shouldn't need something like it.
The patch seems to help. Until installing xymon with the patch applied
I had segfaults every minute and since then, no segfault has happened
anymore for at least 10 minutes.

Thanks!
quoted from Japheth Cleaver

		Kind regards, Axel Beckert
-- 
Axel Beckert <user-96d9963fe797@xymon.invalid>       support: +41 44 633 26 68
IT Services Group, HPT H 6                  voice: +41 44 633 41 89
Departement of Physics, ETH Zurich
CH-8093 Zurich, Switzerland		   http://nic.phys.ethz.ch/
list Japheth Cleaver · Mon, 5 Oct 2015 20:32:58 -0700 ·
quoted from Axel Beckert
On Mon, October 5, 2015 2:36 am, Axel Beckert wrote:
I suppose the below patch would at least keep us from writing past the
end
on 'l', but in a proper situation we shouldn't need something like it.
The patch seems to help. Until installing xymon with the patch applied
I had segfaults every minute and since then, no segfault has happened
anymore for at least 10 minutes.
Hmm. I didn't realize it was crashing that often for you. :/

Would it be possible to run the previous copy of xymongen for a cycle or
two with --debug mode enabled? In theory, that should give us a good idea
of the trigger.


Regards,

-jc