buffer overflow detected in xymongen (4.3.21)
list Axel Beckert
Hi, today our xymongen check went purple for about an hour. In the xymongen.log I found tons of crash reports like this one: *** buffer overflow detected ***: xymongen terminated ======= Backtrace: ========= /lib/x86_64-linux-gnu/libc.so.6(+0x731ff)[0x7f31a13fc1ff] /lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f31a147f4c7] /lib/x86_64-linux-gnu/libc.so.6(+0xf46e0)[0x7f31a147d6e0] xymongen[0x40d526] xymongen[0x403b72] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f31a13aab45] xymongen[0x40510c] ======= Memory map: ======== 00400000-00445000 r-xp 00000000 fd:00 151583 /usr/lib/xymon/server/bin/xymongen 00644000-00645000 r--p 00044000 fd:00 151583 /usr/lib/xymon/server/bin/xymongen 00645000-00647000 rw-p 00045000 fd:00 151583 /usr/lib/xymon/server/bin/xymongen 00647000-00658000 rw-p 00000000 00:00 0 01ea3000-022fe000 rw-p 00000000 00:00 0 [heap] 7f31a0c26000-7f31a0c3c000 r-xp 00000000 fd:00 524307 /lib/x86_64-linux-gnu/libgcc_s.so.1 7f31a0c3c000-7f31a0e3b000 ---p 00016000 fd:00 524307 /lib/x86_64-linux-gnu/libgcc_s.so.1 7f31a0e3b000-7f31a0e3c000 rw-p 00015000 fd:00 524307 /lib/x86_64-linux-gnu/libgcc_s.so.1 7f31a0e3c000-7f31a0f68000 rw-p 00000000 00:00 0 7f31a0f68000-7f31a0f80000 r-xp 00000000 fd:00 535294 /lib/x86_64-linux-gnu/libpthread-2.19.so 7f31a0f80000-7f31a117f000 ---p 00018000 fd:00 535294 /lib/x86_64-linux-gnu/libpthread-2.19.so 7f31a117f000-7f31a1180000 r--p 00017000 fd:00 535294 /lib/x86_64-linux-gnu/libpthread-2.19.so 7f31a1180000-7f31a1181000 rw-p 00018000 fd:00 535294 /lib/x86_64-linux-gnu/libpthread-2.19.so 7f31a1181000-7f31a1185000 rw-p 00000000 00:00 0 7f31a1185000-7f31a1188000 r-xp 00000000 fd:00 535286 /lib/x86_64-linux-gnu/libdl-2.19.so 7f31a1188000-7f31a1387000 ---p 00003000 fd:00 535286 /lib/x86_64-linux-gnu/libdl-2.19.so 7f31a1387000-7f31a1388000 r--p 00002000 fd:00 535286 /lib/x86_64-linux-gnu/libdl-2.19.so 7f31a1388000-7f31a1389000 rw-p 00003000 fd:00 535286 /lib/x86_64-linux-gnu/libdl-2.19.so 7f31a1389000-7f31a1528000 r-xp 00000000 fd:00 535301 /lib/x86_64-linux-gnu/libc-2.19.so 7f31a1528000-7f31a1728000 ---p 0019f000 fd:00 535301 /lib/x86_64-linux-gnu/libc-2.19.so 7f31a1728000-7f31a172c000 r--p 0019f000 fd:00 535301 /lib/x86_64-linux-gnu/libc-2.19.so 7f31a172c000-7f31a172e000 rw-p 001a3000 fd:00 535301 /lib/x86_64-linux-gnu/libc-2.19.so 7f31a172e000-7f31a1732000 rw-p 00000000 00:00 0 7f31a1732000-7f31a179e000 r-xp 00000000 fd:00 524394 /lib/x86_64-linux-gnu/libpcre.so.3.13.1 7f31a179e000-7f31a199e000 ---p 0006c000 fd:00 524394 /lib/x86_64-linux-gnu/libpcre.so.3.13.1 7f31a199e000-7f31a199f000 r--p 0006c000 fd:00 524394 /lib/x86_64-linux-gnu/libpcre.so.3.13.1 7f31a199f000-7f31a19a0000 rw-p 0006d000 fd:00 524394 /lib/x86_64-linux-gnu/libpcre.so.3.13.1 7f31a19a0000-7f31a1b6b000 r-xp 00000000 fd:00 131288 /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0 7f31a1b6b000-7f31a1d6b000 ---p 001cb000 fd:00 131288 /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0 7f31a1d6b000-7f31a1d88000 r--p 001cb000 fd:00 131288 /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0 7f31a1d88000-7f31a1d98000 rw-p 001e8000 fd:00 131288 /usr/lib/x86_64-linux-gnu/libcrypto.so.1.0.0 7f31a1d98000-7f31a1d9b000 rw-p 00000000 00:00 0 7f31a1d9b000-7f31a1df1000 r-xp 00000000 fd:00 133362 /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 7f31a1df1000-7f31a1ff1000 ---p 00056000 fd:00 133362 /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 7f31a1ff1000-7f31a1ff4000 r--p 00056000 fd:00 133362 /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 7f31a1ff4000-7f31a1ffb000 rw-p 00059000 fd:00 133362 /usr/lib/x86_64-linux-gnu/libssl.so.1.0.0 7f31a1ffb000-7f31a201b000 r-xp 00000000 fd:00 535260 /lib/x86_64-linux-gnu/ld-2.19.so 7f31a220b000-7f31a2210000 rw-p 00000000 00:00 0 7f31a2217000-7f31a221b000 rw-p 00000000 00:00 0 7f31a221b000-7f31a221c000 r--p 00020000 fd:00 535260 /lib/x86_64-linux-gnu/ld-2.19.so 7f31a221c000-7f31a221d000 rw-p 00021000 fd:00 535260 /lib/x86_64-linux-gnu/ld-2.19.so 7f31a221d000-7f31a221e000 rw-p 00000000 00:00 0 7fff1cf05000-7fff1cf2d000 rw-p 00000000 00:00 0 [stack] 7fff1cffc000-7fff1cffe000 r-xp 00000000 00:00 0 [vdso] 7fff1cffe000-7fff1d000000 r--p 00000000 00:00 0 [vvar] ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] After ca. one hour it went yellow again with the following error output: "symlink nongreen.xml->index.wml failed: Transport endpoint is not connected". Shortly thereafter it went green again, but the runtime is about 1.5 times as high as before. xymon is installed from Debian's xymon packages. Feel free to tell me what else could be helpful to track down this bufferflow. I've found no (recent) core dump. Kind regards, Axel Beckert -- Axel Beckert <user-96d9963fe797@xymon.invalid> support: +41 44 633 26 68 IT Services Group, HPT H 6 voice: +41 44 633 41 89 Departement of Physics, ETH Zurich CH-8093 Zurich, Switzerland http://nic.phys.ethz.ch/
list Axel Beckert
Hi again,
▸
On Thu, Jul 02, 2015 at 03:49:17PM +0200, Axel Beckert wrote:today our xymongen check went purple for about an hour. In the xymongen.log I found tons of crash reports like this one: *** buffer overflow detected ***: xymongen terminated ======= Backtrace: ========= /lib/x86_64-linux-gnu/libc.so.6(+0x731ff)[0x7f31a13fc1ff] /lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f31a147f4c7] /lib/x86_64-linux-gnu/libc.so.6(+0xf46e0)[0x7f31a147d6e0] xymongen[0x40d526] xymongen[0x403b72] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f31a13aab45] xymongen[0x40510c] ======= Memory map: ========
[...]
After ca. one hour it went yellow again [...]
In the meanwhile I had cases where it took like 15 hours or so to recover.
▸
Feel free to tell me what else could be helpful to track down this bufferflow. I've found no (recent) core dump.
In the meanwhile I gathered hundreds of core dumps, but the backtrace generated from them is not that helpful except for the exact commandline: [New LWP 104939] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `xymongen --recentgifs --subpagecolumns=2 --wml --rss --nongreen-ignorecolumns=l'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f96ef76a107 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. ... because the error message...
"symlink nongreen.xml->index.wml failed: Transport endpoint is not connected".
... comes from xymongen/wmlgen.c, i.e. I assume the buffer overflow is related to the generation of the WML view of Xymon as it would explain why nobody else suffered from it so far as this feature is probably rarely used nowadays. I'll also disable it for now in our setup to see if I'm right with my assumption about the source of the buffer overflow. But I still think it should be fixed if it's indeed in there.
▸
Kind regards, Axel Beckert -- Axel Beckert <user-96d9963fe797@xymon.invalid> support: +41 44 633 26 68 IT Services Group, HPT H 6 voice: +41 44 633 41 89 Departement of Physics, ETH Zurich CH-8093 Zurich, Switzerland http://nic.phys.ethz.ch/
list Japheth Cleaver
Hi Axel, Would you be able to do a full BT on one of those core dumps? I'm not certain if Debian splits the debuginfo off into a separate package (like is done on the RH side), but that might need to be installed first. Having the specific line this is coming from would make tracking down the root easier. WML is indeed probably one of the more rarely used features nowadays, so it's quite possible there's a latent bug in there. -jc
▸
On 9/29/2015 10:27 AM, Axel Beckert wrote:Hi again, On Thu, Jul 02, 2015 at 03:49:17PM +0200, Axel Beckert wrote:today our xymongen check went purple for about an hour. In the xymongen.log I found tons of crash reports like this one: *** buffer overflow detected ***: xymongen terminated ======= Backtrace: ========= /lib/x86_64-linux-gnu/libc.so.6(+0x731ff)[0x7f31a13fc1ff] /lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f31a147f4c7] /lib/x86_64-linux-gnu/libc.so.6(+0xf46e0)[0x7f31a147d6e0] xymongen[0x40d526] xymongen[0x403b72] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f31a13aab45] xymongen[0x40510c] ======= Memory map: ========[...]After ca. one hour it went yellow again [...]In the meanwhile I had cases where it took like 15 hours or so to recover.Feel free to tell me what else could be helpful to track down this bufferflow. I've found no (recent) core dump.In the meanwhile I gathered hundreds of core dumps, but the backtrace generated from them is not that helpful except for the exact commandline: [New LWP 104939] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `xymongen --recentgifs --subpagecolumns=2 --wml --rss --nongreen-ignorecolumns=l'. Program terminated with signal SIGABRT, Aborted. #0 0x00007f96ef76a107 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. ... because the error message..."symlink nongreen.xml->index.wml failed: Transport endpoint is not connected".... comes from xymongen/wmlgen.c, i.e. I assume the buffer overflow is related to the generation of the WML view of Xymon as it would explain why nobody else suffered from it so far as this feature is probably rarely used nowadays. I'll also disable it for now in our setup to see if I'm right with my assumption about the source of the buffer overflow. But I still think it should be fixed if it's indeed in there. Kind regards, Axel Beckert
list Axel Beckert
Hi,
▸
On Tue, Sep 29, 2015 at 10:40:32AM -0700, Japheth Cleaver wrote:Would you be able to do a full BT on one of those core dumps?
Only with recompiling the packages.
▸
I'm not certain if Debian splits the debuginfo off into a separate package (like is done on the RH side), but that might need to be installed first.
We currently don't build debug packages for Debian's xymon package. They're (currently) optional in Debian and only present if the package's maintainer adds them explicitly. And we didn't need them for xymon so far. Debug packages in Debian will change generally in the near future as automatically built debug packages will be available in Debian Unstable soon-ish. So I won't manually add debug packages to the official Debian package as they will become obsolete soon. And because adding them requires some additional review process. So I'll try to add debug packages to my backports the classical way by specifying them manually. P.S.: No new crashes since I disabled the WML generation yesterday, but that's probably not yet a significant amount of time. ;-) Kind regards, Axel (one of Debian's xymon package maintainers)
▸
-- Axel Beckert <user-96d9963fe797@xymon.invalid> support: +41 44 633 26 68 IT Services Group, HPT H 6 voice: +41 44 633 41 89 Departement of Physics, ETH Zurich CH-8093 Zurich, Switzerland http://nic.phys.ethz.ch/
list Axel Beckert
▸
Hi, On Tue, Sep 29, 2015 at 10:40:32AM -0700, Japheth Cleaver wrote:
Would you be able to do a full BT on one of those core dumps?
Today it finally crashed again after I deployed the debug packages. (It's seldom that I'm so happy to see a core dump. ;-) Here's the most recent backtrace: Reading symbols from /usr/lib/xymon/server/bin/xymongen...done. Reading symbols from /usr/lib/debug/.build-id/c8/bc967d123c1a308a9973d19da8c2357af16a6d.debug...done. [New LWP 154856]
▸
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `xymongen --recentgifs --subpagecolumns=2 --wml --rss --nongreen-ignorecolumns=l'.
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f29044aa107 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007f29044aa107 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f29044ab4e8 in __GI_abort () at abort.c:89
#2 0x00007f29044e8204 in __libc_message (do_abort=do_abort at entry=2, fmt=fmt at entry=0x7f29045d8a2b "*** %s ***: %s terminated\n")
at ../sysdeps/posix/libc_fatal.c:175
#3 0x00007f290456b4c7 in __GI___fortify_fail (msg=msg at entry=0x7f29045d89c2 "buffer overflow detected") at fortify_fail.c:31
#4 0x00007f29045696e0 in __GI___chk_fail () at chk_fail.c:28
#5 0x000000000040d526 in strcpy (__src=<optimized out>, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:104
#6 generate_wml_statuscard (host=<optimized out>, entry=<optimized out>) at wmlgen.c:150
#7 do_wml_cards (webdir=0x7f29045d0200 <_itoa_lower_digits> "0123456789abcdefghijklmnopqrstuvwxyz") at wmlgen.c:296
#8 0x0000000000403b72 in main (argc=4404274, argv=0x25ce8) at xymongen.c:678
(gdb)
And here's the oldest one:
Reading symbols from /usr/lib/xymon/server/bin/xymongen...done.
Reading symbols from /usr/lib/debug/.build-id/c8/bc967d123c1a308a9973d19da8c2357af16a6d.debug...done.
[New LWP 122254]
▸
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `xymongen --recentgifs --subpagecolumns=2 --wml --rss --nongreen-ignorecolumns=l'.
Program terminated with signal SIGABRT, Aborted.#0 0x00007f12adf18107 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 0x00007f12adf18107 in __GI_raise (sig=sig at entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f12adf194e8 in __GI_abort () at abort.c:89 #2 0x00007f12adf56204 in __libc_message (do_abort=do_abort at entry=2, fmt=fmt at entry=0x7f12ae046a2b "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:175 #3 0x00007f12adfd94c7 in __GI___fortify_fail (msg=msg at entry=0x7f12ae0469c2 "buffer overflow detected") at fortify_fail.c:31 #4 0x00007f12adfd76e0 in __GI___chk_fail () at chk_fail.c:28 #5 0x000000000040d526 in strcpy (__src=<optimized out>, __dest=<optimized out>) at /usr/include/x86_64-linux-gnu/bits/string3.h:104 #6 generate_wml_statuscard (host=<optimized out>, entry=<optimized out>) at wmlgen.c:150 #7 do_wml_cards (webdir=0x7f12ae03e200 <_itoa_lower_digits> "0123456789abcdefghijklmnopqrstuvwxyz") at wmlgen.c:296 #8 0x0000000000403b72 in main (argc=4404274, argv=0x1dd8e) at xymongen.c:678 (gdb) They look more or less identical to me. It's indeed in the WML generation code, should be this line here: https://sources.debian.net/src/xymon/4.3.21-1/xymongen/wmlgen.c/#L150 HTH
▸
Kind regards, Axel Beckert -- Axel Beckert <user-96d9963fe797@xymon.invalid> support: +41 44 633 26 68 IT Services Group, HPT H 6 voice: +41 44 633 41 89 Departement of Physics, ETH Zurich CH-8093 Zurich, Switzerland http://nic.phys.ethz.ch/
list Axel Beckert
Hi,
▸
On Fri, Oct 02, 2015 at 01:32:51PM +0200, Axel Beckert wrote:Today it finally crashed again after I deployed the debug packages. (It's seldom that I'm so happy to see a core dump. ;-)
Here's the [...] oldest one:▸
[...]
It's indeed in the WML generation code, should be this line here: https://sources.debian.net/src/xymon/4.3.21-1/xymongen/wmlgen.c/#L150
Looking at what status changed in the five minutes before the crash I found two status reports which had each line lengths of 6576 and 6897 characters, but both far shorter than the MAX_LINE_LEN of 16384 characters Another thing I could imagine are lines with no trailing newline at the end. But then again I have no idea where they could come from nor how I should look for them.
▸
Kind regards, Axel Beckert -- Axel Beckert <user-96d9963fe797@xymon.invalid> support: +41 44 633 26 68 IT Services Group, HPT H 6 voice: +41 44 633 41 89 Departement of Physics, ETH Zurich CH-8093 Zurich, Switzerland http://nic.phys.ethz.ch/
list Japheth Cleaver
A malformed message could be returned as a result of a truncated network
connection to xymond, or other sorts of interesting cases.
p = strchr(nextline, '\n'); if (p) *p = '\0';
strcpy(l, nextline);
if (p) nextline = p+1; else nextline = NULL;
There does seem to be something a little odd there.
I suppose the below patch would at least keep us from writing past the end
on 'l', but in a proper situation we shouldn't need something like it.
-jc
▸
On Fri, October 2, 2015 4:58 am, Axel Beckert wrote:Hi, On Fri, Oct 02, 2015 at 01:32:51PM +0200, Axel Beckert wrote:Today it finally crashed again after I deployed the debug packages. (It's seldom that I'm so happy to see a core dump. ;-) Here's the [...] oldest one:[...]It's indeed in the WML generation code, should be this line here: https://sources.debian.net/src/xymon/4.3.21-1/xymongen/wmlgen.c/#L150Looking at what status changed in the five minutes before the crash I found two status reports which had each line lengths of 6576 and 6897 characters, but both far shorter than the MAX_LINE_LEN of 16384 characters Another thing I could imagine are lines with no trailing newline at the end. But then again I have no idea where they could come from nor how I should look for them. Kind regards, Axel Beckert -- Axel Beckert <user-96d9963fe797@xymon.invalid> support: +41 44 633 26 68 IT Services Group, HPT H 6 voice: +41 44 633 41 89 Departement of Physics, ETH Zurich CH-8093 Zurich, Switzerland http://nic.phys.ethz.ch/
Attachments (1)
list Axel Beckert
▸
Hi, On Fri, Oct 02, 2015 at 11:58:28AM -0700, J.C. Cleaver wrote:
A malformed message could be returned as a result of a truncated network connection to xymond,
Hrm. Wouldn't expect that here. Not _that_ often. :-)
or other sorts of interesting cases.
It's probably one of these cases. ;-)
▸
I suppose the below patch would at least keep us from writing past the end on 'l', but in a proper situation we shouldn't need something like it.
The patch seems to help. Until installing xymon with the patch applied I had segfaults every minute and since then, no segfault has happened anymore for at least 10 minutes. Thanks!
▸
Kind regards, Axel Beckert -- Axel Beckert <user-96d9963fe797@xymon.invalid> support: +41 44 633 26 68 IT Services Group, HPT H 6 voice: +41 44 633 41 89 Departement of Physics, ETH Zurich CH-8093 Zurich, Switzerland http://nic.phys.ethz.ch/
list Japheth Cleaver
▸
On Mon, October 5, 2015 2:36 am, Axel Beckert wrote:
I suppose the below patch would at least keep us from writing past the end on 'l', but in a proper situation we shouldn't need something like it.The patch seems to help. Until installing xymon with the patch applied I had segfaults every minute and since then, no segfault has happened anymore for at least 10 minutes.
Hmm. I didn't realize it was crashing that often for you. :/ Would it be possible to run the previous copy of xymongen for a cycle or two with --debug mode enabled? In theory, that should give us a good idea of the trigger. Regards, -jc