Xymon Mailing List Archive search

hobbitd_rrd problem (core)

5 messages in this thread

list Nicolas Lienard · Thu, 21 Aug 2008 11:33:20 +0200 (CEST) ·
Hi,

I got a problem with hobbitd_rrd.

It generates lot of cores file.

Here the bt from a core:


Core was generated by `hobbitd_rrd --rrddir=/opt/hobbit/data/rrd'.
Program terminated with signal 6, Aborted.
Reading symbols from /usr/local/rrdtool-1.2.27/lib/librrd.so.2...done.
Loaded symbols for /usr/local/rrdtool-1.2.27/lib/librrd.so.2
Reading symbols from /usr/lib64/libpng12.so.0...done.
Loaded symbols for /usr/lib64/libpng12.so.0
Reading symbols from /lib64/libpcre.so.0...done.
Loaded symbols for /lib64/libpcre.so.0
Reading symbols from /lib64/tls/libc.so.6...done.
Loaded symbols for /lib64/tls/libc.so.6
Reading symbols from /usr/lib64/libfreetype.so.6...done.
Loaded symbols for /usr/lib64/libfreetype.so.6
Reading symbols from /usr/lib64/libz.so.1...done.
Loaded symbols for /usr/lib64/libz.so.1
Reading symbols from /usr/lib64/libart_lgpl_2.so.2...done.
Loaded symbols for /usr/lib64/libart_lgpl_2.so.2
Reading symbols from /lib64/tls/libm.so.6...done.
Loaded symbols for /lib64/tls/libm.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
#0  0x0000002a95a2626d in raise () from /lib64/tls/libc.so.6
(gdb) bt
#0  0x0000002a95a2626d in raise () from /lib64/tls/libc.so.6
#1  0x0000002a95a27a6e in abort () from /lib64/tls/libc.so.6
#2  0x000000000040f623 in sigsegv_handler (signum=Variable "signum" is not
available.
) at sig.c:57
#3  <signal handler called>
#4  0x0000002a95695639 in write_RRA_row (rrd=0x7fbfffb0a0, rra_idx=3,
rra_current=0x7fbfffaff0, CDP_scratch_idx=Variable "CDP_scratch_idx" is
not available.
)
    at rrd_update.c:1534
#5  0x0000002a95696d02 in _rrd_update (filename=0x525520
"/opt/hobbit/data/rrd/mail.pitsystem.it/netstat.rrd", tmplt=Variable
"tmplt" is not available.
)
    at rrd_update.c:1239
#6  0x0000002a95698027 in rrd_update (argc=5, argv=0x7fbfffb160) at
rrd_update.c:185
#7  0x00000000004036b6 in create_and_update_rrd (hostname=Variable
"hostname" is not available.
) at do_rrd.c:183
#8  0x000000000040a665 in update_rrd (hostname=0x2a9623708e
"mail.pitsystem.it", testname=0x2a962370a0 "netstat",
    msg=0x2a962370a8 "data mail,pitsystem,it.netstat\nlinux\nIp:\n   
856780605 total packets received\n    0 forwarded\n    0 incoming
packets discarded\n    853032398 incoming packets delivered\n   
1540620911 requests sent out"...,
    tstamp=1219310955, sender=Variable "sender" is not available.
) at do_rrd.c:344
#9  0x0000000000402951 in main (argc=Variable "argc" is not available.
) at hobbitd_rrd.c:153
(gdb)


Hobbit version: 4.2.0


If somebody could help me to know what is wrong, he is welcome.

Regards,
Nicolas Lienard
list David Peters · Fri, 22 Aug 2008 18:41:51 +1000 ·
How often does it happen?
quoted from Nicolas Lienard


Nicolas LIENARD wrote:
Hi,

I got a problem with hobbitd_rrd.

It generates lot of cores file.

Here the bt from a core:


Core was generated by `hobbitd_rrd --rrddir=/opt/hobbit/data/rrd'.
Program terminated with signal 6, Aborted.
Reading symbols from /usr/local/rrdtool-1.2.27/lib/librrd.so.2...done.
Loaded symbols for /usr/local/rrdtool-1.2.27/lib/librrd.so.2
Reading symbols from /usr/lib64/libpng12.so.0...done.
Loaded symbols for /usr/lib64/libpng12.so.0
Reading symbols from /lib64/libpcre.so.0...done.
Loaded symbols for /lib64/libpcre.so.0
Reading symbols from /lib64/tls/libc.so.6...done.
Loaded symbols for /lib64/tls/libc.so.6
Reading symbols from /usr/lib64/libfreetype.so.6...done.
Loaded symbols for /usr/lib64/libfreetype.so.6
Reading symbols from /usr/lib64/libz.so.1...done.
Loaded symbols for /usr/lib64/libz.so.1
Reading symbols from /usr/lib64/libart_lgpl_2.so.2...done.
Loaded symbols for /usr/lib64/libart_lgpl_2.so.2
Reading symbols from /lib64/tls/libm.so.6...done.
Loaded symbols for /lib64/tls/libm.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
#0  0x0000002a95a2626d in raise () from /lib64/tls/libc.so.6
(gdb) bt
#0  0x0000002a95a2626d in raise () from /lib64/tls/libc.so.6
#1  0x0000002a95a27a6e in abort () from /lib64/tls/libc.so.6
#2  0x000000000040f623 in sigsegv_handler (signum=Variable "signum" is not
available.
) at sig.c:57
#3  <signal handler called>
#4  0x0000002a95695639 in write_RRA_row (rrd=0x7fbfffb0a0, rra_idx=3,
rra_current=0x7fbfffaff0, CDP_scratch_idx=Variable "CDP_scratch_idx" is
not available.
)
    at rrd_update.c:1534
#5  0x0000002a95696d02 in _rrd_update (filename=0x525520
"/opt/hobbit/data/rrd/mail.pitsystem.it/netstat.rrd", tmplt=Variable
"tmplt" is not available.
)
    at rrd_update.c:1239
#6  0x0000002a95698027 in rrd_update (argc=5, argv=0x7fbfffb160) at
rrd_update.c:185
#7  0x00000000004036b6 in create_and_update_rrd (hostname=Variable
"hostname" is not available.
) at do_rrd.c:183
#8  0x000000000040a665 in update_rrd (hostname=0x2a9623708e
"mail.pitsystem.it", testname=0x2a962370a0 "netstat",
    msg=0x2a962370a8 "data mail,pitsystem,it.netstat\nlinux\nIp:\n   
856780605 total packets received\n    0 forwarded\n    0 incoming
packets discarded\n    853032398 incoming packets delivered\n   
1540620911 requests sent out"...,
    tstamp=1219310955, sender=Variable "sender" is not available.
) at do_rrd.c:344
#9  0x0000000000402951 in main (argc=Variable "argc" is not available.
) at hobbitd_rrd.c:153
(gdb)


Hobbit version: 4.2.0


If somebody could help me to know what is wrong, he is welcome.

Regards,
Nicolas Lienard

list Nicolas Lienard · Fri, 22 Aug 2008 14:15:40 +0200 (CEST) ·
Hi,

I ve to delete them because of space disk:

Everytime, it comes:

]# ls -ltr core*
-rw-------  1 hobbit hobbit 2523136 Aug 22 12:36 core.24773
-rw-------  1 hobbit hobbit 2662400 Aug 22 12:36 core.26341
-rw-------  1 hobbit hobbit 2646016 Aug 22 12:37 core.11284
-rw-------  1 hobbit hobbit 2523136 Aug 22 12:37 core.24953
-rw-------  1 hobbit hobbit 2523136 Aug 22 12:39 core.31159
-rw-------  1 hobbit hobbit 2523136 Aug 22 12:41 core.6837
-rw-------  1 hobbit hobbit 2662400 Aug 22 12:41 core.12842
-rw-------  1 hobbit hobbit 2646016 Aug 22 12:42 core.31114
-rw-------  1 hobbit hobbit 2523136 Aug 22 12:42 core.12550
-rw-------  1 hobbit hobbit 2523136 Aug 22 12:44 core.20583
-rw-------  1 hobbit hobbit 2662400 Aug 22 12:46 core.617
-rw-------  1 hobbit hobbit 2646016 Aug 22 12:48 core.18711
-rw-------  1 hobbit hobbit 2662400 Aug 22 12:51 core.20452
-rw-------  1 hobbit hobbit 2646016 Aug 22 12:53 core.10551
-rw-------  1 hobbit hobbit 2523136 Aug 22 12:54 core.27585
-rw-------  1 hobbit hobbit 2523136 Aug 22 12:56 core.1045
-rw-------  1 hobbit hobbit 2662400 Aug 22 12:56 core.7084
-rw-------  1 hobbit hobbit 2646016 Aug 22 12:57 core.28050
-rw-------  1 hobbit hobbit 2523136 Aug 22 12:58 core.6518
-rw-------  1 hobbit hobbit 2523136 Aug 22 13:00 core.15028
-rw-------  1 hobbit hobbit 2523136 Aug 22 13:01 core.21864
-rw-------  1 hobbit hobbit 2662400 Aug 22 13:01 core.27360
-rw-------  1 hobbit hobbit 2523136 Aug 22 13:03 core.26816
-rw-------  1 hobbit hobbit 2646016 Aug 22 13:03 core.14766
-rw-------  1 hobbit hobbit 2519040 Aug 22 13:08 core.14296
-rw-------  1 hobbit hobbit 2523136 Aug 22 13:11 core.26935
-rw-------  1 hobbit hobbit 2658304 Aug 22 13:11 core.28118
-rw-------  1 hobbit hobbit 2523136 Aug 22 13:13 core.27671
-rw-------  1 hobbit hobbit 2641920 Aug 22 13:13 core.14420

Regards,
Nicolas Lienard
quoted from David Peters


On Fri, August 22, 2008 10:41 am, David Peters wrote:
How often does it happen?


Nicolas LIENARD wrote:
Hi,


I got a problem with hobbitd_rrd.


It generates lot of cores file.


Here the bt from a core:


Core was generated by `hobbitd_rrd --rrddir=/opt/hobbit/data/rrd'.
Program terminated with signal 6, Aborted.
Reading symbols from /usr/local/rrdtool-1.2.27/lib/librrd.so.2...done.
Loaded symbols for /usr/local/rrdtool-1.2.27/lib/librrd.so.2
Reading symbols from /usr/lib64/libpng12.so.0...done.
Loaded symbols for /usr/lib64/libpng12.so.0
Reading symbols from /lib64/libpcre.so.0...done.
Loaded symbols for /lib64/libpcre.so.0
Reading symbols from /lib64/tls/libc.so.6...done.
Loaded symbols for /lib64/tls/libc.so.6
Reading symbols from /usr/lib64/libfreetype.so.6...done.
Loaded symbols for /usr/lib64/libfreetype.so.6
Reading symbols from /usr/lib64/libz.so.1...done.
Loaded symbols for /usr/lib64/libz.so.1
Reading symbols from /usr/lib64/libart_lgpl_2.so.2...done.
Loaded symbols for /usr/lib64/libart_lgpl_2.so.2
Reading symbols from /lib64/tls/libm.so.6...done.
Loaded symbols for /lib64/tls/libm.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
#0  0x0000002a95a2626d in raise () from /lib64/tls/libc.so.6
(gdb) bt
#0  0x0000002a95a2626d in raise () from /lib64/tls/libc.so.6
#1  0x0000002a95a27a6e in abort () from /lib64/tls/libc.so.6
#2  0x000000000040f623 in sigsegv_handler (signum=Variable "signum" is
not available. ) at sig.c:57
#3  <signal handler called>
#4  0x0000002a95695639 in write_RRA_row (rrd=0x7fbfffb0a0, rra_idx=3,
rra_current=0x7fbfffaff0, CDP_scratch_idx=Variable "CDP_scratch_idx" is
not available. )
at rrd_update.c:1534 #5  0x0000002a95696d02 in _rrd_update
(filename=0x525520
"/opt/hobbit/data/rrd/mail.pitsystem.it/netstat.rrd", tmplt=Variable
"tmplt" is not available.
)
at rrd_update.c:1239 #6  0x0000002a95698027 in rrd_update (argc=5,
argv=0x7fbfffb160) at rrd_update.c:185
#7  0x00000000004036b6 in create_and_update_rrd (hostname=Variable
"hostname" is not available.
) at do_rrd.c:183
#8  0x000000000040a665 in update_rrd (hostname=0x2a9623708e
"mail.pitsystem.it", testname=0x2a962370a0 "netstat",
msg=0x2a962370a8 "data mail,pitsystem,it.netstat\nlinux\nIp:\n 856780605
total packets received\n    0 forwarded\n    0 incoming packets
discarded\n    853032398 incoming packets delivered\n 1540620911
requests sent out"..., tstamp=1219310955, sender=Variable "sender" is
not available. ) at do_rrd.c:344
#9  0x0000000000402951 in main (argc=Variable "argc" is not available.
) at hobbitd_rrd.c:153
(gdb)


Hobbit version: 4.2.0


If somebody could help me to know what is wrong, he is welcome.


Regards,
Nicolas Lienard

list Henrik Størner · Mon, 25 Aug 2008 21:11:53 +0000 (UTC) ·
quoted from Nicolas Lienard
In <user-620508b69d1e@xymon.invalid> "Nicolas LIENARD" <user-2c7a80acbe2b@xymon.invalid> writes:
Core was generated by `hobbitd_rrd --rrddir=/opt/hobbit/data/rrd'.
Program terminated with signal 6, Aborted.
#3  <signal handler called>
#4  0x0000002a95695639 in write_RRA_row (rrd=0x7fbfffb0a0, rra_idx=3,
rra_current=0x7fbfffaff0, CDP_scratch_idx=Variable "CDP_scratch_idx" is
not available.
)
   at rrd_update.c:1534
#5  0x0000002a95696d02 in _rrd_update (filename=0x525520
"/opt/hobbit/data/rrd/mail.pitsystem.it/netstat.rrd", tmplt=Variable
"tmplt" is not available.
)
   at rrd_update.c:1239
#6  0x0000002a95698027 in rrd_update (argc=5, argv=0x7fbfffb160) at
rrd_update.c:185
#7  0x00000000004036b6 in create_and_update_rrd (hostname=Variable
"hostname" is not available.
) at do_rrd.c:183
#8  0x000000000040a665 in update_rrd (hostname=0x2a9623708e
"mail.pitsystem.it", testname=0x2a962370a0 "netstat",
   msg=0x2a962370a8 "data mail,pitsystem,it.netstat\nlinux\nIp:\n   
856780605 total packets received\n    0 forwarded\n    0 incoming
packets discarded\n    853032398 incoming packets delivered\n   
1540620911 requests sent out"...,

It looks like a problem with the RRDtool library; the crash happens
inside one of the RRD functions (one handling updates of an RRD file).
The data that gdb shows hobbit feeding into RRD looks allright.

It seems you have rrdtool 1.2.27. I see that 1.2.28 is the last of the
1.2.x series of rrrdtool; there's nothing in the change-log that
screams "this is the bug" at me, but I would suggest re-building
Hobbit with the latest 1.2.x rrdtool version to see if that
solves the problem.

Also, if you could run hobbitd_rrd with the "--debug" option for a
few minutes until it crashes, it would show exactly what data was
being sent to the RRD-update function.


Regards,
Henrik
list Nicolas Lienard · Wed, 27 Aug 2008 13:31:59 +0200 ·
Hi,

I finally deleted old rrd files, and everything works now.

I ll have a look on the version. The problem appeared after updating
devmon.

Thanks for reply.
quoted from Henrik Størner


Le lundi 25 août 2008 à 21:11 +0000, Henrik Stoerner a écrit :
In <user-620508b69d1e@xymon.invalid> "Nicolas LIENARD" <user-2c7a80acbe2b@xymon.invalid> writes:
Core was generated by `hobbitd_rrd --rrddir=/opt/hobbit/data/rrd'.
Program terminated with signal 6, Aborted.
#3  <signal handler called>
#4  0x0000002a95695639 in write_RRA_row (rrd=0x7fbfffb0a0, rra_idx=3,
rra_current=0x7fbfffaff0, CDP_scratch_idx=Variable "CDP_scratch_idx" is
not available.
)
   at rrd_update.c:1534
#5  0x0000002a95696d02 in _rrd_update (filename=0x525520
"/opt/hobbit/data/rrd/mail.pitsystem.it/netstat.rrd", tmplt=Variable
"tmplt" is not available.
)
   at rrd_update.c:1239
#6  0x0000002a95698027 in rrd_update (argc=5, argv=0x7fbfffb160) at
rrd_update.c:185
#7  0x00000000004036b6 in create_and_update_rrd (hostname=Variable
"hostname" is not available.
) at do_rrd.c:183
#8  0x000000000040a665 in update_rrd (hostname=0x2a9623708e
"mail.pitsystem.it", testname=0x2a962370a0 "netstat",
   msg=0x2a962370a8 "data mail,pitsystem,it.netstat\nlinux\nIp:\n   
856780605 total packets received\n    0 forwarded\n    0 incoming
packets discarded\n    853032398 incoming packets delivered\n   
1540620911 requests sent out"...,

It looks like a problem with the RRDtool library; the crash happens
inside one of the RRD functions (one handling updates of an RRD file).
The data that gdb shows hobbit feeding into RRD looks allright.

It seems you have rrdtool 1.2.27. I see that 1.2.28 is the last of the
1.2.x series of rrrdtool; there's nothing in the change-log that
screams "this is the bug" at me, but I would suggest re-building
Hobbit with the latest 1.2.x rrdtool version to see if that
solves the problem.

Also, if you could run hobbitd_rrd with the "--debug" option for a
few minutes until it crashes, it would show exactly what data was
being sent to the RRD-update function.


Regards,
Henrik