Xymon Mailing List Archive search

Error in logs after recent update to 4.2.0

3 messages in this thread

list David Pullman · Mon, 16 Jul 2007 10:15:31 -0400 ·
I'm afraid I have no idea where to start looking for what the cause of this error my be.  We upgraded our display and bbtest server, which is a RHEL 4 x86_64 box, to 4.2.0 at the very end of June.  Since then we have the following about every ten minutes in /var/log/messages:

Jul 16 09:48:36 frodo-70 kernel: hobbitd_history[28138]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4
Jul 16 09:48:41 frodo-70 kernel: hobbitd_history[28148]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4
Jul 16 09:48:50 frodo-70 kernel: hobbitd_history[28161]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4
Jul 16 09:58:55 frodo-70 kernel: hobbitd_history[28899]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4
Jul 16 09:59:00 frodo-70 kernel: hobbitd_history[28905]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4
Jul 16 09:59:06 frodo-70 kernel: hobbitd_history[28915]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4
Jul 16 09:59:10 frodo-70 kernel: hobbitd_history[28928]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4
Jul 16 09:59:15 frodo-70 kernel: hobbitd_history[28934]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4
Jul 16 09:59:20 frodo-70 kernel: hobbitd_history[28940]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4

This system ran 4.1.2p1 for more than a year with no issues.  I did an upgrade in place.  No changes otherwise.

I would be very grateful for even the slightest hint on how to start looking for the problem.

Thanks very much

-- 
David Pullman
Systems Administrator
Manufacturing Engineering Laboratory
National Institute of Standards & Technology
Mail Stop XXXX
XXX Bureau Drive
Gaithersburg, MD XXXXX-XXXX
Tel: (XXX) XXX-XXXX
Fax: (XXX) XXX-XXXX
E-mail: user-27782d82b327@xymon.invalid
list Henrik Størner · Mon, 16 Jul 2007 16:23:40 +0200 ·
quoted from David Pullman
On Mon, Jul 16, 2007 at 10:15:31AM -0400, David Pullman wrote:
I'm afraid I have no idea where to start looking for what the cause of 
this error my be.  We upgraded our display and bbtest server, which is a 
RHEL 4 x86_64 box, to 4.2.0 at the very end of June.  Since then we have 
the following about every ten minutes in /var/log/messages:

Jul 16 09:48:36 frodo-70 kernel: hobbitd_history[28138]: segfault at 
0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4
The only clue here is that it's hobbitd_history (the history logging
module) in Hobbit that crashes. There might be a core file from it
in ~hobbit/data/tmp/ - if there is, then running it through gdb as
described in http://www.hswn.dk/hobbit/help/known-issues.html#bugreport
would help.

There aren't any known issues with that module in 4.2.0, so I'm a bit
puzzled as to what might be the cause of this.

Anything in the ~hobbit/server/logs/history.log file ?


Regards,
Henrik
list David Pullman · Wed, 18 Jul 2007 10:33:14 -0400 ·
quoted from Henrik Størner
Henrik Stoerner wrote:
On Mon, Jul 16, 2007 at 10:15:31AM -0400, David Pullman wrote:
I'm afraid I have no idea where to start looking for what the cause of this error my be.  We upgraded our display and bbtest server, which is a RHEL 4 x86_64 box, to 4.2.0 at the very end of June.  Since then we have the following about every ten minutes in /var/log/messages:

Jul 16 09:48:36 frodo-70 kernel: hobbitd_history[28138]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4
The only clue here is that it's hobbitd_history (the history logging
module) in Hobbit that crashes. There might be a core file from it
in ~hobbit/data/tmp/ - if there is, then running it through gdb as
described in http://www.hswn.dk/hobbit/help/known-issues.html#bugreport
would help.

There aren't any known issues with that module in 4.2.0, so I'm a bit
puzzled as to what might be the cause of this.

Anything in the ~hobbit/server/logs/history.log file ?


Regards,
Henrik

Henrik,

You pointed me in right direction by suggesting the history log file. In there it complained that it could not open allevents.  I found that the allevents file had its ownership changed from hobbit to root.  Fixed that and no more segfaults.

I don't know how the ownership changed, but I'll find out; flogging will commence shortly :)

Thanks very much.

--David