Error in logs after recent update to 4.2.0
list David Pullman
I'm afraid I have no idea where to start looking for what the cause of this error my be. We upgraded our display and bbtest server, which is a RHEL 4 x86_64 box, to 4.2.0 at the very end of June. Since then we have the following about every ten minutes in /var/log/messages: Jul 16 09:48:36 frodo-70 kernel: hobbitd_history[28138]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4 Jul 16 09:48:41 frodo-70 kernel: hobbitd_history[28148]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4 Jul 16 09:48:50 frodo-70 kernel: hobbitd_history[28161]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4 Jul 16 09:58:55 frodo-70 kernel: hobbitd_history[28899]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4 Jul 16 09:59:00 frodo-70 kernel: hobbitd_history[28905]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4 Jul 16 09:59:06 frodo-70 kernel: hobbitd_history[28915]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4 Jul 16 09:59:10 frodo-70 kernel: hobbitd_history[28928]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4 Jul 16 09:59:15 frodo-70 kernel: hobbitd_history[28934]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4 Jul 16 09:59:20 frodo-70 kernel: hobbitd_history[28940]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4 This system ran 4.1.2p1 for more than a year with no issues. I did an upgrade in place. No changes otherwise. I would be very grateful for even the slightest hint on how to start looking for the problem. Thanks very much -- David Pullman Systems Administrator Manufacturing Engineering Laboratory National Institute of Standards & Technology Mail Stop XXXX XXX Bureau Drive Gaithersburg, MD XXXXX-XXXX Tel: (XXX) XXX-XXXX Fax: (XXX) XXX-XXXX E-mail: user-27782d82b327@xymon.invalid
list Henrik Størner
▸
On Mon, Jul 16, 2007 at 10:15:31AM -0400, David Pullman wrote:
I'm afraid I have no idea where to start looking for what the cause of this error my be. We upgraded our display and bbtest server, which is a RHEL 4 x86_64 box, to 4.2.0 at the very end of June. Since then we have the following about every ten minutes in /var/log/messages: Jul 16 09:48:36 frodo-70 kernel: hobbitd_history[28138]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4
The only clue here is that it's hobbitd_history (the history logging module) in Hobbit that crashes. There might be a core file from it in ~hobbit/data/tmp/ - if there is, then running it through gdb as described in http://www.hswn.dk/hobbit/help/known-issues.html#bugreport would help. There aren't any known issues with that module in 4.2.0, so I'm a bit puzzled as to what might be the cause of this. Anything in the ~hobbit/server/logs/history.log file ? Regards, Henrik
list David Pullman
▸
Henrik Stoerner wrote:
On Mon, Jul 16, 2007 at 10:15:31AM -0400, David Pullman wrote:I'm afraid I have no idea where to start looking for what the cause of this error my be. We upgraded our display and bbtest server, which is a RHEL 4 x86_64 box, to 4.2.0 at the very end of June. Since then we have the following about every ten minutes in /var/log/messages: Jul 16 09:48:36 frodo-70 kernel: hobbitd_history[28138]: segfault at 0000000000000000 rip 0000003e05f5ce3f rsp 0000007fbfff5020 error 4The only clue here is that it's hobbitd_history (the history logging module) in Hobbit that crashes. There might be a core file from it in ~hobbit/data/tmp/ - if there is, then running it through gdb as described in http://www.hswn.dk/hobbit/help/known-issues.html#bugreport would help. There aren't any known issues with that module in 4.2.0, so I'm a bit puzzled as to what might be the cause of this. Anything in the ~hobbit/server/logs/history.log file ? Regards, Henrik
Henrik, You pointed me in right direction by suggesting the history log file. In there it complained that it could not open allevents. I found that the allevents file had its ownership changed from hobbit to root. Fixed that and no more segfaults. I don't know how the ownership changed, but I'll find out; flogging will commence shortly :) Thanks very much. --David