hobbitfetch infinite loop?
list Daniel J McDonald
Every couple of days I note that hobbitfetch is hung:
yellow Load is HIGH
System clock is 0 seconds off
top - 07:23:48 up 67 days, 15:36, 0 users, load average: 3.57, 5.42, 5.18
Tasks: 94 total, 4 running, 90 sleeping, 0 stopped, 0 zombie
Cpu(s): 47.8% us, 6.5% sy, 0.3% ni, 43.0% id, 1.8% wa, 0.1% hi, 0.5% si
Mem: 2074580k total, 2008140k used, 66440k free, 46652k buffers
Swap: 1052216k total, 2576k used, 1049640k free, 1325888k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
30444 hobbit 25 0 2020 1016 616 R 96.3 0.0 3869:03 hobbitfetch
I then go kill -9 the hobbitfetch process, and it starts "working" again, but I get an error reported in the hobbitfetch status:
- Program crashed
Fatal signal caught!
Nothing is written to hobbitfetch.log. But I do get correct statuses
for the few hosts that I am polling.
I have 4 hosts that I poll using hobbitfetch. I think there will be
just two more. But they are all internet facing, so keeping them up is
a priority...
I would appreciate any suggestions as to where to begin troubleshooting
this, as I really need this to work.
--
Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX
Austin Energy
http://www.austinenergy.com
list Henrik Størner
▸
On Sat, Jan 27, 2007 at 11:43:58AM -0600, Daniel J McDonald wrote:
Every couple of days I note that hobbitfetch is hung: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 30444 hobbit 25 0 2020 1016 616 R 96.3 0.0 3869:03 hobbitfetch I then go kill -9 the hobbitfetch process, and it starts "working" again,
Next time, please do a "kill -6" and run the resulting coredump through gdb to get the stacktrace (see the "Reporting bugs" online help). If possible, I'd also like to have a copy of the coredump and the hobbitfetch binary. Regards, Henrik
list Daniel J McDonald
▸
On Sat, 2007-01-27 at 23:31 +0100, Henrik Stoerner wrote:
On Sat, Jan 27, 2007 at 11:43:58AM -0600, Daniel J McDonald wrote:Every couple of days I note that hobbitfetch is hung: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 30444 hobbit 25 0 2020 1016 616 R 96.3 0.0 3869:03 hobbitfetch > > I then go kill -9 the hobbitfetch process, and it starts "working" again,Next time, please do a "kill -6" and run the resulting coredump through gdb to get the stacktrace (see the "Reporting bugs" online help). If possible, I'd also like to have a copy of the coredump and the hobbitfetch binary.
I've sent you a couple of those in the past... On a whim, I specified IP addresses for those servers (rather than 0.0.0.0) and the infinite loop issue went away. I still get the - Program crashed Fatal signal caught! message on the hobbitfetch monitoring page, but it is still pulling down data much of the time. -- Daniel J McDonald, CCIE # 2495, CISSP # 78281, CNX Austin Energy http://www.austinenergy.com