Xymon Mailing List Archive search

[unsolved]defunct problem on sun 5.7

2 messages in this thread

list Thomas Seglard · Wed, 6 Sep 2006 14:49:33 +0200 ·
Hello,

I wrote some weeks ago about a 'defunct' problem on a solaris 7 client. Today I got this problem on another server... This problem worries me a bit as I can't find any clues on how to solve it. Is it a bug in hobbit client ? A bug in the system ? Do you encounter this type of error under hobbit or under solaris ? All system commands normally respond when I separately launch them. All my other applications on this server are normally working. I really don't understand the problem...
Thanks for your help.
Best regards,

Thomas


Hello,

I got a strange behaviour on a server with solaris 7, I got this output when I launch client :

$ ./runclient.sh start
Hobbit client for sunos started on edsvideo
$ ps -efd|grep hobbit
  hobbit 21806 21805  0 11:46:21 ?        0:00 /usr/lib/sa/sadc 300 2
  hobbit 21898 21897  0 11:46:24 ?        0:00 vmstat 300 2
  hobbit 21780     1  0 11:46:20 ?        0:00 /opt/hobbit/client/bin/hobbitlaunch --config=/opt/hobbit/client/etc/clientlaunc
  hobbit 21785 21780  0                   0:00 <defunct>
  hobbit 21781 21780  0                   0:00 <defunct>
  hobbit 21784 21780  0                   0:00 <defunct>
  hobbit 21787 21780  0                   0:00 <defunct>
  hobbit 21783 21780  0 11:46:20 ?        0:00 /usr/bin/sh /opt/hobbit/client/ext/bb-sar.sh
  hobbit 21897     1  0 11:46:24 ?        0:00 sh -c vmstat 300 2 1>/opt/hobbit/client/tmp/hobbit_vmstat.edsvideo.21811 2>&1;   hobbit 21904 21903  0 11:46:24 ?        0:00 iostat -c 300 2
  hobbit 21903     1  0 11:46:24 ?        0:00 sh -c iostat -c 300 2 1>/opt/hobbit/client/tmp/hobbit_iostatcpu.edsvideo.21811   hobbit 15431 15421  0 10:51:57 pts/2    0:00 -ksh
  hobbit 21805 21783  0 11:46:21 ?        0:00 /usr/bin/sar -A -o /opt/hobbit/client/tmp/sar.21783 300 1
  hobbit 21786 21780  0                   0:00 <defunct>
  hobbit 21782 21780  0                   0:01 <defunct>

Few minutes later, I got this :

$ ps -efd|grep hobbit
  hobbit 21780     1  0 11:46:20 ?        0:00 /opt/hobbit/client/bin/hobbitlaunch --config=/opt/hobbit/client/etc/clientlaunc
  hobbit 21785 21780  0                   0:00 <defunct>
  hobbit 21781 21780  0                   0:00 <defunct>
  hobbit 21784 21780  0                   0:00 <defunct>
  hobbit 21787 21780  0                   0:00 <defunct>
  hobbit 21783 21780  0                   0:00 <defunct>
  hobbit 15431 15421  0 10:51:57 pts/2    0:00 -ksh
  hobbit 21786 21780  0                   0:00 <defunct>
  hobbit 21782 21780  0                   0:01 <defunct>

Thus, nothing is graphed on hobbit server and all tests are purple. I got several servers with solaris 7 and everything goes fine on these, so I'm just posting here to know if someone got this problem too.
Sincerly,

Thomas


Ce message (et toutes ses pieces jointes eventuelles) est confidentiel et etabli a l'intention exclusive de ses destinataires.
Toute utilisation de ce message non conforme a sa destination, toute diffusion ou toute publication, totale ou partielle, est
interdite, sauf autorisation expresse.
L'internet ne permettant pas d'assurer l'integrite de ce message, CNP Assurances et ses filiales declinent toute responsabilite
au titre de ce message, s'il a ete altere, deforme ou falsifie.

*****

This message and any attachments (the "message") are confidential and intended solely for the addressees.
Any unauthorised use or dissemination is prohibited.
E-mails are susceptible to alteration.
Neither CNP Assurances nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or falsified.
list John Glowacki · Wed, 06 Sep 2006 11:13:50 -0400 ·
I am seeing defunct processes on a few solaris 7 clients too. Most 7 clients work OK. I have not had time to look into why. It seems to run once which sends the status to the hobbit server OK. Then the processes go defunct and a second status never gets sent. Results are purple after 30 minutes. My first guess is that it is a patch level issue. So far all working servers are at 106541-16 or higher.

# ps -ef | grep bb
       bb 22873     1  0 10:08:20 ?        0:00 /opt/hobbit/client/bin/hobbitlaunch --config=/opt/hobbit/client/etc/clientlaunc
       bb 22911 22909  0 10:08:22 ?        0:00 vmstat 300 2
       bb   549     1  0   Aug 19 ?        0:16 /opt/bbc1.9e-btf/bin/bbrun -a /opt/bbc1.9e-btf/bin/bb-local.sh
     root 22921 22195  0 10:08:33 pts/0    0:00 grep bb
       bb 22875 22873  1 10:08:20 ?        0:00 /opt/RICHPse/bin/se.sparcv9.5.7 /opt/hobbit/client/etc/hobbit.se
       bb 22909     1  0 10:08:22 ?        0:00 sh -c vmstat 300 2 1>/opt/hobbit/client/tmp/hobbit_vmstat.22878 2>&1; mv /opt/h
       bb 22876 22873  0                   0:00 <defunct>
       bb 22874 22873  0                   0:00 <defunct>

# cat clientlaunch.log
2006-09-06 10:08:20 hobbitlaunch starting
2006-09-06 10:08:20 Loading tasklist configuration from /opt/hobbit/client/etc/clientlaunch.cfg
# ps -ef | grep bb
       bb 22873     1  0 10:08:20 ?        0:00 /opt/hobbit/client/bin/hobbitlaunch --config=/opt/hobbit/client/etc/clientlaunc
       bb   549     1  0   Aug 19 ?        0:16 /opt/bbc1.9e-btf/bin/bbrun -a /opt/bbc1.9e-btf/bin/bb-local.sh
       bb 22875 22873  0 10:08:20 ?        0:00 /opt/RICHPse/bin/se.sparcv9.5.7 /opt/hobbit/client/etc/hobbit.se
       bb 22876 22873  0                   0:00 <defunct>
       bb 22874 22873  0                   0:00 <defunct>

John
quoted from Thomas Seglard

user-bb3e9041f07f@xymon.invalid wrote:
Hello,

I wrote some weeks ago about a 'defunct' problem on a solaris 7 client. Today I got this problem on another server... This problem worries me a bit as I can't find any clues on how to solve it. Is it a bug in hobbit client ? A bug in the system ? Do you encounter this type of error under hobbit or under solaris ? All system commands normally respond when I separately launch them. All my other applications on this server are normally working. I really don't understand the problem...
Thanks for your help.
Best regards,

Thomas


Hello,

I got a strange behaviour on a server with solaris 7, I got this output when I launch client :

$ ./runclient.sh start
Hobbit client for sunos started on edsvideo
$ ps -efd|grep hobbit
  hobbit 21806 21805  0 11:46:21 ?        0:00 /usr/lib/sa/sadc 300 2
  hobbit 21898 21897  0 11:46:24 ?        0:00 vmstat 300 2
  hobbit 21780     1  0 11:46:20 ?        0:00 /opt/hobbit/client/bin/hobbitlaunch --config=/opt/hobbit/client/etc/clientlaunc
  hobbit 21785 21780  0                   0:00 <defunct>
  hobbit 21781 21780  0                   0:00 <defunct>
  hobbit 21784 21780  0                   0:00 <defunct>
  hobbit 21787 21780  0                   0:00 <defunct>
  hobbit 21783 21780  0 11:46:20 ?        0:00 /usr/bin/sh /opt/hobbit/client/ext/bb-sar.sh
  hobbit 21897     1  0 11:46:24 ?        0:00 sh -c vmstat 300 2 1>/opt/hobbit/client/tmp/hobbit_vmstat.edsvideo.21811 2>&1;   hobbit 21904 21903  0 11:46:24 ?        0:00 iostat -c 300 2
  hobbit 21903     1  0 11:46:24 ?        0:00 sh -c iostat -c 300 2 1>/opt/hobbit/client/tmp/hobbit_iostatcpu.edsvideo.21811   hobbit 15431 15421  0 10:51:57 pts/2    0:00 -ksh
  hobbit 21805 21783  0 11:46:21 ?        0:00 /usr/bin/sar -A -o /opt/hobbit/client/tmp/sar.21783 300 1
  hobbit 21786 21780  0                   0:00 <defunct>
  hobbit 21782 21780  0                   0:01 <defunct>

Few minutes later, I got this :

$ ps -efd|grep hobbit
  hobbit 21780     1  0 11:46:20 ?        0:00 /opt/hobbit/client/bin/hobbitlaunch --config=/opt/hobbit/client/etc/clientlaunc
  hobbit 21785 21780  0                   0:00 <defunct>
  hobbit 21781 21780  0                   0:00 <defunct>
  hobbit 21784 21780  0                   0:00 <defunct>
  hobbit 21787 21780  0                   0:00 <defunct>
  hobbit 21783 21780  0                   0:00 <defunct>
  hobbit 15431 15421  0 10:51:57 pts/2    0:00 -ksh
  hobbit 21786 21780  0                   0:00 <defunct>
  hobbit 21782 21780  0                   0:01 <defunct>

Thus, nothing is graphed on hobbit server and all tests are purple. I got several servers with solaris 7 and everything goes fine on these, so I'm just posting here to know if someone got this problem too.
Sincerly,

Thomas