Xymon Mailing List Archive search

phantom processes causing false alerts

list Greg Shea
Mon, 12 Apr 2010 16:56:08 +0000 (UTC)
Message-Id: <user-f380c635bd24@xymon.invalid>

DARN HTML Email
----- user-f00ed6e065e8@xymon.invalid wrote: 

----- "Odinn" <user-385fc81b8782@xymon.invalid> wrote: > It sound like your hobbit server is in several different groups in hobbit-clients.cfg > each group that it matches, it adds the checks from that group to its list of checks, it doesn't match a group and then leave the hobbit-clients, it's additive all the way to the end for groups that match. > example > PAGE=servers >   PROC sshd > HOST=hobbitsrv >   PROC sshd >   PROC hobbitd > HOST=%(.*)srv(.*) >   PROC telnetd > if hobbitsrv is in the servers page, then it will check procs for sshd, sshd (again), hobbitd, and telnetd >  -- > Jim Sloan > Just remember, today is the day you thought tomorrow was going to be yesterday. > ----- Original Message ---- > From: "user-762ee872a5a4@xymon.invalid" <user-762ee872a5a4@xymon.invalid> > To: user-ae9b8668bcde@xymon.invalid > Cc: user-762ee872a5a4@xymon.invalid > Sent: Fri, April 9, 2010 5:35:15 PM > Subject: [hobbit] phantom processes causing false alerts > Hi all, > I have a strange problem with procs and am reaching out to the group for > help.  In the picture below > there are numerous 'cron' and 'sshd'  processes showing up for this > particular host (the hobbitserver). > The problem is, all these processes and in particular the ones in error > (red) don't belong on this server. > I have tried to locate where these phantom processes are coming from, > searching hist, histlog, hostdata > directories but I'm stuck.  The Client Data file does not have any of > these processes. > Any assistance would be appreciated > Thanks > Gregory R Shea > EMC Corporation > ======================================================================== > = > Entry from hobbit-alerts.cfg > ## HobbitServer > GROUP=HHEARTBEAT >         SCRIPT /apps/hobbit/server/etc/hb-alert.pl heartbeat > ======================================================================== > = > Entry from hobbit-clients.cfg > HOST=hobbitserver >         LOAD 40.0 45.0 >         MEMPHYS 101 102 >         MEMACT  98 99 >         MEMSWAP 97 98 >         #PROC hobbitd_channel >         PROC heartbeat 1 1 yellow GROUP=HHEARTBEAT >         #PROC sendmail TRACK=sendmail >         FILE /apps/hobbit/server/etc/bb-hosts yellow MTIME>600 TRACK >         FILE /apps/hobbit/server/etc/hobbit-alerts.cfg yellow MTIME>600 > TRACK >         FILE /apps/hobbit/server/etc/hobbit-clients.cfg yellow MTIME>600 > TRACK >         PORT LOCAL=0.0.0.0:1984 TEXT=HobbitD > ======================================================================== > == > http://hobbitserver/hobbit-cgi/bb-hostsvc.sh?HOST=hobbitmon&SERVICE=proc > s > Mon Mar 22 20:55:39 EDT 2010 - Processes NOT ok >   heartbeat (found 0, req. 1 or more)                This should be > YELLOW >   crond (found 1, req. 1 or more) >   /usr/sbin/sshd (found 1, req. 1 or more) >   cron (found 1, req. 1 or more) >   sshd (found 5, req. 1 or more) >   crond (found 1, req. 1 or more) >   /usr/sbin/sshd (found 1, req. 1 or more) >   crond (found 1, req. 1 or more) >   /usr/sbin/sshd (found 1, req. 1 or more) >   crond (found 1, req. 1 or more) >   /usr/sbin/sshd (found 1, req. 1 or more) >   crond (found 1, req. 1 or more) >   /usr/sbin/sshd (found 1, req. 1 or more) >   cron (found 1, req. 1 or more) >   sshd (found 5, req. 1 or more) >   crond (found 1, req. 1 or more) >   sshd (found 5, req. 1 or more) >   crond (found 1, req. 1 or more) >   sshd (found 5, req. 1 or more) >   cron (found 1, req. 1 or more) >   sshd (found 5, req. 1 or more) >   crond (found 1, req. 1 or more) >   /usr/sbin/sshd (found 1, req. 1 or more) >   cron (found 1, req. 1 or more) >   sshd (found 5, req. 1 or more) >   cron (found 1, req. 1 or more) >   sshd (found 5, req. 1 or more) >   sshd (found 5, req. 1 or more) >   ns-slapd (found 0, req. 1 or more)                These 2 are RED >   ns-httpd (found 0, req. 1 or more)                These 2 are RED >   cron (found 1, req. 1 or more) >   sshd (found 5, req. 1 or more) >   /usr/sbin/sshd (found 1, req. 1 or more) >   crond (found 1, req. 1 or more) >   sshd (found 5, req. 1 or more) >   crond (found 1, req. 1 or more) >   crond (found 1, req. 1 or more) >   /usr/sbin/sshd (found 1, req. 1 or more) >   crond (found 1, req. 1 or more) >   /usr/sbin/sshd (found 1, req. 1 or more) >   crond (found 1, req. 1 or more) >   sshd (found 5, req. 1 or more) >   PID  PPID USER      STARTED S PRI %CPU     TIME %MEM  RSZ   VSZ CMD >     1     0 root       Mar 21 S  24  0.0 00:00:02  0.0  568  4772 init > [3] >     2     1 root       Mar 21 S 139  0.0 00:00:13  0.0    0     0 > [migration/0] >     3     1 root       Mar 21 S   5  0.0 00:00:02  0.0    0     0 > [ksoftirqd/0] >     4     1 root       Mar 21 S 139  0.0 00:00:19  0.0    0     0 > [migration/1] >     5     1 root       Mar 21 S   5  0.0 00:00:03  0.0    0     0 > [ksoftirqd/1] >     6     1 root       Mar 21 S 139  0.0 00:00:27  0.0    0     0 > [migration/2] >     7     1 root       Mar 21 S   5  0.0 00:00:03  0.0    0     0 > [ksoftirqd/2] >     8     1 root       Mar 21 S 139  0.0 00:00:23  0.0    0     0 > [migration/3] >     9     1 root       Mar 21 S   5  0.0 00:00:04  0.0    0     0 > [ksoftirqd/3] >    10     1 root       Mar 21 S  34  0.0 00:00:00  0.0    0     0 > [events/0] >    11     1 root       Mar 21 S  34  0.0 00:00:00  0.0    0     0 > [events/1] >    12     1 root       Mar 21 S  34  0.0 00:00:00  0.0    0     0 > [events/2] >    13     1 root       Mar 21 S  34  0.0 00:00:00  0.0    0     0 > [events/3] >    14    10 root       Mar 21 S  34  0.0 00:00:00  0.0    0     0 > [khelper] >    15    10 root       Mar 21 S  24  0.0 00:00:00  0.0    0     0 > [kacpid] >    73    10 root       Mar 21 S  34  0.0 00:00:00  0.0    0     0 > [kblockd/0] >    74    10 root       Mar 21 S  34  0.0 00:00:00  0.0    0     0 > [kblockd/1] >    75    10 root       Mar 21 S  34  0.0 00:00:00  0.0    0     0 > [kblockd/2] >    76    10 root       Mar 21 S  34  0.0 00:00:00  0.0    0     0 > [kblockd/3] >    77     1 root       Mar 21 S  24  0.0 00:00:00  0.0    0     0 > [khubd] >   112    10 root       Mar 21 S  19  0.0 00:00:00  0.0    0     0 > [pdflush] >   113    10 root       Mar 21 S  24  0.0 00:01:46  0.0    0     0 > [pdflush] >   114     1 root       Mar 21 S  24  0.0 00:00:35  0.0    0     0 > [kswapd0] >   115    10 root       Mar 21 S  30  0.0 00:00:00  0.0    0     0 > [aio/0] >   116    10 root       Mar 21 S  34  0.0 00:00:00  0.0    0     0 > [aio/1] >   117    10 root       Mar 21 S  34  0.0 00:00:00  0.0    0     0 > [aio/2] >   118    10 root       Mar 21 S  34  0.0 00:00:00  0.0    0     0 > [aio/3] >   262     1 root       Mar 21 S  18  0.0 00:00:00  0.0    0     0 > [kseriod] >   310 32684 hobbit   20:56:55 S  23  0.0 00:00:00  0.0  944  7424 > /usr/sbin/ntpq -n -c rv xxx.xxx.xxx.xxx >   316 32667 hobbit   20:56:55 S  18  0.0 00:00:00  0.0  364  2608 > /apps/hobbit/client/bin/bb xxx.xxx.xxx.xxx status hobbitserver.mailq > green Mon Mar 22 20:56:55 EDT 2010?Mail queue contains 0 requests? >   337 32675 hobbit   20:56:55 S  21  0.0 00:00:00  0.0  364  2608 > /apps/hobbit/client/bin/bb xxx.xxx.xxx.xxx status hobbitserver.sendmail > green Mon Mar 22 20:56:55 EDT 2010  <sendmail>?Statistics from Tue Dec > 12 04:02:02 2006? M   msgsfr  bytes_from   msgsto    bytes_to  msgsrej > msgsdis msgsqur  Mailer? 3        2          2K        0          0K > 0       0       0  smtp? 4    16709      82093K    16515      80675K 0 > 0       0  esmtp? 7        0          0K  4686309    9618429K 0       0 > 0  relay? 8        0          0K        2          2K 0       0       0 > procmail? 9  4695543   10022009K     8595    1678935K 0       0       0 > local?================================================================== > ===? T  4712254   10104104K  4711421   11378041K        0       0 0? C > 4640434              4695450                    0 >   356 32661 hobbit   20:56:55 S  21  0.0 00:00:00  0.0  364  2608 > /apps/hobbit/client/bin/bb xxx.xxx.xxx.xxx status hobbitserver.temp > green Mon Mar 22 20:56:54 EDT 2010 Temperature status: ?Device Temp(C) > Temp(F) > Threshold(C)?------------------------------------------------------?&gre > en System Board Ambient Temp    22       71 > 42?------------------------------------------------------?Status green: > All devices look okay >   385  4788 apache   18:41:00 S  23  0.0 00:00:00  0.0 7836 119508 > /usr/sbin/httpd >   386  4788 apache   18:41:00 S  23  0.0 00:00:00  0.0 7840 119508 > /usr/sbin/httpd >   448 32663 hobbit   20:56:55 Z  23  0.0 00:00:00  0.0    0     0 > [letstat.pl] <defunct> >   505     1 root       Mar 21 S  20  0.0 00:00:00  0.0    0     0 > [scsi_eh_0] >   522 32663 hobbit   20:56:55 S  22  0.0 00:00:00  0.0  364  2608 > /apps/hobbit/client/bin/bb xxx.xxx.xxx.xxx status hobbitserver.network > green Mon Mar 22 20:56:54 EDT 2010 - EMS - ?No Recent Network > Input/Output Errors detected. ??eth0: autoneg on, 1GB/s-FDX, link > ok.?eth1: autoneg on, 1GB/s-FDX, link ok.? ? ?Name   Mtu   Net/Dest > Address               Ipkts        Ierrs  Opkts        Oerrs  Collis > Queue ?lo     16436  loopback             loopback             381956450 > 38195645     0      0        0    ?eth1   16896  hobbitserver > hobbitserver            5078278      0      2000647      0      0 0 > ?eth0   16896  hobbitserver         hobbitserver 203321736    0 > 135196037    0      0        0    ? >   556     1 root       Mar 21 S  24  0.0 00:01:29  0.0    0     0 
xx SNIP xx 

       >  

Hi Jim, 

Thanks for the response.  That was the first place I checked, but no luck.  The Hobbit server is all by itself, no includes, no pages, no wildcards. 
The URL http://hobbitserver/hobbit-cgi/bb-hostsvc.sh?HOST=hobbitserver&SERVICE=procs  uses hobbitsvc.cgi and the man page states: "hobbitsvc.cgi  is a CGI program to present a Hobbit status log in HTML form" .  I went through all the logs and deleted "procs" for hobbitserver, but no luck.  I also looked to see if somehow the logs had the same inode, no luck. 
This is driving me crazy 
Thanks again Gregory R Shea EMC Corporation