Xymon Mailing List Archive search

False Process Down Alerts

list Josh Luthman
Sun, 17 Jan 2010 18:21:15 -0500
Message-Id: <user-a7a1a92efcf4@xymon.invalid>

Is there only one client sending data as this name?  I don't think you
answered Lars' email.

What does the alert read and what does the data say?  Missing process?  Too
high of a load?

Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

"The secret to creativity is knowing how to hide your sources."
--- Albert Einstein


On Sun, Jan 17, 2010 at 6:11 PM, Chris Naude <user-aaac7867ee41@xymon.invalid>wrote:
The problem has suddenly become much much worse. I verified with tcpdump
that the data coming from the client is 100% correct. It seems something on
the Xymon server side is not handling the client data correctly. Anyone have
any other ideas?

[image: red] 89%     /testdb3 (37771472% used) has reached the PANIC level (95%)

Filesystem            1024-blocks  Used  Available Capacity Mounted on
/dev/vgtestdb1/lvol1    107844344 70901816 36942528    66%     /testdb1
/dev/vgtestdb2/lvol1    35962064 25453128 10508936    71%     /testdb2
/dev/vgtestdb4/lvol1    970909400 825006344 145903056    85%     /testdb4
/dev/vgtestdb3/lv
l1 ]  338788224 301016752 37771472    89%     /testdb3
/dev/vgtestdb5/lvol1    179789048 150553912 29235136    84%     /testdb5
/dev/vg00/lvol8       24580711    74501 24506210     1%     /home
/dev/vg00/lvol4       10226680  6339283  3887397    62%     /opt


On Sat, Jan 16, 2010 at 10:44 AM, Chris Naude <user-aaac7867ee41@xymon.invalid>wrote:
That makes a lot of sense. I did have some issues with the startup scripts
on HP-UX. I'll check it out later tonight. Hopefully i can get it fixed
before it goes live tonight. Thanks!


On Sat, Jan 16, 2010 at 7:56 AM, Lars Ebeling <
user-1fecd3eafd52@xymon.invalid> wrote:
 It looks like two instances of the client are writing to the file at
the same time or almost ;)

Lars

----- Original Message -----
 *From:* Chris Naude <user-aaac7867ee41@xymon.invalid>
*To:* user-ae9b8668bcde@xymon.invalid
*Sent:* Saturday, January 16, 2010 4:59 AM
*Subject:* [hobbit] False Process Down Alerts

I'm run into a strange problem with my Xymon server. I noticed today that
I'm receiving random false alerts for processes being down. When I look at
the process list output in the alert it looks as if the data coming from the
clients isn't correct. Here is an example. Has anyone seen anything like
this?

 9613  1944 root      Jan 11  S 154  0.00 00:00:00    6128 cmclconfd -c
10389  1944 root      Jan 11  S 154  0.00 00:00:00    6128 cmclconfd -c
 9794     1 oracle   10:55:57 S 154  0.00 00:00:0
  217600]oracleTEST (LOCAL=NO)
 1592     1 oracle    Jan 11  S 154  0.00 00:00:11  217136 ora_mman_TEST
12751  1944 root      Jan 11  S 154  0.00 00:00:00    6128 cmclconfd -c
 8965  1944 root      Jan 11  S 154  0.00 00:00:00    6128 cmclconfd -c


11819     1 oracle    Jan 12  S 154  0.00 00:00:07  217280 ora_j015_TEST
 2711     1 roo
      ]ec  4  S 120  0.04 00:02:16     868 /usr/sbin/xntpd
 3547     1 xymon     Dec  4  S 168  0.00 00:00:43     268 /opt/xymon/client/bin/hobbitlaunch --config=/opt/xymon/client/etc/clientlaunch.cfg --log=/opt/xymon/client/logs/clientlaunch.log --pidfile=/opt/xymon/client/logs/clientlaunch.101.example.com.pid
 3728     1 root      Dec  4  R 152  0.00 00:00:37    4208 /usr/sbin/stm/uut/bin/tools/monitor/WbemWrapperMonitor


Xymon version: 4.3.0-0.beta2
Xymon server: CentOS 5.4 32 bit

Client: HP-UX 11.31 Itanium

--
Chris Naude

--
Chris Naude
--
Chris Naude