Xymon Mailing List Archive search

procs monitor size fluctuations

4 messages in this thread

list Gus Ferrer · Thu, 9 Apr 2009 17:24:55 -0400 ·
Hi,

I started seeing weird behavior with one of my Hobbit clients 2 days  ago and it's got me stumped.  What's happening is that my 'procs'  monitor is changing from red to green every polling cycle.  When it is  red, the monitored process list shows about half my monitored  processes are down.  However, doing a 'ps' on the client itself shows  that these processes are actually running.  I noticed that during the  red cycles, the active process list is consistently about 100 lines  shorter than when the procs monitor is green, so the Hobbit server is  marking the processes down since they aren't being reported by the  client.
It doesn't look like the client messages are being being truncated by  the server(no messages regarding truncation  in the hobbitd logs  anyway), but I raised the MAXMSG_STATUS and MAXMSG_CLIENT on the  server, with no obvious effect.  I also don't see any network issues  between to the client and the server, and no extreme loads on either.   I'm stuck....

BTW - I'm running the 4.2.0 software. The client is Solaris 10, the  server is SuSE Enterprise 9.4.


This is a small snippet, taken from my hobbit server,  First, from the  histlogs directory.  You can see the size of the file is bouncing back  and forth between 27186k and 16714k:

-rw-r--r--   1 hobbit users 27186 Apr  9 16:35 Thu_Apr_9_16:35:13_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:39 Thu_Apr_9_16:39:10_2009
-rw-r--r--   1 hobbit users 28108 Apr  9 16:40 Thu_Apr_9_16:40:11_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:44 Thu_Apr_9_16:44:14_2009
-rw-r--r--   1 hobbit users 27186 Apr  9 16:45 Thu_Apr_9_16:45:09_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:49 Thu_Apr_9_16:49:17_2009

And the corresponding bit from the hist directory shows the flapping  between red and green:

Thu Apr  9 16:35:13 2009 green 1239309313 237
Thu Apr  9 16:39:10 2009 red 1239309550 61
Thu Apr  9 16:40:11 2009 green 1239309611 243
Thu Apr  9 16:44:14 2009 red 1239309854 55
Thu Apr  9 16:45:09 2009 green 1239309909 248
Thu Apr  9 16:49:17 2009 red 1239310157 55

This is a very small example of what's happening, but it's been  happening with the same regularity for the 2 days now.
Does anyone have a clue what might be happening?

Thanks,
Gus
list Bruce White · Thu, 9 Apr 2009 17:30:45 -0500 ·
It sounds like the client is not getting a full ps listing.  Did you
check the contents of the hostdata file on the server for client in
question?


  Bruce White
 Senior Enterprise Systems Engineer | Phone: XXX-XXX-XXXX | Fax: XXX-XXX-XXXX | user-58f975e8bf9d@xymon.invalid | http://www.fellowes.com/
   Disclaimer: The information contained in this message may be privileged and confidential and protected from disclosure. If the reader of this message is not the intended recipient or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by replying to the message and deleting it from your computer. Thank you. Fellowes, Inc.
quoted from Gus Ferrer
 -----Original Message-----
From: Gus Ferrer [mailto:user-af0e9aea3366@xymon.invalid] Sent: Thursday, April 09, 2009 4:25 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] procs monitor size fluctuations

Hi,

I started seeing weird behavior with one of my Hobbit clients 2 days  ago and it's got me stumped.  What's happening is that my 'procs'  monitor is changing from red to green every polling cycle.  When it is  red, the monitored process list shows about half my monitored  processes are down.  However, doing a 'ps' on the client itself shows  that these processes are actually running.  I noticed that during the  red cycles, the active process list is consistently about 100 lines  shorter than when the procs monitor is green, so the Hobbit server is  marking the processes down since they aren't being reported by the  client.
It doesn't look like the client messages are being being truncated by  the server(no messages regarding truncation  in the hobbitd logs  anyway), but I raised the MAXMSG_STATUS and MAXMSG_CLIENT on the  server, with no obvious effect.  I also don't see any network issues  between to the client and the server, and no extreme loads on either.   I'm stuck....

BTW - I'm running the 4.2.0 software. The client is Solaris 10, the  server is SuSE Enterprise 9.4.


This is a small snippet, taken from my hobbit server,  First, from the  histlogs directory.  You can see the size of the file is bouncing back  and forth between 27186k and 16714k:

-rw-r--r--   1 hobbit users 27186 Apr  9 16:35 Thu_Apr_9_16:35:13_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:39 Thu_Apr_9_16:39:10_2009
-rw-r--r--   1 hobbit users 28108 Apr  9 16:40 Thu_Apr_9_16:40:11_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:44 Thu_Apr_9_16:44:14_2009
-rw-r--r--   1 hobbit users 27186 Apr  9 16:45 Thu_Apr_9_16:45:09_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:49 Thu_Apr_9_16:49:17_2009

And the corresponding bit from the hist directory shows the flapping  between red and green:

Thu Apr  9 16:35:13 2009 green 1239309313 237
Thu Apr  9 16:39:10 2009 red 1239309550 61
Thu Apr  9 16:40:11 2009 green 1239309611 243
Thu Apr  9 16:44:14 2009 red 1239309854 55
Thu Apr  9 16:45:09 2009 green 1239309909 248
Thu Apr  9 16:49:17 2009 red 1239310157 55

This is a very small example of what's happening, but it's been  happening with the same regularity for the 2 days now.
Does anyone have a clue what might be happening?

Thanks,
Gus
list Thomas R. Brand · Fri, 10 Apr 2009 10:03:10 -0400 ·
quoted from Bruce White
-----Original Message-----
...  my 'procs'
monitor is changing from red to green every polling cycle

-rw-r--r--   1 hobbit users 27186 Apr  9 16:35 Thu_Apr_9_16:35:13_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:39 Thu_Apr_9_16:39:10_2009
-rw-r--r--   1 hobbit users 28108 Apr  9 16:40 Thu_Apr_9_16:40:11_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:44 Thu_Apr_9_16:44:14_2009
-rw-r--r--   1 hobbit users 27186 Apr  9 16:45 Thu_Apr_9_16:45:09_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:49 Thu_Apr_9_16:49:17_2009
I've seen this behavior when a test system was reporting to the server
using the same hostname as a production system.

Notice the timestamp on the files; 16:39& 16:40, 16:44 & 16:45...

System 'a' reports at "Thu_Apr_9_16:44:14", followed a minute later when

System 'b' reports at "Thu_Apr_9_16:45:09"


Check the procs web page for the system; just above the graph, it states
what client IP reported the message:
	Status message received from 172.28.60.11
See if this changes when the screen flips color

Tom
list Gus Ferrer · Fri, 10 Apr 2009 10:52:03 -0400 ·
Good catch!  That's exactly what was happening.   A misconfigured  start-up script was setting the hostname to another server's name.

Thanks very much,
Gus
quoted from Thomas R. Brand


On Apr 10, 2009, at 10:03 AM, Brand, Thomas R. wrote:
-----Original Message-----
...  my 'procs'
monitor is changing from red to green every polling cycle

-rw-r--r--   1 hobbit users 27186 Apr  9 16:35  Thu_Apr_9_16:35:13_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:39  Thu_Apr_9_16:39:10_2009
-rw-r--r--   1 hobbit users 28108 Apr  9 16:40  Thu_Apr_9_16:40:11_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:44  Thu_Apr_9_16:44:14_2009
-rw-r--r--   1 hobbit users 27186 Apr  9 16:45  Thu_Apr_9_16:45:09_2009
-rw-r--r--   1 hobbit users 16714 Apr  9 16:49  Thu_Apr_9_16:49:17_2009
I've seen this behavior when a test system was reporting to the server
using the same hostname as a production system.

Notice the timestamp on the files; 16:39& 16:40, 16:44 & 16:45...

System 'a' reports at "Thu_Apr_9_16:44:14", followed a minute later  when

System 'b' reports at "Thu_Apr_9_16:45:09"


Check the procs web page for the system; just above the graph, it  states
what client IP reported the message:
	Status message received from 172.28.60.11
See if this changes when the screen flips color

Tom