Xymon Mailing List Archive search

Loss of Apache graphs

list Werner Michels
Fri, 24 Feb 2006 09:06:04 -0300
Message-Id: <user-fb1c8909fbbc@xymon.invalid>

Thomas,

	Anyway while the shared memory is not working, you'll posible
have trouble.

	Did you already follow the verification of IPC/Shared memory
verification? Is not, take a look at one described by Henrik is at

http://www.hswn.dk/hobbiton/2005/10/msg00200.html

	IPC also has a dedicated session on the install manual
(http://www.hswn.dk/hobbit/help/install.html)

	Hope this helps.

-wm


On Thu, 23 Feb 2006 16:45:37 +0100
Thomas <user-97316fb2dd2a@xymon.invalid> wrote:
Yes thats right, mine are usually around 2 GB when I get into problems, so this is not it. Also I done have this in my logs

2006-02-09 15:24:36 Tried to down BOARDBUSY: Invalid argument
2006-02-09 15:31:54 Could not get shm of size 262144: No such file or directory
2006-02-09 15:31:54 Channel not available

so something else is wrong.

Rob Munsch wrote:
... are these not the rrd logs you meant..?  Can't really find any > others... also, if it's that, why is only one host affected?

Rob Munsch wrote:
Hmm.

Well, there are two webservers; one is showing apache graphs, the >> other isn't.
Went ahead and added 'em to a (pretty aggressive) rotation schedule; >> rrd-data.log is 600k, rrd-status was about 3.5M.  Just in case, >> rrd-status is now limited to 1M.

Stopped and restarted the server, but no apparent effect.  The server >> that had its Apache graphs still does, and that one that doesn't, >> doesn't.

Here are some recent rrd log entries, if that sheds any light.  "Mo" >> is the server with the graphs, "ws-1" is the one without:

rrd-data.log

2006-02-03 02:55:17 RRD error updating >> /home/hobbit/data/rrd/ws-1/apache.rrd from 10.10.10.47: illegal >> attempt to update using time 1138953317 when last update time is >> 1138953317 (minimum one second step)
2006-02-03 02:55:17 RRD error updating >> /home/hobbit/data/rrd/mo/apache.rrd from 10.10.10.47: illegal attempt >> to update using time 1138953317 when last update time is 1138953317 >> (minimum one second step)
2006-02-03 02:58:25 RRD error updating >> /home/hobbit/data/rrd/ws-1/apache.rrd from 10.10.10.47: illegal >> attempt to update using time 1138953505 when last update time is >> 1138953505 (minimum one second step)
2006-02-03 02:58:25 RRD error updating >> /home/hobbit/data/rrd/mo/apache.rrd from 10.10.10.47: illegal attempt >> to update using time 1138953505 when last update time is 1138953505 >> (minimum one second step)
2006-02-09 04:04:12 Could not get shm of size 262144: No such file or >> directory
2006-02-09 04:04:12 Channel not available
2006-02-09 15:24:36 Tried to down BOARDBUSY: Invalid argument
2006-02-09 15:31:54 Could not get shm of size 262144: No such file or >> directory
2006-02-09 15:31:54 Channel not available
2006-02-16 11:18:57 Tried to down BOARDBUSY: Invalid argument
2006-02-16 11:18:57 Worker process died with exit code 0, terminating
2006-02-22 11:41:59 Tried to down BOARDBUSY: Invalid argument
root at randomaccess /var/log/hobbit #          >>
rrd-status.log (the former 3.5M log - current is empty file with no >> entries post-rotate)

2006-02-09 15:24:36 Tried to down BOARDBUSY: Invalid argument
2006-02-09 15:31:54 Could not get shm of size 262144: No such file or >> directory
2006-02-09 15:31:54 Channel not available
2006-02-09 22:25:38 RRD error updating >> /home/hobbit/data/rrd/randomaccess/bbgen.rrd from 10.10.10.47: >> illegal attempt to update using time 1139541938 when last update time >> is 1139545383 (minimum one second step)
2006-02-16 11:18:57 Tried to down BOARDBUSY: Invalid argument
root at randomaccess /var/log/hobbit #

Not sure what's going on here.

Thomas wrote:
Hi Rob,

I dont know if this can help you but every time I have had problems >>> with missing graphs its been because the rrd logfiles were too big.

Just a info..

/Thomas

Rob Munsch wrote:
Hello,

There are two webservers being monitored by hobbit (among many >>>> other different servers).
Both have bb-hosts entries that are nearly identical.  Both have >>>> the same version of the client on them (4.1.2p1).  Both seem to be >>>> working perfectly well in all other respects - both internal (CPU, >>>> disk etc) and external (conn, http) tests seem to be working, and >>>> have graphs.

However on one, the apache trends show up as expected, and on the >>>> other, they have stopped graphing.  Current values for the >>>> graphless one are good ol' "nan," but *just* for the 4 apache trend >>>> graphs - Utilization, Workers, CPU Ut and RPS.

All other trend graphs are there.

Historical data for before the sudden loss of graphing is there >>>> (i.e., about a week ago the graphing stopped - 12 day graph shows >>>> data before this cutoff).

Nothing has changed, been added, or modified as far as i can tell.

What am i missing..?

Thanks!
E-mail classificado pelo Identificador de Spam Inteligente Terra.
Para alterar a categoria classificada, visite
http://mail.terra.com.br/protected_email/imail/imail.cgi?+_u=wmlistas&_l=1,1140709586.254648.2481.malavi.terra.com.br,6984,Des15,Des15