Using --no-cache
list Tom L. Stewart
On 4.3.0 has the -no-cache been taken out for the rrd daemons? I have been having issues in testing where I am getting missing data for 5-15 minutes at random times. I have increased my MAX... items in the xymonserver.cfg and have tried to use the -no-cache in tasks.cfg. It currently looks like this on a process status. xymon 27206 27201 0 Mar16 ? 00:00:00 xymond_channel --channel=status --log=/home/xymon/log/rrd-status.log xymond_rrd --rrddir=/home/xymon/data/rrd --no-cache xymon 27207 27201 0 Mar16 ? 00:00:00 xymond_channel --channel=data --log=/home/xymon/log/rrd-data.log xymond_rrd --rrddir=/home/xymon/data/rrd --extra-tests=mpstat,zonestat --extra-script=/home/xymon/server/ext/rrd_data.pl --no-cache xymon 27225 27206 0 Mar16 ? 00:00:02 xymond_rrd --rrddir=/home/xymon/data/rrd --no-cache xymon 27249 27207 0 Mar16 ? 00:00:01 xymond_rrd --rrddir=/home/xymon/data/rrd --extra-tests=mpstat,zonestat --extra-script=/home/xymon/server/ext/rrd_data.pl --no-cache The missing data seems to be affecting the Solaris systems more that the other systems. Any ideas would be great. Thank you, Tom
list Henrik Størner
On Thu, 17 Mar 2011 15:28:34 -0500, "Stewart, Tom L."
▸
<user-f210f371749e@xymon.invalid> wrote:On 4.3.0 has the -no-cache been taken out for the rrd daemons? I have been having issues in testing where I am getting missing data for 5-15 minutes at random times.
"xymond_rrd --no-cache" should work fine. Are you seeing any errors in the rrd-status.log or rrd-data.log files ? Regards, Henrik
list Tom L. Stewart
I am using the --no-cache but I still get holes. Seems to be related to Solaris 10 systems on both sparc and x86 and are totally random. I have included a picture of one of the systems that does not even have a heavy load from over the weekend. The only graphs that are affected are: "CPU Load" and "Users and Processes". Here are the log files from the last restart with nothing from the weekend. [root at xxxxx log]# cat rrd-data.log 2011-03-24 13:28:41 Tried to down BOARDBUSY: Invalid argument 2011-03-24 13:28:41 Peer not up, flushing message queue 2011-03-24 13:28:41 Shutting down, flushing cached updates to disk 2011-03-24 13:28:41 Cache flush completed 2011-03-24 13:28:56 Peer not up, flushing message queue [root at xxxxx log]# cat rrd-status.log 2011-03-24 13:28:41 Tried to down BOARDBUSY: Invalid argument 2011-03-24 13:28:41 Peer not up, flushing message queue 2011-03-24 13:28:41 Shutting down, flushing cached updates to disk 2011-03-24 13:28:41 Cache flush completed 2011-03-24 13:28:56 Peer not up, flushing message queue I did see some improvement when I added more memory for the following based on error messages from xymond. xymonserver.cfg:MAXMSG_CLIENT=2048 # added more by Tom S xymonserver.cfg:MAXMSG_DATA=2048 # added more by Tom S xymonserver.cfg:MAXMSG_STATUS=2048 # added more by Tom S xymonserver.cfg:MAXMSG_NOTES=2048 # added more by Tom S xymonserver.cfg:MAXMSG_USER=2048 # added more by Tom S The server is RHEL6 64-bit: [root at xxxxx etc]# uname -a Linux xxxx 2.6.32-71.18.1.el6.x86_64 #1 SMP Wed Feb 2 17:49:59 EST 2011 x86_64 x86_64 x86_64 GNU/Linux The only other item is that the clients are still using 4.2.3, but I don't think that would make a difference ?? Tom -----Original Message----- From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of user-ce4a2c883f75@xymon.invalid Sent: Monday, March 28, 2011 7:28 AM To: xymon at xymon.com Subject: Re: [Xymon] Using --no-cache
▸
On Thu, 17 Mar 2011 15:28:34 -0500, "Stewart, Tom L."
<user-f210f371749e@xymon.invalid> wrote:On 4.3.0 has the -no-cache been taken out for the rrd daemons? I have been having issues in testing where I am getting missing data for 5-15 minutes at random times.
"xymond_rrd --no-cache" should work fine. Are you seeing any errors in the rrd-status.log or rrd-data.log files ? Regards, Henrik
Attachments (1)
list Jeremy Laidman
On Tue, Mar 29, 2011 at 2:10 AM, Stewart, Tom L.
▸
<user-f210f371749e@xymon.invalid> wrote:I am using the --no-cache but I still get holes.
The only graphs that are affected are: "CPU Load" and "Users and Processes".
I get empty graphs for these two when I have --no-cache. Without --no-cache I get graphs with gaps. I get no errors in the logs.
▸
I did see some improvement when I added more memory for the following based on error messages from xymond.
Interesting. I wondered about these settings, but when I ooked for errors in xymond.log, I saw nothing. I'll add these in my setup and see if it helps. My clients and servers are all running 4.3.0 on SUSE. So for me, it's not a Solaris thing, or a Xymon version mismatch thing. But I would imagine that Solaris messages would be different (perhaps larger) than Linux messages, and that might be why you're seeing problems only on your Solaris servers. Cheers Jeremy
attachment.png