graphing problem
list Phil Crooker
I'm a graphing newbie and can't get the graph to work. Could anyone please help? This is to measure internet latency, the test is called "internet". This is running over four hosts, so we should be getting four sets of graphs. The rrd file is created, but I just see NaN values, no data is being recorded. many thanks, Phil I've got the following data coming in from an ext script every 10 mintues: google : 0.15 businessspectator : 1.28 bloomberg : 0.05 I set this up as a gauge graph. Here is one of the entries: <ds> <name> google </name> <type> GAUGE </type> <minimal_heartbeat> 600 </minimal_heartbeat> <min> NaN </min> <max> NaN </max> <!-- PDP Status --> <last_ds> U </last_ds> <value> 0.0000000000e+00 </value> <unknown_sec> 172 </unknown_sec> </ds> Here are the config files. in xymonserver.cfg: I added this to the GRAPHS string: internet=ncv and put in this line after that NCV_internet="google:GAUGE,businessspectator:GAUGE,bloomberg:GAUGE" graphs.cfg: [internet] TITLE Internet Latency YAXIS Seconds DEF:google=internet.rrd:google:GAUGE DEF:businessspectator=internet.rrd:businessspectator:GAUGE DEF:bloomberg=internet.rrd:bloomberg:GAUGE LINE2:google#00CCCC:Inode cache LINE2:businessspectator#FF0000:Dentry cache LINE2:bloomberg#00FF00:In COMMENT:Time to load home page in seconds.\n
list Phil Crooker
should be getting four sets of graphs. The rrd file is created, but I just see NaN values, no data is being recorded.
First, it sometimes takes a few cycles for RRD entries to show up. Although I suspect this is normally for non-GAUGE data sources.
This has been running for several days.
Check your RRD file permissions and make sure that Xymon can write to them. If they contain nothing but NaNs, you could delete them and see if they get recreated.
Yes they are recreated.
▸
in xymonserver.cfg: I added this to the GRAPHS string: internet=ncv and put in this line after that NCV_internet="google:GAUGE,businessspectator:GAUGE,bloomberg:GAUGE"
Good. Also, you might need to add "internet=ncv" to TEST2RRD.
It is now in both GRAPHS and TEST2RRD and has gone through several cycles... Probably something simple I missed out. thanks.
list Adam Goryachev
▸
On 17/04/13 15:30, Phil Crooker wrote:
should be getting four sets of graphs. The rrd file is created, but I just see NaN values, no data is being recorded.
You mentioned that your data is updated every 10 minutes, ensure that the RRD is defined as receiving one update every 10 minutes. By default, it expects data every 5 minutes, and only half the values can be invalid before all data is invalid, and one update every 10 minutes means half the data is invalid/unknown.... Just something to check anyway... (or consider sending data updates more frequently, every 5 minutes like xymon expects). Regards, Adam -- Adam Goryachev Website Managers www.websitemanagers.com.au
list Jeremy Laidman
On 17 April 2013 15:30, Phil Crooker <user-e8e31cd73303@xymon.invalid> wrote:
Probably something simple I missed out.
See if there are errors showing in your rrd-status.log file. J
list Michael Beatty
The problem is most certainly that your data is coming in at 10 minute intervals. The RRD files are created to have a step of 300 seconds and a heartbeat of 600 seconds. This means that rrd expects a new data point every 300 seconds (5 minutes) and if it doesn't get a new data point withing 600 seconds (10 minutes) it considers the data junk and disregards it. Typically, this is set to be double the step, so if you miss one, its okay, but if you miss 2 it returns NaN (Not a Number), there is nothing to stop you from going larger. Since your data is coming in every 10 minutes, it is in violation of your heartbeat and the data is being ignored.
I have asked around before and didn't get a solid answer as to if there is a better way to do it, but I have worked out a way to fix this. If there is a better way, please let me know!
There are two places you need to tweak, first is in xymonserver.cfg. You need to fully define your NCV value for that RRD.
You have:
NCV_internet="google:GAUGE,businessspectator:GAUGE,bloomberg:GAUGE"
When an rrd file is created it is in the following format:
dataSourceName:dataSourceType:heartbeat:min_value:max_value
So, your NCV value should be fully defined as:
NCV_internet="google:GAUGE:1200:0:U,businessspectator:GAUGE:1200:0:U,bloomberg:GAUGE:1200:0:U
This will create your 3 datasets, all of type GAUGE with a 1200 (double your 10 minute test) with a 0 for a minimum expected value and an "U"nlimited maximum value.
Next, you need to set your rrddefinition.cfg to set the "step" to override the default 300 second value. Since your script runs every 10 minutes, it should be 600. To do this, put a "-s" parameter in your rrddefinition
[internet]
-s 600
RRA:AVERAGE:0.5:1:576
One thing to note, while you can change the heartbeat on the fly, the step is permanent. Once the file is created, changing your rrddefinition.cfg won't change the RRD file. As long as you are still in development, every time you make a change, just delete the rrd file and let xymon create a new one, it will set the step you have defined in rrddefinition.cfg. If you aren't in development and do not want to loose the data you currently have, the only option I know you have is to export the rrd using "rrdtool dump" to an XML file, manually edit the STEP of that file, then do an "rrdtool restore" to convert that XML back into an rrd.
Here is a link to the rrd man page, its a good read.
http://oss.oetiker.ch/rrdtool/doc/rrdgraph.en.html
Michael Beatty
▸
On 04/16/2013 09:48 PM, Phil Crooker wrote:I'm a graphing newbie and can't get the graph to work. Could anyone please help? This is to measure internet latency, the test is called "internet". This is running over four hosts, so we should be getting four sets of graphs. The rrd file is created, but I just see NaN values, no data is being recorded.
many thanks, Phil
I've got the following data coming in from an ext script every 10 mintues:
google : 0.15
businessspectator : 1.28
bloomberg : 0.05
I set this up as a gauge graph. Here is one of the entries:
<ds>
<name> google </name>
<type> GAUGE </type>
<minimal_heartbeat> 600 </minimal_heartbeat>
<min> NaN </min>
<max> NaN </max>
<!-- PDP Status -->
<last_ds> U </last_ds>
<value> 0.0000000000e+00 </value>
<unknown_sec> 172 </unknown_sec>
</ds>
Here are the config files.
in xymonserver.cfg:
I added this to the GRAPHS string: internet=ncv
and put in this line after that
NCV_internet="google:GAUGE,businessspectator:GAUGE,bloomberg:GAUGE"
graphs.cfg:
[internet]
TITLE Internet Latency
YAXIS Seconds
DEF:google=internet.rrd:google:GAUGE
DEF:businessspectator=internet.rrd:businessspectator:GAUGE
DEF:bloomberg=internet.rrd:bloomberg:GAUGE
LINE2:google#00CCCC:Inode cache
LINE2:businessspectator#FF0000:Dentry cache
LINE2:bloomberg#00FF00:In
COMMENT:Time to load home page in seconds.\n
list Phil Crooker
First, many thanks for everyone's help, I really appreciate it and it saves me a lot of time. Now, I followed Michael's directions, and the data was recorded in the rrd files. But no graphs. So, I then changed the DEF entries in graphs.cfg from GAUGE to AVERAGE as per Wim Nelis's advice and voila - they are there! I think I will change the colour scheme, though.... ;-) Many, many thanks, all, this is excellent. regards, Phil
▸
The problem is most certainly that your data is coming in at 10 minute intervals. The RRD files are created to have a step of 300 seconds and a heartbeat of 600 seconds. This means that rrd expects a new data point every 300 seconds (5 minutes) and if it doesn't get a new data point withing 600 seconds (10 minutes) it considers the data junk and disregards it. Typically, this is set to be double the step, so if you miss one, its okay, but if you miss 2 it returns NaN (Not a Number), there is nothing to stop you from going larger. Since your data is coming in every 10 minutes, it is in violation of your heartbeat and the data is being ignored.
I have asked around before and didn't get a solid answer as to if there is a better way to do it, but I have worked out a way to fix this. If there is a better way, please let me know!
There are two places you need to tweak, first is in xymonserver.cfg. You need to fully define your NCV value for that RRD.
You have:
NCV_internet="google:GAUGE,businessspectator:GAUGE,bloomberg:GAUGE"
When an rrd file is created it is in the following format:
dataSourceName:dataSourceType:heartbeat:min_value:max_value
So, your NCV value should be fully defined as:
NCV_internet="google:GAUGE:1200:0:U,businessspectator:GAUGE:1200:0:U,bloomberg:GAUGE:1200:0:U
This will create your 3 datasets, all of type GAUGE with a 1200 (double your 10 minute test) with a 0 for a minimum expected value and an "U"nlimited maximum value.
Next, you need to set your rrddefinition.cfg to set the "step" to override the default 300 second value. Since your script runs every 10 minutes, it should be 600. To do this, put a "-s" parameter in your rrddefinition
[internet]
-s 600
RRA:AVERAGE:0.5:1:576
One thing to note, while you can change the heartbeat on the fly, the step is permanent. Once the file is created, changing your rrddefinition.cfg won't change the RRD file. As long as you are still in development, every time you make a change, just delete the rrd file and let xymon create a new one, it will set the step you have defined in rrddefinition.cfg. If you aren't in development and do not want to loose the data you currently have, the only option I know you have is to export the rrd using "rrdtool dump" to an XML file, manually edit the STEP of that file, then do an "rrdtool restore" to convert that XML back into an rrd.
Here is a link to the rrd man page, its a good read.
http://oss.oetiker.ch/rrdtool/doc/rrdgraph.en.html
Michael Beatty
On 04/16/2013 09:48 PM, Phil Crooker wrote:
I'm a graphing newbie and can't get the graph to work. Could anyone please help? This is to measure internet latency, the test is called "internet". This is running over four hosts, so we should be getting four sets of graphs. The rrd file is created, but I just see NaN values, no data is being recorded.
many thanks, Phil
I've got the following data coming in from an ext script every 10 mintues:
google : 0.15
businessspectator : 1.28
bloomberg : 0.05
I set this up as a gauge graph. Here is one of the entries:
<ds>
<name> google </name>
<type> GAUGE </type>
<minimal_heartbeat> 600 </minimal_heartbeat>
<min> NaN </min>
<max> NaN </max>
<!-- PDP Status -->
<last_ds> U </last_ds>
<value> 0.0000000000e+00 </value>
<unknown_sec> 172 </unknown_sec>
</ds>
Here are the config files.
in xymonserver.cfg:
I added this to the GRAPHS string: internet=ncv
and put in this line after that
NCV_internet="google:GAUGE,businessspectator:GAUGE,bloomberg:GAUGE"
graphs.cfg:
[internet]
TITLE Internet Latency
YAXIS Seconds
DEF:google=internet.rrd:google:GAUGE
DEF:businessspectator=internet.rrd:businessspectator:GAUGE
DEF:bloomberg=internet.rrd:bloomberg:GAUGE
LINE2:google#00CCCC:Inode cache
LINE2:businessspectator#FF0000:Dentry cache
LINE2:bloomberg#00FF00:In
COMMENT:Time to load home page in seconds.\n
list David Welker
Maybe I have a unique problem, but I'm hoping someone can help me out as I have scoured the archives, tried all kinds of combinations, and still haven't found the perfect solution. Problem: I have a column that reports a list of system and user processes (my-conn). I have a script that gets the summary data into a file (counts) in NCV format. I would like to report the summary data as a graph on the same column as the list of systems and processes (my-conn) I initially had both my-conn and counts=ncv in my TEST2RRD variable, which allowed my graph to show up, but I was getting all kinds of DS errors with the my-conn rrd, even after making it type GAUGE. I also was sending status messages. I tried sending data messages, since I don't want a separate column displayed with the summary data, which didn't seem to help. Sometimes I get links on the page, or sometimes I get the graph on the trends page, but not the column page. I've tried all kinds of variation on the theme... TEST2RRD=counts=ncv, my-conn=counts TEST2RRD=counts, my-conn TEST2RRD=counts TEST2RRD=my-conn=ncv GRAPH=counts GRAPH=my-conn GRAPH=my-conn=ncv Bottom line, can this be done? If so, what should I be sending, and what should the variable values be? Thanks! David
list Jeremy Laidman
▸
On 2 July 2013 00:32, David Welker <user-04cf53598626@xymon.invalid> wrote:
Maybe I have a unique problem, but I'm hoping someone can help me out as I have scoured the archives, tried all kinds of combinations, and still haven't found the perfect solution.
This is doable. Not sure if you're going anything wrong, but one thing comes to mind. If you changed your DS type to/from GAUGE, then you need to rebuild or erase your RRD file. Regardless, it might help to see a setup that works. For example, I graph SMART media errors. My script (see http://xymonton.org/monitors:xymon-smart) presents the data values for a sample in a data "trends" message, and also sends a status "smart" message. The script that creates these messages is run from a file in tasks.d/. The data messages include various metrics including uncorrected errors, corrected errors, drive temperature and so on. In graphs.cfg, I append entries for [smart], [smart_uncorrected], [smart_corrected] and [smart_temp]. Here's an example: [smart] # total read/write errors TITLE S.M.A.R.T. Total Media Errors YAXIS errors per second FNPATTERN ^smart.(.*).rrd DEF:rc at RRDIDX@=@RRDFN@:err_r_c:AVERAGE DEF:ru at RRDIDX@=@RRDFN@:err_r_u:AVERAGE DEF:wc at RRDIDX@=@RRDFN@:err_w_c:AVERAGE DEF:wu at RRDIDX@=@RRDFN@:err_w_u:AVERAGE CDEF:re at RRDIDX@=rc at RRDIDX@,ru at RRDIDX@,+ CDEF:we at RRDIDX@=wc at RRDIDX@,wu at RRDIDX@,+ COMMENT:@RRDPARAM@\:\n LINE1:re at RRDIDX@#@COLOR@:Read Errors : GPRINT:re at RRDIDX@:LAST:\: %5.1lf %s (cur) GPRINT:re at RRDIDX@:MAX: %5.1lf %s (max) GPRINT:re at RRDIDX@:MIN: %5.1lf %s (min) GPRINT:re at RRDIDX@:AVERAGE: %5.1lf %s (avg)\n LINE1:we at RRDIDX@#@COLOR@:Write Errors : GPRINT:we at RRDIDX@:LAST:\: %5.1lf %s (cur) GPRINT:we at RRDIDX@:MAX: %5.1lf %s (max) GPRINT:we at RRDIDX@:MIN: %5.1lf %s (min) GPRINT:we at RRDIDX@:AVERAGE: %5.1lf %s (avg)\n In xymonserver.cfg, I append "smart" to the TEST2RRD variable, thusly: TEST2RRD="cpu-la,disk,...xymonproxy,xymond,smart" This should result in the RRD file being created and populated. It should also result in the graph being shown on the status view of the "smart" test, using the definition added into graphs.cfg. Also in xymonserver.cfg, I append "trends" to the GRAPHS variable, like so: GRAPHS="la,disk,...,xymond,ntp,smart" This should result in a single "smart" graph being added to the trends page. In hosts.cfg I include smart in a TRENDS parameter like so: 10.1.2.3 servername # conn dns this that TRENDS:smart:smart|smart_uncorrected|smart_corrected|smart_temp This is not strictly necessary for what you want, but I want all possible graphs to show on the trends page for this host. This essentially says that on the trends page, where the "smart" graph would have gone, instead put the four graphs smart, smart_uncorrected, smart_corrected and smart_temp. J