repair rrd data
list Grégory Bulot
Hello I have many ncv graph and they often "dotted" instead made of lines It's like some data left in rrd : see attachment * How to correct without loosing all data ? * how it's happened ? I had search for "rrd repair tools" but nothing seems works on my server or PEBCAK ;-) (hope my english is not too bad)
list Jeremy Laidman
Hi Diffusion
This is most likely a problem with the data source feeding into Xymon, in
which case the RRD file will have times with no data, shown as "nan" (not a
number) in "rrdtool fetch <filename> AVERAGE" output. Note that it's common
for one or two "NaN" records at the end of each timescale in the RRD file
(and a bunch at the start also, if the host has been added relatively
recently), but it's less common to have numbers intermingled with NaN
lines. Here's an example:
$ *rrdtool fetch disk,root.rrd AVERAGE*
pct used
1718868300: 8.6000000000e+01 6.1474990000e+06
1718868600: 8.6000000000e+01 6.1474990000e+06
1718868900: -nan -nan
1718869200: -nan -nan
1718869500: -nan -nan
1718869800: 8.6000000000e+01 6.1474990000e+06
1718870100: 8.6000000000e+01 6.1474990000e+06
1718870400: 8.6000000000e+01 6.1474990000e+06
1718870700: 8.6000000000e+01 6.1474990000e+06
...
This example is a sequence of data samples with numbers, then nan due to
missing samples and then more numbers. If you find this in your RRD files,
then the data doesn't exist, and there's no way to repair. The first column
is epoch time, which can be converted to local time using (on Linux or with
GNU date) "date --date @<epochtime>". This might be helpful in correlating
the nan entries with the gaps in your graphs.
In my experience, this is often caused by a connectivity problem between
the client and the server. You may find that data from the client (eg disk,
cpu) has gaps, but data from xymonnet probes (eg conn, http, dns) have no
gaps. This is consistent with missing client data samples.
You might find log messages in xymond.log that help identify a problem.
Also on the client, the xymonclient.log file might be worth a look.
Cheers
Jeremy
▸
On Fri, 21 Jun 2024 at 01:16, <user-ba12ef048dd6@xymon.invalid> wrote:
Hello I have many ncv graph and they often "dotted" instead made of lines It's like some data left in rrd : see attachment * How to correct without loosing all data ? * how it's happened ? I had search for "rrd repair tools" but nothing seems works on my server or PEBCAK ;-) (hope my english is not too bad)
list Ralph Mitchell
Gaps like that happen when there's missing data. There's not a lot you can
do about that.
If you really, really want to make the gaps go away, you can try dumping
out the RRD to XML, edit the dump, then load it back in.
rrdtool dump xymon.rrd
After getting past the header, you'll see the actual data in this form:
<!-- 2024-06-19 18:55:00 EDT / 1718837700 -->
<row><v>4.0000000000e+00</v><v>3.0000000000e+01</v></row>
<!-- 2024-06-19 19:00:00 EDT / 1718838000 -->
<row><v>4.0000000000e+00</v><v>3.0000000000e+01</v></row>
<!-- 2024-06-19 19:05:00 EDT / 1718838300 -->
<row><v>4.0000000000e+00</v><v>3.0000000000e+01</v></row>
<!-- 2024-06-19 19:10:00 EDT / 1718838600 -->
<row><v>4.0000000000e+00</v><v>3.0000000000e+01</v></row>
<!-- 2024-06-19 19:15:00 EDT / 1718838900 -->
<row><v>4.0000000000e+00</v><v>3.0000000000e+01</v></row>
I think where there's a missing value, you'll see "NaN" or something
similar. You could rewrite the empty values with an average of the row
above and below. After saving the file, import it back into a new RRD.
The problem with doing all that is you have to get it done between samples
being written, which is typically 5 minute intervals. If you don't swap
the old RRD for the new one in time, you'll miss the next sample.
Ralph Mitchell
▸
On Thu, Jun 20, 2024 at 11:16?AM <user-ba12ef048dd6@xymon.invalid> wrote:
Hello I have many ncv graph and they often "dotted" instead made of lines It's like some data left in rrd : see attachment * How to correct without loosing all data ? * how it's happened ? I had search for "rrd repair tools" but nothing seems works on my server or PEBCAK ;-) (hope my english is not too bad)
list Grégory Bulot
Le Fri, 21 Jun 2024 10:23:24 +1000, Jeremy Laidman <user-0608abae5e7c@xymon.invalid> a ?crit :
▸
This is most likely a problem with the data source feeding into Xymon, in which case the RRD file will have times with no data, shown as "nan" (not a number) in "rrdtool fetch <filename> AVERAGE" output. Note that it's common for one or two "NaN" records at the end of each timescale in the RRD file (and a bunch at the start also, if the host has been added relatively recently), but it's less common to have numbers intermingled with NaN lines. Here's an example:
$ rrdtool fetch "${RRDBASEHOST}/ups,BCHARGE.rrd" AVERAGE | tail -15
1719046500: 1,0000000000e+02
1719046800: 1,0000000000e+02
1719047100: 1,0000000000e+02
1719047400: 1,0000000000e+02
1719047700: 1,0000000000e+02
1719048000: 1,0000000000e+02
1719048300: 1,0000000000e+02
1719048600: 1,0000000000e+02
1719048900: -nan
1719049200: -nan
1719049500: -nan
1719049800: -nan
1719050100: 1,0000000000e+02
1719050400: -nan
1719050700: -nan
▸
You might find log messages in xymond.log that help identify a problem. Also on the client, the xymonclient.log file might be worth a look.
ok i'll have a look, at first nothing, but i'll add verbosity for my
check "ups/bcharge" (the most easy to look, and the least important
history)
list Grégory Bulot
Le Thu, 20 Jun 2024 20:35:31 -0400, Ralph M <user-00a5e44c48c0@xymon.invalid> a ?crit :
▸
Gaps like that happen when there's missing data. There's not a lot you can do about that.
i think that to (and Jeremy Laidman previous answer)
▸
If you really, really want to make the gaps go away, you can try
dumping out the RRD to XML, edit the dump, then load it back in.
rrdtool dump xymon.rrd
$ rrdtool dump "${RRDBASEHOST}/ups,BCHARGE.rrd" | tail -7 | sed
's#.*-->\(.*\)# \1#'
<row><v>9.9378222222e+01</v></row>
<row><v>NaN</v></row>
<row><v>1.0000000000e+02</v></row>
<row><v>1.0000000000e+02</v></row>
</database>
</rra>
</rrd>
I'll try to see the trouble, and after trying to add "missing data with
value not too bad"