Xymon Mailing List Archive search

Xymon + graphite

5 messages in this thread

list Galen Johnson · Mon, 7 Dec 2015 18:48:36 +0000 ·
Hey,


Has anyone tried to integrate alerting based on Graphite?  Or used Graphite as a trending replacement to rrd?  I love Xymon for my monitoring but the limitations and aggregations of rrds are starting to become an issue.


thanks


=G=


NB: I was sent this link today, http://blog.takipi.com/graphite-vs-grafana-build-the-best-monitoring-architecture-for-your-application/, and on the Graphite page there is this link under monitoring that I thought I might consider modifying for Xymon: https://github.com/blacked/graphite-to-zabbix.


=G=
list Jeremy Laidman · Thu, 10 Dec 2015 10:49:08 +0000 ·
quoted from Galen Johnson
On Tue, Dec 8, 2015 at 5:49 AM Galen Johnson <user-87f955643e3d@xymon.invalid> wrote:
Has anyone tried to integrate alerting based on Graphite?  Or used
Graphite as a trending replacement to rrd?  I love Xymon for my monitoring
but the limitations and aggregations of rrds are starting to become an
issue.
Nope, but I'm intrigued by Graphite.  Most of my servers have enormously
long trends pages because of all the extra graphs I've added.  These are
indispensable for tracking down weird faults.  But the number of graphs and
RRD files has become unwieldy.  One major shortcoming is that I can't put
metrics from different hosts onto the same graph.  I've used RRGrapher <
http://pages.cs.wisc.edu/~plonka/RRGrapher/>; to let me create ad-hoc graphs
like this, but it's obviously from last millennium, and could do with a
facelift.

I'd also like to integrate Smokeping into my monitoring service.  Having
multiple interfaces into the suite of RRD files makes for a
less-than-intuitive user experience.

For trending, Xymon can threshold (alert) on RRD files with the "DS"
operator in analysis.cfg.  Perhaps this can be extended to alert on
Holt-Winters aberrant behaviour thresholds.  Doing the same sort of thing
with a rewrite of the g2zproxy probably wouldn't be too difficult, at least
not on the Xymon side.

J
list Japheth Cleaver · Thu, 10 Dec 2015 08:02:15 -0800 ·
quoted from Jeremy Laidman

On Thu, December 10, 2015 2:49 am, Jeremy Laidman wrote:
On Tue, Dec 8, 2015 at 5:49 AM Galen Johnson <user-87f955643e3d@xymon.invalid>
wrote:
Has anyone tried to integrate alerting based on Graphite?  Or used
Graphite as a trending replacement to rrd?  I love Xymon for my
monitoring
but the limitations and aggregations of rrds are starting to become an
issue.
Nope, but I'm intrigued by Graphite.  Most of my servers have enormously
long trends pages because of all the extra graphs I've added.  These are
indispensable for tracking down weird faults.  But the number of graphs
and
RRD files has become unwieldy.  One major shortcoming is that I can't put
metrics from different hosts onto the same graph.  I've used RRGrapher <
http://pages.cs.wisc.edu/~plonka/RRGrapher/>; to let me create ad-hoc
graphs
like this, but it's obviously from last millennium, and could do with a
facelift.
I'd been looking at http://www.flotcharts.org/ and a few other RRD
graphing packages that could be used providing a more browseable
interface. There's absolutely a need (aside from the CSS work and a
potential "dashboard" view generally) for improved multi-host and
multi-graph views besides the linear trends output, I agree.
quoted from Jeremy Laidman
For trending, Xymon can threshold (alert) on RRD files with the "DS"
operator in analysis.cfg.  Perhaps this can be extended to alert on
Holt-Winters aberrant behaviour thresholds.  Doing the same sort of thing
with a rewrite of the g2zproxy probably wouldn't be too difficult, at
least
not on the Xymon side.
(Actually, the RRD files generated on new RPM installs have had HWPREDICT,
SEASONAL, and a few other RRA's configured for a while now, if anyone
feels like experimenting...)


One problem with the current RRD paradigm is that alerting is happening
only with data available at insertion time, not using data that's stored
into RRD file (or whatever metric store you have) already, so xymond_rrd
can't efficiently alert on things beyond that.

A "xymond_trend" could operate asynchronously on the RRD files, but to get
useful trend data back out of RRDs you'll need to flush the data to disk
first, which more or less blows out your I/O performance. Fine if you're
on SSD, but more of a problem if you're on heavily loaded spinning disks.

The problem there is just that there're just so many different ways of
doing this with a lot of different needs. To make something flexible
enough would require a good survey of what people are looking for.

(With that in mind -- What are people looking for? :) Maybe it's easier
than I'm thinking.)


Alternatively, sending the metric data off entirely to a different
package, which can reinject an alert into xymon if/when it notices a
trend, is an easily-approachable option using the RRD --processor option,
which can fork your metric feed off to whatever you like (OpenTSDB,
graphite, splunk, etc...). The re-posting of alerts back into xymon can be
done with that package's notification tool set and some scripting of xymon
messages.


Regards,
-jc
list Bruce Ferrell · Thu, 10 Dec 2015 09:05:40 -0800 ·
quoted from Japheth Cleaver
On 12/10/2015 08:02 AM, J.C. Cleaver wrote:
On Thu, December 10, 2015 2:49 am, Jeremy Laidman wrote:
On Tue, Dec 8, 2015 at 5:49 AM Galen Johnson <user-87f955643e3d@xymon.invalid>
wrote:
Has anyone tried to integrate alerting based on Graphite?  Or used
Graphite as a trending replacement to rrd?  I love Xymon for my
monitoring
but the limitations and aggregations of rrds are starting to become an
issue.
Nope, but I'm intrigued by Graphite.  Most of my servers have enormously
long trends pages because of all the extra graphs I've added.  These are
indispensable for tracking down weird faults.  But the number of graphs
and
RRD files has become unwieldy.  One major shortcoming is that I can't put
metrics from different hosts onto the same graph.  I've used RRGrapher <
http://pages.cs.wisc.edu/~plonka/RRGrapher/>; to let me create ad-hoc
graphs
like this, but it's obviously from last millennium, and could do with a
facelift.
I'd been looking at http://www.flotcharts.org/ and a few other RRD
graphing packages that could be used providing a more browseable
interface. There's absolutely a need (aside from the CSS work and a
potential "dashboard" view generally) for improved multi-host and
multi-graph views besides the linear trends output, I agree.
For trending, Xymon can threshold (alert) on RRD files with the "DS"
operator in analysis.cfg.  Perhaps this can be extended to alert on
Holt-Winters aberrant behaviour thresholds.  Doing the same sort of thing
with a rewrite of the g2zproxy probably wouldn't be too difficult, at
least
not on the Xymon side.
(Actually, the RRD files generated on new RPM installs have had HWPREDICT,
SEASONAL, and a few other RRA's configured for a while now, if anyone
feels like experimenting...)


One problem with the current RRD paradigm is that alerting is happening
only with data available at insertion time, not using data that's stored
into RRD file (or whatever metric store you have) already, so xymond_rrd
can't efficiently alert on things beyond that.

A "xymond_trend" could operate asynchronously on the RRD files, but to get
useful trend data back out of RRDs you'll need to flush the data to disk
first, which more or less blows out your I/O performance. Fine if you're
on SSD, but more of a problem if you're on heavily loaded spinning disks.

The problem there is just that there're just so many different ways of
doing this with a lot of different needs. To make something flexible
enough would require a good survey of what people are looking for.

(With that in mind -- What are people looking for? :) Maybe it's easier
than I'm thinking.)


Alternatively, sending the metric data off entirely to a different
package, which can reinject an alert into xymon if/when it notices a
trend, is an easily-approachable option using the RRD --processor option,
which can fork your metric feed off to whatever you like (OpenTSDB,
graphite, splunk, etc...). The re-posting of alerts back into xymon can be
done with that package's notification tool set and some scripting of xymon
messages.


Regards,
-jc
Having done a bit of this type of thing in another life, what you're discussing is what we termed an alert manager/data collector architecture.  The entire beauty of rrd data
storage is it's simplicity and It automatically does rollups.

I started my charting using flot and because of the complexity  of managing js charting on all the different browsers, I eventually scrapped js charting entirely and used GD to
generate chart images.  For the particular use case, RRD didn't make sense as exact storage historical data was mandatory... rollups/data averaging was not allowed.
list John Thurston · Thu, 10 Dec 2015 08:24:53 -0900 ·
quoted from Bruce Ferrell
On 12/10/2015 7:02 AM, J.C. Cleaver wrote:
Alternatively, sending the metric data off entirely to a different
package, which can reinject an alert into xymon if/when it notices a
trend, is an easily-approachable option using the RRD --processor option,
which can fork your metric feed off to whatever you like (OpenTSDB,
graphite, splunk, etc...). The re-posting of alerts back into xymon can be
done with that package's notification tool set and some scripting of xymon
messages.
This seems like the "Xymonish" way to do this.

Attempting to embed cross-host, historic-metric, and trending analysis 
seems to stray pretty far from the Big Brother/Xymon tradition. "Give me 
a red/yellow/green message and I'll put it a web page and send an email 
to whomever you have specified." (See my sig-line)

-- 
    Do things because you should, not just because you can.

John Thurston    XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Enterprise Technology Services
Department of Administration
State of Alaska