On Sat, Jul 21, 2007 at 09:34:11PM -0400, Scott Walters wrote:
Great to see the summary, these features look great. I'd like to
request more RRDs and reports about the monitoring system and the
servers/services monitored. For example:
I think the following could be "gauge" metrics:
Number of devices monitored
Number of services monitored
Number of host.service in green state
Number of host.service in yellow state
Number of host.service in red state
Number of host.service in XXX state
You mean like this:
Statistics:
Hosts : 4321
Pages : 286
Status messages : 22331
- Red : 907 ( 4.06 %)
- Red (non-propagating) : 809 ( 3.62 %)
- Yellow : 353 ( 1.58 %)
- Yellow (non-propagating) : 210 ( 0.94 %)
- Clear : 1970 ( 8.82 %)
- Green : 17052 (76.36 %)
- Purple : 452 ( 2.02 %)
- Blue : 578 ( 2.59 %)
The first three are from the current "bbgen --report" status message;
I've added the breakdown of the colors now. Will put these into an RRD
for tracking trends.
I am thinking these could be done by creating counters within hobbit
(since boot):
Number of state changes
Number of state changes per server
Number of state changes per service
Number of notifications sent
The state changes can be calculated from the history logs. This is
preferable, I think, because that way it won't get reset if the Hobbit
server is restarted.
Notifications - it would make sense to have the alert module provide
some statistics that we could put into a trend graph.
If you like, I could draft up some graphs and reports I'd like to see.
My above description might be hard to visualize. I definitely think
hobbit could benefit from internal counters, similarly to how on OS
keeps tracks of context switches and the like.
Please do. The graphs I've created about the Hobbit "internals" have
been mostly for my own use as debugging / performance evaluation data.
If we can provide some data that is interesting to management, that
would be a good thing.
Regards,
Henrik