History File Management
list David B. Ritch
I recently installed hobbit to monitor nodes in a cluster. It's a very good product - thanks to all who have worked on it! I noticed after a while, that the hobbit home directory was growing in an apparently boundless fashion. On further investigation, I discovered that on one server, my history files had grown to several gigabytes. After a while, this growth becomes unsustainable. Am I doing something wrong? How do others manage the growth of hobbit history and log files? I'm not referring to those in /var/log/hobbit - that's easy to handle with logrotate. I'm concerned about the directories associated with each client under the server data directory. Thanks! David
list Rich Smrcina
Let me get this right, you're saying that on some (most, all) of your Hobbit client machines some log file(s) is/are growing to multiple GB in size? Which file(s) and can you post a few lines of the data?
▸
David B. Ritch wrote:I recently installed hobbit to monitor nodes in a cluster. It's a very good product - thanks to all who have worked on it! I noticed after a while, that the hobbit home directory was growing in an apparently boundless fashion. On further investigation, I discovered that on one server, my history files had grown to several gigabytes. After a while, this growth becomes unsustainable. Am I doing something wrong? How do others manage the growth of hobbit history and log files? I'm not referring to those in /var/log/hobbit - that's easy to handle with logrotate. I'm concerned about the directories associated with each client under the server data directory. Thanks! David
--
Rich Smrcina VM Assist, Inc. Phone: XXX-XXX-XXXX Ans Service: XXX-XXX-XXXX user-61add9955ef9@xymon.invalid http://www.linkedin.com/in/richsmrcina Catch the WAVV! http://www.wavv.org WAVV 2009 - Orlando, FL - May 15-19, 2009
list Vernon Everett
Hi all Has anybody had any success in having different alerts on the PROCS test. Example. We have 2 different applications running on a server with different application custodians. If process FOO dies, we need to email/alert user-f9235f338fae@xymon.invalid If process BAR dies, we need to email/alert user-3a783a60a905@xymon.invalid As I see it, PROCS is all process tests bundled together, and hobbit cannot differentiate between them. Has anybody managed to have different alert recipients based on process? Thanks Vernon NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
list Carl Bruhn
Hi Vernon, As far as I can read from the localclient.cfg it should be possible: Quote : # # You can also associate a GROUP id with a rule. The group-id is passed to # the alert module, which can then use it to control who gets an alert when # a failure occurs. E.g. the following associates the "httpd" process check # with the "web" group, and the "sshd" check with the "admins" group: # PROC httpd 5 GROUP=web # PROC sshd 1 GROUP=admins # In the hobbit-alerts.cfg file, you could then have rules like # GROUP=web # MAIL user-b7c20e0da76a@xymon.invalid # GROUP=admins # MAIL user-04caac0eb454@xymon.invalid # best regards. Carl Bruhn Systemkonsulent CSC Consulting Group A/S Oldenburg Allé 1 DK-2630 Taastrup Tel: direkte 3614 5224 Tel: mobil 2923 5224 Tel: CSC Consulting Group 3614 5200 Fax:CSC Consulting Group 3614 5390 user-a25c6c56d0d6@xymon.invalid Company website http://www.csc.dk There's more to life than Oracle....... Unix for instance. CSC • This is a PRIVATE message. If you are not the intended recipient, please delete without copying and kindly advise us by e-mail of the mistake in delivery. NOTE: Regardless of content, this e-mail shall not operate to bind CSC to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose • CSC Consulting Group A/S • Registered Office: Retortvej 8, 1780 Copenhagen V, Denmark • Registered in Denmark No: 26031362
list Vernon Everett
Thanks, missed that. Only read the section under PROCS. Should probably have read the whole thing first.
▸
-----Original Message----- From: Carl Bruhn [mailto:user-a25c6c56d0d6@xymon.invalid] Sent: Monday, 1 September 2008 3:15 PM To: user-ae9b8668bcde@xymon.invalid Subject: Re: [hobbit] Procs - Different alerts Hi Vernon, As far as I can read from the localclient.cfg it should be possible: Quote : # # You can also associate a GROUP id with a rule. The group-id is passed to # the alert module, which can then use it to control who gets an alert when # a failure occurs. E.g. the following associates the "httpd" process check # with the "web" group, and the "sshd" check with the "admins" group: # PROC httpd 5 GROUP=web # PROC sshd 1 GROUP=admins # In the hobbit-alerts.cfg file, you could then have rules like # GROUP=web # MAIL user-b7c20e0da76a@xymon.invalid # GROUP=admins # MAIL user-04caac0eb454@xymon.invalid # best regards. Carl Bruhn Systemkonsulent CSC Consulting Group A/S Oldenburg Allé 1 DK-2630 Taastrup Tel: direkte 3614 5224 Tel: mobil 2923 5224 Tel: CSC Consulting Group 3614 5200 Fax:CSC Consulting Group 3614 5390 user-a25c6c56d0d6@xymon.invalid Company website http://www.csc.dk There's more to life than Oracle....... Unix for instance. CSC * This is a PRIVATE message. If you are not the intended recipient, please delete without copying and kindly advise us by e-mail of the mistake in delivery. NOTE: Regardless of content, this e-mail shall not operate to bind CSC to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of e-mail for such purpose * CSC Consulting Group A/S * Registered Office: Retortvej 8, 1780 Copenhagen V, Denmark * Registered in Denmark No: 26031362 NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
list David B. Ritch
Not precisely - it's only on my hobbit server. It appears to collect historical data. It looks like BB messages - I suspect that it keeps the messages whenever a system state is changed. That's great - up to a point. I can click on "history", and then see exactly what caused a state change. However, my systems have been in development, and have had frequent state changes, so the history has grown quickly. I'm concerned that even when the system settles down, these files will grow over time, and need to be managed. And no - I don't have individual clients with associated files growing to GBs. I have over a hundred clients on each of a couple of servers. On one of the servers, I have history files growing to 10s of megabytes per client - which adds up to gigabytes. Unfortunately, I don't have access to those systems from here, so I can't post data. Thanks, dbr
▸
Rich Smrcina wrote:Let me get this right, you're saying that on some (most, all) of your Hobbit client machines some log file(s) is/are growing to multiple GB in size? Which file(s) and can you post a few lines of the data? David B. Ritch wrote:I recently installed hobbit to monitor nodes in a cluster. It's a very good product - thanks to all who have worked on it! I noticed after a while, that the hobbit home directory was growing in an apparently boundless fashion. On further investigation, I discovered that on one server, my history files had grown to several gigabytes. After a while, this growth becomes unsustainable. Am I doing something wrong? How do others manage the growth of hobbit history and log files? I'm not referring to those in /var/log/hobbit - that's easy to handle with logrotate. I'm concerned about the directories associated with each client under the server data directory. Thanks! David
list Vernon Everett
Hi all Grouping subpages? How? Grouping hosts is dead easy, but subpages? Is it even possible? I am looking for a page And on the page, a heading Then a few links to subpages (with hosts in these subpages) Then another heading And more links to more subpages (with more hosts) Regards
▸
Vernon
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
list Vernon Everett
According to the docs, and confirmed by testing,
NOCOLUMNS:column[,column]
Used to drop certain of the status columns generated by the Hobbit
client. column is one of cpu, disk, files, memory, msgs, ports, procs. This setting stops these columns from being updated for the host. Note: If the columns already exist, you must use the bb(1) utility
to drop them, or they will go purple.
This is the bit that's got me stuck.
column is one of cpu, disk, files, memory, msgs, ports, procs.
Is there a way to nocolumn a custom test?
We have a server, appearing twice in bb-hosts, because it performs 2
different functions.
It has 2 custom tests on it, one monitoring app1 and other monitoring
app2.
There are different custodians for the 2 apps, so I want them to appear
in their own groups.
We have it listed in bb-hosts as .
.
group App1 Servers
1.2.3.4 foo # NOCOLUMS:app2
.
.
.
group App2 Servers
0.0.0.0 foo # NOCOLUMS:app1
How do I make this work?
▸
Thanks
Vernon
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
list Dominique Frise
From bb-hosts(5) man page:
group-only COLUMN1|COLUMN2|COLUMN3 [group-title]
Same as the "group" and "group-compress" lines, but
includes only the columns explicitly listed in the
group. Any columns not listed will be ignored for these
hosts.
group-except COLUMN1|COLUMN2|COLUMN3 [group-title]
Same as the "group-only" lines, but includes all
columns EXCEPT those explicitly listed in the group.
Any columns listed will be ignored for these hosts -
all other columns are shown.
In your case maybe:
group-except app2 App1 Servers
1.2.3.4 foo #
.
.
group-except app1 App2 Servers
0.0.0.0 foo #
Dominique
▸
Everett, Vernon wrote:According to the docs, and confirmed by testing,
NOCOLUMNS:column[,column]
Used to drop certain of the status columns generated by the Hobbit
client. column is one of cpu, disk, files, memory, msgs, ports, procs. This setting stops these columns from being updated for the host. Note: If the columns already exist, you must use the bb(1) utility
to drop them, or they will go purple.
This is the bit that's got me stuck.
column is one of cpu, disk, files, memory, msgs, ports, procs.
Is there a way to nocolumn a custom test?
We have a server, appearing twice in bb-hosts, because it performs 2
different functions.
It has 2 custom tests on it, one monitoring app1 and other monitoring
app2.
There are different custodians for the 2 apps, so I want them to appear
in their own groups.
We have it listed in bb-hosts as .
.
group App1 Servers
1.2.3.4 foo # NOCOLUMS:app2
.
.
.
group App2 Servers
0.0.0.0 foo # NOCOLUMS:app1
How do I make this work?
Thanks
Vernon
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
list Vernon Everett
Bloody brilliant! Thanks.
▸
-----Original Message-----
From: Dominique Frise [mailto:user-78ab6673b600@xymon.invalid] Sent: Wednesday, 3 September 2008 2:59 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] NOCOLUMNS:custom,stuff
From bb-hosts(5) man page:
group-only COLUMN1|COLUMN2|COLUMN3 [group-title]
Same as the "group" and "group-compress" lines, but
includes only the columns explicitly listed in the
group. Any columns not listed will be ignored for these
hosts.
group-except COLUMN1|COLUMN2|COLUMN3 [group-title]
Same as the "group-only" lines, but includes all
columns EXCEPT those explicitly listed in the group.
Any columns listed will be ignored for these hosts -
all other columns are shown.
In your case maybe:
group-except app2 App1 Servers
1.2.3.4 foo #
.
.
group-except app1 App2 Servers
0.0.0.0 foo #
Dominique
Everett, Vernon wrote:According to the docs, and confirmed by testing, NOCOLUMNS:column[,column]
Used to drop certain of the status columns generated by the Hobbitclient. column is one of cpu, disk, files, memory, msgs, ports, procs.
This setting stops these columns from being updated for the host. Note: If the columns already exist, you must use the bb(1) utilityto drop them, or they will go purple.
This is the bit that's got me stuck.
column is one of cpu, disk, files, memory, msgs, ports, procs.
Is there a way to nocolumn a custom test?
We have a server, appearing twice in bb-hosts, because it performs 2 different functions.
It has 2 custom tests on it, one monitoring app1 and other monitoring app2.
There are different custodians for the 2 apps, so I want them to appear in their own groups.
We have it listed in bb-hosts as
.
.
group App1 Servers
1.2.3.4 foo # NOCOLUMS:app2
.
.
.
group App2 Servers
0.0.0.0 foo # NOCOLUMS:app1
How do I make this work?
Thanks
Vernon
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material.You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
NOTICE: This email and any attachments are confidential. They may contain legally privileged information or copyright material. You must not read, copy, use or disclose them without authorisation. If you are not an intended recipient, please contact us at once by return email and then delete both messages and all attachments.
list Tom Kauffman
Looks like this -- the "Timeclocks" entry shows on the main page; on the "Timeclocks" page there is an entry for each site; the site-level subpage shows the detail for all the timeclocks at the site. # page Timeclocks <H4><I> Timeclocks </I></H4> title <H3> Timeclocks </H3> subparent Timeclocks atl-clock <H3><I>Atlanta Timeclocks</I></H3> group-compress <H3><I>Atlanta Timeclocks</I></H3> include hosts/atl-stuff/timeclocks subparent Timeclocks bly-clock <H3><I>Blytheville Timeclocks</I></H3> group-compress <H3><I>Blytheville Timeclocks</I></H3> include hosts/bly-stuff/timeclocks subparent Timeclocks cha-clock <H3><I>Charleston Timeclocks</I></H3> group-compress <H3><I>Charleston Timeclocks</I></H3> include hosts/cha-stuff/timeclocks subparent Timeclocks cor-clock <H3><I>Corona Timeclocks</I></H3> group-compress <H3><I>Corona Timeclocks</I></H3> include hosts/cor-stuff/timeclocks # hosts/atl-stuff/timeclocks 10.168.20.1 atl-timeclock1 # noconn 10.168.20.2 atl-timeclock2 # noconn I have entries on my main page for routers, LAN switches, print servers, and other such. I also have entries by physical site, with the full detail of the devices at that site (used by the site admins). Tom
▸
-----Original Message-----
From: Everett, Vernon [mailto:user-9da1a1882f49@xymon.invalid]
Sent: Tuesday, September 02, 2008 3:52 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Grouping subpages
Hi all
Grouping subpages?
How?
Grouping hosts is dead easy, but subpages?
Is it even possible?
I am looking for a page
And on the page, a heading
Then a few links to subpages (with hosts in these subpages)
Then another heading
And more links to more subpages (with more hosts)
Regards
Vernon
NOTICE: This email and any attachments are confidential.
They may contain legally privileged information or
copyright material. You must not read, copy, use or
disclose them without authorisation. If you are not an
intended recipient, please contact us at once by return
email and then delete both messages and all attachments.
list Geoff Hallford
Correct. I believe the 2nd one should be 'subpage' though and not
'subparent'. You can do this as many times as you want (I think). I have
up-to 3 pages deep:
page partner1 <H4>Partner 1</H4>
subpage p1-research <H4>Research</H4>
subparent p1-res-servers p1-research <H4>Servers</H4>
1.1.1.1 SERVER1
subparent p1-res-network p1-research <H4>Network</H4>
2.2.2.2 SWITCH1
subpage p1-business <H4>Business</H4>
subparent p1-bus-servers p1-business <H4>Servers</H4>
1.1.1.2 SERVER2
subparent p1-bus-network p1-business <H4>Network</H4>
2.2.2.3 SWITCH2
Hierarchy looks like this:
Main Page
--- Partner 1
--- Research
--- Servers
--- Network
--- Business
--- Servers
--- Network
--- Partner 2
.........
'Nobody goes there anymore. It's too crowded.' --Yogi Berra
▸
On Wed, Sep 3, 2008 at 9:36 AM, Kauffman, Tom <user-3feba9e60a8b@xymon.invalid> wrote:
Looks like this -- the "Timeclocks" entry shows on the main page; on the
"Timeclocks" page there is an entry for each site; the site-level subpage
shows the detail for all the timeclocks at the site.
#
page Timeclocks <H4><I> Timeclocks </I></H4>
title <H3> Timeclocks </H3>
subparent Timeclocks atl-clock <H3><I>Atlanta Timeclocks</I></H3>
group-compress <H3><I>Atlanta Timeclocks</I></H3>
include hosts/atl-stuff/timeclocks
subparent Timeclocks bly-clock <H3><I>Blytheville Timeclocks</I></H3>
group-compress <H3><I>Blytheville Timeclocks</I></H3>
include hosts/bly-stuff/timeclocks
subparent Timeclocks cha-clock <H3><I>Charleston Timeclocks</I></H3>
group-compress <H3><I>Charleston Timeclocks</I></H3>
include hosts/cha-stuff/timeclocks
subparent Timeclocks cor-clock <H3><I>Corona Timeclocks</I></H3>
group-compress <H3><I>Corona Timeclocks</I></H3>
include hosts/cor-stuff/timeclocks
# hosts/atl-stuff/timeclocks
10.168.20.1 atl-timeclock1 # noconn
10.168.20.2 atl-timeclock2 # noconn
I have entries on my main page for routers, LAN switches, print servers,
and other such. I also have entries by physical site, with the full detail
of the devices at that site (used by the site admins).
Tom
-----Original Message-----
From: Everett, Vernon [mailto:user-9da1a1882f49@xymon.invalid]
Sent: Tuesday, September 02, 2008 3:52 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Grouping subpages
Hi all
Grouping subpages?
How?
Grouping hosts is dead easy, but subpages?
Is it even possible?
I am looking for a page
And on the page, a heading
Then a few links to subpages (with hosts in these subpages)
Then another heading
And more links to more subpages (with more hosts)
Regards
Vernon
NOTICE: This email and any attachments are confidential.
They may contain legally privileged information or
copyright material. You must not read, copy, use or
disclose them without authorisation. If you are not an
intended recipient, please contact us at once by return
email and then delete both messages and all attachments.
list Henrik Størner
▸
In <user-e6524b53948f@xymon.invalid> "David B. Ritch" <user-23cafa473f8d@xymon.invalid> writes:
I noticed after a while, that the hobbit home directory was growing in an apparently boundless fashion. On further investigation, I discovered that on one server, my history files had grown to several gigabytes. After a while, this growth becomes unsustainable.
The hist/HOSTNAME.STATUSNAME and histlogs/* files shouldn't grow *that* much. They will grow, but usually slowly - they are only updated when there is a status change (red->green or similar). If they are growing quickly, it is usually because a status is "flapping", i.e. changing rapidly between red and green. I've seen this happen if you have two hosts reporting with the same name (e.g. two hosts called "www"), with one reporting a green status and the other reporting a red status. A look at the bottom line of the status report which says "Message received from ..." will help you figure out if that is the case. Regardless, the 'trimhistory' utility can be used to clean up the size of the history logs. Regards, Henrik
list Henrik Størner
▸
In <user-ec5aeb54c9d2@xymon.invalid> "Everett, Vernon" <user-9da1a1882f49@xymon.invalid> writes:
Grouping subpages? How?
Grouping hosts is dead easy, but subpages? Is it even possible?
I am looking for a page And on the page, a heading Then a few links to subpages (with hosts in these subpages) Then another heading And more links to more subpages (with more hosts)
Separate the "subpage" entries in bb-hosts with "title" sections.
E.g.
title Perth servers
subpage win-perth Windows server in Perth
subpage unix-perth Unix server in Perth
title Melbourne
subpage win-melbourne ...
Henrik
list Henrik Størner
In <user-0c6348f4a52f@xymon.invalid> "Everett, Vernon" <user-9da1a1882f49@xymon.invalid> writes:
group App1 Servers 1.2.3.4 foo # NOCOLUMS:app2
Use "group-except app2 App1 Servers", and drop the NOCOLUMNS Henrik
list David B. Ritch
Thank you! I believe that trimhistory is just what I was looking for. David
▸
Henrik Stoerner wrote:In <user-e6524b53948f@xymon.invalid> "David B. Ritch" <user-23cafa473f8d@xymon.invalid> writes:I noticed after a while, that the hobbit home directory was growing in an apparently boundless fashion. On further investigation, I discovered that on one server, my history files had grown to several gigabytes. After a while, this growth becomes unsustainable.The hist/HOSTNAME.STATUSNAME and histlogs/* files shouldn't grow *that* much. They will grow, but usually slowly - they are only updated when there is a status change (red->green or similar). If they are growing quickly, it is usually because a status is "flapping", i.e. changing rapidly between red and green. I've seen this happen if you have two hosts reporting with the same name (e.g. two hosts called "www"), with one reporting a green status and the other reporting a red status. A look at the bottom line of the status report which says "Message received from ..." will help you figure out if that is the case. Regardless, the 'trimhistory' utility can be used to clean up the size of the history logs. Regards, Henrik