Bogus hosts filling up alert.log
list David Mills
Hi, all! I have recently set up a new Xymon (Xymon 4.3.28-1.el6.terabithia<http://xymon.sourceforge.net/>) server on RHEL 6.7 that, for the most part, is doing just fine. However, I've discovered that my /var/log/xymon/alert.log file is growing at a crazy rate to the point it periodically needs to zero'd-out or it will swamp the file system. The problem is every 15 seconds the /var/log/xymon/alert.log file receives a flurry of log entries like this: ... 2017-10-31 14:53:55 Checking criteria for host '0FS_94_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_94_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_94_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_94_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_94_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_94_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_94_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_94_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_95_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_95_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_95_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_95_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_95_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_95_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_95_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_95_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_96_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_96_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_96_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_96_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_96_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_96_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_96_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire 2017-10-31 14:53:55 Checking criteria for host '0FS_96_192_168_22_1__export_', which is not yet defined; some alerts may not immediately fire ... The '0FS_96_192_168_22_1__export_' is actually the name of a host I've defined in the past, but is no longer in xymon's memory (AFAIK!!) I have reduced the alerts.cfg down to a minimal stub, commenting out the "directory /etc/xymon/alerts.d/..." directive, "rm -r"'d any reference to this host (and others similar to it) under the data files directories (e.g. rrd/, hist/, histlogs/, hostdata/, etc.). I have gone as far as running "find /etc/xymon -type f | xargs egrep 0FS_" looking for "surprises". I've also stopped / restarted the server and scanned what's active in memory via "xymon localhost xymondboard | egrep 0FS_". This "host" is not a real client host but an artifact I've created on the server side to represent a file system I'm monitoring in a server-side ext script, so I know it is not announcing it's presence over port 1984. For the life of me I can't figure out where the alerts daemon is running across this hostname. Help? ~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~ David Mills Systems Administrator Northrop Grumman (XXX) XXX-XXXX (mobile)
list John Thurston
▸
On 10/31/2017 12:31 PM, Mills,David (HHSC Contractor) wrote:
This “host” is not a real client host but an artifact I’ve created on the server side to represent a file system I’m monitoring in a server-side ext script, so I know it is not announcing it’s presence over port 1984. For the life of me I can’t figure out where the alerts daemon is running across this hostname.
Do you get any information if you ask xymond_alert to react in the foreground?
xymoncmd xymond_alert --test 0FS_96_192_168_22_1__export_ foo --color=red
--
Do things because you should, not just because you can.
John Thurston XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Department of Administration
State of Alaska
list David Mills
Thx, John! Here's the output, though I'm not quite sure what to make of it: .../xymon> /usr/share/xymon/bin/xymoncmd /usr/share/xymon/bin/xymond_alert --test 0FS_96_192_168_22_1__export --color=RED 2017-10-31 15:41:18.587126 Host not found in hosts.cfg - assuming it is on the top page 00081791 2017-10-31 15:41:18 send_alert 0FS_96_192_168_22_1__export:--color=RED state Paging 2017-10-31 15:41:18 Checking criteria for host '0FS_96_192_168_22_1__export', which is not yet defined; some alerts may not immediately fire 00081791 2017-10-31 15:41:18 Matching host:service:dgroup:page '0FS_96_192_168_22_1__export:--color=RED:(NULL):' against rule line 121 00081791 2017-10-31 15:41:18 Failed 'GROUP=phys-dba' (group not in include list) 2017-10-31 15:41:18 Checking criteria for host '0FS_96_192_168_22_1__export', which is not yet defined; some alerts may not immediately fire 00081791 2017-10-31 15:41:18 Matching host:service:dgroup:page '0FS_96_192_168_22_1__export:--color=RED:(NULL):' against rule line 124 00081791 2017-10-31 15:41:18 Failed 'GROUP=lgcl-dba' (group not in include list) 2017-10-31 15:41:18 Checking criteria for host '0FS_96_192_168_22_1__export', which is not yet defined; some alerts may not immediately fire 00081791 2017-10-31 15:41:18 Matching host:service:dgroup:page '0FS_96_192_168_22_1__export:--color=RED:(NULL):' against rule line 127 00081791 2017-10-31 15:41:18 Failed 'GROUP=maxi-dba' (group not in include list) 2017-10-31 15:41:18 Checking criteria for host '0FS_96_192_168_22_1__export', which is not yet defined; some alerts may not immediately fire 00081791 2017-10-31 15:41:18 Matching host:service:dgroup:page '0FS_96_192_168_22_1__export:--color=RED:(NULL):' against rule line 131 00081791 2017-10-31 15:41:18 Failed 'GROUP=unix-sadm' (group not in include list) 2017-10-31 15:41:18 Checking criteria for host '0FS_96_192_168_22_1__export', which is not yet defined; some alerts may not immediately fire 00081791 2017-10-31 15:41:18 Matching host:service:dgroup:page '0FS_96_192_168_22_1__export:--color=RED:(NULL):' against rule line 134 00081791 2017-10-31 15:41:18 Failed 'GROUP=windows-sadm' (group not in include list) 2017-10-31 15:41:18 Checking criteria for host '0FS_96_192_168_22_1__export', which is not yet defined; some alerts may not immediately fire 00081791 2017-10-31 15:41:18 Matching host:service:dgroup:page '0FS_96_192_168_22_1__export:--color=RED:(NULL):' against rule line 138 00081791 2017-10-31 15:41:18 Failed 'GROUP=env-mgmt' (group not in include list) 2017-10-31 15:41:18 Checking criteria for host '0FS_96_192_168_22_1__export', which is not yet defined; some alerts may not immediately fire 00081791 2017-10-31 15:41:18 Matching host:service:dgroup:page '0FS_96_192_168_22_1__export:--color=RED:(NULL):' against rule line 141 00081791 2017-10-31 15:41:18 Failed 'GROUP=tools-adm' (group not in include list) 2017-10-31 15:41:18 Checking criteria for host '0FS_96_192_168_22_1__export', which is not yet defined; some alerts may not immediately fire 00081791 2017-10-31 15:41:18 Matching host:service:dgroup:page '0FS_96_192_168_22_1__export:--color=RED:(NULL):' against rule line 146 00081791 2017-10-31 15:41:18 Failed 'GROUP=net-adm' (group not in include list)
▸
--
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
David Mills
Systems Administrator
Northrop Grumman
(XXX) XXX-XXXX (mobile)
-----Original Message-----
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of John Thurston
Sent: Tuesday, October 31, 2017 3:38 PM
To: xymon at xymon.com
Subject: Re: [Xymon] Bogus hosts filling up alert.log
On 10/31/2017 12:31 PM, Mills,David (HHSC Contractor) wrote:This “host” is not a real client host but an artifact I’ve created on the server side to represent a file system I’m monitoring in a server-side ext script, so I know it is not announcing it’s presence over port 1984. For the life of me I can’t figure out where the alerts daemon is running across this hostname.
Do you get any information if you ask xymond_alert to react in the foreground?
xymoncmd xymond_alert --test 0FS_96_192_168_22_1__export_ foo --color=red
--
Do things because you should, not just because you can.
John Thurston XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Department of Administration
State of Alaska
list John Thurston
▸
On 10/31/2017 12:31 PM, Mills,David (HHSC Contractor) wrote:
This “host” is not a real client host but an artifact I’ve created on the server side to represent a file system I’m monitoring in a server-side ext script, so I know it is not announcing it’s presence over port 1984. For the life of me I can’t figure out where the alerts daemon is running across this hostname.
Based on the output of your interactive run, I suggest the host named "0FS_94_192_168_22_1__export_" is not defined in your hosts.cfg, but the xymon server is seeing messages being sent to it under this name. Does it appear in your 'ghost report'? If you define this name in hosts.cfg, do the nasty messages cease?
▸
--
Do things because you should, not just because you can.
John Thurston XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Department of Administration
State of Alaska
list David Mills
Well, John, I added a bogus entry to what we call our "orphans" page -- part of the hosts.cfg hierarchy -- culled from the current ghostlist report, so Xymon will officially have it in the hosts.cfg hierarchy. $ tail -1 orphaned-hosts.cfg 10.11.22.245 0FS_96_192_168_22_1__export_ # noconn $ pkill -HUP 'xymond ' Perhaps because this is not really a host you can interact with (hence the "noconn" tag) Xymon claims it still doesn't know about this host: $ /usr/share/xymon/bin/xymon localhost "xymondboard host=0FS_96_192_168_22_1__export_"
▸
$
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
David Mills
Systems Administrator
Northrop Grumman
(XXX) XXX-XXXX (mobile)
-----Original Message-----
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of John Thurston
Sent: Tuesday, October 31, 2017 4:00 PM
To: xymon at xymon.com
Subject: Re: [Xymon] Bogus hosts filling up alert.log
On 10/31/2017 12:31 PM, Mills,David (HHSC Contractor) wrote:This “host” is not a real client host but an artifact I’ve created on the server side to represent a file system I’m monitoring in a server-side ext script, so I know it is not announcing it’s presence over port 1984. For the life of me I can’t figure out where the alerts daemon is running across this hostname.
Based on the output of your interactive run, I suggest the host named "0FS_94_192_168_22_1__export_" is not defined in your hosts.cfg, but the xymon server is seeing messages being sent to it under this name.
Does it appear in your 'ghost report'?
If you define this name in hosts.cfg, do the nasty messages cease?
--
Do things because you should, not just because you can.
John Thurston XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Department of Administration
State of Alaska
list David Mills
Amendment: Eventually, Xymon did see the bogus entry for 0FS_96_192_168_22_1__export_ and it did stop appearing in the alert.cfg It's another data point, but this is kind of expected behavior and now I have just this bogus artifact hanging around. 'Any ideas where the alerts daemon gets its list of host names to check against? ;-)
▸
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
David Mills
Systems Administrator
Northrop Grumman
(XXX) XXX-XXXX (mobile)
-----Original Message-----
From: Mills,David (HHSC Contractor)
Sent: Tuesday, October 31, 2017 4:31 PM
To: 'John Thurston' <user-ce4d79d99bab@xymon.invalid>; xymon at xymon.com
Subject: RE: [Xymon] Bogus hosts filling up alert.log
Well, John, I added a bogus entry to what we call our "orphans" page -- part of the hosts.cfg hierarchy -- culled from the current ghostlist report, so Xymon will officially have it in the hosts.cfg hierarchy.
$ tail -1 orphaned-hosts.cfg
10.11.22.245 0FS_96_192_168_22_1__export_ # noconn
$ pkill -HUP 'xymond '
Perhaps because this is not really a host you can interact with (hence the "noconn" tag) Xymon claims it still doesn't know about this host:
$ /usr/share/xymon/bin/xymon localhost "xymondboard host=0FS_96_192_168_22_1__export_"
$
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
David Mills
Systems Administrator
Northrop Grumman
(XXX) XXX-XXXX (mobile)
-----Original Message-----
From: Xymon [mailto:xymon-bounces at xymon.com] On Behalf Of John Thurston
Sent: Tuesday, October 31, 2017 4:00 PM
To: xymon at xymon.com
Subject: Re: [Xymon] Bogus hosts filling up alert.log
On 10/31/2017 12:31 PM, Mills,David (HHSC Contractor) wrote:This “host” is not a real client host but an artifact I’ve created on the server side to represent a file system I’m monitoring in a server-side ext script, so I know it is not announcing it’s presence over port 1984. For the life of me I can’t figure out where the alerts daemon is running across this hostname.
Based on the output of your interactive run, I suggest the host named "0FS_94_192_168_22_1__export_" is not defined in your hosts.cfg, but the xymon server is seeing messages being sent to it under this name.
Does it appear in your 'ghost report'?
If you define this name in hosts.cfg, do the nasty messages cease?
--
Do things because you should, not just because you can.
John Thurston XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Department of Administration
State of Alaska
list Japheth Cleaver
▸
On 10/31/2017 2:49 PM, Mills,David (HHSC Contractor) wrote:
Amendment: Eventually, Xymon did see the bogus entry for 0FS_96_192_168_22_1__export_ and it did stop appearing in the alert.cfg It's another data point, but this is kind of expected behavior and now I have just this bogus artifact hanging around. 'Any ideas where the alerts daemon gets its list of host names to check against?
;-)If there had been a previous alert for the virtual/fake host, it would have been stored in memory, and frozen out into the alerts.chk file (probably in /var/lib/xymon/tmp/ during restarts). The general reason for the alert is that xymond(_alert) is getting a report about something it doesn't (yet) know about. Usually, this clears a few minutes later as soon as xymond next checks hosts.cfg for changes, since typically that's the action pending to be taken. Similarly, when a host is removed from hosts.cfg (or a drop command), that message is passed to xymond_alert to clear its record of the alert as well. Adding the host to hosts.cfg and then dropping it should work for clearing out the errant alert. Alternatively, you can stop xymon (or at least xymond_alert, by adding DISABLED into tasks.cfg), grep out the line for the alert in alerts.chk manually (it's just a normal single-line-record text file, and then bring it back up. HTH, -jc
list David Mills
JC -- Brilliant! That solved my problem perfectly. Thanks to John again for lending a hand! ;-)
▸
~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~*~
David Mills
Systems Administrator
Northrop Grumman
(XXX) XXX-XXXX (mobile)
-----Original Message-----
From: Japheth Cleaver [mailto:user-87556346d4af@xymon.invalid]
Sent: Tuesday, October 31, 2017 6:06 PM
To: Mills,David (HHSC Contractor) <user-7037272ac73f@xymon.invalid>; John Thurston <user-ce4d79d99bab@xymon.invalid>; xymon at xymon.com
Subject: Re: [Xymon] Bogus hosts filling up alert.log
On 10/31/2017 2:49 PM, Mills,David (HHSC Contractor) wrote:Amendment: Eventually, Xymon did see the bogus entry for 0FS_96_192_168_22_1__export_ and it did stop appearing in the alert.cfg It's another data point, but this is kind of expected behavior and now I have just this bogus artifact hanging around. 'Any ideas where the alerts daemon gets its list of host names to check against?
;-)▸
If there had been a previous alert for the virtual/fake host, it would have been stored in memory, and frozen out into the alerts.chk file (probably in /var/lib/xymon/tmp/ during restarts). The general reason for the alert is that xymond(_alert) is getting a report about something it doesn't (yet) know about. Usually, this clears a few minutes later as soon as xymond next checks hosts.cfg for changes, since typically that's the action pending to be taken. Similarly, when a host is removed from hosts.cfg (or a drop command), that message is passed to xymond_alert to clear its record of the alert as well. Adding the host to hosts.cfg and then dropping it should work for clearing out the errant alert. Alternatively, you can stop xymon (or at least xymond_alert, by adding DISABLED into tasks.cfg), grep out the line for the alert in alerts.chk manually (it's just a normal single-line-record text file, and then bring it back up. HTH, -jc