TS - bb-hosts problem
list Tom Schmitt
I am getting a yellow alert on my Xymon server under 'BBTEST':
Bottom of Form
Mon Aug 30 07:43:57 2010
bbtest-net version 4.3.0-0.beta2
SSL library : OpenSSL 0.9.8e-rhel5 01 Jul 2008
LDAP library: OpenLDAP 20343
Statistics:
Hosts total : 951
Hosts with no tests : 0
Total test count : 1008
Status messages : 1012
Alert status msgs : 0
Transmissions : 14
DNS statistics:
# hostnames resolved : 975
# succesful : 573
# failed : 382
# calls to dnsresolve : 1002
TCP test statistics:
# TCP tests total : 40
# HTTP tests : 30
# Simple TCP tests : 10
# Connection attempts : 40
# bytes written : 4682
# bytes read : 109923
Error output:
Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause
strange results
TIME SPENT
Event Starttime
Duration
bbtest-net startup 8802.037154
• Service definitions loaded 8802.038670
0.001516
Tests loaded 8802.134488
0.095818
DNS lookups completed 8807.226739
5.092251
Test engine setup completed 8807.245081
0.018342
TCP tests completed 8810.416313
3.171232
PING test completed (951 hosts) 8859.946266
49.529953
PING test results sent 8859.949945
0.003679
Test result collection completed 8859.950004
0.000059
LDAP test engine setup completed 8859.950006
0.000002
LDAP tests executed 8859.950008
0.000002
LDAP tests result collection completed 8859.950010
0.000002
DNS tests executed 8860.077831
0.127821
NTP tests executed 8861.305830
1.227999
Test results transmitted 8861.339357
0.033527
bbtest-net completed 8861.340730
0.001373
TIME TOTAL
59.303576
Here is a 'grep' of bb-hosts:
[root at monitorp etc]# grep -i cswflex4 bb-hosts
0.0.0.0 cswflex4.csw.L-3com.com
128.170.31.14 cswflex4.csw.L-3com.com # conn
0.0.0.0 cswflex4.csw.l-3com.com
0.0.0.0 cswflex4.csw.l-3com.com
[root at monitorp etc]#
I cannot find a second entry with an IP address.
I have hundreds of other servers with 0.0.0.0 entries so I cannot figure
out what is happening?
Running Xymon 4.3.0-0.beta2 on CentOS 5.5 on Dell R-610.
I tried to do a 'drop hostname', could there be a duplicate in the Xymon
data base?
If so, how do you remove the bad one so you don't loose the history
being gathered by the client on the Windows server.
The client is the big brother client.
Thanks,
Tom Schmitt
Senior IT Staff - R&D
L-3 Communication Systems West
640 North 2200 West
P.O. Box 16850
Salt Lake City, UT XXXXX
Phone (XXX) XXX-XXXX
Cell (XXX) XXX-XXXX
eFax (XXX) XXX-XXXX
user-9c1ae820b621@xymon.invalid
\\\\||////
\ ~ ~ /
| @ @ |
--oOo---(_)---oOo--
list Johan Sjöberg
Isn't the hostname the "primary key"? Can you use the same hostname with multiple IP addresses? /Johan From: user-9c1ae820b621@xymon.invalid [mailto:user-9c1ae820b621@xymon.invalid] Sent: den 30 augusti 2010 15:54 To: xymon at xymon.com Subject: [xymon] TS - bb-hosts problem Importance: High
▸
I am getting a yellow alert on my Xymon server under 'BBTEST':
Bottom of Form
Mon Aug 30 07:43:57 2010
bbtest-net version 4.3.0-0.beta2
SSL library : OpenSSL 0.9.8e-rhel5 01 Jul 2008
LDAP library: OpenLDAP 20343
Statistics:
Hosts total : 951
Hosts with no tests : 0
Total test count : 1008
Status messages : 1012
Alert status msgs : 0
Transmissions : 14
DNS statistics:
# hostnames resolved : 975
# succesful : 573
# failed : 382
# calls to dnsresolve : 1002
TCP test statistics:
# TCP tests total : 40
# HTTP tests : 30
# Simple TCP tests : 10
# Connection attempts : 40
# bytes written : 4682
# bytes read : 109923
Error output:
Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause strange results
TIME SPENT
Event Starttime Duration
bbtest-net startup 8802.037154 -
Service definitions loaded 8802.038670 0.001516
Tests loaded 8802.134488 0.095818
DNS lookups completed 8807.226739 5.092251
Test engine setup completed 8807.245081 0.018342
TCP tests completed 8810.416313 3.171232
PING test completed (951 hosts) 8859.946266 49.529953
PING test results sent 8859.949945 0.003679
Test result collection completed 8859.950004 0.000059
LDAP test engine setup completed 8859.950006 0.000002
LDAP tests executed 8859.950008 0.000002
LDAP tests result collection completed 8859.950010 0.000002
DNS tests executed 8860.077831 0.127821
NTP tests executed 8861.305830 1.227999
Test results transmitted 8861.339357 0.033527
bbtest-net completed 8861.340730 0.001373
TIME TOTAL 59.303576
Here is a 'grep' of bb-hosts:
[root at monitorp etc]# grep -i cswflex4 bb-hosts
0.0.0.0 cswflex4.csw.L-3com.com
128.170.31.14 cswflex4.csw.L-3com.com # conn
0.0.0.0 cswflex4.csw.l-3com.com
0.0.0.0 cswflex4.csw.l-3com.com
[root at monitorp etc]#
I cannot find a second entry with an IP address.
I have hundreds of other servers with 0.0.0.0 entries so I cannot figure out what is happening?
Running Xymon 4.3.0-0.beta2 on CentOS 5.5 on Dell R-610.
I tried to do a 'drop hostname', could there be a duplicate in the Xymon data base?
If so, how do you remove the bad one so you don't loose the history being gathered by the client on the Windows server.
The client is the big brother client.
Thanks,
Tom Schmitt
Senior IT Staff - R&D
L-3 Communication Systems West
640 North 2200 West
P.O. Box 16850
Salt Lake City, UT XXXXX
Phone (XXX) XXX-XXXX
Cell (XXX) XXX-XXXX
eFax (XXX) XXX-XXXX
user-9c1ae820b621@xymon.invalid<mailto:user-9c1ae820b621@xymon.invalid>
\\\\||////
\ ~ ~ /
| @ @ |
--oOo---(_)---oOo--
list Steve Holmes
▸
On Aug 30, 2010, at 9:54 AM, user-9c1ae820b621@xymon.invalid wrote:
I am getting a yellow alert on my Xymon server under ‘BBTEST’:
Mon Aug 30 07:43:57 2010
bbtest-net version 4.3.0-0.beta2
SSL library : OpenSSL 0.9.8e-rhel5 01 Jul 2008
LDAP library: OpenLDAP 20343
Statistics:
Hosts total : 951
Hosts with no tests : 0
Total test count : 1008
Status messages : 1012
Alert status msgs : 0
Transmissions : 14
DNS statistics:
# hostnames resolved : 975
# succesful : 573
# failed : 382
# calls to dnsresolve : 1002
TCP test statistics:
# TCP tests total : 40
# HTTP tests : 30
# Simple TCP tests : 10
# Connection attempts : 40
# bytes written : 4682
# bytes read : 109923
Error output:
Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause strange results
TIME SPENT
Event Starttime Duration
bbtest-net startup 8802.037154 -
Service definitions loaded 8802.038670 0.001516
Tests loaded 8802.134488 0.095818
DNS lookups completed 8807.226739 5.092251
Test engine setup completed 8807.245081 0.018342
TCP tests completed 8810.416313 3.171232
PING test completed (951 hosts) 8859.946266 49.529953
PING test results sent 8859.949945 0.003679
Test result collection completed 8859.950004 0.000059
LDAP test engine setup completed 8859.950006 0.000002
LDAP tests executed 8859.950008 0.000002
LDAP tests result collection completed 8859.950010 0.000002
DNS tests executed 8860.077831 0.127821
NTP tests executed 8861.305830 1.227999
Test results transmitted 8861.339357 0.033527
bbtest-net completed 8861.340730 0.001373
TIME TOTAL 59.303576
Here is a ‘grep’ of bb-hosts:
[root at monitorp etc]# grep -i cswflex4 bb-hosts
0.0.0.0 cswflex4.csw.L-3com.com
128.170.31.14 cswflex4.csw.L-3com.com # conn
0.0.0.0 cswflex4.csw.l-3com.com
0.0.0.0 cswflex4.csw.l-3com.com
[root at monitorp etc]#
I cannot find a second entry with an IP address.
I have hundreds of other servers with 0.0.0.0 entries so I cannot figure out what is happening?
Running Xymon 4.3.0-0.beta2 on CentOS 5.5 on Dell R-610.
I tried to do a ‘drop hostname’, could there be a duplicate in the Xymon data base?
If so, how do you remove the bad one so you don’t loose the history being gathered by the client on the Windows server.
The client is the big brother client.
Thanks,
Tom Schmitt
Senior IT Staff - R&D
L-3 Communication Systems West
640 North 2200 West
P.O. Box 16850
Salt Lake City, UT XXXXX
Phone (XXX) XXX-XXXX
Cell (XXX) XXX-XXXX
eFax (XXX) XXX-XXXX
user-9c1ae820b621@xymon.invalid
\\\\||////
\ ~ ~ /
| @ @ |
--oOo---(_)---oOo--
It says the duplicate is in bb-hosts. Grep bb-hosts for that hostname. If you have 'include'd files grep those, too. You should be able to verify the duplicate and maybe even find it. Steve
list Steve Holmes
▸
On Mon, Aug 30, 2010 at 10:06 AM, Steve Holmes <user-5425c7b245e1@xymon.invalid> wrote:
On Aug 30, 2010, at 9:54 AM, user-9c1ae820b621@xymon.invalid wrote:
*I am getting a yellow alert on my Xymon server under ‘BBTEST’*:
Bottom of Form
Mon Aug 30 07:43:57 2010
bbtest-net version 4.3.0-0.beta2
SSL library : OpenSSL 0.9.8e-rhel5 01 Jul 2008
LDAP library: OpenLDAP 20343
Statistics:
Hosts total : 951
Hosts with no tests : 0
Total test count : 1008
Status messages : 1012
Alert status msgs : 0
Transmissions : 14
DNS statistics:
# hostnames resolved : 975
# succesful : 573
# failed : 382
# calls to dnsresolve : 1002
TCP test statistics:
# TCP tests total : 40
# HTTP tests : 30
# Simple TCP tests : 10
# Connection attempts : 40
# bytes written : 4682
# bytes read : 109923
*Error output:*
*Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause
strange results*
* *
TIME SPENT
Event Starttime
Duration
bbtest-net startup 8802.037154
• Service definitions loaded 8802.038670
0.001516
Tests loaded 8802.134488
0.095818
DNS lookups completed 8807.226739
5.092251
Test engine setup completed 8807.245081
0.018342
TCP tests completed 8810.416313
3.171232
PING test completed (951 hosts) 8859.946266
49.529953
PING test results sent 8859.949945
0.003679
Test result collection completed 8859.950004
0.000059
LDAP test engine setup completed 8859.950006
0.000002
LDAP tests executed 8859.950008
0.000002
LDAP tests result collection completed 8859.950010
0.000002
DNS tests executed 8860.077831
0.127821
NTP tests executed 8861.305830
1.227999
Test results transmitted 8861.339357
0.033527
bbtest-net completed 8861.340730
0.001373
TIME TOTAL
59.303576
*Here is a ‘grep’ of bb-hosts:*
[root at monitorp etc]# grep -i cswflex4 bb-hosts
0.0.0.0 <http://cswflex4.csw.L-3com.com>cswflex4.csw.L-3com.com 128.170.31.14 cswflex4.csw.L-3com.com # conn 0.0.0.0 <http://cswflex4.csw.l-3com.com>cswflex4.csw.l-3com.com 0.0.0.0 <http://cswflex4.csw.l-3com.com>cswflex4.csw.l-3com.com
▸
[root at monitorp etc]#
I cannot find a second entry with an IP address.
I have hundreds of other servers with 0.0.0.0 entries so I cannot figure
out what is happening?
Running Xymon 4.3.0-0.beta2 on CentOS 5.5 on Dell R-610.
I tried to do a ‘drop hostname’, could there be a duplicate in the Xymon
data base?
If so, how do you remove the bad one so you don’t loose the history being
gathered by the client on the Windows server.
The client is the big brother client.
*Thanks,*
* *
*Tom Schmitt*
• *
It says the duplicate is in bb-hosts. Grep bb-hosts for that hostname. If
you have 'include'd files grep those, too. You should be able to verify the
duplicate and maybe even find it.
Steve
Sorry, ignore my last reply. I ready the OP first on my iPhone and missed the part about your grep of bb-hosts. You might try adding the 'prefer' keyword on the line with the IP address. Steve
list Epperson
▸
On Mon, August 30, 2010 09:54, user-9c1ae820b621@xymon.invalid wrote: [SNIP]
Error output: Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause strange results
[SNIP]
[root at monitorp etc]# grep -i cswflex4 bb-hosts
0.0.0.0 cswflex4.csw.L-3com.com
128.170.31.14 cswflex4.csw.L-3com.com # conn
0.0.0.0 cswflex4.csw.l-3com.com
0.0.0.0 cswflex4.csw.l-3com.com
[root at monitorp etc]#
I cannot find a second entry with an IP address.
I have hundreds of other servers with 0.0.0.0 entries so I cannot figure
out what is happening?
Since case isn't relevant in hostnames, I'd say the error message isn't strictly correct: cswflex4.csw.l-3com.com actually appears four times in bb-hosts, not two. As indicated by the error, this may cause strange results. I'm not sure what your point is about the 0.0.0.0 ip. With no TESTIP tag, the ip will only be used if hostname lookup fails. And used or not used, you have four instances of the hostname in bb-hosts, which can give strange results. Or maybe I'm not reading your post correctly.
list Xymon User in Richmond
On Mon, August 30, 2010 09:54, user-9c1ae820b621@xymon.invalid wrote: [SNIP]
Error output: Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause strange results
[SNIP]
[root at monitorp etc]# grep -i cswflex4 bb-hosts
0.0.0.0 cswflex4.csw.L-3com.com
128.170.31.14 cswflex4.csw.L-3com.com # conn
0.0.0.0 cswflex4.csw.l-3com.com
0.0.0.0 cswflex4.csw.l-3com.com
[root at monitorp etc]#
I cannot find a second entry with an IP address.
I have hundreds of other servers with 0.0.0.0 entries so I cannot figure
out what is happening?
Since case isn't relevant in hostnames, I'd say the error message isn't strictly correct: cswflex4.csw.l-3com.com actually appears four times in bb-hosts, not two. As indicated by the error, this may cause strange results. I'm not sure what your point is about the 0.0.0.0 ip. With no TESTIP tag, the ip will only be used if hostname lookup fails. And used or not used, you have four instances of the hostname in bb-hosts, which can give strange results. Or maybe I'm not reading your post correctly.
list Torsten Richter
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
▸
On Mon, August 30, 2010 09:54, user-9c1ae820b621@xymon.invalid wrote:
[SNIP]
[root at monitorp etc]# grep -i cswflex4 bb-hosts 0.0.0.0 cswflex4.csw.L-3com.com 128.170.31.14 cswflex4.csw.L-3com.com # conn 0.0.0.0 cswflex4.csw.l-3com.com 0.0.0.0 cswflex4.csw.l-3com.com
if you really have something like this in your bb-hosts you need to add the noconn tag to the 3 entries that are not the "real" one. On the other hand you can also drop the conn tag for the second entry since the conn test is always executed if not specified otherwise. So your bb-hosts should read like this: 0.0.0.0 cswflex4.csw.L-3com.com # noconn 128.170.31.14 cswflex4.csw.L-3com.com 0.0.0.0 cswflex4.csw.l-3com.com # noconn 0.0.0.0 cswflex4.csw.l-3com.com # noconn Then everything should be fine. HTH Torsten - -- +---------------------------------------------------------+ | E-mail : user-c862b499d9fa@xymon.invalid | | | | Homepage: http://www.richter-it.net/ | +---------------------------------------------------------+ Download my public key from: http://gpg-keyserver.de/pks/lookup?search=0x899093AC&op=get -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (MingW32) iEYEARECAAYFAkx76gEACgkQ7DlmxomQk6wkoACfaKt8kx7oNtvd7Bk4lrUAAw/l ykMAn0eFrbj3GRyjUCMGqBAOUOmdm6qB =Cqhr -----END PGP SIGNATURE-----
list Tom Schmitt
Only the device with an IP address is tested. The 0.0.0.0 entries are so that you can show the tests on other web pages in xymon. I do this on over 100 devices so the 0.0.0.0 entries should not be causing the problem. I am still at a loss where it is getting the message. I should be able to have one actual test entry with the IP address on the line and Multiple 0.0.0.0 entries that are associated with it. Why is just this one entry having a problem???? Do I have to remove all the entries for this hosts including the Xymon DB entry and add them back in again?
▸
Thanks,
Tom Schmitt
Senior IT Staff - R&D
L-3 Communication Systems West
640 North 2200 West
P.O. Box 16850
Salt Lake City, UT XXXXX
Phone (XXX) XXX-XXXX
Cell (XXX) XXX-XXXX
eFax (XXX) XXX-XXXX
user-9c1ae820b621@xymon.invalid
\\\\||////
\ ~ ~ /
| @ @ |
--oOo---(_)---oOo--
-----Original Message-----
From: user-97cfc20104f4@xymon.invalid [mailto:user-97cfc20104f4@xymon.invalid]
Sent: Monday, August 30, 2010 9:13 AM
To: xymon at xymon.com
Subject: Re: [xymon] TS - bb-hosts problem
Importance: High
On Mon, August 30, 2010 09:54, user-9c1ae820b621@xymon.invalid wrote:
[SNIP]Error output: Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause strange results
[SNIP]
[root at monitorp etc]# grep -i cswflex4 bb-hosts
0.0.0.0 cswflex4.csw.L-3com.com
128.170.31.14 cswflex4.csw.L-3com.com # conn
0.0.0.0 cswflex4.csw.l-3com.com
0.0.0.0 cswflex4.csw.l-3com.com
[root at monitorp etc]#
I cannot find a second entry with an IP address.
I have hundreds of other servers with 0.0.0.0 entries so I cannot
figure
out what is happening?
Since case isn't relevant in hostnames, I'd say the error message isn't strictly correct: cswflex4.csw.l-3com.com actually appears four times in bb-hosts, not two. As indicated by the error, this may cause strange results. I'm not sure what your point is about the 0.0.0.0 ip. With no TESTIP tag, the ip will only be used if hostname lookup fails. And used or not used, you have four instances of the hostname in bb-hosts, which can give strange results. Or maybe I'm not reading your post correctly.
list Craig Cook
▸
Only the device with an IP address is tested. The 0.0.0.0 entries are so that you can show the tests on other web pages in xymon. I do this on over 100 devices so the 0.0.0.0 entries should not be causing the problem.
This is not correct. This listed in the bb-hosts file: 0.0.0.0 cswflex4.csw.L-3com.com 128.170.31.14 cswflex4.csw.L-3com.com # conn 0.0.0.0 cswflex4.csw.l-3com.com 0.0.0.0 cswflex4.csw.l-3com.com Will result in that host being ping tested 4 times. 0.0.0.0 means do a DNS lookup for the IP address, then ping test it. I think your IP "128.170.31.14" will be ignored as well, DNS will be used first, but test that theory yourself. What you want is to add the "noping" option at the end of 3 of these entries, and add a "prefer" option to the line you want xymon to work with. You also need to check your DNS entries for hosts, 382 lookup failures sounds like something to investigate. If you have around 100 unique hostnames in your bb-hosts file, this stat shows that you have around 1002 total lines. DNS lookups for all ;)
▸
DNS statistics:
# hostnames resolved : 975
# succesful : 573
# failed : 382
# calls to dnsresolve : 1002
Craig