Xymon Mailing List Archive search

TS - bb-hosts problem

9 messages in this thread

list Tom Schmitt · Mon, 30 Aug 2010 07:54:27 -0600 ·
I am getting a yellow alert on my Xymon server under 'BBTEST':

Bottom of Form

Mon Aug 30 07:43:57 2010

 
bbtest-net version 4.3.0-0.beta2

SSL library : OpenSSL 0.9.8e-rhel5 01 Jul 2008

LDAP library: OpenLDAP 20343

 
Statistics:

 Hosts total           :      951

 Hosts with no tests   :        0

 Total test count      :     1008

 Status messages       :     1012

 Alert status msgs     :        0

 Transmissions         :       14

 
DNS statistics:

 # hostnames resolved  :      975

 # succesful           :      573

 # failed              :      382

 # calls to dnsresolve :     1002

 
TCP test statistics:

 # TCP tests total     :       40

 # HTTP tests          :       30

 # Simple TCP tests    :       10

 # Connection attempts :       40

 # bytes written       :     4682

 # bytes read          :   109923

 
Error output:

Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause
strange results

 
TIME SPENT

Event                                            Starttime
Duration

bbtest-net startup                             8802.037154
• Service definitions loaded                     8802.038670
0.001516 

Tests loaded                                   8802.134488
0.095818 

DNS lookups completed                          8807.226739
5.092251 

Test engine setup completed                    8807.245081
0.018342 

TCP tests completed                            8810.416313
3.171232 

PING test completed (951 hosts)                8859.946266
49.529953 

PING test results sent                         8859.949945
0.003679 

Test result collection completed               8859.950004
0.000059 

LDAP test engine setup completed               8859.950006
0.000002 

LDAP tests executed                            8859.950008
0.000002 

LDAP tests result collection completed         8859.950010
0.000002 

DNS tests executed                             8860.077831
0.127821 

NTP tests executed                             8861.305830
1.227999 

Test results transmitted                       8861.339357
0.033527 

bbtest-net completed                           8861.340730
0.001373 

TIME TOTAL
59.303576 

 
Here is a 'grep' of bb-hosts:

                [root at monitorp etc]# grep -i cswflex4 bb-hosts

0.0.0.0         cswflex4.csw.L-3com.com

128.170.31.14   cswflex4.csw.L-3com.com         # conn

0.0.0.0 cswflex4.csw.l-3com.com

0.0.0.0 cswflex4.csw.l-3com.com

[root at monitorp etc]#

 
I cannot find a second entry with an IP address.

I have hundreds of other servers with 0.0.0.0 entries so I cannot figure
out what is happening?

 
Running Xymon 4.3.0-0.beta2 on CentOS 5.5 on Dell R-610.

I tried to do a 'drop hostname', could there be a duplicate in the Xymon
data base?

If so, how do you remove the bad one so you don't loose the history
being gathered by the client on the Windows server.

The client is the big brother client.

 
Thanks,

 
Tom Schmitt

Senior IT Staff - R&D

L-3 Communication Systems West

640 North 2200 West

P.O. Box 16850

Salt Lake City, UT  XXXXX

Phone (XXX) XXX-XXXX

Cell      (XXX) XXX-XXXX

eFax    (XXX) XXX-XXXX

user-9c1ae820b621@xymon.invalid

           \\\\||////

             \ ~  ~ /  

             | @  @ |   

--oOo---(_)---oOo--
list Johan Sjöberg · Mon, 30 Aug 2010 15:58:43 +0200 ·
Isn't the hostname the "primary key"? Can you use the same hostname with multiple IP addresses?

/Johan

From: user-9c1ae820b621@xymon.invalid [mailto:user-9c1ae820b621@xymon.invalid]
Sent: den 30 augusti 2010 15:54
To: xymon at xymon.com
Subject: [xymon] TS - bb-hosts problem
Importance: High
quoted from Tom Schmitt

I am getting a yellow alert on my Xymon server under 'BBTEST':

Bottom of Form
Mon Aug 30 07:43:57 2010

bbtest-net version 4.3.0-0.beta2
SSL library : OpenSSL 0.9.8e-rhel5 01 Jul 2008
LDAP library: OpenLDAP 20343

Statistics:
 Hosts total           :      951
 Hosts with no tests   :        0
 Total test count      :     1008
 Status messages       :     1012
 Alert status msgs     :        0
 Transmissions         :       14

DNS statistics:
 # hostnames resolved  :      975
 # succesful           :      573
 # failed              :      382
 # calls to dnsresolve :     1002

TCP test statistics:
 # TCP tests total     :       40
 # HTTP tests          :       30
 # Simple TCP tests    :       10
 # Connection attempts :       40
 # bytes written       :     4682
 # bytes read          :   109923


Error output:
Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause strange results


TIME SPENT
Event                                            Starttime          Duration
bbtest-net startup                             8802.037154                 -
Service definitions loaded                     8802.038670          0.001516
Tests loaded                                   8802.134488          0.095818
DNS lookups completed                          8807.226739          5.092251
Test engine setup completed                    8807.245081          0.018342
TCP tests completed                            8810.416313          3.171232
PING test completed (951 hosts)                8859.946266         49.529953
PING test results sent                         8859.949945          0.003679
Test result collection completed               8859.950004          0.000059
LDAP test engine setup completed               8859.950006          0.000002
LDAP tests executed                            8859.950008          0.000002
LDAP tests result collection completed         8859.950010          0.000002
DNS tests executed                             8860.077831          0.127821
NTP tests executed                             8861.305830          1.227999
Test results transmitted                       8861.339357          0.033527
bbtest-net completed                           8861.340730          0.001373
TIME TOTAL                                                         59.303576


Here is a 'grep' of bb-hosts:
                [root at monitorp etc]# grep -i cswflex4 bb-hosts
0.0.0.0         cswflex4.csw.L-3com.com
128.170.31.14   cswflex4.csw.L-3com.com         # conn
0.0.0.0 cswflex4.csw.l-3com.com
0.0.0.0 cswflex4.csw.l-3com.com
[root at monitorp etc]#


I cannot find a second entry with an IP address.
I have hundreds of other servers with 0.0.0.0 entries so I cannot figure out what is happening?

Running Xymon 4.3.0-0.beta2 on CentOS 5.5 on Dell R-610.
I tried to do a 'drop hostname', could there be a duplicate in the Xymon data base?
If so, how do you remove the bad one so you don't loose the history being gathered by the client on the Windows server.
The client is the big brother client.

Thanks,

Tom Schmitt
Senior IT Staff - R&D
L-3 Communication Systems West
640 North 2200 West
P.O. Box 16850
Salt Lake City, UT  XXXXX
Phone (XXX) XXX-XXXX
Cell      (XXX) XXX-XXXX
eFax    (XXX) XXX-XXXX

user-9c1ae820b621@xymon.invalid<mailto:user-9c1ae820b621@xymon.invalid>
           \\\\||////
             \ ~  ~ /
             | @  @ |
--oOo---(_)---oOo--
list Steve Holmes · Mon, 30 Aug 2010 10:06:44 -0400 ·
quoted from Johan Sjöberg
On Aug 30, 2010, at 9:54 AM, user-9c1ae820b621@xymon.invalid wrote:
I am getting a yellow alert on my Xymon server under ‘BBTEST’:


Mon Aug 30 07:43:57 2010

 
bbtest-net version 4.3.0-0.beta2

SSL library : OpenSSL 0.9.8e-rhel5 01 Jul 2008

LDAP library: OpenLDAP 20343

 
Statistics:

 Hosts total           :      951

 Hosts with no tests   :        0

 Total test count      :     1008

 Status messages       :     1012

 Alert status msgs     :        0

 Transmissions         :       14

 
DNS statistics:

 # hostnames resolved  :      975

 # succesful           :      573

 # failed              :      382

 # calls to dnsresolve :     1002

 
TCP test statistics:

 # TCP tests total     :         40

 # HTTP tests          :       30

 # Simple TCP tests    :       10

 # Connection attempts :       40

 # bytes written       :     4682

 # bytes read          :   109923

 
Error output:

Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause strange results

 
TIME SPENT

Event                                              Starttime          Duration

bbtest-net startup                               8802.037154                 -

Service definitions loaded                       8802.038670          0.001516

Tests loaded                                     8802.134488          0.095818

DNS lookups completed                            8807.226739          5.092251

Test engine setup completed                      8807.245081          0.018342

TCP tests completed                              8810.416313          3.171232

PING test completed (951 hosts)                  8859.946266         49.529953

PING test results sent                           8859.949945          0.003679

Test result collection completed               8859.950004          0.000059

LDAP test engine setup completed               8859.950006          0.000002

LDAP tests executed                              8859.950008          0.000002

LDAP tests result collection completed         8859.950010          0.000002

DNS tests executed                               8860.077831          0.127821

NTP tests executed                             8861.305830          1.227999

Test results transmitted                         8861.339357          0.033527

bbtest-net completed                             8861.340730          0.001373

TIME TOTAL                                                           59.303576

 
Here is a ‘grep’ of bb-hosts:

                [root at monitorp etc]# grep -i cswflex4 bb-hosts

0.0.0.0         cswflex4.csw.L-3com.com

128.170.31.14   cswflex4.csw.L-3com.com         # conn

0.0.0.0 cswflex4.csw.l-3com.com

0.0.0.0 cswflex4.csw.l-3com.com

[root at monitorp etc]#

 
I cannot find a second entry with an IP address.

I have hundreds of other servers with 0.0.0.0 entries so I cannot figure out what is happening?

 
Running Xymon 4.3.0-0.beta2 on CentOS 5.5 on Dell R-610.

I tried to do a ‘drop hostname’, could there be a duplicate in the Xymon data base?

If so, how do you remove the bad one so you don’t loose the history being gathered by the client on the Windows server.

The client is the big brother client.

 
Thanks,

 
Tom Schmitt

Senior IT Staff - R&D

L-3 Communication Systems West

640 North 2200 West

P.O. Box 16850

Salt Lake City, UT  XXXXX

Phone (XXX) XXX-XXXX

Cell      (XXX) XXX-XXXX

eFax    (XXX) XXX-XXXX

user-9c1ae820b621@xymon.invalid

           \\\\||////

             \ ~  ~ / 
             | @  @ |  
--oOo---(_)---oOo--

 
It says the duplicate is in bb-hosts. Grep bb-hosts for that hostname. If you have 'include'd files grep those, too. You should be able to verify the duplicate and maybe even find it. 
Steve
list Steve Holmes · Mon, 30 Aug 2010 10:14:11 -0400 ·
quoted from Steve Holmes
On Mon, Aug 30, 2010 at 10:06 AM, Steve Holmes <user-5425c7b245e1@xymon.invalid> wrote:
On Aug 30, 2010, at 9:54 AM, user-9c1ae820b621@xymon.invalid wrote:

 *I am getting a yellow alert on my Xymon server under ‘BBTEST’*:

 Bottom of Form

Mon Aug 30 07:43:57 2010


bbtest-net version 4.3.0-0.beta2

SSL library : OpenSSL 0.9.8e-rhel5 01 Jul 2008

LDAP library: OpenLDAP 20343


Statistics:

 Hosts total           :      951

 Hosts with no tests   :        0

 Total test count      :     1008

 Status messages       :     1012

 Alert status msgs     :        0

 Transmissions         :       14


DNS statistics:

 # hostnames resolved  :      975

 # succesful           :      573

 # failed              :      382

 # calls to dnsresolve :     1002


TCP test statistics:

 # TCP tests total     :       40

 # HTTP tests          :       30

 # Simple TCP tests    :       10

 # Connection attempts :       40

 # bytes written       :     4682

 # bytes read          :   109923


*Error output:*

*Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause
strange results*

* *


TIME SPENT

Event                                            Starttime
Duration

bbtest-net startup                             8802.037154
• Service definitions loaded                     8802.038670
0.001516

Tests loaded                                   8802.134488
0.095818

DNS lookups completed                          8807.226739
5.092251

Test engine setup completed                    8807.245081
0.018342

TCP tests completed                            8810.416313
3.171232

PING test completed (951 hosts)                8859.946266
        49.529953

PING test results sent                         8859.949945
0.003679

Test result collection completed               8859.950004
0.000059

LDAP test engine setup completed               8859.950006
0.000002

LDAP tests executed                            8859.950008
0.000002

LDAP tests result collection completed         8859.950010
0.000002

DNS tests executed                             8860.077831
0.127821

NTP tests executed                             8861.305830
1.227999

Test results transmitted                       8861.339357
0.033527

bbtest-net completed                           8861.340730
0.001373

TIME TOTAL
                    59.303576


*Here is a ‘grep’ of bb-hosts:*

                [root at monitorp etc]# grep -i cswflex4 bb-hosts

0.0.0.0         <http://cswflex4.csw.L-3com.com>cswflex4.csw.L-3com.com

128.170.31.14   cswflex4.csw.L-3com.com         # conn

0.0.0.0 <http://cswflex4.csw.l-3com.com>cswflex4.csw.l-3com.com

0.0.0.0 <http://cswflex4.csw.l-3com.com>cswflex4.csw.l-3com.com
quoted from Steve Holmes

[root at monitorp etc]#


I cannot find a second entry with an IP address.

I have hundreds of other servers with 0.0.0.0 entries so I cannot figure
out what is happening?


Running Xymon 4.3.0-0.beta2 on CentOS 5.5 on Dell R-610.

I tried to do a ‘drop hostname’, could there be a duplicate in the Xymon
data base?

If so, how do you remove the bad one so you don’t loose the history being
gathered by the client on the Windows server.

The client is the big brother client.


*Thanks,*

* *

*Tom Schmitt*

• *

It says the duplicate is in bb-hosts. Grep bb-hosts for that hostname. If
you have 'include'd files grep those, too. You should be able to verify the
duplicate and maybe even find it.

Steve
Sorry, ignore my last reply. I ready the OP first on my iPhone and missed
the part about your grep of bb-hosts.
You might try adding the 'prefer' keyword on the line with the IP address.

Steve
list Epperson · Mon, 30 Aug 2010 11:13:07 -0400 ·
quoted from Steve Holmes
On Mon, August 30, 2010 09:54, user-9c1ae820b621@xymon.invalid wrote:
[SNIP]
Error output:

Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause
strange results
[SNIP]
                [root at monitorp etc]# grep -i cswflex4 bb-hosts

0.0.0.0         cswflex4.csw.L-3com.com

128.170.31.14   cswflex4.csw.L-3com.com         # conn

0.0.0.0 cswflex4.csw.l-3com.com

0.0.0.0 cswflex4.csw.l-3com.com

[root at monitorp etc]#


I cannot find a second entry with an IP address.

I have hundreds of other servers with 0.0.0.0 entries so I cannot figure
out what is happening?
Since case isn't relevant in hostnames, I'd say the error message isn't
strictly correct:  cswflex4.csw.l-3com.com actually appears four times in
bb-hosts, not two.  As indicated by the error, this may cause strange
results.

I'm not sure what your point is about the 0.0.0.0 ip.  With no TESTIP tag,
the ip will only be used if hostname lookup fails.  And used or not used,
you have four instances of the hostname in bb-hosts, which can give
strange results.

Or maybe I'm not reading your post correctly.
list Xymon User in Richmond · Mon, 30 Aug 2010 11:32:55 -0400 ·
On Mon, August 30, 2010 09:54, user-9c1ae820b621@xymon.invalid wrote:
[SNIP]
Error output:

Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause
strange results
[SNIP]
                [root at monitorp etc]# grep -i cswflex4 bb-hosts

0.0.0.0         cswflex4.csw.L-3com.com

128.170.31.14   cswflex4.csw.L-3com.com         # conn

0.0.0.0 cswflex4.csw.l-3com.com

0.0.0.0 cswflex4.csw.l-3com.com

[root at monitorp etc]#


I cannot find a second entry with an IP address.

I have hundreds of other servers with 0.0.0.0 entries so I cannot figure
out what is happening?
Since case isn't relevant in hostnames, I'd say the error message isn't
strictly correct:  cswflex4.csw.l-3com.com actually appears four times in
bb-hosts, not two.  As indicated by the error, this may cause strange
results.

I'm not sure what your point is about the 0.0.0.0 ip.  With no TESTIP tag,
the ip will only be used if hostname lookup fails.  And used or not used,
you have four instances of the hostname in bb-hosts, which can give
strange results.

Or maybe I'm not reading your post correctly.
list Torsten Richter · Mon, 30 Aug 2010 19:27:29 +0200 ·
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,
quoted from Xymon User in Richmond


On Mon, August 30, 2010 09:54, user-9c1ae820b621@xymon.invalid wrote:

[SNIP]
                [root at monitorp etc]# grep -i cswflex4 bb-hosts

0.0.0.0         cswflex4.csw.L-3com.com

128.170.31.14   cswflex4.csw.L-3com.com         # conn

0.0.0.0 cswflex4.csw.l-3com.com

0.0.0.0 cswflex4.csw.l-3com.com
if you really have something like this in your bb-hosts you need to add
the noconn tag to the 3 entries that are not the "real" one. On the
other hand you can also drop the conn tag for the second entry since the
conn test is always executed if not specified otherwise.
So your bb-hosts should read like this:

0.0.0.0         cswflex4.csw.L-3com.com         # noconn

128.170.31.14   cswflex4.csw.L-3com.com

0.0.0.0 	cswflex4.csw.l-3com.com         # noconn

0.0.0.0 	cswflex4.csw.l-3com.com         # noconn

Then everything should be fine.

HTH
Torsten
- -- 
+---------------------------------------------------------+
| E-mail  : user-c862b499d9fa@xymon.invalid			  |
|							  |
| Homepage: http://www.richter-it.net/			  |
+---------------------------------------------------------+
Download my public key from:
http://gpg-keyserver.de/pks/lookup?search=0x899093AC&op=get
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)

iEYEARECAAYFAkx76gEACgkQ7DlmxomQk6wkoACfaKt8kx7oNtvd7Bk4lrUAAw/l
ykMAn0eFrbj3GRyjUCMGqBAOUOmdm6qB
=Cqhr
-----END PGP SIGNATURE-----
list Tom Schmitt · Mon, 30 Aug 2010 16:45:55 -0600 ·
Only the device with an IP address is tested.
The 0.0.0.0 entries are so that you can show the tests on other web
pages in xymon.
I do this on over 100 devices so the 0.0.0.0 entries should not be
causing the problem.

I am still at a loss where it is getting the message.
I should be able to have one actual test entry with the IP address on
the line and
Multiple 0.0.0.0 entries that are associated with it.
Why is just this one entry having a problem????

Do I have to remove all the entries for this hosts including the Xymon
DB entry and add them back in again?
quoted from Xymon User in Richmond

         Thanks,
         
         Tom Schmitt
         Senior IT Staff - R&D
         L-3 Communication Systems West
         640 North 2200 West
         P.O. Box 16850
         Salt Lake City, UT  XXXXX
         Phone (XXX) XXX-XXXX
         Cell      (XXX) XXX-XXXX
         eFax    (XXX) XXX-XXXX
         user-9c1ae820b621@xymon.invalid
                 \\\\||////
                  \ ~  ~ /  
                  | @  @ |   
     		--oOo---(_)---oOo--


-----Original Message-----
From: user-97cfc20104f4@xymon.invalid [mailto:user-97cfc20104f4@xymon.invalid] 
Sent: Monday, August 30, 2010 9:13 AM
To: xymon at xymon.com
Subject: Re: [xymon] TS - bb-hosts problem
Importance: High

On Mon, August 30, 2010 09:54, user-9c1ae820b621@xymon.invalid wrote:
[SNIP]
Error output:

Host cswflex4.csw.l-3com.com appears twice in bb-hosts! This may cause
strange results
[SNIP]
                [root at monitorp etc]# grep -i cswflex4 bb-hosts

0.0.0.0         cswflex4.csw.L-3com.com

128.170.31.14   cswflex4.csw.L-3com.com         # conn

0.0.0.0 cswflex4.csw.l-3com.com

0.0.0.0 cswflex4.csw.l-3com.com

[root at monitorp etc]#


I cannot find a second entry with an IP address.

I have hundreds of other servers with 0.0.0.0 entries so I cannot
figure
out what is happening?
Since case isn't relevant in hostnames, I'd say the error message isn't
strictly correct:  cswflex4.csw.l-3com.com actually appears four times
in
bb-hosts, not two.  As indicated by the error, this may cause strange
results.

I'm not sure what your point is about the 0.0.0.0 ip.  With no TESTIP
tag,
the ip will only be used if hostname lookup fails.  And used or not
used,
you have four instances of the hostname in bb-hosts, which can give
strange results.

Or maybe I'm not reading your post correctly.
list Craig Cook · Tue, 31 Aug 2010 09:04:50 -0400 ·
quoted from Tom Schmitt
Only the device with an IP address is tested.
The 0.0.0.0 entries are so that you can show the tests on other web pages in xymon.
I do this on over 100 devices so the 0.0.0.0 entries should not be causing the problem.
This is not correct.

This listed in the bb-hosts file:

0.0.0.0         cswflex4.csw.L-3com.com
128.170.31.14   cswflex4.csw.L-3com.com         # conn
0.0.0.0 cswflex4.csw.l-3com.com
0.0.0.0 cswflex4.csw.l-3com.com

Will result in that host being ping tested 4 times.

0.0.0.0 means do a DNS lookup for the IP address, then ping test it.

I think your IP "128.170.31.14" will be ignored as well, DNS will be used first, but test that theory yourself.


What you want is to add the "noping" option at the end of 3 of these entries, and add a "prefer" option to the line you want xymon to work with.


You also need to check your DNS entries for hosts, 382 lookup failures sounds like something to investigate.

If you have around 100 unique hostnames in your bb-hosts file, this stat shows that you have around 1002 total lines.  DNS lookups for all ;)
quoted from Steve Holmes

DNS statistics:

 # hostnames resolved  :      975
 # succesful           :      573
 # failed              :      382
 # calls to dnsresolve :     1002


Craig