Xymon Mailing List Archive search

strange behavior

8 messages in this thread

list Jason Brockdorf · Fri, 11 Mar 2016 23:39:25 +0000 (UTC) ·
I finished my masterpiece hosts.cfg and have had my xymon server up and running for a while but now that I've got client data coming in, I've noticed some very strange behavior.  When I have the host on the front page, all tests show up fine.  When I move the host to a page (first level, using the page command) the tests that depend on client data go "purple" and the data doesn't update.  Also: hosts on pages cannot be seen when I click the "info" column and they're not seen in enable/disable tests either, even when I click on that option from the page they show up on.  I've tried explicitly setting the groups they're in to display the "missing" data but it doesn't help.  I have made some changes... but none that I think would/should be affecting it like this.

I've changed the xymonclient-linux.sh script to use different commands, like adding the -h switch to df, or changing MACHINEDOTS="`hostname -s | tr [:upper:] [:lower:]`" (always a short name, always lower case, to address inconsistencies) but as soon as I move them back to the front page, the data shows up again and they're "green".  The data from when they were "missing" still shows as missing in the history...  I'm stumped on this one.  I'm relatively certain the only changes I've made server side are only cosmetic changes to the underlying HTML and CSS files.
I'm using the terabitha RPMs for my server and built from source for my client:xymon.x86_64                         4.3.26-1.el7                   @xymon-testing

I DID change the config so that xymon is at the / of my web server, I'm not hosting anything else here.
I'm relatively certain any client-side changes I've made aren't the problem since it works fine when on the main page.  Any ideas how to fix this?  I've spent a lot of time on this already and I just want to be done with it. :(
Thanks in advance for any help,
Signed,Extremely Frustrated (Jason Brockdorf)
list John Thurston · Fri, 11 Mar 2016 14:48:08 -0900 ·
quoted from Jason Brockdorf
On 3/11/2016 2:39 PM, Jason Brockdorf wrote:
I finished my masterpiece hosts.cfg and have had my xymon server up and
running for a while but now that I've got client data coming in, I've
noticed some very strange behavior.
Can you give us a snippet of your hosts.cfg?


-- 
    Do things because you should, not just because you can.

John Thurston    XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Enterprise Technology Services
Department of Administration
State of Alaska
list Adam Goryachev · Sat, 12 Mar 2016 10:59:50 +1100 ·
quoted from John Thurston
On 12 March 2016 10:48:08 am AEDT, John Thurston <user-ce4d79d99bab@xymon.invalid> wrote:
On 3/11/2016 2:39 PM, Jason Brockdorf wrote:
I finished my masterpiece hosts.cfg and have had my xymon server up
and
running for a while but now that I've got client data coming in, I've
noticed some very strange behavior.
Can you give us a snippet of your hosts.cfg?


-- 
   Do things because you should, not just because you can.

John Thurston    XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Enterprise Technology Services
Department of Administration
State of Alaska
Check the ghosts report. Maybe xymon isn't including them from the sub page because the syntax is wrong? 
If the client sends a report the server doesn't know about it is added to the ghost report. 

Regards
Adam
list Jason Brockdorf · Sat, 12 Mar 2016 00:00:32 +0000 (UTC) ·
#snipped comments and whitespace throughout for brevity, changed ips and hostname for security.  The rest of the file keeps the same format.  Now that I've pasted it in here I'm thinking... Maybe because I'm using the same group name on different pages?
0.0.0.0 .default. # NOCOLUMNS:files
127.0.0.1 hhsiamxymon      # bbd http://localhost/123.32.139.15 cent5vm # my test client @ home#123.45.67.8 somehttpserver # NAME:"testing on front page"
# include additional host configurationsdirectory /etc/xymon/hosts.d
title <font size="+2">Production</font>
page prod-httpservers A bunch of HTTP serversgroup-only conn|cpu|memory|disk|info|ports|procs|sslcert|clientlog HTTP#group http <-- tried commenting this and using explicit group-only out to see if it was problem with groups123.56.69.76 hhsce4sveimhttp1 # 123.56.69.77 hhsce4sveimhttp2 #123.56.69.78 hhsce4sveimhttp3 #123.56.69.79 hhsce4sveimhttp4 #group Load Balancer123.56.69.38 hhsceseimhttplb #title <font size="+2">Test</font>page test-http A bunch of test HTTP servers
group http168.40.165.51 hhsce4aveimwst1
quoted from John Thurston

      From: John Thurston <user-ce4d79d99bab@xymon.invalid>
 To: "xymon at xymon.com" <xymon at xymon.com> 
 Sent: Friday, March 11, 2016 5:48 PM
 Subject: Re: [Xymon] strange behavior
   
On 3/11/2016 2:39 PM, Jason Brockdorf wrote:
I finished my masterpiece hosts.cfg and have had my xymon server up and
running for a while but now that I've got client data coming in, I've
noticed some very strange behavior.
Can you give us a snippet of your hosts.cfg?


-- 
    Do things because you should, not just because you can.

John Thurston    XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Enterprise Technology Services
Department of Administration
State of Alaska
list Jason Brockdorf · Sat, 12 Mar 2016 00:02:19 +0000 (UTC) ·
No, they don't show up on the ghosts report.  That was the trouble I had with the inconsistencies with names, the reason why I changed client to report short, lowercase hostname only.  But thank you for your input :)
quoted from Adam Goryachev
      From: Adam Goryachev <user-92fd6827f6ae@xymon.invalid>
 To: John Thurston <user-ce4d79d99bab@xymon.invalid>; "xymon at xymon.com" <xymon at xymon.com>  Sent: Friday, March 11, 2016 5:59 PM
 Subject: Re: [Xymon] strange behavior
   On 12 March 2016 10:48:08 am AEDT, John Thurston <user-ce4d79d99bab@xymon.invalid> wrote:
On 3/11/2016 2:39 PM, Jason Brockdorf wrote:

 I finished my masterpiece hosts.cfg and have had my xymon server up and
 running for a while but now that I've got client data coming in, I've
 noticed some very strange behavior.


Can you give us a snippet of your hosts.cfg?


Check the ghosts report. Maybe xymon isn't including them from the sub page because the syntax is wrong? If the client sends a report the server doesn't know about it is added to the ghost report. 
Regards
Adam
list John Thurston · Fri, 11 Mar 2016 16:26:06 -0900 ·
On 3/11/2016 3:00 PM, Jason Brockdorf wrote:
0.0.0.0 .default.# NOCOLUMNS:files

127.0.0.1hhsiamxymon      # bbd http://localhost/
123.32.139.15cent5vm # my test client @ home
This concerns me a little. The # is not a comment delimiter in this context. I have no idea what xymonnet is going to do the those tags. If you want comments, put them on the line above.

I don't like the non-qualified domain names, but that shouldn't prevent this stuff from working. If you really are using short-names, I suggest putting an explicit TESTIP in your .default. line. Again, it shouldn't be the cause of your breakage, but it would be a good idea.

- snip -
page prod-httpservers A bunch of HTTP servers
Have you tried this without the hypen in the page name?
group-only conn|cpu|memory|disk|info|ports|procs|sslcert|clientlog HTTP
re-using the group name should not cause any trouble.

if you run
   xymoncmd xymoncfg --web
or
   xymoncmd xymoncfg --net

Is the hosts.cfg correctly parsed?

If you run
   xymoncmd xymonnet --no-update hhsce4aveimwst1
does the output look reasonable?

Is there anything interesting in xymond.log ?
quoted from Jason Brockdorf
-- 
    Do things because you should, not just because you can.

John Thurston    XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Enterprise Technology Services
Department of Administration
State of Alaska
list Jason Brockdorf · Sun, 13 Mar 2016 06:41:15 +0000 (UTC) ·
Well, I figured it out.
I uninstalled and reinstalled xymon server to see if using all stock would help and it didn't.  All I added back into hosts.cfg was ONLY the IPs and hostnames with no page definitions.
It was then either one of two things that I noticed (I didn't wait long enough between doing both of them to know which one fixed it...)
First: I commented out the default #directory /etc/xymon/hosts.d line (because I noticed everything above was working and everything below was not) and then I noticed something else while waiting for it to update:
In my history.log there was a bunch of lines: 2016-03-13 00:15:10.061160 Cannot create /var/lib/xymon/histlogs/someservername/cpu: File exists (someservername was one of the offenders)
So I went and deleted everything in /var/lib/xymon/histlogs
After that everything magically started to appear on the main page.  I believe it was deleting the directories in histlogs that fixed it because they didn't all come back at once, they came back a few at a time (just as my winscp was deleting the directories serially instead of all at once) but that could be the client updates were coming in at different times as well *shrug*.
I'm sure it was something I did wrong but it was extremely frustrating to say the least.  Perhaps Xymon could try to append (is that appropriate for those files?) in such a situation instead of just logging an error and giving up?
Thanks for all your help everyone that tried.
quoted from John Thurston
      From: John Thurston <user-ce4d79d99bab@xymon.invalid>
 To: "xymon at xymon.com" <xymon at xymon.com> 
 Sent: Friday, March 11, 2016 7:26 PM
 Subject: Re: [Xymon] strange behavior
   
On 3/11/2016 3:00 PM, Jason Brockdorf wrote:
0.0.0.0 .default.# NOCOLUMNS:files

127.0.0.1hhsiamxymon      # bbd http://localhost/
123.32.139.15cent5vm # my test client @ home
This concerns me a little. The # is not a comment delimiter in this 
context. I have no idea what xymonnet is going to do the those tags. If 
you want comments, put them on the line above.

I don't like the non-qualified domain names, but that shouldn't prevent 
this stuff from working. If you really are using short-names, I suggest 
putting an explicit TESTIP in your .default. line. Again, it shouldn't 
be the cause of your breakage, but it would be a good idea.

- snip -
page prod-httpservers A bunch of HTTP servers
Have you tried this without the hypen in the page name?
group-only conn|cpu|memory|disk|info|ports|procs|sslcert|clientlog HTTP
re-using the group name should not cause any trouble.

if you run
  xymoncmd xymoncfg --web
or
  xymoncmd xymoncfg --net

Is the hosts.cfg correctly parsed?

If you run
  xymoncmd xymonnet --no-update hhsce4aveimwst1
does the output look reasonable?

Is there anything interesting in xymond.log ?
-- 
    Do things because you should, not just because you can.

John Thurston    XXX-XXX-XXXX
user-ce4d79d99bab@xymon.invalid
Enterprise Technology Services
Department of Administration
State of Alaska
list Adam Goryachev · Mon, 14 Mar 2016 10:21:02 +1100 ·
quoted from Jason Brockdorf
On 13/03/16 17:41, Jason Brockdorf via Xymon wrote:
Well, I figured it out.

I uninstalled and reinstalled xymon server to see if using all stock would help and it didn't.  All I added back into hosts.cfg was ONLY the IPs and hostnames with no page definitions.

It was then either one of two things that I noticed (I didn't wait long enough between doing both of them to know which one fixed it...)

First: I commented out the default #directory/etc/xymon/hosts.d line (because I noticed everything above was working and everything below was not) and then I noticed something else while waiting for it to update:

In my history.log there was a bunch of lines: 2016-03-13 00:15:10.061160 Cannot create /var/lib/xymon/histlogs/someservername/cpu: File exists (someservername was one of the offenders)
Well, this was certainly one of the problems...
quoted from Jason Brockdorf
So I went and deleted everything in /var/lib/xymon/histlogs

After that everything magically started to appear on the main page.  I believe it was deleting the directories in histlogs that fixed it because they didn't all come back at once, they came back a few at a time (just as my winscp was deleting the directories serially instead of all at once) but that could be the client updates were coming in at different times as well *shrug*.

I'm sure it was something I did wrong but it was extremely frustrating to say the least.  Perhaps Xymon could try to append (is that appropriate for those files?) in such a situation instead of just logging an error and giving up?
The problem is that it was a file, and it should be a directory. Xymon doesn't delete the file (because it might contain valuable information), but it needs a directory to store it's information in multiple files. So the best it can do is log a message to "alert" you.
Thanks for all your help everyone that tried.
Glad it is working for you now.

Regards,
Adam
-- 
Adam Goryachev Website Managers www.websitemanagers.com.au