Frequent purple alerts
list Jaime Kikpole
I have a number of systems running the PowerShell Xymon client. One and only one of them frequently gives purple alerts on all tests. I haven't found any pattern for this yet. They just seem to happen several times per day and then clear themselves after anything from a few minutes to a few hours. I tried removing and reinstalling the client, but that didn't seem to help. Any suggestions? Jaime Kikpole Director of Technology & Innovations Cairo-Durham Central School District (XXX) XXX-XXXX, x59500 cairodurham.org <http://www.cairodurham.org> Technical Support: user-2eed5d3dd752@xymon.invalid go.cairodurham.org/techtips [image: Google Certified Educator, Level 1] [image: Google Certified Educator, Level 2] -- This electronic message and any attachment(s) may contain confidential or legally privileged information protected by law from further disclosure and is intended only for the individual or entity identified above as the addressee. If you are not the addressee (or the employee or agency responsible to deliver it to the addressee), or if this message has been addressed to you in error, you are hereby notified that you may not copy, forward, disclose or use any part of this message or any attachment(s). Please notify the sender immediately by return email or telephone and permanently delete this message and attachment(s) from your system.
list Timothy Williams
What OS version are they? Does it happen more often after Patch Tuesday? We have some 2008 to 2012 servers that the CPU goes so high as the OS runs it's update scans to see what patches it needs. Happens every few hours (set in WSUS frequency), and so severe that XymonClient can't run and send its files to Xymon server. Tim Williams On Wed, Jul 10, 2019 at 8:54 AM Jaime Kikpole via Xymon <xymon at xymon.com> wrote:
---------- Forwarded message ---------- From: Jaime Kikpole <user-c575ba5bb612@xymon.invalid> To: xymon at xymon.com Cc: Bcc: Date: Wed, 10 Jul 2019 08:53:57 -0400 Subject: Frequent purple alerts
▸
I have a number of systems running the PowerShell Xymon client. One and only one of them frequently gives purple alerts on all tests. I haven't found any pattern for this yet. They just seem to happen several times per day and then clear themselves after anything from a few minutes to a few hours. I tried removing and reinstalling the client, but that didn't seem to help. Any suggestions? Jaime Kikpole Director of Technology & Innovations Cairo-Durham Central School District (XXX) XXX-XXXX, x59500 cairodurham.org <http://www.cairodurham.org>; Technical Support: user-2eed5d3dd752@xymon.invalid go.cairodurham.org/techtips [image: Google Certified Educator, Level 1] [image: Google Certified Educator, Level 2] This electronic message and any attachment(s) may contain confidential or legally privileged information protected by law from further disclosure and is intended only for the individual or entity identified above as the addressee. If you are not the addressee (or the employee or agency responsible to deliver it to the addressee), or if this message has been addressed to you in error, you are hereby notified that you may not copy, forward, disclose or use any part of this message or any attachment(s). Please notify the sender immediately by return email or telephone and permanently delete this message and attachment(s) from your system.
---------- Forwarded message ----------
From: Jaime Kikpole via Xymon <xymon at xymon.com>
To: xymon at xymon.com
Cc:
Bcc:
Date: Wed, 10 Jul 2019 08:53:57 -0400
Subject: [Xymon] Frequent purple alerts
list Jaime Kikpole
It's Windows Server 2012R2 and it happens several times a day, every day. I could take a look at the CPU load, though. Thanks for the idea.
▸
Jaime Kikpole Director of Technology & Innovations Cairo-Durham Central School District (XXX) XXX-XXXX, x59500 cairodurham.org <http://www.cairodurham.org>; Technical Support: user-2eed5d3dd752@xymon.invalid go.cairodurham.org/techtips [image: Google Certified Educator, Level 1] [image: Google Certified Educator, Level 2]
On Wed, Jul 10, 2019 at 9:04 AM Timothy Williams <user-1a5482fb085e@xymon.invalid>
▸
wrote:
What OS version are they? Does it happen more often after Patch Tuesday? We have some 2008 to 2012 servers that the CPU goes so high as the OS runs it's update scans to see what patches it needs. Happens every few hours (set in WSUS frequency), and so severe that XymonClient can't run and send its files to Xymon server. Tim Williams On Wed, Jul 10, 2019 at 8:54 AM Jaime Kikpole via Xymon <xymon at xymon.com> wrote:---------- Forwarded message ---------- From: Jaime Kikpole <user-c575ba5bb612@xymon.invalid> To: xymon at xymon.com Cc: Bcc: Date: Wed, 10 Jul 2019 08:53:57 -0400 Subject: Frequent purple alerts I have a number of systems running the PowerShell Xymon client. One and only one of them frequently gives purple alerts on all tests. I haven't found any pattern for this yet. They just seem to happen several times per day and then clear themselves after anything from a few minutes to a few hours. I tried removing and reinstalling the client, but that didn't seem to help. Any suggestions? Jaime Kikpole Director of Technology & Innovations Cairo-Durham Central School District (XXX) XXX-XXXX, x59500 cairodurham.org <http://www.cairodurham.org>; Technical Support: user-2eed5d3dd752@xymon.invalid go.cairodurham.org/techtips [image: Google Certified Educator, Level 1] [image: Google Certified Educator, Level 2] This electronic message and any attachment(s) may contain confidential or legally privileged information protected by law from further disclosure and is intended only for the individual or entity identified above as the addressee. If you are not the addressee (or the employee or agency responsible to deliver it to the addressee), or if this message has been addressed to you in error, you are hereby notified that you may not copy, forward, disclose or use any part of this message or any attachment(s). Please notify the sender immediately by return email or telephone and permanently delete this message and attachment(s) from your system. ---------- Forwarded message ---------- From: Jaime Kikpole via Xymon <xymon at xymon.com> To: xymon at xymon.com Cc: Bcc: Date: Wed, 10 Jul 2019 08:53:57 -0400 Subject: [Xymon] Frequent purple alerts
-- This electronic message and any attachment(s) may contain confidential or legally privileged information protected by law from further disclosure and is intended only for the individual or entity identified above as the addressee. If you are not the addressee (or the employee or agency responsible to deliver it to the addressee), or if this message has been addressed to you in error, you are hereby notified that you may not copy, forward, disclose or use any part of this message or any attachment(s). Please notify the sender immediately by return email or telephone and permanently delete this message and attachment(s) from your system.
list Paul Root
Purple means that it is not getting updates to the server in a timely manner. This could be the network is too congested. Or the computer is overloaded. Or even that it is sending messages that are too big (a frequent Windows problem in my experience – log files get so big). Does it go purple for 1 minute, 5 minutes, 30 minutes? The shorter suggests that it isn’t finishing the run and sending the update in time. Longer suggests connectivity issues.
▸
From: Jaime Kikpole <user-c575ba5bb612@xymon.invalid>
Sent: Wednesday, July 10, 2019 7:54 AM
To: xymon at xymon.com
Subject: Frequent purple alerts
I have a number of systems running the PowerShell Xymon client. One and only one of them frequently gives purple alerts on all tests. I haven't found any pattern for this yet. They just seem to happen several times per day and then clear themselves after anything from a few minutes to a few hours. I tried removing and reinstalling the client, but that didn't seem to help.
Any suggestions?
[https://s3.amazonaws.com/htmlsig-assets/spacer.gif]
▸
Jaime Kikpole
Director of Technology & Innovations
Cairo-Durham Central School District
(XXX) XXX-XXXX, x59500cairodurham.org<http://www.cairodurham.org>; Technical Support: user-2eed5d3dd752@xymon.invalid<mailto:user-2eed5d3dd752@xymon.invalid> go.cairodurham.org/techtips<http://go.cairodurham.org/techtips>; [Google Certified Educator, Level 1] [Google Certified Educator, Level 2] This electronic message and any attachment(s) may contain confidential or legally privileged information protected by law from further disclosure and is intended only for the individual or entity identified above as the addressee. If you are not the addressee (or the employee or agency responsible to deliver it to the addressee), or if this message has been addressed to you in error, you are hereby notified that you may not copy, forward, disclose or use any part of this message or any attachment(s). Please notify the sender immediately by return email or telephone and permanently delete this message and attachment(s) from your system. This communication is the property of CenturyLink and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
list Jaime Kikpole
On Wed, Jul 10, 2019 at 11:44 AM Root, Paul T <user-76fdb6883669@xymon.invalid> wrote:
This could be the network is too congested.
That seems unlikely. The Xymon server and clients are all on one of two VM clusters, so there aren't a lot of "hops" and the bandwidth is very high. The Xymon client is one of two AD controllers, named dir1 and dir2. The one with purple alerts is dir2. On dir1, we also have .1x authentication, but they're otherwise pretty much the same. Their biggest difference is RAM: 8GB for dir1 and 4GB for dir2. Do you think I should increase the RAM? Other than leaving Task Manager open at all times and hoping to catch it as soon as the purple alert occurs, I'm not sure how to check if it's running out of RAM.
▸
Or the computer is overloaded. Or even that it is sending messages that are too big (a frequent Windows problem in my experience – log files get so big).
Any way to check on the log size?
Does it go purple for 1 minute, 5 minutes, 30 minutes?
I just looked at the log. The purple alert durations are roughly 40 minutes each during the last three occasions. If it helps, this server is an Active Directory domain controller, but it is one of two and the other one (a) has more stuff running on it, such as .1x authentication for our wifi and (b) isn't having this issue. Thanks for the advice!
▸
Jaime Kikpole Director of Technology & Innovations Cairo-Durham Central School District (XXX) XXX-XXXX, x59500 cairodurham.org <http://www.cairodurham.org>; Technical Support: user-2eed5d3dd752@xymon.invalid go.cairodurham.org/techtips [image: Google Certified Educator, Level 1] [image: Google Certified Educator, Level 2] -- This electronic message and any attachment(s) may contain confidential or legally privileged information protected by law from further disclosure and is intended only for the individual or entity identified above as the addressee. If you are not the addressee (or the employee or agency responsible to deliver it to the addressee), or if this message has been addressed to you in error, you are hereby notified that you may not copy, forward, disclose or use any part of this message or any attachment(s). Please notify the sender immediately by return email or telephone and permanently delete this message and attachment(s) from your system.
list Timothy Williams
In my xymonclient_config.xml file I specify the logs to be generated in C:\Logs <clientlogfile>c:\Logs\xymonclient.log</clientlogfile>. Yours may be in folder with script. Look at the xymonclient.log file which contains the transmission data. Look for the size of the Sent file. If your buffers are set too low on Xymon server, data file gets truncated, usually causing White, not purple. 2019-07-10 12:28:43.087 Sending to server 2019-07-10 12:28:43.087 Using "original" ASCII encoding 2019-07-10 12:28:43.087 Connecting to host 128.172.5.33 2019-07-10 12:28:43.103 Sent 54761 bytes to server 2019-07-10 12:28:43.309 Received 436 bytes from server Tim Williams On Wed, Jul 10, 2019 at 12:23 PM Jaime Kikpole via Xymon <xymon at xymon.com> wrote:
---------- Forwarded message ---------- From: Jaime Kikpole <user-c575ba5bb612@xymon.invalid> To: "Root, Paul T" <user-76fdb6883669@xymon.invalid> Cc: "xymon at xymon.com" <xymon at xymon.com> Bcc: Date: Wed, 10 Jul 2019 12:21:42 -0400 Subject: Re: Frequent purple alerts
▸
On Wed, Jul 10, 2019 at 11:44 AM Root, Paul T <user-76fdb6883669@xymon.invalid> wrote:This could be the network is too congested.That seems unlikely. The Xymon server and clients are all on one of two VM clusters, so there aren't a lot of "hops" and the bandwidth is very high. The Xymon client is one of two AD controllers, named dir1 and dir2. The one with purple alerts is dir2. On dir1, we also have .1x authentication, but they're otherwise pretty much the same. Their biggest difference is RAM: 8GB for dir1 and 4GB for dir2. Do you think I should increase the RAM? Other than leaving Task Manager open at all times and hoping to catch it as soon as the purple alert occurs, I'm not sure how to check if it's running out of RAM.Or the computer is overloaded. Or even that it is sending messages that are too big (a frequent Windows problem in my experience – log files get so big).Any way to check on the log size?Does it go purple for 1 minute, 5 minutes, 30 minutes?I just looked at the log. The purple alert durations are roughly 40 minutes each during the last three occasions. If it helps, this server is an Active Directory domain controller, but it is one of two and the other one (a) has more stuff running on it, such as .1x authentication for our wifi and (b) isn't having this issue. Thanks for the advice! Jaime Kikpole Director of Technology & Innovations Cairo-Durham Central School District (XXX) XXX-XXXX, x59500 cairodurham.org <http://www.cairodurham.org>; Technical Support: user-2eed5d3dd752@xymon.invalid go.cairodurham.org/techtips [image: Google Certified Educator, Level 1] [image: Google Certified Educator, Level 2] This electronic message and any attachment(s) may contain confidential or legally privileged information protected by law from further disclosure and is intended only for the individual or entity identified above as the addressee. If you are not the addressee (or the employee or agency responsible to deliver it to the addressee), or if this message has been addressed to you in error, you are hereby notified that you may not copy, forward, disclose or use any part of this message or any attachment(s). Please notify the sender immediately by return email or telephone and permanently delete this message and attachment(s) from your system. ---------- Forwarded message ---------- From: Jaime Kikpole via Xymon <xymon at xymon.com> To: "Root, Paul T" <user-76fdb6883669@xymon.invalid> Cc: "xymon at xymon.com" <xymon at xymon.com> Bcc:
Date: Wed, 10 Jul 2019 12:21:42 -0400
Subject: Re: [Xymon] Frequent purple alerts
list Jaime Kikpole
Sorry to resurrect this old thread, but I finally was able to grab the logs from the Xymon client during a purple alert. Usually, it would go back to green before I would notice, could switch gears, and began working on it. Thanks, Timoth Williams, for pointing out the file uploading parts of the logs. Based on that, I found these lines in the xymonclient.log file: 2019-07-31 15:25:38 Connecting to host 163.153.163.90 2019-07-31 15:25:59 ERROR: Cannot connect to host monitor1.cairodurham.org (163.153.163.90) : System.Management.Automation.MethodInvocationException: Exception calling "Connect" with "2" argument(s): "A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 163.153.163.90:1984" ---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 163.153.163.90:1984 It looks like it was somehow resolving the FQDN (monitor1.cairodurham.org) to its external IP address instead of its internal IP address. I'm not sure why. I just checked the DNS settings and they're the same as another Windows 2012R2 server that isn't having this issue. I changed the FQDN to the internal IP address and restarted the service. Everything went green almost immediately. Any idea how it could resolve to the public IP address 2 - 4 each day but only for a few hours total each day?
▸
Jaime Kikpole Director of Technology & Innovations Cairo-Durham Central School District (XXX) XXX-XXXX, x59500 cairodurham.org <http://www.cairodurham.org>; Technical Support: user-2eed5d3dd752@xymon.invalid go.cairodurham.org/techtips
[image: Google Certified Educator, Level 1][image: Google Certified Educator, Level 2] <https://www.credential.net/d24m9rrp>;
▸
--
This electronic message and any attachment(s) may contain confidential or
legally privileged information protected by law from further disclosure and
is intended only for the individual or entity identified above as the
addressee. If you are not the addressee (or the employee or agency
responsible to deliver it to the addressee), or if this message has been
addressed to you in error, you are hereby notified that you may not copy,
forward, disclose or use any part of this message or any attachment(s).
Please notify the sender immediately by return email or telephone and
permanently delete this message and attachment(s) from your system.
list Timothy Williams
I don't know about the DNS switching around, unless it is due to some DC synchronizing stuff, and one has a manual entry the other doesn't? Two ways to circumvent that is to use the IP in the Xymon Settings file <servers> tag ( I think that is what you said you did), or add the internal IP to the server HOSTS file; both of which requires future editing if the IP of the hostname gets changed. I should have mentioned that I use the tag <clientlogretain>4</clientlogretain> in my xymonclient_config.xml file to save multiple versions of the logs to give me some time to look at them and track changes from one file to another when I make a change. Glad you are able to get it stable. Tim Williams VCU Computer Center On Wed, Jul 31, 2019 at 4:33 PM Jaime Kikpole <user-c575ba5bb612@xymon.invalid>
▸
wrote:
Sorry to resurrect this old thread, but I finally was able to grab the logs from the Xymon client during a purple alert. Usually, it would go back to green before I would notice, could switch gears, and began working on it. Thanks, Timoth Williams, for pointing out the file uploading parts of the logs. Based on that, I found these lines in the xymonclient.log file: 2019-07-31 15:25:38 Connecting to host 163.153.163.90 2019-07-31 15:25:59 ERROR: Cannot connect to host monitor1.cairodurham.org (163.153.163.90) : System.Management.Automation.MethodInvocationException: Exception calling "Connect" with "2" argument(s): "A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 163.153.163.90:1984" ---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 163.153.163.90:1984 It looks like it was somehow resolving the FQDN (monitor1.cairodurham.org) to its external IP address instead of its internal IP address. I'm not sure why. I just checked the DNS settings and they're the same as another Windows 2012R2 server that isn't having this issue. I changed the FQDN to the internal IP address and restarted the service. Everything went green almost immediately. Any idea how it could resolve to the public IP address 2 - 4 each day but only for a few hours total each day? Jaime Kikpole Director of Technology & Innovations Cairo-Durham Central School District (XXX) XXX-XXXX, x59500 cairodurham.org <http://www.cairodurham.org>; Technical Support: user-2eed5d3dd752@xymon.invalid go.cairodurham.org/techtips [image: Google Certified Educator, Level 1][image: Google Certified Educator, Level 2] <https://www.credential.net/d24m9rrp>; This electronic message and any attachment(s) may contain confidential or legally privileged information protected by law from further disclosure and is intended only for the individual or entity identified above as the addressee. If you are not the addressee (or the employee or agency responsible to deliver it to the addressee), or if this message has been addressed to you in error, you are hereby notified that you may not copy, forward, disclose or use any part of this message or any attachment(s). Please notify the sender immediately by return email or telephone and permanently delete this message and attachment(s) from your system.
list Paul Root
You can tell xymon to use the IP address in the hosts.cfg file. Put testip in the comment section of that host. See the hosts.cfg man page.
▸
From: Timothy Williams <user-1a5482fb085e@xymon.invalid>
Sent: Thursday, August 01, 2019 10:02 AM
To: Jaime Kikpole <user-c575ba5bb612@xymon.invalid>
Cc: Root, Paul T <user-76fdb6883669@xymon.invalid>; xymon at xymon.com
Subject: Re: [Xymon] Frequent purple alerts
I don't know about the DNS switching around, unless it is due to some DC synchronizing stuff, and one has a manual entry the other doesn't? Two ways to circumvent that is to use the IP in the Xymon Settings file <servers> tag ( I think that is what you said you did), or add the internal IP to the server HOSTS file; both of which requires future editing if the IP of the hostname gets changed.
I should have mentioned that I use the tag <clientlogretain>4</clientlogretain> in my xymonclient_config.xml file to save multiple versions of the logs to give me some time to look at them and track changes from one file to another when I make a change.
Glad you are able to get it stable.
Tim Williams
VCU Computer Center
On Wed, Jul 31, 2019 at 4:33 PM Jaime Kikpole <user-c575ba5bb612@xymon.invalid<mailto:user-c575ba5bb612@xymon.invalid>> wrote:
Sorry to resurrect this old thread, but I finally was able to grab the logs from the Xymon client during a purple alert. Usually, it would go back to green before I would notice, could switch gears, and began working on it.
Thanks, Timoth Williams, for pointing out the file uploading parts of the logs. Based on that, I found these lines in the xymonclient.log file:
2019-07-31 15:25:38 Connecting to host 163.153.163.902019-07-31 15:25:59 ERROR: Cannot connect to host monitor1.cairodurham.org<http://monitor1.cairodurham.org>; (163.153.163.90) : System.Management.Automation.MethodInvocationException: Exception calling "Connect" with "2" argument(s): "A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 163.153.163.90:1984<http://163.153.163.90:1984>"; ---> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 163.153.163.90:1984<http://163.153.163.90:1984>; It looks like it was somehow resolving the FQDN (monitor1.cairodurham.org<http://monitor1.cairodurham.org>;) to its external IP address instead of its internal IP address. I'm not sure why. I just checked the DNS settings and they're the same as another Windows 2012R2 server that isn't having this issue.
▸
I changed the FQDN to the internal IP address and restarted the service. Everything went green almost immediately. Any idea how it could resolve to the public IP address 2 - 4 each day but only for a few hours total each day? [https://s3.amazonaws.com/htmlsig-assets/spacer.gif] Jaime Kikpole Director of Technology & Innovations Cairo-Durham Central School District (XXX) XXX-XXXX, x59500 cairodurham.org<http://www.cairodurham.org>; Technical Support: user-2eed5d3dd752@xymon.invalid<mailto:user-2eed5d3dd752@xymon.invalid> go.cairodurham.org/techtips<http://go.cairodurham.org/techtips>;
[Google Certified Educator, Level 1][Google Certified Educator, Level 2][https://api.accredible.com/v1/frontend/credential_website_embed_image/badge/13415328]<https://www.credential.net/d24m9rrp>;
▸
This electronic message and any attachment(s) may contain confidential or legally privileged information protected by law from further disclosure and is intended only for the individual or entity identified above as the addressee. If you are not the addressee (or the employee or agency responsible to deliver it to the addressee), or if this message has been addressed to you in error, you are hereby notified that you may not copy, forward, disclose or use any part of this message or any attachment(s). Please notify the sender immediately by return email or telephone and permanently delete this message and attachment(s) from your system.
This communication is the property of CenturyLink and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.