Formatting errors on log files
list Greg Krpan
Recently, my monitoring has been generating frequent errors that are false,
due to improper formatting, It is happening on both Windows and Linux
clients. I've included an example of how the tests are sending data back
to the xymon server. I have not made any changes to my client or server
configurations. Has anyone else been experiencing this behavior, or know
of a fix?
Greg.
Name StartupType Status DisplayName
AeLookupSvc manual stopped
Application Experience
ALG manual stopped
Application Layer Gateway Service
AppIDSvc manual stopped
Application Identity
Appinfo manual stopped
Application Information
AppMgmt manual stopped
Application Management
AppReadiness manual stopped App Readiness
AppXSvc manual stopped AppX
Deployment Service (AppXSVC)
AudioEndpointBuilder manual
toppe] Windows Audio Endpoint Builder
Audiosrv manual stopped Windows Audio
BBWin automatic started Big
Brother Xymon Client
BFE automatic started Base
Filtering Engine
BITS automatic started
Background Intelligent Transfer Serv
ce
BrokerInfrastructure ] automatic started
Background Tasks Infrastructure Service
Browser disabled stopped Computer Browser
CcmExec automatic started SMS Agent Host
CertPropSvc manual started
Certificate Propagation
CmRcService disabled stopped
Configuration Manager Remote Control
COMSysApp manual
started COM+ Sys]
m Application
CryptSvc]
]
utomatic started Cr]
tographic Services
DcomLaunch ]
automatic sta]
ed DCOM Serv]
Process Launcher
defra]svc manual stopped Optimize drives
DeviceAssociationService manual stopped Device
Association Service
list Japheth Cleaver
▸
On Fri, October 14, 2016 3:52 pm, Greg Krpan wrote:
Recently, my monitoring has been generating frequent errors that are
false,
due to improper formatting, It is happening on both Windows and Linux
clients. I've included an example of how the tests are sending data back
to the xymon server. I have not made any changes to my client or server
configurations. Has anyone else been experiencing this behavior, or know
of a fix?
Greg.
Name StartupType Status
DisplayName
AeLookupSvc manual stopped
Application Experience
ALG manual stopped
Application Layer Gateway Service
AppIDSvc manual stopped
Application Identity
Appinfo manual stopped
Application Information
AppMgmt manual stopped
Application Management
AppReadiness manual stopped App
Readiness
AppXSvc manual stopped AppX
Deployment Service (AppXSVC)
AudioEndpointBuilder manual
toppe] Windows Audio Endpoint Builder
Audiosrv manual stopped Windows
Audio
BBWin automatic started Big
Brother Xymon Client
BFE automatic started Base
Filtering Engine
BITS automatic started
Background Intelligent Transfer Serv
ce
BrokerInfrastructure ] automatic started
Background Tasks Infrastructure Service
Browser disabled stopped Computer
Browser
CcmExec automatic started SMS Agent
Host
CertPropSvc manual started
Certificate Propagation
CmRcService disabled stopped
Configuration Manager Remote Control
COMSysApp manual
started COM+ Sys]
m Application
CryptSvc]
]
utomatic started Cr]
tographic Services
DcomLaunch ]
automatic sta]
ed DCOM Serv]
Process Launcher
defra]svc manual stopped Optimize
drives
DeviceAssociationService manual stopped Device
Association Service
Hi Greg,
Is there anything unusual about the process names on the lines immediately
before the corruption? There's a known issue in that lines starting with a
bracket will cause missing data, and this can happen more frequently on
Windows servers just by virtue of some of the data that's coming across,
but that doesn't appear to be causing this specific issue.
Can you confirm which version of Xymon server you're using? Do you see the
same corruption in the "raw" Client Data for the affected servers, or is
it only occurring on the status pages?
Also -- anything unusual in the log files? Has this problem been constant
since it started, or is it getting worse? Does restarting the xymon
service fix it (temporarily)?
Regards,
-jc
list Greg Krpan
Hi JC- Thanks for the response. I am using Xymon 4.3.27 currently. The raw client data looks fine- there are no corrupted lines and no added brackets or special characters that I can see. This only occurs on the status pages. The server has been running since May, and this particular problem started at the end of Sept., after running Windows Update on my servers, but as both Windows and Linux clients are showing the behavior, I have ruled out the updates as the issue. I have tried restarting the service with no effect on behavior and there is nothing in the log files that show a problem that I can see. The level of false positives due to formatting errors has remained relatively consistent, and tends to be limited to the PROCS (Win, Linux) and SVCS (Win only) tests, but occasionally will see the same error occurring on the DISK and CPU tests, although that is significantly less frequent, and is not across all configured machines. The PROCS/SVCS tests are showing random errors on one machine or another approximately every 5 minutes. Thanks Greg. On Fri, Oct 14, 2016 at 6:52 PM, J.C. Cleaver <user-87556346d4af@xymon.invalid>
▸
wrote:
On Fri, October 14, 2016 3:52 pm, Greg Krpan wrote:Recently, my monitoring has been generating frequent errors that are false, due to improper formatting, It is happening on both Windows and Linux clients. I've included an example of how the tests are sending data back to the xymon server. I have not made any changes to my client or server configurations. Has anyone else been experiencing this behavior, or know of a fix? Greg. Name StartupType Status DisplayName AeLookupSvc manual stopped Application Experience ALG manual stopped Application Layer Gateway Service AppIDSvc manual stopped Application Identity Appinfo manual stopped Application Information AppMgmt manual stopped Application Management AppReadiness manual stopped App Readiness AppXSvc manual stopped AppX Deployment Service (AppXSVC) AudioEndpointBuilder manual toppe] Windows Audio Endpoint Builder Audiosrv manual stopped Windows Audio BBWin automatic started Big Brother Xymon Client BFE automatic started Base Filtering Engine BITS automatic started Background Intelligent Transfer Serv ce BrokerInfrastructure ] automatic started Background Tasks Infrastructure Service Browser disabled stopped Computer Browser CcmExec automatic started SMS Agent Host CertPropSvc manual started Certificate Propagation CmRcService disabled stopped Configuration Manager Remote Control COMSysApp manual started COM+ Sys] m Application CryptSvc] ] utomatic started Cr] tographic Services DcomLaunch ] automatic sta] ed DCOM Serv] Process Launcher defra]svc manual stopped Optimize drives DeviceAssociationService manual stopped Device Association ServiceHi Greg, Is there anything unusual about the process names on the lines immediately before the corruption? There's a known issue in that lines starting with a bracket will cause missing data, and this can happen more frequently on Windows servers just by virtue of some of the data that's coming across, but that doesn't appear to be causing this specific issue. Can you confirm which version of Xymon server you're using? Do you see the same corruption in the "raw" Client Data for the affected servers, or is it only occurring on the status pages? Also -- anything unusual in the log files? Has this problem been constant since it started, or is it getting worse? Does restarting the xymon service fix it (temporarily)? Regards, -jc
--
In honor of those who lost their lives exploring the final frontier:
Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White
II, Roger Bruce Chaffee
Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R.
Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E.
McNair, Gregory B. Jarvis, Sharon Christa McAuliffe
Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband,
William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown,
Laurel Blair Salton Clark, Ilan Ramon
list Japheth Cleaver
Hmm. Does the data after the corrupted lines appear to match the remaining data for the server in question? From the sample below it seems not (as I believe this is reported in alphabetical order), which might indicate indicate a broader memory corruption issue going on within xymond_client, where it's somehow losing track of the end or garbling the data in the buffer being used for holding status output. If it's causing a false positive, then it's not merely the final output that's the problem, but something occurring earlier in processing. What OS+distro is the server running on? Any chance you might be able to run xymond_client in debug mode for a bit while this is occurring? -jc
▸
On 10/17/2016 7:56 AM, Greg Krpan wrote:Hi JC- Thanks for the response. I am using Xymon 4.3.27 currently. The raw client data looks fine- there are no corrupted lines and no added brackets or special characters that I can see. This only occurs on the status pages. The server has been running since May, and this particular problem started at the end of Sept., after running Windows Update on my servers, but as both Windows and Linux clients are showing the behavior, I have ruled out the updates as the issue. I have tried restarting the service with no effect on behavior and there is nothing in the log files that show a problem that I can see. The level of false positives due to formatting errors has remained relatively consistent, and tends to be limited to the PROCS (Win, Linux) and SVCS (Win only) tests, but occasionally will see the same error occurring on the DISK and CPU tests, although that is significantly less frequent, and is not across all configured machines. The PROCS/SVCS tests are showing random errors on one machine or another approximately every 5 minutes. Thanks Greg. On Fri, Oct 14, 2016 at 6:52 PM, J.C. Cleaver <user-87556346d4af@xymon.invalid <mailto:user-87556346d4af@xymon.invalid>> wrote: On Fri, October 14, 2016 3:52 pm, Greg Krpan wrote:Recently, my monitoring has been generating frequent errors that are false, due to improper formatting, It is happening on both Windows andLinuxclients. I've included an example of how the tests are sending data back to the xymon server. I have not made any changes to my client or server configurations. Has anyone else been experiencing this behavior, or know of a fix? Greg. Name StartupType Status DisplayName AeLookupSvc manual stopped Application Experience ALG manual stopped Application Layer Gateway Service AppIDSvc manual stopped Application Identity Appinfo manual stopped Application Information AppMgmt manual stopped Application Management AppReadiness manual stopped App Readiness AppXSvc manual stopped AppX Deployment Service (AppXSVC) AudioEndpointBuilder manual toppe] Windows Audio Endpoint Builder Audiosrv manual stopped Windows Audio BBWin automatic started Big Brother Xymon Client BFE automatic started Base Filtering Engine BITS automatic started Background Intelligent Transfer Serv ce BrokerInfrastructure ] automatic started Background Tasks Infrastructure Service Browser disabled stopped Computer Browser CcmExec automatic started SMSAgentHost CertPropSvc manual started Certificate Propagation CmRcService disabled stopped Configuration Manager Remote Control COMSysApp manual started COM+ Sys] m Application CryptSvc] ] utomatic started Cr] tographic Services DcomLaunch ] automatic sta] ed DCOM Serv] Process Launcher defra]svc manual stopped Optimize drives DeviceAssociationService manual stopped Device Association ServiceHi Greg, Is there anything unusual about the process names on the lines immediately before the corruption? There's a known issue in that lines starting with a bracket will cause missing data, and this can happen more frequently on Windows servers just by virtue of some of the data that's coming across, but that doesn't appear to be causing this specific issue. Can you confirm which version of Xymon server you're using? Do you see the same corruption in the "raw" Client Data for the affected servers, or is it only occurring on the status pages? Also -- anything unusual in the log files? Has this problem been constant since it started, or is it getting worse? Does restarting the xymon service fix it (temporarily)? Regards, -jc -- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White II, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband, William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon
list Greg Krpan
I've included an entire "SVCS" status below on a failed status screen. As you can see, it is random as to how and where the output corrupts, On the Windows systems, I run BBWin for the client (version 0.13). I should be able to put the xymond_client process into debug mode to monitor for a while as well.. The problem is less predominant on Linux clients than on Windows, but it is occurring on both. The server is running on CentOS 7 with current patches. # uname -a Linux ************************* 3.10.0-327.36.1.el7.x86_64 #1 SMP Sun Sep 18 13:04:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linu # cat /etc/centos-release CentOS Linux release 7.2.1511 (Core) [# cat /etc/os-release NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/"; BUG_REPORT_URL="https://bugs.centos.org/"; CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7"
▸
Name StartupType Status DisplayName
AeLookupSvc manual stopped
Application Experience
ALG manual stopped
Application Layer Gateway Service
AppHostSvc automatic started
Application Host Helper Service
AppIDSvc manual stopped
Application Identity
Appinfo manual stopped Application I
forma]ion
▸
AppMgmt manual stopped
Application Management
AppReadiness manual stopped App Readiness
AppXSvc manual stopped AppX
Deployment Service (AppXSVC)
aspnet_state manual stopped
ASP.NET State Service
AudioEndpointBuilder man
al stopped Wi]dows Audio Endpoint Builder
▸
Audiosrv manual stopped Windows Audio
BBWin automatic started Big
Brother Xymon Client
BFE automatic started Base
Filtering Engine
BITS automatic started Back
round Intelligent Trans]
r Service
BrokerInfrastru]
ure automatic ]
started Background Tasks]
nfrastructure Service
Browser ]
]isabled stopped Computer Browser
▸
CcmExec automatic started SMS Agent Host
CertPropSvc manual started
Certificate Propagation
CmRcService disabled stopped
Configuration Manager Remote Control
COMSysApp manual started COM+
System Application
CryptSvc automatic started
Cryptographic Services
DcomLaunch automatic started DCOM
Server Process Launcher
defragsvc manual stopped Optimize drives
DeviceAssociationService manual stopped Device
Association Service
DeviceInstall manual stopped Device
Install Service
Dhcp automatic started DHCP Client
DiagTrack automatic started
Diagnostics Tracking Service
Dnscache automatic started DNS Client
dot3svc manual stopped Wired AutoConfig
DPS automatic started
Diagnostic Policy Service
DsmSvc manual started Device
Setup Manager
Eaphost manual stopped
Extensible Authentication Protocol
EFS manual stopped
Encrypting File System (EFS)
EventLog automatic started
Windows Event Log
EventSystem automatic started COM+
Event System
fdPHost manual stopped
Function Discovery Provider Host
FDResPub manual stopped
Function Discovery Resource Publication
FontCache automatic started
Windows Font Cache Service
gpsvc automatic started Group
Policy Client
hidserv manual stopped Human
Interface Device Service
hkmsvc manual stopped Health
Key and Certificate Management
IEEtwCollectorService manual stopped
Internet Explorer ETW Collector Service
IISADMIN automatic started IIS
Admin Service
IKEEXT automatic started IKE
and AuthIP IPsec Keying Modules
iphlpsvc automatic started IP Helper
KeyIso manual started CNG
Key Isolation
KPSSVC manual stopped KDC
Proxy Server service (KPS)
KtmRm manual stopped KtmRm
for Distributed Transaction Coordinator
LanmanServer automatic started Server
LanmanWorkstation automatic started Workstation
lltdsvc manual stopped
Link-Layer Topology Discovery Mapper
lmhosts automatic started TCP/IP
NetBIOS Helper
lpasvc manual stopped
Microsoft Policy Platform Local Authority
lppsvc manual stopped
Microsoft Policy Platform Processor
LSM automatic started Local
Session Manager
McAfeeFramework automatic started McAfee
Framework Service
McShield automatic started McAfee McShield
McTaskManager automatic started McAfee
Task Manager
MMCSS manual stopped
Multimedia Class Scheduler
MpsSvc
automatic started ] Windows Firewall
MSDTC automatic started
Distributed Transaction Coordinator
MSiSCSI manual stopped
Microsoft iSCSI Initiator Service
msiserver manual stopped
Windows Installer
napagent manual stopped
Network Access Protection Agent
NcaSvc manual stopped
Network Connectivity Assistant
Netlogon automatic started Netlogon
Netman manual stopped
Network Connections
netprofm manual started
Network List Service
NetTcpPortSharing disabled stopped
Net.Tcp Port Sharing Service
NlaSvc automatic started
Network Location Awareness
nsi automatic started
Network Store Interface Service
PerfHost manual stopped
Performance Counter DLL Host
pla manual stopped
Performance Logs & Alerts
PlugPlay manual started Plug and Play
PolicyAgent manual started IPsec
Policy Agent
Power automatic started Power
PrintNotify manual stopped
Printer Extensions and Notifications
ProfSvc automatic started User
Profile Service
QBCFMonitorService automatic started
QuickBooks Database Manager Service
QBFCService manual stopped Intuit
QuickBooks FCS
QuickBooksDB17 automatic started QuickBooksDB17
RasAuto manual stopped Remote
Access Auto Connection Manager
RasMan manual stopped Remote
Access Connection Manager
RemoteAccess disabled stopped
Routing and Remote Access
RemoteRegistry automatic stopped Remote Registry
RpcEptMapper automatic started RPC
Endpoint Mapper
RpcLocator manual stopped Remote
Procedure Call (RPC) Locator
RpcSs automatic started Remote
Procedure Call (RPC)
RSoPProv manual stopped
Resultant Set of Policy Provider
sacsvr manual stopped
Special Administration Console Helper
SamSs automatic started
Security Accounts Manager
SCardSvr disabled stopped Smart Card
ScDeviceEnum manual stopped Smart
Card Device Enumeration Service
Schedule automatic started Task Scheduler
SCPolicySvc manual stopped Smart
Card Removal Policy
seclogon manual stopped Secondary Logon
SENS automatic started System
Event Notification Service
SessionEnv manual started Remote
Desktop Configuration
SharedAccess disabled stopped
Internet Connection Sharing (ICS)
ShellHWDetection automatic stopped Shell
Hardware Detection
smphost manual stopped
Microsoft Storage Spaces SMP
smstsmgr manual stopped
ConfigMgr Task Sequence Agent
SNMP automatic started SNMP Service
SNMPTRAP automatic started SNMP Trap
Spooler automatic started Print Spooler
sppsvc automatic stopped
Software Protection
SSDPSRV disabled stopped SSDP Discovery
SstpSvc manual stopped Secure
Socket Tunneling Protocol Service
svsvc manual stopped Spot Verifier
swprv manual stopped
Microsoft Software Shadow Copy Provider
SysMain manual stopped Superfetch
SystemEventsBroker automatic started System
Events Broker
TapiSrv manual stopped Telephony
TermService manual started Remote
Desktop Services
Themes automatic started Themes
THREADORDER manual stopped Thread
Ordering Server
TieringEngineService manual stopped
Storage Tiers Management
TrkWks automatic started
Distributed Link Tracking Client
TrustedInstaller manual stopped
Windows Modules Installer
UALSVC automatic started User
Access Logging Service
UI0Detect manual stopped
Interactive Services Detection
UmRdpService manual started Remote
Desktop Services UserMode Port Redirector
upnphost disabled stopped UPnP Device Host
VaultSvc manual stopped
Credential Manager
vds manual stopped Virtual Disk
VGAuthService automatic started VMware
Alias Manager and Ticket Service
vmicguestinterface manual stopped
Hyper-V Guest Service Interface
vmicheartbeat manual stopped
Hyper-V Heartbeat Service
vmickvpexchange manual stopped
Hyper-V Data Exchange Service
vmicrdv manual stopped
Hyper-V Remote Desktop Virtualization Service
vmicshutdown manual stopped
Hyper-V Guest Shutdown Service
vmictimesync manual stopped
Hyper-V Time Synchronization Service
vmicvss manual stopped
Hyper-V Volume Shadow Copy Requestor
VMTools automatic started VMware Tools
vmvss manual stopped VMware
Snapshot Provider
VSS manual stopped Volume
Shadow Copy
W32Time manual started Windows Time
w3logsvc manual stopped W3C
Logging Service
W3SVC automatic started World
Wide Web Publishing Service
WAS manual started
Windows Process Activation Service
Wcmsvc automatic started
Windows Connection Manager
WcsPlugInService manual stopped
Windows Color System
WdiServiceHost manual stopped
Diagnostic Service Host
WdiSystemHost manual stopped
Diagnostic System Host
Wecsvc manual stopped
Windows Event Collector
WEPHOSTSVC manual stopped
Windows Encryption Provider Host Service
wercplsupport manual stopped
Problem Reports and Solutions Control Panel Support
WerSvc manual stopped
Windows Error Reporting Service
WinHttpAutoProxySvc manual started
WinHTTP Web Proxy Auto-Discovery Service
Winmgmt automatic started
Windows Management Instrumentation
WinRM automatic started
Windows Remote Management (WS-Management)
wmiApSrv manual stopped WMI
Performance Adapter
WPDBusEnum manual stopped
Portable Device Enumerator Service
WSService manual stopped
Windows Store Service (WSService)
wuauserv automatic started Windows Update
wudfsvc manual stopped
Windows Driver Foundation - User-mode Driver Framework
On Mon, Oct 17, 2016 at 4:10 PM, Japheth Cleaver <user-87556346d4af@xymon.invalid>
▸
wrote:
Hmm. Does the data after the corrupted lines appear to match the remaining data for the server in question? From the sample below it seems not (as I believe this is reported in alphabetical order), which might indicate indicate a broader memory corruption issue going on within xymond_client, where it's somehow losing track of the end or garbling the data in the buffer being used for holding status output. If it's causing a false positive, then it's not merely the final output that's the problem, but something occurring earlier in processing. What OS+distro is the server running on? Any chance you might be able to run xymond_client in debug mode for a bit while this is occurring? -jc On 10/17/2016 7:56 AM, Greg Krpan wrote: Hi JC- Thanks for the response. I am using Xymon 4.3.27 currently. The raw client data looks fine- there are no corrupted lines and no added brackets or special characters that I can see. This only occurs on the status pages. The server has been running since May, and this particular problem started at the end of Sept., after running Windows Update on my servers, but as both Windows and Linux clients are showing the behavior, I have ruled out the updates as the issue. I have tried restarting the service with no effect on behavior and there is nothing in the log files that show a problem that I can see. The level of false positives due to formatting errors has remained relatively consistent, and tends to be limited to the PROCS (Win, Linux) and SVCS (Win only) tests, but occasionally will see the same error occurring on the DISK and CPU tests, although that is significantly less frequent, and is not across all configured machines. The PROCS/SVCS tests are showing random errors on one machine or another approximately every 5 minutes. Thanks Greg. On Fri, Oct 14, 2016 at 6:52 PM, J.C. Cleaver <user-87556346d4af@xymon.invalid> wrote:On Fri, October 14, 2016 3:52 pm, Greg Krpan wrote:Recently, my monitoring has been generating frequent errors that are false, due to improper formatting, It is happening on both Windows and Linux clients. I've included an example of how the tests are sending data back to the xymon server. I have not made any changes to my client or server configurations. Has anyone else been experiencing this behavior, or know of a fix? Greg. Name StartupType Status DisplayName AeLookupSvc manual stopped Application Experience ALG manual stopped Application Layer Gateway Service AppIDSvc manual stopped Application Identity Appinfo manual stopped Application Information AppMgmt manual stopped Application Management AppReadiness manual stopped App Readiness AppXSvc manual stopped AppX Deployment Service (AppXSVC) AudioEndpointBuilder manual toppe] Windows Audio Endpoint Builder Audiosrv manual stopped Windows Audio BBWin automatic started Big Brother Xymon Client BFE automatic started Base Filtering Engine BITS automatic started Background Intelligent Transfer Serv ce BrokerInfrastructure ] automatic started Background Tasks Infrastructure Service Browser disabled stopped Computer Browser CcmExec automatic started SMSAgentHost CertPropSvc manual started Certificate Propagation CmRcService disabled stopped Configuration Manager Remote Control COMSysApp manual started COM+ Sys] m Application CryptSvc] ] utomatic started Cr] tographic Services DcomLaunch ] automatic sta] ed DCOM Serv] Process Launcher defra]svc manual stopped Optimize drives DeviceAssociationService manual stopped Device Association ServiceHi Greg, Is there anything unusual about the process names on the lines immediately before the corruption? There's a known issue in that lines starting with a bracket will cause missing data, and this can happen more frequently on Windows servers just by virtue of some of the data that's coming across, but that doesn't appear to be causing this specific issue. Can you confirm which version of Xymon server you're using? Do you see the same corruption in the "raw" Client Data for the affected servers, or is it only occurring on the status pages? Also -- anything unusual in the log files? Has this problem been constant since it started, or is it getting worse? Does restarting the xymon service fix it (temporarily)? Regards, -jc-- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White II, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband, William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon
-- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White II, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband, William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon
list Greg Krpan
As an additional update, I've checked multiple times and the raw data that is passed to the system is correct. It appears to be when it is displayed on the webpage where the problem occurs. I have not updated xymon- it appears as though the version that I am on is still the most recent version (4.3.27) Is anyone aware of any conflicts that may have been introduced after patching? Since this is a production server, I run patches on a monthly basis, with the most recent patching occurring on 9/29, which is coincidentally when the problem started occurring.
▸
On Mon, Oct 17, 2016 at 4:27 PM, Greg Krpan <user-aaa61fac9dfe@xymon.invalid> wrote:
I've included an entire "SVCS" status below on a failed status screen. As you can see, it is random as to how and where the output corrupts, On the Windows systems, I run BBWin for the client (version 0.13). I should be able to put the xymond_client process into debug mode to monitor for a while as well.. The problem is less predominant on Linux clients than on Windows, but it is occurring on both. The server is running on CentOS 7 with current patches. # uname -a Linux ************************* 3.10.0-327.36.1.el7.x86_64 #1 SMP Sun Sep 18 13:04:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linu # cat /etc/centos-release CentOS Linux release 7.2.1511 (Core) [# cat /etc/os-release NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/"; BUG_REPORT_URL="https://bugs.centos.org/"; CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7" Name StartupType Status DisplayName AeLookupSvc manual stopped Application Experience ALG manual stopped Application Layer Gateway Service AppHostSvc automatic started Application Host Helper Service AppIDSvc manual stopped Application Identity Appinfo manual stopped Application I forma]ion AppMgmt manual stopped Application Management AppReadiness manual stopped App Readiness AppXSvc manual stopped AppX Deployment Service (AppXSVC) aspnet_state manual stopped ASP.NET State Service AudioEndpointBuilder man al stopped Wi]dows Audio Endpoint Builder Audiosrv manual stopped Windows Audio BBWin automatic started Big Brother Xymon Client BFE automatic started Base Filtering Engine BITS automatic started Back round Intelligent Trans] r Service BrokerInfrastru] ure automatic ] started Background Tasks] nfrastructure Service Browser ] ]isabled stopped Computer Browser CcmExec automatic started SMS Agent Host CertPropSvc manual started Certificate Propagation CmRcService disabled stopped Configuration Manager Remote Control COMSysApp manual started COM+ System Application CryptSvc automatic started Cryptographic Services DcomLaunch automatic started DCOM Server Process Launcher defragsvc manual stopped Optimize drives DeviceAssociationService manual stopped Device Association Service DeviceInstall manual stopped Device Install Service Dhcp automatic started DHCP Client DiagTrack automatic started Diagnostics Tracking Service Dnscache automatic started DNS Client dot3svc manual stopped Wired AutoConfig DPS automatic started Diagnostic Policy Service DsmSvc manual started Device Setup Manager Eaphost manual stopped Extensible Authentication Protocol EFS manual stopped Encrypting File System (EFS) EventLog automatic started Windows Event Log EventSystem automatic started COM+ Event System fdPHost manual stopped Function Discovery Provider Host FDResPub manual stopped Function Discovery Resource Publication FontCache automatic started Windows Font Cache Service gpsvc automatic started Group Policy Client hidserv manual stopped Human Interface Device Service hkmsvc manual stopped Health Key and Certificate Management IEEtwCollectorService manual stopped Internet Explorer ETW Collector Service IISADMIN automatic started IIS Admin Service IKEEXT automatic started IKE and AuthIP IPsec Keying Modules iphlpsvc automatic started IP Helper KeyIso manual started CNG Key Isolation KPSSVC manual stopped KDC Proxy Server service (KPS) KtmRm manual stopped KtmRm for Distributed Transaction Coordinator LanmanServer automatic started Server LanmanWorkstation automatic started Workstation lltdsvc manual stopped Link-Layer Topology Discovery Mapper lmhosts automatic started TCP/IP NetBIOS Helper lpasvc manual stopped Microsoft Policy Platform Local Authority lppsvc manual stopped Microsoft Policy Platform Processor LSM automatic started Local Session Manager McAfeeFramework automatic started McAfee Framework Service McShield automatic started McAfee McShield McTaskManager automatic started McAfee Task Manager MMCSS manual stopped Multimedia Class Scheduler MpsSvc automatic started ] Windows Firewall MSDTC automatic started Distributed Transaction Coordinator MSiSCSI manual stopped Microsoft iSCSI Initiator Service msiserver manual stopped Windows Installer napagent manual stopped Network Access Protection Agent NcaSvc manual stopped Network Connectivity Assistant Netlogon automatic started Netlogon Netman manual stopped Network Connections netprofm manual started Network List Service NetTcpPortSharing disabled stopped Net.Tcp Port Sharing Service NlaSvc automatic started Network Location Awareness nsi automatic started Network Store Interface Service PerfHost manual stopped Performance Counter DLL Host pla manual stopped Performance Logs & Alerts PlugPlay manual started Plug and Play PolicyAgent manual started IPsec Policy Agent Power automatic started Power PrintNotify manual stopped Printer Extensions and Notifications ProfSvc automatic started User Profile Service QBCFMonitorService automatic started QuickBooks Database Manager Service QBFCService manual stopped Intuit QuickBooks FCS QuickBooksDB17 automatic started QuickBooksDB17 RasAuto manual stopped Remote Access Auto Connection Manager RasMan manual stopped Remote Access Connection Manager RemoteAccess disabled stopped Routing and Remote Access RemoteRegistry automatic stopped Remote Registry RpcEptMapper automatic started RPC Endpoint Mapper RpcLocator manual stopped Remote Procedure Call (RPC) Locator RpcSs automatic started Remote Procedure Call (RPC) RSoPProv manual stopped Resultant Set of Policy Provider sacsvr manual stopped Special Administration Console Helper SamSs automatic started Security Accounts Manager SCardSvr disabled stopped Smart Card ScDeviceEnum manual stopped Smart Card Device Enumeration Service Schedule automatic started Task Scheduler SCPolicySvc manual stopped Smart Card Removal Policy seclogon manual stopped Secondary Logon SENS automatic started System Event Notification Service SessionEnv manual started Remote Desktop Configuration SharedAccess disabled stopped Internet Connection Sharing (ICS) ShellHWDetection automatic stopped Shell Hardware Detection smphost manual stopped Microsoft Storage Spaces SMP smstsmgr manual stopped ConfigMgr Task Sequence Agent SNMP automatic started SNMP Service SNMPTRAP automatic started SNMP Trap Spooler automatic started Print Spooler sppsvc automatic stopped Software Protection SSDPSRV disabled stopped SSDP Discovery SstpSvc manual stopped Secure Socket Tunneling Protocol Service svsvc manual stopped Spot Verifier swprv manual stopped Microsoft Software Shadow Copy Provider SysMain manual stopped Superfetch SystemEventsBroker automatic started System Events Broker TapiSrv manual stopped Telephony TermService manual started Remote Desktop Services Themes automatic started Themes THREADORDER manual stopped Thread Ordering Server TieringEngineService manual stopped Storage Tiers Management TrkWks automatic started Distributed Link Tracking Client TrustedInstaller manual stopped Windows Modules Installer UALSVC automatic started User Access Logging Service UI0Detect manual stopped Interactive Services Detection UmRdpService manual started Remote Desktop Services UserMode Port Redirector upnphost disabled stopped UPnP Device Host VaultSvc manual stopped Credential Manager vds manual stopped Virtual Disk VGAuthService automatic started VMware Alias Manager and Ticket Service vmicguestinterface manual stopped Hyper-V Guest Service Interface vmicheartbeat manual stopped Hyper-V Heartbeat Service vmickvpexchange manual stopped Hyper-V Data Exchange Service vmicrdv manual stopped Hyper-V Remote Desktop Virtualization Service vmicshutdown manual stopped Hyper-V Guest Shutdown Service vmictimesync manual stopped Hyper-V Time Synchronization Service vmicvss manual stopped Hyper-V Volume Shadow Copy Requestor VMTools automatic started VMware Tools vmvss manual stopped VMware Snapshot Provider VSS manual stopped Volume Shadow Copy W32Time manual started Windows Time w3logsvc manual stopped W3C Logging Service W3SVC automatic started World Wide Web Publishing Service WAS manual started Windows Process Activation Service Wcmsvc automatic started Windows Connection Manager WcsPlugInService manual stopped Windows Color System WdiServiceHost manual stopped Diagnostic Service Host WdiSystemHost manual stopped Diagnostic System Host Wecsvc manual stopped Windows Event Collector WEPHOSTSVC manual stopped Windows Encryption Provider Host Service wercplsupport manual stopped Problem Reports and Solutions Control Panel Support WerSvc manual stopped Windows Error Reporting Service WinHttpAutoProxySvc manual started WinHTTP Web Proxy Auto-Discovery Service Winmgmt automatic started Windows Management Instrumentation WinRM automatic started Windows Remote Management (WS-Management) wmiApSrv manual stopped WMI Performance Adapter WPDBusEnum manual stopped Portable Device Enumerator Service WSService manual stopped Windows Store Service (WSService) wuauserv automatic started Windows Update wudfsvc manual stopped Windows Driver Foundation - User-mode Driver Framework On Mon, Oct 17, 2016 at 4:10 PM, Japheth Cleaver <user-87556346d4af@xymon.invalid> wrote:Hmm. Does the data after the corrupted lines appear to match the remaining data for the server in question? From the sample below it seems not (as I believe this is reported in alphabetical order), which might indicate indicate a broader memory corruption issue going on within xymond_client, where it's somehow losing track of the end or garbling the data in the buffer being used for holding status output. If it's causing a false positive, then it's not merely the final output that's the problem, but something occurring earlier in processing. What OS+distro is the server running on? Any chance you might be able to run xymond_client in debug mode for a bit while this is occurring? -jc On 10/17/2016 7:56 AM, Greg Krpan wrote: Hi JC- Thanks for the response. I am using Xymon 4.3.27 currently. The raw client data looks fine- there are no corrupted lines and no added brackets or special characters that I can see. This only occurs on the status pages. The server has been running since May, and this particular problem started at the end of Sept., after running Windows Update on my servers, but as both Windows and Linux clients are showing the behavior, I have ruled out the updates as the issue. I have tried restarting the service with no effect on behavior and there is nothing in the log files that show a problem that I can see. The level of false positives due to formatting errors has remained relatively consistent, and tends to be limited to the PROCS (Win, Linux) and SVCS (Win only) tests, but occasionally will see the same error occurring on the DISK and CPU tests, although that is significantly less frequent, and is not across all configured machines. The PROCS/SVCS tests are showing random errors on one machine or another approximately every 5 minutes. Thanks Greg. On Fri, Oct 14, 2016 at 6:52 PM, J.C. Cleaver <user-87556346d4af@xymon.invalid> wrote:On Fri, October 14, 2016 3:52 pm, Greg Krpan wrote:Recently, my monitoring has been generating frequent errors that are false, due to improper formatting, It is happening on both Windows and Linux clients. I've included an example of how the tests are sending data back to the xymon server. I have not made any changes to my client or server configurations. Has anyone else been experiencing this behavior, or know of a fix? Greg. Name StartupType Status DisplayName AeLookupSvc manual stopped Application Experience ALG manual stopped Application Layer Gateway Service AppIDSvc manual stopped Application Identity Appinfo manual stopped Application Information AppMgmt manual stopped Application Management AppReadiness manual stopped App Readiness AppXSvc manual stopped AppX Deployment Service (AppXSVC) AudioEndpointBuilder manual toppe] Windows Audio Endpoint Builder Audiosrv manual stopped Windows Audio BBWin automatic started Big Brother Xymon Client BFE automatic started Base Filtering Engine BITS automatic started Background Intelligent Transfer Serv ce BrokerInfrastructure ] automatic started Background Tasks Infrastructure Service Browser disabled stoppedComputerBrowser CcmExec automatic started SMSAgentHost CertPropSvc manual started Certificate Propagation CmRcService disabled stopped Configuration Manager Remote Control COMSysApp manual started COM+ Sys] m Application CryptSvc] ] utomatic started Cr] tographic Services DcomLaunch ] automatic sta] ed DCOM Serv] Process Launcher defra]svc manual stoppedOptimizedrives DeviceAssociationService manual stopped Device Association ServiceHi Greg, Is there anything unusual about the process names on the lines immediately before the corruption? There's a known issue in that lines starting with a bracket will cause missing data, and this can happen more frequently on Windows servers just by virtue of some of the data that's coming across, but that doesn't appear to be causing this specific issue. Can you confirm which version of Xymon server you're using? Do you see the same corruption in the "raw" Client Data for the affected servers, or is it only occurring on the status pages? Also -- anything unusual in the log files? Has this problem been constant since it started, or is it getting worse? Does restarting the xymon service fix it (temporarily)? Regards, -jc-- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White II, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband, William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon-- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White II, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband, William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon
-- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White II, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband, William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon
list Japheth Cleaver
Hi,
▸
On Fri, October 21, 2016 9:19 am, Greg Krpan wrote:As an additional update, I've checked multiple times and the raw data that is passed to the system is correct. It appears to be when it is displayed on the webpage where the problem occurs. I have not updated xymon- it appears as though the version that I am on is still the most recent version (4.3.27) Is anyone aware of any conflicts that may have been introduced after patching? Since this is a production server, I run patches on a monthly basis, with the most recent patching occurring on 9/29, which is coincidentally when the problem started occurring.
Can you run from the command line:
xymoncmd xymon 127.0.0.1 "xymondlog <hostname>.<affectedtestname>"
for a test (like svcs) that you're seeing now? I'm curious if the garbled
data is showing in memory or if it's happening (just) on the web layer.
I'm not aware of any outstanding issues in xymond_client code that
wouldn't be affecting lots of people, but it's possible we have a bug
here. Most of the recent changes have been either new features or CSP/XSS
fixes at the display layer.
Have you noticed any errors coming through xymond_client, or any patterns
from running it in --debug mode?
-jc
▸
On Mon, Oct 17, 2016 at 4:27 PM, Greg Krpan <user-aaa61fac9dfe@xymon.invalid> wrote:I've included an entire "SVCS" status below on a failed status screen. As you can see, it is random as to how and where the output corrupts, On the Windows systems, I run BBWin for the client (version 0.13). I should be able to put the xymond_client process into debug mode to monitor for a while as well.. The problem is less predominant on Linux clients than on Windows, but it is occurring on both. The server is running on CentOS 7 with current patches. # uname -a Linux ************************* 3.10.0-327.36.1.el7.x86_64 #1 SMP Sun Sep 18 13:04:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linu # cat /etc/centos-release CentOS Linux release 7.2.1511 (Core) [# cat /etc/os-release NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/"; BUG_REPORT_URL="https://bugs.centos.org/"; CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7" Name StartupType Status DisplayName AeLookupSvc manual stopped Application Experience ALG manual stopped Application Layer Gateway Service AppHostSvc automatic started Application Host Helper Service AppIDSvc manual stopped Application Identity Appinfo manual stopped Application I forma]ion AppMgmt manual stopped Application Management AppReadiness manual stopped App Readiness AppXSvc manual stopped AppX Deployment Service (AppXSVC) aspnet_state manual stopped ASP.NET State Service AudioEndpointBuilder man al stopped Wi]dows Audio Endpoint Builder Audiosrv manual stopped Windows Audio BBWin automatic started Big Brother Xymon Client BFE automatic started Base Filtering Engine BITS automatic started Back round Intelligent Trans] r Service BrokerInfrastru] ure automatic ] started Background Tasks] nfrastructure Service Browser ] ]isabled stopped Computer Browser CcmExec automatic started SMS Agent Host CertPropSvc manual started Certificate Propagation CmRcService disabled stopped Configuration Manager Remote Control COMSysApp manual started COM+ System Application CryptSvc automatic started Cryptographic Services DcomLaunch automatic started DCOM Server Process Launcher defragsvc manual stopped Optimize drives DeviceAssociationService manual stopped Device Association Service DeviceInstall manual stopped Device Install Service Dhcp automatic started DHCP Client DiagTrack automatic started Diagnostics Tracking Service Dnscache automatic started DNS Client dot3svc manual stopped Wired AutoConfig DPS automatic started Diagnostic Policy Service DsmSvc manual started Device Setup Manager Eaphost manual stopped Extensible Authentication Protocol EFS manual stopped Encrypting File System (EFS) EventLog automatic started Windows Event Log EventSystem automatic started COM+ Event System fdPHost manual stopped Function Discovery Provider Host FDResPub manual stopped Function Discovery Resource Publication FontCache automatic started Windows Font Cache Service gpsvc automatic started Group Policy Client hidserv manual stopped Human Interface Device Service hkmsvc manual stopped Health Key and Certificate Management IEEtwCollectorService manual stopped Internet Explorer ETW Collector Service IISADMIN automatic started IIS Admin Service IKEEXT automatic started IKE and AuthIP IPsec Keying Modules iphlpsvc automatic started IP Helper KeyIso manual started CNG Key Isolation KPSSVC manual stopped KDC Proxy Server service (KPS) KtmRm manual stopped KtmRm for Distributed Transaction Coordinator LanmanServer automatic started Server LanmanWorkstation automatic started Workstation lltdsvc manual stopped Link-Layer Topology Discovery Mapper lmhosts automatic started TCP/IP NetBIOS Helper lpasvc manual stopped Microsoft Policy Platform Local Authority lppsvc manual stopped Microsoft Policy Platform Processor LSM automatic started Local Session Manager McAfeeFramework automatic started McAfee Framework Service McShield automatic started McAfee McShield McTaskManager automatic started McAfee Task Manager MMCSS manual stopped Multimedia Class Scheduler MpsSvc automatic started ] Windows Firewall MSDTC automatic started Distributed Transaction Coordinator MSiSCSI manual stopped Microsoft iSCSI Initiator Service msiserver manual stopped Windows Installer napagent manual stopped Network Access Protection Agent NcaSvc manual stopped Network Connectivity Assistant Netlogon automatic started Netlogon Netman manual stopped Network Connections netprofm manual started Network List Service NetTcpPortSharing disabled stopped Net.Tcp Port Sharing Service NlaSvc automatic started Network Location Awareness nsi automatic started Network Store Interface Service PerfHost manual stopped Performance Counter DLL Host pla manual stopped Performance Logs & Alerts PlugPlay manual started Plug and Play PolicyAgent manual started IPsec Policy Agent Power automatic started Power PrintNotify manual stopped Printer Extensions and Notifications ProfSvc automatic started User Profile Service QBCFMonitorService automatic started QuickBooks Database Manager Service QBFCService manual stopped Intuit QuickBooks FCS QuickBooksDB17 automatic started QuickBooksDB17 RasAuto manual stopped Remote Access Auto Connection Manager RasMan manual stopped Remote Access Connection Manager RemoteAccess disabled stopped Routing and Remote Access RemoteRegistry automatic stopped Remote Registry RpcEptMapper automatic started RPC Endpoint Mapper RpcLocator manual stopped Remote Procedure Call (RPC) Locator RpcSs automatic started Remote Procedure Call (RPC) RSoPProv manual stopped Resultant Set of Policy Provider sacsvr manual stopped Special Administration Console Helper SamSs automatic started Security Accounts Manager SCardSvr disabled stopped Smart Card ScDeviceEnum manual stopped Smart Card Device Enumeration Service Schedule automatic started Task Scheduler SCPolicySvc manual stopped Smart Card Removal Policy seclogon manual stopped Secondary Logon SENS automatic started System Event Notification Service SessionEnv manual started Remote Desktop Configuration SharedAccess disabled stopped Internet Connection Sharing (ICS) ShellHWDetection automatic stopped Shell Hardware Detection smphost manual stopped Microsoft Storage Spaces SMP smstsmgr manual stopped ConfigMgr Task Sequence Agent SNMP automatic started SNMP Service SNMPTRAP automatic started SNMP Trap Spooler automatic started Print Spooler sppsvc automatic stopped Software Protection SSDPSRV disabled stopped SSDP Discovery SstpSvc manual stopped Secure Socket Tunneling Protocol Service svsvc manual stopped Spot Verifier swprv manual stopped Microsoft Software Shadow Copy Provider SysMain manual stopped Superfetch SystemEventsBroker automatic started System Events Broker TapiSrv manual stopped Telephony TermService manual started Remote Desktop Services Themes automatic started Themes THREADORDER manual stopped Thread Ordering Server TieringEngineService manual stopped Storage Tiers Management TrkWks automatic started Distributed Link Tracking Client TrustedInstaller manual stopped Windows Modules Installer UALSVC automatic started User Access Logging Service UI0Detect manual stopped Interactive Services Detection UmRdpService manual started Remote Desktop Services UserMode Port Redirector upnphost disabled stopped UPnP Device Host VaultSvc manual stopped Credential Manager vds manual stopped Virtual Disk VGAuthService automatic started VMware Alias Manager and Ticket Service vmicguestinterface manual stopped Hyper-V Guest Service Interface vmicheartbeat manual stopped Hyper-V Heartbeat Service vmickvpexchange manual stopped Hyper-V Data Exchange Service vmicrdv manual stopped Hyper-V Remote Desktop Virtualization Service vmicshutdown manual stopped Hyper-V Guest Shutdown Service vmictimesync manual stopped Hyper-V Time Synchronization Service vmicvss manual stopped Hyper-V Volume Shadow Copy Requestor VMTools automatic started VMware Tools vmvss manual stopped VMware Snapshot Provider VSS manual stopped Volume Shadow Copy W32Time manual started Windows Time w3logsvc manual stopped W3C Logging Service W3SVC automatic started World Wide Web Publishing Service WAS manual started Windows Process Activation Service Wcmsvc automatic started Windows Connection Manager WcsPlugInService manual stopped Windows Color System WdiServiceHost manual stopped Diagnostic Service Host WdiSystemHost manual stopped Diagnostic System Host Wecsvc manual stopped Windows Event Collector WEPHOSTSVC manual stopped Windows Encryption Provider Host Service wercplsupport manual stopped Problem Reports and Solutions Control Panel Support WerSvc manual stopped Windows Error Reporting Service WinHttpAutoProxySvc manual started WinHTTP Web Proxy Auto-Discovery Service Winmgmt automatic started Windows Management Instrumentation WinRM automatic started Windows Remote Management (WS-Management) wmiApSrv manual stopped WMI Performance Adapter WPDBusEnum manual stopped Portable Device Enumerator Service WSService manual stopped Windows Store Service (WSService) wuauserv automatic started Windows Update wudfsvc manual stopped Windows Driver Foundation - User-mode Driver Framework On Mon, Oct 17, 2016 at 4:10 PM, Japheth Cleaver <user-87556346d4af@xymon.invalid> wrote:Hmm. Does the data after the corrupted lines appear to match the remaining data for the server in question? From the sample below it seems not (as I believe this is reported in alphabetical order), which might indicate indicate a broader memory corruption issue going on within xymond_client, where it's somehow losing track of the end or garbling the data in the buffer being used for holding status output. If it's causing a false positive, then it's not merely the final output that's the problem, but something occurring earlier in processing. What OS+distro is the server running on? Any chance you might be able to run xymond_client in debug mode for a bit while this is occurring? -jc On 10/17/2016 7:56 AM, Greg Krpan wrote: Hi JC- Thanks for the response. I am using Xymon 4.3.27 currently. The raw client data looks fine- there are no corrupted lines and no added brackets or special characters that I can see. This only occurs on the status pages. The server has been running since May, and this particular problem started at the end of Sept., after running Windows Update on my servers, but as both Windows and Linux clients are showing the behavior, I have ruled out the updates as the issue. I have tried restarting the service with no effect on behavior and there is nothing in the log files that show a problem that I can see. The level of false positives due to formatting errors has remained relatively consistent, and tends to be limited to the PROCS (Win, Linux) and SVCS (Win only) tests, but occasionally will see the same error occurring on the DISK and CPU tests, although that is significantly less frequent, and is not across all configured machines. The PROCS/SVCS tests are showing random errors on one machine or another approximately every 5 minutes. Thanks Greg. On Fri, Oct 14, 2016 at 6:52 PM, J.C. Cleaver <user-87556346d4af@xymon.invalid> wrote:On Fri, October 14, 2016 3:52 pm, Greg Krpan wrote:Recently, my monitoring has been generating frequent errors that are false, due to improper formatting, It is happening on both Windows andLinuxclients. I've included an example of how the tests are sending data back to the xymon server. I have not made any changes to my client or server configurations. Has anyone else been experiencing this behavior, or know of a fix? Greg. Name StartupType Status DisplayName AeLookupSvc manual stopped Application Experience ALG manual stopped Application Layer Gateway Service AppIDSvc manual stopped Application Identity Appinfo manual stopped Application Information AppMgmt manual stopped Application Management AppReadiness manual stopped App Readiness AppXSvc manual stopped AppX Deployment Service (AppXSVC) AudioEndpointBuilder manual toppe] Windows Audio Endpoint Builder Audiosrv manual stoppedWindowsAudio BBWin automatic started Big Brother Xymon Client BFE automatic started Base Filtering Engine BITS automatic started Background Intelligent Transfer Serv ce BrokerInfrastructure ] automatic started Background Tasks Infrastructure Service Browser disabled stoppedComputerBrowser CcmExec automatic started SMSAgentHost CertPropSvc manual started Certificate Propagation CmRcService disabled stopped Configuration Manager Remote Control COMSysApp manual started COM+ Sys] m Application CryptSvc] ] utomatic started Cr] tographic Services DcomLaunch ] automatic sta] ed DCOM Serv] Process Launcher defra]svc manual stoppedOptimizedrives DeviceAssociationService manual stoppedDeviceAssociation ServiceHi Greg, Is there anything unusual about the process names on the lines immediately before the corruption? There's a known issue in that lines starting with a bracket will cause missing data, and this can happen more frequently on Windows servers just by virtue of some of the data that's coming across, but that doesn't appear to be causing this specific issue. Can you confirm which version of Xymon server you're using? Do you see the same corruption in the "raw" Client Data for the affected servers, or is it only occurring on the status pages? Also -- anything unusual in the log files? Has this problem been constant since it started, or is it getting worse? Does restarting the xymon service fix it (temporarily)? Regards, -jc-- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White II, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband, William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon-- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White II, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband, William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon-- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White II, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband, William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon
list Greg Krpan
I haven't noticed any errors through xymond_client or from debug mode. After running the xymoncmd line above I get the following: ./xymoncmd xymon 127.0.0.1 "xymondlog wstc-lr0dhcp1.svcs" wstc-lr0dhcp1|svcs|red||1477331698|1477331698|1477333498|0|0|15.1.160.11|460159|||Y| red Mon Oct 24 11:55:55 2016 - Services NOT ok &red BBWin: No matching service - want started/automatic &green DHCPServer is started/automatic - want started/automatic &green McAfeeFramework is started/automatic - want started/automatic &green McShield is started/automatic - want started/automatic &green McTaskManager is started/automatic - want started/automatic &green VMTools is started/automatic - want started/automatic
▸
Name StartupType Status DisplayName
AeLookupSvc manual stopped Application
Experience
ALG manual stopped Application
Layer Gateway Service
AppIDSvc manual stopped Application
Identity
Appinfo manual
sta]ted Application Information
▸
AppMgmt manual stopped Application
Management
AppReadiness manual stopped App
Readiness
AppXSvc manual stopped AppX
Deployment Service (AppXSVC)
AudioEndpointBuilder manual stopped Windows
Audio Endpoint Builder
Audiosrv ]
manual st]
ped Windows Audio
]BWin automatic started Big Brother
▸
Xymon Client
BFE automatic started Base
Filtering Engine
BITS automatic started Background
Intelligent Transfer Service
▸
BrokerInfrastructure automatic started Background
Tasks Infrastructure Service
Browser disabled stopped Computer
Browser
CcmExec automatic started SMS Agent
Host
CertPropSvc manual started Certificate
Propagation
CmRcService disabled stopped
Configuration Manager Remote Control
COMSysApp manual started COM+ System
Application
CryptSvc automatic started
Cryptographic Services
DcomLaunch automatic started DCOM Server
Process Launcher
defragsvc manual stopped Optimize
drives
DeviceAssociationService manual stopped Device
Association Service
DeviceInstall manual stopped Device
Install Service
Dhcp automatic started DHCP Client
DHCPServer automatic started DHCP Server
▸
DiagTrack automatic started Diagnostics
Tracking Service
Dnscache automatic started DNS Client
dot3svc manual stopped Wired
AutoConfig
DPS automatic started Diagnostic
Policy Service
DsmSvc manual stopped Device
▸
Setup Manager
Eaphost manual stopped Extensible
Authentication Protocol
EFS manual stopped Encrypting
File System (EFS)
EventLog automatic started Windows
Event Log
EventSystem automatic started COM+ Event
System
fdPHost manual stopped Function
Discovery Provider Host
FDResPub manual stopped Function
Discovery Resource Publication
FontCache automatic started Windows
Font Cache Service
gpsvc automatic started Group
Policy Client
hidserv manual stopped Human
Interface Device Service
hkmsvc manual stopped Health Key
and Certificate Managemen
IEEtwCollectorService ]
manual stopped ]
Internet Explorer ETW Collector Servi]
IKEEXT ]
automat]c started IKE and AuthIP IPsec Keying
▸
Modules
iphlpsvc automatic started IP Helper
KeyIso manual started CNG Key
Isolation
KPSSVC manual stopped KDC Proxy
Server service (KPS)
KtmRm manual stopped KtmRm for
Distributed Transaction Coordinator
LanmanServer automatic started Server
LanmanWorkstation automatic started Workstation
lltdsvc manual stopped Link-Layer
Topology Discovery Mapper
lmhosts automatic started TCP/IP
NetBIOS Helper
lpasvc manual stopped Microsoft
Policy Platform Local Authority
lppsvc manual stopped Microsoft
Policy Platform Processor
LSM automatic started Local
Session Manager
McAfeeFramework automatic started McAfee
Framework Service
McShield automatic started McAfee
McShield
McTaskManager automatic started McAfee Task
Manager
MMCSS manual stopped Multimedia
Class Scheduler
MpsSvc automatic started Windows
Firewall
MSDTC automatic started Distributed
Transaction Coordinator
MSiSCSI manual stopped Microsoft
iSCSI Initiator Service
msiserver manual stopped Windows
Installer
napagent manual stopped Network
Access Protection Agent
NcaSvc manual stopped Network
Connectivity Assistant
Netlogon automatic started Netlogon
Netman manual stopped Network
Connections
netprofm manual started Network
List Service
NetTcpPortSharing disabled stopped Net.Tcp
Port Sharing Service
NlaSvc automatic started Network
Location Awareness
nsi automatic started Network
Store Interface Service
PerfHost manual stopped Performance
Counter DLL Host
pla manual stopped Performance
Logs & Alerts
PlugPlay manual started Plug and
Play
PolicyAgent manual started IPsec
Policy Agent
Power automatic started Power
PrintNotify manual stopped Printer
Extensions and Notifications
ProfSvc automatic started User
Profile Service
RasAuto manual stopped Remote
Access Auto Connection Manager
RasMan manual stopped Remote
Access Connection Manager
RemoteAccess disabled stopped Routing and
Remote Access
RemoteRegistry automatic stopped Remote
Registry
RpcEptMapper automatic started RPC
Endpoint Mapper
RpcLocator manual stopped Remote
Procedure Call (RPC) Locator
RpcSs automatic started Remote
Procedure Call (RPC)
RSoPProv manual stopped Resultant
Set of Policy Provider
sacsvr manual stopped Special
Administration Console Helper
SamSs automatic started Security
Accounts Manager
SCardSvr disabled stopped Smart Card
ScDeviceEnum manual started Smart Card
▸
Device Enumeration Service
Schedule automatic started Task
Scheduler
SCPolicySvc manual stopped Smart Card
Removal Policy
seclogon manual stopped Secondary
Logon
SENS automatic started System
Event Notification Service
SessionEnv manual started Remote
Desktop Configuration
SharedAccess disabled stopped Internet
Connection Sharing (ICS)
ShellHWDetection automatic started Shell
▸
Hardware Detection
smphost manual stopped Microsoft
Storage Spaces SMP
smstsmgr manual stopped ConfigMgr
Task Sequence Agent
SNMP automatic started SNMP Service
SNMPTRAP manual stopped SNMP Trap
▸
Spooler automatic started Print
Spooler
sppsvc automatic stopped Software
Protection
SSDPSRV disabled stopped SSDP
Discovery
SstpSvc manual stopped Secure
Socket Tunneling Protocol Service
svsvc manual stopped Spot
Verifier
swprv manual stopped Microsoft
Software Shadow Copy Provider
SysMain manual stopped Superfetch
SystemEventsBroker automatic started System
Events Broker
TapiSrv manual stopped Telephony
TermService manual started Remote
Desktop Services
Themes automatic started Themes
THREADORDER manual stopped Thread
Ordering Server
TieringEngineService manual stopped Storage
Tiers Management
TrkWks automatic started Distributed
Link Tracking Client
TrustedInstaller manual stopped Windows
Modules Installer
UALSVC automatic started User Access
Logging Service
UI0Detect manual stopped Interactive
Services Detection
UmRdpService manual started Remote
Desktop Services UserMode Port Redirector
upnphost disabled stopped UPnP Device
Host
VaultSvc manual stopped Credential
Manager
vds manual stopped Virtual Disk
VGAuthService automatic started VMware
Alias Manager and Ticket Service
vmicguestinterface manual stopped Hyper-V
Guest Service Interface
vmicheartbeat manual stopped Hyper-V
Heartbeat Service
vmickvpexchange manual stopped Hyper-V
Data Exchange Service
vmicrdv manual stopped Hyper-V
Remote Desktop Virtualization Service
vmicshutdown manual stopped Hyper-V
Guest Shutdown Service
vmictimesync manual stopped Hyper-V
Time Synchronization Service
vmicvss manual stopped Hyper-V
Volume Shadow Copy Requestor
VMTools automatic started VMware Tools
vmvss manual stopped VMware
Snapshot Provider
VSS manual stopped Volume
Shadow Copy
W32Time manual started Windows Time
Wcmsvc automatic started Windows
Connection Manager
WcsPlugInService manual stopped Windows
Color System
WdiServiceHost manual stopped Diagnostic
Service Host
WdiSystemHost manual stopped Diagnostic
System Host
Wecsvc manual stopped Windows
Event Collector
WEPHOSTSVC manual stopped Windows
Encryption Provider Host Service
wercplsupport manual stopped Problem
Reports and Solutions Control Panel Support
WerSvc manual stopped Windows
Error Reporting Service
WinHttpAutoProxySvc manual stopped WinHTTP Web
▸
Proxy Auto-Discovery Service
Winmgmt automatic started Windows
Management Instrumentation
WinRM automatic started Windows
Remote Management (WS-Management)
wmiApSrv manual stopped WMI
Performance Adapter
WPDBusEnum manual stopped Portable
Device Enumerator Service
WSService manual stopped Windows
Store Service (WSService)
wuauserv manual started Windows
Update
wudfsvc manual stopped Windows
Driver Foundation - User-mode Driver Framework
On Fri, Oct 21, 2016 at 5:24 PM, J.C. Cleaver <user-87556346d4af@xymon.invalid>
▸
wrote:
Hi, On Fri, October 21, 2016 9:19 am, Greg Krpan wrote:As an additional update, I've checked multiple times and the raw data that is passed to the system is correct. It appears to be when it is displayed on the webpage where the problem occurs. I have not updated xymon- it appears as though the version that I am on is still the most recent version (4.3.27) Is anyone aware of any conflicts that may have been introduced after patching? Since this is a production server, I run patches on a monthly basis, with the most recent patching occurring on 9/29, which is coincidentally when the problem started occurring.Can you run from the command line: xymoncmd xymon 127.0.0.1 "xymondlog <hostname>.<affectedtestname>" for a test (like svcs) that you're seeing now? I'm curious if the garbled data is showing in memory or if it's happening (just) on the web layer. I'm not aware of any outstanding issues in xymond_client code that wouldn't be affecting lots of people, but it's possible we have a bug here. Most of the recent changes have been either new features or CSP/XSS fixes at the display layer. Have you noticed any errors coming through xymond_client, or any patterns from running it in --debug mode? -jcOn Mon, Oct 17, 2016 at 4:27 PM, Greg Krpan <user-aaa61fac9dfe@xymon.invalid> wrote:I've included an entire "SVCS" status below on a failed status screen. As you can see, it is random as to how and where the output corrupts, On the Windows systems, I run BBWin for the client (version 0.13). I should be able to put the xymond_client process into debug mode to monitor for a while as well.. The problem is less predominant on Linux clients than on Windows, but it is occurring on both. The server is running on CentOS 7 with current patches. # uname -a Linux ************************* 3.10.0-327.36.1.el7.x86_64 #1 SMP Sun Sep 18 13:04:29 UTC 2016 x86_64 x86_64 x86_64 GNU/Linu # cat /etc/centos-release CentOS Linux release 7.2.1511 (Core) [# cat /etc/os-release NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/"; BUG_REPORT_URL="https://bugs.centos.org/"; CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7" Name StartupType Status DisplayName AeLookupSvc manual stopped Application Experience ALG manual stopped Application Layer Gateway Service AppHostSvc automatic started Application Host Helper Service AppIDSvc manual stopped Application Identity Appinfo manual stopped Application I forma]ion AppMgmt manual stopped Application Management AppReadiness manual stopped App Readiness AppXSvc manual stopped AppX Deployment Service (AppXSVC) aspnet_state manual stopped ASP.NET State Service AudioEndpointBuilder man al stopped Wi]dows Audio Endpoint Builder Audiosrv manual stopped Windows Audio BBWin automatic started Big Brother Xymon Client BFE automatic started Base Filtering Engine BITS automatic started Back round Intelligent Trans] r Service BrokerInfrastru] ure automatic ] started Background Tasks] nfrastructure Service Browser ] ]isabled stopped Computer Browser CcmExec automatic started SMS Agent Host CertPropSvc manual started Certificate Propagation CmRcService disabled stopped Configuration Manager Remote Control COMSysApp manual started COM+ System Application CryptSvc automatic started Cryptographic Services DcomLaunch automatic started DCOM Server Process Launcher defragsvc manual stopped Optimize drives DeviceAssociationService manual stopped Device Association Service DeviceInstall manual stopped Device Install Service Dhcp automatic started DHCP Client DiagTrack automatic started Diagnostics Tracking Service Dnscache automatic started DNS Client dot3svc manual stopped Wired AutoConfig DPS automatic started Diagnostic Policy Service DsmSvc manual started Device Setup Manager Eaphost manual stopped Extensible Authentication Protocol EFS manual stopped Encrypting File System (EFS) EventLog automatic started Windows Event Log EventSystem automatic started COM+ Event System fdPHost manual stopped Function Discovery Provider Host FDResPub manual stopped Function Discovery Resource Publication FontCache automatic started Windows Font Cache Service gpsvc automatic started Group Policy Client hidserv manual stopped Human Interface Device Service hkmsvc manual stopped Health Key and Certificate Management IEEtwCollectorService manual stopped Internet Explorer ETW Collector Service IISADMIN automatic started IIS Admin Service IKEEXT automatic started IKE and AuthIP IPsec Keying Modules iphlpsvc automatic started IP Helper KeyIso manual started CNG Key Isolation KPSSVC manual stopped KDC Proxy Server service (KPS) KtmRm manual stopped KtmRm for Distributed Transaction Coordinator LanmanServer automatic started Server LanmanWorkstation automatic started Workstation lltdsvc manual stopped Link-Layer Topology Discovery Mapper lmhosts automatic started TCP/IP NetBIOS Helper lpasvc manual stopped Microsoft Policy Platform Local Authority lppsvc manual stopped Microsoft Policy Platform Processor LSM automatic started Local Session Manager McAfeeFramework automatic started McAfee Framework Service McShield automatic started McAfee McShield McTaskManager automatic started McAfee Task Manager MMCSS manual stopped Multimedia Class Scheduler MpsSvc automatic started ] Windows Firewall MSDTC automatic started Distributed Transaction Coordinator MSiSCSI manual stopped Microsoft iSCSI Initiator Service msiserver manual stopped Windows Installer napagent manual stopped Network Access Protection Agent NcaSvc manual stopped Network Connectivity Assistant Netlogon automatic started Netlogon Netman manual stopped Network Connections netprofm manual started Network List Service NetTcpPortSharing disabled stopped Net.Tcp Port Sharing Service NlaSvc automatic started Network Location Awareness nsi automatic started Network Store Interface Service PerfHost manual stopped Performance Counter DLL Host pla manual stopped Performance Logs & Alerts PlugPlay manual started Plug and Play PolicyAgent manual started IPsec Policy Agent Power automatic started Power PrintNotify manual stopped Printer Extensions and Notifications ProfSvc automatic started User Profile Service QBCFMonitorService automatic started QuickBooks Database Manager Service QBFCService manual stopped Intuit QuickBooks FCS QuickBooksDB17 automatic started QuickBooksDB17 RasAuto manual stopped Remote Access Auto Connection Manager RasMan manual stopped Remote Access Connection Manager RemoteAccess disabled stopped Routing and Remote Access RemoteRegistry automatic stopped Remote Registry RpcEptMapper automatic started RPC Endpoint Mapper RpcLocator manual stopped Remote Procedure Call (RPC) Locator RpcSs automatic started Remote Procedure Call (RPC) RSoPProv manual stopped Resultant Set of Policy Provider sacsvr manual stopped Special Administration Console Helper SamSs automatic started Security Accounts Manager SCardSvr disabled stopped Smart Card ScDeviceEnum manual stopped Smart Card Device Enumeration Service Schedule automatic started Task Scheduler SCPolicySvc manual stopped Smart Card Removal Policy seclogon manual stopped Secondary Logon SENS automatic started System Event Notification Service SessionEnv manual started Remote Desktop Configuration SharedAccess disabled stopped Internet Connection Sharing (ICS) ShellHWDetection automatic stopped Shell Hardware Detection smphost manual stopped Microsoft Storage Spaces SMP smstsmgr manual stopped ConfigMgr Task Sequence Agent SNMP automatic started SNMP Service SNMPTRAP automatic started SNMP Trap Spooler automatic started Print Spooler sppsvc automatic stopped Software Protection SSDPSRV disabled stopped SSDP Discovery SstpSvc manual stopped Secure Socket Tunneling Protocol Service svsvc manual stopped Spot Verifier swprv manual stopped Microsoft Software Shadow Copy Provider SysMain manual stopped Superfetch SystemEventsBroker automatic started System Events Broker TapiSrv manual stopped Telephony TermService manual started Remote Desktop Services Themes automatic started Themes THREADORDER manual stopped Thread Ordering Server TieringEngineService manual stopped Storage Tiers Management TrkWks automatic started Distributed Link Tracking Client TrustedInstaller manual stopped Windows Modules Installer UALSVC automatic started User Access Logging Service UI0Detect manual stopped Interactive Services Detection UmRdpService manual started Remote Desktop Services UserMode Port Redirector upnphost disabled stopped UPnP Device Host VaultSvc manual stopped Credential Manager vds manual stopped Virtual Disk VGAuthService automatic started VMware Alias Manager and Ticket Service vmicguestinterface manual stopped Hyper-V Guest Service Interface vmicheartbeat manual stopped Hyper-V Heartbeat Service vmickvpexchange manual stopped Hyper-V Data Exchange Service vmicrdv manual stopped Hyper-V Remote Desktop Virtualization Service vmicshutdown manual stopped Hyper-V Guest Shutdown Service vmictimesync manual stopped Hyper-V Time Synchronization Service vmicvss manual stopped Hyper-V Volume Shadow Copy Requestor VMTools automatic started VMware Tools vmvss manual stopped VMware Snapshot Provider VSS manual stopped Volume Shadow Copy W32Time manual started Windows Time w3logsvc manual stopped W3C Logging Service W3SVC automatic started World Wide Web Publishing Service WAS manual started Windows Process Activation Service Wcmsvc automatic started Windows Connection Manager WcsPlugInService manual stopped Windows Color System WdiServiceHost manual stopped Diagnostic Service Host WdiSystemHost manual stopped Diagnostic System Host Wecsvc manual stopped Windows Event Collector WEPHOSTSVC manual stopped Windows Encryption Provider Host Service wercplsupport manual stopped Problem Reports and Solutions Control Panel Support WerSvc manual stopped Windows Error Reporting Service WinHttpAutoProxySvc manual started WinHTTP Web Proxy Auto-Discovery Service Winmgmt automatic started Windows Management Instrumentation WinRM automatic started Windows Remote Management (WS-Management) wmiApSrv manual stopped WMI Performance Adapter WPDBusEnum manual stopped Portable Device Enumerator Service WSService manual stopped Windows Store Service (WSService) wuauserv automatic started Windows Update wudfsvc manual stopped Windows Driver Foundation - User-mode Driver Framework On Mon, Oct 17, 2016 at 4:10 PM, Japheth Cleaver <user-87556346d4af@xymon.invalid> wrote:Hmm. Does the data after the corrupted lines appear to match the remaining data for the server in question? From the sample below it seems not (as I believe this is reported in alphabetical order), which might indicate indicate a broader memory corruption issue going on within xymond_client, where it's somehow losing track of the end or garbling the data in the buffer being used for holding status output. If it's causing a false positive, then it's not merely the final output that's the problem, but something occurring earlier in processing. What OS+distro is the server running on? Any chance you might be able to run xymond_client in debug mode for a bit while this is occurring? -jc On 10/17/2016 7:56 AM, Greg Krpan wrote: Hi JC- Thanks for the response. I am using Xymon 4.3.27 currently. The raw client data looks fine- there are no corrupted lines and no added brackets or special characters that I can see. This only occurs on the status pages. The server has been running since May, and this particular problem started at the end of Sept., after running Windows Update on my servers, but as both Windows and Linux clients are showing the behavior, I have ruled out the updates as the issue. I have tried restarting the service with no effect on behavior and there is nothing in the log files that show a problem that I can see. The level of false positives due to formatting errors has remained relatively consistent, and tends to be limited to the PROCS (Win, Linux) and SVCS (Win only) tests, but occasionally will see the same error occurring on the DISK and CPU tests, although that is significantly less frequent, and is not across all configured machines. The PROCS/SVCS tests are showing random errors on one machine or another approximately every 5 minutes. Thanks Greg. On Fri, Oct 14, 2016 at 6:52 PM, J.C. Cleaver <user-87556346d4af@xymon.invalid> wrote:On Fri, October 14, 2016 3:52 pm, Greg Krpan wrote:Recently, my monitoring has been generating frequent errors that are false, due to improper formatting, It is happening on both Windows andLinuxclients. I've included an example of how the tests are sending data back to the xymon server. I have not made any changes to my client or server configurations. Has anyone else been experiencing this behavior, or know of a fix? Greg. Name StartupType Status DisplayName AeLookupSvc manual stopped Application Experience ALG manual stopped Application Layer Gateway Service AppIDSvc manual stopped Application Identity Appinfo manual stopped Application Information AppMgmt manual stopped Application Management AppReadiness manual stopped App Readiness AppXSvc manual stopped AppX Deployment Service (AppXSVC) AudioEndpointBuilder manual toppe] Windows Audio Endpoint Builder Audiosrv manual stoppedWindowsAudio BBWin automatic started Big Brother Xymon Client BFE automatic started Base Filtering Engine BITS automatic started Background Intelligent Transfer Serv ce BrokerInfrastructure ] automatic started Background Tasks Infrastructure Service Browser disabled stoppedComputerBrowser CcmExec automatic started SMSAgentHost CertPropSvc manual started Certificate Propagation CmRcService disabled stopped Configuration Manager Remote Control COMSysApp manual started COM+ Sys] m Application CryptSvc] ] utomatic started Cr] tographic Services DcomLaunch ] automatic sta] ed DCOM Serv] Process Launcher defra]svc manual stoppedOptimizedrives DeviceAssociationService manual stoppedDeviceAssociation ServiceHi Greg, Is there anything unusual about the process names on the lines immediately before the corruption? There's a known issue in that lines starting with a bracket will cause missing data, and this can happen more frequently on Windows servers just by virtue of some of the data that's coming across, but that doesn't appear to be causing this specific issue. Can you confirm which version of Xymon server you're using? Do you see the same corruption in the "raw" Client Data for the affected servers, or is it only occurring on the status pages? Also -- anything unusual in the log files? Has this problem been constant since it started, or is it getting worse? Does restarting the xymon service fix it (temporarily)? Regards, -jc-- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White II, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband, William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon-- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White II, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband, William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon-- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward HigginsWhiteII, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D.Husband,William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon
-- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White II, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband, William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon
list Japheth Cleaver
▸
On 10/24/2016 10:59 AM, Greg Krpan wrote:
I haven't noticed any errors through xymond_client or from debug mode.
After running the xymoncmd line above I get the following:
./xymoncmd xymon 127.0.0.1 "xymondlog wstc-lr0dhcp1.svcs"
wstc-lr0dhcp1|svcs|red||1477331698|1477331698|1477333498|0|0|15.1.160.11|460159|||Y|
red Mon Oct 24 11:55:55 2016 - Services NOT ok
&red BBWin: No matching service - want started/automatic
&green DHCPServer is started/automatic - want started/automatic
&green McAfeeFramework is started/automatic - want started/automatic
&green McShield is started/automatic - want started/automatic
&green McTaskManager is started/automatic - want started/automatic
&green VMTools is started/automatic - want started/automatic
Name StartupType Status DisplayName
AeLookupSvc manual stopped Application Experience
ALG manual stopped Application Layer Gateway Service
AppIDSvc manual stopped Application Identity
Appinfo manual
sta]ted Application Information
AppMgmt manual stopped Application Management
AppReadiness manual stopped App Readiness
AppXSvc manual stopped AppX Deployment Service (AppXSVC)
AudioEndpointBuilder manual stopped Windows Audio Endpoint Builder
Audiosrv ]
manual st]
ped Windows Audio
]BWin automatic started Big Brother Xymon Client
BFE automatic started Base Filtering Engine*snip* Thanks; that confirms that the issue involved xymond_client or xymond, and isn't related to the web display. Looking through the changes from 4.3.25 to 4.3.27, it's hard to see what might be causing this issue. Is there any chance you're under a significant memory pressure on this machine? Would you be able to add some glibc debugging at all? If so, would you be able to add an: (export) MALLOC_CHECK_=3 (export) MALLOC_PERTURB_=1 ... into the environment? This might help trigger a memory issue that could otherwise go unnoticed. Alternatively, the next step might be to downgrade to 4.3.25 and see if that fixes the problem (if so, that really indicated there's a specific hidden issue here). Also, it might be interesting to see if the el7 Terabithia RPMs show the same problem for you. There was a significant increase in lookup/buffer debugging in xymond_client in there that's also in the 4.x-master branch but isn't in 4.3.x when compiled from source. Regards, -jc
list Greg Krpan
So the memory usage on the machine is fairly high. This system is a VM, and was built with only 2GB of memory, of which about 1.8GB is in use. I have a maintenance window coming up this week where I am going to increase the available memory to the server, but I will also try to inject the 2 MALLOC debugging that you suggested into things as well to watch for additional issues as time goes on. Hopefully that can help to identify where the problem lies and I can be a testbed to help determine a resolution. Greg. On Mon, Oct 24, 2016 at 11:57 PM, Japheth Cleaver <user-87556346d4af@xymon.invalid>
▸
wrote:
On 10/24/2016 10:59 AM, Greg Krpan wrote:I haven't noticed any errors through xymond_client or from debug mode. After running the xymoncmd line above I get the following: ./xymoncmd xymon 127.0.0.1 "xymondlog wstc-lr0dhcp1.svcs"
wstc-lr0dhcp1|svcs|red||1477331698|1477331698|1477333498|0| 0|15.1.160.11|460159|||Y|
▸
red Mon Oct 24 11:55:55 2016 - Services NOT ok &red BBWin: No matching service - want started/automatic &green DHCPServer is started/automatic - want started/automatic &green McAfeeFramework is started/automatic - want started/automatic &green McShield is started/automatic - want started/automatic &green McTaskManager is started/automatic - want started/automatic &green VMTools is started/automatic - want started/automatic Name StartupType Status DisplayName AeLookupSvc manual stopped Application Experience ALG manual stopped Application Layer Gateway Service AppIDSvc manual stopped Application Identity Appinfo manual sta]ted Application Information AppMgmt manual stopped Application Management AppReadiness manual stopped App Readiness AppXSvc manual stopped AppX Deployment Service (AppXSVC) AudioEndpointBuilder manual stopped Windows Audio Endpoint Builder Audiosrv ] manual st] ped Windows Audio ]BWin automatic started Big Brother Xymon Client BFE automatic started Base Filtering Engine*snip* Thanks; that confirms that the issue involved xymond_client or xymond, and isn't related to the web display. Looking through the changes from 4.3.25 to 4.3.27, it's hard to see what might be causing this issue. Is there any chance you're under a significant memory pressure on this machine? Would you be able to add some glibc debugging at all? If so, would you be able to add an: (export) MALLOC_CHECK_=3 (export) MALLOC_PERTURB_=1 ... into the environment? This might help trigger a memory issue that could otherwise go unnoticed. Alternatively, the next step might be to downgrade to 4.3.25 and see if that fixes the problem (if so, that really indicated there's a specific hidden issue here). Also, it might be interesting to see if the el7 Terabithia RPMs show the same problem for you. There was a significant increase in lookup/buffer debugging in xymond_client in there that's also in the 4.x-master branch but isn't in 4.3.x when compiled from source. Regards, -jc
-- In honor of those who lost their lives exploring the final frontier: Apollo 1; January 27, 1967 Virgil "Gus" Ivan Grissom, Edward Higgins White II, Roger Bruce Chaffee Space Shuttle Challenger, Mission STS-51-L; January 28, 1986 Francis R. Scobee, Michael J. Smith, Judith A. Resnik, Ellison S. Onizuka, Ronald E. McNair, Gregory B. Jarvis, Sharon Christa McAuliffe Space Shuttle Columbia, Mission STS-107; February 1, 2003 Rick D. Husband, William C. McCool, Michael P. Anderson, Kalpana Chawla, David M. Brown, Laurel Blair Salton Clark, Ilan Ramon