Xymon Mailing List Archive search

questions

list Jeremy Laidman
Fri, 23 Jun 2017 01:02:10 +1000
Message-Id: <CAAnki7C=user-3946108e5fbc@xymon.invalid>

Chad

What does the director do? How does it communicate with the servers?

Does the director or the server create a log message when there's a
problem? Xymon can detect and alarm on that.

Does the director connect to the servers on a specific TCP port? If that
port is rejecting a connection, the Xymon server can test for that (every 5
minutes, but can be more often) and alarm on it.

When a server fails, does it stop listening on a particular TCP port? Or
perhaps a process crashes and restarts, causing the connection to fail?
Xymon can test for these and alarm when it detects a missing listening or
established TCP socket, or a missing process.

It's also possible to write a script to have Xymon look at the process
listing "ps" output, and look for a particular process's lifetime, and
alert when it's less than 5 minutes.

One thing to note is that Xymon's probes and processes typically look for
things every 5 minutes. Transient failures that come and go within a few
seconds may not be detected using the standard probes and checks. However,
the frequency of some of these probes can be increased to make it more
likely to catch failures. But also, a custom script can be written to check
the state as often as you need. However for transient faults, it's more
reliable to look for artefacts of a failure (log errors and warnings, short
process lifetime) rather than periodically checking for a successful state.

J


On 22 June 2017 at 22:59, Chad Rodriguez <user-d4c92497b647@xymon.invalid> wrote:
Symptom, we open up director and see application servers not communicating
at the same time we can ping server by hostname and IP>


Respectfully,


Chad Rodriguez | Systems Administrator

XXXXX N. XXth Ave., Phoenix, AZ XXXXX

office: XXX-XXX-XXXX | fax: XXX-XXX-XXXX

email – user-d4c92497b647@xymon.invalid

[image: PetSmart_logo_email.jpg]

*Upcoming Out-of-Office dates**:*

*June 26th through July 4th*

*July 21st*


*From:* Jeremy Laidman [mailto:user-71895fb2e44c@xymon.invalid]
*Sent:* Thursday, June 22, 2017 3:23 AM
*To:* Chad Rodriguez <user-d4c92497b647@xymon.invalid>
*Cc:* xymon at xymon.com
*Subject:* Re: [Xymon] questions


Chad


Situations like what exactly? When a server is rebooted? Or when a server
stops communicating? Can you explain what symptoms? What is a "director"?
Sorry, I'm not familiar with the Solarwinds product.


Out of the box, Xymon can detect a few different types of communication
issues (eg ping checks, TCP port responses) as well as monitoring logfiles
for messages that indicate trouble. Furthermore, Xymon is highly
extensible, so if you can write a script to perform a test for your
problem, you can turn it into a message for Xymon to display, and
optionally alarm via email or other means.


Cheers

Jeremy


On 22 June 2017 at 07:07, Chad Rodriguez <user-d4c92497b647@xymon.invalid> wrote:

We have no monitoring in place other than solarwinds which monitors
heartbeats. Essentially we have a few application servers that are randomly
not communicating with the director and were having to reboot them. Seeing
if your application would alert on situations like this in an email
notification format?


Respectfully,


Chad Rodriguez | Systems Administrator

XXXXX N. XXth Ave., Phoenix, AZ XXXXX

office: XXX-XXX-XXXX | fax: XXX-XXX-XXXX

email – user-d4c92497b647@xymon.invalid

[image: PetSmart_logo_email.jpg]

*Upcoming Out-of-Office dates**:*

*June 26th through July 4th*

*July 21st*