Xymon Mailing List Archive search

Green status during a total blackout

11 messages in this thread

list L.M.J · Mon, 25 Jul 2011 11:28:30 +0200 ·
 Hi,

   My Xymon server (running Hobbit 4.2.2) has been shut down during 6  hours. When the power came back, Xymon came back too. I've checked all  monitored equipment, everything was green during the past 6h. I was  expecting purple or white color because no data has been reported during  the time. Is that a bug or a setting I could adjust ?

   Thanks.
list Josh Luthman · Mon, 25 Jul 2011 09:18:21 -0400 ·
It won't turn purple until it runs again, about 5 minutes.  That is, if it
doesn't obtain data that is more than 30 minutes old before it runs.
quoted from L.M.J
On Jul 25, 2011 5:45 AM, "L.M.J" <user-78bb6d5d9024@xymon.invalid> wrote:
Hi,

My Xymon server (running Hobbit 4.2.2) has been shut down during 6
hours. When the power came back, Xymon came back too. I've checked all
monitored equipment, everything was green during the past 6h. I was
expecting purple or white color because no data has been reported during
the time. Is that a bug or a setting I could adjust ?

Thanks.
list L.M.J · Wed, 27 Jul 2011 06:45:53 +0200 ·
Le Mon, 25 Jul 2011 11:28:30 +0200,
quoted from L.M.J
"L.M.J" <user-78bb6d5d9024@xymon.invalid> a écrit :
When the power came back, Xymon came back too. I've checked all  monitored equipment, everything was green during the past 6h. I was  expecting purple or white color because no data has been reported during  the time. Is that a bug or a setting I could adjust ?
Hi,

  Nobody had this issue ? My chief asked me when the power outage arrived, I
  could not told him : everything stay green all the time :-/

-- 
 LMJ
 "May the source be with you my young padawan"
 http://sites.google.com/site/imatruelinuxmasterjedi/
list David Baldwin · Wed, 27 Jul 2011 15:49:20 +1000 ·
quoted from L.M.J
On 27/07/11 2:45 PM, L.M.J wrote:
Le Mon, 25 Jul 2011 11:28:30 +0200,
"L.M.J" <user-78bb6d5d9024@xymon.invalid> a écrit :
When the power came back, Xymon came back too. I've checked all 
 monitored equipment, everything was green during the past 6h. I was 
 expecting purple or white color because no data has been reported during 
 the time. Is that a bug or a setting I could adjust ?
Hi,

  Nobody had this issue ? My chief asked me when the power outage arrived, I
  could not told him : everything stay green all the time :-/
Status colour has to change explicitly - i.e. by notification of status
change. If the server died when the power went off, and came back when
everything was back again it would have no reason to show anything had
changed. Purple only happens when a display update has expired
information. If xymon was running the whole time (e.g. on a UPS) you
shouldn't see all green.

David.

-- 
David Baldwin - Assistant Director, Infrastructure (acting)
Information and Communication Technology Services
Australian Sports Commission          http://ausport.gov.au
Tel 02 62147830 Fax 02 62141830       PO Box 176 Belconnen ACT 2616
user-cbbf693f2c89@xymon.invalid          Leverrier Street Bruce ACT 2617


Keep up to date with what's happening in Australian sport visit http://www.ausport.gov.au

This message is intended for the addressee named and may contain confidential and privileged information. If you are not the intended recipient please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited and may be unlawful. If you receive this message in error, please delete it and notify the sender.
list Vernon Everett · Wed, 27 Jul 2011 14:00:44 +0800 ·
When you had the outage, did your Xymon server go down too?
If it did, then what you are seeing makes sense.

Xymon looks for status changes.
If everything went down, including Xymon, then it would never have
received any messages, nor been able to perform any other tests, like
ping etc.

I am guessing that when power was restored, you brought up all your
other servers first, and then started Xymon.
Or they all started up around the same time.

When it came back up, it would have assigned the last known status to
each server.
Last known status of all servers was green. At this point, all pings
and other tests would return green, so there is no status change to
report. So no errors.
(At worst, I would expect your CPU columns to go yellow, for an hour,
with the "Machine recently rebooted" message)

The only way to prevent this in future is to make sure your Xymon
server is on a UPS.
Better still, on a different UPS to the rest of your kit.
That way, it will stay up, and happily report that everything else had
gone to hell in a handbasket.

If your manager nees it, you can probably get an idea of the start-end
time of the outage from your messages files and other logs.
They will all indicate a boot message at around the time of the power recovery.
By checking the last timestamp before the recovery, will give you an
idea of the outage start time.
Check a few servers, the latest possible value is the correct one.

The root cause of your issue, or rather the lack of issue, is that
Xymon never recieved any reports of problems. Because it couldn't.

Hope that helps.

Regards
     Vernon
quoted from L.M.J


On 27 July 2011 12:45, L.M.J <user-78bb6d5d9024@xymon.invalid> wrote:
Le Mon, 25 Jul 2011 11:28:30 +0200,
"L.M.J" <user-78bb6d5d9024@xymon.invalid> a écrit :
When the power came back, Xymon came back too. I've checked all
 monitored equipment, everything was green during the past 6h. I was
 expecting purple or white color because no data has been reported during
 the time. Is that a bug or a setting I could adjust ?
Hi,

 Nobody had this issue ? My chief asked me when the power outage arrived, I
 could not told him : everything stay green all the time :-/

--
 LMJ
 "May the source be with you my young padawan"
 http://sites.google.com/site/imatruelinuxmasterjedi/

list L.M.J · Wed, 27 Jul 2011 12:45:45 +0200 ·
quoted from Vernon Everett
 On Wed, 27 Jul 2011 14:00:44 +0800, Vernon Everett wrote:
When you had the outage, did your Xymon server go down too?
If it did, then what you are seeing makes sense.

Xymon looks for status changes.
If everything went down, including Xymon, then it would never have
received any messages, nor been able to perform any other tests, like
ping etc.

I am guessing that when power was restored, you brought up all your
other servers first, and then started Xymon. Or they all started up around the same time.

The root cause of your issue, or rather the lack of issue, is that
Xymon never recieved any reports of problems. Because it couldn't.

 Yes, it's exactly what happened and no one is shocked about this  behaviour ?

 I would expect, at least, a white stripe in the "History" page of each  status check. Nothing has been reported during 6h, this is never happens  in a every day functioning. Type or paste your English text here and  click on the "Check Text" button.
quoted from L.M.J

On 27 July 2011 12:45, L.M.J <user-78bb6d5d9024@xymon.invalid> wrote:
Le Mon, 25 Jul 2011 11:28:30 +0200,
"L.M.J" <user-78bb6d5d9024@xymon.invalid> a écrit :
When the power came back, Xymon came back too. I've checked all
 monitored equipment, everything was green during the past 6h. I was
 expecting purple or white color because no data has been reported during
 the time. Is that a bug or a setting I could adjust ?
list Neil Simmonds · Wed, 27 Jul 2011 11:54:04 +0100 ·
If your Xymon server was down how could it recognise that it had not received any data?

If it received data within 5 minutes of it restarting then I would fully expect it to show green for the intervening period. I don't think Xymon can recognise that it was itself down and therefore show white/purple in the history. The timer that causes these statuses would have been reset by the reboot which would mean that as long as it then receives data from the clients within the new refreshed timer period it would stay green.

The only way I can think of that would allow you to see when the power outage occurred on Xymon would be to have your Xymon server attached to a UPS that would keep it alive for at least 10 minutes.
quoted from L.M.J

-----Original Message-----
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of L.M.J
Sent: 27 July 2011 11:46
To: Vernon Everett
Cc: xymon at xymon.com
Subject: Re: [Xymon] Green status during a total blackout

 On Wed, 27 Jul 2011 14:00:44 +0800, Vernon Everett wrote:
When you had the outage, did your Xymon server go down too?
If it did, then what you are seeing makes sense.

Xymon looks for status changes.
If everything went down, including Xymon, then it would never have
received any messages, nor been able to perform any other tests, like
ping etc.

I am guessing that when power was restored, you brought up all your
other servers first, and then started Xymon. Or they all started up 
around the same time.

The root cause of your issue, or rather the lack of issue, is that
Xymon never recieved any reports of problems. Because it couldn't.

 Yes, it's exactly what happened and no one is shocked about this 
 behaviour ?

 I would expect, at least, a white stripe in the "History" page of each 
 status check. Nothing has been reported during 6h, this is never happens 
 in a every day functioning. Type or paste your English text here and 
 click on the "Check Text" button.

On 27 July 2011 12:45, L.M.J <user-78bb6d5d9024@xymon.invalid> wrote:
Le Mon, 25 Jul 2011 11:28:30 +0200,
"L.M.J" <user-78bb6d5d9024@xymon.invalid> a écrit :
When the power came back, Xymon came back too. I've checked all
 monitored equipment, everything was green during the past 6h. I 
was
 expecting purple or white color because no data has been reported 
during
 the time. Is that a bug or a setting I could adjust ?

Name & Registered Office: EXPRESS GIFTS LIMITED, 2 GREGORY ST, HYDE, CHESHIRE, ENGLAND, SK14 4TH, Company No. 00718151.
Express Gifts Limited is authorised and regulated by the Financial Services Authority

NOTE:  This email and any information contained within or attached in a separate file is confidential and intended solely for the Individual to whom it is addressed. The information or data included is solely for the purpose indicated or previously agreed. Any information or data included with this e-mail remains the property of Findel PLC and the recipient will refrain from utilising the information for any purpose other than that indicated and upon request will destroy the information and remove it from their records.  Any views or opinions presented are solely those of the author and do not necessarily represent those of Findel PLC. If you are not the intended recipient, be advised that you have received this email in error and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. No warranties or assurances are made in relation to the safety and content of this e-mail and any attachments.  No liability is accepted for any consequences arising from it. Findel Plc reserves the right to monitor all e-mail communications through its internal and external networks. If you have received this email in error please notify our IT helpdesk on +44(0) 1254 303030
list Craig Whilding · Wed, 27 Jul 2011 11:55:21 +0100 ·
quoted from L.M.J

On 27/07/11 11:45, L.M.J wrote:
On Wed, 27 Jul 2011 14:00:44 +0800, Vernon Everett wrote:
When you had the outage, did your Xymon server go down too?
If it did, then what you are seeing makes sense.

Xymon looks for status changes.
If everything went down, including Xymon, then it would never have
received any messages, nor been able to perform any other tests, like
ping etc.

I am guessing that when power was restored, you brought up all your
other servers first, and then started Xymon. Or they all started up around the same time.

The root cause of your issue, or rather the lack of issue, is that
Xymon never recieved any reports of problems. Because it couldn't.

Yes, it's exactly what happened and no one is shocked about this behaviour ?

I would expect, at least, a white stripe in the "History" page of each status check. Nothing has been reported during 6h, this is never happens in a every day functioning. Type or paste your English text here and click on the "Check Text" button.
In our case the UPS would alert Xymon its on battery and has lost input power. Then it would start shutting down boxes after a few minutes causing box status reports to change colour and then it would shut down cleanly.

Everything would already be red before its turned back on again.

As part of a clean shutdown though maybe Xymon shouldn't leave stuff green for the duration its turned off. I've never really checked what it does at that point..

Craig
quoted from L.M.J
On 27 July 2011 12:45, L.M.J <user-78bb6d5d9024@xymon.invalid> wrote:
Le Mon, 25 Jul 2011 11:28:30 +0200,
"L.M.J" <user-78bb6d5d9024@xymon.invalid> a écrit :
When the power came back, Xymon came back too. I've checked all
 monitored equipment, everything was green during the past 6h. I was
 expecting purple or white color because no data has been reported during
 the time. Is that a bug or a setting I could adjust ?
list L.M.J · Wed, 27 Jul 2011 13:05:36 +0200 ·
quoted from Neil Simmonds
 On Wed, 27 Jul 2011 11:54:04 +0100, Neil Simmonds wrote:
If your Xymon server was down how could it recognise that it had not
received any data?
 Yes, but when it restarted, can't it see any data/graphs has been  update for hours ? What about updating rrd graphs with a 6h hole ?
 It turns purple when no data has been reported since 30min ? Xymon  turns itself to purple without having notification, doesn't it ?

 BTW : I'm already on UPS, but it ran out of power :-/
 Also, Hobbit starts after all other servers due to server-side  dependencies.
list Henrik Størner · Fri, 29 Jul 2011 17:47:50 +0200 ·
quoted from L.M.J
On 27-07-2011 13:05, L.M.J wrote:
On Wed, 27 Jul 2011 11:54:04 +0100, Neil Simmonds wrote:
If your Xymon server was down how could it recognise that it had not
received any data?
Yes, but when it restarted, can't it see any data/graphs has been update
for hours ? What about updating rrd graphs with a 6h hole ?
It turns purple when no data has been reported since 30min ? Xymon turns
itself to purple without having notification, doesn't it ?
You have a valid point, but Xymon is - by design - essentially event-driven when it comes to updates: It won't update anything unless it receives some sort of notification that something has changed. And when it is down, it doesn't receive anything.

Yes, Xymon could look at the RRD files and see that they haven't been updated. There might be other reasons for that, though - e.g. the xymond_rrd module might have crashed, but the rest of Xymon has been running fine.

And what happens if all your tests go purple immediately when Xymon is brought back on-line ? You'll potentially fire off all the alerts you have configured. At a time when things are really running OK...

So it's a design choice to behave the way it does. If your entire datacenter has been down because of a power outage, people will know. They don't need the Xymon history to learn about that.


Regards,
Henrik
list Ryan Novosielski · Fri, 29 Jul 2011 12:11:24 -0400 ·
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I think he does have a point here... couldn't it retroactively, when
drawing up history and graphs, say "there's a big hole here where
nothing was erased?" The way Xymon so far processes information might
make this a non-trivial task, however -- don't know.
quoted from Neil Simmonds

On 07/27/2011 06:54 AM, Neil Simmonds wrote:
If your Xymon server was down how could it recognise that it had not received any data?

If it received data within 5 minutes of it restarting then I would fully expect it to show green for the intervening period. I don't think Xymon can recognise that it was itself down and therefore show white/purple in the history. The timer that causes these statuses would have been reset by the reboot which would mean that as long as it then receives data from the clients within the new refreshed timer period it would stay green.

The only way I can think of that would allow you to see when the power outage occurred on Xymon would be to have your Xymon server attached to a UPS that would keep it alive for at least 10 minutes.

-----Original Message-----
From: xymon-bounces at xymon.com [mailto:xymon-bounces at xymon.com] On Behalf Of L.M.J
Sent: 27 July 2011 11:46
To: Vernon Everett
Cc: xymon at xymon.com
Subject: Re: [Xymon] Green status during a total blackout

 On Wed, 27 Jul 2011 14:00:44 +0800, Vernon Everett wrote:
When you had the outage, did your Xymon server go down too?
If it did, then what you are seeing makes sense.

Xymon looks for status changes.
If everything went down, including Xymon, then it would never have
received any messages, nor been able to perform any other tests, like
ping etc.

I am guessing that when power was restored, you brought up all your
other servers first, and then started Xymon. Or they all started up 
around the same time.

The root cause of your issue, or rather the lack of issue, is that
Xymon never recieved any reports of problems. Because it couldn't.

 Yes, it's exactly what happened and no one is shocked about this 
 behaviour ?

 I would expect, at least, a white stripe in the "History" page of each 
 status check. Nothing has been reported during 6h, this is never happens 
 in a every day functioning. Type or paste your English text here and 
 click on the "Check Text" button.

On 27 July 2011 12:45, L.M.J <user-78bb6d5d9024@xymon.invalid> wrote:
Le Mon, 25 Jul 2011 11:28:30 +0200,
"L.M.J" <user-78bb6d5d9024@xymon.invalid> a écrit :
When the power came back, Xymon came back too. I've checked all
 monitored equipment, everything was green during the past 6h. I 
was
 expecting purple or white color because no data has been reported 
during
 the time. Is that a bug or a setting I could adjust ?

Name & Registered Office: EXPRESS GIFTS LIMITED, 2 GREGORY ST, HYDE, CHESHIRE, ENGLAND, SK14 4TH, Company No. 00718151.
Express Gifts Limited is authorised and regulated by the Financial Services Authority

NOTE:  This email and any information contained within or attached in a separate file is confidential and intended solely for the Individual to whom it is addressed. The information or data included is solely for the purpose indicated or previously agreed. Any information or data included with this e-mail remains the property of Findel PLC and the recipient will refrain from utilising the information for any purpose other than that indicated and upon request will destroy the information and remove it from their records.  Any views or opinions presented are solely those of the author and do not necessarily represent those of Findel PLC. If you are not the intended recipient, be advised that you have received this email in error and that any use, dissemination, forwarding, printing, or copying of this email is strictly prohibited. No warranties or assurances are made in relation to the safety and content of this e-mail and any attachments.  No liability is accepted for any co
nsequences arising from it. Findel Plc reserves the right to monitor all e-mail communications through its internal and external networks. If you have received this email in error please notify our IT helpdesk on +44(0) 1254 303030
- -- 
- ---- _  _ _  _ ___  _  _  _
|Y#| |  | |\/| |  \ |\ |  | |Ryan Novosielski - Sr. Systems Programmer
|$&| |__| |  | |__/ | \| _| |user-ae4522577e16@xymon.invalid - 973/972.0922 (2-0922)
\__/ Univ. of Med. and Dent.|IST/CST-Academic Svcs. - ADMC 450, Newark
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk4y26wACgkQmb+gadEcsb4VyACgnErBwoS6A0/bcvoHOzYhq/9Q
W7AAn3UUexXGw2IOOA1MA8Xc2wyggDoo
=CLMP
-----END PGP SIGNATURE-----