Xymon Mailing List Archive search

xymon disk not alerting at 100%, need another set of eyes

list Japheth Cleaver
Thu, 5 Jan 2017 12:10:59 -0800
Message-Id: <user-02c7866151e7@xymon.invalid>

Eyeballing it, it seems to be correct, and if windows matches are 
working then it seems like class (or at least OS) is being sensed 
properly. Can you put xymond_client in debug mode (-USR2) and show the 
output from the processing of the disk section for this client? It 
should indicate there the thresholds it *thinks* apply to this host.

Also, when running manually like this:
     /usr/libexec/xymon/xymond_client --dump-config
...can you prefix with xymoncmd and see if anything changes? Weird 
configs I'd forgotten about in xymonserver.cfg have bit me on occasion.

-jc

On 1/5/2017 11:38 AM, Scot Kreienkamp wrote:
No… it’s showing up on the page and in the graph.  Even if it was 
ignored, reverting to the default out-of-the-box config would have 
removed the ignore also.

*Scot Kreienkamp  | Senior Systems Engineer | La-Z-Boy Corporate*
One La-Z-Boy Drive | Monroe, Michigan 48162 | Office: XXX-XXX-XXXX | | 
Mobile: XXXXXXXXXX | Email: user-9678697f1438@xymon.invalid

*From:* user-f00ed6e065e8@xymon.invalid [mailto:user-f00ed6e065e8@xymon.invalid]
*Sent:* Thursday, January 5, 2017 2:36 PM
*To:* Scot Kreienkamp
*Cc:* user-87556346d4af@xymon.invalid; xymon
*Subject:* Re: [Xymon] xymon disk not alerting at 100%, need another 
set of eyes

Is /boot ignored?


    It’s not the partition the client is on, and it’s been that way
    for days.

    So a bit more troubleshooting, I moved all the files out of
    analysis.d so the only analysis config is the default included
    from the install and restarted xymon.

    [root at monvxymon analysis.d]# /usr/libexec/xymon/xymond_client
    --dump-config --config=etc/analysis.cfg   ; echo Done

    UP 3600 -1 (line: 365)

    LOAD 5.00 10.00 (line: 366)

    DISK * 90% 95% 0 -1 red (line: 367)

    INODE * 70% 90% 0 -1 red (line: 368)

    MEMREAL 100 101 (line: 369)

    MEMSWAP 50 80 (line: 370)

    MEMACT 90 97 (line: 371)

    Done

    Then I restarted my client to force it to report in.  The disk
    test is still green with the /boot partition at 100% full!  All my
    windows clients are working, but NONE of my Linux clients with
    disk full conditions are working.

    Something is definitely broken!

    JC, any ideas?

    *From:*user-f00ed6e065e8@xymon.invalid <mailto:user-f00ed6e065e8@xymon.invalid>
    [mailto:user-f00ed6e065e8@xymon.invalid]
    *Sent:* Thursday, January 5, 2017 2:18 PM
    *To:* Scot Kreienkamp
    *Cc:* xymon
    *Subject:* Re: [Xymon] xymon disk not alerting at 100%, need
    another set of eyes

    Hi Scott,

    What may have happened is that the disk filled up quicker than the
    client could send the alert.

    If the client is on the same disk that is full.  That's caught me
    a few times.

    HTH

    Regards

    Greg Shea


        So I had another thought, I copied the class statement to
        another file so it’s now first in the list and last in the
        list, and my disk test is still green.  Is the class match broken?

        I’m on 4.3.27-1 from Terabithia.

        Thanks!

        *From:*Scot Kreienkamp
        *Sent:* Thursday, January 5, 2017 1:53 PM
        *To:*xymon at xymon.com <mailto:xymon at xymon.com>
        *Subject:* RE: xymon disk not alerting at 100%, need another
        set of eyes

        After re-reading I can see how that may not be totally clear. 
        By alerting, I mean that the disk test is still green, even
        though a partition is at 100%full.

        I found two hosts that weren’t alerting on disk full condition
        and started digging into the problem further.  As I understand
        it, xymon matches the first entry from analysis config files.
        So I dumped the analysis config for disks:

        Client line:

        [collector:]

        client corpvskreienl,na,lzb,hq.linux linux

        [root at monvxymon hosts.d]# /usr/libexec/xymon/xymond_client
        --dump-config --config=etc/analysis.cfg |grep -i ^disk

        DISK %^(D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z)
        15728640U 10485760U 0 -1 red
        HOST=%(mondbexec.*|mondb.*|retmaildb.*).na.lzb.hq (line: 515)

        DISK %^(1|2|3|4|5|6|7|8|9|0).* IGNORE
        HOST=%(mondbexec.*|mondb.*|retmaildb.*).na.lzb.hq (line: 516)

        DISK %^(D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z)
        15728640U 10485760U 0 -1 red
        HOST=%(mon|new|red|neo|taz|sil|kin|sal|hpt)exch.*.na.lzb.hq
        (line: 527)

        DISK %^(1|2|3|4|5|6|7|8|9|0).* IGNORE
        HOST=%(mon|new|red|neo|taz|sil|kin|sal|hpt)exch.*.na.lzb.hq
        (line: 528)

        DISK %^(D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|U|V|W|X|Y|Z) 15728640U
        10485760U 0 -1 red HOST=%dayexch.*.na.lzb.hq (line: 539)

        DISK %^T IGNORE HOST=%dayexch.*.na.lzb.hq (line: 540)

        DISK %^(1|2|3|4|5|6|7|8|9|0|).* IGNORE
        HOST=%dayexch.*.na.lzb.hq (line: 541)

        DISK C 204800U 102400U 0 -1 red HOST=mdas4000.mdmza.dmz.hq
        (line: 567)

        DISK E 101% 101% 0 -1 red HOST=mdas4000.mdmza.dmz.hq (line: 568)

        DISK F 99% 100% 0 -1 red HOST=mons6000.na.lzb.hq (line: 576)

        DISK %^(D|E|F|G|H|I|J|K|L|M|N|O|P|Q|R|S|T|U|V|W|X|Y|Z)
        15728640U 10485760U 0 -1 red PAGE=infrastructure/fileserv
        (line: 582)

        DISK D 99% 100% 0 -1 red
        HOST=lzbv5223.na.lzb.hq,lzbv6016.na.lzb.hq (line: 746)

        DISK * 90% 95% 0 -1 red HOST=%dvrvas(0|1)\.mdmza.dmz.hq (line:
        762)

        DISK * 90% 95% 0 -1 red CLASS=powershell (line: 1054)

        DISK * 90% 95% 0 -1 red CLASS=win32 (line: 1073)

        DISK * 90% 95% 0 -1 red CLASS=linux (line: 1090)

        DISK * 90% 95% 0 -1 red (line: 1132)

        I can’t find any lines above where the hostname matches, it’s
        on page Infrastructure/Miscellaneous so none of the page
        statements match, so it should match on the class. Or the very
        last line is the system default which should apply if nothing
        else.  My server is sitting at 100%full on one partition so it
        SHOULD be alerting.

        Thanks for any help.

        This message is intended only for the individual or entity to
        which it is addressed.  It may contain privileged,
        confidential information which is exempt from disclosure under
        applicable laws.  If you are not the intended recipient, you
        are strictly prohibited from disseminating or distributing
        this information (other than to the intended recipient) or
        copying this information.  If you have received this
        communication in error, please notify us immediately by e-mail
        or by telephone at the above number. Thank you.