3 vmstat procs and other issues

6 messages in this thread

list Galen Johnson · Thu, 5 Jan 2017 16:40:37 -0500 ·

Hey,

While I'm questioning things, I noticed that there are 3 vmstat calls run
on the server every 5 minutes instead of the 1 that I would expect...Anyone
else seeing that behavior?

I'm running the Terabithia RPM (yes, I know there is a new release about to
come out) on Centos 7.

I'm also not getting the disk column and most of the other local tests that
I've come to expect, either (memory, cpu, files, etc).

thanks

=G=

list Japheth Cleaver · Thu, 5 Jan 2017 14:15:38 -0800 ·

▸ quoted from Galen Johnson

On 1/5/2017 1:40 PM, Galen Johnson wrote:

Hey,

While I'm questioning things, I noticed that there are 3 vmstat calls run on the server every 5 minutes instead of the 1 that I would expect...Anyone else seeing that behavior?

I'm running the Terabithia RPM (yes, I know there is a new release about to come out) on Centos 7.

This is actually normal. It's a side effect of the fact that those RPMs fire the client off every 100s instead of every 5m. Although vmstat (and anything launched the same way) is collecting info for the previous 5m, it's doing it once for each client execution that occurs during that time (3x). You end up with a rolling 5m average than a discrete 5m block when $interval != $collectionperiod.

I'm also not getting the disk column and most of the other local tests that I've come to expect, either (memory, cpu, files, etc).

FQDN issue possible here as well?

-jc

list Galen Johnson · Thu, 5 Jan 2017 18:06:48 -0500 ·

Ok...I fixed this by adding '-x tmpfs -x devtmpfs' to the df command.
Definitely something you may want to consider.  It was cluttering up my
disk graphs badly (especially on systems that had lots of users logged
in).

=G=

On Thu, Jan 5, 2017 at 6:00 PM, Galen Johnson <user-fc632e705d24@xymon.invalid> wrote:

Thanks for the explanation on the vmstat point...I feel much better now,
I hadn't actually noticed the differences in the timestamps until you
pointed it out.

The FQDN was the issue with my missing tests.  They show up now.  Now I
just need the df command to ignore tmpfs paths an I think I'll be good.

=G=

On Thu, Jan 5, 2017 at 5:15 PM, Japheth Cleaver <user-87556346d4af@xymon.invalid>

▸ quoted from Japheth Cleaver

wrote:

On 1/5/2017 1:40 PM, Galen Johnson wrote:

Hey,

While I'm questioning things, I noticed that there are 3 vmstat calls
run on the server every 5 minutes instead of the 1 that I would
expect...Anyone else seeing that behavior?

I'm running the Terabithia RPM (yes, I know there is a new release about
to come out) on Centos 7.

This is actually normal. It's a side effect of the fact that those RPMs
fire the client off every 100s instead of every 5m. Although vmstat (and
anything launched the same way) is collecting info for the previous 5m,
it's doing it once for each client execution that occurs during that time
(3x). You end up with a rolling 5m average than a discrete 5m block when
$interval != $collectionperiod.

I'm also not getting the disk column and most of the other local tests

that I've come to expect, either (memory, cpu, files, etc).

FQDN issue possible here as well?

-jc

list Galen Johnson · Thu, 5 Jan 2017 18:18:06 -0500 ·

Ok...looking more closely at the xymonclient-linux.sh script, it appears
that tmpfs was intentionally not included in the exclude list.  At the very
least, you may want to consider adding /run to the default excludes in
analysis.cfg.  I was also noticing in my df output that other virtual
filesystems were showing (and being tracked) as well, such as /dev/shm and
/sys.  I'm not sure why you would want those tracked either.  For example,
here's a list of tmpfs "disks" I see on one of my systems:

tmpfs                 2033585      9   2033576    1% /dev/shm
tmpfs                 2033585    598   2032987    1% /run
tmpfs                 2033585     13   2033572    1% /sys/fs/cgroup
tmpfs                 2033585     29   2033556    1% /tmp
tmpfs                 2033585      1   2033584    1% /run/user/1984

▸ quoted from Galen Johnson


=G=

On Thu, Jan 5, 2017 at 6:06 PM, Galen Johnson <user-fc632e705d24@xymon.invalid> wrote:

Ok...I fixed this by adding '-x tmpfs -x devtmpfs' to the df command.
Definitely something you may want to consider.  It was cluttering up my
disk graphs badly (especially on systems that had lots of users logged
in).

=G=

On Thu, Jan 5, 2017 at 6:00 PM, Galen Johnson <user-fc632e705d24@xymon.invalid> wrote:

Thanks for the explanation on the vmstat point...I feel much better now,
I hadn't actually noticed the differences in the timestamps until you
pointed it out.

The FQDN was the issue with my missing tests.  They show up now.  Now I
just need the df command to ignore tmpfs paths an I think I'll be good.

=G=

On Thu, Jan 5, 2017 at 5:15 PM, Japheth Cleaver <user-87556346d4af@xymon.invalid>
wrote:

On 1/5/2017 1:40 PM, Galen Johnson wrote:

Hey,

While I'm questioning things, I noticed that there are 3 vmstat calls
run on the server every 5 minutes instead of the 1 that I would
expect...Anyone else seeing that behavior?

I'm running the Terabithia RPM (yes, I know there is a new release
about to come out) on Centos 7.

This is actually normal. It's a side effect of the fact that those RPMs
fire the client off every 100s instead of every 5m. Although vmstat (and
anything launched the same way) is collecting info for the previous 5m,
it's doing it once for each client execution that occurs during that time
(3x). You end up with a rolling 5m average than a discrete 5m block when
$interval != $collectionperiod.

I'm also not getting the disk column and most of the other local tests

that I've come to expect, either (memory, cpu, files, etc).

FQDN issue possible here as well?

-jc

list Japheth Cleaver · Thu, 5 Jan 2017 15:33:19 -0800 ·

▸ quoted from Galen Johnson

On 1/5/2017 3:18 PM, Galen Johnson wrote:

Ok...looking more closely at the xymonclient-linux.sh script, it appears that tmpfs was intentionally not included in the exclude list.  At the very least, you may want to consider adding /run to the default excludes in analysis.cfg. I was also noticing in my df output that other virtual filesystems were showing (and being tracked) as well, such as /dev/shm and /sys.  I'm not sure why you would want those tracked either.  For example, here's a list of tmpfs "disks" I see on one of my systems:

tmpfs                 2033585      9   2033576    1% /dev/shm
tmpfs                 2033585    598   2032987    1% /run
tmpfs                 2033585     13   2033572    1% /sys/fs/cgroup
tmpfs                 2033585     29   2033556    1% /tmp
tmpfs                 2033585      1   2033584    1% /run/user/1984

Oy.

I'd forgotten about the newer per-user /run/ environments that were coming in now in EL7. Definitely worth adding specific excludes for those.

/dev/shm/ is occasionally used as dedicated scratch space for some purposes. In fact, it's the default write location for the temporary message in the xymon-client RPM for a while now (this way you can actually get a report when /tmp/ fills to 100%). With the proliferation of spurious tmpfs mounts now, it might be a good idea to take another look.

OTOH, at least as far as excludes go, this might be something best handled at a per-distribution level... leaving new, admin-created tmpfs points tracked by default.


Regards,
-jc
-jc

list Greg Earle · Sat, 05 Aug 2017 15:03:23 -0700 ·

▸ quoted from Japheth Cleaver

Way back on 6 Jan 2017, at 3:00, Japeth Cleaver <user-87556346d4af@xymon.invalid> wrote:

On 1/5/2017 3:18 PM, Galen Johnson wrote:

Ok...looking more closely at the xymonclient-linux.sh script, it
appears that tmpfs was intentionally not included in the exclude
list.  At the very least, you may want to consider adding /run to the
default excludes in analysis.cfg.  I was also noticing in my df output
that other virtual filesystems were showing (and being tracked) as
well, such as /dev/shm and /sys.  I'm not sure why you would want
those tracked either.  For example, here's a list of tmpfs "disks" I
see on one of my systems:

tmpfs                 2033585      9   2033576    1% /dev/shm
tmpfs                 2033585    598   2032987    1% /run
tmpfs                 2033585     13   2033572    1% /sys/fs/cgroup
tmpfs                 2033585     29   2033556    1% /tmp
tmpfs                 2033585      1   2033584    1% /run/user/1984

Oy.

I'd forgotten about the newer per-user /run/ environments that were
coming in now in EL7.  Definitely worth adding specific excludes for those.

/dev/shm/ is occasionally used as dedicated scratch space for some
purposes.  In fact, it's the default write location for the temporary
message in the xymon-client RPM for a while now (this way you can
actually get a report when /tmp/ fills to 100%).  With the proliferation
of spurious tmpfs mounts now, it might be a good idea to take another look.

Sorry to dredge up an old thread from January, but as I just ran into this ...

I agree with Japeth that getting reports about "/tmp" being 100% full is a Good Thing™.

But I'm seeing some strange instances on a couple of our RHEL 7.3 systems where for some strange reason, during boot "/dev/shm" is getting mysteriously chmod'ed from mode 1777 to 1755.  No idea why/where.

Xymon starts up OK but of course it can't write its temp files in there so everything goes Purple and "stopped reporting".  Took me a while to suss that one out.  (This mystery chmod'ing also affects "gdm" as "pulseaudio" can't write its tmp files in there either.)

Not sure what the best solution is (I assume if you put them in "/var/tmp" or "/var/run/xymon" then if "/var" is part of "/", you wouldn't get 100% full alerts for "/" - which would be a Bad Thing™), but thought I should mention this scenario.

I might try changing XYMONTMP in "xymonclient.cfg" on this latest system where it happened just to test it out, but it's a VM with little chance of "/" filling up.

BTW, I just noticed there's a "/run/xymon" (in the RHEL 7.3 RPM, anyway).  Why not use that for XYMONTMP on RHEL 7/CentOS 7/Fedora?  Or would it be too much of a pain to maintain two different "xymonclient.cfg" files with two different XYMONTMP settings?

		- Greg

P.S. Need Fedora 26 RPMs for 4.3.28 please :)

3 vmstat procs and other issues 🔗 link

3 vmstat procs and other issues