Xymon Mailing List Archive search

Devmon - scalability ?

4 messages in this thread

list Nicolas Lienard · Sun, 16 Sep 2012 14:13:36 +0200 ·
hello

we got a xymon server polling ~ 8100 network devices (mainly cisco).
devmon is used as well but the polling period is high: > 600 sec. (10 mn ).
some purple are coming/leaving randomly :(

the server is not overloaded : load average: 0.70, 0.71, 0.74
no I/O disk load (SSD disk). server is idling at 94% (16cpu).
bandwidth reachs 60Mb in outboud.

i m wondering if any people here have some tuning recommendations to have a better polling time (less than 300 sec  idealy).
for the moment, NUMFORKS is 60 (default 10) . 

dm stats:
Polled devices:   8183
Polled tests:     16535
Avg tests/node:   n/a
# clear msgs:     0

SNMP test time:   321
Test logic time:  164
BB msg xfer time: 176
This poll period: 661


maybe running a second devmon instance on the same server could be better.

any experience is welcome.

cheers
Nico
list Nicolas Lienard · Sun, 16 Sep 2012 14:14:09 +0200 ·
hello

we got a xymon server polling ~ 8100 network devices (mainly cisco).
devmon is used as well but the polling period is high: > 600 sec. (10 mn ).
some purple are coming/leaving randomly :(

the server is not overloaded : load average: 0.70, 0.71, 0.74
no I/O disk load (SSD disk). server is idling at 94% (16cpu).
bandwidth reachs 60Mb in outboud.

i m wondering if any people here have some tuning recommendations to have a better polling time (less than 300 sec  idealy).
for the moment, NUMFORKS is 60 (default 10) . 

dm stats:
Polled devices:   8183
Polled tests:     16535
Avg tests/node:   n/a
# clear msgs:     0

SNMP test time:   321
Test logic time:  164
BB msg xfer time: 176
This poll period: 661


maybe running a second devmon instance on the same server could be better.

any experience is welcome.

cheers
Nico
list Jeremy Laidman · Mon, 17 Sep 2012 16:14:23 +1000 ·
quoted from Nicolas Lienard
On 16 September 2012 22:13, Nico <user-4f1d872b9031@xymon.invalid> wrote:
we got a xymon server polling ~ 8100 network devices (mainly cisco).
devmon is used as well but the polling period is high: > 600 sec. (10 mn ).
some purple are coming/leaving randomly :(
i m wondering if any people here have some tuning recommendations to have a
better polling time (less than 300 sec  idealy).
for the moment, NUMFORKS is 60 (default 10) .
Have you tried increasing numforks beyond 60?

Have you tried decreasing cycletime (default is 60 seconds)?  This
determines how log a fork will wait until looking for more work to do, so
if it's too large, there will be work in the queue waiting to be processed
for a long time, with a bunch of sleeping forks doing nothing.

maybe running a second devmon instance on the same server could be better.
You could try this.  But I don't see how this could be any better than
increasing numforks.  If anything, it could be worse, especially if you
don't separate the work.

Ensure you define cid() and model() in hosts.cfg for every host with
"DEVMON:".  For optimal efficiency, the following should produce no output:

  $ xymoncmd xymongrep 'devmon*'|egrep -v "cid.*model|model.*cid"

Cheers
Jeremy
list Nicolas Lienard · Mon, 17 Sep 2012 22:52:31 +0200 ·
Hello

I tried 80 but it was spending too much time also (>800 sec).
Will play with cycle time tomorrow to see if it helps.
About discovery it runs ony once per day and forcing the template and community will speed up only the discovery.

Thanks!
Cheers
Nico
quoted from Jeremy Laidman


Le 17 sept. 2012 à 08:14, Jeremy Laidman <user-71895fb2e44c@xymon.invalid> a écrit :
On 16 September 2012 22:13, Nico <user-4f1d872b9031@xymon.invalid> wrote:
we got a xymon server polling ~ 8100 network devices (mainly cisco).
devmon is used as well but the polling period is high: > 600 sec. (10 mn ).
some purple are coming/leaving randomly :(

i m wondering if any people here have some tuning recommendations to have a better polling time (less than 300 sec  idealy).
for the moment, NUMFORKS is 60 (default 10) .

Have you tried increasing numforks beyond 60?

Have you tried decreasing cycletime (default is 60 seconds)?  This determines how log a fork will wait until looking for more work to do, so if it's too large, there will be work in the queue waiting to be processed for a long time, with a bunch of sleeping forks doing nothing.

maybe running a second devmon instance on the same server could be better.

You could try this.  But I don't see how this could be any better than increasing numforks.  If anything, it could be worse, especially if you don't separate the work.

Ensure you define cid() and model() in hosts.cfg for every host with "DEVMON:".  For optimal efficiency, the following should produce no output:

  $ xymoncmd xymongrep 'devmon*'|egrep -v "cid.*model|model.*cid"

Cheers
Jeremy