Xymon Mailing List Archive search

Handling SNMP traps with Hobbit

6 messages in this thread

list Andy Farrior · Fri, 15 Jul 2005 21:21:10 -0500 ·
 
 
My technicians have been bugging me about wanting to receive SNMP traps from various equipment.
 
The only references to SNMP traps and Hobbit I could find were forwarding Hobbit events as SNMP traps to
NMS servers like OpenView or something.
 
I don't have either; so I've tried to implement SNMP trap handling with Hobbit using an external perl script, snmptrapd, SNMPTT, and SEC.
 
I've put my configuration notes here if you're interested:
http://cerebro.victoriacollege.edu/hobbit-trap.html
 
 
If you find something grossly wrong, let me know.
 
thanks,
andy
list Henrik Størner · Sat, 16 Jul 2005 09:22:00 +0200 ·
quoted from Andy Farrior
On Fri, Jul 15, 2005 at 09:21:10PM -0500, FARRIOR, Andy wrote:
My technicians have been bugging me about wanting to receive SNMP traps from various equipment.
 
The only references to SNMP traps and Hobbit I could find were forwarding Hobbit events as SNMP traps to
NMS servers like OpenView or something.
 
I don't have either; so I've tried to implement SNMP trap handling with Hobbit using an external perl script, snmptrapd, SNMPTT, and SEC.
 
I've put my configuration notes here if you're interested:
http://cerebro.victoriacollege.edu/hobbit-trap.html
This is a very elegant solution for handling SNMP traps. The
SNMPTT and SEC tools make these statuses really usable, instead
of just dumping all traps directly into Hobbit. I'll definitely
get this up and running on my own system after the holidays.

The only criticism I have is about the way you keep statuses
from going purple - I would do that differently, because right 
now you depend on a certain format of the Hobbit checkpoint-file
which may change (there's a reason it isn't documented anywhere).

Instead of reading the checkpoint file, I'd query the hobbit daemon
directly. You do this with the bb client tool and the "hobbitdboard"
command. E.g. to fetch the hostname and expiry-time for all "trap"
statuses you can do this:
   $BB $BBDISP "hobbitdboard test=trap fields=hostname,validtime"
The output looks like this:
   adsl.hswn.dk|1121498714
   backup-mx.post.tele.dk|1121498714
   www.sslug.dk|1121498623
So changing your script to do this should be really simple - use
Perl's open() to run the command in a pipe and read the output - and
you no longer rely on the checkpoint-file being updated, or the
format of this file not changing. And you only get the hosts
that really do have a "trap" status logged in Hobbit, so you no
longer need to read the bb-hosts file - you shouldn't do it the
way you do, because it doesn't handle bb-hosts files that have
been split up into multiple files and then combined via the
"include" statement. If you must, use the "bbhostgrep trap" command
and read the output from it.

Another - perhaps more elegant - solution is to change Hobbit so
that you can send a status-message that does not expire. I'd be
willing to implement such a change since it does make sense for
this kind of integration with other systems. (I have a similar
problem on my system where it receives e-mails instead of SNMP
traps). However, then you will not get any indication if your 
SNMP module stops working, so each method has its benefits and
drawbacks.

My Perl skills are really poor, so I'd love it if you could
change the trap.pl script to use the hobbitdboard command instead 
of the checkpoint file. 


Thanks,
Henrik
list Andy Farrior · Sat, 16 Jul 2005 14:01:31 -0500 ·
quoted from Henrik Størner

-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid]
Sent: Sat 7/16/2005 2:22 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Handling SNMP traps with Hobbit
Instead of reading the checkpoint file, I'd query the hobbit daemon
directly. You do this with the bb client tool and the "hobbitdboard"
command. E.g. to fetch the hostname and expiry-time for all "trap"
statuses you can do this:
  $BB $BBDISP "hobbitdboard test=trap fields=hostname,validtime"
The output looks like this:
  adsl.hswn.dk|1121498714
  backup-mx.post.tele.dk|1121498714
  www.sslug.dk|1121498623
I should have known you could do something like that with Hobbit....  I'll play with that.  (There was a voice in the back of my head that told me not to use the checkpoint file, but I didn't know what else to use.  must read all docs...)  Thanks.
quoted from Henrik Størner

Another - perhaps more elegant - solution is to change Hobbit so
that you can send a status-message that does not expire. I'd be
willing to implement such a change since it does make sense for
this kind of integration with other systems. (I have a similar
problem on my system where it receives e-mails instead of SNMP
traps). However, then you will not get any indication if your 
SNMP module stops working, so each method has its benefits and
drawbacks.

For now, I think I'll try using long LIFETIME values like 24h or 48h and set the status to "no traps to report" after that time period.
quoted from Henrik Størner

My Perl skills are really poor, so I'd love it if you could
change the trap.pl script to use the hobbitdboard command instead 
of the checkpoint file. 
I *should* have something by Monday.


thanks Henrik!

Andy
list Andy Farrior · Sat, 16 Jul 2005 18:35:23 -0500 ·
I've updated the trap.pl script to read from hobbitdboard.
 
 
let's try that again.
 
http://cerebro.victoriacollege.edu/hobbit-trap.html
 
 
thanks,
Andy
list Thomas Pedersen · Mon, 18 Jul 2005 10:20:44 +0200 ·
Hi andy,

I am using snmptt also but set it up using the mysql web page and then have a ext script to test if any unacknowledged events are on the web page. My primary reason for doing it this way was that I was not able to make sure that all traps were recorded. How do you make sure that 2 consecutive traps from the same device are recorded both and not only the last ?

Also you mention the "problem" with classification. This has rased som issues in my organisation, because some server people did not agree on a specific alert levels (ie CRITICAL etc). Do you edit this by hand ?

Best regards,
Thomas

quoted from Henrik StørnerFARRIOR, Andy skrev:


My technicians have been bugging me about wanting to receive SNMP traps from various equipment.

The only references to SNMP traps and Hobbit I could find were forwarding Hobbit events as SNMP traps to
NMS servers like OpenView or something.

I don't have either; so I've tried to implement SNMP trap handling with Hobbit using an external perl script, snmptrapd, SNMPTT, and SEC.

I've put my configuration notes here if you're interested:


If you find something grossly wrong, let me know.

thanks,
andy
list Andy Farrior · Mon, 18 Jul 2005 07:27:05 -0500 ·
quoted from Thomas Pedersen
-----Original Message-----
From: Thomas [mailto:user-97316fb2dd2a@xymon.invalid]
Sent: Mon 7/18/2005 3:20 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Handling SNMP traps with Hobbit

Hi andy,

I am using snmptt also but set it up using the mysql web page and then have a ext script to test if >any unacknowledged events are on the web page. My primary reason for doing it this way was that I was >not able to make sure that all traps were recorded. How do you make sure that 2 consecutive traps >from the same device are recorded both and not only the last ?
If two Normal traps arrive, only the last one will be seen.  If a WARNING or CRITICAL trap arrives followed by a Normal trap, you will still get an alert for the yellow/red trap but you'll see the green trap on the page; however, the yellow/red trap should be in the history.

I wanted to include MySQL logging with SNMPTT, but haven't had a chance to work with it.  I'd also want to have some link from the Hobbit status page to a SNMPTT web page so you could see all past alerts.

Instead of waiting or forcing a user to acknoledge a trap, I treat it like a regular alert.

Just wanted to keep it simple.

Also you mention the "problem" with classification. This has rased som issues in my organisation, >because some server people did not agree on a specific alert levels (ie CRITICAL etc). Do you edit >this by hand ?
I edited these by hand.  We're fairly small, only two people receive network alerts, two receive server alerts; so, we didn't have any issues.


Thanks,
Andy