Xymon Mailing List Archive search

Status not available

list Brian Lynch
Mon, 28 Feb 2005 14:41:26 -0800
Message-Id: <user-ac96db1b7ba3@xymon.invalid>

Is it possible that the SIGUSR2 signal is not being sent properly? 
What signal is sent to kill hobbitd from hobbitlaunch if it doesn't
recieve notice?

The HEARTBEAT keyword can only be used for one task, and  is
     specifically  aimed  at monitoring the hobbitd(8) task. This
     task must send a SIGUSR2 signal to  hobbitlaunch  regularly;
     if  this  signal  fails  to arrive for more than 15 seconds,
     hobbitlaunch will kill the running task and start up  a  new
     one.  hobbitd(8) will send this signal every 5 seconds.


On Mon, 28 Feb 2005 14:26:55 -0800, Brian Lynch <user-0420823115a8@xymon.invalid> wrote:
Enabled the -DDEBUG on compile and enabled --debug on runtime for hobbitd.
Same output from gdb..

#0  0xff19fc14 in _libc_kill () from /usr/lib/libc.so.1
(gdb) backtrace
#0  0xff19fc14 in _libc_kill () from /usr/lib/libc.so.1
#1  0xff13598c in abort () from /usr/lib/libc.so.1
#2  0x000281f4 in sigsegv_handler (signum=0) at sig.c:57
#3  <signal handler called>

Here are the sections of each log:

hobbitlaunch.log

2005-02-28 22:24:56 Task bbhistory started with PID 2567
2005-02-28 22:24:56 Task bbenadis started with PID 2572
2005-02-28 22:24:56 Task bbpage started with PID 2575
2005-02-28 22:24:56 Task larrdstatus started with PID 2578
2005-02-28 22:24:56 Task larrddata started with PID 2580
2005-02-28 22:24:56 Task bbdisplay started with PID 2582
2005-02-28 22:24:56 Task bbnet started with PID 2584
2005-02-28 22:24:56 Task bbretest started with PID 2585
2005-02-28 22:24:58 Task larrdcolumn started with PID 2608
2005-02-28 22:24:58 Task infocolumn started with PID 2609
2005-02-28 22:25:16 Task hobbitd terminated by signal 6
2005-02-28 22:25:17 Loading hostnames
2005-02-28 22:25:17 Task hobbitd started with PID 2712
2005-02-28 22:25:17 Loading saved state
2005-02-28 22:25:17 Setting up network listener on 0.0.0.0:1984
2005-02-28 22:25:17 Setting up signal handlers
2005-02-28 22:25:17 Setting up hobbitd channels
2005-02-28 22:25:17 Setting up status channel (id=1)
2005-02-28 22:25:17 calling ftok('/opt/hobbit/server',1)
2005-02-28 22:25:17 ftok() returns: 0x10078B4
2005-02-28 22:25:17 shmget() returns: 0x8FC
2005-02-28 22:25:17 Setting up stachg channel (id=2)
2005-02-28 22:25:17 calling ftok('/opt/hobbit/server',2)
2005-02-28 22:25:17 ftok() returns: 0x20078B4
2005-02-28 22:25:17 shmget() returns: 0x8FD
2005-02-28 22:25:17 Setting up page channel (id=3)
2005-02-28 22:25:17 calling ftok('/opt/hobbit/server',3)
2005-02-28 22:25:17 ftok() returns: 0x30078B4
2005-02-28 22:25:17 shmget() returns: 0x8FE
2005-02-28 22:25:17 Setting up data channel (id=4)
2005-02-28 22:25:17 calling ftok('/opt/hobbit/server',4)
2005-02-28 22:25:17 ftok() returns: 0x40078B4
2005-02-28 22:25:17 shmget() returns: 0x8FF
2005-02-28 22:25:17 Setting up notes channel (id=5)
2005-02-28 22:25:17 calling ftok('/opt/hobbit/server',5)
2005-02-28 22:25:17 ftok() returns: 0x50078B4
2005-02-28 22:25:17 shmget() returns: 0x900
2005-02-28 22:25:17 Setting up enadis channel (id=6)
2005-02-28 22:25:17 calling ftok('/opt/hobbit/server',6)
2005-02-28 22:25:17 ftok() returns: 0x60078B4
2005-02-28 22:25:17 shmget() returns: 0xB59
2005-02-28 22:25:17 Setting up logfiles
2005-02-28 22:25:17 Task bbenadis terminated, status 1
2005-02-28 22:25:22 Task bbhistory started with PID 2740
2005-02-28 22:25:22 Task bbenadis started with PID 2743
2005-02-28 22:25:22 Task bbpage started with PID 2747
2005-02-28 22:25:22 Task larrdstatus started with PID 2750
2005-02-28 22:25:22 Task larrddata started with PID 2752

hobbitd.log

005-02-28 22:15:16 Posting message 12 to 1 readers
2005-02-28 22:15:16 Message posted
2005-02-28 22:15:16 Posting message 13 to 1 readers
2005-02-28 22:15:16 Message posted
2005-02-28 22:15:16 oldcolor=6, oldas=2, newcolor=0, newas=0
2005-02-28 22:15:16 posting to stachg channel
2005-02-28 22:15:17 Setup complete
2005-02-28 22:15:17 Sending heartbeat to pid 28397
2005-02-28 22:15:18 posting to status channel
2005-02-28 22:15:18 Dropping message - no readers
2005-02-28 22:15:19 posting to status channel
2005-02-28 22:15:19 Dropping message - no readers
2005-02-28 22:15:21 posting to status channel
2005-02-28 22:15:21 Dropping message - no readers
2005-02-28 22:15:22 Sending heartbeat to pid 28397
2005-02-28 22:15:22 posting to status channel
2005-02-28 22:15:22 Dropping message - no readers
2005-02-28 22:15:23 posting to status channel
2005-02-28 22:15:23 Posting message 1 to 1 readers
2005-02-28 22:15:23 Message posted
2005-02-28 22:15:24 posting to status channel
2005-02-28 22:15:24 Posting message 2 to 1 readers
2005-02-28 22:15:24 Message posted
2005-02-28 22:15:26 posting alert to page channel
2005-02-28 22:15:26 Posting message 1 to 1 readers
2005-02-28 22:15:26 Message posted
2005-02-28 22:15:26 posting to status channel
2005-02-28 22:15:26 Posting message 3 to 1 readers
2005-02-28 22:15:26 Message posted
2005-02-28 22:15:28 Sending heartbeat to pid 28397


On Mon, 28 Feb 2005 14:11:43 -0800, Brian Lynch <user-0420823115a8@xymon.invalid> wrote:
Have it running with '--debug' now... Hopefully, this will yield more
information.

- Brian


On Mon, 28 Feb 2005 23:10:14 +0100, Henrik Stoerner <user-ce4a2c883f75@xymon.invalid> wrote:
On Mon, Feb 28, 2005 at 02:00:34PM -0800, Brian Lynch wrote:
It is also dumping core... here is the gdb results from the core file.
[snip]
(gdb) backtrace
#0  0xff19fc14 in _libc_kill () from /usr/lib/libc.so.1
#1  0xff13598c in abort () from /usr/lib/libc.so.1
#2  0x0001cf54 in sigsegv_handler (signum=-14958584) at sig.c:57
#3  <signal handler called>
No more lines than that ? I'd expect a few more unless there's some
serious memory corruption taking place.

Henrik