Xymon Mailing List Archive search

hobbit quick search for FireFox 2 ?

9 messages in this thread

list T.J. Yang · Sat, 25 Aug 2007 20:48:33 -0500 ·
I used to be able to switch the default Google search to internal BB server 
search in FireFox 1.
But my old quick  external F.F. search module doesn't work in F.F.2 anymore.

Anyone has hobbit search module for FF 2 ?

T.J. Yang

See what you�re getting into�before you go there 
http://newlivehotmail.com/?ocid=TXT_TAGHM_migration_HM_viral_preview_0507
list Sean R. Clark · Sun, 26 Aug 2007 07:53:44 -0400 ·
The hobbitd_channel for rrd is crashing on me every 2 days or so - and ~ 3
hours after it crashes, the main hobbitdboard locks up and reports only
"BOARDBUSY"

I can't see why it's crashing, I do have a core file:

pstack core
core 'core' of 8975:    hobbitd_channel --channel=status
--log=/sw/bbserver/bbvar/acks/rrd-sta
 fef70717 _lwp_kill (1, 6) + 7
 fef1ced3 raise    (6) + 1f
 fef00969 abort    (8052c4d, 2, 19, 8046ab4, 1, febdd980) + cd
 0804b45d xstrdup  (fec00000, 8046ab4, 1, fef6fce7, 8046d54, 8046ac0) + 31
 08049d8b main     (5, 8046b04, 8046b1c) + 293
 08049950 _start   (5, 8046dbc, 8046dcc, 8046ddd, 8046e0a, 8046e16) + 80

gdb server/bin/hobbitd_channel core
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-pc-solaris2.10"...

warning: exec file is newer than core file.
Core was generated by `hobbitd_channel --channel=status
--log=/sw/bbserver/bbvar/acks/rrd-status.log h'.
Program terminated with signal 6, Aborted.
Reading symbols from /lib/libc.so.1...done.
Loaded symbols for /lib/libc.so.1
#0  0xfef70717 in _lwp_kill () from /lib/libc.so.1


Any help on what to do next would be appreciated - I am running hobbit 4.2
with the 'all-in-one' patch applied on x86 solaris 10


-Sean
list Sean R. Clark · Mon, 27 Aug 2007 13:49:55 -0400 ·
Following up on my own thread 


Using sunpro DBX

Reading hobbitd_channel
core file header read successfully
Reading ld.so.1
Reading libc.so.1
program terminated by signal ABRT (Abort)
0xfef70717: __lwp_kill+0x0007:  jae      __lwp_kill+0x15        [
0xfef70725, .+0xe ]
Current function is xstrdup
  175                   abort();
(dbx) where

  [1] __lwp_kill(0x1, 0x6), at 0xfef70717 
  [2] _thr_kill(0x1, 0x6), at 0xfef6ded4 
  [3] raise(0x6), at 0xfef1ced3 
  [4] abort(0x8052c4d, 0x2, 0x29), at 0xfef00969 
=>[5] xstrdup(s = (nil)), line 175 in "memory.c"
  [6] main(argc = 5, argv = 0x8046b04), line 239 in "hobbitd_channel.c"


It's crashing with an "Out of memory" abort.

What can cause the hobbitd channel to reach this condition? Am I sending it
too much information? 


-Sean
quoted from Sean R. Clark


-----Original Message-----
From: Sean R. Clark [mailto:user-94e09d797e16@xymon.invalid] 
Sent: Sunday, August 26, 2007 7:54 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] Hobbid_channel crashing on me


The hobbitd_channel for rrd is crashing on me every 2 days or so - and ~ 3
hours after it crashes, the main hobbitdboard locks up and reports only
"BOARDBUSY"

I can't see why it's crashing, I do have a core file:

pstack core
core 'core' of 8975:    hobbitd_channel --channel=status
--log=/sw/bbserver/bbvar/acks/rrd-sta
 fef70717 _lwp_kill (1, 6) + 7
 fef1ced3 raise    (6) + 1f
 fef00969 abort    (8052c4d, 2, 19, 8046ab4, 1, febdd980) + cd
 0804b45d xstrdup  (fec00000, 8046ab4, 1, fef6fce7, 8046d54, 8046ac0) + 31
 08049d8b main     (5, 8046b04, 8046b1c) + 293
 08049950 _start   (5, 8046dbc, 8046dcc, 8046ddd, 8046e0a, 8046e16) + 80

gdb server/bin/hobbitd_channel core
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-pc-solaris2.10"...

warning: exec file is newer than core file.
Core was generated by `hobbitd_channel --channel=status
--log=/sw/bbserver/bbvar/acks/rrd-status.log h'.
Program terminated with signal 6, Aborted.
Reading symbols from /lib/libc.so.1...done.
Loaded symbols for /lib/libc.so.1
#0  0xfef70717 in _lwp_kill () from /lib/libc.so.1


Any help on what to do next would be appreciated - I am running hobbit 4.2
with the 'all-in-one' patch applied on x86 solaris 10


-Sean
list Henrik Størner · Tue, 28 Aug 2007 04:46:27 +0200 ·
quoted from Sean R. Clark
On Mon, Aug 27, 2007 at 01:49:55PM -0400, Sean R. Clark wrote:
Following up on my own thread 
It's crashing with an "Out of memory" abort.

What can cause the hobbitd channel to reach this condition? Am I sending it
too much information? 
It shouldn't do that normally, but I have seen some cases where it does
that when the hobbitd_rrd process cannot keep up with the amount of
updates being fed into it.

How many RRD files do you have ? Could you check if this server has a
high I/O load ?

If that is the problem, then I probably do have a solution ready for you
(I've run into the same problem at work).


Regards,
Henrik
list Sean R. Clark · Tue, 28 Aug 2007 09:26:51 -0400 ·
 /sw/bbserver/bbvar/rrd> find . -name "*.rrd" -print | wc -l
18102
[sclark][tools-www-01] /sw/bbserver/bbvar/rrd> 


I have 18,102 RRD's, 17,671 of which are controlled by the hobbitd_channel
(the others are written/populated from other sources)


The slice I have the data on has a busy% between 16-88% depending on what's
going on (so yes, high I/O as well)

Iostat -xtc 5 2 :

device       r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b 
sd20       382.6  217.2 2863.0  265.4  0.0  2.4    4.0   0  87 

device       r/s    w/s   kr/s   kw/s wait actv  svc_t  %w  %b
sd20       138.7  108.0 1304.1  605.9  0.6  2.6   12.7   1  31  


So I would be interested in your solution ;)

-Sean
quoted from Henrik Størner

-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: Monday, August 27, 2007 10:46 PM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Hobbid_channel crashing on me

On Mon, Aug 27, 2007 at 01:49:55PM -0400, Sean R. Clark wrote:
Following up on my own thread
It's crashing with an "Out of memory" abort.

What can cause the hobbitd channel to reach this condition? Am I 
sending it too much information?
It shouldn't do that normally, but I have seen some cases where it does that
when the hobbitd_rrd process cannot keep up with the amount of updates being
fed into it.

How many RRD files do you have ? Could you check if this server has a high
I/O load ?

If that is the problem, then I probably do have a solution ready for you
(I've run into the same problem at work).


Regards,
Henrik
list Henrik Størner · Tue, 28 Aug 2007 16:56:44 +0200 ·
quoted from Sean R. Clark
On Tue, Aug 28, 2007 at 09:26:51AM -0400, Sean R. Clark wrote:
I have 18,102 RRD's, 17,671 of which are controlled by the hobbitd_channel
(the others are written/populated from other sources)

The slice I have the data on has a busy% between 16-88% depending on what's
going on (so yes, high I/O as well)
OK, then I'd suggest that you pick up the current snapshot of Hobbit
from http://www.hswn.dk/beta/ and build that. The only parts you need
to replace in your current setup are these binaries:

  * hobbitd/hobbitd_channel
  * hobbitd/hobbitd_rrd 
  * web/hobbitgraph.cgi

After running "make", shutdown Hobbit and copy these files to your 
~hobbit/server/bin/ directory (it's probably wise to save the original 
ones first).  Then start Hobbit again, and everything should be working 
fine - with a lot less I/O load, and no memory leak in hobbitd_channel.

What's changed internally is that updates of the RRD files are now
cached for up to 30 minutes before being written to disk; the RRDtool
library can handle "batch" updates of the data, so instead of updating
the RRD file with 1 dataset every 5 minutes, it now gets 6 datasets in
one operation every 30 minutes.

This also means that when you shutdown Hobbit, you'll see that the
hobbitd_rrd process takes quite a long time to finish - it is busy
writing all of the cached updates to disk. On my work server, this 
takes about 5 minutes.


Regards,
Henrik
list Tom Georgoulias · Tue, 28 Aug 2007 11:08:56 -0400 ·
quoted from Henrik Størner
Henrik Stoerner wrote:
The slice I have the data on has a busy% between 16-88% depending on what's
going on (so yes, high I/O as well)
What's changed internally is that updates of the RRD files are now
cached for up to 30 minutes before being written to disk
This also means that when you shutdown Hobbit, you'll see that the
hobbitd_rrd process takes quite a long time to finish - it is busy
writing all of the cached updates to disk. On my work server, this 
takes about 5 minutes.
This is probably obvious, but if hobbit crashes or is rudely shutdown, 
we could lose up to 30 mins of data, right?
-- 
Tom Georgoulias
Sr. Systems Engineer
McClatchy Interactive
user-6a0b8b0f0ae1@xymon.invalid
list Henrik Størner · Tue, 28 Aug 2007 17:24:13 +0200 ·
quoted from Tom Georgoulias
On Tue, Aug 28, 2007 at 11:08:56AM -0400, Tom Georgoulias wrote:
Henrik Stoerner wrote:
What's changed internally is that updates of the RRD files are now
cached for up to 30 minutes before being written to disk
This also means that when you shutdown Hobbit, you'll see that the
hobbitd_rrd process takes quite a long time to finish - it is busy
writing all of the cached updates to disk. On my work server, this >takes about 5 minutes.
This is probably obvious, but if hobbit crashes or is rudely shutdown, we could lose up to 30 mins of data, right?
Correct.


Regards,
Henrik
list Sean R. Clark · Tue, 28 Aug 2007 19:27:17 -0400 ·
I replaced the binaries

It ran for ~ 3 hours


I just get:


2007-08-28 12:17:12 Setup complete
2007-08-28 12:44:53 BOARDBUSY locked at 2, GETNCNT is 0, GETPID is 9917, 2
clients
2007-08-28 15:08:58 BOARDBUSY locked at 2, GETNCNT is 0, GETPID is 9917, 2
clients
2007-08-28 15:10:20 BOARDBUSY locked at 1, GETNCNT is 0, GETPID is 10501, 2
clients
2007-08-28 15:11:23 BOARDBUSY locked at 1, GETNCNT is 0, GETPID is 9917, 1
clients
2007-08-28 15:14:53 BOARDBUSY locked at 2, GETNCNT is 0, GETPID is 9917, 2
clients
2007-08-28 15:16:53 BOARDBUSY locked at 1, GETNCNT is 0, GETPID is 9917, 1
clients 


And a red hobbitd_channel sent to the daemon

Core file says:

Reading hobbitd_channel
core file header read successfully
Reading ld.so.1
Reading libresolv.so.2
Reading libsocket.so.1
Reading libnsl.so.1
Reading libc.so.1
program terminated by signal ABRT (Abort)
0xfee60717: __lwp_kill+0x0007:  jae      __lwp_kill+0x15        [
0xfee60725, .+0xe ]
Current function is sigsegv_handler
   57           abort();
(dbx) where

  [1] __lwp_kill(0x1, 0x6), at 0xfee60717 
  [2] _thr_kill(0x1, 0x6), at 0xfee5ded4 
  [3] raise(0x6), at 0xfee0ced3 
  [4] abort(0x80599c0, 0x0, 0x8046758, 0xfee4dd4f, 0x8046758, 0xfee4dd4f),
at 0xfedf0969 
=>[5] sigsegv_handler(signum = 11), line 57 in "sig.c"
  [6] __sighndlr(0xb, 0x0, 0x80467f0, 0x804ebe8), at 0xfee5fadf 
  [7] call_user_handler(0xb, 0x0, 0x80467f0), at 0xfee560d3 
  [8] sigacthandler(0xb, 0x0, 0x80467f0, 0xf, 0x0, 0x0), at 0xfee56253 
  ---- called from signal handler with signal 11 (SIGSEGV) ------
  [9] main(argc = 4, argv = 0x8046b28), line 676 in "hobbitd_channel.c"

Meaning it tried to spawn a thread and dumped core


Is this a "nicer" crash ? Meaning that it will keep runing since it just
core dumped on a fork and not the whole channel? Or is this something more?
quoted from Henrik Størner


-Sean


-----Original Message-----
From: Henrik Stoerner [mailto:user-ce4a2c883f75@xymon.invalid] 
Sent: Tuesday, August 28, 2007 10:57 AM
To: user-ae9b8668bcde@xymon.invalid
Subject: Re: [hobbit] Hobbid_channel crashing on me

On Tue, Aug 28, 2007 at 09:26:51AM -0400, Sean R. Clark wrote:
I have 18,102 RRD's, 17,671 of which are controlled by the 
hobbitd_channel (the others are written/populated from other sources)

The slice I have the data on has a busy% between 16-88% depending on 
what's going on (so yes, high I/O as well)
OK, then I'd suggest that you pick up the current snapshot of Hobbit from
http://www.hswn.dk/beta/ and build that. The only parts you need to replace
in your current setup are these binaries:

  * hobbitd/hobbitd_channel
  * hobbitd/hobbitd_rrd
  * web/hobbitgraph.cgi

After running "make", shutdown Hobbit and copy these files to your
~hobbit/server/bin/ directory (it's probably wise to save the original ones
first).  Then start Hobbit again, and everything should be working fine -
with a lot less I/O load, and no memory leak in hobbitd_channel.

What's changed internally is that updates of the RRD files are now cached
for up to 30 minutes before being written to disk; the RRDtool library can
handle "batch" updates of the data, so instead of updating the RRD file with
1 dataset every 5 minutes, it now gets 6 datasets in one operation every 30
minutes.

This also means that when you shutdown Hobbit, you'll see that the
hobbitd_rrd process takes quite a long time to finish - it is busy writing
all of the cached updates to disk. On my work server, this takes about 5
minutes.


Regards,
Henrik