Xymon Mailing List Archive search

bbgen using full CPU

7 messages in this thread

list Josh Luthman · Fri, 1 Feb 2008 00:13:36 -0500 ·
I have a problem after using the all-in-one patch. ( Last updated:
2007-02-09 10:30 UTC ).  The bbgen ps takes the entire CPU and seemingly
never stops.  I've watched it take all the free CPU time for a good fifteen
minutes now.

I got the server running with the packages necessary.  I got the
4.2.0(release) up and running.  Later, I applied the patch and redid a
make &&
make install as hobbituser and root respectively.  Once Hobbit is started
again, the bbgen ps takes over the CPU.

Anyone know where I should begin to look for the problem?

Thanks in advance!

-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Anna Jonna Armannsdottir · Fri, 01 Feb 2008 11:42:13 +0000 ·
On fös, 2008-02-01 at 00:13 -0500, Josh Luthman wrote:
I got the server running with the packages necessary.  I got the 4.2.0
(release) up and running.  Later, I applied the patch and redid a make
quoted from Josh Luthman
&& make install as hobbituser and root respectively.  Once Hobbit is
started again, the bbgen ps takes over the CPU.

Anyone know where I should begin to look for the problem?
Hi Josh, my server also is 4.2.0 with the all in one patch and it is running quite nicely. I would suggest checking the bbgen test of the hobbit server itself. Is that test available on your server? 
The testresults on my server is: Fri Feb 1 11:31:35 2008
bbgen for Hobbit version 4.2.0

Statistics:
 Hosts               :   149
 Status messages     :  1189
 Purple messages     :     0
 Pages               :     8


TIME SPENT
Event                                            Starttime          Duration
Startup                                  1201865495.414615                 -
Load links done                          1201865495.415258          0.000643 Load bbhosts done                        1201865495.421488          0.006230 ACK removal done                         1201865495.421608          0.000120 Load STATE done                          1201865495.465410          0.043802 Color calculation done                   1201865495.465800          0.000390 Hobbit pagegen start                     1201865495.465857          0.000057 Hobbit pagegen done                      1201865495.494465          0.028608 BB2 generation done                      1201865495.504538          0.010073 BBNK generation done                     1201865495.505616          0.001078 Summary transmission done                1201865495.505620          0.000004 Run completed                            1201865495.505622          0.000002 TIME TOTAL                                                          0.091007 
The average time total is about 0.1 sec for the last 48 hours. The server
has one CPU 1GHz and 500 MB of ram. Hope this helps you finding the problem. 
-- 
Kindest Regards, Anna Jonna Ármannsdóttir,       %&   A: Because people read from top to bottom.
Unix System Aministration, Computing Services,   %&   Q: Why is top posting bad?
University of Iceland.
list Henrik Størner · Fri, 1 Feb 2008 15:14:48 +0100 ·
quoted from Josh Luthman
On Fri, Feb 01, 2008 at 12:13:36AM -0500, Josh Luthman wrote:
I have a problem after using the all-in-one patch. ( Last updated:
2007-02-09 10:30 UTC ).  The bbgen ps takes the entire CPU and seemingly
never stops.  I've watched it take all the free CPU time for a good fifteen
minutes now.
bbgen shouldn't take more than a few seconds to run. It's cpu-intensive,
but not that much. So something's broken.

If you haven't done it yet, kill it with a "kill -6 <process-ID>". This
should generate a core file, and it would be interesting to see what the
backtrace is (see the "Reporting bugs" on-line help).
quoted from Anna Jonna Armannsdottir
make install as hobbituser and root respectively.  Once Hobbit is started
again, the bbgen ps takes over the CPU.
Sounds like it is reproducible. That's good (for debugging, at least!)


Regards,
Henrik
list Josh Luthman · Fri, 1 Feb 2008 10:33:15 -0500 ·
Anna,

I forgot to mention that once this bbgen test kicks into gear, it never
reports anything.  On the main view it is green, but on details of the page
it is purple - and stays that way.

Henrik,

In case it is relevant the core was put in /home/shire/data/acks - not the
usual ~hobbit/server/tmp

It is reproducable - I can kill the ps and restart hobbit, every single time
I start hobbit a few seconds later the CPU is overrun with this one PID.

Here is the gdb output:

# gdb bin/bbgen ../data/acks/core.31412
GNU gdb Red Hat Linux (6.5-25.el5rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db
library "/lib/libthread_db.so.1".


warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/libpcre.so.0...done.
Loaded symbols for /lib/libpcre.so.0
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `bbgen --recentgifs --subpagecolumns=2 --report'.
Program terminated with signal 6, Aborted.
#0  0x0805274f in calc_pagecolors (phead=0x81ddaa0) at process.c:96
96                      for (h = toppage->hosts; (h); h = h->next) {
(gdb) bt
#0  0x0805274f in calc_pagecolors (phead=0x81ddaa0) at process.c:96
#1  0x08049d31 in main (argc=4, argv=0xbfbdd4b4) at bbgen.c:594

Is this what you need?
quoted from Henrik Størner

On 2/1/08, Henrik Stoerner <user-ce4a2c883f75@xymon.invalid> wrote:
On Fri, Feb 01, 2008 at 12:13:36AM -0500, Josh Luthman wrote:
I have a problem after using the all-in-one patch. ( Last updated:
2007-02-09 10:30 UTC ).  The bbgen ps takes the entire CPU and seemingly
never stops.  I've watched it take all the free CPU time for a good
fifteen
minutes now.
bbgen shouldn't take more than a few seconds to run. It's cpu-intensive,
but not that much. So something's broken.

If you haven't done it yet, kill it with a "kill -6 <process-ID>". This
should generate a core file, and it would be interesting to see what the
backtrace is (see the "Reporting bugs" on-line help).
make install as hobbituser and root respectively.  Once Hobbit is
started
again, the bbgen ps takes over the CPU.
Sounds like it is reproducible. That's good (for debugging, at least!)


Regards,
Henrik

-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Josh Luthman · Sun, 3 Feb 2008 01:11:08 -0500 ·
My CPU is still at 100% =P

Is there any ideas on what I can test out?
quoted from Josh Luthman

On 2/1/08, Josh Luthman <user-4c45a83f15cb@xymon.invalid> wrote:
Anna,

I forgot to mention that once this bbgen test kicks into gear, it never
reports anything.  On the main view it is green, but on details of the page
it is purple - and stays that way.

Henrik,

In case it is relevant the core was put in /home/shire/data/acks - not the
usual ~hobbit/server/tmp

It is reproducable - I can kill the ps and restart hobbit, every single
time I start hobbit a few seconds later the CPU is overrun with this one
PID.

Here is the gdb output:

# gdb bin/bbgen ../data/acks/core.31412
GNU gdb Red Hat Linux (6.5-25.el5rh)
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for
details.
This GDB was configured as "i386-redhat-linux-gnu"...Using host
libthread_db library "/lib/libthread_db.so.1".


warning: Can't read pathname for load map: Input/output error.
Reading symbols from /lib/libpcre.so.0...done.
Loaded symbols for /lib/libpcre.so.0
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Core was generated by `bbgen --recentgifs --subpagecolumns=2 --report'.
Program terminated with signal 6, Aborted.
#0  0x0805274f in calc_pagecolors (phead=0x81ddaa0) at process.c:96
96                      for (h = toppage->hosts; (h); h = h->next) {
(gdb) bt
#0  0x0805274f in calc_pagecolors (phead=0x81ddaa0) at process.c:96
#1  0x08049d31 in main (argc=4, argv=0xbfbdd4b4) at bbgen.c:594

Is this what you need?

On 2/1/08, Henrik Stoerner <user-ce4a2c883f75@xymon.invalid> wrote:
On Fri, Feb 01, 2008 at 12:13:36AM -0500, Josh Luthman wrote:
I have a problem after using the all-in-one patch. ( Last updated:
2007-02-09 10:30 UTC ).  The bbgen ps takes the entire CPU and
seemingly
never stops.  I've watched it take all the free CPU time for a good
fifteen
minutes now.
bbgen shouldn't take more than a few seconds to run. It's cpu-intensive,
but not that much. So something's broken.

If you haven't done it yet, kill it with a "kill -6 <process-ID>". This
should generate a core file, and it would be interesting to see what the
backtrace is (see the "Reporting bugs" on-line help).
make install as hobbituser and root respectively.  Once Hobbit is
started
again, the bbgen ps takes over the CPU.
Sounds like it is reproducible. That's good (for debugging, at least!)


Regards,
Henrik

--
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer
list Henrik Størner · Sun, 3 Feb 2008 09:47:39 +0100 ·
quoted from Josh Luthman
On Sun, Feb 03, 2008 at 01:11:08AM -0500, Josh Luthman wrote:
My CPU is still at 100% =P

Is there any ideas on what I can test out?
Could you send me (off-list) the output from "bb 127.0.0.1 hobbitdboard"
and "bbcmd bbhostshow", please ?


Regards,
Henrik
list Josh Luthman · Tue, 12 Feb 2008 09:09:00 -0500 ·
Henrik's solution for this problem was done as shown below...

cd hobbit-4.2.0
make clean
make
/etc/init.d/hobbit stop
killall bbgen
cp bbdisplay/bbgen ~hobbit/server/bin
/etc/init.d/hobbit start

The problem per Henrik:
"the problem probably was that some of the C modules had not been
recompiled. If the number of parameters to a function changes and the
code is not re-compiled, there will be some stack corruption when a
function returns - and this can cause all sorts of odd behaviour,
including the looping that you've seen."
quoted from Henrik Størner

On 2/3/08, Henrik Stoerner <user-ce4a2c883f75@xymon.invalid> wrote:
On Sun, Feb 03, 2008 at 01:11:08AM -0500, Josh Luthman wrote:
My CPU is still at 100% =P

Is there any ideas on what I can test out?
Could you send me (off-list) the output from "bb 127.0.0.1 hobbitdboard"
and "bbcmd bbhostshow", please ?


Regards,
Henrik

-- 
Josh Luthman
Office: XXX-XXX-XXXX
Direct: XXX-XXX-XXXX
XXXX Wayne St
Suite XXXX
Troy, OH XXXXX

Those who don't understand UNIX are condemned to reinvent it, poorly.
--- Henry Spencer