Missed out the other bug, which is a duplicate of the one below. It gives
more detail about the issue:
"Bug ID: 6454060 Synopsis: select()/poll() indicate a socket as readable
when there is no data available
Description:
pollsys() (on behalf of select()) indicates available data on a socket but
a following recvfrom() fails with EAGAIN:"
I'm not sure if this does apply here, but it is a spurious type of issue.
Colin Spargo/GIS/SC/CSC
19/04/2007 18:10
To
user-ae9b8668bcde@xymon.invalid
cc
Subject
RE: [hobbit] "hobbitd status-board not available" from bbgen on solaris 10
Good to hear!
A trawl through sunsolve shows a few bugs that may have something to do
with it:
Bug ID: 6458410 Synopsis: read() may spuriously return EAGAIN while
unfusing a TCP connection
No patch for this yet I believe.
"T.J. Yang" <user-8e841282cda5@xymon.invalid>
19/04/2007 17:46
Please respond to
user-ae9b8668bcde@xymon.invalid
To
user-ae9b8668bcde@xymon.invalid
cc
Subject
RE: [hobbit] "hobbitd status-board not available" from bbgen on solaris 10
1. stop hobbit server
2. zero out the existing log file
3. apply the online fix
4. So far so good, I can confirm the status-board error message is now
gone
;)
bash-3.00# grep -i status-board *.log
bash-3.00# pwd
/var/opt/hobbitserver42/log
bash-3.00# ls *.log
acknowledge.log cgierror.log hobbitlaunch.log rrd-data.log
bb-display.log clientdata.log hobbitlaunch.pid rrd-status.log
bb-network.log history.log hostdata.log
bb-retest.log hobbitd.log notifications.log
bbcombotest.log hobbitd.pid page.log
bash-3.00# cat /etc/release
Solaris 10 6/06 s10s_u2wos_09a SPARC
Copyright 2006 Sun Microsystems, Inc. All Rights Reserved.
Use is subject to license terms.
Assembled 09 June 2006
bash-3.00#
Good job on track down the cause on providing the fix.
T.J. Yang
From: Colin Spargo <user-4148d5b43ace@xymon.invalid>
Reply-To: user-ae9b8668bcde@xymon.invalid
To: user-ae9b8668bcde@xymon.invalid
Subject: [hobbit] "hobbitd status-board not available" from bbgen on
solaris 10
Date: Thu, 19 Apr 2007 12:30:23 +0100
If anyone has been having issues with bbgen logging this error mesage on
Solaris 10 and intermittently failing, resulting in blank status pages,
then I think I have found a workaround.
If you disable TCP fusion be adding the following kernel parameter to
/etc/system and reboot, hopefully you will find that the problem goes
away.
set ip:do_tcp_fusion = 0
Apparently this can be done on a live system as well (without rebooting),
but will require hobbit to be restarted. To do this:
echo do_tcp_fusion/W0 | mdb -kw
TCP fusion is only used on local loopback connections to speed them up by
bypassing the normal TCP stack. I found that the problem only occured
when
connecting to hobbitd locally. I tried running "bb localhost
hobbitdboard"
once a second, and found it would often return no data, but if I ran the
same command from another host to the hobbit server, it always returned
correct data. This made me suspect TCP fusion, as I have run into issues
with it before. It it is best left disabled in my opinion.
MSN is giving away a trip to Vegas to see Elton John. Enter to win today.
http://msnconcertcontest.com?icid-nceltontagline